Skip to content

Designing data processing systems Questions

Practice questions for Designing data processing systems topic in Google Cloud Professional Data Engineer. 44 questions covering this domain.

44 questions12 easy21 medium11 hard
Q1
medium

An enterprise wants a centralized governance layer that connects to BigQuery, Cloud Storage, Pub/Sub, and Spanner, then enriches metadata with profili...

Q2
medium

Analysts need to build ELT workflows in BigQuery using SQL, keep the code in version control, define dependencies between tables, and add assertions f...

Q3
easy

A data platform team needs an asynchronous, scalable service that decouples event producers from event consumers and is commonly used for streaming an...

Q4
medium

A data engineering team wants a fully managed, cloud-native service for visually building and managing scalable enterprise data integration pipelines ...

Q5
hard

A workflow must pause until an external approval system calls back, and the design should avoid polling while allowing the workflow to resume later. W...

Q6
medium

A migration project needs to move 50 TiB of data from Amazon S3 and on-premises file storage into Cloud Storage using a managed service optimized for ...

Q7
hard

A platform team wants to build an open lakehouse on Cloud Storage using Apache Iceberg so Spark, Flink, Hive, and BigQuery can work with the same data...

Q8
easy

An architecture must orchestrate calls to Cloud Run, BigQuery, and external HTTP APIs in a defined order, scale on demand, and incur no charges while ...

Q9
hard

A governance initiative requires automatic harvesting of technical metadata from distributed Google Cloud sources and enrichment with business context...

Q10
easy

A company wants to replicate ongoing changes from an operational database into BigQuery with a serverless change data capture service. Which service s...

Q11
medium

A publisher wants to share BigQuery data with subscribers without copying the underlying data and wants subscribers to access it through a linked data...

Q12
medium

A retailer needs to store five years of IoT telemetry from millions of devices, supporting low-latency reads on the most recent values for a single de...

Q13
medium

A team wants to build an open data lakehouse on Cloud Storage where Iceberg tables are managed with a unified metastore that BigQuery and Spark can bo...

Q14
easy

Which Google Cloud service is a fully managed, globally distributed relational database that provides external consistency and horizontal scaling for ...

Q15
hard

A team must choose between Workflows and Managed Service for Apache Airflow for orchestrating complex data pipelines. Which factor most strongly favor...

Q16
hard

A streaming architecture must process millions of events per second with low latency, support late-arriving data with watermarks, and reuse the same c...

Q17
easy

Which fully managed Google Cloud service runs Apache Beam pipelines for unified batch and stream processing?

Q18
medium

A team needs to migrate an operational MySQL database to Google Cloud with continuous replication and minimal downtime cutover, with built-in schema c...

Q19
medium

A SaaS provider needs a relational database that supports global users with strong transactional consistency and minimal operational overhead for repl...

Q20
easy

Which Google Cloud service is a managed Spark and Hadoop platform suited for migrating existing on-premises Hadoop and Spark workloads?

Sign in to see all 44 questions

Create a free account to browse all questions — completely free during our launch phase.