Designing data processing systems Questions
Practice questions for Designing data processing systems topic in Google Cloud Professional Data Engineer. 44 questions covering this domain.
An enterprise wants a centralized governance layer that connects to BigQuery, Cloud Storage, Pub/Sub, and Spanner, then enriches metadata with profili...
Analysts need to build ELT workflows in BigQuery using SQL, keep the code in version control, define dependencies between tables, and add assertions f...
A data platform team needs an asynchronous, scalable service that decouples event producers from event consumers and is commonly used for streaming an...
A data engineering team wants a fully managed, cloud-native service for visually building and managing scalable enterprise data integration pipelines ...
A workflow must pause until an external approval system calls back, and the design should avoid polling while allowing the workflow to resume later. W...
A migration project needs to move 50 TiB of data from Amazon S3 and on-premises file storage into Cloud Storage using a managed service optimized for ...
A platform team wants to build an open lakehouse on Cloud Storage using Apache Iceberg so Spark, Flink, Hive, and BigQuery can work with the same data...
An architecture must orchestrate calls to Cloud Run, BigQuery, and external HTTP APIs in a defined order, scale on demand, and incur no charges while ...
A governance initiative requires automatic harvesting of technical metadata from distributed Google Cloud sources and enrichment with business context...
A company wants to replicate ongoing changes from an operational database into BigQuery with a serverless change data capture service. Which service s...
A publisher wants to share BigQuery data with subscribers without copying the underlying data and wants subscribers to access it through a linked data...
A retailer needs to store five years of IoT telemetry from millions of devices, supporting low-latency reads on the most recent values for a single de...
A team wants to build an open data lakehouse on Cloud Storage where Iceberg tables are managed with a unified metastore that BigQuery and Spark can bo...
Which Google Cloud service is a fully managed, globally distributed relational database that provides external consistency and horizontal scaling for ...
A team must choose between Workflows and Managed Service for Apache Airflow for orchestrating complex data pipelines. Which factor most strongly favor...
A streaming architecture must process millions of events per second with low latency, support late-arriving data with watermarks, and reuse the same c...
Which fully managed Google Cloud service runs Apache Beam pipelines for unified batch and stream processing?
A team needs to migrate an operational MySQL database to Google Cloud with continuous replication and minimal downtime cutover, with built-in schema c...
A SaaS provider needs a relational database that supports global users with strong transactional consistency and minimal operational overhead for repl...
Which Google Cloud service is a managed Spark and Hadoop platform suited for migrating existing on-premises Hadoop and Spark workloads?
Sign in to see all 44 questions
Create a free account to browse all questions — completely free during our launch phase.