Skip to content

Ingesting and processing the data Questions

Practice questions for Ingesting and processing the data topic in Google Cloud Professional Data Engineer. 53 questions covering this domain.

53 questions13 easy28 medium12 hard
Q1
medium

Pub/Sub uses per-message parallelism instead of partition-based messaging. What is a primary benefit of that design?

Q2
hard

Which statement about Pub/Sub dead-letter delivery attempts is accurate?

Q3
hard

A team wants to trigger DAG executions when objects arrive in a Cloud Storage bucket while keeping the orchestration logic in Airflow. Which service b...

Q4
medium

What does Dataform's hermetic compilation model guarantee?

Q5
medium

Which destinations does Datastream replicate to directly according to the product overview?

Q6
easy

In Dataform, what is the primary purpose of the ref function?

Q7
hard

A Dataform team schedules workflow configuration runs every 15 minutes, but some runs exceed that duration. According to the documentation, what happe...

Q8
hard

A workflow today contains complex transformations and reusable business logic written directly in Workflows YAML. The team wants to align with Google'...

Q9
medium

A team wants a visual pipeline service that can build both batch and real-time data pipelines without writing most of the orchestration code by hand. ...

Q10
medium

In a Managed Service for Apache Airflow environment, where are DAGs, logs, custom plugins, and environment data stored?

Q11
easy

Which service is the fully managed workflow orchestration offering based on Apache Airflow, where workflows are created as DAGs in Python files?

Q12
easy

In Pub/Sub, dead-letter topics are configured on which resource?

Q13
medium

A downstream integration pipeline must run automatically when one or more upstream Cloud Data Fusion pipelines complete, and the trigger should be abl...

Q14
medium

Which type of Apache Beam window assigns each element to a single window of fixed length that does not overlap with other windows?

Q15
hard

A streaming pipeline must include events that arrive after their event-time window closes, up to 30 minutes late. Which Apache Beam concept directly c...

Q16
medium

A team wants to package and parameterize a Dataflow pipeline so non-developers can launch it from the console with custom arguments and so the pipelin...

Q17
medium

A team wants to ingest streaming data into BigQuery using a high-throughput, exactly-once method that supports stream-level transactions and is the re...

Q18
hard

A data team wants to run Apache Spark batch jobs on Google Cloud without provisioning, sizing, or managing a cluster, and pay only for the duration th...

Q19
medium

A team needs to schedule recurring imports from Amazon S3, Google Ads, and YouTube reports into BigQuery without writing custom code. Which managed se...

Q20
easy

Which Pub/Sub feature allows messages with the same key to be delivered to subscribers in the order they were published?

Sign in to see all 53 questions

Create a free account to browse all questions — completely free during our launch phase.