Ingesting and processing the data Questions
Practice questions for Ingesting and processing the data topic in Google Cloud Professional Data Engineer. 53 questions covering this domain.
Pub/Sub uses per-message parallelism instead of partition-based messaging. What is a primary benefit of that design?
Which statement about Pub/Sub dead-letter delivery attempts is accurate?
A team wants to trigger DAG executions when objects arrive in a Cloud Storage bucket while keeping the orchestration logic in Airflow. Which service b...
What does Dataform's hermetic compilation model guarantee?
Which destinations does Datastream replicate to directly according to the product overview?
In Dataform, what is the primary purpose of the ref function?
A Dataform team schedules workflow configuration runs every 15 minutes, but some runs exceed that duration. According to the documentation, what happe...
A workflow today contains complex transformations and reusable business logic written directly in Workflows YAML. The team wants to align with Google'...
A team wants a visual pipeline service that can build both batch and real-time data pipelines without writing most of the orchestration code by hand. ...
In a Managed Service for Apache Airflow environment, where are DAGs, logs, custom plugins, and environment data stored?
Which service is the fully managed workflow orchestration offering based on Apache Airflow, where workflows are created as DAGs in Python files?
In Pub/Sub, dead-letter topics are configured on which resource?
A downstream integration pipeline must run automatically when one or more upstream Cloud Data Fusion pipelines complete, and the trigger should be abl...
Which type of Apache Beam window assigns each element to a single window of fixed length that does not overlap with other windows?
A streaming pipeline must include events that arrive after their event-time window closes, up to 30 minutes late. Which Apache Beam concept directly c...
A team wants to package and parameterize a Dataflow pipeline so non-developers can launch it from the console with custom arguments and so the pipelin...
A team wants to ingest streaming data into BigQuery using a high-throughput, exactly-once method that supports stream-level transactions and is the re...
A data team wants to run Apache Spark batch jobs on Google Cloud without provisioning, sizing, or managing a cluster, and pay only for the duration th...
A team needs to schedule recurring imports from Amazon S3, Google Ads, and YouTube reports into BigQuery without writing custom code. Which managed se...
Which Pub/Sub feature allows messages with the same key to be delivered to subscribers in the order they were published?
Sign in to see all 53 questions
Create a free account to browse all questions — completely free during our launch phase.