Skip to content

Productionizing Data Pipelines Questions

Practice questions for Productionizing Data Pipelines topic in Databricks Certified Data Engineer Associate. 36 questions covering this domain.

36 questions8 easy18 medium10 hard
Q1
medium

A data engineer is designing a Lakeflow Job with three tasks: Task A ingests data, Task B transforms it, and Task C validates the output. Tasks B and ...

Q2
hard

A data engineer wants to programmatically create and deploy Lakeflow Jobs as part of a CI/CD pipeline using infrastructure-as-code. Which Databricks t...

Q3
medium

A data engineering team stores their job notebooks in a Git repository. They want to ensure that each Lakeflow Job run uses a specific tagged version ...

Q4
easy

What does a Lakeflow Jobs trigger define?

Q5
medium

A data engineer needs to pass a runtime parameter `run_date` to all tasks in a Lakeflow Job so each task can filter data for the correct date. Which L...

Q6
medium

A data engineer wants to receive an email alert whenever a Lakeflow Job run fails. Which Lakeflow Jobs feature should they configure?

Q7
hard

A data engineer is running a Lakeflow Job with 5 tasks. Task 3 fails intermittently due to transient network errors. The engineer wants to automatical...

Q8
easy

In Lakeflow Jobs, what is a task?

Q9
hard

A Lakeflow Job runs daily and processes data from the previous day. The job has been running successfully for months. After a code change deployed on ...

Q10
hard

A data engineer's production Lakeflow Job fails at Task C during an overnight run. Tasks A and B completed successfully. After fixing the bug in Task ...

Q11
medium

A data engineer wants to programmatically monitor a Lakeflow Job run and check its completion status from outside Databricks (e.g., from a CI/CD pipel...

Q12
hard

A data engineer's Lakeflow Job fails intermittently during Task C with a transient `java.lang.OutOfMemoryError`. The task processes a large dataset an...

Q13
medium

A data engineer wants to use Databricks Asset Bundles to manage a production Lakeflow Job as code. After setting up the bundle YAML, which CLI command...

Q14
medium

A data engineer wants to pass a dynamic date parameter to all tasks in a Lakeflow Job so the job processes only data for the previous day. The enginee...

Q15
hard

A Lakeflow Job runs a multi-task ETL pipeline nightly. After a recent deployment, Task D (which runs a complex transformation notebook) starts failing...

Q16
easy

In Lakeflow Jobs, what does setting a task's `depends_on` property do?

Q17
medium

A Lakeflow Job has five tasks (A, B, C, D, E). Tasks B and C both depend on Task A. Task D depends on both B and C. Task E depends on D. Task B fails....

Q18
easy

A data engineer creates a Lakeflow Job with a cron trigger set to `0 0 * * *`. What does this cron expression mean?

Q19
easy

Which Lakeflow Jobs trigger starts a run when new files appear in a monitored Unity Catalog storage location?

Q20
medium

Several tasks in one job share the same jobs compute resource. Which behavior should the engineer keep in mind?

Sign in to see all 36 questions

Create a free account to browse all questions — completely free during our launch phase.