Cost & Performance Optimisation Questions
Practice questions for Cost & Performance Optimisation topic in Databricks Certified Data Engineer Professional. 26 questions covering this domain.
Which statement best describes Databricks Predictive Optimization for Unity Catalog managed Delta tables?
What is the primary purpose of running the `OPTIMIZE` command on a Delta table?
A data engineering team switches their Delta Lake ETL workloads from standard Databricks Runtime to a Photon-enabled runtime. Which type of operations...
A data engineer changes the liquid clustering keys on an existing table from `(created_date)` to `(created_date, region)` using `ALTER TABLE`. They th...
A data engineer creates a new Delta table for event analytics. Queries will filter on different combinations of `event_date`, `region`, and `event_typ...
A PySpark job joins a large transactions table (200 GB) with a small currency rates lookup table (10 MB). The job experiences slow performance due to ...
A data engineer profiles a Spark job in the Spark UI and notices that one stage has a very high ratio of shuffle write bytes to shuffle read bytes in ...
A data engineering team runs a Lakeflow Job with 10 tasks daily. Each task creates a new all-purpose cluster (due to legacy configuration), which incu...
A data engineering team uses a shared all-purpose cluster that runs continuously 24/7. Multiple engineers use it interactively during business hours b...
A data engineer wants to reduce storage costs for a Delta table that has daily `OPTIMIZE` runs. The table has a 30-day retention requirement for time ...
What does the `ZORDER BY` clause in a Delta Lake `OPTIMIZE` command do?
A data engineer observes that a large PySpark job has a very long GC (Garbage Collection) pause time visible in the Spark UI executor metrics. Which c...
Which maintenance operations can predictive optimization run on Unity Catalog managed tables?
A new Delta table is expected to stay under 10 TB, and most queries filter on one or two columns. Which clustering-key guidance matches the Databricks...
A large existing Delta table is altered to add liquid clustering keys for the first time. After a normal `OPTIMIZE`, older files still follow the old ...
A table already has `CLUSTER BY AUTO`. An engineer runs `CREATE OR REPLACE TABLE` for it but forgets to include `CLUSTER BY AUTO` in the replacement s...
Which statement about liquid clustering is correct?
A team wants Databricks to choose and adjust clustering keys automatically over time based on query patterns. What prerequisite must be in place?
Which tables are explicitly excluded from predictive optimization?
A governance team enabled predictive optimization at the account level and asks which table types it will skip. Which answer is correct?
Sign in to see all 26 questions
Create a free account to browse all questions — completely free during our launch phase.