A data engineer needs to perform a one-time batch load of Parquet files from an S3 bucket into an existing Delta table. Previously loaded files must not be re-ingested on future runs of the same command. Which approach is most appropriate?
spark.read.parquet(path).write.format(\delta\).mode(\append\).save(table_path)COPY INTO SQL commandspark.readStream.format(\cloudFiles\) with a triggerMERGE INTO with the Parquet files as the sourceMore Data Ingestion & Acquisition Questions
14 questions
Full Databricks Certified Data Engineer Professional Practice Test
All topics covered
All Databricks Certified Data Engineer Professional Questions
Browse by topic
Related Questions
What is the primary purpose of the `cloudFiles` source format in Databricks Auto Loader?...
A data engineer configures Auto Loader with `cloudFiles.schemaEvolutionMode` set to `addNewColumns`....
A data engineer ingests JSON files using Auto Loader with a fixed target schema. Incoming files occa...
In Auto Loader, what is the difference between directory listing mode and file notification mode for...
A data engineer uses Auto Loader to ingest files from an Azure Data Lake Storage Gen2 path. The sour...
Educational Content — CertQnA practice questions are written against official exam objectives, covering the same domains tested on the real exam. All content is original and independent — not actual exam questions, not affiliated with any certification vendor. Learn more about our content policy
Discussion
Be the first to share your understanding of this concept
Sign in to join the discussion