A data science team deploys a model for batch inference using mlflow.pyfunc.spark_udf() on a DataFrame with 1 million rows. The job is very slow despite having a large cluster. Which change is most likely to improve performance?
spark_udf to mlflow.pyfunc.load_model() and call predict() on the full DataFrame on the driverMore Model Deployment Questions
24 questions
Full Databricks Certified Machine Learning Associate Practice Test
All topics covered
All Databricks Certified Machine Learning Associate Questions
Browse by topic
Related Questions
A data scientist has registered a model in the Workspace Model Registry and wants to expose it as a ...
In the Unity Catalog Model Registry, what replaces the stage concept (Staging, Production, Archived)...
A junior data scientist requests a stage transition to Production in the Workspace Model Registry bu...
To configure MLflow to use Unity Catalog as the model registry backend instead of the Workspace Mode...
A data science team wants to automatically trigger a CI/CD job when a model version is transitioned ...
Educational Content — CertQnA practice questions are written against official exam objectives, covering the same domains tested on the real exam. All content is original and independent — not actual exam questions, not affiliated with any certification vendor. Learn more about our content policy
Discussion
Be the first to share your understanding of this concept
Sign in to join the discussion