Skip to content
MLA
Model Deployment
hard
Question 4 of 24

A data science team deploys a model for batch inference using mlflow.pyfunc.spark_udf() on a DataFrame with 1 million rows. The job is very slow despite having a large cluster. Which change is most likely to improve performance?

ASwitch from spark_udf to mlflow.pyfunc.load_model() and call predict() on the full DataFrame on the driver
BIncrease model complexity to improve prediction throughput
CRepartition the input DataFrame to increase parallelism across executors
DLog the model as a smaller artifact to reduce load time

Educational Content — CertQnA practice questions are written against official exam objectives, covering the same domains tested on the real exam. All content is original and independent — not actual exam questions, not affiliated with any certification vendor. Learn more about our content policy

Discussion

Be the first to share your understanding of this concept

⚠️ Discussion is for concept clarification only. Do not share or request actual exam questions or answers.

Sign in to join the discussion