ML Workflows Questions
Practice questions for ML Workflows topic in Databricks Certified Machine Learning Associate. 38 questions covering this domain.
A data scientist uses `hp.loguniform("learning_rate", np.log(1e-5), np.log(1e-1))` in a Hyperopt search space. What is the practical implication of us...
What does the `cross_val_score()` function from scikit-learn return when used for model evaluation?
A data scientist uses Hyperopt with SparkTrials for distributed hyperparameter tuning. Which code pattern correctly implements distributed tuning?
A data scientist computes the following confusion matrix on the test set: True Positives=80, False Positives=20, False Negatives=30, True Negatives=12...
A data scientist is evaluating a regression model and finds a high R² value but also a high RMSE. What is the most likely explanation?
When creating a training dataset using the Databricks Feature Store, a data scientist wants to ensure that features reflect only the values available ...
A data scientist needs to search all runs in a specific experiment and filter only those with a validation accuracy greater than 0.90, then retrieve t...
A data scientist runs 50 Hyperopt trials in an MLflow experiment. They want to programmatically select the run with the highest `val_f1` and retrieve ...
In the MLflow tracking UI, what is the easiest way to identify which training run produced the model with the lowest validation RMSE across an experim...
A data scientist needs to programmatically retrieve a completed MLflow run's metric value using the run ID. Which code correctly does this?
In scikit-learn, which method on a fitted `StandardScaler` applies the learned mean and standard deviation to new data without refitting?
A data scientist evaluates a multiclass classifier on 3 classes and wants a single performance metric that weights each class by the number of support...
A data scientist is debugging a Hyperopt run where all trials return the same loss value regardless of the hyperparameter configuration. What is the m...
In scikit-learn, which class is used to chain preprocessing steps and a final estimator into a single object that can be trained and used for predicti...
A data scientist compares two scikit-learn models using 5-fold cross-validation. Model A has mean_accuracy=0.88, std=0.01. Model B has mean_accuracy=0...
A data scientist uses Hyperopt with the TPE algorithm to search over the following search space. Which statement correctly describes the behavior of `...
A data scientist uses `train_test_split()` from scikit-learn to split data into training and test sets. To ensure the same split is produced every tim...
A data scientist builds a preprocessing and model pipeline in scikit-learn and wants to ensure it is logged correctly to MLflow so that preprocessing ...
A data scientist wants to evaluate a scikit-learn model with stratified k-fold cross-validation (k=5) to ensure class proportions are preserved in eac...
A data scientist implements a Hyperopt objective function for XGBoost and uses `SparkTrials(parallelism=8)`. Their cluster has 16 worker cores. After ...
Sign in to see all 38 questions
Create a free account to browse all questions — completely free during our launch phase.