Skip to content

Modeling Questions

Practice questions for Modeling topic in AWS Certified Machine Learning - Specialty. 69 questions covering this domain.

69 questions19 easy35 medium15 hard
Q1
medium

A linear regression model has many features, and the data scientist suspects most of them are irrelevant. They want a regularization technique that au...

Q2
easy

A machine learning engineer wants to automatically find the optimal combination of hyperparameters such as learning rate and max depth for a SageMaker...

Q3
easy

A classification model achieves 98% accuracy on the training set but only 62% accuracy on the validation set. What problem does this indicate and what...

Q4
easy

A retail company wants to segment its customers into distinct groups based on purchasing behavior without any predefined labels. Which type of ML algo...

Q5
hard

A financial company must audit a loan approval model for bias before deployment. They specifically need to measure whether a demographic group receive...

Q6
medium

A manufacturing plant collects sensor data in a continuous stream. The operations team needs to detect anomalous sensor readings in near-real-time wit...

Q7
hard

An e-commerce team needs to train a recommendation model on a dataset of (user, item) pairs with sparse one-hot encoded features for user demographics...

Q8
hard

A data scientist applies SMOTE to the entire dataset and then runs 5-fold cross-validation. After training, cross-validation performance is significan...

Q9
medium

A data scientist wants to build an ensemble model that trains multiple decision trees on different bootstrap samples of the training data and averages...

Q10
medium

A retail company needs to forecast the sales of 10,000 individual products simultaneously, leveraging historical patterns across all products. Which S...

Q11
medium

A binary classifier for disease diagnosis achieves an AUC of 0.95 on the ROC curve. A naive majority-class classifier achieves an AUC of 0.5. What doe...

Q12
medium

A team is training a large ML model on SageMaker with a 500 GB dataset stored in S3. Training currently takes a long time because SageMaker downloads ...

Q13
medium

During a deep learning training job on SageMaker, a data scientist suspects the model is not converging due to vanishing gradients. Which SageMaker fe...

Q14
medium

A data scientist needs to train a binary classification model on a dataset with tens of millions of rows. The model must be interpretable, and trainin...

Q15
medium

A data scientist needs to train a word2vec model on a large text corpus to generate word embeddings for downstream NLP tasks. Which SageMaker built-in...

Q16
easy

A binary classifier for spam detection produces 90 true positives, 10 false negatives, 5 false positives, and 895 true negatives. Which metric directl...

Q17
hard

A data scientist is training a very large deep learning model where the model itself is too large to fit in the memory of a single GPU. Which distribu...

Q18
easy

A data scientist needs a supervised ML algorithm for a tabular regression problem. The algorithm should handle missing values natively and is known fo...

Q19
easy

A data scientist needs an unsupervised algorithm to reduce a high-dimensional dataset with 200 features to the top 2 dimensions for visualization purp...

Q20
medium

A company trains a SageMaker XGBoost model but training is taking too long because the 100 GB training dataset must be fully downloaded to each instan...

Sign in to see all 69 questions

Create a free account to browse all questions — completely free during our launch phase.