Modeling Questions
Practice questions for Modeling topic in AWS Certified Machine Learning - Specialty. 69 questions covering this domain.
A linear regression model has many features, and the data scientist suspects most of them are irrelevant. They want a regularization technique that au...
A machine learning engineer wants to automatically find the optimal combination of hyperparameters such as learning rate and max depth for a SageMaker...
A classification model achieves 98% accuracy on the training set but only 62% accuracy on the validation set. What problem does this indicate and what...
A retail company wants to segment its customers into distinct groups based on purchasing behavior without any predefined labels. Which type of ML algo...
A financial company must audit a loan approval model for bias before deployment. They specifically need to measure whether a demographic group receive...
A manufacturing plant collects sensor data in a continuous stream. The operations team needs to detect anomalous sensor readings in near-real-time wit...
An e-commerce team needs to train a recommendation model on a dataset of (user, item) pairs with sparse one-hot encoded features for user demographics...
A data scientist applies SMOTE to the entire dataset and then runs 5-fold cross-validation. After training, cross-validation performance is significan...
A data scientist wants to build an ensemble model that trains multiple decision trees on different bootstrap samples of the training data and averages...
A retail company needs to forecast the sales of 10,000 individual products simultaneously, leveraging historical patterns across all products. Which S...
A binary classifier for disease diagnosis achieves an AUC of 0.95 on the ROC curve. A naive majority-class classifier achieves an AUC of 0.5. What doe...
A team is training a large ML model on SageMaker with a 500 GB dataset stored in S3. Training currently takes a long time because SageMaker downloads ...
During a deep learning training job on SageMaker, a data scientist suspects the model is not converging due to vanishing gradients. Which SageMaker fe...
A data scientist needs to train a binary classification model on a dataset with tens of millions of rows. The model must be interpretable, and trainin...
A data scientist needs to train a word2vec model on a large text corpus to generate word embeddings for downstream NLP tasks. Which SageMaker built-in...
A binary classifier for spam detection produces 90 true positives, 10 false negatives, 5 false positives, and 895 true negatives. Which metric directl...
A data scientist is training a very large deep learning model where the model itself is too large to fit in the memory of a single GPU. Which distribu...
A data scientist needs a supervised ML algorithm for a tabular regression problem. The algorithm should handle missing values natively and is known fo...
A data scientist needs an unsupervised algorithm to reduce a high-dimensional dataset with 200 features to the top 2 dimensions for visualization purp...
A company trains a SageMaker XGBoost model but training is taking too long because the 100 GB training dataset must be fully downloaded to each instan...
Sign in to see all 69 questions
Create a free account to browse all questions — completely free during our launch phase.