A team needs to run distributed Apache Spark jobs to preprocess petabyte-scale datasets stored in Amazon S3 before ML training. They need the ability to choose specific instance types and use spot instances. Which AWS service best fits?
More Data Engineering Questions
40 questions
Full AWS Certified Machine Learning - Specialty Practice Test
All topics covered
All AWS Certified Machine Learning - Specialty Questions
Browse by topic
Related Questions
A data engineering team needs to ingest real-time clickstream data and deliver it directly to Amazon...
A machine learning team stores training datasets in Amazon S3. They want to reduce storage costs and...
A data science team needs a centralized, searchable metadata repository that stores table definition...
A team has raw data files in various formats landing in an Amazon S3 bucket daily. They want an auto...
A company needs to create a large labeled image dataset for training an object detection model. They...
Educational Content — CertQnA practice questions are written against official exam objectives, covering the same domains tested on the real exam. All content is original and independent — not actual exam questions, not affiliated with any certification vendor. Learn more about our content policy
Discussion
Be the first to share your understanding of this concept
Sign in to join the discussion