Optimize generative AI systems and model performance Questions

Practice questions for Optimize generative AI systems and model performance topic in Microsoft Certified: Machine Learning Operations (MLOps) Engineer Associate. 24 questions covering this domain.

24 questions5 easy12 medium7 hard

hard

A team has released a fine-tuned model and now needs disciplined operational control as it moves from experimentation into steady production use. Whic...

medium

A RAG system returns too much irrelevant context. Which tuning area should the engineer review first?

easy

A team lacks enough domain examples for a fine-tuning pilot and needs more training material. Which technique is explicitly in scope for AI-300?

medium

A domain-specific application needs better semantic retrieval quality than the general embedding model is delivering. Which action is most aligned wit...

easy

A search architect wants one retrieval strategy that blends semantic understanding with keyword matching to improve result quality. Which approach sho...

medium

Two retrieval designs seem promising, and the team wants evidence about which one actually improves grounded answers in production-like testing. What ...

medium

A team prepares supervised fine-tuning data for an Azure OpenAI model. Which file format is required for the training and validation files?

hard

Two embedding models are candidates for a domain-specific RAG system. Which evaluation approach is most aligned with the AI-300 study guide for select...

easy

To improve final ranking quality of retrieved chunks before they are passed to the LLM, which Azure AI Search feature reorders results using a deeper ...

Q10

medium

A RAG pipeline returns chunks that often span across topic boundaries, causing irrelevant context. Which optimization should be evaluated first?

Q11

medium

To reduce latency and cost for repeated identical or near-identical prompts, which optimization technique is in scope for AI-300?

Q12

hard

A team will fine-tune a smaller foundation model so it can serve a high-volume internal task at lower latency and cost. Which sequence is best aligned...

Q13

medium

A team fine-tunes an Azure OpenAI model but the fine-tuned model's responses are often too short and lack detail. Which fine-tuning data characteristi...

Q14

easy

Which Azure AI Search index feature stores a dense vector representation of a document chunk to enable similarity search against a query embedding?

Q15

medium

A RAG system chunks documents into 1000-token segments with no overlap. Users report that answers often miss context that spans the boundary between t...

Q16

hard

A team runs A/B evaluation between two chunking strategies: 500-token fixed-size chunks and 500-token semantic chunks (using sentence boundaries). The...

Q17

hard

A production generative AI application faces high token costs during peak hours. The team observes that roughly 40% of incoming requests are near-iden...

Q18

medium

An engineer wants to reduce the number of tokens sent to the LLM on each request by filtering irrelevant chunks before the final prompt is assembled. ...

Q19

medium

A team is deciding between fine-tuning an Azure OpenAI model and using RAG to improve responses for a customer support use case. Which scenario most s...

Q20

medium

A RAG system has high recall but low precision—it retrieves many documents but most are not relevant. Which query optimization technique can improve p...

Sign in to see all 24 questions

Create a free account to browse all questions — completely free during our launch phase.