Optimize generative AI systems and model performance Questions
Practice questions for Optimize generative AI systems and model performance topic in Microsoft Certified: Machine Learning Operations (MLOps) Engineer Associate. 24 questions covering this domain.
A team has released a fine-tuned model and now needs disciplined operational control as it moves from experimentation into steady production use. Whic...
A RAG system returns too much irrelevant context. Which tuning area should the engineer review first?
A team lacks enough domain examples for a fine-tuning pilot and needs more training material. Which technique is explicitly in scope for AI-300?
A domain-specific application needs better semantic retrieval quality than the general embedding model is delivering. Which action is most aligned wit...
A search architect wants one retrieval strategy that blends semantic understanding with keyword matching to improve result quality. Which approach sho...
Two retrieval designs seem promising, and the team wants evidence about which one actually improves grounded answers in production-like testing. What ...
A team prepares supervised fine-tuning data for an Azure OpenAI model. Which file format is required for the training and validation files?
Two embedding models are candidates for a domain-specific RAG system. Which evaluation approach is most aligned with the AI-300 study guide for select...
To improve final ranking quality of retrieved chunks before they are passed to the LLM, which Azure AI Search feature reorders results using a deeper ...
A RAG pipeline returns chunks that often span across topic boundaries, causing irrelevant context. Which optimization should be evaluated first?
To reduce latency and cost for repeated identical or near-identical prompts, which optimization technique is in scope for AI-300?
A team will fine-tune a smaller foundation model so it can serve a high-volume internal task at lower latency and cost. Which sequence is best aligned...
A team fine-tunes an Azure OpenAI model but the fine-tuned model's responses are often too short and lack detail. Which fine-tuning data characteristi...
Which Azure AI Search index feature stores a dense vector representation of a document chunk to enable similarity search against a query embedding?
A RAG system chunks documents into 1000-token segments with no overlap. Users report that answers often miss context that spans the boundary between t...
A team runs A/B evaluation between two chunking strategies: 500-token fixed-size chunks and 500-token semantic chunks (using sentence boundaries). The...
A production generative AI application faces high token costs during peak hours. The team observes that roughly 40% of incoming requests are near-iden...
An engineer wants to reduce the number of tokens sent to the LLM on each request by filtering irrelevant chunks before the final prompt is assembled. ...
A team is deciding between fine-tuning an Azure OpenAI model and using RAG to improve responses for a customer support use case. Which scenario most s...
A RAG system has high recall but low precision—it retrieves many documents but most are not relevant. Which query optimization technique can improve p...
Sign in to see all 24 questions
Create a free account to browse all questions — completely free during our launch phase.