Testing, Validation, and Troubleshooting Questions

Practice questions for Testing, Validation, and Troubleshooting topic in AWS Certified Generative AI Developer - Professional. 22 questions covering this domain.

22 questions6 easy9 medium7 hard

easy

A brand team wants reviewers to judge whether model outputs match brand voice and friendliness. Which evaluation approach in Amazon Bedrock is most ap...

easy

Which set of metrics is explicitly associated with Amazon Bedrock automatic model evaluation?

hard

A platform team wants to automatically test guardrail effectiveness and send detailed policy violation information into production dashboards for ongo...

medium

An operations team needs step by step visibility into live agent workflows and wants telemetry compatible with OpenTelemetry for troubleshooting. Whic...

medium

A team wants to submit a prompt dataset to Bedrock, run model inference, score the responses automatically, and view aggregated results in reports. Wh...

medium

An evaluator wants automated metrics for summarization quality during a Bedrock automatic model evaluation job. Which metric is commonly available for...

hard

An AgentCore-based agent occasionally fails mid-workflow with cryptic errors. Operations need step-level traces compatible with OpenTelemetry to inves...

easy

Which Amazon Bedrock capability lets you compare model outputs across multiple foundation models on the same prompt dataset to choose a model?

hard

A RAG application returns confident but wrong answers when the retriever returns no relevant chunks. Which combination of changes BEST mitigates the i...

Q10

medium

A team observes a Bedrock invocation returning a ThrottlingException intermittently. Which mitigations align with AWS guidance? (Choose the BEST singl...

Q11

hard

A RAG-based assistant returns answers that are highly relevant when the knowledge base contains the answer but confidently hallucinates when it does n...

Q12

easy

Which Amazon Bedrock model evaluation job type uses an LLM judge rather than human reviewers to score model responses automatically at scale?

Q13

easy

Which Amazon Bedrock evaluation metric assesses whether a model's response is factually supported by the retrieved context rather than invented?

Q14

medium

An operations team sees a high rate of ValidationException errors in Bedrock invocation logs. Which diagnostic step is most appropriate?

Q15

hard

An agentic Bedrock workflow sporadically produces incorrect tool arguments. Inspection of orchestration traces shows the agent reasoning is correct bu...

Q16

medium

A developer's Bedrock Agent returns a response that does not call the expected action group tool and instead responds with irrelevant text. Which debu...

Q17

medium

A developer submits prompts to a Bedrock model and consistently receives responses that are truncated mid-sentence. Which inference parameter is most ...

Q18

medium

A team runs a Bedrock automatic model evaluation job on a 500-prompt dataset and wants to compare results between two foundation models. Which Bedrock...

Q19

medium

An operations team sees frequent ServiceQuotaExceededException errors in Bedrock CloudWatch logs. Which combination of actions directly addresses this...

Q20

hard

A Bedrock Knowledge Base RAG application shows low groundedness scores (model produces answers not traceable to retrieved context) on automatic evalua...

Sign in to see all 22 questions

Create a free account to browse all questions — completely free during our launch phase.