Applying site reliability engineering practices Questions
Practice questions for Applying site reliability engineering practices topic in Google Professional Cloud DevOps Engineer. 37 questions covering this domain.
Why is a 100% SLO generally considered a poor target in SRE guidance?
A service is close to exhausting its error budget. According to SRE guidance, which action is most appropriate?
A service has a 99.9% SLO over its compliance period. What error budget does that imply?
A team argues that deploying a service in two zones guarantees near-perfect availability because the two instances are independent. What is the main f...
A dependency owned by another team causes your service to miss its SLO. Which response is presented as the more user-centered approach in the SRE Work...
You only have a latency percentile metric per 10-minute interval. Which SLO type is the appropriate fit?
Which activity is the clearest example of toil?
Which compliance period is more closely aligned with recent user experience because it continuously evaluates the latest interval, such as the last 30...
Which definition best describes a service-level indicator (SLI)?
Which SRE concept describes the explicit maximum proportion of bad events a service is allowed to experience within the SLO compliance period?
A service's error budget is being consumed faster than expected due to dependency failures outside the team's control. The SRE Workbook recommends a p...
According to SRE guidance, what type of work is toil most likely to crowd out if left uncontrolled?
What is the key difference between a calendar-window SLO and a rolling-window SLO?
A services team reviews their SLO monthly and consistently achieves far above the target, maintaining a large unused error budget. According to SRE gu...
A team is designing SLIs for a batch data-processing pipeline. Requests are not user-facing; instead, correctness of processed records matters. Which ...
A team's SLO is based on request availability. Over the past compliance period, they experienced a 2-hour full outage and several shorter degraded-ava...
A new microservice is being instrumented for its first SLI. The team debates using availability vs. latency as the primary SLI. According to SRE guida...
A team experiences alert fatigue because their CPU-based alert fires every night due to a scheduled batch job, but the batch job never causes a user-f...
A team wants to reduce on-call burden. Their first step is to identify which alerts are consuming the most engineer time. Which type of analysis direc...
A team designs an SLO for a data pipeline. The pipeline must process all records without dropping any. Which SLI type from the SRE Workbook is most ap...
Sign in to see all 37 questions
Create a free account to browse all questions — completely free during our launch phase.