Metrics and Continuous Improvement

Agile teams talk a lot about continuous improvement, but few measure whether they are actually improving. The right metrics, used the right way, turn the retrospective from a feelings session into a learning loop.

Goodhart's Law

"When a measure becomes a target, it ceases to be a good measure." Every metric in this lesson is for the team to use — not for management to use against the team. The moment velocity becomes a quota, story points inflate. The moment unit-test coverage becomes a target, you get tests of getters and setters.

DORA Metrics

From the DevOps Research and Assessment group, popularised in Accelerate. Four metrics that distinguish high-performing software organisations from low performers — empirically.

Metric	Question it answers	Elite	Low
Deployment Frequency	How often do we release?	Multiple per day	Less than monthly
Lead Time for Changes	From commit to production?	Under 1 hour	More than 6 months
Change Failure Rate	Percent of deploys causing incident?	0–15%	40–60%
Mean Time to Recovery	How fast do we recover?	Under 1 hour	1+ week

The first two measure speed; the second two measure stability. High performers do not trade one for the other — they get both.

Flow Metrics

From Kanban thinking; valuable in Scrum too:

Cycle time. Time from "started" to "done" per item. Shorter is better. Watch the distribution, not just the mean.
Throughput. Items completed per Sprint or week. Stable throughput means a healthy team.
WIP. Items in progress right now. Spikes signal trouble before cycle time worsens.
Aging WIP. Items still open after N days. The single most useful early-warning metric.

Quality Metrics

Escaped defects. Bugs found in production within 30 days of release. The DoD's report card.
Test coverage. Useful as a floor, not a ceiling. Below 60% suggests untested layers; above 90% rarely justifies the cost.
Time to detect / time to mitigate. Operational health.

Team Health Metrics

Sustainable pace. Are we consistent over months, or sprinting then collapsing?
Happiness index. Anonymous 1–5 weekly. Trend matters more than absolute value.
On-call burden. Hours paged per week per person.
Retro action follow-through. Of last 5 retro actions, how many actually happened?

Outcome vs Output

Output: how much we shipped.
Outcome: what changed for the user or business.

Output metrics (velocity, story count, lines of code) are easy. Outcome metrics (activation rate, retention, conversion, NPS, time-to-value) are harder but more honest. A team that ships 50 features no one uses has high output and zero outcome.

Pair Agile delivery metrics with product analytics. The product manager's job is to keep outcome at the centre of the conversation.

The Improvement Loop

Observe. Pick a metric or qualitative pattern you want to improve.
Hypothesise. "We think doing X for one Sprint will improve Y."
Experiment. Try it for one Sprint, no more.
Measure. Did Y move? Did anything else move?
Decide. Adopt, drop, or run another iteration.

Limit experiments to one or two at a time. Otherwise you cannot tell what worked.

Common Improvement Targets

Symptom	Experiment
Long PR review times	WIP limit on "in review", review-first morning policy
Items rolling over Sprints	Smaller stories, lower commit, slice harder
Fragile DoD	Add specific quality gates, automate them
Stand-ups taking 30 minutes	Switch to walking the board, focus on Sprint Goal
Retros raising same issues	Action owner per item, follow-up at next retro
High change failure rate	Trunk-based development, feature flags, smaller deploys

What Not to Measure

Individual velocity. Team game; individual numbers create perverse incentives.
Hours worked. The Agile manifesto says sustainable pace.
Lines of code, commits per day, tickets closed per person. All gameable, none meaningful.
"Maturity scores." Theatre.

Reporting Up

Executives and stakeholders want to see progress. Report:

Outcome metrics. Are users adopting? Is revenue moving?
DORA metrics. Are we delivering reliably?
Forecast ranges. 50/85/95% confidence dates.
Risks and decisions needed. Where leadership help is required.

Avoid reporting velocity to executives. It is meaningless to them and corrupts the team's use of it.

The Improving Team

You can recognise an improving team:

Retrospective actions have owners and get done.
Cycle time is falling or stable; not creeping up.
Same bugs do not recur; learnings are codified.
Bench depth grows — more people can do more parts of the work.
Members would join the team again given the choice.

Cert Mapping

Cert	Scope
PSM I	Empirical process control, Scrum metrics
PMI-ACP	Earned value, cycle time, quality, team metrics
SAFe Agilist	Plus PI predictability, business value, lean portfolio metrics
DevOps certs (DASA, AWS DevOps Pro)	DORA metrics directly

Closing

You now have the foundation: the Agile mindset, Scrum's roles, events, and artifacts, user stories, estimation, Kanban, scaling frameworks, and the metrics that tell you whether any of it is working. The frameworks are scaffolding; the values are the building. Apply the practices that fit your context, measure outcomes, and improve continuously. Pair this knowledge with hands-on practice on a real team, and you have what the certifications test — and more importantly, what teams need to deliver.