Skip to content
6 min read·Lesson 4 of 8

Rightsizing and Utilisation

Eliminate the gap between provisioned capacity and actual usage — usually 30-60% on day one.

The first rule of cloud optimisation: most things are too big. CPU utilisation across enterprise cloud fleets typically averages 10-20%. Memory averages 30-40%. Storage usage often falls below 50% of provisioned. The gap between provisioned and used is the largest single category of cloud waste.

Rightsizing means closing that gap. It is the second optimisation after killing the obviously idle.

Step 1: Kill the Idle

Before rightsizing, eliminate resources that are not used at all:

  • EC2 instances with <5% CPU and <5% network for 7+ days
  • Unattached EBS volumes — they bill regardless of attachment
  • Old EBS snapshots beyond retention policy
  • Idle load balancers (zero traffic)
  • Unused Elastic IPs (billable when not associated)
  • RDS / databases with no connections
  • Empty S3 buckets in expensive classes
  • NAT Gateways in subnets with no outbound traffic
  • VM scale sets / Auto Scaling Groups at min=N when usage justifies min=0

Each provider has native tooling: AWS Trusted Advisor and AWS Cost Optimization Hub, Azure Advisor, GCP Recommender. They will surface idle resources in minutes. Third-party tools (Vantage, CloudHealth, ProsperOps, Kion, Apptio Cloudability) add automation.

Make this an automated weekly job, not a one-off campaign. Idle resources reappear constantly.

Step 2: Schedule Non-Prod

Development, staging, QA, and demo environments are typically used 8 hours a day, 5 days a week — 40 hours out of 168, or 24% of the week. Running them 24/7 wastes 76%.

Solutions:

  • Auto-shutdown. AWS Instance Scheduler, Azure DevTest Labs auto-shutdown, GCP scheduled instance group resizing. Tag-driven: schedule=weekday-9-7 shuts down outside hours.
  • Spin-up on demand. Ephemeral preview environments per PR; tear down on merge. Qovery, Shipyard, GitHub Codespaces and similar make this practical.
  • Spot/preemptible for non-prod. Acceptable if interruption is recoverable.

Expected savings on non-prod: 60-70%. Often the largest single FinOps win.

Step 3: Rightsize the Survivors

For workloads that genuinely run continuously, match instance size to observed usage.

The methodology

  1. Capture 2-4 weeks of CloudWatch / Azure Monitor / Cloud Monitoring metrics: CPU, memory, network, disk IO, IOPS.
  2. Compute P95 of each metric.
  3. Pick the smallest instance type whose limits exceed P95 by a target buffer (commonly 25-40%).
  4. Check instance family — newer generations are usually cheaper and faster.
  5. Consider Graviton (AWS), Ampere (Azure / Oracle), Tau (GCP) ARM instances for 20-40% additional savings.

Recommendation engines

Use them but verify:

  • AWS Compute Optimizer
  • Azure Advisor cost recommendations
  • GCP Recommender
  • Third-party: Vantage, ProsperOps, Cloudability, Kubecost (Kubernetes)

They are accurate for steady-state workloads. They miss burst patterns, batch jobs, periodic load tests, and warm pools for failover. Always overlay product-team knowledge.

Don't forget memory and IO

Many recommendations focus on CPU. A right-CPU but wrong-memory choice still wastes money. R instance families exist because some workloads are memory-bound; M-family is general-purpose; C-family is CPU-bound. Pick the family that matches the dominant resource.

RDS, ElastiCache, Redshift

Same approach. Many database instances are sized for peak; consider provisioned IOPS to general-purpose, aurora-serverless v2 for variable workloads, and reader/writer split rather than upsizing the primary.

Step 4: Storage Class and Lifecycle

Storage tiers offer dramatic cost differences:

S3 classApprox $/GB-monthAccess pattern
S3 Standard$0.023Active
S3 Standard-IA$0.0125Infrequent
S3 One Zone-IA$0.01Infrequent, lower availability OK
S3 Glacier Instant Retrieval$0.004Archived, occasional access
S3 Glacier Flexible Retrieval$0.0036Archived, hours to retrieve
S3 Glacier Deep Archive$0.00099Cold, 12-hour retrieve
S3 Intelligent-TieringtieredAuto-moves; small monitoring fee

Equivalent tiers exist on Azure (Hot/Cool/Cold/Archive) and GCP (Standard/Nearline/Coldline/Archive).

  • Enable lifecycle rules on every bucket: transition to IA after 30 days, Glacier after 90, Deep Archive after 365.
  • Use S3 Intelligent-Tiering for unknown access patterns — almost always net savings, only $0.0025/1000 objects monitoring fee.
  • Set incomplete-multipart-upload abort rules — orphaned multi-part uploads accumulate invisibly.
  • Delete old object versions if versioning is enabled.

Step 5: Modernise the Architecture

Bigger gains come from changing what you run, not how big it is.

  • Spot / preemptible instances — up to 90% off for interruptible workloads (batch, CI, stateless workers, fault-tolerant training). Karpenter and Cluster Autoscaler make this routine on Kubernetes.
  • Serverless — pay per request. For low-volume or bursty endpoints often cheaper than always-on containers. Lambda, Cloud Run, Container Apps.
  • Managed services over self-hosted — DynamoDB vs Cassandra on EC2; managed Redis vs self-managed. Usually higher unit cost but lower total cost when ops, patching, and on-call are included.
  • Consolidation — multi-tenant where it fits; one well-utilised cluster beats five half-empty ones.
  • Region choice — same workload can be 20% cheaper in one region than another; especially relevant for batch / training.

Step 6: Continuous Rightsizing

Rightsizing once is a campaign; rightsizing continuously is operating discipline:

  • Monthly review of top 50 spenders.
  • Automated reports of P95 utilisation per resource flagged when below 30%.
  • Kubernetes Vertical Pod Autoscaler in recommendation mode; HPA on the right metric.
  • Storage lifecycle rules in place by default in your IaC modules.
  • Spot adoption tracked as a percentage of compute hours.

What to Expect

OptimisationTypical savingsEffort
Idle resource cleanup5-15%Low
Non-prod scheduling10-20%Medium
Storage tiering5-10%Low
Rightsizing10-20%Medium
Spot adoption10-25%Medium-High
ARM / Graviton10-20% of computeMedium (compat testing)
Architecture modernisationvaries, often largestHigh

These compound. Combined, a first-year FinOps programme commonly delivers 25-40% reduction on like-for-like workloads, with the gains compounding as new workloads land in the optimised pattern by default.

One major rate-based lever remains: commitments. That is the next lesson.

Key Takeaways

  • Typical idle and over-provisioning waste is 30-60% of cloud spend before any optimisation.
  • Rightsizing means matching instance size, count, and family to observed demand.
  • Schedule non-prod workloads off outside business hours for ~65% savings on those resources.
  • Use native and third-party rightsizing recommendations as starting points, not gospel.
  • Modernise architecture (Spot, serverless, managed services) for the biggest gains.

Test your knowledge

Try exam-style practice questions to reinforce what you've learned.

Practice Questions →