Skip to content
6 min read·Lesson 9 of 10

Cloud Databases: Managed, Serverless, and Distributed

Tour the managed database services on AWS, Azure, and GCP — when to pick which, and what serverless and distributed SQL bring to the table.

Running a database yourself means dealing with backups, replication, patching, failover, capacity planning, and 3am pages. Managed cloud databases offload most of that. This lesson maps the landscape.

The Managed-Service Layers

From most-managed to least-managed:

  1. Serverless databases — scale to zero, pay per request
  2. Cloud-native managed — Aurora, AlloyDB, Spanner, Cosmos DB
  3. Standard managed — RDS, Cloud SQL, Azure SQL Database
  4. VM + database — EC2 + Postgres yourself
  5. On-prem / co-located

You generally want to live as high on this list as price and feature constraints allow.

Standard Managed Relational Databases

CloudServiceEngines
AWSRDSPostgres, MySQL, MariaDB, Oracle, SQL Server
AzureAzure Database for PostgreSQL/MySQL/MariaDBPostgres, MySQL, MariaDB
GCPCloud SQLPostgres, MySQL, SQL Server
AzureAzure SQL Database / Managed InstanceSQL Server

What you get:

  • Automated backups, point-in-time recovery
  • Minor-version patching during a maintenance window
  • Read replicas with one click
  • Multi-AZ / zone-redundant high availability
  • Encryption at rest and in transit
  • Metrics, logs, and alerting integrated with the cloud monitoring stack

What's still on you:

  • Schema design and migrations
  • Indexes and query performance
  • Connection pooling at the app layer
  • Right-sizing the instance and storage

Cloud-Native Reimplementations

Amazon Aurora

Aurora speaks Postgres or MySQL but stores data in a distributed, log-structured storage layer that replicates 6 ways across 3 availability zones. Faster failover (often under 30 seconds), better replication lag, up to 15 read replicas sharing the same storage.

Google AlloyDB

Postgres compatible. Separates compute from storage, uses columnar acceleration for analytics, and offers cross-region replicas. Targets demanding OLTP workloads.

Azure Cosmos DB for PostgreSQL

Distributed Postgres (formerly Citus) — sharded across nodes for horizontal scale-out.

Distributed SQL Databases

These give you a SQL interface and ACID transactions but spread data across many nodes and regions.

Google SpannerGlobally consistent, paxos-replicated; powers Google Ads. Pricey but unique.
CockroachDBPostgres-wire-compatible; survives node and region failures.
YugabyteDBPostgres + Cassandra wire compatibility.
TiDBMySQL-compatible distributed SQL with HTAP capabilities.

Use these when single-node Postgres can't keep up with writes or you need multi-region strong consistency. They're more expensive and have higher per-query latency than a single-node DB.

Managed NoSQL

ServiceCloudNotes
DynamoDBAWSKey-value & document; single-digit-ms latency at any scale; pay-per-request or provisioned capacity
Cosmos DBAzureMulti-model (SQL API, MongoDB API, Cassandra API, Gremlin); global distribution; tunable consistency
FirestoreGCPDocument DB with realtime listeners; great for mobile apps
BigtableGCPWide-column for huge analytical workloads
Cloud Memorystore / ElastiCache / Azure Cache for RedisAllManaged Redis
OpenSearch / Elastic CloudAllManaged search

Serverless Databases

Pay only for what you use; scale to zero (or near-zero) when idle. Great for spiky workloads and dev environments.

Aurora Serverless v2Postgres or MySQL; scales between min and max ACUs
NeonPostgres-compatible; branching like Git; scales to zero
PlanetScaleMySQL-compatible (Vitess); branching, deploy requests, no foreign keys by default
Cloudflare D1SQLite at the edge
SupabaseHosted Postgres + auth + realtime + storage; great for full-stack apps
DynamoDB on-demandTrue per-request billing

Cloud Data Warehouses

Different beast — columnar, optimised for analytics over huge datasets. Don't use them as your transactional DB; sync data into them.

  • AWS Redshift
  • Google BigQuery — serverless, scan-priced
  • Snowflake — multi-cloud, separates compute and storage
  • Databricks SQL — built on Delta Lake
  • ClickHouse Cloud — open-source columnar, very fast

Cost Models

Be aware of how each service bills:

  • Provisioned (RDS, Cloud SQL) — you pay for the instance whether or not it's used. Storage and IOPS billed separately.
  • Per-request (DynamoDB on-demand, BigQuery) — pay per read/write or per byte scanned. Spiky traffic friendly; runaway queries can be expensive.
  • Serverless ACU/RU (Aurora Serverless, Cosmos DB) — pay for compute units consumed. Usually a min floor.
  • Storage — usually billed per GB-month. Backups and replicas count.
  • Egress — pulling data out of the cloud or across regions can dwarf compute costs. Watch this.

Choosing Between Managed and Self-Hosted

Reason to self-hostReason to use managed
Heavy customisation (custom extensions, kernel tuning)You don't want to wake up at 3am for a failed replica
Compliance restricts cloud usageYou'd rather pay engineers for product, not for ops
Cost at very large scale (100s of TB)You want one-click multi-AZ HA
Vendor lock-in concernsYou want point-in-time recovery for free

For 95% of teams, managed wins. The remaining 5% generally know who they are.

Practical Migration Path

  1. Start with managed Postgres on your cloud (RDS/Cloud SQL/Azure DB)
  2. Add read replicas when reads dominate
  3. Add a Redis cache for hot read paths
  4. Move to Aurora / AlloyDB / Spanner only when you've outgrown a single primary
  5. Sync analytics into BigQuery / Redshift / Snowflake

Don't pre-optimise. Most products never need step 4.

Key Takeaways

  • Managed services (RDS, Cloud SQL, Azure SQL) handle backups, patching, replication — but you still own schema and indexes.
  • Cloud-native databases (Aurora, AlloyDB) reimplement Postgres/MySQL on distributed storage for better scale and faster failover.
  • Distributed SQL (Spanner, CockroachDB) gives you SQL with global scale and strong consistency.
  • Serverless databases (Aurora Serverless, Neon, PlanetScale) scale to zero and bill per request.
  • Choose managed by default — self-host only when you have specific control or cost reasons.

Test your knowledge

Try exam-style practice questions to reinforce what you've learned.

Practice Questions →