Skip to content
5 min read·Lesson 5 of 10

SQL vs NoSQL

Relational and non-relational databases compared honestly. Where each fits, the families inside NoSQL, and how to choose without dogma.

The "SQL vs NoSQL" debate generated more heat than light. Today's reality: relational databases handle far bigger workloads than people remember, and NoSQL has matured into specific tools for specific shapes of problem. Pick by data shape and access pattern, not by fashion.

What SQL Gives You

  • A relational model — tables, rows, columns, foreign keys.
  • Joins — composing data across tables in queries.
  • ACID transactions — atomicity, consistency, isolation, durability.
  • Strong schema — columns and types declared and enforced.
  • Mature tooling, decades of operational knowledge, every ORM ever.

Tools: PostgreSQL, MySQL, SQL Server, Oracle, plus cloud-native: Aurora (AWS), Cloud SQL / Spanner (GCP), Azure SQL DB, CockroachDB, YugabyteDB.

What NoSQL Gives You (Family by Family)

Key-value

  • Hash table at huge scale: get(key), set(key, value).
  • Sub-ms latency, horizontal scale via consistent hashing.
  • No queries beyond key lookup.
  • Examples: Redis, DynamoDB (in basic mode), Memcached, etcd.
  • Use for: caches, session stores, leaderboards, simple lookups.

Document

  • JSON-shaped documents in collections.
  • Flexible schema; nested data first-class.
  • Secondary indexes; rich queries by field; weak/no joins.
  • Examples: MongoDB, DynamoDB, Cosmos DB, Firestore, Couchbase.
  • Use for: catalogues with variable shape, user profiles, content stores.

Wide-column

  • Tables with billions of rows and millions of (sparse) columns.
  • Rows partitioned by key; queries use the partition key + clustering key range.
  • Designed for write-heavy, very large scale.
  • Examples: Cassandra, ScyllaDB, HBase, Bigtable, DynamoDB (with sort keys).
  • Use for: time-series, IoT, event logs, large-scale audit trails.

Graph

  • Nodes and edges with properties.
  • Optimised for traversals — friend-of-friend, path-finding, recommendations.
  • Examples: Neo4j, Amazon Neptune, JanusGraph, ArangoDB.
  • Use for: social graphs, fraud rings, knowledge graphs, recommendations.

Search

  • Inverted-index stores for full-text and faceted search.
  • Examples: Elasticsearch, OpenSearch, Algolia, Meilisearch.
  • Use for: search bars, log analytics, observability.

Time-series

  • Append-mostly numeric data over time, with downsampling and retention.
  • Examples: InfluxDB, TimescaleDB (Postgres extension), Prometheus, ClickHouse for analytics.
  • Use for: metrics, IoT telemetry, financial tick data.

Picking Between Them

If your data is...Default to
Relational with joins, complex queries, transactionsSQL (Postgres)
Single-key lookups at huge scaleKey-value (Redis, DynamoDB)
Variable-shape JSON documentsDocument (Mongo, DynamoDB, Firestore)
Massive write throughput, time/event seriesWide-column (Cassandra, Bigtable)
Heavy traversal of relationshipsGraph (Neo4j, Neptune)
Full-text searchSearch (Elasticsearch)
Metrics over timeTime-series (Prometheus, Timescale)

The "Default to Postgres" Argument

For most apps you'll start, Postgres is the right default:

  • Handles millions of rows easily, hundreds of millions with thought.
  • JSONB columns let you store document-shaped data when you need to.
  • Full-text search built in; pgvector for embeddings; PostGIS for geo.
  • Logical replication, partitioning, foreign data wrappers.
  • Aurora PostgreSQL, Cloud SQL, RDS handle the operations.

Only reach for NoSQL when a specific access pattern, shape, or scale demands it. "We might need to scale" is rarely a good enough reason — premature NoSQL costs you joins and transactional integrity you'll miss.

Modern "NewSQL" — SQL at Scale

Distributed SQL databases now provide horizontal scale with relational semantics:

  • Google Spanner — global, externally consistent, used inside Google for years.
  • CockroachDB — Spanner-inspired, open-source, multi-region.
  • YugabyteDB — Postgres-compatible distributed SQL.
  • Aurora — managed Postgres / MySQL with separated storage and compute.

The historical "SQL doesn't scale" claim is mostly outdated. The newer trade-offs are cost and write latency, not "can it scale at all".

Polyglot Persistence

Real systems often combine multiple databases:

  • Postgres for the system of record.
  • Redis for sessions and hot caches.
  • Elasticsearch for search.
  • S3 + Iceberg for analytics.
  • DynamoDB for one specific high-scale low-latency lookup table.

Each store is chosen for what it does best. The cost: data must stay in sync across them — usually via change data capture (Debezium, Datastream) or event-driven updates.

ACID vs BASE

ACID (relational tradition)BASE (NoSQL tradition)
Atomic, Consistent, Isolated, DurableBasically Available, Soft state, Eventually consistent
Strong guarantees per transactionLooser guarantees, optimised for scale

Modern systems are not strictly one or the other. DynamoDB has ACID transactions; Postgres has logical replication producing eventual consistency at replicas. Don't treat the labels as absolute.

Cost and Operations

  • Self-managed Postgres on a VM: cheapest at small scale, expensive in operator hours.
  • Managed Postgres (RDS, Aurora, Cloud SQL): expensive at small scale, cheap in operator hours; the right default for most teams.
  • DynamoDB / Bigtable / Cosmos: pay per request and per stored GB. Cheap at low traffic, can balloon at high QPS without thought.
  • Cassandra / self-managed Mongo: powerful, but operating at scale is a real specialisation.

Cert Mapping

CertDatabase scope
AWS SAA-C03RDS, Aurora, DynamoDB, ElastiCache, Neptune, OpenSearch
Azure AZ-104 / AZ-305Azure SQL, Cosmos DB (multi-API), Cache for Redis
GCP PCA / PDECloud SQL, Spanner, Bigtable, Firestore, BigQuery

Decision Heuristic

  1. Start with Postgres unless you have a specific reason not to.
  2. If access is "given key, get value" at huge QPS, add a key-value store as a cache or primary.
  3. If data shape is fluid documents, consider a document DB.
  4. If you have time-series / event-stream / massive write fan-in, consider wide-column.
  5. If you have a true graph problem (multi-hop traversals are central), use a graph DB.
  6. For analytics, use a warehouse / lakehouse — not the operational store.

The next two lessons take whichever database you picked and ask: how do we replicate it for availability, and shard it for scale?

Key Takeaways

  • SQL databases offer relational integrity, joins, and ACID transactions; NoSQL trades these for scale and flexibility.
  • NoSQL is not one thing — key-value, document, wide-column, and graph stores have very different shapes.
  • Most apps default to SQL (Postgres) and reach for NoSQL only for specific scaling or shape needs.
  • Modern SQL (Postgres, Aurora, Spanner, CockroachDB) closes much of the historical NoSQL scaling gap.
  • Choosing wrong is expensive — migration is hard once data is in.

Test your knowledge

Try exam-style practice questions to reinforce what you've learned.

Practice Questions →