NoSQL Databases: A Practical Tour — SQL and Databases Fundamentals | CertQnA

"NoSQL" started as a label for databases that don't use SQL or fixed schemas. Today it's a broad umbrella covering very different systems. The right way to think about NoSQL is by data model — pick the model that matches your access pattern.

Why NoSQL Exists

Relational databases shine when:

Data has rich relationships
You need ACID transactions
Queries are varied and ad-hoc

They struggle when:

Write volume exceeds what one node can handle
Data is naturally hierarchical (deeply nested objects)
Schemas evolve every week
You need globally distributed, low-latency reads

NoSQL systems specialise — each gives up some SQL feature to scale or simplify in a specific direction.

The Four Main Families

1. Key-Value Stores

The simplest model: a giant hash map. SET foo bar / GET foo.

Examples	Redis, Memcached, AWS DynamoDB (key-value mode), etcd, Consul KV
Use for	Caches, session storage, rate limit counters, leaderboards (Redis sorted sets), feature flags
Trade-off	You can only fetch by key — no `WHERE` clauses

redis-cli
> SET user:42 '{"name":"Alex"}'
OK
> GET user:42
"{\"name\":\"Alex\"}"
> INCR page-views:home
(integer) 1
> ZADD leaderboard 1500 alice
(integer) 1
> ZRANGE leaderboard 0 9 WITHSCORES   # top 10

2. Document Databases

Store JSON-shaped records ("documents") in collections. The schema is per-document — different documents in the same collection can have different fields.

Examples	MongoDB, AWS DocumentDB, Couchbase, Firebase Firestore, ArangoDB
Use for	Catalogues, content stores, user profiles, anything with nested data
Trade-off	Joins are awkward; you typically embed related data

// MongoDB
db.users.insertOne({
    _id: 'alex',
    name: 'Alex',
    addresses: [
        { city: 'Berlin', country: 'DE' },
        { city: 'Lisbon', country: 'PT' }
    ]
});

db.users.find({ 'addresses.country': 'DE' });

3. Wide-Column Stores

Tables of rows, but each row can have different columns and very many of them. Designed to scale horizontally across hundreds of nodes.

Examples	Apache Cassandra, ScyllaDB, Google Bigtable, HBase
Use for	Time-series, IoT telemetry, event logs, message inboxes — any massive-write workload with predictable access patterns
Trade-off	You must design tables around queries; ad-hoc analytics are hard

4. Graph Databases

First-class support for nodes and edges. Traversals like "friends of friends who like X" are O(1) per hop instead of multiple joins.

Examples	Neo4j, AWS Neptune, ArangoDB, JanusGraph
Use for	Social networks, recommendations, fraud rings, dependency graphs, knowledge graphs
Trade-off	Specialist tooling; smaller ecosystem; usually a side database alongside a primary store

// Cypher (Neo4j)
MATCH (me:User {id: 'alex'})-[:FRIEND]->()-[:FRIEND]->(fof:User)
WHERE NOT (me)-[:FRIEND]->(fof) AND me <> fof
RETURN fof.name, count(*) AS mutual
ORDER BY mutual DESC LIMIT 10;

Search Engines

Often grouped with NoSQL: Elasticsearch, OpenSearch, Meilisearch, Typesense, Algolia. They build inverted indexes for fast full-text and faceted search. Use them as a side database synced from your primary store.

Time-Series Databases

Specialised for high-rate metrics: InfluxDB, TimescaleDB (Postgres extension), Prometheus, AWS Timestream. Optimised for writes, time-bucketed queries, and downsampling.

The CAP Theorem

In a distributed database, when a network partition happens you must choose between:

Consistency — every read sees the latest write or fails
Availability — every request gets a response, possibly stale

You don't really choose between Consistency and Partition tolerance — partitions happen, you have to tolerate them. Real choice is C-vs-A under partition. Examples:

CP systems: MongoDB (default), HBase, etcd, Spanner — refuse stale reads
AP systems: Cassandra, DynamoDB (eventually consistent reads), Riak — always answer, may be slightly stale

Many systems are tunable per request — you pick the trade-off at query time.

BASE vs ACID

NoSQL systems often advertise BASE: Basically Available, Soft state, Eventually consistent. The bargain: relax strict consistency to keep serving traffic during failures and to scale across regions.

Note: many "NoSQL" systems now offer ACID transactions within a partition (MongoDB multi-document, DynamoDB transactions). The line keeps blurring.

How to Choose

Start with Postgres. It does most things competently. Don't pick NoSQL because it's trendy.
Add a cache (Redis) when you need sub-millisecond reads or you're hitting the DB too hard.
Add a search engine (Elasticsearch / Meilisearch) when relevance scoring or faceting becomes core.
Pick a document store if your data is genuinely document-shaped and your team is already there.
Pick a wide-column store when single-node Postgres can't handle write volume and the access pattern is predictable.
Pick a graph database when relationships, not records, are the product.

Polyglot Persistence

Most large systems use several. A typical e-commerce stack might use:

Postgres — orders, products, users (system of record)
Redis — sessions, cart cache, rate limits
Elasticsearch — product search and faceting
S3 / object store — images, invoices
ClickHouse / BigQuery — analytics

Each piece does what it's best at. The job of an architect is to pick the smallest set of stores that covers the requirements without creating a sync nightmare.