Volumes and Persistent Data — Docker and Containers Basics | CertQnA

Containers are designed to be ephemeral — start one, stop it, throw it away, start a fresh one. But real applications need to persist data: database files, uploaded user content, cached models. Volumes are how containers do that.

Three Ways to Mount Storage

Type	Where it lives	Use for
Named volume	Managed by Docker (typically under /var/lib/docker/volumes)	Databases, app state, anything you want Docker to look after
Bind mount	Any path on the host	Development (live source code), reading host config files
tmpfs mount	Host RAM only — never written to disk	Secrets, scratch space, security-sensitive temp files

Named Volumes

docker volume create pgdata

docker run -d --name db \
  -v pgdata:/var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=secret \
  postgres:16

The volume pgdata survives container removal. You can stop, remove, and replace the postgres container — the data persists. You can even upgrade Postgres versions by pointing a new container at the same volume (after appropriate migrations).

docker volume ls
docker volume inspect pgdata
docker volume rm pgdata    # only when no container uses it
docker volume prune        # remove all unused volumes

Bind Mounts

The killer use case for bind mounts is local development — edit a file on the host, see the change reflected instantly inside the container:

docker run --rm -it \
  -v "$(pwd)":/app \
  -w /app \
  node:20-alpine \
  npm run dev

The current directory is mounted at /app inside the container. This is also how you mount config files, certificates, or specific data sets without baking them into the image.

Beware: bind mounts give the container full read/write access to that host path. Mount only what's necessary, prefer read-only when possible:

docker run -v /etc/myapp/config.yaml:/etc/myapp/config.yaml:ro myapp

The --mount Syntax

Modern Docker also accepts a more explicit --mount flag:

docker run --mount type=volume,src=pgdata,dst=/var/lib/postgresql/data postgres
docker run --mount type=bind,src=/host/path,dst=/in/container,readonly nginx
docker run --mount type=tmpfs,dst=/tmp,tmpfs-size=64m busybox

Backups

To back up a named volume, attach a one-shot helper container that tars the volume contents to the host:

docker run --rm \
  -v pgdata:/data:ro \
  -v "$(pwd)":/backup \
  alpine \
  tar czf /backup/pgdata-$(date +%F).tgz -C / data

Restore is the reverse: create a new volume and untar into it.

For production databases, this is too crude — use the database's native backup tool (e.g., pg_dump, mongodump, RDS automated snapshots). The volume is just where the data sits; backup is a database concern.

Common Pitfalls

Mounting onto a path that already has files in the image — the mount hides the image's files. For named volumes the first mount copies the image's files into the empty volume; for bind mounts the host directory wins immediately.
Permissions: the user inside the container needs ownership of mounted paths. Common fix: chown in the Dockerfile or use --user $(id -u):$(id -g) at run time.
Forgetting that docker rm deletes the container's writable layer — anything written outside a volume is gone.

Production Patterns

For a single host, named volumes plus regular backups work fine. At scale, consider:

Managed databases (RDS, Cloud SQL, Azure Database) — let the provider handle replication, backups, failover.
Object storage (S3, GCS, Azure Blob) for user uploads and large files — virtually unlimited and accessed via SDK rather than mounted.
Network filesystems (EFS, Azure Files, Filestore) when multiple containers need shared read/write access.
Kubernetes Persistent Volumes (PVs) and StatefulSets — the cloud-native way to schedule stateful workloads.