Skip to content
5 min read·Lesson 8 of 10

Volumes and Persistent Data

Learn how containers store data, why named volumes are preferred over bind mounts in production, and how to back up and restore container data.

Containers are designed to be ephemeral — start one, stop it, throw it away, start a fresh one. But real applications need to persist data: database files, uploaded user content, cached models. Volumes are how containers do that.

Three Ways to Mount Storage

TypeWhere it livesUse for
Named volumeManaged by Docker (typically under /var/lib/docker/volumes)Databases, app state, anything you want Docker to look after
Bind mountAny path on the hostDevelopment (live source code), reading host config files
tmpfs mountHost RAM only — never written to diskSecrets, scratch space, security-sensitive temp files

Named Volumes

docker volume create pgdata

docker run -d --name db \
  -v pgdata:/var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=secret \
  postgres:16

The volume pgdata survives container removal. You can stop, remove, and replace the postgres container — the data persists. You can even upgrade Postgres versions by pointing a new container at the same volume (after appropriate migrations).

docker volume ls
docker volume inspect pgdata
docker volume rm pgdata    # only when no container uses it
docker volume prune        # remove all unused volumes

Bind Mounts

The killer use case for bind mounts is local development — edit a file on the host, see the change reflected instantly inside the container:

docker run --rm -it \
  -v "$(pwd)":/app \
  -w /app \
  node:20-alpine \
  npm run dev

The current directory is mounted at /app inside the container. This is also how you mount config files, certificates, or specific data sets without baking them into the image.

Beware: bind mounts give the container full read/write access to that host path. Mount only what's necessary, prefer read-only when possible:

docker run -v /etc/myapp/config.yaml:/etc/myapp/config.yaml:ro myapp

The --mount Syntax

Modern Docker also accepts a more explicit --mount flag:

docker run --mount type=volume,src=pgdata,dst=/var/lib/postgresql/data postgres
docker run --mount type=bind,src=/host/path,dst=/in/container,readonly nginx
docker run --mount type=tmpfs,dst=/tmp,tmpfs-size=64m busybox

Backups

To back up a named volume, attach a one-shot helper container that tars the volume contents to the host:

docker run --rm \
  -v pgdata:/data:ro \
  -v "$(pwd)":/backup \
  alpine \
  tar czf /backup/pgdata-$(date +%F).tgz -C / data

Restore is the reverse: create a new volume and untar into it.

For production databases, this is too crude — use the database's native backup tool (e.g., pg_dump, mongodump, RDS automated snapshots). The volume is just where the data sits; backup is a database concern.

Common Pitfalls

  • Mounting onto a path that already has files in the image — the mount hides the image's files. For named volumes the first mount copies the image's files into the empty volume; for bind mounts the host directory wins immediately.
  • Permissions: the user inside the container needs ownership of mounted paths. Common fix: chown in the Dockerfile or use --user $(id -u):$(id -g) at run time.
  • Forgetting that docker rm deletes the container's writable layer — anything written outside a volume is gone.

Production Patterns

For a single host, named volumes plus regular backups work fine. At scale, consider:

  • Managed databases (RDS, Cloud SQL, Azure Database) — let the provider handle replication, backups, failover.
  • Object storage (S3, GCS, Azure Blob) for user uploads and large files — virtually unlimited and accessed via SDK rather than mounted.
  • Network filesystems (EFS, Azure Files, Filestore) when multiple containers need shared read/write access.
  • Kubernetes Persistent Volumes (PVs) and StatefulSets — the cloud-native way to schedule stateful workloads.

Key Takeaways

  • Container filesystems are ephemeral — anything written inside is lost when the container is removed.
  • Named volumes are managed by Docker and are the right default for databases and stateful workloads.
  • Bind mounts map a host directory into the container — perfect for development, dangerous for production.
  • tmpfs mounts live in host memory only — useful for secrets and scratch space.
  • For production-grade persistence at scale, prefer object storage (S3) or managed databases over local volumes.

Test your knowledge

Try exam-style practice questions to reinforce what you've learned.

Practice Questions →