Imagine you are writing a complex program and something breaks. You want to go back to yesterday's working version, but you only have one copy of the files. Or you are working with a teammate and you both edit the same file simultaneously — whose changes win? Version control solves both problems.
What is Version Control?
A version control system (VCS) records changes to a set of files over time. It gives you:
- History: See every change ever made, who made it, and why.
- Revert: Undo mistakes by rolling back to any previous state.
- Collaboration: Multiple people work on the same codebase simultaneously without overwriting each other.
- Branching: Work on a new feature in isolation, then merge it back when ready.
- Audit trail: Know exactly when a bug was introduced and by which commit.
Centralised vs Distributed VCS
| Centralised (e.g. SVN) | Distributed (e.g. Git) | |
|---|---|---|
| Repository location | Single server | Every developer has a full copy |
| Offline work | Limited — needs server | Full history available offline |
| Single point of failure | Yes — server goes down, work stops | No — any copy can restore the project |
| Speed | Slower (network required for most ops) | Fast (most ops are local) |
| Branching | Heavy — copies directory trees | Lightweight — just a pointer |
Git's History
Git was created by Linus Torvalds in April 2005 — in just a few weeks — after the Linux kernel project lost access to its proprietary VCS, BitKeeper. Torvalds had specific requirements:
- Speed
- Simple design
- Strong support for non-linear development (thousands of parallel branches)
- Fully distributed
- Able to handle large projects efficiently
Git met all of these. By 2010 it had become the dominant VCS in the open-source world, and today it is used by effectively every software team on the planet.
How Git Stores Data
Unlike older VCS tools that store file diffs (what changed between versions), Git stores snapshots of the entire project at each commit. If a file has not changed, Git stores a reference to the previous identical file — not a copy. This makes most operations instant because Git just compares checksums.
Every object in Git is identified by a SHA-1 hash (a 40-character hexadecimal string). This means Git can detect any corruption: if a single byte changes, the hash changes and Git knows something is wrong.
The Three Areas of Git
Understanding these three areas is the mental model that makes everything else in Git click:
| Area | Also called | What it holds |
|---|---|---|
| Working tree | Working directory | Your actual files on disk — what you see in your editor |
| Staging area | Index | A preview of your next commit — files you have git add-ed |
| Repository | .git directory | The permanent record of all commits, branches, and history |
The typical flow is: edit files in the working tree → stage changes you want to include → commit the staged snapshot to the repository. This two-step process (add then commit) lets you craft precise commits from a messy set of changes.
Git vs GitHub
These are not the same thing:
- Git is the command-line tool and the version control system itself. It runs on your machine.
- GitHub is a web platform that hosts Git repositories. It adds collaboration features: pull requests, code review, issues, Actions (CI/CD), and more.
Alternatives to GitHub for hosting Git repositories include GitLab, Bitbucket, and Gitea (self-hosted). They all use the same underlying Git protocol.