Security Operations is where prevention meets reality: it's the team and tooling that detects, investigates, and responds to threats day in, day out. This lesson walks through how a Security Operations Centre (SOC) is structured and how an incident actually unfolds.
The SOC
A SOC is the team responsible for continuously monitoring an organisation's security and responding to incidents. Typical tiered structure:
| Tier | Responsibilities |
|---|---|
| T1 Analyst | First-line triage of alerts; close false positives; escalate real ones |
| T2 Analyst | Deeper investigation; correlate across systems; contain incidents |
| T3 / Threat Hunter / Detection Engineer | Proactive hunting, building new detections, reverse-engineering malware |
| SOC Manager | Runs the team, reports to CISO, owns metrics and process |
SOCs run 24×7 — internally (follow-the-sun across regions) or via a Managed Security Service Provider (MSSP) / Managed Detection & Response (MDR) vendor.
Incident Response Lifecycle (NIST SP 800-61)
- Preparation. Build runbooks, train responders, maintain contact lists, run tabletop exercises, deploy detection tooling.
- Detection & Analysis. An alert fires (or someone reports something). Validate it. Classify severity. Open an incident ticket. Begin chain-of-custody.
- Containment. Stop the bleeding. Isolate the host from the network, disable compromised accounts, block IOCs. Short-term containment buys time; long-term containment prevents reinfection.
- Eradication. Remove the attacker — wipe and rebuild systems, rotate credentials, close the entry point.
- Recovery. Bring systems back to production, monitor closely for recurrence.
- Post-incident Activity. Blameless retrospective: what happened, what we did well, what we'd change. Update detections and runbooks.
The discipline of running this loop matters more than any single tool. A team that never practiced will fumble a real incident regardless of how much they spent on the SIEM.
The Tooling Stack
SIEM — Security Information and Event Management
Aggregates logs from everywhere (endpoints, servers, firewalls, cloud, apps), correlates them, and runs detection rules. The eyes of the SOC. Examples: Splunk, Microsoft Sentinel, Elastic Security, Sumo Logic, Chronicle, Panther.
EDR — Endpoint Detection and Response
Agents on every endpoint that record process, file, network, and registry activity, run behavioural detections, and let analysts respond remotely (kill process, isolate host). Examples: CrowdStrike Falcon, Microsoft Defender for Endpoint, SentinelOne.
XDR — Extended Detection and Response
EDR plus correlation across email, identity, cloud, and network. Tries to give one unified view rather than a stack of disconnected consoles.
SOAR — Security Orchestration, Automation, and Response
Runs playbooks. When a phishing alert fires: automatically pull the email headers, check the attachment in a sandbox, search for who else received it, and pre-fill a containment ticket. Examples: Palo Alto XSOAR, Splunk SOAR, Tines.
Threat Intelligence Platforms
Ingest feeds of IOCs (IPs, domains, hashes) and TTPs from commercial and open sources (MISP, AlienVault OTX) and feed them into detections.
Other tooling
- UEBA — user/entity behavioural analytics; spots impossible travel, unusual data access
- NDR — network detection and response; analyses traffic flow and metadata
- Deception — honeypots, canary tokens that scream when touched
Detection Engineering
Modern SOCs treat detections like code. Practices:
- Detections live in a Git repo (Sigma rules, Splunk SPL, KQL queries) with code review
- Each detection links to the MITRE ATT&CK technique it covers
- Tests run against synthetic and historical data before deployment
- False-positive rates are tracked; noisy detections are tuned or retired
Threat Hunting
Don't wait for alerts. Threat hunting starts from a hypothesis:
"If an attacker compromised our CI/CD service account, they would clone unusual repositories and download large amounts of data outside business hours."
The hunter then queries telemetry to find evidence — or absence — of that activity. Findings either become new detections, or improve the team's mental model of what normal looks like.
Forensics and Evidence
If an incident may go to legal action, evidence handling matters from minute one:
- Preserve before you investigate — image disks, snapshot memory; don't reboot
- Chain of custody — log who touched what artefact, when
- Hash everything so you can prove evidence wasn't altered
- Use write blockers for physical media
Tools: Volatility (memory), Autopsy (disk), Velociraptor (live response), KAPE (artefact collection).
Key Metrics
- MTTD — Mean Time To Detect: from compromise to first alert
- MTTR — Mean Time To Respond / Resolve
- Dwell time — total time the attacker was inside (industry medians used to be 200+ days; modern XDR has pushed this much lower)
- False positive rate per detection
- Coverage — % of MITRE ATT&CK techniques you can detect
Tabletops and Exercises
Run regular exercises so muscle memory exists when a real incident hits:
- Tabletop — discussion-only, walk through a scenario in a meeting room
- Purple team — red attackers and blue defenders work together to test detections
- Full-scale simulation — replay a realistic attack against a non-prod environment with real tooling
The lessons from these exercises feed back into detections, runbooks, and architecture — closing the loop with the rest of the IR lifecycle.