You cannot respond to what you cannot see. Cloud security depends on a well-instrumented pipeline of logs, metrics, and managed detections feeding a place where humans can investigate.
The Three Layers of Cloud Logs
| Layer | AWS | Azure | GCP |
|---|---|---|---|
| Control plane (API calls) | CloudTrail | Activity Log | Audit Logs (Admin Activity) |
| Data plane (object/data access) | CloudTrail Data Events, S3 access logs | Storage diagnostic logs | Audit Logs (Data Access) |
| Network | VPC Flow Logs, Route 53 query logs | NSG Flow Logs, DNS analytics | VPC Flow Logs, Cloud DNS logs |
Turn on all three at the org level — a missing log is a missing investigation.
Centralisation: Send Everything to One Place
Three common destinations:
- Cloud-native data lake — AWS Security Lake (OCSF), Azure Data Explorer, GCP BigQuery. Cheap, queryable, but you build detections.
- SIEM — Splunk, Microsoft Sentinel, Sumo Logic, Elastic Security, Chronicle (Google SecOps). Pre-built detections, alert pipelines, case management.
- XDR / CNAPP combined — Wiz Defend, Sysdig Secure, Palo Alto Cortex Cloud. Posture + runtime + identity correlation.
For multi-cloud, the SIEM/CNAPP route is usually the right answer — one place for the security team to look.
Managed Threat Detection
| Cloud | Service | What it watches |
|---|---|---|
| AWS | GuardDuty | VPC Flow, DNS, CloudTrail, S3 events, EKS audit, Lambda, Malware |
| Azure | Defender for Cloud (CWP) | VMs, containers, App Services, SQL, storage, key vault, DNS, ARM |
| GCP | Security Command Center (Premium / Enterprise) | SCC findings, Event Threat Detection, Container Threat Detection |
Each ingests logs and runs ML / signature detections. Examples:
- EC2 instance communicating with a known C2 IP.
- API call from an anonymising proxy or unusual country.
- Sudden spike in
iam:CreateAccessKeyfollowed bys3:GetObjectin many buckets. - Container running a crypto-miner.
- Storage account exposed publicly.
Findings flow into Security Hub / Defender for Cloud / SCC for unified views, then to your SIEM and ticketing.
Identity Detection
Most cloud attacks go through identity. Set up alerts on:
- Console login from new country / impossible travel.
- MFA bypass / disabled.
- New IAM access key on a long-existing user.
- New role assumption pattern (role A never assumed role B before).
- Privilege escalation chains (PutUserPolicy adding admin).
- Federation provider changes.
Your IdP (Entra ID, Okta, Google) plus the cloud's IAM logs together tell the story. Defender for Identity, Okta ThreatInsight, and similar add behavioural baselines.
Continuous Compliance / Posture (CSPM)
While threat detection looks for active attacks, posture management looks for misconfiguration before it is exploited.
- AWS Config + Security Hub, Azure Policy + Defender for Cloud, GCP Security Command Center / Policy Intelligence ship many built-in rules.
- CSPM tools (Wiz, Prisma, Lacework, Orca) cover all three clouds and add graph reasoning ("this public bucket is reachable by this role used by this exposed Lambda").
Track Mean Time to Remediate (MTTR) for high-severity findings as a security KPI.
Detection-as-Code
Treat detections like application code:
- Detection rules in Git, code-reviewed, with tests.
- Sigma rules / KQL / SPL queries / YARA-L versioned.
- CI runs synthetic events against the rules to confirm they fire.
- Tuning is a PR, not an undocumented click.
Tools like Panther, Snowflake security data lake, Sigma converters make this practical.
Honeytokens and Canaries
Plant a fake AWS access key in a place an attacker would look (a config file, a dummy repo). Set up an alert that fires the moment anyone tries to use it. Real-world signal-to-noise ratio is exceptional. Vendors: Thinkst Canary, Canarytokens.
Building the Pipeline
[ Cloud audit + flow + data logs ]
│
▼
[ Org-level central account / SIEM ]
│
├──→ [ Managed detection: GuardDuty, Defender, SCC ]
├──→ [ CSPM: AWS Config, Wiz, Prisma ]
├──→ [ Custom rules in SIEM ]
│
▼
[ Alert pipeline: severity routing ]
│
├── P1 → on-call SOC, automated containment (disable creds, isolate instance)
├── P2 → ticket + business-hour triage
└── P3 → metrics / dashboards
What "Good" Looks Like
- Every account has CloudTrail / Activity Log / Audit Logs flowing to a central, write-only location.
- The central log store has retention >1 year for security forensic needs.
- Managed detection (GuardDuty / Defender / SCC) is on org-wide.
- A handful of high-fidelity custom detections cover the top org-specific risks.
- Findings page on-call within minutes for criticals; everything else is tracked with SLA.
- Practice runs (purple team, tabletop) keep the pipeline exercised.
Anti-Patterns
- Logs only in the same account as the workload — an attacker with the workload can delete them.
- 30-day retention for security logs — many real-world breaches surface months after the initial intrusion.
- Alert thresholds set so high nothing fires; or so low everything fires and is ignored.
- Findings landing in a dashboard with no owner.
- SIEM with no automation — humans triage every benign finding manually.
Detection earns its keep only when it shortens incidents. The next lesson takes the alert from fired to contained.