Logging and Threat Detection — Cloud Security Fundamentals | CertQnA

You cannot respond to what you cannot see. Cloud security depends on a well-instrumented pipeline of logs, metrics, and managed detections feeding a place where humans can investigate.

The Three Layers of Cloud Logs

Layer	AWS	Azure	GCP
Control plane (API calls)	CloudTrail	Activity Log	Audit Logs (Admin Activity)
Data plane (object/data access)	CloudTrail Data Events, S3 access logs	Storage diagnostic logs	Audit Logs (Data Access)
Network	VPC Flow Logs, Route 53 query logs	NSG Flow Logs, DNS analytics	VPC Flow Logs, Cloud DNS logs

Turn on all three at the org level — a missing log is a missing investigation.

Centralisation: Send Everything to One Place

Three common destinations:

Cloud-native data lake — AWS Security Lake (OCSF), Azure Data Explorer, GCP BigQuery. Cheap, queryable, but you build detections.
SIEM — Splunk, Microsoft Sentinel, Sumo Logic, Elastic Security, Chronicle (Google SecOps). Pre-built detections, alert pipelines, case management.
XDR / CNAPP combined — Wiz Defend, Sysdig Secure, Palo Alto Cortex Cloud. Posture + runtime + identity correlation.

For multi-cloud, the SIEM/CNAPP route is usually the right answer — one place for the security team to look.

Managed Threat Detection

Cloud	Service	What it watches
AWS	GuardDuty	VPC Flow, DNS, CloudTrail, S3 events, EKS audit, Lambda, Malware
Azure	Defender for Cloud (CWP)	VMs, containers, App Services, SQL, storage, key vault, DNS, ARM
GCP	Security Command Center (Premium / Enterprise)	SCC findings, Event Threat Detection, Container Threat Detection

Each ingests logs and runs ML / signature detections. Examples:

EC2 instance communicating with a known C2 IP.
API call from an anonymising proxy or unusual country.
Sudden spike in iam:CreateAccessKey followed by s3:GetObject in many buckets.
Container running a crypto-miner.
Storage account exposed publicly.

Findings flow into Security Hub / Defender for Cloud / SCC for unified views, then to your SIEM and ticketing.

Identity Detection

Most cloud attacks go through identity. Set up alerts on:

Console login from new country / impossible travel.
MFA bypass / disabled.
New IAM access key on a long-existing user.
New role assumption pattern (role A never assumed role B before).
Privilege escalation chains (PutUserPolicy adding admin).
Federation provider changes.

Your IdP (Entra ID, Okta, Google) plus the cloud's IAM logs together tell the story. Defender for Identity, Okta ThreatInsight, and similar add behavioural baselines.

Continuous Compliance / Posture (CSPM)

While threat detection looks for active attacks, posture management looks for misconfiguration before it is exploited.

AWS Config + Security Hub, Azure Policy + Defender for Cloud, GCP Security Command Center / Policy Intelligence ship many built-in rules.
CSPM tools (Wiz, Prisma, Lacework, Orca) cover all three clouds and add graph reasoning ("this public bucket is reachable by this role used by this exposed Lambda").

Track Mean Time to Remediate (MTTR) for high-severity findings as a security KPI.

Detection-as-Code

Treat detections like application code:

Detection rules in Git, code-reviewed, with tests.
Sigma rules / KQL / SPL queries / YARA-L versioned.
CI runs synthetic events against the rules to confirm they fire.
Tuning is a PR, not an undocumented click.

Tools like Panther, Snowflake security data lake, Sigma converters make this practical.

Honeytokens and Canaries

Plant a fake AWS access key in a place an attacker would look (a config file, a dummy repo). Set up an alert that fires the moment anyone tries to use it. Real-world signal-to-noise ratio is exceptional. Vendors: Thinkst Canary, Canarytokens.

Building the Pipeline

[ Cloud audit + flow + data logs ]
        │
        ▼
[ Org-level central account / SIEM ]
        │
        ├──→ [ Managed detection: GuardDuty, Defender, SCC ]
        ├──→ [ CSPM: AWS Config, Wiz, Prisma ]
        ├──→ [ Custom rules in SIEM ]
        │
        ▼
[ Alert pipeline: severity routing ]
        │
        ├── P1 → on-call SOC, automated containment (disable creds, isolate instance)
        ├── P2 → ticket + business-hour triage
        └── P3 → metrics / dashboards

What "Good" Looks Like

Every account has CloudTrail / Activity Log / Audit Logs flowing to a central, write-only location.
The central log store has retention >1 year for security forensic needs.
Managed detection (GuardDuty / Defender / SCC) is on org-wide.
A handful of high-fidelity custom detections cover the top org-specific risks.
Findings page on-call within minutes for criticals; everything else is tracked with SLA.
Practice runs (purple team, tabletop) keep the pipeline exercised.

Anti-Patterns

Logs only in the same account as the workload — an attacker with the workload can delete them.
30-day retention for security logs — many real-world breaches surface months after the initial intrusion.
Alert thresholds set so high nothing fires; or so low everything fires and is ignored.
Findings landing in a dashboard with no owner.
SIEM with no automation — humans triage every benign finding manually.

Detection earns its keep only when it shortens incidents. The next lesson takes the alert from fired to contained.