OpenTelemetry (OTel) is now the second-largest CNCF project after Kubernetes by contributor count. It's the de facto vendor-neutral standard for collecting traces, metrics, and logs from modern distributed systems. The OpenTelemetry Certified Associate (OTCA) is the Linux Foundation's certification validating practitioner-level OTel knowledge — and as of 2026, it's becoming a standard ask in SRE and platform engineering job descriptions.
Why OpenTelemetry Matters
Before OTel, observability data was fragmented across proprietary agents (New Relic, Datadog, Dynatrace, Splunk) and incompatible SDKs (Zipkin, Jaeger, OpenTracing, OpenCensus). OTel unifies the producer side: instrument once with vendor-neutral SDKs, export to any backend that accepts OTLP. This gives organisations vendor independence and lets engineers carry their instrumentation knowledge across companies.
Exam At a Glance
| Attribute | Value |
|---|---|
| Exam code | OTCA |
| Cost (USD) | $250 |
| Format | 60 multiple-choice questions |
| Duration | 90 minutes |
| Passing score | 75% |
| Validity | 3 years |
| Free retake | 1 included |
| Prerequisites | None |
Exam Domains
| Domain | Approx weight |
|---|---|
| 1. Observability Primer | 15% |
| 2. OpenTelemetry Concepts (Signals, API/SDK) | 25% |
| 3. OpenTelemetry Collector | 25% |
| 4. Instrumentation | 20% |
| 5. Semantic Conventions & Resource Detection | 15% |
Core Concepts to Master
The Three Signals
| Signal | What it tells you | Status |
|---|---|---|
| Traces | Latency and causality across services | Stable |
| Metrics | Aggregate numerical measurements over time | Stable |
| Logs | Discrete event records | Stable |
| Profiles | CPU and memory profiling data | In development |
API vs SDK vs Instrumentation Libraries
- API: the interface application code calls (vendor-neutral, no-op by default)
- SDK: the implementation that processes and exports telemetry
- Instrumentation libraries: per-framework auto-instrumentation (HTTP servers, gRPC, database clients)
Know that you can use only the API in your library code without forcing users into a specific SDK — a key OTel design principle.
The OpenTelemetry Collector
The Collector is the most-tested topic. Understand:
- Receivers — accept telemetry (OTLP, Prometheus, Jaeger, Zipkin, log file)
- Processors — transform, batch, filter, sample (batch, memory_limiter, attributes, tail_sampling)
- Exporters — send to backends (OTLP, Prometheus remote write, AWS X-Ray, Tempo, Loki)
- Connectors — link pipelines together (e.g., spanmetrics generates metrics from traces)
- Extensions — non-pipeline capabilities (health_check, pprof)
Deployment Patterns
| Pattern | When to use |
|---|---|
| Agent (per-host or sidecar) | Close-to-source collection, low-latency local buffering |
| Gateway (cluster-level) | Centralised processing, sampling, multi-backend fanout |
| Agent + Gateway | Most common production pattern |
Sampling Strategies
- Head sampling: decision at trace start, simple but loses interesting low-volume traces
- Tail sampling: decision after trace completes, retains errors and high-latency traces but needs more memory
- Probabilistic vs deterministic: random fraction vs consistent hash-based
Semantic Conventions
OTel's semantic conventions standardise attribute names (e.g., http.request.method, service.name, k8s.pod.uid). They are why a Datadog dashboard can interpret data instrumented by code that intends to export to Tempo. Know that conventions are versioned and that breaking changes are rare and clearly signalled.
Context Propagation
- W3C Trace Context (
traceparent,tracestate) is the default - B3 propagation supported for Zipkin interop
- Baggage carries cross-cutting key/values across service boundaries
4-Week Study Plan
| Week | Focus | Hands-on task |
|---|---|---|
| 1 | Observability primer, three signals, OTel architecture | Instrument a simple Python/Node app with OTel SDK |
| 2 | Collector: receivers, processors, exporters, pipelines | Build a Collector config with OTLP → batch → OTLP gateway |
| 3 | Sampling strategies, semantic conventions, resource detection | Configure tail sampling on errors + 1% of normal traffic |
| 4 | Auto-instrumentation, K8s deployment patterns, practice exams | Deploy OTel Operator on KIND, auto-inject into pods |
Recommended Free Resources
- opentelemetry.io/docs — the canonical reference; the "Concepts" and "Collector" sections cover most exam content
- The OpenTelemetry Demo (otel-demo) — a polyglot microservice app with full OTel instrumentation; clone and run locally
- Honeycomb's OpenTelemetry guide — vendor-flavoured but technically excellent
- CNCF Webinars on OpenTelemetry — recorded sessions on collector design and sampling
- LFS148: Introduction to OpenTelemetry — free Linux Foundation course mapped to OTCA
Common Pitfalls
- Treating logs as a first-class OTel signal everywhere — logs reached stability later than traces and metrics; some SDKs still vary in support maturity
- Confusing Prometheus exposition format with OTel metrics format — Prometheus is supported as a receiver, but OTel's native data model is different (delta vs cumulative aggregation matters)
- Assuming the Collector replaces a backend — it doesn't store telemetry, it routes and transforms it
- Mixing up
service.name(required resource attribute) withservice.instance.id(unique per instance)
Should You Take OTCA?
OTCA is a 4–6 week part-time effort and a strong addition to a platform engineering, SRE, or backend developer resume. It pairs naturally with PCA (Prometheus Certified Associate) for a complete observability story, and with CKA for the full platform stack.
If you already instrument applications with OTel at work, you can probably pass with 2–3 weekends of focused study. If OTel is brand new, plan for the full 4 weeks and build the demo app to internalise the data flow.