Skip to content
6 min read·Lesson 9 of 10

Async, Threads, and Parallelism

Use asyncio, threads, and processes to speed up Python — and know which to pick for I/O-bound vs CPU-bound work.

Python performance is rarely about Python being slow — it's about waiting on the network. This lesson covers the three concurrency models you'll actually use and which one to pick for which problem.

The Decision

WorkloadToolWhy
Many slow HTTP/DB/cloud callsasyncio + httpx / aiobotocoreSingle thread, no GIL contention, scales to thousands of connections
Same as above but only sync libraries availableThreadPoolExecutorThreads release the GIL while waiting on I/O
Heavy CPU work (image processing, hashing, ML)ProcessPoolExecutor or numpyBypass the GIL with separate processes

Most DevOps work is I/O-bound — async is your default.

Threads for I/O-Bound Sync Code

from concurrent.futures import ThreadPoolExecutor, as_completed
import requests

urls = ["https://example.com/" + path for path in ["a", "b", "c", "d", "e"]]

def fetch(url: str) -> tuple[str, int]:
    resp = requests.get(url, timeout=10)
    return url, resp.status_code

with ThreadPoolExecutor(max_workers=10) as pool:
    futures = [pool.submit(fetch, url) for url in urls]
    for future in as_completed(futures):
        url, status = future.result()
        print(url, status)

Python's GIL (Global Interpreter Lock) prevents two threads from running Python bytecode at the same time, but threads do release the GIL while waiting on I/O — which is exactly when we want concurrency.

Processes for CPU-Bound Work

from concurrent.futures import ProcessPoolExecutor

def hash_file(path: str) -> str:
    import hashlib
    h = hashlib.sha256()
    with open(path, "rb") as f:
        for chunk in iter(lambda: f.read(8192), b""):
            h.update(chunk)
    return h.hexdigest()

paths = ["a.bin", "b.bin", "c.bin", "d.bin"]

with ProcessPoolExecutor() as pool:
    for path, digest in zip(paths, pool.map(hash_file, paths)):
        print(path, digest)

Each worker runs in its own Python process — true parallelism on multiple CPU cores. Cost: starting a process is heavier and you can only pass picklable arguments.

Note: Python 3.13 introduced an experimental free-threaded build that lets you turn the GIL off; the standard build still has it.

asyncio Basics

import asyncio

async def hello(name: str) -> str:
    await asyncio.sleep(1)
    return f"hello, {name}"

async def main():
    # Sequential — total ~3s
    a = await hello("a")
    b = await hello("b")
    c = await hello("c")

    # Concurrent — total ~1s
    results = await asyncio.gather(
        hello("a"),
        hello("b"),
        hello("c"),
    )
    print(results)

asyncio.run(main())

Key vocabulary:

  • Coroutine — a function defined with async def. Returns a coroutine object; doesn't run until awaited or scheduled.
  • await — pauses the coroutine until the awaited thing finishes.
  • Event loop — single-threaded scheduler that runs coroutines. asyncio.run(main()) starts one, runs main, closes it.
  • asyncio.gather — schedule many coroutines concurrently and wait for all.

Async HTTP with httpx

import asyncio
import httpx

async def fetch(client: httpx.AsyncClient, url: str) -> tuple[str, int]:
    resp = await client.get(url, timeout=10)
    return url, resp.status_code

async def main(urls: list[str]) -> None:
    async with httpx.AsyncClient(http2=True) as client:
        results = await asyncio.gather(*(fetch(client, u) for u in urls))
    for url, status in results:
        print(url, status)

asyncio.run(main(["https://a.example", "https://b.example", "https://c.example"]))

Hundreds of in-flight requests on one thread — orders of magnitude less overhead than threads.

Bounding Concurrency with a Semaphore

Unbounded gather on a list of 50 000 URLs will exhaust file descriptors and get you rate-limited. Bound it:

async def main(urls: list[str]) -> None:
    sem = asyncio.Semaphore(50)

    async def fetch_bounded(url):
        async with sem:
            async with httpx.AsyncClient(timeout=10) as client:
                return await client.get(url)

    results = await asyncio.gather(*(fetch_bounded(u) for u in urls))

Now at most 50 requests are in flight at any moment.

Mixing Sync and Async

If you call a blocking function inside async def, you stall the event loop and freeze every other coroutine. Two ways to avoid it:

import asyncio
import time

def sync_work():
    time.sleep(2)        # blocks
    return 42

async def main():
    # WRONG — blocks the loop for 2 seconds
    # result = sync_work()

    # RIGHT — run in a thread, await its completion
    result = await asyncio.to_thread(sync_work)
    print(result)

asyncio.run(main())

asyncio.to_thread (3.9+) hands work off to the default thread pool and returns an awaitable.

Cancellation and Timeouts

async def main():
    try:
        async with asyncio.timeout(5):
            await long_running_task()
    except asyncio.TimeoutError:
        print("took longer than 5 seconds")

asyncio.timeout (3.11+) cleanly cancels everything inside on timeout. Coroutines should be designed to handle cancellation politely — clean up resources in try/finally.

Common Pitfalls

  • Forgetting await. foo() returns a coroutine object that does nothing on its own. Linters and type checkers catch this.
  • Sharing one Session across the wrong loop. Create the client inside async def main(), not as a module-level global.
  • Calling blocking code unawares. Even requests.get or a CPU-heavy JSON parse stalls the loop. Profile or use asyncio.to_thread.
  • Catching Exception can swallow asyncio.CancelledError. Re-raise it, or use except asyncio.CancelledError: raise first.

When Not to Bother

If your script only makes one or two API calls, you don't need any of this. Concurrency adds complexity; introduce it when you measure a real wall-clock problem worth solving.

Key Takeaways

  • Use asyncio for I/O-bound concurrency: HTTP, DNS, cloud APIs.
  • Use ThreadPoolExecutor when you need parallel I/O but the libraries are sync.
  • Use ProcessPoolExecutor for CPU-bound work to bypass the GIL.
  • asyncio.gather runs many coroutines concurrently; bound it with a Semaphore to avoid floods.
  • Don’t mix sync and async carelessly — calling a blocking function from async code stalls everything.

Test your knowledge

Try exam-style practice questions to reinforce what you've learned.

Practice Questions →