FastAPI Lifespan vs Startup/Shutdown: A Migration Guide That Won't Silently Break

A FastAPI app booted, served traffic, and crashed three hours later because its database pool was never initialized. The codebase had a perfectly valid @app.on_event("startup") handler that opened the pool. It also had a lifespan parameter on the FastAPI() constructor. Only one of those ran. Starlette picked the lifespan and silently dropped every on_event handler in the file.

That is the trap. It is also why the FastAPI docs flag on_startup and on_shutdown as deprecated and direct everyone toward the lifespan async context manager. This article walks through the migration, then handles the cases that the official one-paragraph migration note skips over: exception propagation, stacking multiple lifespans, and the coexistence rule that makes the old API fail quietly.

Why on_startup and on_shutdown are being retired

Starlette has supported the ASGI lifespan protocol since 0.13. The protocol gives the server a way to send lifespan.startup and lifespan.shutdown messages around the request-serving phase. FastAPI originally exposed two helpers, app.add_event_handler("startup", ...) and the @app.on_event("startup") decorator, that wrapped those messages with a list of plain async callables.

That worked for a single resource. The moment a project needed three (a database pool, a Redis client, and a background task scheduler), the implicit ordering and lack of cleanup symmetry started to bite. If the second handler crashed, the first one's cleanup had already been registered but the third one's setup never ran. Resource leaks at boot are awkward; resource leaks at shutdown are silent.

Python's contextlib.asynccontextmanager already solves this exact shape of problem. One try/yield/finally block expresses setup, the live phase, and cleanup as a single unit. FastAPI 0.93 introduced the lifespan parameter that accepts such a context manager, and the official events documentation now treats on_event as a legacy shim retained for backwards compatibility.

The minimal migration

Here is the before-and-after for a common shape: an HTTP client that needs to be open for the lifetime of the app.

from fastapi import FastAPI
import httpx

app = FastAPI()

@app.on_event("startup")
async def open_client():
    app.state.client = httpx.AsyncClient(timeout=10.0)

@app.on_event("shutdown")
async def close_client():
    await app.state.client.aclose()

The same code in the lifespan style:

from contextlib import asynccontextmanager
from fastapi import FastAPI
import httpx

@asynccontextmanager
async def lifespan(app: FastAPI):
    app.state.client = httpx.AsyncClient(timeout=10.0)
    try:
        yield
    finally:
        await app.state.client.aclose()

app = FastAPI(lifespan=lifespan)

Two structural wins fall out of this. Cleanup happens inside finally, so it runs even if shutdown is triggered by an unhandled exception during the live phase. The dependency between "client was opened" and "client should be closed" is now lexical, not split across two top-level decorators that could drift apart during a refactor.

For multiple resources, nest more context managers or use contextlib.AsyncExitStack:

from contextlib import AsyncExitStack, asynccontextmanager
from fastapi import FastAPI
import httpx
import redis.asyncio as redis

@asynccontextmanager
async def lifespan(app: FastAPI):
    async with AsyncExitStack() as stack:
        app.state.http = await stack.enter_async_context(
            httpx.AsyncClient(timeout=10.0)
        )
        app.state.cache = redis.Redis.from_url("redis://localhost:6379")
        stack.push_async_callback(app.state.cache.aclose)
        yield

AsyncExitStack runs cleanups in reverse order of registration, which matches what most operators want: tear down the highest-level resource first, then walk down to the connection pools.

Exception propagation before yield

The first non-obvious difference between the two APIs is what happens when setup fails.

In the on_event model, a handler that raises propagates the exception out of Starlette.startup(). The ASGI server (uvicorn, hypercorn) sees the failure during the lifespan handshake and refuses to start serving requests. Logging is decent: the traceback hits stderr, the process exits non-zero, and a container orchestrator notices the failed health check.

The lifespan model behaves the same way before the yield, but the failure mode looks different in tracebacks. Consider this:

@asynccontextmanager
async def lifespan(app: FastAPI):
    pool = await create_pool()  # raises OperationalError on bad DSN
    app.state.pool = pool
    try:
        yield
    finally:
        await pool.close()

If create_pool() raises, the try/finally is never entered, so pool.close() is not called. That is correct: there is nothing to close because pool was never bound. The exception propagates up through the ASGI lifespan handshake and the server exits.

The trap appears when developers, trying to be defensive, wrap the setup in a try block that swallows the exception:

@asynccontextmanager
async def lifespan(app: FastAPI):
    try:
        app.state.pool = await create_pool()
    except Exception:
        app.state.pool = None  # don't crash the app
    yield

The app now boots. Every request that touches app.state.pool will crash with AttributeError: 'NoneType' object has no attribute 'acquire', the health check returns 200 because it never hits the database, and the only signal of trouble is a flood of 500s in the logs five minutes later when traffic ramps. The honest pattern is to let the exception propagate. If the database is unreachable at boot, that is information the orchestrator needs to act on, not a state the app should normalize.

The same logic applies to cleanup. If pool.close() raises inside finally, the exception is swallowed by Starlette's lifespan handler so the process can still exit cleanly, but it is logged at WARNING level. Surface anything you care about before reaching finally.

Stacking multiple lifespans

Once an app grows beyond one team, having a single lifespan() function for the whole codebase becomes a merge-conflict magnet. The fastapi-lifespan-manager library, maintained at github.com/uriyyo/fastapi-lifespan-manager, solves this by letting each module register its own lifespan and composing them at app construction time.

from fastapi import FastAPI
from fastapi_lifespan_manager import LifespanManager

manager = LifespanManager()

@manager.add
async def db_lifespan(app: FastAPI):
    app.state.pool = await create_pool()
    try:
        yield
    finally:
        await app.state.pool.close()

@manager.add
async def cache_lifespan(app: FastAPI):
    app.state.cache = await connect_cache()
    try:
        yield
    finally:
        await app.state.cache.aclose()

app = FastAPI(lifespan=manager)

Registration order matters: db_lifespan runs setup first, then cache_lifespan, then the app serves. Shutdown runs in reverse. That ordering guarantee is what lets one module assume another module's resource is already available.

A common pattern is to define a per-feature lifespan next to the routes it touches, then have the composition root import each manager and combine them. The same module structure can be done by hand with AsyncExitStack, but the manager handles the boilerplate of attaching child lifespans to a parent app, and survives app.include_router(other_app) mounting in a way that nested context managers do not.

For comparison, an equivalent hand-rolled version of two lifespans without the library:

@asynccontextmanager
async def combined_lifespan(app: FastAPI):
    async with db_cm(app), cache_cm(app):
        yield

That works for a flat two-lifespan case and adds no new dependency. The library starts paying off around three lifespans, when you want each to live in its own module without the composition root importing every internal context manager directly.

The silent-fail coexistence trap

This is the section that motivates the article. FastAPI's source has this check in the constructor: if you pass a lifespan argument, the on_startup and on_shutdown lists are still constructed, but Starlette's Router ignores them in favor of the lifespan context manager.

Concretely:

@asynccontextmanager
async def lifespan(app: FastAPI):
    print("lifespan setup")
    yield
    print("lifespan teardown")

app = FastAPI(lifespan=lifespan)

@app.on_event("startup")
async def legacy_startup():
    print("legacy startup")  # never runs

The legacy_startup print never fires. The handler is registered, the app does not warn about the mismatch, and a contributor who adds @app.on_event("startup") to a file expecting it to run will be confused for an afternoon.

There are two practical defenses:

Pick one style per app and grep for the other. A pre-commit hook that fails on on_event\( in a codebase that has migrated to lifespan= catches new regressions in under 50ms per file.
Migrate every handler in the same PR you introduce lifespan=. Splitting the migration across two PRs is the most common way teams end up with both APIs in the same file.

The opposite mix is safer: if you have not yet introduced lifespan=, the on_event handlers run normally. The trap is one-directional.

A comparison of the two APIs

Concern	`on_startup` / `on_shutdown`	`lifespan` context manager
Order of cleanup	Reverse registration of shutdown handlers, no coupling to startup order	Lexically symmetric to setup via `try`/`finally` or `AsyncExitStack`
Failure mid-setup	Subsequent handlers still run; previously-registered shutdowns also run	Failed setup short-circuits; only cleanups inside an entered `with` run
Multi-module composition	Each module imports the `app` and decorates handlers	Each module exports a context manager; composition root combines them
Coexistence behavior	Works alone	Wins when both are present; the legacy handlers run zero times
Status in FastAPI docs	Deprecated as of 0.93	Recommended since 0.93

The lifespan API is roughly 30% fewer lines for a 3-resource app once AsyncExitStack replaces the four matching decorators, and zero lines longer for a single-resource app. The library option (fastapi-lifespan-manager) shifts that to roughly equal at one resource and 40% fewer at five.

Testing lifespans

A subtle gotcha: the lifespan only runs when the app is invoked through an ASGI server. FastAPI's TestClient, which wraps requests, does not trigger the lifespan unless you use it as a context manager:

from fastapi.testclient import TestClient

def test_pool_is_initialized():
    with TestClient(app) as client:
        response = client.get("/healthz")
        assert response.status_code == 200

Without the with block, the lifespan never fires, and app.state.pool never exists. The asgi-lifespan project documents the underlying contract if you need to test lifespan behavior outside of a FastAPI client.

When migrating, run the test suite with both styles and watch for tests that started failing only after the lifespan move. Those are almost always tests that constructed a TestClient without the context-manager form and were relying on a module-level resource that no longer initializes.

What the migration looks like end-to-end

Concretely, the steps that worked across a 12-route service:

Inventory every on_event decorator and every app.add_event_handler call. There are usually fewer than ten.
Group setups that share a resource. Each group becomes one @asynccontextmanager function or one @manager.add entry.
Replace each setup body with the equivalent setup-then-yield-then-cleanup shape. Resist the urge to add new try/except blocks for "safety"; let real failures propagate.
Convert tests to use with TestClient(app) as client:.
Delete the on_event decorators in the same commit.
Add a pre-commit grep that fails on on_event\( returning.

The whole migration for a small service takes a few hours. The payoff is twofold: cleanup actually runs on failure paths, and adding a new resource means editing one file in one place instead of two decorators 200 lines apart.

The framework's direction is clear. Starlette built the lifespan protocol because the ASGI spec required it, FastAPI exposed it because async context managers express resource lifetime better than parallel callback lists, and the deprecation notice in the official docs is a signal that the legacy API will eventually disappear. Moving early avoids both the silent-fail trap and an inevitable larger migration when the deprecation becomes a removal.

References: