Skip to content

Instantly share code, notes, and snippets.

@nymous
Last active November 15, 2024 19:24
Show Gist options
  • Save nymous/f138c7f06062b7c43c060bf03759c29e to your computer and use it in GitHub Desktop.
Save nymous/f138c7f06062b7c43c060bf03759c29e to your computer and use it in GitHub Desktop.
Logging setup for FastAPI, Uvicorn and Structlog (with Datadog integration)

Logging setup for FastAPI

This logging setup configures Structlog to output pretty logs in development, and JSON log lines in production.

Then, you can use Structlog loggers or standard logging loggers, and they both will be processed by the Structlog pipeline (see the hello() endpoint for reference). That way any log generated by your dependencies will also be processed and enriched, even if they know nothing about Structlog!

Requests are assigned a correlation ID with the asgi-correlation-id middleware (either captured from incoming request or generated on the fly). All logs are linked to the correlation ID, and to the Datadog trace/span if instrumented. This data "global to the request" is stored in context vars, and automatically added to all logs produced during the request thanks to Structlog. You can add to these "global local variables" at any point in an endpoint with structlog.contextvars.bind_contextvars(customer_id=123). Any log sent after this call will contain this attribute, whether it comes from structlog or logging.

The duration of the request is also captured, and all technical information about the request (URL, method, client IP/port, status code) are logged as attributes.

In production, Uvicorn startup logs are silenced because they are printed before we have time to configure JSON logging (and log ingestion systems don't like unformatted logs 😅). Uvicorn standard access logs are disabled, and reimplemented using Structlog (to be formatted in JSON or in the console).

How to run

Install dependencies with Poetry.

Run in development:

uvicorn main:app --reload

and in production (to log in JSON format and instrument the code with Datadog):

LOG_JSON_FORMAT=true DD_TRACE_ENABLED=true ddtrace-run uvicorn main:app --host 0.0.0.0 --log-config uvicorn_disable_logging.json

Note

I will add my references and inspirations soon.

The "duration" logged and returned in the header does not encompass the full duration of the request. The code could be moved to a higher middleware running earlier in the chain, to also capture the duration of "inner" middlewares such as the correlation-id middleware, or others...

If you install Rich or better-exceptions, you will automagically get pretty formatted exception with context, variable values... Try it and see if you like it

There are some TODOs left in the code, you can suggest improvements in the comments 😄 The Datadog middleware is especially horrible, there must be a better way to add it manually. I also need to open an issue in Starlette comparing in which order middlewares are activated in different languages and frameworks, to try and make it more consistent with other ecosystems.

License

This code is available under the MIT license.

import logging
import sys
import structlog
from ddtrace import tracer
from structlog.types import EventDict, Processor
# https://github.com/hynek/structlog/issues/35#issuecomment-591321744
def rename_event_key(_, __, event_dict: EventDict) -> EventDict:
"""
Log entries keep the text message in the `event` field, but Datadog
uses the `message` field. This processor moves the value from one field to
the other.
See https://github.com/hynek/structlog/issues/35#issuecomment-591321744
"""
event_dict["message"] = event_dict.pop("event")
return event_dict
def drop_color_message_key(_, __, event_dict: EventDict) -> EventDict:
"""
Uvicorn logs the message a second time in the extra `color_message`, but we don't
need it. This processor drops the key from the event dict if it exists.
"""
event_dict.pop("color_message", None)
return event_dict
def tracer_injection(_, __, event_dict: EventDict) -> EventDict:
# get correlation ids from current tracer context
span = tracer.current_span()
trace_id, span_id = (span.trace_id, span.span_id) if span else (None, None)
# add ids to structlog event dictionary
event_dict["dd.trace_id"] = str(trace_id or 0)
event_dict["dd.span_id"] = str(span_id or 0)
return event_dict
def setup_logging(json_logs: bool = False, log_level: str = "INFO"):
timestamper = structlog.processors.TimeStamper(fmt="iso")
shared_processors: list[Processor] = [
structlog.contextvars.merge_contextvars,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.stdlib.ExtraAdder(),
drop_color_message_key,
tracer_injection,
timestamper,
structlog.processors.StackInfoRenderer(),
]
if json_logs:
# We rename the `event` key to `message` only in JSON logs, as Datadog looks for the
# `message` key but the pretty ConsoleRenderer looks for `event`
shared_processors.append(rename_event_key)
# Format the exception only for JSON logs, as we want to pretty-print them when
# using the ConsoleRenderer
shared_processors.append(structlog.processors.format_exc_info)
structlog.configure(
processors=shared_processors
+ [
# Prepare event dict for `ProcessorFormatter`.
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
],
logger_factory=structlog.stdlib.LoggerFactory(),
cache_logger_on_first_use=True,
)
log_renderer: structlog.types.Processor
if json_logs:
log_renderer = structlog.processors.JSONRenderer()
else:
log_renderer = structlog.dev.ConsoleRenderer()
formatter = structlog.stdlib.ProcessorFormatter(
# These run ONLY on `logging` entries that do NOT originate within
# structlog.
foreign_pre_chain=shared_processors,
# These run on ALL entries after the pre_chain is done.
processors=[
# Remove _record & _from_structlog.
structlog.stdlib.ProcessorFormatter.remove_processors_meta,
log_renderer,
],
)
handler = logging.StreamHandler()
# Use OUR `ProcessorFormatter` to format all `logging` entries.
handler.setFormatter(formatter)
root_logger = logging.getLogger()
root_logger.addHandler(handler)
root_logger.setLevel(log_level.upper())
for _log in ["uvicorn", "uvicorn.error"]:
# Clear the log handlers for uvicorn loggers, and enable propagation
# so the messages are caught by our root logger and formatted correctly
# by structlog
logging.getLogger(_log).handlers.clear()
logging.getLogger(_log).propagate = True
# Since we re-create the access logs ourselves, to add all information
# in the structured log (see the `logging_middleware` in main.py), we clear
# the handlers and prevent the logs to propagate to a logger higher up in the
# hierarchy (effectively rendering them silent).
logging.getLogger("uvicorn.access").handlers.clear()
logging.getLogger("uvicorn.access").propagate = False
def handle_exception(exc_type, exc_value, exc_traceback):
"""
Log any uncaught exception instead of letting it be printed by Python
(but leave KeyboardInterrupt untouched to allow users to Ctrl+C to stop)
See https://stackoverflow.com/a/16993115/3641865
"""
if issubclass(exc_type, KeyboardInterrupt):
sys.__excepthook__(exc_type, exc_value, exc_traceback)
return
root_logger.error(
"Uncaught exception", exc_info=(exc_type, exc_value, exc_traceback)
)
sys.excepthook = handle_exception
import os
import time
# This line needs to be run before any `ddtrace` import, to avoid sending traces
# in local dev environment (we don't have a Datadog agent configured locally, so
# it prints a stacktrace every time it tries to send a trace)
# TODO: Find a better way to activate Datadog traces?
os.environ["DD_TRACE_ENABLED"] = os.getenv("DD_TRACE_ENABLED", "false") # noqa
import structlog
import uvicorn
from asgi_correlation_id import CorrelationIdMiddleware
from asgi_correlation_id.context import correlation_id
from ddtrace.contrib.asgi.middleware import TraceMiddleware
from fastapi import FastAPI, Request, Response
from pydantic import parse_obj_as
from uvicorn.protocols.utils import get_path_with_query_string
from custom_logging import setup_logging
LOG_JSON_FORMAT = parse_obj_as(bool, os.getenv("LOG_JSON_FORMAT", False))
LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
setup_logging(json_logs=LOG_JSON_FORMAT, log_level=LOG_LEVEL)
access_logger = structlog.stdlib.get_logger("api.access")
app = FastAPI(title="Example API", version="1.0.0")
@app.middleware("http")
async def logging_middleware(request: Request, call_next) -> Response:
structlog.contextvars.clear_contextvars()
# These context vars will be added to all log entries emitted during the request
request_id = correlation_id.get()
structlog.contextvars.bind_contextvars(request_id=request_id)
start_time = time.perf_counter_ns()
# If the call_next raises an error, we still want to return our own 500 response,
# so we can add headers to it (process time, request ID...)
response = Response(status_code=500)
try:
response = await call_next(request)
except Exception:
# TODO: Validate that we don't swallow exceptions (unit test?)
structlog.stdlib.get_logger("api.error").exception("Uncaught exception")
raise
finally:
process_time = time.perf_counter_ns() - start_time
status_code = response.status_code
url = get_path_with_query_string(request.scope)
client_host = request.client.host
client_port = request.client.port
http_method = request.method
http_version = request.scope["http_version"]
# Recreate the Uvicorn access log format, but add all parameters as structured information
access_logger.info(
f"""{client_host}:{client_port} - "{http_method} {url} HTTP/{http_version}" {status_code}""",
http={
"url": str(request.url),
"status_code": status_code,
"method": http_method,
"request_id": request_id,
"version": http_version,
},
network={"client": {"ip": client_host, "port": client_port}},
duration=process_time,
)
response.headers["X-Process-Time"] = str(process_time / 10 ** 9)
return response
# This middleware must be placed after the logging, to populate the context with the request ID
# NOTE: Why last??
# Answer: middlewares are applied in the reverse order of when they are added (you can verify this
# by debugging `app.middleware_stack` and recursively drilling down the `app` property).
app.add_middleware(CorrelationIdMiddleware)
# UGLY HACK
# Datadog's `TraceMiddleware` is applied as the very first middleware in the list, by patching `FastAPI` constructor.
# Unfortunately that means that it is the innermost middleware, so the trace/span are created last in the middleware
# chain. Because we want to add the trace_id/span_id in the access log, we need to extract it from the middleware list,
# put it back as the outermost middleware, and rebuild the middleware stack.
# TODO: Open an issue in dd-trace-py to ask if it can change its initialization, or if there is an easy way to add the
# middleware manually, so we can add it later in the chain and have it be the outermost middleware.
# TODO: Open an issue in Starlette to better explain the order of middlewares
tracing_middleware = next(
(m for m in app.user_middleware if m.cls == TraceMiddleware), None
)
if tracing_middleware is not None:
app.user_middleware = [m for m in app.user_middleware if m.cls != TraceMiddleware]
structlog.stdlib.get_logger("api.datadog_patch").info(
"Patching Datadog tracing middleware to be the outermost middleware..."
)
app.user_middleware.insert(0, tracing_middleware)
app.middleware_stack = app.build_middleware_stack()
@app.get("/")
def hello():
custom_structlog_logger = structlog.stdlib.get_logger("my.structlog.logger")
custom_structlog_logger.info("This is an info message from Structlog")
custom_structlog_logger.warning("This is a warning message from Structlog, with attributes", an_extra="attribute")
custom_structlog_logger.error("This is an error message from Structlog")
custom_logging_logger = logging.getLogger("my.logging.logger")
custom_logging_logger.info("This is an info message from standard logger")
custom_logging_logger.warning("This is a warning message from standard logger, with attributes", extra={"another_extra": "attribute"})
return "Hello, World!"
if __name__ == "__main__":
uvicorn.run(app, host="127.0.0.1", port=8000, log_config=None)
[tool.poetry]
name = "example-api"
version = "0.1.0"
description = ""
authors = ["Nymous"]
[tool.poetry.dependencies]
python = "^3.10"
fastapi = "^0.78.0"
uvicorn = "^0.18.1"
ddtrace = "^1.2.1"
structlog = "^21.5.0"
asgi-correlation-id = "^3.0.0"
[tool.poetry.dev-dependencies]
black = "^22.3.0"
isort = "^5.10.1"
mypy = "^0.961"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
{
"version": 1,
"disable_existing_loggers": false,
"formatters": {
"default": {
"()": "uvicorn.logging.DefaultFormatter",
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
},
"access": {
"()": "uvicorn.logging.AccessFormatter",
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
}
},
"handlers": {
"default": {
"formatter": "default",
"class": "logging.NullHandler"
},
"access": {
"formatter": "access",
"class": "logging.NullHandler"
}
},
"loggers": {
"uvicorn.error": {
"level": "INFO",
"handlers": [
"default"
],
"propagate": false
},
"uvicorn.access": {
"level": "INFO",
"handlers": [
"access"
],
"propagate": false
}
}
}
MIT License
Copyright (c) 2023 Thomas GAUDIN
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
@basilnsage
Copy link

This was incredibly helpful, thanks for sharing!

@ClaytonJY
Copy link

Great stuff. Should this work when including an APIRouter sub-application? I can do extra, non-access logging for endpoints in my main.py, similar to the the hello() function/endpoint here, but I can't seem to do the same within-endpoint logging for endpoints I define in another file with APIRouter 😢

@toadjaune
Copy link

Thanks @nymous, this was a huge help !

Small note, the feature implemented in your custom rename_event_key can now be done natively with structlog.processors.EventRenamer :)

@bnjmn
Copy link

bnjmn commented Jan 8, 2024

Hi! Thanks for sharing this. I think you're missing an import logging in your main.py. Once I added that, everything ran as expected.

@jmy12k3
Copy link

jmy12k3 commented Jan 17, 2024

Thanks bro, it is super helpful!

For anyone looking for similar resources, this open source project might help you to code your structured logs with Uvicorn compatible.

(im not affliated btw) happy coding!

@Sammyjroberts
Copy link

This is one of the best things on github, thank you

@quiqueg
Copy link

quiqueg commented Apr 26, 2024

Heads up that Datadog updated their guide doc for the tracer_injection function:

import ddtrace
from ddtrace import tracer

import structlog

def tracer_injection(logger, log_method, event_dict):
    # get correlation ids from current tracer context
    span = tracer.current_span()
    trace_id, span_id = (str((1 << 64) - 1 & span.trace_id), span.span_id) if span else (None, None)

    # add ids to structlog event dictionary
    event_dict['dd.trace_id'] = str(trace_id or 0)
    event_dict['dd.span_id'] = str(span_id or 0)

    # add the env, service, and version configured for the tracer
    event_dict['dd.env'] = ddtrace.config.env or ""
    event_dict['dd.service'] = ddtrace.config.service or ""
    event_dict['dd.version'] = ddtrace.config.version or ""

    return event_dict

structlog.configure(
    processors=[
        tracer_injection,
        structlog.processors.JSONRenderer()
    ]
)
log = structlog.get_logger()

Notably, the span.trace_id is now 128-bit by default (configurable via DD_TRACE_128_BIT_TRACEID_GENERATION_ENABLED). But, as of writing, Datadog's log<>trace correlation still looks for 64-bit trace IDs in logs.

So the span.trace_id will need to be converted to 64-bit before injecting it into the logs:

str((1 << 64) - 1 & span.trace_id)

@AllanGH
Copy link

AllanGH commented May 1, 2024

Small note, the feature implemented in your custom rename_event_key can now be done natively with structlog.processors.EventRenamer :)

In addition to this, ConsoleRenderer doc shows an option to rename the event key, allowing you to stick with 'message' for both JSON and console logging.

event_key (str) – The key to look for the main log message. Needed when you rename it e.g. using structlog.processors.EventRenamer. Ignored if columns are passed.

@tobiasbueschel
Copy link

tobiasbueschel commented Jul 30, 2024

This is a great example @nymous - thank you! 🙏

I'm curious why you configured the uvicorn.error logger with log level info? Wouldn't the uvicorn.error logger always log with level ERROR? I might miss some context on the inner workings of uvicorn...

https://gist.github.com/nymous/f138c7f06062b7c43c060bf03759c29e#file-uvicorn_disable_logging-json-L25-L31

"uvicorn.error": {
    "level": "INFO",
    "handlers": [
        "default"
    ],
    "propagate": false
},

UPDATE: nevermind, I found my answer in this open issue: encode/uvicorn#562 - I'll leave my comment here in case anyone new to uvicorn also stumbles upon this.

Have a good remaining week :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment