🐍 Flask Python Structured Logging — What Most Miss in Production

The article explains that approximately 80% of Flask applications still use basic `print()` statements or unstructured logging in production, which hinders effective debugging and monitoring despite the availability of modern tools like Datadog and Elasticsearch. It demonstrates how to implement structured JSON logging using Python's built-in `logging` module with a custom `JsonFormatter`, and also highlights the simpler alternative of using the Loguru library, which offers cleaner syntax and native support for structured output through features like contextual binding with `bind()`.

Roughly 80% of Flask applications still rely on basic print statements or unstructured logging.info calls for observability in production. Despite widespread adoption of modern monitoring tools like Datadog, Loki, and Elasticsearch, most Python web apps ship logs as plain text — making debugging slow, filtering unreliable, and alerting brittle. This isn’t a legacy issue; it’s happening in brand-new Flask services today. 📑 Table of Contents The Python logging module is not a thin wrapper around print — it’s a fully composable system for routing, formatting, and filtering log records based on severity, source, and custom context. Every log call e.g., logger.info "User logged in" creates a LogRecord object. This record contains metadata — timestamp, filename, line number, function name, log level — before any formatter processes it. That metadata enables deterministic serialization into JSON without context loss. To emit structured output, replace the default logging.Formatter with one that serializes the record. import logging import json import sys class JsonFormatter logging.Formatter : def format self, record : log entry = { "timestamp": self.formatTime record, self.datefmt , "level": record.levelname, "logger": record.name, "module": record.module, "function": record.funcName, "line": record.lineno, "message": record.getMessage , } if record.exc info: log entry "exception" = self.formatException record.exc info return json.dumps log entry Configure root logger handler = logging.StreamHandler sys.stdout handler.setFormatter JsonFormatter logging.basicConfig handlers= handler , level=logging.INFO logger = logging.getLogger "flask app" Now, when you log: logger.info "User login attempted", extra={"user id": 123, "ip": "192.168.1.1"} You get: {"timestamp": "-11-15 14:22:30,123", "level": "INFO", "logger": "flask app", "module": "auth", "function": "login", "line": 45, "message": "User login attempted", "user id": 123, "ip": "192.168.1.1"} The extra dictionary is merged into the top level of the JSON output because those keys become attributes on the LogRecord instance. This behavior is consistent and predictable — no additional configuration needed. The standard logging module requires boilerplate and careful handler management. Loguru reduces that surface area with better defaults, cleaner composition, and native support for structured output. Its core abstraction is the sink — a generalized destination for log events. Sinks can be streams, files, or network endpoints, and each can have its own format, filter, and serialization logic. Install it: $ pip install loguru Collecting loguru Downloading loguru-0.7.2-py3-none-any.whl 58 kB Installing collected packages: loguru Successfully installed loguru-0.7.2 Configure JSON output: from loguru import logger import sys import json Remove default handler logger.remove Add JSON sink logger.add sys.stdout, format=lambda record: json.dumps { "time": record "time" .isoformat , "level": record "level" .name, "message": record "message" , "module": record "module" , "function": record "function" , "line": record "line" , record "extra" } , level="INFO" Loguru supports contextual binding via bind : @app.route "/login", methods= "POST" def login : user id = authenticate request.json if user id: authenticated logger = logger.bind user id=user id, ip=request.remote addr authenticated logger.info "User authenticated" return {"status": "ok"} else: logger.warning "Login failed", ip=request.remote addr return {"status": "unauthorized"}, 401 Output: {"time": "-11-15T14:25:10.123456+00:00", "level": "INFO", "message": "User authenticated", "module": "app", "function": "login", "line": 23, "user id": 456, "ip": "192.168.1.1"} bind attaches key-value pairs to the logger instance, propagating them across all subsequent log calls from that instance. This avoids repetitive extra kwargs and reduces error surface. Structured logging isn’t about format — it’s about making every log line queryable, filterable, and traceable. In Flask, request-scoped data like trace IDs or user identifiers should appear in all logs for that request without manual pass-through. Loguru integrates with Python’s contextvars to maintain state across async and threaded contexts. Use patch to inject bound data into every log record within the request lifecycle. from flask import g @app.before request def attach log context : trace id = request.headers.get "X-Trace-ID", "unknown" logger.bind trace id=trace id .patch lambda record: None @app.after request def clear context response : logger.unbind "trace id" return response After binding, every logger.info or logger.error call within the request includes the trace id field. This aligns logs across functions and services during incident investigation. Loguru captures full stack traces by default when using logger.exception : try: risky operation except Exception: logger.exception "Operation failed" Output includes: "exception": "Traceback most recent call last :\\n File \"app.py\", line 30, in login\\n risky operation \\n File \"utils.py\", line 12, in risky operation\\n raise ValueError 'Boom' \\nValueError: Boom" For non-critical paths, use the @logger.catch decorator: @logger.catch def risky operation : return 1 / 0 This logs the traceback and prevents the exception from halting execution. Useful for optional processing or background tasks where failure shouldn't crash the request. To gain observability at the HTTP layer, capture request metadata — method, path, status, duration — automatically. Use Flask’s before request and after request hooks to wrap each incoming request. from time import time from flask import request, g @app.before request def start timer : g.start = time logger.bind method=request.method, path=request.path, ip=request.remote addr .patch lambda record: None @app.after request def log request response : duration = time - g.start logger.info "Request completed", status=response.status code, duration=f"{duration:.4f}s", length=response.content length or "-" return response Example output: {"time": "-11-15T14:30:00.123456+00:00", "level": "INFO", "message": "Request completed", "module": "app", "function": "log request", "line": 45, "method": "POST", "path": "/login", "ip": "192.168.1.1", "status": 200, "duration": "0.1234s", "length": "15"} This adds full request observability without touching application logic. Health endpoints like /health or /metrics generate high-volume, low-value logs. Filter them early to reduce noise and storage cost. Skip binding and timing for known endpoints: @app.before request def start timer : if request.path in "/health", "/metrics" : return g.start = time logger.bind method=request.method, path=request.path, ip=request.remote addr .patch lambda record: None Alternatively, disable logging per route using a decorator: def no log func : def wrapper args, kwargs : with logger.disabled : return func args, kwargs return wrapper @app.route "/health" @no log def health : return "OK" Never log passwords, authentication tokens, or personally identifiable information PII . Sanitize request payloads before inclusion: safe data = {k: v for k, v in request.json.items if k not in {"password", "token"}} logger.bind body=safe data .info "Login request received" Prefer allowlists over denylists: logged fields = {k: request.json k for k in "email", "country" if k in request.json} This ensures only explicitly permitted fields enter the log stream. Structured logs only deliver value if used correctly in production environments. First, always emit logs to stdout . Container orchestrators like Kubernetes expect applications to write logs to standard output so agents e.g., Fluentd, Vector, Filebeat can collect and forward them. Avoid writing directly to files. Second, standardize field names. Use consistent keys such as http.method , http.status code , user.id , and trace.id across services. This enables reusable dashboards and alerting rules in tools like Grafana or Datadog. Third, adopt correlation IDs. Generate a unique ID per request and propagate it through logs and downstream services. import uuid @app.before request def add correlation id : cid = request.headers.get "X-Correlation-ID" or str uuid.uuid4 logger.bind correlation id=cid g.correlation id = cid @app.after request def add correlation header response : response.headers "X-Correlation-ID" = g.correlation id return response Fourth, manage log levels rigorously. Use DEBUG for detailed traces, INFO for operational milestones, WARNING for recoverable anomalies, and ERROR for failures. Apply level filtering at the sink: logger.add sys.stdout, level="INFO", serialize=True Fifth, consider performance. JSON serialization adds measurable CPU overhead under load. For high-throughput services, use orjson — an optimized JSON library written in Rust. import orjson def json serializer obj : return orjson.dumps obj .decode orjson is up to 50× faster than the standard json module and handles common types like datetime and dataclass natively. In Kubernetes, pod logs are scraped from stdout by default. No custom configuration is required if your app emits JSON. Verify output: $ kubectl logs my-flask-pod-7x9f2 {"time": "-11-15T14:35:00.123456+00:00", "level": "INFO", "message": "Request completed", "method": "GET", "path": "/api/users", "status": 200} Ensure your log agent parses JSON correctly. For Fluentd, use parser-type: json . For Grafana Loki, configure pipeline stages in your agent to extract structured labels. With JSON logs, you move from text scanning to precise querying. In Loki : " In Datadog : {job="flask"} | json | level="ERROR" and path="/login" " " In Elasticsearch : service:flask @level:ERROR @http.status code:5xx " "json Filtering by {"query": {"term": {"http.status code": "500"}}} "status:500 or path:/login executes in milliseconds instead of scanning gigabytes of text. That precision is the core advantage of structured logging. Good logs don’t just tell you what failed — they tell you who, when, where, and how it mattered. Adding structured JSON logging to a Flask app isn’t a refactor — it’s a shift in how you treat logs. They become first-class data pipelines, not side-effect outputs. Both the built-in logging module and Loguru can achieve this. The former offers full control and zero dependencies. The latter delivers simpler syntax, better context handling, and native async support. Choose based on team familiarity and long-term maintainability — but don’t skip the step. Your logs will be queried during outages, often under pressure. Give your team structured, consistent, and secure data — not unstructured noise. Structured logging isn’t optional for modern systems. It’s the baseline for reliable observability in distributed environments. Yes, but it’s not recommended. Loguru can intercept standard logging calls via logger.enable , but mixing both increases complexity. Pick one and standardize across the codebase. Also read: 🐍 How to set up CI/CD for a Python Flask app using GitHub Actions — common mistakes and key tips Use Loguru’s built-in rotation: logger.add "logs/app.json", rotation="100 MB", serialize=True . For file-based logging, ensure your log shipper e.g., Filebeat can handle log rotation without missing entries. Yes, marginally — serialization adds CPU cost. But the trade-off in observability is almost always worth it. For high-throughput services, use orjson or consider sampling non-critical logs.