Diagnosing PostgreSQL Connection Leaks on RDS

The site started throwing 502 Bad Gateway errors. Everything stopped. Restarting Gunicorn fixed it within seconds. Then it would happen again, at a completely random time, once every few days. This went on for about two weeks before we decided to properly dig in. That pattern is almost always a leak. In our case it was database connections. Infrastructure context RDS: db.m5.4xlarge, max_connections=5000, tcp_keepalives_idle=300s Gunicorn: 8 workers x 25 threads = 200 concurrent connections Celery: 3 instances (1 main worker at concurrency=35, 2 side workers at concurrency=25 each) = 85 worker threads total Total max DB connections across all processes: 285 Peak traffic (8am to 8pm SGT): ...

March 16, 2026 · Pranav Gore