
11am Monday (oct 14th) state of affairs:
Things started to deteriorate the night of oct 9th (PST)
api latency slowly builds, leading to site outages every few hours
redeploying the api fixes things for a time
Local host works fine during the outage
simple queries start to take forever, filling up the connection pool
So far we've :
Set up a read replica that the front end uses when querying directly from the db with the supabase js library
updated our backend pg promise version
checked open connections on api (1.3k), total ingress/egress bytes, total api query numbers, cpu usage (10%), memory usage (10%), all are normal.
Reverted suspicious-looking commits over the past few days
Increased, then decreased pg pool size
Moved some requests from api to the supabase js client that talked directly to the db's load balancer
Typical API stats during an outage:



DB stats during an example outage:


I'm happy to provide more info, stats, etc.
repo: https://github.com/manifoldmarkets/manifold
This resolves as NO if we haven't fixed the underlying problem, i.e. even if we have a cron job restarting the server every 2 hours
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ11,391 | |
2 | Ṁ5,826 | |
3 | Ṁ4,648 | |
4 | Ṁ2,346 | |
5 | Ṁ946 |