This morning, our database went down for 38 minutes. No read or write queries were being returned starting at 8:38am.
The cause seems to be a memory bug in a library we were using:
We've detected a memory leak within PostgREST, triggered by long-running connections, documented here:
https://github.com/PostgREST/postgrest/issues/2638
This bug affected PostgREST versions under v10.2, with it being addressed in v10.2 - your project was among the affected ones running v10.1.2.
— Supabase engineer
Supabase has since upgraded the version of that library to solve the memory leak, so hopefully that specific problem won't happen again.
Will we lose read access again this year for > 5 minutes, or write access for > 12 hours?
In total, we've had two outages this year:
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ50 | |
2 | Ṁ40 | |
3 | Ṁ5 | |
4 | Ṁ2 | |
5 | Ṁ1 |
We've been having issues with CPU and such, which is probably our fault.
But! We also restarted the DB yesterday, and it got stuck in a boot loop, which was mostly a bug in the way Supabase was hosting it (they only gave it 90 seconds to start up, but we needed longer to replay transactions). Doc on the outage: https://manifoldmarkets.notion.site/DB-Outage-849d055d43f64807b98ca0930f890346?pvs=4
So Supabase is responsible for the full outage that lasted 2 hours. That's a clear YES resolution.
Suppose that read/write access isn't completely gone but they don't work for some substantial percentage of requests (say, 30%) 👀