The flagging system and bluechecks are good enough at this scale, but they won't work forever. They also lack any shades of grey; either you're unreliable, Trustworthyish™️, or a lowly peasant.
Here's how I might do it:
After a market is resolved, bettors can review the resolution and either approve or disapprove.
Once someone has 100(?) reviews, they get a badge next to their name showing how reliable they are; either showing the percentage or a grade of some kind.
Mousing over the badge gives more detailed info; maybe "98% reliable from 2342 reviews".
There's a lot to fiddle with in that system, like:
Disapprovals from bettors who profited should probably be worth more than from bettors who didn't, while approvals from bettors who didn't profit are worth more than from those who did. Maybe they should even be the only ones that count?
Should the average be per-market or per-bettor? If someone resolves 100 5-bettor markets right but flubs one 500-bettor market, how reliable should they be considered?
Not bothering to mark a market as approved or disapproved should probably count as a weak implicit approval, given that the user has used Manifold after the resolution took place
What about abandonment? Should you be able to disapprove of a closed-not-resolved market?
"Approval" is probably not the best term for it
Vouching? If someone new but famous joins Manifold maybe highly-approved users should be able to click a button to say "I trust this person"
I like vouching. Also controversial: users' reviews with high vouch scores could weight more.