If someone consistently uses AI reasoning tools (like advanced LLMs) to check their decisions before acting, does that make them safer to be around?
The Bayesian Inference Complication: How much of any increased safety comes from:
Direct effect: The AI actually improving their decisions
Selection effect: What their behavior reveals about their underlying judgment
The Base Rate Consideration:
For exceptional people (rare): Using AI for routine decisions might actually be a negative signal—suggesting they lack the judgment you'd expect, or are overly anxious/dependent
For average people (common): Using AI consistently is likely a strong positive signal—they're compensating for typical human limitations and showing epistemic humility
Reframed Core Question: Given that people with truly excellent judgment are rare, should we update more positively on average when someone uses AI assistance extensively? In other words: is "average person + AI augmentation" more trustworthy than "unknown person without AI," even though "exceptional person without AI" might be most trustworthy of all?
Here’s a clean Bayesian reframing that separates selection from direct effects and makes the base-rate logic explicit.
Bayesian setup
Variables
J \in \{\text{E}, \text{A}\}: latent judgment quality (Exceptional, Average).
U \in \{0,1\}: uses AI extensively for routine decisions.
S \in \{0,1\}: “safe to be around” outcome.
Priors and propensities
\pi \equiv P(J=\text{E}) is small; P(J=\text{A})=1-\pi.
\alpha_j \equiv P(U=1\mid J=j). Empirically you’re positing \alpha_{\text{E}} < \alpha_{\text{A}} for routine use.
Baseline safety and direct effect
s_{j0} \equiv P(S=1\mid U=0, J=j) baseline safety.
r_j \equiv \dfrac{P(S=1\mid U=1, J=j)}{P(S=1\mid U=0, J=j)} risk ratio from AI use (direct effect at fixed J). Typically r_j \ge 1, with diminishing returns r_{\text{E}} \le r_{\text{A}}.
Posterior over judgment given behavior (selection)
P(J=\text{E}\mid U=1)=\frac{\pi\,\alpha_{\text{E}}}{\pi\,\alpha_{\text{E}}+(1-\pi)\,\alpha_{\text{A}}}, \quad P(J=\text{E}\mid U=0)=\frac{\pi\,(1-\alpha_{\text{E}})}{\pi\,(1-\alpha_{\text{E}})+(1-\pi)\,(1-\alpha_{\text{A}})}.
Equivalently, the Bayes factor of observing AI use for “exceptional vs average” is
\text{BF}U=\frac{P(U=1\mid \text{E})}{P(U=1\mid \text{A})}=\frac{\alpha{\text{E}}}{\alpha_{\text{A}}}\ (<1 \text{ under your assumption}).
Decomposition: direct vs selection
Total safety difference when you observe AI use vs non-use:
\begin{aligned} \Delta &\equiv P(S=1\mid U=1)-P(S=1\mid U=0) \\ &=\underbrace{\sum_{j} \Big(P(S\!=\!1\mid U\!=\!1,j)-P(S\!=\!1\mid U\!=\!0,j)\Big)\,P(j\mid U\!=\!1)}{\textbf{Direct effect at fixed } J} \\ &\qquad +\ \underbrace{\sum{j} P(S\!=\!1\mid U\!=\!0,j)\,\Big(P(j\mid U\!=\!1)-P(j\mid U\!=\!0)\Big)}{\textbf{Selection effect via }P(J\mid U)}. \end{aligned}
Using r_j and s{j0}:
P(S=1\mid U=1)=\sum_j r_j\,s_{j0}\,P(j\mid U=1),\qquad P(S=1\mid U=0)=\sum_j s_{j0}\,P(j\mid U=0).
Direct effect term is positive if r_j>1.
Selection effect term is negative if AI users are less likely to be exceptional (\alpha_{\text{E}}<\alpha_{\text{A}}) and s_{\text{E}0}>s_{\text{A}0}.
Decision rules you actually care about
Should we update positively on someone who uses AI a lot?
Yes iff
\sum_j s_{j0}\big(r_j\,P(j\mid U=1)-P(j\mid U=0)\big) \;>\; 0.
Intuition: the direct boost r_j must outweigh the composition shift toward average users that the behavior signals.
“Average + AI” vs “Unknown without AI”?
Prefer “Average + AI” when
r_{\text{A}}\,s_{\text{A}0} \;>\; \sum_j s_{j0}\,P(j\mid U=0).
If you ignore the selection in the “unknown without AI” pool and use pure priors as a shortcut:
r_{\text{A}}\,s_{\text{A}0} \;>\; \pi\,s_{\text{E}0} + (1-\pi)\,s_{\text{A}0} \quad\Longleftrightarrow\quad r_{\text{A}} \;>\; 1 + \pi\,\frac{s_{\text{E}0}-s_{\text{A}0}}{s_{\text{A}0}}.
With exceptional people rare (\pi small), this threshold is typically modest.
When is “exceptional without AI” still best?
Usually when s_{\text{E}0} already dominates and r_{\text{E}} offers little marginal gain. Formally, “E without AI” beats “A with AI” if
s_{\text{E}0} \;>\; r_{\text{A}}\,s_{\text{A}0}.
Minimal numeric toy example
Let \pi=0.10, \alpha_{\text{E}}=0.2, \alpha_{\text{A}}=0.6, s_{\text{E}0}=0.95, s_{\text{A}0}=0.80, r_{\text{E}}=1.02, r_{\text{A}}=1.10.
Selection: P(\text{E}\mid U=1)=\frac{0.1\cdot0.2}{0.1\cdot0.2+0.9\cdot0.6}\approx3.6\%
P(\text{E}\mid U=0)\approx18.2\%.
Safety:
P(S\mid U=1)\approx 0.036\cdot0.969 + 0.964\cdot0.88 \approx 0.883,
P(S\mid U=0)\approx 0.182\cdot0.95 + 0.818\cdot0.80 \approx 0.827.
So despite the negative selection signal, AI users are safer on average because the direct effect is large enough. Also,
“Average + AI” yields r_{\text{A}}s_{\text{A}0}=0.88 which beats the “unknown without AI” pool at 0.827, while “Exceptional without AI” remains highest at 0.95.
Takeaways in one line each
Observing AI use is evidence against “exceptional,” but can still increase expected safety if r_{\text{A}} is decent.
The rarer true excellence is, the more “average + AI” dominates comparisons to “unknown without AI.”
Exceptional without AI remains the gold standard unless average-user augmentation is very strong.