In general, UI changes seem to be rolled out when devs feel like it and decide they look good, and changed in response to vocal feedback.
Will the Manifold team try rolling out a UI change with systematic or quantitative testing, before 2024-04-01?
Things that would count:
A/B testing the rollout so that only some users get it at first, with some kind of metric tracking.
Publicly discussing what metrics were tracked around the UI change, and how they were interpreted as good or bad.
Systematically and intentionally seeking feedback on a proposed or actual change, via a mechanism likely to have significantly less selection bias than "voluntary feedback from users engaged enough to be on Discord".
Creating a decision market that will resolve based on quantitative metrics and reporting those metrics.
The testing must be public enough that I can find evidence of it in order to count. This question resolves N/A if there are no UI changes whatsoever before close. I wouldn't be surprised if there are weird edge cases; please ask about them. I won't trade in this market.
I think it's unfortunate that the market was so mispriced, but I think it simply was mispriced. I think what actually happened was a pretty clear case of the first bullet point:
A/B testing the rollout so that only some users get it at first, with some kind of metric tracking.
It's public in the sense that the testing was done on the users, on the public site, not a private test group (eg, employee-only beta). I think I wrote the title a little poorly, in that it overly emphasized the "public" aspect. On a re-read of the market text, I think it's clear that the "public" aspect can be fairly minimal. In particular:
The testing must be public enough that I can find evidence of it in order to count.
That bar was clearly met.
My apologies for the confusion. I regret that I wasn't terribly engaged with Manifold for most of the duration of this question, but there doesn't appear to have been much relevant discussion that I missed or questions I failed to answer, so I don't think that actually had a large impact. My apologies also for the slow resolution; trying to figure out what to do with what appeared to be a badly-mispriced market was stressful and I procrastinated on it.
@EvanDaniel I saw this question a few days ago and it was only priced at 10% or something, I didn't bet but I'm pretty sure they did do A/B testing in March.
Found two instances:
It's up to you to decide if it meets your criteria, but my view based on reading Standup was that they intentionally made an effort to do this within the last month. I guess I know why I didn't bet -- I wasn't sure how clear it needed to be for the criteria whether these were "UI" changes, and I couldn't figure out why the market was trading at only 10% when it was obvious (to me) that this was going on....
@Eliza That sure looks like A/B testing of display features. I don't see anything I wrote that said it had to be about major changes; what text to display somewhere sure seems like it counts.
I didn't read in detail, but I'm assuming they tracked some kind of metric to go with this.
I'm inclined to resolve Yes based on this, but will wait a couple days in case anyone wants to present a counter-argument. Resolving a market Yes that closed at 3% makes me a little nervous, so I'd like to give No bettors time to respond.
Anyway, sounds like you should have bet on it... not all Manifold markets are efficient ;)
@EvanDaniel Hi, I guess I'm the one who bet it to 3%. I have no problem with this counting for a YES resolution against me, it's just $M470. Yes, @Eliza should've gotten most of it, but with no actual YES holders who gets my mana, the AMM? Anyway, I'll share my reasoning behind the bet.
I bet with the expectation that a YES resolution required something more public, not just of the UI changes visible to users, but of the testing results, metrics, discussion, and/or analysis. I based this assumption on the points mentioned in the description:
Publicly discussing what metrics were tracked around the UI change, and how they were interpreted as good or bad.
Systematically and intentionally seeking feedback on a proposed or actual change, via a mechanism likely to have significantly less selection bias than "voluntary feedback from users engaged enough to be on Discord".
I suppose I should have clarified before betting if this was the intended meaning of "public" in the title and description. I feel the description emphasized this meaning, but I see the first point (before what I quoted) does cover A/B testing results/discussions kept private. So, up to you.
Here's what I could do, if you decide to bail me out and resolve NO I'll send $M200 to @Eliza as a reward for surfacing the evidence. Note my profit on a NO resolution would only be about $M33.
@eppsilon If they publicly tested the April Fools feature before rollout, thus causing this market to resolve Yes, I will laugh my ass off.
For example, with the new UI, I don’t understand where to find the list of limit orders anymore.
It is probably somewhere, but I expect that with some testing, they would have seen it was hard to find.