Will at least 3/4 of the inverse scaling first round winners demonstrate U-shaped scaling by 2024?
4
closes 2024
77%
chance

The inverse scaling prize aims to find important tasks where larger language models do worse. The first round winners consisted of:

  • Negation QA: Question answering with negations (eg "Coral is not a type of living organism which can be identified in...")

  • Quote Repetition: Asking the model to repeat famous quotes wrong, with a few-shot prompt. (eg [examples of repetition] "Repeat: One small step for a man, one large step for eggs. One small step for a man, one large step for ...")

  • Hindsight Neglect 10-shot: have LMs answer questions about the expected value of bets, with many examples where the positive EV bets all resolved in favor (that is, were good ex-post), but where the final bet resolved negatively.

  • Redefine Math: have LMs answer questions about math where math symbols are redefined to mean other things (eg, "Let e be 312. What's the first digit of e?")

Note that all four of these tasks asked the model to pick between two predefined options.


Recent work from Google Brain (https://arxiv.org/abs/2211.02011) has argued that several of these tasks demonstrate U-shaped scaling:* as you scale your language model, while performance goes down at first, it eventually recovers with enough scale (that is, with the standard language modelling objectives and no further finetuning). However, they did not use the same setup as the inverse scaling prize, and so this result has been challenged by the inverse scaling prize creators: https://twitter.com/EthanJPerez/status/1588352204540235776 An updated version of the paper, using the exact same evaluation methods as in the inverse scaling prize, found that both Hindsight Neglect and Quote Reptition demonstrate u-shaped scaling.

This market resolves yes if at least 3 out of the 4 round one winners demonstrate U-shaped scaling, using the same setup as in the inverse scaling prize, by January 1st, 2024. That is, it resolves positively if at least one of Negation QA and Redefine Math demonstrates U-shaped scaling.

*An inverse scaling task demonstrates U-shaped scaling if a language model comes out with standard language modelling objectives that eventually performs better on the task as you increase the size of the model (despite initially performing worse). For the purpose of the question, evaluations that differ from the inverse scaling prize (eg Chain-of-thought eval, few-shot prompting, allowing the model to use external tools, etc) are disallowed.

Sort by:
Gigacasting avatar
Gigacasting

This competition was always horribly flawed.

All models demonstrate this property if you don’t turn up the regularization and dataset size to compensate.

Anything “result” found here is fake: just turn up weight decay, and/or use parallelism versus width, or create better data and augmentation—it will go away.

Disclaimer: not an AI genius but a lot smarter than whoever designed this.

LawrenceChan avatar
Lawrence Chanis predicting YES at 70%

@Gigacasting Surely you agree that there are some tasks where more capable AI systems eventually do worse? Or are you imagining "better data and augmentation" that's sufficient to resolve all possible alignment failures? To put the author's goals into your framework, "better data and augmentation" might be hard in practice, and they're trying to find tasks where even with pretty decent data + augmentation, we still see inverse scaling.

Gigacasting avatar
Gigacasting

More capable Ai systems might seek more years of female education depressing their birth rates, while less capable Ai systems have TFRs of 7+

Also more capable Ai systems might be more amenable to groupthink and have a less solid theory of mind regarding lesser Ai systems due to lack of exposure during training

They might also favor complex unworkable solutions to problems long solved in simple, robust ways by most societies and believe wild delusions about paper clips bc they are more interested in their own delusions compared to reality

😏

Related markets

By the end of 2023, will any real-world event have had at least 50 different Manifold markets about it with near-identical resolution criteria?5%
Will I still be within the top 5 market creators at the end of 2023?62%
Will I still be within the top 5 market creators at the end of 2023?80%
Will 2022 Atlas Fellows launch startups worth >$100 million by the end of 2024?31%
Will any prediction market clearly elicit knowledge about an important event that was otherwise unknown to the public before 2025?87%
By the end of 2023, will any real-world event have had at least 20 different Manifold markets about it with near-identical resolution criteria?33%
Will Metaculus let users set the scale of the bet slider by the end of 2023?23%
Will the FairlyRandom group contain at least 300 markets at the end of 2023?26%
Will there be 5+ "thought leaders" that became so due in part to trading success on prediction markets by end of 2026?27%
Will any prediction market implement CEO markets before 2025?3%
Will it be possible to short an answer on a free-response market by the end of 2024?81%
Will people care about prediction markets by 2025?26%
If any market creators reach 500 resolutions by the end of 2023, will any of them have a perfect record?54%
Will 2022 Atlas Fellows launch startups worth >$20 million by the end of 2023?64%
Will any of my markets about prediction market longevity have at least 100 traders by the end of 2023?31%
Will any resolve-to-quiescence-with-high-probability markets fail catastrophically by EO202362%
Will any serious market-creator on Manifold cherry-pick by the end of 2023?95%
Will it be possible to short an answer on a free-response market by the end of 2023?82%
By the end of 2023, what will be the size of my largest win on any single manifold market?14K
Will anyone lose at least M$500,000 on a single market by the end of 2023? (v2)74%