Will I be impressed by someone using RL through self-play to improve model creativity or aesthetics in 2025?
5
10kṀ480
2026
47%
chance

Researchers have used reinforcement learning to improve LLM performance on math problems, among other things (one example). It's also widely known that firms like Midjourney use human feedback to improve the aesthetic quality of their images. Could LLMs bootstrap aesthetic/creative abilities through self-play (e.g. by using other models as a judges of quality and creating a reward function from that)?

Resolves YES if someone releases a model which is commonly understood to have used RL through self-play and that model produces an artifact that is publicly accessible in 2025—be it a poem, text, song, image, video, interpretive dance, etc—that I find aesthetically or creatively impressive (compared to the pre-self play version of the model that produced it).

Get
Ṁ1,000
to start trading!
Sort by:
bought Ṁ30 YES

@MalachiteEagle good market!

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules