[M5000 subsidy] Will finetuned GPT-3.5 solve any freshly-generated Sudoku puzzle? (2023)

6.1kṀ78k

resolved Jan 1

Resolved

ALL

Resolves YES if someone finds a fixed prompt as defined in the main market that succeeds at solving any Sudoku puzzle listed at Sudoku - Free daily Sudoku games from the Los Angeles Times (latimes.com ) that was generated after the comment was posted.

You are allowed to experiment with ChatGPT, but judging will be done with the API with temperature set to 0 for reproducibility.
Any puzzle - easy, medium, or hard - will qualify. No other puzzle provider is allowed for this market.
Solution must be posted in the comments of any Manifold market in the "GPT-4 Sudoku Challenge" group in 2023, and later confirmation of solution must also be posted in the comments. Market creator will not proactively check solutions against every new puzzle, but will check solutions that are found and posted.
Any variant of GPT-3.5 is allowed: ChatGPT(using the green icon), gpt-3.5-turbo, gpt-3.5-turbo-instruct
Finetuning GPT-3.5 is allowed, but the puzzle must be published after the model's creation.
The number of allowed turns is increased to 200, so the 4k context is equivalent to the 32k context GPT-4 in token count.

Related markets

Main market: /Mira/will-a-prompt-that-enables-gpt4-to
GPT-3.5 no finetuning: /Mira/will-gpt35-solve-any-freshlygenerat
GPT-3.5 finetuning allowed: /Mira/will-finetuned-gpt35-solve-any-fres
GPT-4 no finetuning: /Mira/m100-subsidy-will-gpt4-solve-any-fr-c5b090d547d1

Technical AI Timelines

GPT-4 Sudoku Challenge (2023)

New Year's Resolutions 2024

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ1,209
2		Ṁ923
3		Ṁ508
4		Ṁ404
5		Ṁ321

People are also trading

In what year will an AI achieve 90%+ on Sudoku bench?

Will GPT-5 have fewer parameters than GPT-4? (1500M subsidy)

28% chance

Which AI lab will have the highest Sudoku-Bench score by March 1, 2026?

Will GPT-5 be able to solve A::B system puzzles consistently

15% chance

Video generation model solves Sudoku puzzles by EOY 2026?

24% chance

Will it be possible to disentangle most of the features learned by a model comparable to GPT-3 this decade? (1k subsidy)

Sort by:

Resolves NO by default because no candidate was given for testing.

MrLuke255 boughtṀ100YES

@MrLuke255 I'm willing to validate going slightly(~5 days) into January as long as the prompt and model are finished and posted in December, and as long as you believe you solved a fresh puzzle in December.

Relevant paper about fine-tuning GPT-2 for solving puzzles, including sudoku: https://arxiv.org/pdf/2109.02797.pdf

predictedYES

Is it just me or the sudoku doesn't work? 😐

@Mira Could you check? Sudoku doesn’t seem to work for me on this site you linked

predictedNO

@MrLuke255

predictedYES

@Mira That's weird. Could it possibly work only in the US?

predictedNO

@MrLuke255 Try a different browser or VPN maybe. Or join the Discord: https://discord.gg/Y6qvtB5xPD and if you have a solution but are limited on eligible puzzles I'm sure somebody would get you a feed.

predictedYES

@Mira I don't have yet, but I plan to try the fine-tuning approach. If neural nets can be trained to solve sudokus, why not transformers? But that probably also depends on how the fine-tuning in OpenAI's version works

@MrLuke255 fine-tuning ≠ training.

Fine-tuning is much closer to prompt-engineering, for what it lets you achieve.

predictedYES

@Benx In this case you might be right. But in general ML fine-tuning is a common way of adapting existing model to new domains

@MrLuke255 the link does not work for me (located in EU), either.

predictedNO

@Zozo001CoN @MrLuke255 If anyone needs puzzles from the LA Times, my judging script has a puzzle bank:

manifold-sudoku/main.py at main · Mira-public/manifold-sudoku (github.com )

If you solve any of them, I could run your prompt on the remainder of December.

If fine-tuning is allowed... Can it be fine-tuned on the puzzle it then solves? 🙄

Ah, I didn't read this thoroughly enough. When using a fine-tuned model, only puzzles available the day after the model count

This Question should have a much lower chance than the main market for GPT-4.

@DanielParker This one includes gpt-3.5-turbo-instruct, I'd expect it to trade at a moderate premium to the GPT-4 market.

predictedNO

@CameronHolmes Do keep in mind you have to actually solve it though. The strategy of thinking "Probably this model can solve it" and then not actually making it solve it, is unlikely to work.

@DanielParker It does say "any puzzle", not "easy Sudoku puzzles".

predictedYES

@DanielParker It should be at least as high as the market for GPT-4... wdym? GPT 3.5-turbo-instinct looks to be much better at abstract logical tasks like sudoku than GPT-4

People are also trading

In what year will an AI achieve 90%+ on Sudoku bench?

Will GPT-5 have fewer parameters than GPT-4? (1500M subsidy)

28% chance

Which AI lab will have the highest Sudoku-Bench score by March 1, 2026?

Will GPT-5 be able to solve A::B system puzzles consistently

15% chance

Video generation model solves Sudoku puzzles by EOY 2026?

24% chance

Will it be possible to disentangle most of the features learned by a model comparable to GPT-3 this decade? (1k subsidy)

58% chance

Related markets

🏅 Top traders

People are also trading

People are also trading

Related questions