Will more than 5% of GPT-4’s training data be YouTube transcripts?
34
1kṀ3629Jun 2
11%
chance
1H
6H
1D
1W
1M
ALL
If there is an estimate as to what the training data of GPT-4, this market will resolve to YES if more than 5% of it contains YouTube transcripts. Raw YouTube videos don't count towards the resolution, if GPT-4 ends up being multimodal.
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
People are also trading
Related questions
Did OpenAI transcribe Youtube videos to train a GPT model as claimed by NYT?
89% chance
Will OpenAI be sued (with standing) for using transcribed YouTube videos for GPT before 2026?
10% chance
Will GPT-5 be capable of some form of online learning?
27% chance
How much compute will be used to train GPT-5?
What percentage of 2025 will be left when OpenAI releases GPT-5?
38% chance
What percentage of 2025 will be left when OpenAI announces GPT-5?
38% chance
What hardware will GPT-5 be trained on?
What will be true about GPT-5?
Will there be an LLM (as good as GPT-4) that was trained with 1/100th the energy consumed to train GPT-4, by 2026?
83% chance
Will the ratio of inference runs to training runs on GPT5 decrease from the ratio on GPT4?
50% chance