Does a few shot prompt improve GPT-4 debugging performance?

Ṁ287Ṁ201

resolved Mar 1

Resolved

N/A

ALL

If I use a 3 shot prompt giving examples of spotting bugs in my Pytorch code, will GPT-4 improve its debugging ability?

This will be a noisy manual eval: I will evaluate on 10 code snippets and resolve Yes if the few shot prompted version (compared to zero-shot, net change) does better on 2+ cases, otherwise No. I'll either go through with this eval or cancel by end of Feb 2024. If I don't end up finding/creating such a prompt this market will be cancelled.

Market context

OpenAI

GPT-4

Get

1,000

to start trading!

People are also trading

Will the performance jump from GPT4->GPT5 be less than the one from GPT3->GPT4?

79% chance

What is the main reason behind GPT-4o speed improvement relative to GPT-4 base model?

GPT-4 #1: Conditional on being able to use a REPL, will GPT-4 be able to be prompted to write naively malicious code?

54% chance

If GPT-5 can do recursive self-improvement, will it first be via fine-tuning on its outputs?

37% chance

Will training on o1/o3 traces improve GPT-5 performance on a broad set of tasks?

63% chance

What will the aggregate improvement of GPT5 be over GPT4 in terms of metrics?

200

Sort by:

bought Ṁ10 NO

id say it depends on the code snippet itself; it’s pretty good at basic programming with few-shot but for cp it’s pretty bad at debugging

predictedNO

Hmmm so I realized the question description is ambiguous, I intended the criterion to resolve Yes iff net improvement is >=+2 (i.e. number of better answers - worse answers >=2), if this isn't how people read it I will cancel the market.

predictedNO

@JacobPfau I'll leave this comment standing for a week, and resolve N/A if any existing traders understood the question differently (sorry if so). If no one complains, I'll update the description to clarify and let the market continue.

predictedNO

@JacobPfau Updated to add 'net change'

People are also trading

Will the performance jump from GPT4->GPT5 be less than the one from GPT3->GPT4?

79% chance

What is the main reason behind GPT-4o speed improvement relative to GPT-4 base model?

GPT-4 #1: Conditional on being able to use a REPL, will GPT-4 be able to be prompted to write naively malicious code?

54% chance

If GPT-5 can do recursive self-improvement, will it first be via fine-tuning on its outputs?

37% chance

Will training on o1/o3 traces improve GPT-5 performance on a broad set of tasks?

63% chance

What will the aggregate improvement of GPT5 be over GPT4 in terms of metrics?

200

People are also trading

People are also trading

Related questions