When will I spend >=30 minutes, on a median workday, interacting with a GPT-like tool?
Any GPT-series model would qualify. Any app using an LM-pretrained transformers for token generation would qualify, if most app interactions include queries to the model.

Currently I use Github copilot, but most interactions with my code editor do not involve copilot, so this does not qualify yet.

For context, I'd estimate I currently spend roughly 30-60 minutes per work week interacting with LM. I work on language model alignment as a PhD student. If I work on a project which requires intensive direct work with an LM on a temporary basis, this does not qualify. I will resolve this question taking an average over a one month period.

Resolving this as I frequently at least read copilot's recommendations these days when coding (though it remains highly unreliable). Also query GPT-4 more frequently as I've gotten a better sense of where it can speed me up vs just misguide me when it comes to coding related questions.

pinging @BionicD0LPH1N since you have a derivative market.

Do you have access to GPT-4?

@BionicD0LPH1N Yes chat but not API access though that'll likely change soon.

There's a decent chance this will resolve soon. Within the next few weeks I intend to commit to a randomly selected subset of days over a month period where I'll monitor my AI usage time.

Increasingly defaulting to asking ChatGPT to write parts of scripts for me. I'd guess I'm currently at 90+ minutes per week.
FWIW I intend not to 'front-run' this market, and will trade as if this market referred to e.g. one of my colleagues at the NYU ARG.

@JacobPfau I also find it more worthwhile than before to spend time probing capabilities and biases of LMs.

I can't extract any money from this market, so I created a new one asking specifically about 2023. I placed a 100 mana YES limit order at 35% if anyone wants to bet against me.

Does it count if increasingly large language models end up built into apps you use every day, providing higher-quality autocomplete, auto-fill, recommendations, content fill-in or similar? Or does it have to be something like hand-tuning prompts and feeding them to a model to complete/infill (like GPT-3 sandbox) or doing interactive completions (like ChatGPT)?

@ML Yes, if I start using auto-complete for email 15 minutes a day plus 15 minutes of QA interaction this would count. I use auto-complete very infrequently though.

Something I do use more is general spell/grammar check, I will not count this towards the 30 minutes even if a language model is used for this on the backend. I'm stipulating this a bit arbitrarily, but it's mainly because an LM would probably provide only a minimal improvement on that domain.

