Options are inclusive: if this happens tomorrow then all options resolve to YES.
This market will of course resolve somewhat subjectively. The overall idea is "there is a model that I can use instead of hiring an intern". Some things I expect such a model to be able to do:
Take a paper as input and give a runnable rough draft implementation (does not need to be bug free)
Experiment with hyperparameters for an existing model ("grad student descent")
Make some nice loss curve graphs (acceptable to use third-party tools for this e.g. WandB)
Take two existing algorithms/models and run comparisons between them ("Model A performs better on this benchmark", "Model B trains 3x faster")
Avoid trivial mistakes like no validation set, testing on your train set, etc
Some things I don't expect it to do:
Develop novel ML algorithms
Any kind of non-trivial distributed training (anything more complicated then "run your code with this flag to make it distributed")
Any kind of performance optimization (e.g. writing Triton kernels)
I will give myself one month (2024-07-20) to modify the resolution criteria based on feedback.