Will AI agents be used to develop software commercially by the end of 2023?
closes Dec 31

An AI agent is something which takes a task and directly applies changes to a code base. (Possibly via a merge request, letting a human to review changes.) I.e. it works similarly to giving a task to a programmer.

The market resolves to "YES" if such agents exist by the end of the year and are used in commercial environments, essentially displacing work of programmers.

The agent must work for a mainstream programming language and a commonly used code base format. "AI app generator" which produces something from a template does not count, neither do specialized "no code" environments.

Tools like Copilot do not count - they are designed to help a programmer to write code, not to replace a programmer.

Experiments in a lab settings do not count - it's much easier to operate in a controlled environment.

JustNo is predicting NO at 66%

Does it have to be good? 🤣 I fully expect to see a lot of AI hype scams (I suspect this is the first I've seen in the wild, $99 to talk to a private AI bot - https://twitter.com/ReadMultiplex) and I can just about guarantee some one will make code this way and market their product as being written by AI. I also strongly suspect the code produced will be absolute trash.

Alex Mizrahi is predicting YES at 65%

@JustNo It has to meat quality standards of commercial software development.

Braulio Valdivielso Martínez is predicting NO at 64%

@AlexMizrahi any examples of such standards? Test coverage, for instance?

Alex Mizrahi is predicting YES at 65%

@BraulioValdivielsoMartine There are no formal standard. The best way to assess quality is to sample opinions of senior software developers. Information about such assessment can be posted in press, blogs, etc. E.g. if we see that e.g. Google considers quality acceptable that would be it.

firstuserhere bought Ṁ100 of YES

Would this type of stuff count? (keeping aside scale or commercial environment for now)

Alex Mizrahi is predicting YES at 70%

Yes. It is sufficiently general and it does the work which otherwise would be done by a human programmer.

Yonatan Cale

This bot automatically opens pull requests to update the dependencies in a repo:

This does replace some (though not much) of the work of a programmer, and is prompted by the bot (not by a human)

I'm guessing this doesn't count. Do you mean because it can't do a variety of tasks like a human programmer? Or some other reason maybe?

Alex Mizrahi is predicting YES at 70%

@YonatanCale It needs to be sufficiently general, i.e. it should be able to take a task in a natural language and carry it out. It is mentioned in the description: "takes a task".

dependabot does only one thing. Narrow tools like that existed for decades so it doesn't make sense to create a prediction market about them, the question is whether we'll get something new - more general, more powerful. It should be almost as powerful as a human programmer.

Yonatan Cale

"Almost as powerful as a human programmer" - I'd be happy if you were more specific (maybe give 10 tasks and say it should be able to do 7 of them?)

But this is enough for me to buy NO anyway

Alex Mizrahi is predicting YES at 67%

@YonatanCale Results are already available for tasks which are easy to specify and measure: "AlphaCode achieved an estimated rank within the top 54% of participants in programming competitions".

Commercial software development, however, does not have a well-defined measure of complexity. We can't use things like coding competitions as they are skewed towards more self-contained tasks which are uncharacteristic for commercial software development.

So I'm afraid it's better to leave this open ended.

If this resolves to YES most likely the evidence will be in form of articles claiming that programmers are being replaced by AI agents. I will use my own judgement as an expert (I am a CTO of a software company and a senior programmer) to ignore irrelevant evidence - for example, a bot having only a 'narrow' functionality.

Yonatan Cale is predicting NO at 68%

@AlexMizrahi "articles claiming that programmers are being replaced by AI agents" (judged by you) adds relevant info for me.

Together with "Almost as powerful as a human programmer" - that removes stuff like "just generate css"


Collected Over Spread bought Ṁ10 of NO

I think AI agents could potentially be used to automate some mundane tasks like "rename Foo to Bar throughout the codebase, including when they appear as part of larger names (FooStatus, currentFoo, updateFoo), except in the SuperfooCoApi module." I want to say that tools already exist to automate a task like this, but not with a natural-language prompt.

Yonatan Cale

@CollectedOverSpread Copilot and GPT-4 both are both way stronger than that

