
Will an AI system be able to fully refactor a 10k+ line codebase before 2026 ?
Growing capabilities and context lenght increase of recent AI systems will potentially allow ever more powerful applications concerning code and IT infrastructures in general.
A full refactoring is a long and intense process that require a important amount of skill and knowledge. Good refactoring usually increase efficiency and readability of codebases, while facilitating further improvements on the codebase.
Refactoring & generation rules
To be considered a valid refactoring, the AI refactoring should actually show, in one go : good readability, efficiency gain (if possible), harmonization of the syntax and structure of the code while not showing any loss in feature or specification.
The system would need to deduce everything related to code, configuration files and basically the whole github repo
Pre-generation user feedback is possible but should be 100% optionnal and should only concern architecture preferences, naming conventions and high level considerations.
Re-run of the same input by the user until getting a valid result will not be counted as success.
Reliability
It would need to have a very high average reliability (~95%+) accross various common programming languages (Python, Java, C++, C#, etc...) and librairies.
Allowed human interactions
Interaction that need administrator privilege and directly asked by the system for package installation or similar for example (feedback possible for this).
Additionnal
There is one attempt for the final code generation, but internally the system could go for as many iterative test-loop process as needed and use as many external tool as needed.
For resolution
I would prefer not to rely on a single source (including me) for the resolution,
that's why I will prefer using public benchmarks (that of course doesn't exist yet ...).
If not available I will go for online forum consensus.