
I'm thinking of something like https://mentat.ai/, but that actually works.
I will provide a paragraph or so describing the change I want made. Then it should create a GitHub PR, which I will review and leave only a few comments before merging. The whole process should take less than 30 minutes. This should work fairly reliably.
I tried this yesterday and it failed haha:
https://github.com/manifoldmarkets/manifold/pull/2694
See more discussion in my post:
People are also trading
This looks good to me, stephen gave it two prompts to create this and I think it took less than 10 mins https://github.com/manifoldmarkets/manifold/pull/3588
GPT 4.1 is awesome for coding.
It's genuinely really good. (mini is ok, nano is dogwater). I have been using it off azure with cursor both as assist and tedious implementation speedrunner - it's one-shot so many instructions that 4o would have a bad time with, and that claude would overthink.
Not tab complete, mostly just asking stuff. Really has come a long way with code
Crazy how ai agents are regularly building small features for me almost daily and this market is still at 80%
I'd like to conduct some tests using codebuff/cursor. What are acceptable small features in your mind? I have a couple ideas:
- add a button to the comments bottom row that allows users to tip the commenter. Denormalize the tip amount onto the comment and display the total tipped amount on the button.
- Add a delete button for admins/mods that marks a comment as deleted (don't actually delete the comment, just set the deleted flag and hidden flags both) that hides the comment completely from the market.
@JamesGrugett said the delete comment button for spam fit the bill, I'll try using codebuff to do this soon
@ian I am aware that you work on Manifold, but since you are also the largest YES holder can we maybe agree to let @JamesGrugett do these kinds of evaluations once time comes.
@CalibratedNeutral That sounds reasonable, although he doesn't work at manifold anymore so I'm not sure if he'll want to put 30 mins in to do this. I was going to film my attempt from scratch
@CalibratedNeutral I was not aware of that. Then maybe a third party (another developer working on Manifold)? The stakes are reasonably high for me, so I really would strongly prefer to have everything as unbiased as possible.
@CalibratedNeutral Alternatively, @JamesGrugett could test this question on his new startup, codebuff. He uses codebuff to help develop codebuff
@ian Either option sounds good to me as long as the resolution criteria are followed according to @JamesGrugett's judgement