I want to open any text box on my phone
I want to hit a button and talk to Claude or a GPT, instead of typing
It would write what I told it to
Either transcribing what I say or creatively writing things
It'd be able to read its own writing and edit it
I can ask it to edit to make things more / less formal, etc etc
Afaict this is shippable today if the user adds their own API key... Would be incredible. I want it really bad. It's slow but typing is hell already
It doesn't have to read context. It'll just write things inti the box. But the more context it can get, the better.
Not required but mind blowing:
Upload images along with it
If the LLM easily knows my bio data and can fill things in well
If the LLM has personalities or modes I can switch between, each day with context
Also I want this now. If you have experience writing keyboards please contact me I'll pay. How hard is this?
Why are we waiting for interruptability? This is it, we are done. Every single but of extra context you add powers this up - for example the label before the text box ("first name") or the full page etc.
@KTibow oh really? how does it work? I'd like to try. It's really weird that google voice => text (while awesome) hasn't improved in a year or more. That is a totally solved problem. Same for spell correction - it's still using some old heuristic model rather than an LLM to just intelligently figure out what I mean to say, and have it say that. Yes I would pay a lot to never ever ever have to manage moving my cursor around a tiny windows to do spell corrections that an LLM could completely do.
@Ernie ah not quite to the extent of convenience of course...
in terms of correction:
it has better-than-gboard autocorrection by default
it lets you turn on microsoft editor to view grammar suggestions
and you can explicitly ask for an llm to edit/generate text
but it doesn't have whisper-level transcription or llm correction by default
@KTibow ah, okay. oh wait if you ask for an llm what happens? can I just talk to it and it'll write things?
there's one workflow that I'd take over for example - entering image descriptions into ideogram. right now I have claude expand them / make up random stories and stuff, fill in details. it's very teachable. I'd like to go to ideogram.ai and open the textbox on firefox on my phone, then just hit a button and talk to claude and have it write stuff (with perfect spellcheck and transcription too obv). paying a cent or whatever for it is fine...
@KTibow general question to you as a keyboard-things-knower: is there any way to get gboard (or a similar product) on PC? seems crazy but like I want really good voice to text, etc here too, yet PCs just seem to be lagging infinitely behind...
@Ernie idk.
what is out right now can kind of do that (eg swiftkey has bing chat/"copilot" for built in chat, windows has win+h for built in dictation, and you can always switch between apps). i don't use those so i don't know the quality though. (i could try rigging together a script to call whisper on a hotkey if you want)