Will dictation via free tools be faster and more accurate than typing in ~~April~~ June 2023?
39
207
770
resolved Jul 23
Resolved
NO

The goal of this market is to assess whether free dictation tools are ready for regular use in professional writing.

In June 2023, I will attempt dictation of a 1000-word English text, run a spell checker / grammar checker to correct errors, and then diff the text with a ground truth of the same text. If there are any remaining mistakes revealed by the diff, I will then proofread the text manually and repeat the diff until all errors are corrected. I will then repeat the same process, but typing by hand in both qwerty and dvorak. This market resolves YES if the time to get to a perfect match with the ground truth is lower by dictation than by hand typing (the faster of querty and dvorak), once time spent proofreading is included.

Please leave suggestions for realistic text and dictation engines to use in the comments. Text should include some language that dictation software usually has trouble with, such as their/there/they're and two/to/too, but be representative of professional writing. In the interest of fairness, I will attempt all dictation engines suggested (which I can use for free) and take the fastest. I will also consider use of dictation engines included in software I already use (i.e. zero marginal cost to me.) I currently use Linux in personal use and MS Office / Win10 at work.

Because resolution requires trust, I will not be betting in this market.

Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ57
2Ṁ32
3Ṁ20
4Ṁ16
5Ṁ14
Sort by:

Legal typing (Qwerty in Word) with initial corrections and formatting 21:10.85

Scientific typing (Qwerty in Word) with initial corrections and formatting 25:42.02

Corrections (Qwerty in Word), v1: 5:10.73

Corrections (Qwerty in Word), v2: 3:11.87

Total for typing: 55:14

STT in Word

Legal (STT in Word) 19:47

Scientific (STT in Word) 21:26

Corrections, v1 by hand: 21:43

Corrections, v1 via GPT4: 00:20 (time to write prompt)

Corrections, v2 by hand following v1 via GPT4: 16:30

Lower bound for STT in Word: 62:56

Lower bound for STT in Word with GPT: 58:03

Tried Talon. Speech recognition is pretty good, but it fails hard on homophones. It also completely fails when entering text into Notepad, but succeeds with Word. Odd. Speech to text: 14:30. Corrections: > 30 min.

Well, Talon is very good at voice recognition, but "dictation mode" is unable to accurately paste in what it reads:

I have decided on a test text: (1) the preliminary opinion from DUPREE v. YOUNGER
CERTIORARI TO THE UNITED STATES COURT OF APPEALS FOR THE FOURTH CIRCUIT
No. 22–210. Decided May 25, 2023, without the note (765 words). https://en.wikisource.org/wiki/Dupree_v._Younger

and the first page of the first interesting actual journal article in Nature this week:

(2) Calvin, A., Eierman, S., Peng, Z. et al. Single Molecule Infrared Spectroscopy in the Gas Phase. Nature (2023). https://doi.org/10.1038/s41586-023-06351-7 (786 words).

@brp Time to record: Legal 06:44; Scientific 7:46.

predicted NO

@brp Any progress since this?

This very much depends on how good of a typist you are. I can type ~140 WPM while concentrating, so for me, dictation would be much slower, but for someone who types 40 WPM I could see it being useful.

predicted YES

Any plans to resolve this @brp?

@DanMan314 Plans, yes. Schedule, no. Will open up and change date to June. Sorry about this.

Reopening for two days so I have time to actually do the experiment.

To get a well-written but obscure text in English with high probability, just visit https://en.wikisource.org/wiki/Special:Random

bought Ṁ200 of NO

@duck_master Testing with this method I came up with https://en.wikisource.org/wiki/Proclamation_1506B - recorded the first two paragraphs. And then ran whisper with the small.en model on that (which is the biggest that will fit into my laptop's vram).

Recording + fixing took me 3 minutes, typing took me 2 minutes 58 seconds (that's slightly slower than @brp if we're counting).

I then went through and compared the two versions and the original, my typed version had no (I corrected as I went) errors and my corrected whisper still had 7 - things like extra commas, small wording changes, and capitalization differences. Correcting those without the benefit of diff may not be easy, on the other hand, They may not matter so much for dictation?

This is far better than whisper did on the Jabberwocky poem, which I didn't even bother comparing because I had to correct every third word.

Whisper doesn't have any integration to actually use it in real time - much less as a keyboard substitute - so I'm not sure it really counts as dictation software?

Are you going to benchmark these dictation tools to your typing speed or some average typing speed from a survey? How fast do you type?

@lukalot I will be doing my own typing and proofreading. A quick online typing speed test says I type at 62 wpm qwerty and 35 wpm dvorak (correcting for errors as I go). For this question I will use the faster of the two.

Ok! I've tried comparing my own typing speed to several internet demos of voice dictation, and while I'm usually faster, they can be pretty close (especially with code examples).
If you chose the text that you're going to use that could be relevant (is it just random words or actual sentences with punctuation?), but I think I'm more optimistic than the current market anyway so I'll wager yes. thanks for the info

candidates, quite possibly not there yet but close enough that they could plausibly compete:

  • openai whisper (currently just a component; inclusion depends on if you're filtering by full deployability or just raw accuracy without ui consideration)

  • talon voice recognition

  • dragon voice recognition (I would short this one)

  • google voice input (I would short this one)

very, very important question here: what microphone will you use? even for humans, microphones are really not all the same for understandability. current voice rec algos are a bit weak.

@L Thank you. This comment is gold. I hadn't thought about the microphone at all. Was going to use whatever is built into my USB headset (PTC Plantronics C3210), as it seems to elicit fewer complaints than the built-in mic of my computer or phone.

I found a website advertising it as "Frequency Response 20 - 20000 Hz". A search for "PTC 208" (written on the device) turns up some regulatory forms in New Zealand, so it might be subject to regulations regarding gain, wave distortion, etc.