(Throughout Advent I’m sharing some hints as to how web developers can make my life as a speech recognition user easier.)
I wrote yesterday about enhancing long drop-down menus to turn them into combo boxes, which act more like text areas and so are somewhat more tractable to voice. However you can still screw them up; here are two ways I’ve seen recently.
The first is where you’re implementing auto complete on a text area. The best way of doing this is to provide a drop-down menu of possible completions and only fill the text area when one is explicitly selected. (This is how Google Search does it, for instance.) This means that until I finished dictating into the text area I can continue to use Dragon commands to correct that dictation. If you remember, Dragon maintains an internal view of what the text is in the current input field, so if you complete automatically in the text field this internal view is now incorrect. We might have the following, having dictated “James":
“James” has been input, but the web app has added “Aylett”, and the text cursor will be after that.
If I actually meant “chains”, and I use Dragon’s correction commands, Dragon will try to correct the word immediately before the text cursor, which it thinks is “James”, but which is actually “Aylett”. Dragon typically uses character by character selection, so what we are likely to end up with is something like “James Achains”.
Note that once the user has selected a completion from the menu, the text input is naturally going to contain some stuff that Dragon doesn’t know about. Voice users should be able to spot these kind of explicit situations, and have a command specifically to resynchronize Dragon’s view of an input, if they need to edit it further.
The other problem is more insidious, although I haven’t seen it in a web app as yet. It’s the way Google Chrome makes its address bar work. Someone on the Chrome team clearly decided that the “http://” part of the URL wasn’t necessary; other schemes are shown, so why take up space showing the most common? Except when you cut and paste the URL from Chrome it always includes the scheme, even if it’s HTTP. This is very clever, but trips up Dragon.
If you want to synchronize Dragon’s view of an input with what’s already there, you say “cache document”. It then selects all the text, and copies it into its own view; then it manually moves the cursor to the beginning then forward to the place where Dragon believes it to be. At this point (usually) Dragon and the input match each other, and voice editing commands will work smoothly.
But when the URL is copied out of Chrome’s address bar, the “http://” part is added to the front, meaning that Dragon thinks it’s there but Chrome, when editing commands are applied to the input field, does not. This creates a similar problem to the first example, in that trying to select parts of the text (to replace it, to add more before or after it, or to apply formatting commands such as capitalization) will select the wrong characters.
So with Google’s URL in the address bar (Google actually uses HTTPS for everything these days, but it’s easier to pretend it doesn’t for the purposes of explanation than to find a website that won’t move to HTTPS in the future), and Dragon thinking it’s synchronized, we might choose to go instead to Microsoft’s website, saying “select Google / Microsoft”. Dragon’s internal view is now “http://Microsoft.com”. However it will attempt to make that edit by selecting the eighth to thirteenth characters and typing “Microsoft” over them. Because the scheme isn’t in the actual text input, you end up with: “google.Microsoft”.
The only way of working round this is to copy the contents of the address bar out, edit them separately from Google Chrome, and then copy them back in again.