Vocola

Vocola is a voice command language—a language for creating commands to control a computer by voice. Two versions are available: Vocola 2 works with Dragon NaturallySpeaking (DNS) and Vocola 3 works with Windows Speech Recognition (WSR) on Windows. While DNS and WSR handle the heavy lifting, Vocola (pronounced "vo-CO-luh") concentrates on features and ease of use. In particular, Vocola offers the following:

Easy to use:

  • Simple, concise command syntax—most commands are one-liners
  • Easy to view and modify commands
  • Changed commands are loaded automatically
  • Large set of useful sample commands
  • Free

Features:

  • Create commands which capture any dictated words
  • Use concise number ranges, optional words, and inline word lists
  • Specify different actions for variable words
  • Speak a continuous sequence of commands
  • Re-use work with include files and user-defined functions

See Vocola 2 or Vocola 3 for details on these and other features.

WSR users and tire-kickers should note that Vocola 3 supports dictation to any application and calling extension functions written in any .NET language.

Examples

Here are four voice commands defined in Vocola:

Copy That = {Ctrl+c};

Copy to WordPad = {Ctrl+a}{Ctrl+c} AppBringUp(WordPad);

1..40 (Left | Right | Up | Down) = {$2_$1};

Sort by (Date=e | Sender=n | Subject=s) = {Alt+v}o $1;

The first is a simple keystroke command—saying "Copy That" sends the keystroke Control-C, which copies the current selection to the clipboard. The great majority of commands needed for controlling a computer by voice are simple keystroke commands like this.

The second command, invoked by saying "Copy to WordPad", copies a window of text (Control-A selects all text and Control-C copies it) and brings up the WordPad editor (using the built-in function AppBringUp).

The third command allows controlling the cursor, by saying for example "3 Left" to move left three characters, or "6 Down" to move down six lines. Spoken words match variable terms on the left and are substituted into the keystroke sequence on the right. For example, when saying "3 Left" the spoken "3" matches the numeric range 1..40 and the spoken "Left" matches the alternative set (Left | Right | Up | Down). The keystroke sequence {Left 3} is constructed and sent, and the cursor moves left three characters.

The fourth command allows sorting messages in Mozilla's Thunderbird Mailer, by saying "Sort by Date", "Sort by Sender", or "Sort by Subject". The matched word "Date", "Sender", or "Subject", causes the appropriate keystroke "e", "n", or "s" to be inserted into the keystroke sequence, choosing the desired option in Thunderbird's View > Sort menu.

Why a custom voice command language?

Other systems for defining voice commands are grafted onto existing programming languages. This means you can program any behavior you want, but you're stuck with the syntactic overhead of the base language. In contrast, Vocola is designed specifically as a voice command language, not as a general-purpose programming language. This means you can write quickly and concisely the great majority of voice commands you need, and use another language in the few cases where you need more power.

When I switched from the Dragon Macro Language to Vocola I was able to convert all but two of my 200+ Dragon macros (achieving a source line count reduction of roughly 6:1) and at this writing use well over 1000 Vocola commands.

Source code

Vocola is open sourced under the MIT license. Source code can be found on GitHub; there are separate repositories for Vocola 2 and Vocola 3.

Acknowledgments

Joel Gould for NatLink, which enabled and inspired Vocola. Mark Lillibridge for Vocola 2 enhancements, support, and great ideas. Rob Chambers and the Microsoft speech group for helping with many WSR issues in Vocola 3. The speech recognition community listed in Voice Resources.

 
Copyright © 2002-2023 Rick Mohr