Introduction: Untethered Speech Recognition and Speech Synthesis With Arduino

MOVI stands for 'My Own Voice Interface' and is a kickstarter-backed Arduino Shield that makes it very easy to build your own speech dialogs to control devices within the Arduino IDE. This quick instructable shows you how to get started setting up an Arduino with MOVI. It assumes basic familiarity with Arduino. If you haven't used Arduino before, read an instructable on Arduino first.

You need:

- An Arduino Uno, Mega, or Leonardo (other boards work as well but it's complicated)

- Loudspeakers that plug into a headphone jack (no 4ohm or 8ohm)

- A power supply, e.g. 12V 500mA or 4 AA alkaline batteries (rechargeables: use 5).

- A MOVI shield.

- An LED

To program the Arduino, you will also need a computer and the Arduino programming cable.

If you haven't already, download the latest version of the MOVI Arduino library as a zip file and install the library into the Arduino IDE. Instructions can be found here.

Step 1: Mount the MOVI Shield

Disconnect all power and USB cables from your Arduino board and connect the MOVI shield onto it, making sure all the pins align correctly. If you see two pins sticking out, you have an older board. You can operate MOVI but you should read Appendix A of the User Manual first to see if you need to set a jumper.

Connect an external speaker or a headset to the Audio Out. Audio Out is labeled “HEADPHONES” and is the audio jack further away from the onboard microphone, closer to the Arduino headers. You can only connect headphone impedance speakers, such as active speakers for laptops and cell phones.

Connect an LED to Arduino header pin D13 (+) and GND (-). It's better to use a resistor but it'll work without long enough...

Step 2: Power On

Connect the external power supply to the Arduino board and switch it on. After about 2 seconds, you should see MOVI’s LED (close to microphone) blinking with increasing frequency. The speakers will say “MOVI is booting”. Eventually the LED will stop blinking and just be constantly on. This indicates MOVI is ready. If the LED does not go on at all, please turn off the power and check with the user manual. If, by the time, the LED has become steady red, you didn’t hear anything, please check your speakers/headset and the connection.

Step 3: Program Arduino

Connect the USB programming cable to the Arduino board.

Important: Always, connect the USB cable after you have connected the external power supply. It is safe to disconnect the USB while the power is on. With the exception of MOVI updating, learning a new call sign, learning new sentences, or resetting to factory settings, you can always unplug the power safely. MOVI’s LED will blink randomly while it is not safe to unplug. However, please do not disconnect the external power while the USB cable is plugged to the Arduino. Powering MOVI from USB will not supply enough voltage to the board and will therefore leave MOVI in an unstable state where the LED might be on or blinking but MOVI does not work properly.

Load the LightSwitch example by opening the File menu under Examples. Choose MOVI or MOVI(tm) Voice Control Shield, depending on the version of the IDE. The screen should look similar to the image.

Compile and upload.

Step 4: Test the Code

Operate the example as follows:

- Get close to the microphone capsule and say “Arduino” in a normal voice, wait for two beeps.

- Say “Let there be light”. Wait for two beeps again.

- The speakers should say “and there was light” and the LED on Arduino board turns on.

- Now say “Arduino”, wait for two beeps.

- Then say “Go dark”. Wait for . The LED on the Arduino board turns off.

The video shows the execution.


Step 5: Write Your Own Code

Congratulations! Your first speech controlled device is running.

Now go ahead and change the code. For example:

1) Change the callsign from "Arduino" to "Computer" (Star Trek!) by changing the line after init() to recognizer.callSign("COMPUTER").

2) Change the recognized sentence from "Go dark" to "I am tired"

3) Add a synthesizer response to when the second sentence is recognized by adding:

recognizer.say("Good night!");

in the if (res==2) {} statement.

To learn more, you can check out the other examples in the MOVI library.


Two quick notes:

- Whenever you change the callsign or add/change sentences, MOVI will have to retrain, which takes about 30 seconds.

- Once you are finished with programming MOVI, you can disconnect the USB cable and run MOVI only off the batteries/power supply. MOVI does not need any connection to a PC or Internet.