This project is about using Raspberry pi with USB microphone and Speaker for interactive answering machine. We are calling it Talking Ganesha. In Hindu Culture , God Ganesha is called as God of Knowledge. We have created this project just to create Talking Ganesha in Ganesh Festival. We have used Google Speech API for converting speech to text, WolfarmAlpha API for answering questions and then eSpeak to speak out the answer return by WolframAlpha.

It is based on initial work done by Mr Dave. His instruction can be found in "http://makezine.com/projects/universal-translator/.

This project was developed with the help of Utsav (my Infosys team mate) and Suhas (Vigyan Ashram Fab Lab Instructor)

You need following hardware :

1. Raspberry Pi B+

2. Sound Card (In case you don't have USB headphone or USB Mic + Speaker)

3. Headphone or Mic + Speaker with audio jack

4. Internet Connection

https://youtu.be/Ofez1F3VLAE

Step 1: Set Up Raspberry Pi

(This is taken from step 1 of Universal Translator "http://makezine.com/projects/universal-translator/)

Assuming you have Raspbian installed on your Raspberry Pi.

Update the software on your Raspberry Pi: sudo apt-get update sudo apt-get upgrade

Install the software required for this project with following commands:

sudo apt-get install python-pip mplayer flac python2.7-dev libcurl4-gnutls-dev

# To get object via any HTTP request e.g. calling Speech API from python script

sudo pip install requests pycurl

# To get answers for your questions

sudo pip install wolframalpha

# To manage sound / audio devices

sudo apt-get install alsa-utils

# To convert text to speach

sudo apt-get install espeak

# We shall use google Speech API to convert Speech to Text

Step 2: Create Google Sppech API Key and Wolfram API Key

Download required python and shell scripts.

Create new folder called "Talking Ganesha" inside home folder

Download the following scripts from download section :

text-to-translate.py

queryProcess.py

stt.sh

Execute command on Raspberry Pi

sudo chmod +x stt.sh

Google Speech API Key :

Use instructions from http://makezine.com/projects/universal-translator (Step 4).

This google api key shall be used in script "text-to-translate.py (key='GoogleKey')

For Wolfram Key :

http://products.wolframalpha.com/api/

Complete Form to activate your account

Click on Get an AppID button on right

Enter application Name and Description

Now copy APPID.

This key shall be used in script "queryProcess.py" (app_id = "WolframAlpha Key")

Note that you can make 50 calls to Google Speech API (per day) and 2000 calls to Wolfram Alpha API (Per month). And this usage must be restricted to personal use (non commercial).

Step 3: Set Up Audio Devices (USB Sound Card)

This is taken from step 2 of Universal Translator "http://makezine.com/projects/universal-translator/...

Plug in the USB headset (use a powered USB hub, if necessary).
Run the following commands, which will list your sound devices: cat /proc/asound/cards cat /proc/asound/modules

You should see that the Headset is listed as card 1. The second command should show that the driver for card 0 (the default output) is snd_bcm2835, which is the Raspberry Pi's analog audio output. The driver for card 1 (our Headset) is snd_usb_audio. If you don't see the headset listed, try rebooting: sudo reboot In order to set the USB headset as the default for both audio input and output, you'll need to update the ALSA configuration file. Open it in the text editor nano:

sudo nano /etc/modprobe.d/alsa-base.conf

Change the line that says:

options snd-usb-audio index=-2

to:

options snd-usb-audio index=0

Save and close the file with Ctrl-X and typing y. Reboot the Raspberry Pi using the following command:

sudo reboot

After the reboot, the sound system should be reloaded so that when you run the above commands

cat /proc/asound/cards

cat /proc/asound/modules

again, you should see the USB Headset is now the default input/output device (card 0) as shown above

Test it out by recording a 5 second clip from the microphone:

arecord -d 5 -r 48000 ganesha.wav

Play it back through the headphone speakers:

aplay ganesha.wav

To adjust the levels you can use the built-in utility alsamixer. This tool handles both audio input and output levels.

sudo alsamixer

Step 4: Ganesha Starts Answering

Use following command :

sudo python text-to-translate.py

Now you shall listen sound "Please ask me any question now" You can ask question in 5 seconds......wait for some time then it shall give you answer.

Wait till you listen "Please ask me any question now " statement again to ask another question.

Ganesha shall keep continuing till user press (Ctrl +Z) to halt the script.

You can change the length of time it shall wait to listen to question by updating -t parameter in stt.sh script.

Step 5: In Case of Issues Getting Proper Speech Recongnition

Please replace line in stt.sh file as below :

***********************************************************************************

echo "Recording your Speech (Ctrl+C to Transcribe)"
#arecord -q -f cd -t wav -d 4 -r 16000 | flac - -f --best --sample-rate 16000 -s -o test.flac; # old code

# try to record audio with sox

rec -q -c 1 -r 16000 test.flac trim 0 5 # this is new code

***********************************************************************************

Also install sox package

sudo apt-get install sox