Introduction: Raspberry Pi Based Answering Ganesha
This project is about using Raspberry pi with USB microphone and Speaker for interactive answering machine. We are calling it Talking Ganesha. In Hindu Culture , God Ganesha is called as God of Knowledge. We have created this project just to create Talking Ganesha in Ganesh Festival. We have used Google Speech API for converting speech to text, WolfarmAlpha API for answering questions and then eSpeak to speak out the answer return by WolframAlpha.
It is based on initial work done by Mr Dave. His instruction can be found in "http://makezine.com/projects/universal-translator/.
This project was developed with the help of Utsav (my Infosys team mate) and Suhas (Vigyan Ashram Fab Lab Instructor)
You need following hardware :
1. Raspberry Pi B+
2. Sound Card (In case you don't have USB headphone or USB Mic + Speaker)
3. Headphone or Mic + Speaker with audio jack
4. Internet Connection
https://youtu.be/Ofez1F3VLAE
Step 1: Set Up Raspberry Pi
(This is taken from step 1 of Universal Translator "http://makezine.com/projects/universal-translator/)
Assuming you have Raspbian installed on your Raspberry Pi.
Update the software on your Raspberry Pi: sudo apt-get update sudo apt-get upgrade
Install the software required for this project with following commands:
sudo apt-get install python-pip mplayer flac python2.7-dev libcurl4-gnutls-dev
# To get object via any HTTP request e.g. calling Speech API from python script
sudo pip install requests pycurl
# To get answers for your questions
sudo pip install wolframalpha
# To manage sound / audio devices
sudo apt-get install alsa-utils
# To convert text to speach
sudo apt-get install espeak
# We shall use google Speech API to convert Speech to Text
Step 2: Create Google Sppech API Key and Wolfram API Key
Download required python and shell scripts.
Create new folder called "Talking Ganesha" inside home folder
Download the following scripts from download section :
text-to-translate.py
queryProcess.py
stt.sh
Execute command on Raspberry Pi
sudo chmod +x stt.sh
Google Speech API Key :
Use instructions from http://makezine.com/projects/universal-translator (Step 4).
This google api key shall be used in script "text-to-translate.py (key='GoogleKey')
For Wolfram Key :
http://products.wolframalpha.com/api/
Sign Up
Complete Form to activate your account
Click on Get an AppID button on right
Enter application Name and Description
Now copy APPID.
This key shall be used in script "queryProcess.py" (app_id = "WolframAlpha Key")
Note that you can make 50 calls to Google Speech API (per day) and 2000 calls to Wolfram Alpha API (Per month). And this usage must be restricted to personal use (non commercial).
Step 3: Set Up Audio Devices (USB Sound Card)
This is taken from step 2 of Universal Translator "http://makezine.com/projects/universal-translator/...
Plug in the USB headset (use a powered USB hub, if necessary).
Run the following commands, which will list your sound devices: cat /proc/asound/cards cat /proc/asound/modules
You should see that the Headset is listed as card 1. The second command should show that the driver for card 0 (the default output) is snd_bcm2835, which is the Raspberry Pi's analog audio output. The driver for card 1 (our Headset) is snd_usb_audio. If you don't see the headset listed, try rebooting: sudo reboot In order to set the USB headset as the default for both audio input and output, you'll need to update the ALSA configuration file. Open it in the text editor nano:
sudo nano /etc/modprobe.d/alsa-base.conf
Change the line that says:
options snd-usb-audio index=-2
to:
options snd-usb-audio index=0
Save and close the file with Ctrl-X and typing y. Reboot the Raspberry Pi using the following command:
sudo reboot
After the reboot, the sound system should be reloaded so that when you run the above commands
cat /proc/asound/cards
cat /proc/asound/modules
again, you should see the USB Headset is now the default input/output device (card 0) as shown above
Test it out by recording a 5 second clip from the microphone:
arecord -d 5 -r 48000 ganesha.wav
Play it back through the headphone speakers:
aplay ganesha.wav
To adjust the levels you can use the built-in utility alsamixer. This tool handles both audio input and output levels.
sudo alsamixer
Step 4: Ganesha Starts Answering
Use following command :
sudo python text-to-translate.py
Now you shall listen sound "Please ask me any question now" You can ask question in 5 seconds......wait for some time then it shall give you answer.
Wait till you listen "Please ask me any question now " statement again to ask another question.
Ganesha shall keep continuing till user press (Ctrl +Z) to halt the script.
You can change the length of time it shall wait to listen to question by updating -t parameter in stt.sh script.
Step 5: In Case of Issues Getting Proper Speech Recongnition
Please replace line in stt.sh file as below :
***********************************************************************************
echo "Recording your Speech (Ctrl+C to Transcribe)"
#arecord -q -f cd -t wav -d 4 -r 16000 | flac - -f --best --sample-rate 16000 -s -o test.flac; # old code
# try to record audio with sox
rec -q -c 1 -r 16000 test.flac trim 0 5 # this is new code
***********************************************************************************
Also install sox package
sudo apt-get install sox