Introduction: Door Control Using Voice
This project deals with the use of voice to control an Intel Edison. The project will be divided into two parts mainly. The first part deals with the configuration of Intel Edison for sound packages. The second part deals with the use of a python script with Intel Edison which will deal with some library packages for the use of speech processing.
The article assumes that you have completed with the initial configuration of Intel Edison.
The greatest problem that I’ve faced while doing this is not to write the code but configuring the Intel Edison with certain packages and libraries. However, while experimenting, I nearly messed up my Edison to a point where re-flashing was the only available option and I had to carry out all the steps once again. Due to my zero experience in Linux previously except for writing some codes in C in the well-known vi editor, I found some configuration a bit out of track. However, once you know what all you require to install and if you have previous experience in Linux environment it will take much lesser time. I would like to thanks Esther Jun Kim, whose amazing article at GitHub on voice processing using Edison helped me a lot regarding this project. The link is here.
Step 1: Configuration of Edison With Sound Card
Add AlexT's unofficial opkg repository.
To configure the repository, add the following lines to /etc/opkg/base-feeds.conf:
src/gz all http://repo.opkg.net/edison/repo/all
src/gz edison http://repo.opkg.net/edison/repo/edison
src/gz core2-32 http://repo.opkg.net/edison/repo/core2-32
Next job is to update and install ‘git’
opkg install git
The next phase is to configure the Intel Edison for sound. Before that please connect the USB
Sound card to the Edison and then switch the tiny mechanical switch toward the USB port. Please note that this article only targets the use of the Arduino Expansion board. Now run the following command lsusb in the console. Your device must be listed in the list that appears before you.
Once the sound card is configured, the next course of action is to install Advanced Linux Sound Architecture (ALSA) and other libraries required for speech processing.
opkg install alsa-utils libportaudio2 libasound2 bison espeak alsa-lib-dev alsa-utils-dev alsa-dev python-numpy
Now you need to check whether ALSA is able to detect the sound card or not. Run the command in the console
Once you find your device listed in the list of connected devices, proceed to the next step. Also note the name of the device.
In this case, the headset is shown under sysdefault:CARD=Device.
Create a ~/.asoundrc file and add the following line to configure the headset:
Play something with aplay to test the headset:
You should hear the word front center.
Similarly, record something to test the microphone:
# Record a test file
# CTRL+C to quit, then play the file aplay ~/test.wav
Step 2: Adding Packages for Speech Processing
This is the major step where the packages required for speech processing needs to be installed. Here PocketSphinx is used which is a lightweight version of CMU sphinx.
You need to install the packages and add a .dic , .lmd file. For that use the Sphinx knowledge base tool.
Download packages.rar and extract the shell scripts. Use the console to execute the shell scripts and install the packages.
Install one at a time.
1: pip install cython
Add the following paths to ~/.profile
3: echo 'export LD_LIBRARY_PATH=/usr/local/lib' >> ~/.profile
echo 'export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig' >> ~/.profile source ~/.profile
Step 3: Circuitry and Hardware
The circuit consists of the following components
1: Intel Edison
2: Seedstudio grove servo
3: Seedstudio grove base shield
4: A LED
Connect the LED to PIN 13. Attach the base shield and connect the servo to digital pin 3. Attach the sound card and then you are done.
Step 4: Python Script for Intel Edison
This section will deal with the code.
Download the code from the file section
import collections import mraa import os import sys import time from Servo import * # Import things for pocketsphinx import pyaudio import wave import pocketsphinx as ps import sphinxbase</p><p>led = mraa.Gpio(13) led.dir(mraa.DIR_OUT) myServo = Servo("First Servo") myServo.attach(3)</p><p>print("Starting") while 1: #PocketSphinx parameters LMD = "/home/root/vcreg/1505.lm" DICTD = "/home/root/vcreg/1505.dic" CHUNK = 1024 FORMAT = pyaudio.paInt16 CHANNELS = 1 RATE = 16000 RECORD_SECONDS = 3 PATH = 'vcreg' p = pyaudio.PyAudio() speech_rec = ps.Decoder(lm=LMD, dict=DICTD) #Record audio stream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, frames_per_buffer=CHUNK) print("* recording") frames =  for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)): data = stream.read(CHUNK) frames.append(data) print("* done recording") stream.stop_stream() stream.close() p.terminate() # Write .wav file fn = "test.wav" #wf = wave.open(os.path.join(PATH, fn), 'wb') wf = wave.open(fn, 'wb') wf.setnchannels(CHANNELS) wf.setsampwidth(p.get_sample_size(FORMAT)) wf.setframerate(RATE) wf.writeframes(b''.join(frames)) wf.close() # Decode speech #wav_file = os.path.join(PATH, fn) wav_file=fn wav_file = file(wav_file,'rb') wav_file.seek(44) speech_rec.decode_raw(wav_file) result = speech_rec.get_hyp() recognised= result print("* LED section begins") print(recognised) if recognised == 'ON.': led.write(1) myServo.write(90) else: led.write(0) myServo.write(0) cm = 'espeak "'+recognised+'"' os.system(cm)
The algorithm for the above code is as follows.
1: Import necessary libraries
2: Record the sound
3: Check with the language model and the dictionary
4: Get the string that was decoded
5: Use the string to match predefined conditions
6: If matched, rotate the servo that acts as a lock and light up the green LED
A video if the working model is attached