Here is small video that explains what I have managed to build:
Step 1: Get Device's Components
Vintage phone for project enclosure. I was very attracted by the esthetics of this old device. Mix of materials, rich textures and shapes certainly add to the experience. I found a reasonably priced candlestick phone and a ringer box on eBay.
Archos 28 as a driving device. Archos 28 is a reasonably priced tablet that has all the features I need: 4Gb of internal memory, Wi-Fi, microphone, audio out and 800Mhz CPU.
One might ask: why not use a micro controller and a set of chips? It looked a bit simpler and more efficient to use Archos 28, as it has all components on its board and also comes with OS Android. Since my phone has to work 24/7 it has to remain plugged in all the time, so power consumption is not an issue.
IOIO Board to interact with hardware. IOIO Board is an amazing device: it plugs into Android device via USB. Android device discovers it as an ADB host. There is a nice little API that allows any Android application read line state (either digitally or do analog read) and generate either digital or PWM signal on a line.
One might ask: why not use Android ADK? Unfortunately, ADK has been added only in Android 2.3. Archos 28 is running 2.2.
Step 2: Clean Up
Attached picture shows all components cleaned up and ready to be transformed into a new device.
Step 3: Prepare Components
Candlestick holder of and old devices was also very simple to work with: it had a simple mechanism that would bring a line to the ground if the telephone is on hook.
Archos 28 didn't fit the old ringer box very well. To be exact it was USB cable to was sticking out and preventing device to fit in. I used a Dremel tool to fix this. I cut a small slope to make sure USB cable fits in nicely.
I have also cut out holes for wires and installed couple hinges.
Step 4: Preserve the Ringer
I used Inkscape to create 2D design. I must mention that using this tool very much feels like a torture. I really hope I'll be able to find better affordable tools for my next project.
One the design was ready, I've submitted it to Ponoko for laster cutting. Laser-cut panel arrived couple weeks later. Arrived panel could be seen on attached picture. Another picture shows how components actually looked.
Step 5: Assemble the Circuit
Once the circuit was in the ringer box Wi-Fi quality reduced significantly. So, I had to purchase and attach an external Wi-Fi antenna. Archos 28 happens to have a very well defined UF.L port. Here is how the system fit into the ringer box:
You can see the circuit and how it fits into the inclosure on attached pictures.
Step 6: Write Software
CMU Sphinx is an open source voice recognition project maintained by Carnegie Mellon. The system consists of two parts: recognizer code and files with voice model and language model. It was easy to compile library code for Android. There is a great example posted by CMU Sphinx's creators. One can teach CMU Sphinx their own pronunciation. All one has to do is to record 20 sentences and run generated files thought a supplied tool. This can significantly increase recognition quality. What is more, one can build a language model. This would basically tell recognizer what words and phrases to expect. In my case a primary phrase was "call name", where name is one of the names from my address book. Having such model also increases recognition quality.
One might ask: why not use Google Voice? Unfortunately, it is really bad at understanding my pronunciation. And it also not so good at recognizing names.
One might ask: why not use special micro controller? I have certainly considered this approach. One solution I found was Sensory. Unfortunately, it looked too expensive. Well, it seemed like I would have to do the same amount of work, as with CMU Sphinx and it will result in comparable quality, but I would have to pay for the chip.
"No speech generator" – I was very convinced in this after trying several different generators. All text-to-speech engines created a very un-natural voice. So, I had to ask a human to record all phrases that my phone can possibly tell. What is more, I had her read each phrase several time. During playback I pick a random version of the phrase; this creates a strong illusion of a real human on the other end.
PJSIP – is an open-source implementation of the SIP stack. In other words, it is open VoIP library. I didn't have much trouble with it: downloaded, compiled and used it. CSipSimple is a big project open source that also uses it. This project very helpful, as it contained some great usage examples.
One might ask: why not use Skype? This was my original idea. I've subscribed to Skype Developer Program. Unfortunately reading license agreement revealed that Skype SDK can not be installed on any devices controlled by Android.
One might ask: why not SIP stack that is built into Android? Unfortunately, the stack has been added only in Android 2.3. Archos 28 is running 2.2.
When telephone is off the hook:
- Wait one second
- Say "Number, please!"
- Start voice recognition
- If recognized "call name", go to next, otherwise say "Sorry, I didn't get that" and go to 3
- Say "Calling name..."
- Start voice recognition
- If recognized "no" or "stop" go to 2, otherwise go to next
- Place a VoIP call
- Say "Call placed"
- Wait until the call is terminated
- Say "Call terminated"
Android App Format
Phone application is actually a background service. There is also a light-wait user application that displays current status. The services starts on app startup or on user app launch.
Where to Find Source
All code that I wrote could be found on google code. You would also need to download and compile PJSIP and CMU Sphinx.
Step 7: Put It All Together
This project is was lot of fun to create: it involved woodwork, 2D CAD modeling and ordering a laster cut, Android software, hacking phone hardware and, of course, preserving vintage artifacts.
I would love to hear any comments or suggestions about this project.