Introduction: Add Custom Alexa Control to Raspberry Pi Project

This project is intended for anybody that has a Raspberry Pi project that uses Python who wants to add voice control via their existing Amazon Echo devices. You do not need to be an experienced programmer, but you should be comfortable using the command line and adapting existing code to fit your needs.

I initially set out on a project to enable my Raspberry Pi to be voice-controlled with Alexa so that it could heat water in a kettle to a specific temperature. Although the interaction I wanted was quite simple (pass one number from Alexa to the Raspberry Pi), it took a lot of work to get to that state from the existing tutorials. I hope this tutorial will make that process as fast as possible for others.

In my example, I start out with a Raspberry Pi Zero W with Raspbian. I have a Python3 program on my Pi that is capable of writing text to an SPI display, and I have a thermometer probe that I can read. For you, this program could be almost anything, but the idea is that you might have some input devices that you want to read via Alexa and/or some output devices that you want to control using Alexa.

The goal is to go from a basic program such as the one described above to a device that you can easily control with my Echo. Assuming you have this hardware already, this project should not cost you any money. In the end, you will get to the point where you can say things like:

Me: "Alexa, ask my gadget to check the temperature on sensor 1."

Alexa's response: "The probe reads 72.31 degrees."

or

Me: "Alexa, tell my gadget to write George Washington"

Response: The display connected to my Raspberry Pi now reads "George Washington"

In the next section, I will describe what needs to happen behind the scenes to make this work. If you just want to get this working on your project and don't care how it works, feel free to skip it (though it might make it harder if something goes wrong).

Step 1: Background

In this image (credit: https://developer.amazon.com/en-US/docs/alexa/alex... we can see the general architecture for the Alexa Gadgets.

When you say something to your Echo device, it sends the audio to the Alexa Cloud, where it is processed and where a response is generated to respond to you. When you ask what the weather is, it is just these two in communication. Now suppose that you want to add voice control to one of your small projects on a Raspberry Pi. Processing everything onboard would require significant hardware and a very sophisticated codebase to get things going. A better solution would be to leverage the Alexa Cloud, which is very sophisticated and has gotten very good at handling complex speech patterns. Alexa Gadgets provide a good way for you to do this.

An Alexa Gadget communicates with an Echo device using bluetooth. Once this connection is established, the two pass each other messages using UTF-8 encoding. When the Echo passes something to the gadget, it is called a directive. The other direction is referred to as an event. Before going into the exact flow of all this, we should introduce another key element: custom Alexa Skills.

Alexa allows developers to create their own custom skills, which allows them to design their own interactions and behaviors for use on all Echo devices. For example, a developer could create a custom skill to tell you the distance between two airports in the US. A user would say: "Alexa, ask my custom distance calculator what the distance is between LAX and JFK" and it could respond with "2475 miles". How does it do this? When a developer makes a custom skill, they define what are called "custom intents" with "sample utterances" containing "slots". For example, in this skill I might have the intent "calc_dist" to calculate the distance between two points. A sample utterance would be "what the distance is between {slot1} and {slot2}" or "how far between {slot1} and {slot2}". The slots shown in brackets have specific types. In this case those types would be airport codes such as LAX, JFK, BOS, ATL. When a user asks for the custom skill, the Alexa Cloud tries to match what the user says to a custom intent using the supplied sample utterances and tries to find valid slot values for that request. In this example, it would find that the user wanted the "calc_dist" intent and that slot1 is LAX and slot2 is JFK. At this point, the Alexa Cloud passes off the work to the developer's own code. Basically, it tells the developers code what intent it received and what all the slot values were, among other details.

The developer gets to decide where their code lives, but a very popular option is to use an AWS Lambda function. If you don't know what that is, it is essentially a service that allows you to upload code that can be run at any time and then charges you only for the amount of time that your code gets run. If we continue with our example, the developer's code might be a Python function that receives the two airport codes, looks up their locations, calculates the distances, and then sends a response back to the Alexa Cloud to speak something out to the user. The Alexa Cloud would then send that speech information back to the user's device, and they would get the answer.

Now we can get back to the gadget. We can create custom skills that are designed to work specifically with gadgets. A developer can write a skill that sends out a directive to a connected gadget. That directive has a payload that can be used however it is needed by the gadget. That skill can also send a directive and then listen for an event from the gadget so that the skill code can have access to information sent from the gadget.

Establishing this flow allows creates a very powerful tool because inexpensive gadgets can have the ability to communicate with code in the cloud and to respond to voice commands using some of the best voice recognition available.

It should be noted that most skills allow various ways of interacting with them. For example, a user might jump straight into an intent by saying "Alexa, ask my custom distance calculator what the distance is between LAX and JFK" (called a one-shot invocation) or they might simply use a launch intent: "Alexa, open my custom distance calculator". This last example would typically be followed by Alexa responding with a prompt for more information. This tutorial intentionally omits support for the latter. More specifically, without modifying the Lambda function, you can only invoke the skill using a one-shot invocation. This design choice allows the model to be more simple (does not have to support launch intents or conversation flow), and I have found that I usually want to interact with my gadgets using one-shot invocations anyway since they are usually faster.

Step 2: Register the Gadget on Alexa Voice Service Developer Console

The following is a description of the steps needed. I have created an equivalent video that shows how to do all of these steps. You can use either, or both, to complete this step.

  1. Navigate to https://developer.amazon.com/alexa/console/avs/hom...
  2. If you don't already have a free account, make one
  3. Click on "Products"
  4. Fill out labels and select "Alexa Gadget"
  5. Fill in whatever you want for the rest of the fields
  6. Click Finish

Step 3: Create AWS Lambda Function and Custom Skill

Create Custom Skill on Alexa Skills Kit Developer Console

Code for this tutorial can be found here

Before completing this step, you will need to create a .zip file that contains the deployment package for the AWS Lambda function as shown in the tutorial here.

  1. Download the folder "lambda" from my Github which contains "lambda_function.py" and "requirements.txt"
  2. Open the terminal and change the current directory to be inside this folder.
  3. Run the following sequence:
pip install -r requirements.txt -t skill_env
cp lambda_function.py skill_env
cd skill_env
zip -r ../../skill-code.zip

Your .zip file will now be located in the directory where the lambda folder was and will be called "skill-code.zip".

A note on the cost of hosting on AWS: This tutorial requires that you have an AWS account (free to create). Lambda functions do cost money, however, their current pricing in the N. Virginia region is $0.000000208 per 100ms use with 128MB of memory. For reference, each invocation of my skill bills about 800ms of use at this tier. To rack up a bill of $1.00USD, you would have to invoke this function about 600,000 times which (if it takes you 5 seconds per invocation) would take you over 34 days of nonstop calling your function. Cost should not be a significant issue unless you publish your skill and a huge number of people start using it. If you are concerned about getting bills on AWS, consider setting up usage alarms that notify you if usage passes a defined threshold.

The following is a description of the steps needed. I have created an equivalent video that shows how to do all of these steps. You can use either, or both to complete this step.

  1. Navigate to https://aws.amazon.com/ and sign in to the console or create a free account if you don't have one
  2. Search for and click on Lambda under services
  3. Click "Create Function"
  4. Select "Author from scratch", give it a name, and choose the latest Python 3 version for runtime
  5. Change "edit code inline" to "upload a .zip file" and select the .zip file created above
  6. In a new window, navigate to https://developer.amazon.com/alexa/console/ask and sign in
  7. Click on "Create Skill"
  8. Label it, choose "Custom" model and "Provision your own" and click "Create Skill"
  9. Click "Start from Scratch" and click "Choose"
  10. Under "Intents", click "Add"
  11. Create a custom intent called "alexa_to_pi" and fill in "write {person}" as a sample utterance
  12. Make an intent slot called "person" with type "AMAZON.Person"
  13. Create a custom intent called "pi_to_alexa" and fill in "check the temperature from sensor {sensor_num}
  14. Make an intent slot called "sensor_num" with type "AMAZON.NUMBER"
  15. Under Interfaces, turn on "Custom Interface Controller"
  16. Under Endpoint, select "AWS Lambda ARN" and copy the "Your Skill ID"
  17. Navigate back to the AWS Console
  18. Click "Add Trigger"
  19. Select "Alexa Skills Kit", check "Enable" under Skill ID verification, paste in the Skill ID you just copied and click add
  20. Copy the Lambda ARN in the upper right corner
  21. Navigate Back to the Alexa Developer Console and paste the Lambda ARN into the "Default Region" field
  22. Under Invocation, set the Skill Invocation Name to be "my gadget"
  23. Click "Save Model" and then "Build Model"
  24. Click "Test" in the top tabs and change the selector from "Off" to "Development"
  25. Note that logs for the Lambda function are found in the "CloudWatch" service on AWS.

Step 4: Set Up the Code on Your Raspberry Pi

For your Raspberry Pi to communicate with the Alexa device, it needs some code to facilitate passing information over bluetooth and maintaining that connection, in addition to a few other files. The easiest way to get started with the most up-to-date files from Amazon is to clone their Raspberry Pi Gadgets repository. Navigate to the directory of your current project and run

git clone  https://github.com/alexa/Alexa-Gadgets-Raspberry-...

This will load their whole repository with all the necessary code onto your Pi. It has some example projects that show off some of the capabilities of Alexa Gadgets. If you would like more information, see the readme on their Github page.

Run their setup function to get everything configured.

cd /home/pi/Alexa-Gadgets-Raspberry-Pi-Samples
sudo python3 launch.py --setup

Follow the prompts and respond "y" when asked if you want to configure using your Gadget credentials. Recall the Amazon ID and Gadget Secret from setting up your gadget on the developer console since it will be asked for here. I chose "bt" transmission mode for my Raspberry Pi Zero W. BLE is not supported by all older Echo devices, but you can look up what your hardware is capable of. If you are using your Pi in Desktop mode, Amazon recommends right-clicking on the bluetooth icon in the top right and clicking "Remove "Bluetooth" from Panel" to avoid connectivity issues.

Note: this step may take a while depending on how much needs to be installed.

Now you will have all the necessary support files to go back to your project and start adding in the functions to allow communication with your Echo.

If you choose, you can delete the "examples" folder in "Alexa-Gadgets-Raspberry-Pi-Samples/src"

You can have your project code wherever you like, but I'll make a folder in the home directory for it, alternatively you can download the folder with the code from my Github, just be sure to edit the .ini files as described below.

cd /home/pi
mkdir my_project
cd my_project
touch my_gadget.py
touch my_gadget.ini

I have now created two files in a folder called "my_project". The .ini file is important. Be sure that it contains the following and substitute in your Amazon ID and Gadget Secret:

[GadgetSettings]
amazonId = INSERT_AMAZON_ID_HERE
alexaGadgetSecret = INSERT_ALEXA_GADGET_SECRET_HERE

[GadgetCapabilities]
Custom.MyGadget = 1.0

Now, let's take a look at the python file before going into the details:

import json
from agt import AlexaGadget
class MyGadget(AlexaGadget):
    def __init__(self):
        super().__init__()
    def on_custom_mygadget_alexatopi(self, directive):
        payload = json.loads(directive.payload.decode("utf-8"))
        print("Received data: " + str(payload))
        write_text(str(payload['data']['person']['value']))
    def on_custom_mygadget_pitoalexa(self, directive):
        payload = json.loads(directive.payload.decode("utf-8"))
        print("Received data: " + str(payload))
        payload = {'data': "The probe reads " + str(get_temp(payload['data']\
['sensor_num']['value'])) + " degrees."}
        self.send_custom_event('Custom.MyGadget', 'PiToAlexa', payload)
        
MyGadget().main()

First you will notice that it calls two functions: write_text() and get_temp(). In my code, I define these functions in the same file, but they are dependent on my hardware so I have chosen to omit them. I have attached this file with those functions defined to just print and return dummy data in case you want to run this exact code. I would suggest testing with this exact code before you modify it to work with your project. I have also attached the .ini file, but make sure you go in and change the ID and gadget secret. The top function receives data passed in from the Alexa. The bottom function receives data in the same format, but the Alexa device will wait for five seconds for an event to be passed back with its own payload. This payload is special in that the Alexa device will speak its contents.

Once you have these files, navigate to the "my_project" folder and run the python file.

sudo reboot

cd /home/pi/my_project
sudo python3 ./my_gadget.py

If this is the first time you are running the program, you will need to pair it to your Echo device. Make sure your Echo device is near the Raspberry Pi, since we need to allow for a bluetooth connection.

In the Alexa app on your mobile device, click "devices" in the bottom right corner.

Click "Echo & Alexa" in the top left.

Click on your Echo device.

Under "WIRELESS", tap "Bluetooth Devices".

Tap "PAIR A NEW DEVICE" and you should see your gadget on the list.

Tap on your gadget. You should see the Pi report that it successfully paired.

While watching the output on your Pi, try giving a voice command to the Echo:

You: "Alexa, ask my gadget to check the temperature from sensor one"

If everything worked properly, you should hear:

Echo: "The probe reads 120.505 degrees."

You: "Alexa, tell my gadget to write George Washington."

The Pi should print:

"Received data: {'data': {'person': {'name': 'person', 'value': 'George Washington', 'confirmationStatus': 'NONE'}}}

George Washington"

Step 5: Wrapping Up

The video shown here is an example of the gadget working with reading the temperature (the same probe in F vs. C) and writing names to a simple display.

Now that you hopefully have something working, you should try to go and customize this to make your own project more capable. Remember that you can easily edit the intents in the Alexa Developer Console and that all of the slots you use will be passed to your Pi in the payload. Furthermore, you can have Alexa say anything you would like by just editing the payload you pass back in the event from your Raspberry Pi code.

Please note that this tutorial is not intended to be a the final solution for all of the capabilities you could want with an Alexa Gadget. It is intentionally limited to give you two simple functions for passing data in each direction between Alexa and a Gadget. If you are interested in building more sophisticated interaction models, I would encourage you to read all of the readme files in https://github.com/alexa/Alexa-Gadgets-Raspberry-P... and to try all of the examples they provide. I would also suggest that you read the documentation for the Alexa Gadgets Toolkit and the Alexa Skills Kit.