Introduction: Image Recognition With K210 Boards and Arduino IDE/Micropython
I already wrote one article on how to run OpenMV demos on Sipeed Maix Bit and also did a video of object detection demo with this board. One of the many questions people have asked is - how can I recognize an object that the neural network is not trained for? In other words how to make your own image classifier and run it with hardware acceleration.
This is an understandable question, since for your project you probably don't need to recognize some generic objects, like cats and dogs and airplanes. You want to recognize something specific, for example, a breed of the dog for that automatic pet door, or a plant species for sorting, or any other exiting applications you can think about!
I got you! In this article I will teach you how to create your own custom image classifier with transfer learning in Keras, convert the trained model to .kmodel format and run it on Sipeed board (can be any board, Bit/Dock or Go) using Micropython or Arduino IDE. And only your imagination will be the limit to tasks you can do with this knowledge.
UPDATE MAY 2020: Seeing how my article and video on Image Recognition with K210 boards are still very popular and among top results on YouTube and Google, I decided to update the article to include the information about aXeleRate, Keras-based framework for AI on the Edge I develop.
aXeleRate, essentially, is based off the collection of scripts I used for training image recognition/object detection models - combined into a single framework and optimized for workflow on Google Colab. It is more convenient to use and more up to date.
For the old version of the article, you can still see it on steemit.com.
Step 1: CNN and Transfer Learning: Some Theory
Convolutional Neural Networks or CNN is a class of deep neural networks, most commonly applied to analyzing visual imagery. There is a lot of literature on the internet on the topic and I'll give some links in the last part of the article. In short, you can think of CNN as a series of filters, applied to the image, each filter looking for a specific feature in the image - on the lower convolutional layers the features are usually lines and simple shapes and on the higher layers the features can be more specific, e.g. body parts, specific textures, parts of animals or plants, etc. A presence of certain set of features can give us a clue to what the object in the image might be. Whiskers, two eyes and a black nose? Must be cat! Green leaves, a tree trunk? Looks like a tree!
I hope you get the idea about the working principle of CNN now. Normally a deep neural network needs thousands of images and hours of training time(depends on the hardware you are using for training) to "develop" filters that are useful for recognizing the types of objects you want. But there is a shortcut.
A model trained to recognize a lot of different common objects(cats, dogs, house appliances, transport, etc) already has a lot of those useful filters "developed", so we don't need it to learn recognizing the basic shapes and parts of the objects again. We can just re-train the last few layers of the network to recognize specific classes of objects, that are important for us. This is called "transfer learning". You need significantly much less training data and compute time with transfer learning, since you are only training last few layers of the network, composed maybe of few hundred neurons.
Sounds awesome, right? Let's see how to implement it.
Step 2: Prepare Your Environment
To are two ways to use aXeleRate: running locally on Ubuntu machine or in Google Colab. For running in Google Colab, have a look at this example:
Training your model locally and exporting it to be used with hardware acceleration is also much easier now.
My working environment is Ubuntu 16.04, 64bit. You can use Virtual machine to run Ubuntu image since we will not use GPU for training. With some modifications you can also run the training script on Windows, but for model conversion you will need to use Linux system. So, preferable environment for you to execute this tutorial is Ubuntu 16.04, running natively or in virtual machine.
Let's start by installing Miniconda, which is environment manager for Python. We will create isolated environment, so we won't accidentally change anything in your system Python environment.
Download the installer here
After installation is complete, create a new environment:
conda create -n ml python=3.7
Let's activate the new environment
conda activate ml
A prefix before your bash shell will appear with the name of the environment, indicating that you work now in that environment.
Step 3: Install AXeleRate and Run Tests
Install aXeleRate on your local machine with
pip install git+https://github.com/AIWintermuteAI/aXeleRate
To download examples run:
git clone https://github.com/AIWintermuteAI/aXeleRate
You can run quick tests with tests_training.py in aXeleRate folder. It will run training and inference for each model type, save and convert trained models. Since it is only training for 5 epochs and dataset is very small, you will not be able to get useful models, but this script is only meant for checking for absence of errors.
Step 4: Re-train the Model, Convert Keras Model to .kmodel
For this toy example we will be training the model to recognize Santa Claus and Arduino Uno. Obviously you can choose other classes. Download the dataset from here. Create a copy of classifier.json file in config folder, then change it accordingly, similar to config file in the screenshot - make sure the path to training and validation folders is correct!
Run the following command from aXeleRate folder:
python axelerate/train.py - c configs/santa_uno.json
The training will start. If the validation accuracy(our validation metric) is not improving for 20 epochs, the training will stop prematurely. Every time validation accuracy improves, model is saved in the project folder. After training is over, aXeleRate automatically converts the best model to specified formats - you can choose, "tflite", "k210" or "edgetpu" as of now.
Step 5: Run the Model on Sipeed Maix Bit
There are two ways to run the model you have now on Sipeed Maix hardware: micropython firmware and Arduino IDE. Micropython hardware is easier to use, but it occupies significant portion of available memory, so there is less space left for the model. Arduino IDE is basically C code, which is much more efficient and has smaller memory footprint. My model is just 1.9Mb, so both options work for it. You can use models as large as 2.9 Mb with Micropython, for anything larger you need to consider using Arduino IDE.
Burn the firmware with kflash_gui tool. You can also choose to burn the trained model to flash too, as shown in the screenshot. Or copy it to SD card(in that case copy .kmodel to the root of an SD card and insert SD card into Sipeed Maix Bit)
Open OpenMV IDE and press the connect button. Open santa_uno.py script from example_scripts folder and press Start button. You should be seeing a live stream from camera and if you open Serial Terminal you will the top image recognition result with the confidence score!
For using with Arduino IDE, first you need to follow the procedure for adding Sipeed boards to Arduino IDE, which is documented here. Your Arduino IDE version needs to be at least 1.8.12. After you added the boards, open the mobilenet_v1_transfer_learning.ino sketch and upload it to Sipeed Maix Bit. Change the name of the model on SD card to "model" (or make a copy with this name). You can change the label names in names.cpp. It will show the live camera stream on the Sipeed Maix screen along with the top image recognition result.
Step 6: Conclusions
Here are some more materials to read on the topic of CNNs and transfer learning:
Transfer Learning using Mobilenet and Keras A great explanation of Transfer learning, this tutorial uses a modified version of the code from that article.
Cats and dogs and convolutional neural networks Explains basics behind CNNs and visualizes some of the filters. With cats!
Train, Convert, Run MobileNet on Sipeed MaixPy and MaixDuino! A tutorial from the Sipeed team on how to train Mobilenet 1000 classes from scratch(no transfer learning). You can download their pre-trained model and try it out!
Hope you can use the knowledge you have now to build some awesome projects with machine vision! You can buy Sipeed boards here, they are among the cheapest options available for ML on embedded systems.