Introduction: Object Detection With Sipeed MaiX Boards(Kendryte K210)
As a continuation of my previous article about image recognition with Sipeed MaiX Boards, I decided to write another tutorial, focusing on object detection. There was some interesting hardware popping up recently with Kendryte K210 chip, including Seeed AI Hat for Edge Computing, M5 stack's M5StickV and DFRobot's HuskyLens (although that one has proprietary firmware and more targeted for complete beginners). Because of it's cheap price, Kendryte K210 has appealed to people, wishing to add computer vision to their projects. But as usual with Chinese hardware products, the tech support is lacking and this is something that I'm trying to improve with my articles and videos. But do keep in mind, that I am not on the Kendryte or Sipeed developers team and cannot answer all the questions related to their product.
With that in mind, let's start! We'll begin with short(and simplified) overview of how object recognition CNN models work.
UPDATE MAY 2020: Seeing how my article and video on Object Detection with K210 boards are still very popular and among top results on YouTube and Google, I decided to update the article to include the information about aXeleRate, Keras-based framework for AI on the Edge I develop. aXeleRate, essentially, is based off the collection of scripts I used for training image recognition/object detection models - combined into a single framework and optimized for workflow on Google Colab. It is more convenient to use and more up to date.
For the old version of the article, you can still see it on steemit.com.
Step 1: Object Detection Model Architecture Explained
Image recognition (or image classification) models take the whole image as an input and output a list of probabilities for each class we're trying to recognize. It is very useful if the object we're interested in occupies a large portion of the image and we don't care much about its location. But what if our project (say, face-tracking camera) requires us not only to have a knowledge about the type of object in the image, but also its coordinates. And what about project requiring detecting multiple objects(for example for counting)?
Here is when Object Detection Models come in handy. In this article we'll be using YOLO (you only look once) architecture and focus the explanation on internal mechanics of this particular architecture.
We're trying to determine what objects are present in the picture and what are their coordinates. Since machine learning is not magic and not "a thinking machine", but just an algorithm which uses statistics to optimize the function(neural network) to better solve a particular problem. We need to paraphrase this problem to make it more "optimizable". A naive approach here would be to have the algorithm minimizing loss(difference) between it's prediction and correct coordinates of the object. That would work pretty well, as long as we have only one object in the image. For multiple objects we take a different approach - we add the grid and make our network predict the presence (or absence) of the object(s) in each grid. Sounds great, but still leaves too much uncertainty for the network - how to output the prediction and what to do when there are multiple objects with center inside one grid cell? We need to add one more constrain - so called anchors. Anchors are initial sizes (width, height) some of which (the closest to the object size) will be resized to the object size - using some outputs from the neural network (final feature map).
So, here's a top-level view on what's going on when YOLO architecture neural network performs an object detection on the image. According to features detected by feature extractor network, for each grid cell a set of predictions is made, which includes the anchors offset, anchor probability and anchor class. Then we discard the predictions with low probability and voila!
Step 2: Prepare the Environment
aXeleRate is based on wonderful project by penny4860, SVHN yolo-v2 digit detector. aXeleRate takes this implementation of YOLO detector in Keras to a next level and uses its convenient configuration system to perform training and conversion of image recognition/object detection and image segmentation networks with various backends.
To are two ways to use aXeleRate: running locally on Ubuntu machine or in Google Colab. For running in Google Colab, have a look at this example:
PASCAL-VOC Object Detection Colab Notebook
Training your model locally and exporting it to be used with hardware acceleration is also much easier now.I highly recommend you installing all the necessary dependencies in Anaconda environment to keep your project separated from others and avoid conflicts.
Download the installer here.
After installation is complete, create a new environment:
conda create -n yolo python=3.7
Let's activate the new environment
conda activate yolo
A prefix before your bash shell will appear with the name of the environment, indicating that you work now in that environment.
Install aXeleRate on your local machine with
pip install git+https://github.com/AIWintermuteAI/aXeleRate
And then run this to download scripts you will need for training and inference:
git clone https://github.com/AIWintermuteAI/aXeleRate.git
You can run quick tests with tests_training.py in aXeleRate folder. It will run training and inference for each model type, save and convert trained models. Since it is only training for 5 epochs and dataset is very small, you will not be able to get useful models, but this script is only meant for checking for absence of errors.
Step 3: Train an Object Detection Model With Keras
Now we can run a training script with the configuration file. Since Keras implementation of YOLO object detector is quite complicated, instead of explaining every relevant piece of code, I will explain how to configure the training and also describe relevant modules, in case you want to make some changes to them yourself.
Let's start with a toy example and train a racoon detector. There is a config file inside of /config folder, raccoon_detector.json. We choose MobileNet7_5 as architecture (where 7_5 is alpha parameter of the original Mobilenet implementation, controls the width of the network) and 224x224 as input size. Let's have a look at the most important parameters in the config:
Type is model frontend - Classifier, Detector or Segnet
Architecture is model backend (feature extractor)
- Full Yolo - Tiny Yolo - MobileNet1_0 - MobileNet7_5 - MobileNet5_0 - MobileNet2_5 - SqueezeNet - VGG16 - ResNet50
For more information on anchors, please read here https://github.com/pjreddie/darknet/issues/568
Labels are labels present in your dataset. IMPORTANT: Please, list all the labels present in the dataset.
object_scale determines how much to penalize wrong prediction of confidence of object predictors
no_object_scale determines how much to penalize wrong prediction of confidence of non-object predictors
coord_scale determines how much to penalize wrong position and size predictions (x, y, w, h)
class_scale determines how much to penalize wrong class prediction
augumentation - image augumentation, resizing, shifting and blurring the image in order to prevent overfitting and have greater variety in dataset.
train_times, validation_times - how many times to repeat the dataset. Useful if you have augumentation
enabled
first_trainable_layer - allows you to freeze certain layers if you're using a pre-trained feature network
Now we need to download the dataset, which I shared on my Google Drive (original dataset), which is a racoon detection dataset, containing 150 annotated pictures.
Make sure to change the lines in configuration file (train_image_folder, train_annot_folder) accordingly and then start the training with the following command:
python axelerate/train.py -c configs/raccoon_detector.json
train.py reads the configuration from .json file and trains the model with axelerate/networks/yolo/yolo_frontend.py script. yolo/backend/loss.py is where custom loss function is implemented and yolo/backend/network.py is where the model is created(input, feature extractor and detection layers put together). axelerate/networks/common_utils/fit.py is script that implements training process and axelerate/networks/common_utils/feature.py contains feature extractors. If you intend to use trained model with K210 chip and Micropython firmware,due to memory limitations you can choose between MobileNet(2_5, 5_0 and 7_5) and TinyYolo, but I've found MobileNet gives better detection accuracy.
Since it is a toy example and only contains 150 images of raccoons, the training process should be pretty fast, even without GPU, although the accuracy will be far from stellar. For work-related project I've trained a traffic sign detector and a number detector, both datasets included over a few thousand training examples.
Step 4: Convert It to .kmodel Format
With aXeleRate, model conversion is performed automatically - this is probably the biggest difference from the old version of training scripts! Plus you get the model files and training graph saved neatly in project folder. Also I did find that vaiidation accuracy sometimes fails to give estimation on model real perfomance for object detection and this why I added mAP as a validation metric for object detection models. You can read more about mAP here.
If the mAP, mean average precision (our validation metric) is not improving for 20 epochs, the training will stop prematurely. Every time mAP improves, model is saved in the project folder. After training is over, aXeleRate automatically converts the best model to specified formats - you can choose, "tflite", "k210" or "edgetpu" as of now.
Now to the last step, actually running our model on Sipeed hardware!Step 5: Run on Micropython Firmware
It is possible to run inference with our object detection model with C code, but for the sake of convenience we will use Micropython firmware and MaixPy IDE instead.
Download MaixPy IDE from here and micropython firmware from here. You can use python script kflash.py to burn the firmware or download separate GUI flash tool here.
Copy model.kmodel to the root of an SD card and insert SD card into Sipeed Maix Bit(or other K210 device). Alternatively you can burn .kmodel to device's flash memory. My example script reads .kmodel from flash memory. If you are using SD card, please change this line
task = kpu.load(0x200000)
to
task = kpu.load("/sd/model.kmodel")
Open MaixPy IDE and press the connect button. Open raccoon_detector.py script from example_scripts/k210/detector folder and press Start button. You should be seeing a live stream from camera with bounding boxes around ... well, raccoons. You can increase the accuracy of the model by providing more training examples, but do keep in mind that it is fairy small model(1.9 M) and it will have troubles detecting small objects (due to low resolution).
One of the questions I received in comments to my previous article on image recognition is how to send the detection results over UART/I2C to other device connected to Sipeed development boards. In my github repository you will be able to find another example script, raccoon_detector_uart.py, which (you guessed it) detects raccoons and sends the coordinates of bounding boxes over UART. Keep in mind, that pins used for UART communication are different of different boards, this is something you need to check yourself in the documentation.
Step 6: Summary
Kendryte K210 is a solid chip for computer vision, flexible, albeit with limited memory available. So far, in my tutorials we have covered using it for recognizing custom objects, detecting custom objects and running some OpenMV based computer vision tasks. I know for a fact that it is also suitable for face recognition and with some tinkering it should be possible to do pose detection and image segmentation (you can use aXeleRate to train semantic segmentation model, but I did not yet implement the inference with K210). Feel free to have a look at aXeleRate repository issues and make a PR if you think there are some improvements that you can contribute!
Here are some articles I used in writing this tutorial, have a look if you want to learn more about object detection with neural networks:
Bounding box object detectors: understanding YOLO, You Look Only Once
Understanding YOLO (more math)
Gentle guide on how YOLO Object Localization works with Keras (Part 2)
Real-time Object Detection with YOLO, YOLOv2 and now YOLOv3
Hope you can use the knowledge you have now to build some awesome projects with machine vision! You can buy Sipeed boards here, they are among the cheapest options available for ML on embedded systems.
Add me on LinkedIn if you have any questions and subscribe to my YouTube channel to get notified about more interesting projects involving machine learning and robotics.
63 Comments
5 months ago
hi. i can't convert tflite to kmodel file with this issue.
1 year ago
I am working on my capstone project, which is person detection, in only one class. if have followed the same as racoon detector, my training images size is 224 * 224 and annot using labelling, but when I completed training with ---
person 1.0000
mAP: 1.0000
mAP did not improve from 1.0.
Epoch 00048: Learning rate is 1.1540750901886111e-07.
Epoch 50/50
60/60 [==============================] - 14s 221ms/step - loss: 0.0925 - val_loss: 0.0907
and import kmodel to sipeed m1 dock but is not detecting anything.
I use the same anchors as the racoon detector, I think it's not perfect for my dataset
could you please help me out,
I want a 90% accuracy person detection model
Thank You.
2 years ago
Would this also work with HuskyLens?
Question 2 years ago
Hi, i tried to follow your tutorial but at point i runt this command ( ""python axelerate/train.py -c configs/raccoon_detector.json"") it gives me an error "Illegal instruction (core dumped)".
i changed path in raccoon_detector.json file accordingly my path.
please tell me what is the problem. i tried in both environment conda and in base linux. more important thing it is not showing traceback information which can help more.
please tell me what i can do?
Thanks.
Question 3 years ago
Hi,I tried to follow your tutorial,but without success,I don't know what the problem is.Can you give me some advice?
Here is the error when I try to run the train command:
Epoch 00007: val_loss did not improve from 0.69008
Epoch 8/25
106/106 [==============================] - 102s 959ms/step - loss: 0.3750 - val_loss: 0.7024
Epoch 00008: val_loss did not improve from 0.69008
Epoch 9/25
106/106 [==============================] - 101s 955ms/step - loss: 0.3097 - val_loss: 0.7333
Epoch 00009: val_loss did not improve from 0.69008
Epoch 10/25
106/106 [==============================] - 93s 880ms/step - loss: 0.3156 - val_loss: 0.7190
Epoch 00010: val_loss did not improve from 0.69008
Traceback (most recent call last):
File "train.py", line 77, in <module>
config['train']['is_only_detect'])
File "/home/imliubo/Kendryte/Yolo-digit-detector/yolo/frontend.py", line 140, in train
saved_weights_name = saved_weights_name)
File "/home/imliubo/Kendryte/Yolo-digit-detector/yolo/backend/utils/fit.py", line 140, in train
max_queue_size = 4)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/keras/engine/training_generator.py", line 251, in fit_generator
callbacks.on_epoch_end(epoch, epoch_logs)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/keras/callbacks.py", line 79, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/keras/callbacks.py", line 1127, in on_epoch_end
K.set_value(self.model.optimizer.lr, new_lr)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2440, in set_value
assign_op = x.assign(assign_placeholder)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1952, in assign
name=name)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 227, in assign
validate_shape=validate_shape)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 66, in assign
use_locking=use_locking, name=name)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 366, in _apply_op_helper
g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6135, in _get_graph_from_inputs
_assert_same_graph(original_graph_element, graph_element)
File "/home/imliubo/miniconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6071, in _assert_same_graph
(item, original_item))
ValueError: Tensor("Placeholder_137:0", shape=(), dtype=float32) must be from the same graph as Tensor("Adam/lr:0", shape=(), dtype=float32_ref).
Answer 3 years ago
Hi! Please, try with https://github.com/AIWintermuteAI/aXeleRate
This
is new framework I wrote, it is tailored for use with Google Colab. I updated the article to reflect the latest changes!
Reply 2 years ago
Hi DmitryM8,
I train kmodel follow this instruction : https://colab.research.google.com/github/AIWinterm...
And get a YOLO_best_mAP.kmodel and save it on sd card
But, when I run kpu.load("/sd/YOLO_best_mAP.kmodel") i got error : we only support V3 now". How can I fix it . Thanks in advance.
Reply 2 years ago
https://dl.sipeed.com/fileDownload?verify_code=9kb...
sudo ./kflash_gui
2 years ago
Something wrong....Trining and interference successfully..texts and boxes all over..
2 years ago
Hi , i tried to follow the tutorial but it is not working for me. I got the this error in the terminal (Error:13:) and i dont know how to fix it. Can u help me , please? I follow the instructions to the Colab and i wanted to tested it but when i run in the IDE that's what happen.
Reply 2 years ago
Hi! Is it you on GitHub issues?
https://github.com/sipeed/MaixPy/issues/299
Question 2 years ago
Hey Thank you for your tutorial it was really helpful, but I am not able to run my kmodel on my Maix dock device when I use the firmware ( maixpy_v0.4.0_50_gcafae9d ) that you used in this video,
I get an error:
ValueError: b>>> init i2c2
my kmodel size is 1.81 MB
also when I try to run it with the latest firmware
(maixpy_v0.5.0_106_g67c538f_minimum_with_ide_support)I get an error:
ERR_KMODEL_VERSION: only support kmodel V3
Thank you.
Answer 2 years ago
What about this firmware?
https://cn.dl.sipeed.com/MAIX/MaixPy/release/maste...
Plus, can you create an issue on MaixPy/aXeleRate github with screenshots/code/full description of the problem?
2 years ago
The first train it's ok, but second,third traning i got error:
Epoch 00011: val_loss did not improve from 0.55500
Traceback (most recent call last):
File "train.py", line 77, in <module>
config['train']['is_only_detect'])
File "/home/ducanh/Yolo-digit-detector/yolo/frontend.py", line 140, in train
saved_weights_name = saved_weights_name)
File "/home/ducanh/Yolo-digit-detector/yolo/backend/utils/fit.py", line 140, in train
max_queue_size = 4)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/engine/training_generator.py", line 251, in fit_generator
callbacks.on_epoch_end(epoch, epoch_logs)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/callbacks.py", line 79, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/callbacks.py", line 1127, in on_epoch_end
K.set_value(self.model.optimizer.lr, new_lr)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2440, in set_value
assign_op = x.assign(assign_placeholder)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1762, in assign
name=name)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 223, in assign
validate_shape=validate_shape)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 64, in assign
use_locking=use_locking, name=name)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 350, in _apply_op_helper
g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 5713, in _get_graph_from_inputs
_assert_same_graph(original_graph_element, graph_element)
File "/home/ducanh/anaconda3/envs/yolo/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 5649, in _assert_same_graph
original_item))
ValueError: Tensor("Placeholder_137:0", shape=(), dtype=float32) must be from the same graph as Tensor("Adam/lr:0", shape=(), dtype=float32_ref).
Reply 2 years ago
Hi! You're using an outdated version of training script - please use this one https://github.com/AIWintermuteAI/aXeleRate as described in the article.
2 years ago
i always get error OSError: run error: 13 at line a = kpu.init_yolo2(task, 0.3, 0.3, 5, anchor)
i tried loading .kmodel from sd card but still get same error
i have also tried minimum firmware and minimum with ide (0.5.0_29)
my code is as follows:
import sensor,image,lcd
import KPU as kpu
lcd.init()
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.set_windowing((224, 224))
sensor.set_vflip(1)
sensor.run(1)
classes = ["bearing_not_ok","bearing_ok"]
task = kpu.load(0x300000)
#task = kpu.load('/sd/m7_5.kmodel')
anchor = (0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828)
a = kpu.init_yolo2(task, 0.3, 0.3, 5, anchor)
while(True):
#.rotation_corr(z_rotation=90.0)
#a = img.pix_to_ai()
code = kpu.run_yolo2(task, img)
if code:
for i in code:
a=img.draw_rectangle(i.rect(),color = (0, 255, 0))
a = img.draw_string(i.x(),i.y(), classes[i.classid()], color=(255,0,0), scale=3)
a = lcd.display(img)
else:
a = lcd.display(img)
Reply 2 years ago
Hi! Please check this example in my GIthub https://github.com/AIWintermuteAI/aXeleRate/blob/master/example_scripts/k210/detector/racoon_detector.py specifically line 14
Question 2 years ago
Hi Just tried a fresh install on a new VM (Ubuntu Desktop 16.04.6 64-bit), followed instructions but when I tried running tests_training.py I get the following error:
ModuleNotFoundError: No module named 'tensorflow.keras.layers.experimental.preprocessing'
...
ImportError: Keras requires Tensorflow 2.2 or higher.
Any suggestions? The only thing I did differently to your instructions was that I had to install git before using the pip command.
Thanks.
Tim
Answer 2 years ago
Fixed that :) there was a problem with requirements.txt file
Question 3 years ago
hi,
from some days I got this error on every script I try to load on the board (maix dock)...
after some seconds the camera stops working, the board reboots and from maixpyIDE (or from PuttY) i got only this error line:
"RuntimeError: Sensor Timeout!!"
I've already tried changing a lot of different firmwares...
sorry for the bit "off topic" question, but i've already ask for this on official sypeed forum and their github profile but i'm waiting for a response :)
is my sensor faulty? i've to buy a new ov2640?
thanks!