Raspberry Pi 4 Traffic Sign Recognition Robot

This instructable is based off my university project. The aim was to create a system where a neural network analyses an image and then based on the recognition will tell an arduino robot to move via Ros.

For example if a turn right sign is recognised then the robot will turn right, if a turn left sign is recognised then the robot will turn left, if neither are recognised then the robot will continue forward. The dataset that’ll be used is the official traffic sign recognition from the INI (2019) (Institut Fur Neuroinformatik), this dataset has 43 classes however only two are needed; 00033 and 00034 folders in the dataset are left and right turn signs.

Step 1: Requirements

The requirements for this project are the following:

An arduino robot. (basically an arduino uno, a motor driver and motors) (not needed if you aren't using a robot)

A raspberry pi 4.

A pi camera.

Software required:

Python 3.

OpenCV 4.

Tensorflow.

arduino IDE (not needed if you aren't using a robot)

Ros (not needed if you aren't using a robot)

Whatever your favourite python ide is (On the raspberry pi, I use Thonny).

To set up OpenCV and Tensorflow, follow the instructions by Adrian. Link: https://www.pyimagesearch.com/

I recommend looking at as many of his tutorials as possible, they are really interesting and are both useful for beginners as well as intermediates.

Step 2: Training of Data

The train script is designed to access the dataset which compiles of around 50,000 images from 43 classes. The script is written in python, using a variety of libraries: os – this is for linking the python script to the correct directory where the dataset is located. Matplotlib – this is for displaying the data from the training model. Tensorflow and keras – these are the libraries used to create the artificial neural network model, they are used to design the model. Numpy – this library is for turning images into an array which can then be put through the model to retrieve a prediction.

The script attached is the python code for making a model from the the dataset. This consists on convolutional 2D with a (5,5) input and an activation of relu then pooling, once this is done the input goes through another convolution with a (3,3) input with the same activation and pooling. This happens one last time before being flattened and then the density is applied to the amount of classes there are, in this case 43.

The next step was to compile the model. This is the part that sets the optimiser, a sgd was the most fitting since this was similar to the optimiser used in assignment 1. Sgd stands for Stochastic gradient descent. Also within the compiler the loss needs to be set, choosing a sparse_categorical_crossentropy loss is the best fitting since the categories are as integers and the model will output a prediction for each class as a float between 0 and 1. 1 being 100% accuracy.

Once the compiler is complete, a generator needs to be applied for the model to start processing the image inputs. The generator consists of multiple parts: training_set – this is the link to the dataset used for training, steps_per_epoch – this is the number steps per epoch that are required, epochs – these are how many times the program will iterate through a full set of data, validation_data – this is the link to the dataset used for validation, validation_steps – the number of steps used for validation, validation happens at the end of each epoch.

Generally, a complete wipe of the whole dataset needs to be complete per epoch. Hence for example a dataset of 1024 images will require: Batch size = 32, Steps per epoch = 32, epochs = 1. Each step includes the whole batch size, so with a batch size of 32 the steps will be 32. On the other hand, it’s best to have a bigger batch size than the number of classes, this is because if the batch size is smaller then each step can’t include an image from each class.

Once the model has finished training, using matplotlib the program will make a graph of the outputs, this shows the history of the training from start to finish. The graph consists of accuracy, validation accuracy, loss and validation loss, this is split up per epoch to show how the training progressed. The final stage is to save the model as a .h5 file which can be accessed later on for the prediction process. Saving the model means that each time the prediction program is ran the training program doesn’t need to be ran again. The training program can take up to 10 minutes per epoch on a raspberry pi.

Attached is the Training script:

traffic2.py
Download

Step 3: Implementing the Pi Camera Predictions

The next program is the prediction and publisher script.

The first stage is to load the model using model.load(). The second stage is to iterate through the frames from the pi camera using opencv and then resize the frame to the same size as the input sizes used in the training stage, 32 x 32 pixels. Once this is done the new resized frame is put through the model using model.predict () which outputs a matrix, each element of the matrix is a float from 0 to 1, the element index is the same as the class it’s representing, hence the first element is class one and the number is the prediction of certainty of the image being from that class. E.g. [[1.0, 0.0, 0.0,…]] shows that the prediction is 100% for class one and zero for the others. In real cases though the output will show certainty for multiple classes and not always 100% hence the added if statement of if the certainty is more than 95% or 0.95 then return the index. This index is put through a list of lists with the actual names of the classes, e.g. ’20 mph speed’ or ‘turn left’. One this is complete there is an if statement which is ‘if turn left or turn right then publish that sign to ros else publish ‘ahead’’. The publisher will publish the string to ros under the topic called robot. In the third step the robot will listen to the topic.

NOTE: If you aren't using the robot side. Just remove the lines:

"import rospy"

"def talker(direction):

message = String()

pub = rospy.Publisher('robot', String, queue_size=10)

rospy.init_node('talker', anonymous=True)

message = direction

rospy.loginfo(message)

pub.publish(message)"

"talker(direction)"

Attached is the Pi camera script.

robotmove.py
Download

Step 4: Arduino Robot

The last step is the robot program script.

This is written in C++ and is a .ino file for the arduino uno. The program requires the ros library which can be found in the libraries manager within the ide. Once this is imported there are example files, I chose to expand on the led blink file since this would do a similar objective to what I needed. The program continues to loop until the power is disconnected, firstly it listens to the topic robot, when it catches a command from that topic it will have an if statement to see what the command says. If the command is left then the script runs the turn left method, if the command is right then it’ll run the turn right method and else it’ll run the forward method. These three methods are very similar to each other, they tell the digital pins to be either LOW (ground) or 100 (PWM) this is for so that the robot isn’t too fast by telling the motor driver to only let a bit of voltage out. The order of these outputs are what makes the robot turn left and right or go forwards, this is due to the orientation of the voltage going to the motors.

Attached is the .ino script for arduino.

Step 5: Testing

The Images attached so the project from start to finish. The first image shows the training in process. Once that is complete a print out of the model made is shown. The third image shows a prediction from the training script. this is the last stage of the training script. If you look in the folder the training script is in, a graph and a model has been made. The graph should look like image 4 here, this shows the history of the training from start to finish.

The final image is while running the pi camera script, the is a live stream from the pi camera. a prediction is made on each frame and the prediction is printed in the terminal. The frame shows what the camera is seeing.

Attached is my University report for this project. Please read for more detail of the project.