Using Sonar, Lidar, and Computer Vision on Microcontrollers to Aid the Visually Impaired

I want to create an intelligent ‘cane’ that can help people with visual impairments much more than existing solutions. The cane will be able to notify the user of objects in front or on the sides by making a noise in the surround sound type headphones. The cane will also have a small camera and LIDAR (Light Detection and Ranging) so that it can recognize objects and people in the room and notify the user by using the headphones. For safety reasons, the headphones will not block out all of the noise as there will be a microphone that can filter out all of the unnecessary sounds and keep the car horns and people talking. Lastly the system will have a GPS so that it can give directions and show the user where to go.

Please vote for me in the Microcontroller and Outdoor Fitness contests!

Step 1: Overview of Project

According World Access for the Blind, physical movement is one of the biggest challenges for blind people. Traveling or simply walking down a crowded street may be very difficult. Traditionally the only solution was using the commonly known “white cane” which is primarily used to scan surroundings by hitting the obstacles in proximity of the user. A better solution would be a device that can replace the sighted assistant by providing information about location of obstacles so that the blind person can go out in unknown environments and feel safe. During this project, a small battery-operated device that meets these criteria was developed. The device can detect the size and location of object by means of sensors which measure the position of objects in relation to the user, relay that information to a microcontroller, and then convert it to audio to provide information to the user. The device was built using available commercial LIDAR (Light Detection and Ranging), SONAR (Sound Navigation and Ranging), and computer vision technologies linked to microcontrollers and programmed to provide the required audible information output using earbuds or headphones. The detection technology was embedded within a “white cane” to indicate to others the user’s condition and provide additional safety.

Step 2: Background Research

In 2017, the World Health Organization reported that there were 285 million visually-impaired people worldwide of which 39 million are completely blind. Most people don't think about the issues that visually-impaired people face every day. According World Access for the Blind, physical movement is one of the biggest challenges for blind people. Traveling or simply walking down a crowded street may be very difficult. Because of this, many people who are visually-impaired prefer bringing a sighted friend or family member to help navigate new environments. Traditionally the only solution was using the commonly known “white cane” which is primarily used to scan surroundings by hitting the obstacles in proximity of the user. A better solution would be a device that can replace the sighted assistant by providing information about location of obstacles so that the blind person can go out in unknown environments and feel safe. NavCog, a collaboration between IBM and Carnegie Mellon University, have attempted to solve the problem by creating a system that uses Bluetooth beacons and smartphones to help guide. However, the solution was cumbersome and proved to be very costly for large scale implementations. My solution addresses this by eliminating any need for external devices and by using a voice to guide the user throughout the day (Figure 3). The advantage of having the technology embedded within a “white cane” is that it signals the rest of the world of the user’s condition which causes change in the behavior of the surrounding people.

Step 3: Design Requirements

After researching the technologies available, I discussed possible solutions with vision professionals on the best approach to helping the visually-impaired navigate their environment. The table below lists the most important features required for someone to transition to my device.

Feature - Description:

Computation - The system needs to provide a fast processing for the exchanged information between the user and sensors. For example, the system needs to be able to inform the user of obstacles in front that are at least 2m away.
Coverage - The system needs to provide its services indoors and outdoors to improve the quality of visually-impaired people’s lives.
Time - The system should perform as well in daytime as at night time.
Range - The range is the distance between the user and the object to be detected by the system. Ideal minimum range is 0.5 m, whereas the maximum range should be more than 5 m. Further distances would be even better but more challenging to compute.
Object Type - The system should detect the sudden appearance of objects. The system should be able to tell the difference between moving objects and static objects.

Step 4: Engineering Design and Equipment Selection

After looking at many different components, I decided on parts selected from the different categories below.

Price of selected parts:

Zungle Panther: $149.99
LiDAR Lite V3: $149.99
LV-MaxSonar-EZ1: $29.95
Ultrasonic Sensor - HC-SR04: $3.95
Raspberry Pi 3: $39.95
Arduino: $24.95
Kinect: $32.44
Floureon 11.1v 3s 1500mAh: $19.99
LM2596HV: $9.64

Step 5: Equipment Selection: Method of Interaction

I decided to use voice control as the method to interact with the device because having multiple buttons on a cane can be challenging for a visually-impaired person, especially if some functions required a combinations of buttons. With voice control, the user can use preset commands to communicate with the cane which reduces potential errors.

Device: Pros --- Cons:

Buttons: No error of command when right button pressed --- It may challenging to ensure the correct buttons are pressed
Voice control: Easy because the user can use preset commands --- Incorrect pronunciation may induce errors

Step 6: Equipment Selection: Microcontroller

The device used the Raspberry Pi because of its low cost and sufficient processing power to calculate the depth map. The Intel Joule would have been the preferred option but its price would have doubled the cost of the system which would not be ideal this device which is developed to provide a lower cost option for users. The arduino was utilized in the system because it can easily get information from sensors. The BeagleBone and Intel Edison were not used because of low price to performance ratio which is bad for this low cost system.

Microcontroller: Pros --- Cons:

Raspberry Pi: Has enough processing power for finding obstacles and has integrated WiFi/Bluetooth --- Not many options for receiving data from sensors
Arduino: Easily receive data from small sensors. ie. LIDAR, Ultrasonic, SONAR, etc --- Not enough processing power for finding obstacles
Intel Edison: Can process obstacles quickly with fast processor --- Requires extra developer pieces to function for system
Intel Joule: Has double the processing speed of any of the microcontrollers on the consumer market to date --- Very high cost for this system and difficult to interact with GPIO for sensor interaction
BeagleBone Black: Compact and compatible with sensors used in project by using the General Purpose Input Output (GPIO) --- Not enough processing power to effectively find objects

Step 7: Equipment Selection: Sensors

A combination of several sensors is used in order to obtain high location accuracy. The Kinect is the main sensor because of the amount of area it can scan for obstacles at one time. LIDAR which stands for LIght Detection and Ranging, is a remote sensing method that uses light in the form of a pulsed laser to measure distances from where the sensor is to objects rapidly; that sensor is used because it can track an area up to 40 meters (m) away and since it can scan at various angles, it can detect if any steps are going up or down. The SOund Navigation And Ranging (SONAR) and Ultrasonic sensors are used as backup tracking in the event the Kinect misses a pole or bump in the ground that would pose a hazard to the user. The 9 Degrees of Freedom Sensor is used for tracking what direction the user is facing so that the device can store the information for higher accuracy directing next time the person walks at the same place.

Sensors: Pros --- Cons:

Kinect V1: Can track 3D objects with --- Only one camera to detect surroundings
Kinect V2: Has 3 infrared cameras and a Red, Green, Blue, Depth (RGB-D) camera for high precision 3D object detection --- Can heat up and may need a cooling fan, and is larger than other sensors
LIDAR: Beam that can track locations up to 40 m away --- Needs to be positioned towards object and can only look in that direction
SONAR: Beam that can track 5 m away but in a far range --- Small objects like feathers can trigger the sensor
Ultrasonic: Has a range of up to 3 m and is very inexpensive --- Distances can occasionally be inaccurate
9 Degrees of Freedom Sensor: Good for sensing orientation and speed of the user --- If anything interferes with the sensors, the distance calculations can be incorrectly calculated

Step 8: Equipment Selection: Software

The selected software for the first few prototypes built with the Kinect V1 sensor was Freenect but it was not very accurate. When switching over to Kinect V2 and Freenect2, the tracking results were significantly improved due to improved tracking as the V2 has a HD camera and 3 infrared cameras as opposed a single camera on the Kinect V1. When I was using OpenNi2 with the Kinect V1, the functions were limited and I could not control some of the functions of the device.

Software: Pros --- Cons:

Freenect: Has a lower level of control for controlling everything --- Only supports the Kinect V1
OpenNi2: Can easily create the point cloud data from the information stream from the Kinect --- Only supports the Kinect V1 and doesn't have support for low level control
Freenect2: Has a lower level of control for sensor bar --- Only works for the Kinect V2
ROS: Operating system ideal for programming camera functions --- Needs to be installed on a fast SD card so that the software will work

Step 9: Equipment Selection: Other Parts

Lithium Ion batteries were selected due to being light, having a high power capacity, and being rechargeable. The 18650 variant of the lithium ion battery has a cylindrical shape and fits perfectly into the cane prototype. The 1st prototype cane is made of PVC pipe because it is hollow and reduces the weight of the cane.

Step 10: System Development: Creating the Hardware Part 1

First we have to disassemble the Kinect to make it lighter and so that it will fit inside of the cane. I started by removing all of the outside casing from the Kinect as the plastic used weighs a LOT. Then I had to cut the cable so that the base could be removed. I took the wires from the connector shown in the picture and soldered them to a usb cable with signal wires and the other two connections were for the 12V input power. Since I wanted the fan inside the cane to be running at full power to cool all of the other components, I cut the connector off the fan from the Kinect and wired 5V from the Raspberry Pi. I also made a small adapter for the LiDAR wire so that it can connect directly into the Raspberry Pi without any other systems in between.

I accidentally soldered the white wire to the black one so don't look at the images for wiring diagrams.

Step 11: System Development: Creating the Hardware Part 2

I created a regulator piece to provide power to all the devices that require 5V like the Raspberry Pi. I tuned the regulator by putting a meter on the output and adjusting the resistor so that the regulator would provide 5.05V. I put it a little bit higher than 5V because over time, the battery voltage goes down and slightly affects the output voltage. I also made an adapter which allows me to power up to 5 devices that require the 12V from the battery.

Step 12: System Development: Programming the System Part 1

One of the most challenging parts of this system is the programming. When I had first gotten the Kinect to play around with it, I installed a program called RTAB Map which takes the data stream from the Kinect and converts it into a point cloud. With the point cloud, it created a 3D image that can be rotated so see the depth of where all of the objects are. After playing around with it for a while and adjusting all of the settings, I decided to install some software on the Raspberry Pi to allow me to see the data stream from the Kinect. The last two images above show what the Raspberry Pi can produce at about 15-20 frames per second.

Step 13: System Development: Programming the System Part 2

I implemented OpenCV in my program because after doing some research on how the library works, I figured that I could split the 3D point cloud from the Raspberry Pi into many different layers. The first program that I made was simple as it split a 3D point cloud into different layers so that I could examine how the program could work. Then I found all the objects by finding "blobs" of points in a certain distance away from the user and marked them with a red circle so that I could see if the Raspberry Pi could disregard small objects such as leaves or snow. The third program incorporates the live stream from the Kinect rather than just a still 3D point cloud used in the previous two programs. The final program adds the sound when finding the object in the proximity of the user. It says how far the object is from the user as well as whether the object is to the left, right, or in front of them.

I included the files and the zipped Raspberry Pi image in the Google Drive folder.

object10feettotheleft.mp3
Download

Step 14: Conclusion

After many trials, a final design was selected. I used the Kinect with the Freenect software to create a high-quality depth map. Alongside that, I used an array of sensors which included a LIDAR, a SONAR, an ultrasonic sensor, and a 9 Degrees Of Freedom sensor which significantly increased the accuracy of the object detection. The Raspberry Pi was used to calculate and plot the depth map, find obstacles, and warn the user verbally through the speakers in the sunglasses. My device improved the navigation capabilities for visually-impaired people but could benefit from some future prototypes.

Step 15: Future Designs

In the future, a smaller sensor to replace the Kinect would be ideal to keep cost and weight of the system down. The device could have the ability to recognize faces and could identify people present in the room by comparing to a database of persons known to the user. Another feature would consist of scene description which would differentiate between objects and people and allow the user to pose more specific questions such as who is in the room, what color is the chair, etc… Finally, connecting the device to the internet and using GPS would allow storage of data and continual update of surroundings; this information could be saved for future use to make the calculation process quicker.