Raspberry Pi Visual Queries

Smart glasses enable people to view and interact with the world around them in new and innovative ways. The forms of interaction common today are voice commands and use of hand gestures to give commands to the device.

Our project’s main goal is to provide a more fluent and natural interface using hand gestures to not only give commands but to also reference the reality we see.

We aim to provide a new API for the smart glasses developers’ community allowing for new applications.

The API created is demonstrated through a program which describes an image (called CamFind) and an OCR.

As can be seen in the demonstration the system recognizes the hand gesture. Next, the image from the RPI is cropped to the square area indicated by the user's hands. Finally, the system receives a voice command, allowing the user to choose an application to use on the cropped image and an answer is returned through the headset.

The specific hand gesture and the captured area can be seen in the above image. The back of both hands are facing the camera, only the thumb and index fingers are open, all other fingers are closed.

The code is modular and each part can be used separability. The project's code is available on Github.

Hardware:

1. RaspBerry Pi model B+
2. Raspberry Pi HD Camera: a. Max video resolution: 1080p b. Size: 36 x 36 mm c. Lens holder: M12x0.5 or CS Mount

3. RPi Camera Extension Kit Bitwizard

4. 300 Mbps Wireless N USB Adapter TL- WN821N - an internet connection is needed only for the applications, not necessary for the hand recognition part

5. Headset - needed for the applications only

6. Powerbank - for RPI mobility