Raspberry Pi Visual Queries

Published

Introduction: Raspberry Pi Visual Queries

Smart glasses enable people to view and interact with the world around them in new and innovative ways. The forms of interaction common today are voice commands and use of hand gestures to give commands to the device.

Our project’s main goal is to provide a more fluent and natural interface using hand gestures to not only give commands but to also reference the reality we see.

We aim to provide a new API for the smart glasses developers’ community allowing for new applications.

The API created is demonstrated through a program which describes an image (called CamFind) and an OCR.

As can be seen in the demonstration the system recognizes the hand gesture. Next, the image from the RPI is cropped to the square area indicated by the user's hands. Finally, the system receives a voice command, allowing the user to choose an application to use on the cropped image and an answer is returned through the headset.

The specific hand gesture and the captured area can be seen in the above image. The back of both hands are facing the camera, only the thumb and index fingers are open, all other fingers are closed.

The code is modular and each part can be used separability. The project's code is available on Github.

Hardware:

1. RaspBerry Pi model B+
2. Raspberry Pi HD Camera: a. Max video resolution: 1080p b. Size: 36 x 36 mm c. Lens holder: M12x0.5 or CS Mount

3. RPi Camera Extension Kit Bitwizard

4. 300 Mbps Wireless N USB Adapter TL- WN821N - an internet connection is needed only for the applications, not necessary for the hand recognition part

5. Headset - needed for the applications only

6. Powerbank - for RPI mobility

Share

    Recommendations

    • Planter Challenge

      Planter Challenge
    • Clocks Contest

      Clocks Contest
    • Casting Contest

      Casting Contest
    user

    We have a be nice policy.
    Please be positive and constructive.

    Tips

    Questions

    6 Comments

    wow... that is.... something!!!!!cool man

    sure, how can I help?

    The project also uses the tesseract :-)

    You can see it in the detailed documentation within the github (setup.txt).

    Thank you, glad you like it!