DEB-Forge - Viam Powered Smart Desk With Contextual Lighting System

After moving house to a place where I can comfortably have a whole room as a workshop I realised that I desperately needed to upgrade my desk. From a bunch of cheap cardboard IKEA desks to a large multi-function "smart" desk to be my forever desk. But what would be smart about it?

Lighting is one of the most important features of a workshop but as someone who suffers with moderate light sensitivity and thus a hate of the "big light" or "house lights" found in many interiors and especially workshops that flood a room with incredibly bright, usually cold, uniform light it is a problem to implement so my workshop becomes littered with point source lamps re-positioned hastily and haphazardly to cover a point I'm soldering or painting. Therefore lighting should be the first "smart" feature I would like to add to the planned desk.

What if my desk could detect where I was working and illuminate brightly only the space where I was doing something?

Step in Viam. The robotics platform.

Supplies

Raspberry Pi 4 (Although any reasonably powered SBC should work) with 'viam-server' installed and accessable over SSH for configuration. May be referred to as RPi or RPi4 in this document.
GoPro Hero 3 with Camlink 4k (my webcam is awful; this has better field of view and automatic exposure)
WS281x 5v RGBW LED Light Strip, aka Neopixels, 1m long with 60 leds/m. (A 12v system is better due to less voltage drop but this is what I had around). May be referred to as LEDs, LED Strip or Pixels in this document.
5v 10a PSU (LEDs especially at 5v are current hungry, 60-100mA per LED at full brightness! always spec up!)
Wiring supplies for soldering the LED strip
A "Desk", currently a large shelf.
A Viam robotics platform account. (A software platform that helps in constructing and deploying robotic systems by breaking down a robotics system into separate Components, Services and Processes across Machines that can be configured and controlled remotely)

Step 1: Setting Up Vision System

If this smart desk is going to detect anything it needs to have some vision! My current webcam setup of a old model GoPro Hero 3 with a Camlink 4k to connect to a computer via USB works very well and better than a cheap webcam, so that was chosen, plugged into the Raspberry Pi 4 and positioned above my desk looking down onto the desks working area.

After configuring a 'Machine' in Viam which would be my desks controller (The aptly named DEB-Forge, Digital-Exo-Brain Forge) I added a 'Camera' Component to it. Firstly trying the 'webcam' sub-component but running into an error suggesting it did not agree with my CamLinks input video stream. To fix this I used the 'ffmpeg' camera sub-component instead and streamed the video using ffmpeg into another dummy stream setup with v4l2loopback with the console commands below.

sudo modprobe v4l2loopback devices=1 exclusive_caps=1
ffmpeg -f v4l2 -input_format yuyv422 -framerate 60 -video_size 1920x1080 -i /dev/video0 -pix_fmt yuyv422 -codec copy -f v4l2 /dev/video2

I did want to run this as a Viam 'Process' but was having issues with that so I ran it as a Cron job at startup.

With that I had a video feed of my desk working in the Viam platform.

Step 2: Gathering Data and Training ML Model

I decided that to detect activity on the desk it should probably detect my hands but if I was going to detect a specific object like human hands then I needed to use machine learning. Viam has an add-on module for YOLOv8 so I tested that but unfortunately I could find no pre-trained nano model for hand detection, only a large model for hand gesture recognition with inference time of over 20,000ms on the RPi4! Too heavy and slow for this use case. No problem. Viam has data capture, data labeling AND training systems built-in to the platform!

After adding a 'data_manager' service to Viam along with enabling Data capture on the camera component I was capturing images of the desk every 2 seconds and syncing it to Viam.

I captured a few images of the empty desk along with 160 images of my hands both one and two at the desk in various positions and angles to give the detection model the best chance with limited data.

Once I had the images, they needed to be labeled to be able to train a ML Model. As the location of the hand was important I needed to train what Viam call a 'Object Detection Model' instead of just a classification model. To do so a bounding box with the label 'hand' was drawn over every hand in every image.

Once the data-set was sanitized, segmented and labelled I began training a model using it on the Viam platform. After just 5-10mins the model was trained! Easily one of the most accessible online ML systems I've used!

Step 3: Deploying and Testing Model

To deploy the model onto the Raspberry Pi was as simple as adding 2 Viam 'Service's to the 'Machine'. First the model itself a 'TFLite CPU' service under ML Model. Where the model that was just trained is declared. Secondly the actual Vision Detection 'Service' that uses that ML model you just declared. Once the configuration is saved, that is set up on the Raspberry Pi.

To test the model a 'Detection Camera' component is added with the following JSON configuration that defines the cameras overlay pipeline. In this case its looking for type detections with a confidence of over 0.5 (out of 1) using the detection service we just set up.

{
  "pipeline": [
    {
      "type": "detections",
      "attributes": {
        "confidence_threshold": 0.5,
        "detector_name": "hand-detector-service"
      }
    }
  ],
  "source": "camera-1"
}

With that done. We can see that the detection camera shows us a bounding box overlayed when it detects some hands!

Step 4: Setting Up Lighting System

Now we can detect hands reliably we need a way of illuminating them! Being a Addressable LED specialist I turned to what I use professionally the trusty WS281x addressable RGB LED tape, aka in maker communities as Neopixels!

The idea being that if I placed a strip (preferably multiple rows of strips or a matrix) of addressable LED tape above a desk I could have full software control of where the illumination was. So that's what was done. A 1m strip of LED tape was stuck to the bottom of the shelf above my "desk" along with 3 wires soldered to the pads of the strip, 1 for Gnd (Ground), 1 for Data and 1 for Positive voltage. Then plugged into their respective locations as per the diagram above with a breadboard facilitating the connection of data to the RPi and Grounding to it. (SAFETY NOTE: Don't use a breadboard for high current applications! Only use for Data and to ground to a common supply as in pictures above! Never for high current +ve and return lines) (SAFETY NOTE: Do not adhere LED strip tape directly for wood for any permanent installation, the tape can get very hot when at full load continuously, mount the LED strip to metal channels or other suitable heat-sinks for heat dissipation.

First problem was that Raspberry Pis give out 3.3V logic and WS281x LED strips require 5V logic, usually you use a logic level converter to pull-up the logic to 5V but I was out of the 74AHCT125 level converter chips I usually use so I implemented the quick and dirty trick of just... not using one (See diagram above). Thankfully, because electronic components are far more fuzzy and imprecise that what is expected in such an art, the short LED strip I was using worked OK with this trick but a level converter should be added soon as any longer of a strip and the signal is going to struggle to propagate to the end of it.

The data signal is connected to Pin 12 (GPIO pin 18) of the RPi4 GPIO pins as this pin has direct memory access (DMA) and can be used with some memory management magic to control an incredible number of LEDs without using all of the RPis resources. (2500 is my record!)

To power the LEDs a separate PSU needs to be used as the RPi4 can't supply more than the power of its own power supply to the 5V rail, which in this setup is 2A. (SAFETY NOTE: Never actually draw that from the 5V rail on any RPi though, as for one the RPi needs that itself and secondly you will likely damage the board). Most WS281x LEDs draw around 60-100mA (depending on if has an extra White LED on chip and other factors) at 5V EACH, a hefty amount of current that adds up when you have 60 per meter on a strip. In this setup it was drawing about 4-5A at full all-on brightness so a 5V 10A rated power supply was used. (SAFETY NOTE: Keep in mind the gauge of your wire when designing these higher current circuits!). Remember to also unify the Grounds of the RPi and the LED strip by connecting them together. Using the Gnd Pin next to GPIO 18. (See picture above).

Second problem is that the Viam way of adding LEDs to a robotics system is to add a LED 'Component' to a 'Machine' as was done with the camera. The problem with doing this is the existing Neopixel component that has been kindly contributed to the 'Module' Registry of Viam as a wrapper of the Adafruit Neopixel Python library does not re-implement any setup function needed to properly setup an addressable LED strip (No way to change RGB order to GRB or use RGBW) nor does it have any sane way of controlling the color and brightness. At least that I can work out. As I was not yet versed in creating a Module myself I opted to configure and control the the LEDs using the Adafruit Neopixel Python library within the main script locally and contribute a fully featured ws281x/neopixel viam module when I had some more time.

(Image attribution: Adafruit)

Step 5: Linking Vision and Lighting System

Now we have a Vision system and a Lighting System. It was time to link them and write the main script logic using the Viam SDK to bring it all together. (The code described here all together is available on GitHub here and is attached as a file.)

First, after ssh'ing into the RPi, a folder within my /Scripts folder within my home directory on the RPi was created and a main.py file created within that.

mkdir Contextual_Lighting && touch Contextual_Lighting/main.py

First was to install the required python packages, the Viam SDK and neopixel libraries, into the folder where our main script (Pythonvenv/envs DO NOT work with the Viam SDK when running as a Viam 'process').

sudo pip3 install --target=Contextual_Lighting 'viam-sdk[mlmodel]' rpi_ws281x adafruit-circuitpython-neopixel

With our python libraries installed we can start scripting! Opening up main.py in a text editor. (VSCode Remote SSH for me!)

First import the required generic python libraries

import asyncio
import time
import math
import board

Then import the required specific python libraries that we just installed.

import neopixel
from viam.robot.client import RobotClient
from viam.rpc.dial import Credentials, DialOptions
from viam.components.camera import Camera
from viam.components.generic import Generic
from viam.services.vision import VisionClient

Then we setup the neopixel strips to be utilized later. This is done in a global context as many of these variables are used throughout the program although this is still not ideal. Also note that these types of addressable LEDs are almost always colloquially known as 'pixels' and will be referred throughout the code as such as other code encountered will likely also do so.

The pin that the LED strip is connected to is set to the RPi GPIO Pin number 18. As explained earlier for the sweet DMA capabilities of that pin.

pixel_pin = board.D18

Then the number of LEDs we have is set at 55 as we have just under a meter across the "desk" so 55 LEDs of a 60 led/m strip

num_pixels = 55

Neopixel/WS281x style LED strips have every possible variation in its set addressable colour order depending on the type. This strip was GRBW

ORDER = neopixel.GRBW

Then we take these variables and use them to instantiate the NeoPixel object.

pixels = neopixel.NeoPixel(
    pixel_pin, num_pixels, brightness=0.5, auto_write=False, pixel_order=ORDER
)

This variable is setup to facilitate easier resetting to a default brightness. Note that 0-255 are the allowed values of R,G,B,W respectably for the brightness of each colour channel in the LED.

ambient_lum = (5,5,5,5)

Then we must construct some functions that we are going to use in the main loop of the program.

The first one is one we can get as a snippet from the Connect tab on the Viam platform. It will looks somewhat like this and is used for this script to connect to the viam-server.

async def connect():
    opts = RobotClient.Options.with_api_key(
        api_key='YOUR API KEY GOES HERE',
        api_key_id='YOUR API KEY ID GOES HERE'
    )
    return await RobotClient.at_address('deb-forge-main.x5shop4nza.viam.cloud', opts)

Then we need to construct a function that awaits the detection of hands using our Vision service client from Viam then once has a detection/s filters out the low confident detection's. The confidence is set quite high for this model as due to the lack of training data on a busy desk, bits of wire and scissors were sometimes getting detected as hands!

async def detect_hands(camera_name, hand_detector):
    potential_hand_detections = await hand_detector.get_detections_from_camera(camera_name)
    confident_hand_detections = []
    for d in potential_hand_detections:
        if d.confidence > 0.8:
            print("hand detected : ", d)
            confident_hand_detections.append(d)
    return confident_hand_detections

Also we need a function that maps what the camera can see to what lights should be turned on to illuminate the things the camera can see. This finds a scale factor between the 720px width of the image and the 55 leds that appear across that picture. Then uses the scale factor along with the dimensions of the detection bounding box from the image to construct a beam that will illuminate just the area of the image were the detection/s are. We are dealing with a single strip of LEDs going across the image so just the width of the image was considered. In the future with multiple rows of strips or a matrix this function will need upgrading. (Note the x_max and x_min beams are inverse of the picture x_max and x_min due to the orientation of the camera, I tried fixing this by rotating the camera 180 but then found hand detection was far less effective due to training the model with a fixed camera orientation and my hands only appearing from one side! I'm learning the quirks and features of ML Learning!)

async def beam_forming(camera_width_px, camera_height_px, detections):
    x_scale_factor = num_pixels / camera_width_px 
    for d in range(len(detections)):
        x_max_beam = num_pixels-math.floor(detections[d].x_min * x_scale_factor)
        x_min_beam = num_pixels-math.floor(detections[d].x_max * x_scale_factor)
        print("Illuminating LEDs on X axis: ", x_min_beam, "-", x_max_beam)
        for led in range(x_min_beam, x_max_beam):
            pixels[led] = (255,255,255,255)

and finally a little helper function to reset all the LEDs in the strip to the ambient light we set globally.

async def refresh_led_pixels():
    for i in range(num_pixels):
        pixels[i] = ambient_lum

Then the core of the script, the main loop

async def main():

First we call and await the connection to the viam-server.

    machine = await connect()

Then we setup the Camera object using the Component name of the camera in Viam.

    camera_name = "camera-1"
    camera = Camera.from_robot(machine, camera_name)

To get the true dimensions of the stream we are getting from the camera we must take an image and then find the dimensions of the image.

    calibration_image = await camera.get_image(mime_type="image/jpeg")    
    cam_width_px = calibration_image.width
    cam_height_px = calibration_image.height
    print("Camera stream resolution:", cam_height_px, " x ", cam_width_px)

We instantiate our hand_detector from the Viam Vision Client.

    hand_detector = VisionClient.from_robot(machine, "hand-detector-service")

Then within the "forever" loop of the program

    while True:

We are continually checking for confident detection's of hands from our Camera.

        hands = await detect_hands("camera-1", hand_detector)

And if some confident detection's of hands are made then...

        if hands:

We first refresh all the pixels that may have been set in previous loop (This needs to be done or pixels get "stuck" on there previous value).

            await refresh_led_pixels()

And we "form the beam" of light using the function we made. Turning up the brightness on the areas where there is a hand detection.

            await beam_forming(cam_width_px, cam_height_px, hands)

Else we fill all the pixels with an ambient set light value.

        else:
            pixels.fill(ambient_lum)

Then Finally right at the end, we push all the set values of the pixels to the LED strip for them to illuminate or not illuminate!

        pixels.show()

Test the script with...

sudo python3 Contextual_Lighting/main.py

(Note that main.py must be run with 'sudo' for the neopixel libraries to work!)

Ideally this is then also run as a Viam process but I was having trouble with that for now.

and... We "hopefully" see that everything works! (See the Video Below!)

main.py
Download

Step 6: Next Steps

Bugs/Refactoring/Refinement

Fix the camera streams reliance on copying the stream to a dummy stream to work as this is using 5-15% of the CPU!
Add Neopixel adafruit library as a Viam Module so the lighting can be integrated as Viam components instead of just locally.
Retrain ML Model on images with desk in different orientations, lighting setups and states of disarray.
Add rugged error/exception handling to the main script.
Replace with 12v LEDs Strip and add a level converter for more reliable lighting control.

Upgrades/Future Development

If some budget can be found for the development of my new desk I will implement this fully into it. Here are some potential upgrades if so.

Coral TPU USB Accelerator - The RPi4 has no dedicated GPU and is struggling to use the TFLite(?) model leading to long inference times that makes it very slow to respond to hand position changes. A dedicated GPU or TPU should help speed detection up.
Gesture/Hand-shape detection - With the extra ML power I might be able to train the detection of multiple hand-shapes that could control the brightness and colour of the light.
Other Object Detection - Could the detection of a particular tool, component or other thing trigger different lighting preferences? Party mode!?
LED Matrix Array - To cover a full desk properly there should be multiple rows of LED strips, or even better a proper 2D matrix array of lights, instead of just one strip to cover the entire area of the desk with the contextual lighting.
Powerful focused LED's - Instead of using LED strips, higher power or clusters of LEDs should be used along with small fresnel lenses (or beam shutters or barn doors) to direct the beam, limit light spill and to more precisely illuminate parts of the desk.

After Contextual Lighting

Height adjustment - My proposed desk design has electric motorized height adjustment using an old IKEA standing desk frame to take it from a standing desk to a sitting one. Could particular objects be detected or sensors used that trigger an adjustment in height?
Videography control - I want to make build vlogs and artistic videos. If there are going to be cameras and lights embedded into the desk could I use them to control a camera and required lights? Tracking a subject? Automated lighting for correct exposure?

Beyond the Desk

3D Printer - Would be very useful if I could use Viam to detect a completed or failed 3D print?
An illuminating Component drawer system akin to Zack Freedmans Litfinity?
Automatically detect Air Quality from Soldering/Resin Curing and activate fans?