Introduction: ESP32-cam Standalone With Robot Arm

The ESP32cam is a very nice processor and camera which should be useful in a wide range of robotics projects. But when you look on the web most ESP32cam projects are very similar: "here's an ESP32cam, let's connect it to a PC over Wi-Fi and use the PC to recognise a face or record video".

The ESP32cam is acting as not much more than a wireless webcam. It is simply a peripheral to a PC. It should be capable of more than that. It ought to be very useful in robots that are not tethered to a PC through Wi-Fi.

I'm interested in assembly-robotics so I started to investigate how useful an ESP32cam would be. Can the robot use vision to put things together?

This project is the first part of that robotics project. I was surprised at how hard it is to get the popular AI-Thinker ESP32cam to do anything standalone. The more I searched the web, the more people I found running up against the same problems - there just aren't enough pins exposed. One commentator suggested that we ahould think of the ESP32cam as a single purpose device - it's a Wi-Fi camera that can also save images to an SD card.

If that's true, it's rather disappointing. It's certainly the case that if you ignore any project that uses Wi-Fi, there are not many others left.One can think of lots of projects where you want a small camera module but don't want a PC running all the time.

What I want is an ESP32cam with a small display for debugging and lots of I/O for sensors and effectors. This Instructable isn't really a finished project. It's meant to be a set of tools that I hope you'll find useful in your own robot vision projects. I've provided a simple example of a robot-arm project but it's not meant to be the finished project. What I hope you'll do is work through each of the following Steps and at some point say "I now know enough to do the project that I personally want to do".

Step 1: Installing ESP32cam Into the Arduino IDE

This was my first encounter with the ESP32 and it was harder that I'd hoped.

In the Arduino IDE (ver 1.8 or higher) select the File|Preferences menu item to open the Preferences dialog. Select the Settings tab. Near the bttom is a box labeled “Additional Boards Manager URLs”; paste the following link into it

https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json

(If the box already contains text, add a comma then the link.)

Turn off warnings. My code produces a lot of warnings which I find excessive and unhelpful. Close the dialog.

Click the Tools|Board menu item (it might currently say e.g. Board: "Arduino Nano"). The top of the submenu says Boards Manager; click it to open the Boards Manager dialog. Scroll down to “esp32” and click the Install button. Wait until it's finished then close the dialog.

Click the Tools|Board menu item again and select the board that you think will match your ESP32cam. I chose "AI-Thinker ESP32cam".

I originally installed ESP32 boards into the Arduino IDE (ver 1.8.9) on a Windows XP machine that I use for hardware development. I wouldn't recommend it - the source files for some of the components seemed to have been put in the wrong directories and it took ages to sort them out. So I changed to Windows 8 and Arduino IDE ver 1.8.16 which went a lot more smoothly. I haven't tried it on my Windows 10 computer.

Step 2: ESP32cam Development Board

I bought a couple of the ESP32cam modules you can see in the photo. It's the most popular and is very cheap. I think it's an "AI-Thinker ESP32cam".

The easiest way to connect an ESP32cam to a desktop computer is to use an "ESP32 CAM Development Board" - search eBay or your favourite supplier. You'll see that they add just £2 to the cost of an ESP32cam. (If you don't have a "Development Board", see the next section.)

The "Development Board" is a pcb that the ESP32cam plugs into. It contains a CH340 USB-to-Serial chip and a couple of pushbuttons. One of the pushbuttons momentarily turns off the 5V supply to the ESP32cam and so acts as a reset button.

Simply clicking "Upload" in the Arduino IDE programs the ESP32cam - there's no messing around with pressing reset buttons or connecting jumpers. That's great.

On the other hand, it took me ages to work out how to get my own Windows programs to reset the "ESP32 CAM Development Board" and put it into "Run" mode. It kept wanting to start up in "Programming" mode. The answer seems to be:

  • set DTR true
  • set RTS true delay(500)
  • set DTR false delay(500)
  • set RTS false

The delays probably don't have to be that long.

As far as I can see,

  • DTR true resets the ESP32 by turning it off
  • DTR false allows the ESP32 to run
  • RTS true sets GPIO0 low (program mode)
  • RTS false sets GPIO0 high (run mode)

but I may have got those wrong.

Write a sketch, download it and make sure it runs. Here's a "Hello world" program:

void setup() {   
  Serial.begin(115200); 
  Serial.println("Hello world"); 
}

void loop() { 
}

Or you could try one of the many the Blink programs on the web. The built-in small red LED is GPIO 33; the bright white LED is GPIO 4.

You are now able to program your ESP32cam.

Step 3: Connect a Serial Interface

If you bought an "ESP32 CAM Development Board" skip this section. If you didn't (like me at first) then you can make your own programmer.

If you have an FTDI USB-to-serial convertor, connect it as shown above. (The pin order may not match your FTDI module. Check the pin names)

I didn't have an FTDI convertor but I have several cheap Prolific ones. I wasted a couple of hours failing to find a Prolific driver that would run under Windows 8.

I was getting rather frustrated by this time. The Arduino system, AVRdude and gcc are real geekware and I loathe Windows 8. I couldn't be bothered to wait for delivery of an FTDI convertor so I bodged an Arduino Nano. I programmed it with an empty sketch that set Rx and TX to input. Then I soldered a wire onto the CH340 USB-to-serial chip (I can give the details if anyone is getting as hacked-off as I was). It's a bit of a Frankenstein's monster but works fine. There are CH340 drivers for all versions of Windows from XP onwards. I think the CH340 is a Good Chip and the Prolific is a Bad Chip.

As you can see, I used a solderless breadboard. That's probably not ideal as it makes it a little hard to press the reset button (it's underneath). I used the corner of an old credit card.

When you want to program the chip, fit the "Jumper" wire shown above then press reset. Click the upload button in the IDE. Some people say you have to press reset when the IDE debugging window says "...." but I didn't need to - maybe it depends on the version of the bootloader.

To run your program, remove the "Jumper" wire shown above then press reset again.

There's a "Hello world" program in the previous Step.

You are now able to program your ESP32cam.

Step 4: Taking a Picture

The next stage is to get the camera to take a photo and send it over USB to a PC.

The camera is initialised in the setup() function. A camera_config_t struct is initialised and passed to the esp_camera_init() function. Most of the fields of the camera_config_t struct are concerned with defining what pins are connected to the camera - you can look at the source code for the details. The interesting fields are:

  • pixel_format: the format of each
  • pixel frame_size: the width and height of the image

The frame_size can take one of the values

  • FRAMESIZE_QQVGA: 160x120
  • FRAMESIZE_QVGA: 320x240
  • FRAMESIZE_CIF: 400x296
  • FRAMESIZE_VGA: 640x480
  • FRAMESIZE_SVGA: 800x600
  • FRAMESIZE_XGA: 1024x768
  • FRAMESIZE_SXGA: 1280x1024
  • FRAMESIZE_UXGA: 1600x1200

Other frame sizes are defined in the sensor.h file but many don't work. (I think the ones I've listed above all work).

The pixel_format can take one of the values

  • PIXFORMAT_RGB565 16 bits per pixel
  • PIXFORMAT_GRAYSCALE 8 bits per pixel
  • PIXFORMAT_JPEG

Other formats are defined but not all work with the "AI-Thinker ESP32cam" that I seem to have.

You must choose the pixel format before calling esp_camera_init(). As far as I can see, you can't subsequently choose another format. I guess calling esp_camera_init() allocates memory for something and there's no way to de-allocate it.

We're wanting to process the image ourselves in the ESP32 so the JPEG format is no use to us. The ESP32 has 520kb of memory of which about 320kb is available to your program. Each colour pixel is two bytes so any format with more than 250k pixels won't work. I stuck with 320x240.

The PIXFORMAT_RGB565 format uses 16 bits per pixel:

  • 5 bits of red
  • 6 bits of green
  • 5 bits of blue

Having set up the camera_config_t struct and called the esp_camera_init() function, we can take a photo with:

  camera_fb_t * fb = NULL;
  fb = esp_camera_fb_get();

which copies an image into a camera_fb_t struct. A camera_fb_t struct has the fields:

    uint8_t * buf;                   Pointer to the pixel data 
    size_t len;                      Length of the buffer in bytes 
    size_t width;                    Width of the buffer in pixels 
    size_t height;                   Height of the buffer in pixels 
    pixformat_t format;              Format of the pixel data 
    struct timeval timestamp;        Timestamp 

The esp_camera_fb_get() function allocates memory for the *buf field (using malloc I suppose). You must remember to free the memory when you've used the buffer by calling esp_camera_fb_return().

You can test the camera with my TakePhoto.ino sketch. It takes a photo every few seconds then sends it to the PC over the serial line. Of course, using the serial line is very, very slow but it demonstrates that the system is working. (If you really wanted to see the photo on a PC then you'd use Wi-Fi.)

The format of the transmitted data is:

  • Image (width) (height) (len) (pixformat):(len bytes of pixel data)

Where

  • (width) is the width of the image
  • (height) is the height of the image
  • (len) is the length in bytes of the pixel data
  • (pixformat) is 0 for RGB565 or 2 for GRAYSCALE

The colon is followed by the pixel data as raw bytes.

The ShowPhoto.exe Windows program receives the image and displays it.

I've had trouble that sometimes the camera doesn't initialise properly. The problem might be that the camera is reset via its CAM_RST line which is connected to a capacitor that pulls the line low for a millisecond when the board is powered-down then powered-up. Maybe the board isn't powered-down properly or not for long enough. If the camera doesn't initialise properly, try unplugging the USB cable and plugging it back in again. That always works for me.

If you include this code before the call to initialise_camera(), it will turn the camera of and on which seems to cure the problem:

  digitalWrite(PWDN_GPIO_NUM, LOW);
  delay(10);
  digitalWrite(PWDN_GPIO_NUM, HIGH);
  delay(10);

You are now able to program your ESP32cam, take a picture and send it over the serial line.

You will find that the first photo that the camera takes is too bright or two dark. The camera takes one or two frames to adjust its automatic brightness control properly.

Step 5: Connecting to a Display

It's useful when debugging to be able to see what the ESP32 camera is seeing so it's worth adding a display even if it's only temporary.

I used a 320x240 pixel ILI9341 display - search eBay or your favourite supplier for "ILI9341 SPI display 240 320". You want one with an SPI interface. They come in different physical sizes but my SimpleILI9341 software should be able to drive them all. Look at the number of pins and how they're labelled in the supplier's photo. There should be around 14 pins some of which have names like MOSI, MISO, SCK.

(If it has lots of pins labelled LCD_00, LCD_01 ... LCD_07 then it has a parallel interface and won't work with the ESP32cam. An SPI display may have 4 extra pins at the other end of the board for an SD card holder - ignore those.)

I will not be using touch-screen for this project so you can buy a display "Without Touch" which will be cheaper.

Connect it as shown above.

The "ESP32 CAM Development Board" I was using doesn't make it easy to connect to the GPIO pins of the ESP32 so I soldered two strips of header-pins to the underside of the "Development Board" then plugged it into a solderless breadboard.

I spent a day trying to work out why the ESP32cam couldn't talk to the display. It's really weird. It turns out that I was using the pin labelled "GND" at the bottom-right of the board but that pin is not connected to the ground of the PCB. The PCB legend and the published schematic say that it is but it isn't. (I wasted an afternoon finding that out! A careful search of the web suggests a couple of other people have noticed that pin is odd. One suggestion is that it's actually the reset pin and has been mis-labelled.)

The "ESP32 CAM Development Board" resets the ESP32cam by turning it off and on. It has a FET which connects the "0V" of the USB connector to the "GND" pins of the ESP32cam. So you have to be careful how you ground the ESP32cam board when using, say, an oscilloscope or an external power supply. You may have to remove any external equipment when programming the board.

Download my SimpleILI9341 software and test program (link below). Copy everything into the same directory called TestILI9341sw. Compile them and upload them to the ESP32cam. (Turn off warnings. All my code produces a lot of unhelpful warnings which I can't be bothered to fix.)

SimpleILI9341 uses software SPI. The ESP32cam has hardware SPI on the chip but the web tells me that people have trouble using it - the SPI is already being used for the camera or the SD card. Other people have got it to work so you could look at their projects.

I didn't even attempt to use hardware SPI. I want to use the SPI pins to also talk to an Arduino so software SPI is easier to control. It takes 200mS to send an image to the screen which is fast enough for the debugging I'm needing.

You are now able to use your ESP32cam and control a TFT display.

Step 6: Send a Photo to the Display

Download my DisplayPhoto.ino sketch and its associated *.h and *.cpp files. Copy them all into a directory called DisplayPhoto. Compile them and upload them to the ESP32cam.
I've put the functions to initialise the camera and to take a photo into the esp32cam.h file.

Before you #include the esp32cam.h file, you must define one of

  • #define COLOR_PHOTO
  • #define MONO_PHOTO

The frame format is always FRAMESIZE_QVGA which is 320x240 pixels.

The pixel format is either

  • PIXFORMAT_RGB565 16-bits colour
  • PIXFORMAT_GRAYSCALE 8-bits monochrome

The take_photo() function returns a frame buffer struct. You can read but not alter the contents of the struct. Subsequent calls to take_photo() will de-allocate the memory - you don't have to worry about that.

The main loop of the DisplayPhoto.ino sketch repeatedly takes a photo and calls ShowPhoto() to show it on the display. ShowPhoto() does the necessary bit shuffling to convert the data from the camera into data for the display.

The camera RGB565 format is

  • 5 bits red; 6 bits green; 5 bits blue
  • (most significant byte first )

The camera monochrome format is

  • 8-bits grayscale

The display format is

  • 5 bits red; 6 bits green; 5 bits blue
  • least significant byte first

DisplayPhoto.ino is not fast but it's good for debugging.

SimpleILI9341sw.h contains the function DrawBitmapArray() which draws a coloured bitmap from the camera; in other words it's in the PIXFORMAT_RGB565 format. DrawBitmapArrayByte() draws a monochrome bitmap; i.e. it's in the PIXFORMAT_GRAYSCALE format.

You are now able to use your ESP32cam to take a photo and send it to the TFT display.

Step 7: Connect to I2C (Failed)

I want to use the ESP32cam to control a robot arm and connect to sensors - for instance to know whether it is holding anything. The ESP32cam has very few pins available. What we need are expansion boards that will give us more I/O. The PCA9685 pwm servo driver chip and the PCF8574 digital I/O chip look ideal.

The modules require 2 pins to drive the I2C interface.

At first sight, it seems as though you can use the built-in I2C hardware of the ESP32. It should be possible to assign the I2C function to any pair of pins.

Unfortunately, it's not so easy. The The ESP32cam has 8 GPIO pins exposed:

  • GPIO 12, 13, 14, 15 are used for the SD card or, in our case the SPI display
  • GPIO 0 is dedicated to the CSI_M clock of the camera and cannot be used
  • GPIO 4 controls the bright "flash" LED
  • GPIO 16 is used by the PSRAM
  • GPIO 2 appears to be available

I spent a couple of days trying to get I2C working. GPIO 0, 2 and 4 simply don't seem to work with I2C. Then I searched the web and found many other people having the same sort of trouble. It seemed that no-one could get hardware I2C working. No-one could get the ESP32's I2C hardware assigned to useful pins.

How about bit-banging software I2C?

There aren't enough extra usable pins available once we've got SPI working. But we could use some of the SPI pins to also do I2C.

The pins I'm using for software SPI are

  • GPIO 12: CS low=select SPI
  • GPIO 13: DC high=data; low=command
  • GPIO 15: MOSI
  • GPIO 14: SCK clock

Could I use some of those same pins to do I2C? Could I use the CS=high to mean "talk to the I2C chips" and CS=low means "talk to the SPI chips"? Unfortunately the PCA9685 chip does not have a chip-select pin.

When I2C is transmitting data, the SDA line should only change when the SCL line is low. If the SCL line is high then

  • SDA high to low means "start"
  • SDA low to high means "stop" (for some chips)

Is it possible to get two of the SPI pins to not produce that sort of pattern during normal SPI operation? It's tricky. It's not clear from the data sheets what an I2C chip would do if it saw odd things happening on the SDA or SCL line. It would probably think it was getting more data as part of the previous transaction. I guess I could use some random logic to get the CS pin to multiplex the GPIO pins to I2C or SPI. Then I'll need extra chips or modules for controlling servos, digital I/O, analogue I/O, etc.

This is all getting rather silly. There must be an easier way. How about getting an Arduino to do all the IO?

Step 8: Getting an Arduino to Do the IO

The ESP32cam takes photos and does all the complicated algorithms. An Arduino Nano does basic I/O and maybe simple things like controlling servo trajectories or guarded moves.

How should the ESP32 talk to the Nano? It should do so as quickly as possible but not so fast that the Nano can't keep up. SPI, I2C and RS232 all use a fixed clock rate and that clock has to run slowly enough that the Nano can always respond. Is there a way that the communications can use handshaking to regulate the bit-rate? Yes. GPIB (General Purpose Interface Bus or IEEE-488) uses a two-phase clock with lines called DAV and NDAC. GPIB operates at the speed of the slowest device on the bus. It's a good starting point for inventing our own comms protocol.

GPIB is more complicated than we need here because it can have multiple devices on a single bus (by using open-collector and pullup resistors like I2C); we only need two devices, a master and a slave. No doubt we could have more devices by using pullup resistors but that's for the future. GPIB uses 8 bidirectional data lines in parallel. We'll use 2 unidirectional data lines because we're so short of pins. So let's call ours the Two-bit-Interface-Bus: "TBIB".

The 4 pins are

  • MCLK master clock (output by ESP32)
  • SCLK slave clock (output by Nano)
  • MOSI master out slave in data (output by ESP32)
  • MISO master in slave out data (output by Nano)

Both Master (the ESP32) and the Slave (the Nano) execute the protocol in a polled loop. There are no interrupts and the timing is entirely variable. If one or other of the processors needs to do some lengthy task, the other processor will wait for it (and can itself spend the time doing a task of its own).

As you'll see from the timing diagram above, the Slave copies the MCLK clock output by the Master. When the Slave sees a change, it copies the current value of MCLK to the SCLK clock.

The Master inverts the SCLK clock output by the Slave. When it sees a change, it copies the new inverted value to the MCLK clock.

The Master sends data on the MOSIa line while listening for incoming data on the MISOa line. The Slave sends data on the MISOa line while listening for incoming data on the MOSIa line. The master only changes the MOSIa data when the MCLK is low. The slave only reads the MOSIa data when the MCLK is high. Similarly, the slave only changes the MISOa data when the SCLK is low. The master only reads the MISOa data when the SCLK is high. The MOSIa and MISOa lines are not the same pins as MOSI and MISO of SPI - I've given them similar names because they have similar functions.

The MOSI and MISO lines are kept low when at rest. If either processor wants to send data to the other, it sets its data line high for one clock cycle then sends the message data one bit at a time (LSB first). In the above diagram, I've shown an 8-bit message being sent but the message could be of any length.

Usually, the Master will send a command or request to the Slave and the Slave will execute the command (e.g. move a servo) or return some data (e.g. a sensor reading). But you can imagine a scenario where the Slave senses something unusual (e.g. a collision) and immediately sends a message to the Master. So the Master continually toggles MCLK and watches MISO even when it has no commands to send.

The Nano runs at 5V but the ESP32cam runs at 3.3V. Are the ESP32 inputs 5V tolerant?

There is some debate about it but the answer seems to be Yes. The chip itself runs at 3.3V but its GPIO pins can be connected to 5V when configured as inputs. The inputs are protected via “snap back” circuits - whatever they may be - rather than the usual two diodes. It looks like there's a diode to 0V but not a diode to 3V3 so feeding a small current into the pin won't raise Vcc. But that's all supposition on my part. I've protected the inputs with 2k2 resistors which will limit any current. It works fine for me.

Can the ESP32cam also talk to a display? Yes. We're really short of pins on the ESP32cam so SPI (to the display) and "TBIB" (to the Nano) take turns. When it's talking to the display, the ESP32cam sets CS low and GPIO12-15 to outputs. The 2k2 resistors also mean that the ESP32cam overwhelms the signals coming from the Nano. ESP32cam should only use SPI when both TBIB CLKs are low. When it's talking to the Nano, the ESP32cam sets CS high and GPIO13 to input.

The attached ESP32_to_Nano.ino sketch acts as a master for TBIB. Copy all the files to a folder called "ESP32_to_Nano" and upload the code to the ESP32cam. (The code for the Nano is in the next Step).

Step 9: Nano TBIB Code

The attached Nano_to_ESP32.ino sketch acts as a slave for TBIB. Copy all the files to a folder called "Nano_to_ESP32" and upload the code to the Nano. (Turn off warnings. My code produces a lot of unhelpful warnings which I can't be bothered to fix.)

I used two PCs - one to program and monitor the ESP32cam and the other to program and monitor the Nano. You could run two copies of tha Arduino IDE on the same computer. With two PCs, I found it easier to keep in my mind who was doing what.

Open a serial monitor with the ToolsSerialMonitor menu command. Both the master and slave send chracters to each other at around 1Hz.

As both the ESP32cam and the Nano are powered by USB do not connect their 5V lines together.

You could power one the ESP32cam from USB and connect its 5V line to the Nano's 5V in order to power the Nano.

Alternatively, you might be able to power the Nano from USB and connect its 5V line to the ESP32cam 5V in order to power the ESP32cam. But be careful doing that. I haven't measured it myself but trawling the web gives a lot of very different results. Maybe the power consumption is

  • Active (everything on): 240mA
  • Wi-Fi/BT transmit: spikes at 790mA
  • "Modem Sleep" (Wi-Fi off): 45mA

Then you'll have the display:

  • ILI9341 display: 150mA

So it would be easy to exceed the maximum current through the Nano's protection diode which is rated at 500mA.

Run the sketch from the previous Step on the ESP32cam and the sketch from this Step on the Nano. Use the Arduino IDE Tools|SerialMonitor menu command to display the serial messages from both processors. You should see them passing bytes back and forth.

Step 10: Arduino Controls Servos

As an illustration of a Nano acting as a peripheral controller for an ESP32cam, here it is controlling a cheap robot arm.

We'll use a PCA9685 to drive the servos. You can get PCA9685 modules from eBay (or your favourite supplier). Adafruit provide free driver libraries so it would be polite to buy from them.

  • the PCA9685ServoTester.exe program runs on Windows PC 1
  • the TBIBSlaveServo.ino sketch runs on the Nano
  • the ESP32_test_robot.ino sketch runs on the ESP32cam

PC 1 runs a copy of the Arduino IDE to program the ESP32cam and also to send it commands. PC 2 runs a copy of the Arduino IDE to program the Nano and also to send it commands. (Of course, you could just use one PC to control both microprocessors.)

The ESP32cam receives the following commands from PC1 over the serial line:

  • 'p': take a photo and send it to the PC
  • 'q': take a photo and display it on the display
  • any other byte: send it to the Nano over the TBIB

The Nano receives the following commands from PC2 over the serial line:

  • 'g': report the current PWM output avlues
  • '0'-'7' followed by byte b: set a servo to position b
  • any other byte: send it to the ESP32cam over the TBIB

The Nano receives the following commands from the ESP32cam over the TBIB:

  • '0'-'7' followed by byte b: set a servo to position b
  • any other byte is ignored

"Servo to position b" means b is in the range 0..255 which corresponds with the PWM output in the range USMIN..USMAX.

The PCA9685ServoTester.exe program allows you easily to send commands to the ESP32cam and hence to the Nano. It can learn and replay servo positions. If you start the ESP32cam when it is connected to the PCA9685ServoTester.exe program, it boots up in the "waiting for a download" mode. The menu command Comms|ResetESP32cam toggles the RTS and DTR lines to force the ESP32cam to boot up in the "Run" mode.

The servos require far more current than the ESP32cam or the Nano can supply. The PCA9685 has a connector for an external PSU for the servos. Ideally, you'd use a 6V supply but 5V "wall wart" PSUs cheaply are available in almost every charity shop or car-boot sale. Assume you'll need at least 1A.

It's a common problem with servo-robots that they start up with a big jerk which might damage the servos. The difficulty is that you don't know where the servos are so the first command you send might move them a big distance and they'll move at their maximum speed. At startup, you could ask the PCA9685 what pulse length it's currently sending but that's no use if you're starting from power-up - at power-up it sends no pulses at all. It would be nice to be able to start the servos slowly: a "Soft Start". Two solutions spring to mind: slowly ramp up the voltage of the servos or slowly ramp up the PWM frequency. Ramping up the voltage doesn't work: the servos just go crazy if the voltage is too low. Normal analogue servos produce one motor pulse for every command pulse so if you send no command pulses, the motors stop moving and "relax". If you send five command pulses per second, the motors will only receive five H-bridge pulses and will move slowly. (That description doesn't apply to some more expensive "digital" servos that remember the length of the last command pulse.) The problem is that the slowest possible frequency of the PCA9685 is around 24 pulses per second and that's almost as powerfult as 50 pulses per second. It may be possible for the Nano to read one of the control pulses and just allow occasional; pulses through. I might try that at a later date.

The TBIBSlaveServo.ino sketch (in the Nano) executes a rather poor version of "Soft Start" by moving each servo in turn. It turns all the servos off then moves them one at a time to the positions in the InitialPositions[] array. You should arrange that the InitialPositions are in a safe position and maybe start by moving the "upper arm" (shoulder) vertically then the "lower arm" (elbow) to a mid position then the base rotation and finally the gripper.

After having initialised the servo positions, the TBIBSlaveServo.ino sketch subsequently moves the servos slowly. You can specify a maximum velocity of each servo in the MaximumVelocity variables. The maximum velocity for, say, the gripper can be much higher than for the base-axis. When the ESP32cam sends a "move servo" command to the Nano, the "target position" of the servo is set and the servo moves to that position at no faster than its maximum velocity. If several servos are not at their target positions then the Nano adjusts their velocities so that they all arrive at their targets at the same time.

When all the servos have reached their targets, the Nano tells the ESP32cam by sending over the TBIB:

  • 't': the servos have reached their targets

The cheap arm use in this step has a very poor design of the base axis (about which the arm turns). The whole weight of the arms rotates on the axle of the base servo. It's a wonder it works at all. I improved the design by gluing a "quadrant" of 5mm acrylic sheet under the front of the base joint to support the arm's weight.

The arm sometimes starts oscillating. The servo tries to move to a position but the arm has so much momentum that it overshoots. The servo then tries to correct it and undershoots. And so on. The servo contains a feedback loop - you can think of it as a PID controller. A PID controller must be tuned to the characteristics of whatever hardware it's controlling. The servo manufacturer has tuned the PID to what they think is a "typical" scenario - often the control surfaces of a plane. A robot arm has far more momentum than the servo was designed for. This is a problem that occurs with all model servos but it's particularly bad when lightweight servos drive a heavyweight arm.

The simple solution is to add damping. I added washers made from thin sheets of packing foam between the joint surfaces. At the base, I put some grease-proof paper (cooking parchment) between my "quadrant" and the arm support.

Copy the following files to a folder named ESP32_test_robot and upload them to the ESP32cam

  • ESP32_test_robot.ino
  • SimpleILI9341sw.cpp
  • camera_pins.h
  • SimpleILI9341sw.h
  • TBIB_Master.h

Copy the following files to a folder named Nano_test_robot and upload them to the Nano

  • Nano_test_robot.ino
  • Adafruit_PWMServoDriver.h
  • TBIB_Slave.h Wire.h
  • Adafruit_PWMServoDriver.cpp

(The Adafruit files are copyright Adafruit. camera_pins.h is copyright espressif. I have included them here to ensure that you get the same version that I used. You can download the latest versions from the respective websites.)

Step 11: Colour Sorter

The most popular introductory program for a robot arm and camera is to sort blocks according to their colour.To make it easy, the blocks always start in the same location and the Red/Yellow/Blue destination bins are always in the same location. So the program is:

  • Park the arm out of the way
  • Wait until the camera sees a Red/Yellow/Blue block in the starting location
  • Blindly move the arm to the pick-up position
  • Close the gripper
  • Blindly move the arm to a destination bin
  • Open the gripper

There are lots of examples of of colour sorters on the web - it's a project students are given in an introductory robotics course. Nevertheless it's a good project to start with to check that your basic systems are working.

I've extended the TBIB protocol so that only whole messages will be accepted. A message starts 0xA5 and ends with a checksum so random noise is almost certainly going to be ignored.

The algorithm to tell whether there's a coloured block in the Region Of Interest (ROI) is to repeatedly:

  • take a photo
  • calculate the major colour in the ROI
  • if it's red, yellow or blue in 5 consecutive frames then move the brick to its bin

Each photo is displayed on the ILI9341 display. It takes around 200mS to display it so that's 5 frames per second.

The ROI is is defined as the 32x32 square in the centre of the photo. The major colour is calculated by

  • measure the "colour" of each pixel in the ROI
  • vote on which is the most frequent colour
  • the majority vote is the "major colour"

The total brightness in the ROI is measured and if it has changed significantly then that photo is discarded.

The "colour" of a pixel is

  • convert RGB to HSL
  • if L (lightness) is too low the pixel is "black"
  • if L is too high the pixel is "white"
  • if S (saturation) is too low the pixel is "gray"
  • otherwise H (hue) is divided into 6 equal parts: red, yellow, green, cyan, blue, magenta

"Black", "white" and "gray" tend to be background colours so count for less. Yellow seemed to be confused with red. So to fudge the algorithm to be more reliable, not all votes count the same:

  • "black", "white" and "gray" count only 1/6
  • "yellow" counts double

I don't really recommend you copy my code. If you do then it will only work if your mechanical setup is exactly the same as mine. You will need to have built your arm with the servos centred in the same way as mine - which is unlikely. The various robot positions are "hard coded" into the sketch as constants. I have provided the code just as a starting point for you to write your own program.

In case you want to modify the code, the following functions will be of interest

  • the DetectColor() function measures the colour in the ROI
  • the ReportColor() function calls DetectColor(), reports to the PC and, if DetectMode == 2 then calls SortBlock()
  • the recvByte() function receives and processes commands from the PC
  • the recvTBIBmsg() function receives and processes messages from the Nano
  • the SendLearnedPos() function sends a single set of learned servo positions to the Nano which moves the servos
  • the SendLearnedSeq() function sends a sequence of learned servo positions to the Nano
  • given the colour of a block, the SortBlock() function tells the Nano which bin to put it in

The following variables will be of interest

  • the LearnedPosition[] array stores learned servo positions
  • the LearnedSequence[] array sequences of learned servo positions
  • the bAtTargets bool is set when the servos have reached their targets

You will probably want to alter the values in LearnedPosition[] and LearnedSequence[] to suit your own robot. I hope that the other functions and variables will be clear when you examine how they're used in the code.

The SortColors.exe program helped me initialise the LearnedPosition[] and LearnedSequence[] arrays. You don't need it in order to run the sketches but you might find it helpful too.

Copy the following files to a folder named ESP32_Sort_Colors and upload them to the ESP32cam

  • ESP32_Sort_Colors.ino
  • SimpleILI9341sw.cpp
  • camera_pins.h
  • SimpleILI9341sw.h
  • TBIB_Master.h

Copy the following files to a folder named Nano_Sort_Colors and upload them to the Nano

  • Nano_Sort_Colors.ino
  • Adafruit_PWMServoDriver.h
  • TBIB_Slave.h
  • Wire.h
  • Adafruit_PWMServoDriver.cpp

Step 12: The Future

This Instructable isn't really a finished project. It's meant to be a set of tools that I hope you'll find helpful in your own projects.

The next stage would be to add sensors to the robot. The most useful one is a "break the beam" sensor to tell it there's something in the gripper. Fix an IR LED on one side of the gripper and an IR phototransistor on the other. Measure the light level with the LED off (using a Nano ADC) then measure it again with the LED on. If the light level has increased significantly, there's nothing in the gripper.

You could extend my ESP32cam code to be able to see coloured bricks anywhere in front of the robot. Then you'd need to move the gripper to that location. How would you do that?

You could use OpenCV and neural net libraries to recognise different objects and sort them. For that sort of work, most people use a RaspberryPi or do the processing on a PC but an ESP32cam should be capable of running simple neural nets.

A cut-down version of OpenCV will run on the ESP32 or you could try ESP-WHO as an alternative. They say it can do face detection and recognition so it should be possible to train it to recognise other objects.

Using vision and a robot arm in industrial assembly is rare. I haven't found any commercial examples. Using vision for pick-and-place is the best manufacturers seem to manage.

Robot arms are popular. It seems like a cool device. Lots of hobbyists buy them but then don't do much with them. If I google for "robot arm project" I get ten pages of hits but almost all are about buying or building the arm, not what to do with it. It's a difficult field to work in. There are color-sorters - which recognise vague blobs - and chess players - which don't recognise the pieces - and waldos with no autonomy (so don't count as robots in my mind) and that's about it. If you are looking for challenging problems then I think assembly robotics is the place to start. Or perhaps you're more interested in self-driving outdoor mobile robots. You wouldn't want to be thethered to Wi-Fi for that.

I can't recommend the acrylic "4 DOF" arm I've used here. It is really rather poor. It's floppy and jerky. The servos aren't strong enough or accurate enough. I suppose it's cheap (approx £8 without servos, £18 with). Three DOF plus gripper isn't enough to do anything useful. You'd do better to buy a metal 6 DOF arm with standard servos (I paid £51 with servos). That's my next project and, I hope, my next instructable.