Introduction: Eye Tracking Animatronic Face
This is the final project for my UIC ME411 Mechatronics course: an interactive animatronic face! This build achieves realistic motion with full-range eye tracking, automatic blinking, and a moving mouth. It's designed to be responsive, beginning its simulated speech only after recognizing a thumbs-up hand gesture.
Supplies
General Electronics:
- Webcam for computer vision - Can use any webcam or integrated camera
- Laptop/Computer to use Arduino IDE + Python
- Power Supply (I managed without one, but this caused the microcontroller to brown out. I worked out in software to reset the port connection when there was no data transfer, but this is not the smart way to do it)
Eye Mechanism/Mouth:
- Servo Motors x7 - can use less expensive variety: Servo motor
- Breadboard
- M-M Jumper Wires
- ELEGOO UNO R3 Board + Cable
- Steel Wire
- Various building materials: Hot Glue Gun, Popsicle Sticks, Balsa Wood, Cardboard
- Access to 3D printer (see Step 1)
Costume Items:
Project code provided at the end
Step 1: Building the Eyes Mechanism
Follow the fantastic tutorial by instructables creator: MorganManly
Step 2: Computer Vision
This system integrates OpenCV (Python) for computer vision processing with microcontroller control via serial communication. It implements a spatial mapping logic to translate the target's pixel coordinates (0-640 range in X and Y) captured by the camera into precise microcontroller signals that control the physical limits of the eye's motion (left, right, up, and down).
Step 3: Building the Frame
A robust structural frame was fabricated using accessible materials and secured with hot glue. This framework was designed to precisely mount and stabilize the critical components, including the eye mechanism, camera module, and decorative costume elements. The frame was then painted white to look fine as wine.
Step 4: Include Facial Identifiers
The final hardware modifications were implemented to achieve a complete facial structure, adding a dynamic mouth mechanism, a nose, and ears. These components utilized 3D printable graciously provided by the open source community:
- Nose Model: 3DTux (Printables)
- Ear Model: Peter Farell (Printables)
Step 5: Integrate Gesture Controls
The final development phase focused on integrating gesture control capabilities into the computer vision system. This was achieved by leveraging Google's opensource media pipe libraries to analyze hand position and orientation in real-time. The gesture detection was then synced to mouth motion in order to play audio and similate speaking functionality.
Step 6: Python Code
Step 7: Arduino Code
Step 8: Reflection
The most significant revelation during this project was the maturity and completeness of open-source documentation and resources available for computer vision. Contrary to my initial assumptions, where I expected computer vision integration to be the major bottleneck, the mechanical construction of the eye mechanism emerged as the primary engineering challenge. The assembly demanded precise manipulation of steel wire, and the absence of appropriate tools required... creative solutions.
Two core concepts can be pursued for future development:
- Gaze Tracking and Convergence: The current system lacks depth perception, as the eyes remain fixed even when a target approaches closely. A valuable feature would be to enable the eyes to converge (cross inward) as a detected target moves closer to the center of the field of view, creating the effect of realistic focusing.
- System Consolidation: The current architecture is distributed across several components (an Arduino-based board, a webcam, and a host laptop). Scaling down the entire system by combining the processing and control onto a single embedded platform, such as a Raspberry Pi or a Jetson Nano, would allow the entire project to be packaged into a highly integrated and compact unit.



