Introduction: Self Walking Robot

What did I make?

● A bot that can be trained to walk (move ahead) on different surfaces. The bot depicts a simple creature with 4 ‘knee-less’ legs who is struggling to move forward. It knows it can orient each of the legs in 3 possible ways only. Now it has to figure out the best possible steps it can take to keep moving. Since its movement also depends upon friction with the surface, we believe that for every different surface it walks, there will be a different (not necessarily unique but most likely similar) set of steps to maximize its effort of moving ahead.

What is it used for?

● It is best used for visualising the patterns of walking for an AI ROBOT.

Step 1: The Flow Diagram

Here a breakdown of the whole project. Broadly the project is in 2 parts electronics with mechanical structure of the robot and the other is the algorithm running over pc and code running over arduino.



Arduino UNO(!)

Ultrasonic sensor

Servo motors

Bluetooth module


Arduino IDE


Jupyter Notebook

Q- learning algorithm

Step 3: MODULE V1 :

Reinforcement Learning : Using ANN (Artificial Neural Network) we planned to train our robot and we came up with two possible method.

Constraints : Each leg (servo motor) is constrained to take only 3 possible positions 60,90 & 120 degrees. Assumptions : We consider that bot motion will constitute 4 states (a state is a certain orientation of all four servos), i.e. there will be 4 different states of the robot which we will consider as 4 steps respectively giving us one cycle of movement, in which the bot will move some distance ahead. This cycle will be repeated ad infinitum to keep the bot moving.

But only problem was the number of iterations to be assessed -
We have 3 possible orientation for each motor and there are 4 different motors making it 3^4=81 states in which robot can exist in a single step or state. We have to take 4 different step to complete one complex motion, which means 81^4 = 43,046,721 possible combinations to be checked for maximum efficiency for one cycle of movement. Suppose it takes 5 sec to train a single state it would take 6.8250 years to complete the training!

Step 4: MODULE V2 :

Q-learning Algorithm

An early reinforcement learning algorithm developed for training things having finite state and finding the shortest paths. source:

Math of Algorithm : There are 81 possible states for each step that bot can be in, we name these states as numbers from 1 to 81 and now what we want to know is the transition value, meaning the change in position of robot (distance moved) while it moves from a random state s1 to some other state s2 (s1, s2 from those 81 states). We can see it as a matrix having 81 rows and 81 columns where an element of matrix will be equal to value of distance it moved from corresponding to its row and column number. These values can be positive or negative depending upon the action of robot in real word. Now we will find a closed loop of states where the distance it travels is always positive, We will be evaluating 81x81 matrix values which are 81^2=6561, now if we take 5 seconds to get these value stored in the matrix then it will take 9.1125 hours only to make a whole of matrix and then a loop of steps to maximize moving efficiency could be figured out easily.


  1. For some state the bot motion was very uneven and was affecting the sensor value of ultrasonic, bot would tilt and pick up distance from a distant wall.
  2. The problem of disconnection from laptop and restarting of arduino was making it to train from 0 value was very irritating.
  3. Watching the robot train for continuous 5h was very exhaustive.

Step 6: MODULE A1 and A2:

  • Mechanical part includes the chassis board with four servos fixed to it.We used ice-cream sticks to make legs.
  • Our principle task - to keep track of distance of bot from its initial position.
  • Our first approach was to use gyro sensor and use the acceleration of bot as it moves to extract its velocity and subsequently its position.
  • Problem - It turned out to be too complicated to implement! Alternative - We restricted bot’s movement to 1 dimension only and used ultrasonic sensor to measure distance from a wall straight ahead in front.
  • The HC05-Bluetooth module was used during training period to transmit distance transition rate between two steps to PC and there the data was stored in a matrix.

Step 7: Link to Videos:

First Time Author

Participated in the
First Time Author