Self-Learning Rock - Paper - Scissors Robot from Lego Mindstorms NXT!

Picture of Self-Learning Rock - Paper - Scissors Robot from Lego Mindstorms NXT!
Hi everyone!  This is my first instructable!

This is a REAL self - learning robot that learns how to play rock - paper - scissors!  It will learn how to beat a person 100% of the time!  A person is NOT needed to teach the robot how to play the game; it really does learn by itself!  

This robot does not play rock-paper-scissors in the way people play.  It first asks the user to input a move (either rock - paper - or scissors).  The robot then calculates the best move to play, and then will extend a retractable arm that shows its next move (a Lego rock, paper, or a Lego scissors).  The player must then tell the robot if the robot won, lost, or tied, against the player.  

While you may think that this robot is cheating, since it waits for the player to make a move, I did not program the robot to know the rules of the game!  The robot does not know that rock beats scissors, paper beats rock, or scissors beats paper!  Instead, the robot relies on the player to tell whether it won/lost/tied to learn from past success/failures and to use this information in the future!

This robot was featured at Robogames 2011 for the Lego Open challenge (first place!).  Moar pics and the source code will be attached!
I just read through all of the code and your reward system is pointless. it does nothing except have a variable set to 1 or -10000. the the only time it uses the reward variable is when setting it. there is nothing that tells it that 1 is good or -10000 is bad. please correct me if I'm wrong.
I'm very sorry, I just checked again and I somehow missed the point where it uses reward to update qscore or something like that. I don't know how I missed it I even searched for it and didn't see it.
very cool! I would appreciate it if you made a separate instructable going into the program and how that in particular works I find it very interesting and you seem to have it figured out.
wusupworld1 year ago
WARGAMES!!!!!!!!!! next you should make a robot to play tic tac to with one zero players
AJMansfield3 years ago
You mentioned a 'reward points' system to encourage the robot to learn. But what would you do if the robot came to realize that the reward points were really meaningless, and started just messing with the human (always getting a tie, for instance?). This robot doesn't seem like it could take over the world or anything, but it might try.
prrgg14935 (author)  AJMansfield3 years ago
I guess we're all screwed then! :D

Actually the program CAN tie with a person.... you just need to add an extra if statement in the code:

if((i == 0 && j == 0) || (i == 0 && j == 2) || (i == 1 && j == 0) || (i == 1 && j == 1) || (i == 2 && j ==1) || (i == 2 && j == 2))
reward = -10000;
}//end if

//otherwise, the robot must win; reward it with a virtual point!
reward = 1;
}//end else

Change this to:
if((i == 0 && j == 2) || (i == 1 && j == 0) || (i == 2 && j == 1))
reward = -10000; //For losing against human, lose 10K points!
else if((i == 0 && j == 0) || (i == 1 && j == 1) || (i == 2 && j == 2))
reward = 1; //1 point for tying
reward = 1; //1 point for winning

This way, the robot might tie or might decide to win, depending on how it's feeling :D I tested out this code before, and most of the time, it prefers to win....
AndyGadget4 years ago

Interesting.  It's simulating a simple neural network.
There are programs around which will learn the quirks and frequencies of the person (or algorithm) it's playing against and make a good guess as to what they are going to play next.  That could be something to work towards.
Also, what's the range of the NXT colour sensor?  You could possibly use one to detect if the human was holding up a rock, scissors or paper marked with an appropriate colour.
(BTW, the word is 'more'.)
prrgg14935 (author)  AndyGadget4 years ago

Which color sensor are you referring to? The Retail 2.0 one can only see 6 colors: White/Black/Green/Red/Yellow/Blue, but the Hitechnic one (both 1.0 and 2.0) can see 17 different shades. I think it's 2 each of Red-Orange-Yellow-Green-Blue-Indigo-Purple, and Black/Gray/White.

That's an interesting option... reducing the number of sensors from 3 Touch to 1 Color! I think people liked pushing buttons more, so I chose the buttons over the color sensor!
I'm not that familiar with the NXT accessories but I knew there was (at least) one colour sensor.  Of course you'd only need to differentiate three colours here.  Fair enough if people prefer pushing buttons though.
prrgg14935 (author) 4 years ago
Oops, forgot to mention the outputs!

Rock - Left Motor - Port A
Paper - Center Motor - Port B
Scissors - Right Motor - Port C

Left Touch - Port 1
Center Touch - Port 2
Right Touch - Port 3