Intro: Tic Tac Toe on Arduino With AI (Minimax Algorithm)
In this Instructable I am going to show you how to build a Tic Tac Toe game with an AI using an Arduino. You can either play against the Arduino or watch the Arduino play against itself.
I am using an algorithm called "minimax algorithm", which can be used not only to build an AI for Tic Tac Toe, but also for a variety of other games like Four in a Row, checkers or even chess. Games like chess are very complex and require much more refined versions of the algorithm. For our Tic Tac Toe game, we can use the simplest version of the algorithm, which is nonetheless pretty impressive. In fact, the AI is so good that it is impossible to beat the Arduino!
The game is easy to build. You only need a few components and the sketch that I've written. I also added a more detailed explanation of the algorithm, if you want to understand how it works.
Step 1: Build and Play
To build the Tic Tac Toe game you will need the following components:
- An Arduino Uno
- 9 WS2812 RGB LEDs
- 9 push buttons
- some wire and jumper cables
Wire up the components as shown in the Fritzing sketch. Then upload the code to your Arduino.
By default, the Arduino takes the first turn. To makes things a bit more interesting, the first move is chosen randomly. After the first move, the Arduino uses the minimax algorithm to determine the best possible move. You start a new game by resetting the Arduino.
You can watch the Arduino "think" by opening the Serial Monitor. For every possible move, the algorithm computes a rating that indicates whether this move will lead to a win (value of 10) or a loss (value of -10) for the Arduino or a draw (value of 0).
You can also watch the Arduino play against itself by uncommenting the line "#define DEMO_MODE" at the very beginning of the sketch. If you upload the modified sketch, the Arduino makes the first move randomly and then uses the minimax algorithm to determine the best move for each player in each turn.
Note that you cannot win against the Arduino. Every game will either end in a draw or you lose, if you make a mistake. This is because the algorithm always choses the best possible move. As you might know, a game of Tic Tac Toe will always end in a draw if both players make no mistake. In demo mode, every game ends in a draw because, as we all know, computers never make mistakes ;-)
Step 2: The Minimax Algorithm
The algorithm consists of two components: an evaluation function and a search strategy. The evaluation function is a function that assigns a numerical value to board positions. If the position is a final position (i.e., a position where either the blue player or the red player has won or where neither player has won), the evaluation function is very simple: Let's say the Arduino plays blue and the human player plays red. If the position is a winning position for blue, the function assigns a value of 10 to that position; if it's a winning position for red, the function assigns a value of -10 to the position; and if the position is a draw, the function assigns a value of 0.
When it's the the Arduino's turn, it wants to choose a move that maximizes the value of the evaluation function, because maximizing the value means that it will prefer a win over a draw (10 is greater than 0) and perfer a draw over losing (0 is greater than -10). By an analogous argument, the opponent wants to play in such a way that she minimizes the value of the evaluation function.
For a non-final position, the algorithm calculates the value of the evaluation function by a recursive search strategy. Starting from the current position, it alternatingly simulates all the moves that the blue player and the red player can take. This can be visualized as a tree, like in the diagram. When it arrives at a final position, it starts to step back, carrying the value of the evaluation function from the lower recursion level to the higher recursion level. It assigns the maximum (if in the corresponding recursion step it's the blue player's turn) or the minimum (if in the corresponding recursion step it's the red player's turn) of the values of the evaluation function from the lower recursion level to the position on the higher recursion level. Finally, when the algorithm has finished stepping back and has arrived at the current position again, it takes that move (or one of the moves) that has the maximum evaluation function value.
This may sound a bit abstract, but it's actually not that difficult. Consider the position shown at the top of the diagram. In the first recursion step, there are three different moves blue can take. Blue tries to maximize the value of the evaluation function. For each of the moves blue can take, there are two moves red can take. Red tries to minimize the value of the evaluation function. Consider the move where blue plays in the upper right corner. If red plays in the center box, red has won (-10). If, on the other hand, red plays in the center bottom box, blue will win in the next move (10). So, if blue plays in the upper right corner, red will play in the center box, since that minimizes the value of the evaluation function. Analogously, if blue plays in the center bottom box, red will again play in the center box because that minimizes the evaluation function. If, on the other hand, blue plays in the center box, it doesn't matter which move red takes, blue will always win (10). Since blue wants to maximize the evaluation function, it will play in the center box, since that position results in a greater value of the evaluation function (10) than the other two moves (-10).
Step 3: Troubleshooting and Further Steps
If you push a button and a different LED than the one corresponding to the button lights up, your probably either got the wires on pins A0-A2 or 4-6 mixed up, or you connected the LEDs in the wrong order.
Also note that the algorithm doesn't necessarily always choose a move that will let the Arduino win as fast as possible. In fact, I spent some time debugging the algorithm because the Arduino didn't choose a move that would have been a winning move. It took me some time until I realized that it instead had chosen a move that guaranteed that it would win one move later. If you want to, you can try to modify the algorithm so that it will always prefer a winning move to a later win.
A possible extension of this project would be to use the algorithm to build an AI for a 4x4 or even a 5x5 Tic Tac Toe. However, note that the number of positions the algorithm needs to examine grows very rapidly. You will need to find ways to make the evaluation function more intelligent by assining values to positions that are not final, based on the likelihood that the position is good or bad for the player in question. You might also try to make the search more intelligent by stopping the recursion early if a move turns out to be less worthy of further exploration than alternative moves.
The Arduino is probably not the best platform for such extensions because of its limited memory. Recursion lets the stack grow during program execution, and if the stack grows too much, it can corrupt the program memory, leading to crashes or erratic behavior. I chose the Arduino for this project mainly because I wanted to see if it could be done and for educational purposes, not because it's the best choice for this sort of problem.
edgardkoopman made it!