Introduction: PicChess

Lets's play chess?

This project is a micro controller chess game. The objective has to be able to play chess on a VGA monitor, including an intelligent computer to play against. This all has been accomplished with a microcontroller.

I started this as a project for the college ( I am a Electrical Engineering student ), but it has grow way beyond that. Now it is a complete chess game with a video output, keyboard for user interface, audio for some sound effects, a clock, a temperature meter. All the code was written from scratch, so any questions about the code you can ask me. I took a lot of time to write all this down, and to make the video and audio routines. The code is all well commented (almost all in English) and modular, it shouldn't be hard to understand.

The source code (attached as a RAR file) was all divided in simple modules, so it's easy to debug and re-use. Some cool techniques where used in the routines that are worth a look. The division of the source code is as follows :
  • Audio
  • Keyboard
  • Video
  • Graphics Routines
  • Serial
  • Temperature sensing
  • External Flash Memory (NVM)
  • Real Time Clock and Calendar (RTCC)
  • Analog Clock
  • Chess Engine
  • Chess Human Interface
  • Chess Graphics
  • Conway's Game of Life
Each section of the code is explained in the next Steps of the instructable, the entire code is huge (108 pages) so i will just scrach the top of it. The routines are written in a non-blocking way, so adding more stuff it's plain.

In the end i had a nice game, not so hard but funny.

Thanks to to my friend Igor for drawing the pieces for me (I suck in Paint).

And if you like the project, and feel it deserves to win, vote in the Microcontroller Contest,and on the Toy contest.To vote go to the following links:
Microcontroller Contest
Toy Challenge

Arthur Benemann, Brazil 2011

Step 1: Hardware

For the hardware the main challenge is to select a processor with power to handle the audio and video, and still have enough power to run the chess engine. The most powerful micro that i had at hand was a DSPIC33F128MC804 from microchip, that i bought to start playing around with the DSPIC33F family. And this seemed a good project to do this.

I know that this micro was supposed to be used for motor control, it has all those nice peripherals and the DSP instruction , but let's do not use this stuff now. The things that are interesting are the SPI module that can go up to 10MHz, 40 MIPS core, 8 DMA channels, 4 Output Compare modules, audio DAC. 

The clock is run at 80MHz this makes the use of the full processor power, and also can be scaled to get a 10MHz clock for the SPI module this is necessary by the video routine. This clock rate is obtained by the PLL block in the DSPIC33F.

If i don't know what is DMA its a feature that allows some peripherals to transfer memory  to or from the data memory without CPU intervention. (wiki reference)

So with the processor chosen the rest it's straight forward.
  • Keyboard  PS2 connection is as simple as two resistor just for precaution (the 5v input pins of the micro must be used).
  • Serial RS232 using a ST232  as transceiver, no interface needed for 3v3.
  • The temperature sensor , a LM35 (10mV/ºC) , just need a low pass filter in the output. External
  • SST25VF016B Flash memory comunnicate via SPI and is 3v3 so just a direct connection, two resistor are added just in case there is a software problem and two inputs accidentally are connected together.
  • Audio output from the DAC it's a 0.7v peak signal. Amplification and ac coupling are made by a capacitor and a LM380 in the typical application from the datasheet, it's capable of outputting 2W with low distortion.
  • VGA signal its composed of 2 TTL signal, just a resistor for interfacing, and three analog RGB signals. The input impedance of a monitor is 75 Ohms so just a resistor would fit, but the signal must have a 0,7 amplitude for full intensity in the screen, by Ohms law this gives 9.3mA more than the maximum current of the processor. A 74HCT14 inversor gives the current gain.

The power supply has three output rails. A 5V regulated by a 7805, for the high voltage chips. A 3v3 rail powers the processor and the flash memory, to get 3v3 a LM317 is used just as described in the datasheet. The amplifier is connected to the unregulated supply because it needs the higher voltage to power the speaker.

I designed little modules to plug in a bread board  so they can be reused. All these modules are  in the project files.Some i have built in a pre drilled pcb so there are no files, but these are simple to be made.

In the beginning of this step there are  pictures of the project mounted in my breadboard, the Schematic is also there, but the eagle files are in the project files.

Parts List:
Qty Value
1  CRYSTAL 32.768 kHz
1  74HCT14D
1  SST25VF016B
1  DSPIC33FJ128MC804-PT
1  LM380
1  LM35
1  LM317
1  7812
1  ST232
1  LED 5MM
2  1N4004
2  220R 1/8W
1  390R 1/8W
5  1k 1/8W
5  1k 1/8W
3  10k 1/8W
4  22pF 50V
7  100nF 16V
4  10uF 25V
1  470uF 16V
1  DB9 Female
1  DB15 Female
1  pinhead bar               

Step 2: Video

The kind of video interface chosen was VGA, because it has the horizotal and vertical sync signals separated from the image signal.That's impportant to get a good framing of the image. This routine would be realy heavy for a 40MIPS processor if it was bit-banged, but using the SPI and DMA modules it was reduced to about 10% of the processor time.
With a limited RAM from the processor (16k) the resolution of the image must be gratly reduced. The chossen resolution where 800x600 pixels 60Hz, this is a standart resolution and every monitor suports it.Another reason to chose this image was its pixel clock of 40MHz ( frequency the pixels are serialized trough the rgb sinals).

To reduce memory consuption the internal buffer just store a 200x150 image, this is upscaled by running the SPI at one quarter of the pixel clock,and repeating each line in the display 4 times.Mo nocromatic image so the buffer and bandwith requirements are low. With this considerations the buffer is reduced to 4Kbytes ( you must double this number becuse duble buffering is used) giving plenty memory for the other routines to run.

Some graphical routines are added such as: plotLine,plotDot,plotSquare,plotCircle.They modfy the video buffer encapsulating the video module so the user does not need to handle the buffer. Text routines are added so text can be placed in the screen on the fly. A BBT ( bit block transfer ) routine places a char anywhere in the screen even if the position is not byte aligned with the buffer.

The way the video is generated is explained in the last step of this instructable.

Step 3: Conway's Game of Life

The project was asking me to include the Conways game of life when i was getting to the end. Just a few lines of code the game was working, thats the magic of C language.

It can be run any time in the game, since it just modify the video buffer.Look up the Wikipedia if you don't know what is this. Basically you start with a parent in the screen and let it evolve following these rules:
  • Any live cell with fewer than two live neighbours dies, as if caused by under-population.
  • Any live cell with two or three live neighbours lives on to the next generation.
  • Any live cell with more than three live neighbours dies, as if by overcrowding.
  • Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.
( a live cell is a white pixel, a dead is black)

It's realy nice to see the gliders showing up from random start positions.

Step 4: Clock

It's nice to have a clock in the game , and the processor had a RTCC module ( Real Time Clock and Calendar) so let's use it.

In the hardware the inclusion of a 32kHz crystal feeds the RTCC circuit, and that takes care of all the time keeping.

I wrote routines to reinitialize the RTCC, set a date-time, read date-time.But to get a nice visualization i added a routine that displays a analog clock in the screen. It's easy to do the static part of the clock using the graphical functions, and the pointers too by using sin and cosine functions from the standard library 'math.h'.

The current temperature is also show in the mode of operation.

Step 5: Keyboard, Serial, Temperature, NVM

The serial, temperature and keybord routines are simple, just some input output routines. A smal overview of this functions is presented below , for more information you can look at the code or ask me.

mperature Routine

There is just one function readTemp() that uses the adc to get a value from the LM35 sensor.This value is then Scaled to get a 10 couts per ºC integer that represents the current temperature.

Serial Routine

Just used to send characteres or string via the RS232 connection to a pc, mainly used for debugging.


This routine is fully indepent using it's self interruptions to handle the keyboard comunnication. Data sent by the keyboard is decoded and processed in this module and the values are saved in a circular FIFO buffer.The main function just need to use getKey to receive the last key pressed.

Non Volatile Memory

The SST25 memory chip is needed to store the sound used by the chess game. It has a SPI interface so an SPI module is used.
The comunnication is straightfoward as explained in the datasheed.Some functions are added to encapsulate the memory chip, these are the initialization, byteRead, blockRead, byteWrite, blockWrite, chipErase, the name is self-explanatory. Most of the time the chip is just read in blocks to refil the audio buffer and that's all there is just a call to readBlock in the audio routine.

Step 6: Chess

Here it's the core of this project the chess game.It is divided in two parts the interface routines and the chess engine. 

Chess Engine
I could include a lot of stuff here, but it's better explained in sites like I will just cover the basics.
Describing a simple program like this as Artificial Intelligence it's not fair since all it do is brute force search.

A Chess program is made of these parts:
  • Move Generator: Given an initial state of the chess board it produces all valid positions by the rules.
  • Evaluation: Returns a Score from a board state, can be as simple as counting the number of pieces with weights ( a pawn weights 1 a queen 9). The higher  the score the best this board state is for the player that has the turn.
  • Make/Unmake: A way to aplay a move to the board variable, and tu undo it later
  • Search: An algorthm has to sesch trought all the moves generate by the move generator, it's the search function. One of the simplest kind is the MinMax seach . I used a little better search the Alpha-beta pruning, that avoid searching useless branches of the search tree.
The seach runs this way: generates all moves up to a search deep (a certain number of turns to look ahead), gives a score for each of the moves in the tip of the tree, and then return the branch which guarantees the best result.

It could be improved alot by adding a quiesent seach to it,this function would serch a few more moves when there is a capture to avoid mistakenly evaluatin one of those positions where a rampage happens.

Chess Interface
All the user interface is handled by this routines. Thing like how to select a piece and make a move... Just look at the code if you have interest, it's all based in a state machine so the code is non-blocking.

One nice thing to note is that to check if a move that the player try to make is valid i used the move generator to get all valid moves, and then check if the player move is one of them. 

Step 7: Audio

For a better human interface audio was added, it has been greatly simplified since the processor contained an internal DAC. However as the internal memory of the microprocessor was limited (1 Mbit Flash) an external memory 16Mbits was added to the project.The sounds can be played are stored sequentially in memory in the form 8.000 Hz 16 bit PCM, which has quality enough to play snippets of voice and does not occupy a lot of space (128s of sounds can be recorded in this memory).

For playing the DAC must be loaded with data in a timed way.This is done with the help of a small buffer and a DMA channel. The channel is activated when the DAC needs more data, the buffer is then transferred to word. The interruption of the DMA ( when it is pointing to the end of the audio buffer) is used to re-fill the audio buffer, with data from external memory, the number of times to play a sound.

Thus it is easy to play a sound, it only necessary to load the address of the sound that should be read, and the number of times the audio buffer to be filled. From this point the mechanism make by the interrupts of the DAC and DMA transfer all the data until the end of the sound.

The signal generated by the DAC is amplified by a simple circuit using an LM380, which has low distortion, is a single power supply and generates reasonable power ( about 2W).


WaveBurner is the name of a small program that i made to load up the songs in the Flash memory. It accepts only the correct type of WAVE file, rips of the data chunk of the file, and pile up all the files in a HEX file ( with the padding necessary for the audio buffer).It has a small firmware that goes into the DSPIC to communicate with the PC program through the serial port.

The software was programmed in Delphi 2010, and is included in the project files.With a few click it can refill the data in the Flash memory, and as output it also give a 'C' header file with the length and address of each song.The second picture is of it burning the flash chip.

Step 8: Video Routine

 I decide to keep this part away from the rest of the instructable because it is the routine that generates the video, it can get confusing.I recommend that you read the following parts of the datasheet of this micro:
The video signal we must generate is composed of three parts: the horizontal sync signal (TTL 0v - 5v), Vertical Sync (TTL 0v - 5v) and the three color signals (Red Green Blue, analog 0v - 0.7v). As the image is chosen to be monochromatic, the three lines of color are linked together and can only assume two levels.The screen captures the last received image is saved internally and display it on the monitor screen, each of these images is called a frame. Our mission is to create these frames in the frequency that the monitor consumes. The images are 'painted' on screen as shown in the first picture, line by line, left to right, just like reading.

Each new frame there is flagged by a signal, called the vertical sync pulse. Another signal called the horizontal sync, generate continuous pulses.The horizontal pulse never stops, but the first and last pulses in a frame are discarded by the receiver.The 800 horizontal pulses in the middle of a frame are the ones that indicate the lines that are show in the screen. At some time between the edges of the horizontal signal the color of each pixel on that line is put is put into the RGB line (black = 0v, 0.7v = white). The choice of a 40 MHz clock facilitates the generation of the video signal because the frequency of each of these pixels is 40MHz for a resolution of 800x600. Each pixel is serialized from the video buffer by the SPI module, each bit of the video buffer is stretched to fill four pixels.

The video module is controlled by the Timer 2 which is programmed to overflow at exactly the frequency of the horizontal sync signal. To generate this signal the Output Compare module is used to turn a pin on the computer automatically. The interrupt of Timer 2 is used to decide if this is the beginning of a new frame,controlling the vertical signal timing. The interrupt that controls the Vertical sync signal takes some time to run, and this could result in problems in the frame timing, to correct this the horizontal sync signal also has this delay added to it leaving the two signals synchronized.

With these two signals correctly synchronized we already have a synchronized frame. New we need to paint the image.The SPI module is used to serialize a word (16 bits) bit by bit through a micro pin.Loading the module directly would take almost all the processor time and  would not be viable. A DMA channel is then used to transfer a small buffer 'line' to each line to be serialized (read 'horizontal sync pulse'). The buffer must be loaded starting line of the video buffer before each line in the TMR2 interrupt routine, which happens just before a new line.And before each transfer DMA must be activated at the right time to start serializing the bits.

When using the SPI module own IRQ to the DMA there is a problem, after the IRQ and before the DMA transfer there is a delay, which generates a wide pixel every 16 pixels. One way used to solve this problem is to generate a timed IRQ to request the DMA to load the SPI .

To generate this timed IRQ another DMA module and a OC module are connected together. The DMA change the interrupt time of the output compare by transferring a buffer to it.When the OC has an interrupt it triggers the DMA, receiving a new time to interrupt and continuing the loop.

That to transmit a video line (200 pixels ), the processor only has to load the buffer line before each line. The buffer line could be completely eliminated if this micro DMA could cover all the RAM (it is limited to only 2K) by reading directly from the video buffer. The second image is trying to explain this.
Microcontroller Contest

Participated in the
Microcontroller Contest

Toy Challenge

Participated in the
Toy Challenge