Introduction: Code and Test a Computer in Machine Language
In this Instructable, I will show you how to code and test a computer program in machine language. Machine language is the native language of computers. Because it is composed of strings of 1s and 0s , it is not easily understood by humans. To work around this, we code programs first in a high level language like C++ or Java then use special computer programs to translate them into the 1s and 0s computers do understand. Learning to code in a high level language is certainly a no-brainer but a brief introduction to machine language can provide valuable insight into how computers work and increase appreciation of this very important technology.
To code and test a machine language program, we need access to a no-frills computer whose machine language is easily understood. Personal computers are far too complex to even consider. The solution is to use Logisim, a logic simulator, that runs on a personal computer. With Logisim we can simulate a computer that meets our needs. The video above gives you some idea what we can accomplish with Logisim.
For the computer design, I adapted one from my Kindle e-book Build Your Own Computer - From Scratch. I started with the BYOC computer described there and trimmed it down to the vary basic BYOC-I (I for Instructable) we will use in this Instructable.
BYOC-I's machine language is simple and easy to understand. You won't need any special knowledge of computers or programming. All required is an inquisitive mind and desire to learn.
You may wonder why we use "machine" to describe a computer when it is not a mechanical device. The reason is historic; the first computing devices were mechanical consisting of gears and wheels. Allan Sherman's lyric, "It was all gears going clickety-clack..." was only off a century or two. Read more about early computing here.
Step 1: Parts List
The parts list is short. Only these two items are required, both downloadable free:
- "Logisim-win-2.7.1.exe" - Logisim is a popular and easy to use logic simulator. Download the Logisim executable file from here then create a short cut in a convenient place like your desktop. Double click the Logisim icon to launch it. Note: Logisim uses Java Runtime Package located here. You may be asked to download it.
- BYOC-I-Full.cir" - Download the Logisim circuit file below.
Launch Logisim then click "File-Open" and load the BYOC-I-Full.cir file. The image above shows the Logisim working environment. The BYOC-I is represented by the subcircuit block. Connected externally are two inputs, Reset and Run, and hexadecimal displays for the BYOC-I's registers and program memory.
The BYOC-I's program memory is pre-loaded with a simple program that counts from 1 to 5 in the A register. To execute (Run) the program, follow these steps.
Step 1 - Click on the Poke Tool. The cursor should change to the poking "finger".
Step 2 - Poke the Reset input twice, once changing it to "1" and again to change it back to "0". This resets the BYOC-I to start the program at address 0.
Step 3 - Poke the Run input once to change it to "1". The A register should show the count changing from 1 to 5 then repeating.
Step 4 - If the program doesn't execute, press control-K and it should start.
If you want to explore Logisim's capabilities, click the Help link in the Menu Bar. From there, you can explore the Logisim "Tutorial", "User Guide", and "Library Reference". An excellent video introduction is found here.
Step 2: Machine Language Hierarchy and Codes
The BYOC-I computer performs tasks based on programs written in machine language. BYOC-I programs, in turn, are composed of instructions executed in a well defined sequence. Each instruction is made of fixed length codes that represent various operational components of the BYOC-I. Finally, these codes consist of strings of 1s and 0s that constitute the machine language the BYOC-I actually executes.
By way of explanation, we'll start with codes and work our way up to the program level. Then we will code a simple program, load it into the BYOC-I's memory, and execute it.
Codes consist of a fixed number of binary (1 and 0) digits or bits, for short. For instance, the table below shows all the possible codes (16 in all) for a code 4 bits wide. Shown along side the code is the hexadecimal (base 16) and decimal equivalent. Hexadecimal is used in referring to binary values as it is more compact than binary and easier to convert from binary than decimal. The "0x" prefix lets you know the number that follows is hexadecimal or "hex" for short.
Binary - Hexadecimal - Decimal
0000 0x0000 0
0001 0x0001 1
0010 0x0002 2
0011 0x0003 3
0100 0x0004 4
0101 0x0005 5
0111 0x0007 7
1000 0x0008 8
1001 0x0009 9
1010 0x000A 10
1011 0x000B 11
1100 0x000C 12
1101 0x000D 13
1110 0x000E 14
1111 0x000F 15
The width of a code determines how many items can be represented. As noted, the 4-bit wide code above can represent up to 16 items (0 to 15); that is, 2 times 2 taken four times or 2 to 4th power equals 16. In general, the number of representable items is 2 raised to the nth power. Here is a short list of n-bit code capacities.
n - Number of Items
BYOC-I computer code widths are chosen to accommodate the number of items to be represented by the code. For example, there are four Instruction Types, so a 2-bit wide code is suitable. Here are the BYOC-I codes with a brief explanation of each.
Instruction Type Code (tt) There are four instructions types: (1) MVI - Move an immediate 8-bit constant value into a memory register. The memory register is a device that holds data to be used for a calculation, (2) MOV - Move data from one register to another, (3) RRC - Perform a register-to-register calculation, and (4) JMP - Jump to a different instruction instead of continuing at the next instruction. The BYOC-I Instruction Type Codes adopted are as follows:
Register Code (dd and ss) The BYOC-I has four 8-bit registers capable of storing values from 0 to 255. A 2-bit code is sufficient to designate the four registers:
00 F register
01 E register
10 D register
11 A register
Calculation Code (ccc) The BYOC-I supports four arithmetic/logic operations. To allow for future expansion to eight calculations, a 3-bit code is used:
000 ADD, add two 8-bit values in designated registers and store the result in one of the registers
001 SUB, subtract two 8-bit values in designated registers and store the result in one of the registers
010 - 011 Reserved for future use
100 AND, logically AND two 8-bit values in designated registers and store the result in one of the registers
101 OR, logically OR two 8-bit values in designated registers and store the result in one of the registers
110 to 111, Reserved for future use
Jump Code (j) A 1-bit code that indicates whether the jump is unconditional (j = 1) or conditioned on a not zero calculation result (j = 0).
Data/Address Code (v...v)/(a...a) 8-bit data can be included in certain instructions representing values from 00000000 to 11111111 or 0 to 255 decimal. This data is 8-bits wide for storage in BYOC-I's 8-bit registers. With decimal arithmetic, we don't show leading zeros. With computer arithmetic, we show leading zeros but they don't affect the value. 00000101 is numerically the same a 101 or 5 decimal.
The idea of using codes to drive a process goes back a long way. One fascinating example is the Jacquard Loom. The automated loom was controlled by a chain of wooden cards in which holes were drilled representing codes for different colored yarn for weaving. I saw my first one in Scotland where it was used to make colorful tartans. Read more about Jacquard Looms here.
Step 3: Anatomy of BYOC-I Instructions
Given the BYOC-I's codes, we move up to the next level, instructions. To create an instruction for the BYOC-I, we place the codes together in specified order and in specific locations within the instruction. Not all codes appear in all instructions but, when they do, they occupy a specific location.
The MVI instruction type requires the most bits, 12 in all. By making the instruction word 12 bits in length, we accommodate all instructions. Unused (so called "don't care") bits are given the value 0. Here is the BYOC-I Instruction Set.
- Move Immediate (MVI) - 00 dd vvvvvvvv
Function: Move a 8-bit data value V = vvvvvvvv to the destination register dd. After execution, register dd will have the value vvvvvvvv.
Abbreviation: MVI R,V where R is A, D, E, or F.
Example: 00 10 00000101 - MVI D,5 - Move the value 5 to the D register.
- Move Register to Register (MOV) - 01 dd ss 000000
Function: Move data from source register ss to desination register dd. After execution, both registers have the same value as the source register.
Abbreviation: MOV Rd,Rs where Rd is the destination register A, D, E, or F and Rs is the source register A, D, E,or F.
Example: 01 11 01 000000 - MOV A,E - Move the value in register E to register A.
- Register to Register Calculation (RRC) - 10 dd ss ccc 000
Function: Perform designated calculation ccc using source register ss and destination register dd then storing the result in the destination register.
Abbreviations: ADD Rd,Rs (ccc=000 Rd + Rs stored in Rd); SUB Rd,Rs (ccc=001 Rd - Rs stored in Rd); AND Rd,Rs (ccc=100 Rd AND Rs stored in Rd); OR Rd,Rs (ccc=101 Rd OR Rs stored in Rd).
Example: 10 00 11 001 000 - SUB F,A - Subtract the value in the A register from the F register with the result in the F register.
- Jump to Different Instruction (JMP) - 11 j 0 aaaaaaaa
Function: Change execution to a different instruction located at address aaaa aaaa
(a) Unconditionally (j=1) -11 1 0 aaaaaaaa
Abbreviation: JMP L where L is address aaaa aaaa
Example: 11 1 0 00001000 - JMP 8 - Change execution to address 8.
(b) Conditionally (j=0) when the previous calculation resulted in a not zero result - 11 0 0 aaaaaaaa
Abbreviation: JNZ L where L is address aaaa aaaa.
Example: 11 0 0 00000100 JNZ 4 If the last calculation yielded a non-zero value, change execution to address 4.
Instruction word bits are numbered left (most significant bit MSB) to right (least significant bit LSB) from 11 to 0. The fixed order and locations of the codes are as follows:
Bits - Code
11-10 Instruction Type
9-8 Destination Register
7-6 Source Register
5-3 Calculation: 000 - add; 001 - subtract; 100 - logical AND; 101 - logical OR
7-0 Constant value v...v and a...a (0 to 255)
The instruction set is summarized in the figure above. Note the structured and orderly appearance of the codes in each instruction. The result is a simpler design for the BYOC-I and it makes instructions easier for human to understand.
Step 4: Coding a Computer Instruction
Before moving to the program level, let's construct some example instructions using the BYOC-I Instruction Set above.
1. Move the value 1 to register A. BYOC-I registers can store values from 0 to 255. In this case, register A will have the value 1 (00000001 binary) after execution of the instruction.
Abbreviation: MVI A,1
Codes Required: Type MVI - 00; Destination Register A - 11; Value - 00000001
Instruction Word: 00 11 00000001
2. Move the contents of register A to register D. After execution, both registers will have the value originally in register A.
Abbreviation: MOV D,A (Remember, the destination is first and source second in the list)
Codes Required: Type MOV - 01; Destination Register D - 10; Source Register A - 11
Instruction Word: 01 10 11 000000
3. Add contents of register D to register A and store in register A. After execution, register A's value will be the sum of the original value of register A and register D.
Abbreviation: ADD A,D (Result is stored in destination register)
Codes Required: Type RRC - 10; Destination Register A - 11; Source Register D - 10; Calculation Add - 000
Instruction Word: 10 11 10 000 000 (ccc is first 000 - add)
4. Jump on not zero to address 3. If the result of the last calculation was zero, execution will change to the instruction at the given address. If zero, execution resumes at the instruction following.
Abbreviation: JNZ 3
Codes Required: Type JMP - 11; Jump Type - 0; Address - 00000003
Instruction Word: 11 0 0 00000003 (Jump type is first 0)
5. Jump unconditional to address 0. After execution, execution changes to the instruction at the given address.
Abbreviation: JMP 0
Code Required: Type JMP - 11; Jump Type - 1; Address - 00000000
Instruction Word; 11 1 0 00000000
While machine coding is somewhat tedious, you can see that it's not impossibly difficult. If you were machine coding for real, you would use a computer program called an assembler to translate from the abbreviation (which is called assembly code) to machine code.
Step 5: Anatomy of a Computer Program
A computer program is a list of instructions that the computer executes beginning at the start of the list continuing down the list to the end. Instructions like JNZ and JMP can change which instruction is executed next. Each instruction in the list occupies a single address in the computer's memory starting at 0. The BYOC-I memory can hold a list of 256 instructions, more than enough for our purposes.
Computer programs are designed to perform a given task. For our program, we'll choose a simple task, counting from 1 to 5. Obviously, there is no "count" instruction, so the first step is to break the task down into steps that can be handled by the BYOC-I's very limited instruction set.
Step 1 Move 1 to register A
Step 2 Move register A to register D
Step 3 Add register D to register A and store the result in register A
Step 4 Move 5 to register E
Step 5 Subtract register A from register E and store the result in register E
Step 6 If the subtraction result was not zero, go back to Step 4 and continue counting
Step 7 If the subtraction result was zero, go back and start over
The next step is to translate these steps into BYOC-I instructions. BYOC-I programs start at address 0 and number consecutively. Jump target addresses are added last after all instructions are in place..
Address:Instruction - Abbreviation ;Description
0:00 11 00000001 - MVI A,1 ;Move 1 to register A
1:01 10 11 000000 - MOV D,A ;Move register A to register D
2:10 11 10 000 000 - ADD A,D ;Add register D to register A and store the result in register A
3:00 01 00 00000101 - MVI E,5 ;Move 5 register E
4:10 01 11 001 000 - SUB E,A ;Subtract register A from register E and store the result in register E
5:11 0 0 00000010 - JNZ 2 ;If the subtraction result was not zero, go back to address 3 and continue counting
6:11 1 0 00000000 - JMP 0;If the subtraction result was zero, go back and start over
Before transferring the program to memory, the binary instruction code has to be changed to hexadecimal to use with the Logisim Hex Editor. First, split the instruction into three groups of 4 bits each. Then translate the groups into hexadecimal using the table in Step 2. Only the last three hexadecimal digits (in bold below) will be used.
Address - Instruction Binary - Instruction Binary Split - Instruction (Hex)
0 001100000001 0011 0000 0001 - 0x0301
1 011011000000 0110 1100 0000 - 0x06C0
2 101110000000 1011 1000 0000 - 0x0B80
3 000100000101 0001 0000 0101 - 0x0105
4 100111001000 1001 1100 1000 - 0x09C8
5 110000000100 1100 0000 0010 - 0x0C02
6 111000000010 1110 0000 0000 - 0x0E00
It's time to transfer the program to the BYOC-I's memory for testing.
Step 6: Transferring the Program to Memory and Testing
Looking at the Logisim "main" circuit, the BYOC-I block shown is the symbol for the actual computer circuit labelled "BYOC-I" in the Explorer Pane. To enter a program into the BYOC-I memory:
- Right click the BYOC-I block (called a "subcircuit") and select (hover over and left click) "View BYOC-I".
- The BYOC-I circuit will appear in the Work Area. Right click on the "Program Memory" symbol and select "Edit Contents..".
- Using the Logisim Hex Editor, enter the hexadecimal code (bold only) as shown above.
You are now ready to execute the program. Return to the main circuit by double clicking "BYOC-I" in the Explorer Pane. The Run and Reset inputs should be "0" to start. Using the Poke Tool, first change Reset to "1" then back to "0". This makes the starting address 0x0000 and prepares the BYOC-I circuit for execution. Now poke the Run input to "1" and the program will execute. (Note: You main need to tap Control-K once to start the Logisim clock. This is a feature that allows you to stop the Logisim clock and step through a program by tapping Control-T repeatedly. Try it sometime!)
The Logisim clock is settable for a wide range of frequencies. As downloaded it is 8 Hz (8 cycles per second). The way the BYOC-I computer is designed, each instruction takes four clock cycles to complete. So, to calculate the BYOC-I speed, divide the clock frequency by 4. At 8 Hz, its speed is 2 instructions per second. You can change the clock by clicking "Simulate" on the tool bar and selecting "Tick Frequency". The possible range is 0.25 Hz to 4100 Hz. The slow speed at 8 Hz was chosen so you could watch the count in the A register.
The maximum speed of the BYOC-I simulation (~1000 instructions per second) is very slow compared to modern computers. The hardware version of the BYOC computer described in my book executes at greater than 12 million instructions per second!
I hope this Instructable has demystified machine language programming and given you insight into how computers work at their most basic level. To clinch your understanding, try to code the two programs below.
- Write a program that starts at 5 and counts down to 0. (ANS. Count5to0.txt below)
- Starting at 2, count by 3 until the number exceeds 7. You could do a little mental arithmetic, check for 8 knowing it would land there then restart. Write your program in a more general way that really tests if the count "exceeds" a specific number. Hint: Explore what happens when a subtract yields a negative value, say 8 - 9= -1 for example. Then experiment with the logical AND to test whether the MSB in an 8-bit number is "1". (ANS. ExceedsCount.txt)
Can you thinks of other challenging problems for the BYOC-I computer? Given its limitations, what more can it do? Share your experiences with me at firstname.lastname@example.org. If you are interested in coding microprocessors, check out my website www.whippleway.com. There I carry machine coding to modern processors like the ATMEL Mega series used in Arduinos.