Today is the first day of the Lunar New Year, so first off, I wish you all a Happy New Year! I hope everyone stays healthy in the new year, has smooth sailing in studies and work, and is happy every day.
Starting today, we officially enter PA2. In PA2, we need to implement a Von Neumann Computer System. That is to say, in this chapter, we can actually run some programs on our virtual machine! (Yay!!!!!!!!)
More on Programs
To be able to run programs on the virtual machine we wrote ourselves, we must fully understand the rules of how a program runs. Although I touched on this in previous articles, I haven’t systematically presented the entire execution process. Therefore, today, I will introduce in more detail how a program runs on a computer.
Program
To get a program running on a computer, we must first have a program (how fresh). Below, I will introduce based on the following piece of program.
Some people might say: Why not write such a classic recursive algorithm like the Fibonacci sequence using recursion? And you even said you like recursion! Pfft! Disgusting!
For the reason, see Appendix
This program calculates the sequence of Fibonacci numbers within 10, but it doesn’t use it to accomplish anything. To execute this code, we also need a…
Compiler
Currently, the above code is just simple text. To the machine, they are nothing special. Although we might name it fib.c, the .c suffix actually has no special effect. If we wanted to, we could also change it to .txt, .py, or even .mp3 or other suffixes without causing any impact. To execute this code, we must have a binary executable file that the machine can understand.
A compiler is a tool that can turn high-level languages like C, Java, etc., into executable files. Different languages have different compilers, and each compiler also has multiple versions corresponding to different architectures. Since my virtual machine uses the Risc-V32 architecture, we need to use the Risc-V32 version of gcc (a compiler for the C language) to compile, and then convert the binary file into assembly language (a relatively low-level language), resulting in a file like this:
1# Memory Address Binary Instruction Assembly Code 2 380000000 <_start>: 480000000: 00000413 li s0,0 580000004: 00009117 auipc sp,0x9 680000008: ffc10113 addi sp,sp,-4 # 80009000 <_end> 78000000c: 00c000ef jal ra,80000018 <_trm_init> 8 980000010 <main>: 1080000010: 00000513 li a0,0 1180000014: 00008067 ret 12 1380000018 <_trm_init>: 1480000018: ff010113 addi sp,sp,-16 158000001c: 00000517 auipc a0,0x0 1680000020: 01c50513 addi a0,a0,28 # 80000038 <_etext> 1780000024: 00112623 sw ra,12(sp) 1880000028: fe9ff0ef jal ra,80000010 <main> 198000002c: 00050513 mv a0,a0 2080000030: 00100073 ebreak 2180000034: 0000006f j 80000034 <_trm_init+0x1c>
This file is divided into three columns. The first column is the memory address where a line of compiled binary instruction is located; the second column is the hexadecimal representation (8 digits) of the binary program (32 bits) stored in this address; and the third column is the assembly program corresponding to this binary instruction, where the first column indicates the name of the instruction (opcode), and the second column indicates the operand or register name required by the instruction (operand).
Such a binary file is a program that a computer can execute!
ISA (Instruction Set Architecture)
The Risc-V instruction set defines a series of instructions, and any chip with a Risc-V architecture must be able to implement these instructions (remember? the instruction set is a specification manual).
Different instruction sets provide different numbers of instructions, so the design of the compiler must also follow the constraints of the instruction set. Risc-V is a Reduced Instruction Set Computer (Risc), which means it contains fewer instructions and the length of the instructions is fixed; while the Intel x86 instruction set we more commonly see is a Complex Instruction Set Computer (Cisc), which defines more instructions, and the length of the instructions can vary.
It should be clarified here that Risc and Cisc are design choices, and there is no absolute superiority or inferiority between the two.
- Every Risc instruction can be completed in a single clock cycle, using simple instructions to achieve more compact program execution, thereby improving running efficiency. Also, because Cisc instructions are very complex, the circuitry designed to parse instructions is usually very complex, making it harder to improve and update the hardware.
- Cisc has a very mature commercial system. At the same time, Cisc makes it easier to support parallel computing. I provide two articles for reference here and here.
Below, I will use x86 gcc to compile the same program, which will result in this binary file:
100100000 <_start>: 2 100000: bd 00 00 00 00 mov $0x0,%ebp 3 100005: bc 00 90 10 00 mov $0x109000,%esp 4 10000a: e8 05 00 00 00 call 100014 <_trm_init> 5 10000f: 90 nop 6 700100010 <main>: 8 100010: 31 c0 xor %eax,%eax 9 100012: c3 ret 10 100013: 90 nop 11 1200100014 <_trm_init>: 13 100014: 55 push %ebp 14 100015: 89 e5 mov %esp,%ebp 15 100017: 83 ec 14 sub $0x14,%esp 16 10001a: 68 40 00 10 00 push $0x100040 17 10001f: e8 ec ff ff ff call 100010 <main> 18 100024: cc int3 19 100025: 83 c4 10 add $0x10,%esp 20 100028: eb fe jmp 100028 <_trm_init+0x14>
I’m sure everyone can clearly see the difference.
Hardware
When executing a program:
- The PC register in the CPU records the position (memory address) of the instruction currently being executed, reads the content from it, and stores it in an instruction register (Instruction Fetch);
- Next, the CPU extracts the opcode and operands from the instruction through a digital circuit (Decoding);
- Then, the machine performs certain operations on the operands according to the content of the instruction (Execution);
- Finally, the value stored in the PC is modified, thereby initializing the execution of the next instruction (Update PC).
These four steps are what all computers do when executing every single program instruction, and are also the operations we need to complete when writing the virtual machine.
Writing a Virtual Machine
I believe everyone can see that for the first three parts of program execution (program, compiler, instruction set), we have ready-made tools to use. Therefore, in PA 2.1, our task is to use code to accomplish the work of the hardware, which is using software to simulate a chip with the Risc-V architecture.
Memory is a continuous storage space in a computer, so we use an array to simulate memory; a register is a storage device with a clear structure, so we use a struct to simulate registers.
- Instruction Fetch: Fetching a value is very simple, we just need to read the space corresponding to the PC from the array.
- Decoding: Decoding is also not difficult; we still use regular expressions for pattern matching.
- Execution: We need to write corresponding code for each instruction to implement its functionality. For Risc-V, since the functions of the instructions in it are relatively singular, it is also relatively simple and direct. (Although the handout asks us to RTFM, I personally think STFW is a more reasonable tool here).
- Update PC: Under normal circumstances, programs only need to be executed sequentially, which means we only need to move the PC to the position of the next instruction each time. However, for some instructions designed for jumping, the PC needs to be specially handled.
And just like that, our virtual machine can run C code~
PA 2.1 Completed.
Appendix
Below is an algorithm for finding the Fibonacci sequence using recursion.
Below is the program compiled by riscv64-gcc:
```assembly 80000000 <_start>: 80000000: 00000413 li s0,0 80000004: 00009117 auipc sp,0x9 80000008: ffc10113 addi sp,sp,-4 # 80009000 <_end> 8000000c: 418000ef jal ra,80000424 <_trm_init>
80000010