PA from Scratch - Day 05

By ten of hearts

← Back

Today is the first day of the Lunar New Year, so first off, I wish you all a Happy New Year! I hope everyone stays healthy in the new year, has smooth sailing in studies and work, and is happy every day.

Starting today, we officially enter PA2. In PA2, we need to implement a Von Neumann Computer System. That is to say, in this chapter, we can actually run some programs on our virtual machine! (Yay!!!!!!!!)

More on Programs

To be able to run programs on the virtual machine we wrote ourselves, we must fully understand the rules of how a program runs. Although I touched on this in previous articles, I haven’t systematically presented the entire execution process. Therefore, today, I will introduce in more detail how a program runs on a computer.

Program

To get a program running on a computer, we must first have a program (how fresh). Below, I will introduce based on the following piece of program.

 1int main()
 2{
 3    int a = 1, b = 0, tmp = 0;
 4    for (int i = 1; i <= 10; i++)
 5    {
 6        tmp = a;
 7        a = a + b;
 8        b = tmp;
 9    }
10}

Some people might say: Why not write such a classic recursive algorithm like the Fibonacci sequence using recursion? And you even said you like recursion! Pfft! Disgusting!

For the reason, see Appendix

This program calculates the sequence of Fibonacci numbers within 10, but it doesn’t use it to accomplish anything. To execute this code, we also need a…

Compiler

Currently, the above code is just simple text. To the machine, they are nothing special. Although we might name it fib.c, the .c suffix actually has no special effect. If we wanted to, we could also change it to .txt, .py, or even .mp3 or other suffixes without causing any impact. To execute this code, we must have a binary executable file that the machine can understand.

A compiler is a tool that can turn high-level languages like C, Java, etc., into executable files. Different languages have different compilers, and each compiler also has multiple versions corresponding to different architectures. Since my virtual machine uses the Risc-V32 architecture, we need to use the Risc-V32 version of gcc (a compiler for the C language) to compile, and then convert the binary file into assembly language (a relatively low-level language), resulting in a file like this:

 1# Memory Address       Binary Instruction                Assembly Code
 2
 380000000 <_start>:
 480000000:       00000413                li      s0,0
 580000004:       00009117                auipc   sp,0x9
 680000008:       ffc10113                addi    sp,sp,-4 # 80009000 <_end>
 78000000c:       00c000ef                jal     ra,80000018 <_trm_init>
 8
 980000010 <main>:
1080000010:       00000513                li      a0,0
1180000014:       00008067                ret
12
1380000018 <_trm_init>:
1480000018:       ff010113                addi    sp,sp,-16
158000001c:       00000517                auipc   a0,0x0
1680000020:       01c50513                addi    a0,a0,28 # 80000038 <_etext>
1780000024:       00112623                sw      ra,12(sp)
1880000028:       fe9ff0ef                jal     ra,80000010 <main>
198000002c:       00050513                mv      a0,a0
2080000030:       00100073                ebreak
2180000034:       0000006f                j       80000034 <_trm_init+0x1c>

This file is divided into three columns. The first column is the memory address where a line of compiled binary instruction is located; the second column is the hexadecimal representation (8 digits) of the binary program (32 bits) stored in this address; and the third column is the assembly program corresponding to this binary instruction, where the first column indicates the name of the instruction (opcode), and the second column indicates the operand or register name required by the instruction (operand).

Such a binary file is a program that a computer can execute!

ISA (Instruction Set Architecture)

The Risc-V instruction set defines a series of instructions, and any chip with a Risc-V architecture must be able to implement these instructions (remember? the instruction set is a specification manual).

Different instruction sets provide different numbers of instructions, so the design of the compiler must also follow the constraints of the instruction set. Risc-V is a Reduced Instruction Set Computer (Risc), which means it contains fewer instructions and the length of the instructions is fixed; while the Intel x86 instruction set we more commonly see is a Complex Instruction Set Computer (Cisc), which defines more instructions, and the length of the instructions can vary.

It should be clarified here that Risc and Cisc are design choices, and there is no absolute superiority or inferiority between the two.

  • Every Risc instruction can be completed in a single clock cycle, using simple instructions to achieve more compact program execution, thereby improving running efficiency. Also, because Cisc instructions are very complex, the circuitry designed to parse instructions is usually very complex, making it harder to improve and update the hardware.
  • Cisc has a very mature commercial system. At the same time, Cisc makes it easier to support parallel computing. I provide two articles for reference here and here.

Below, I will use x86 gcc to compile the same program, which will result in this binary file:

 100100000 <_start>:
 2  100000:       bd 00 00 00 00          mov    $0x0,%ebp
 3  100005:       bc 00 90 10 00          mov    $0x109000,%esp
 4  10000a:       e8 05 00 00 00          call   100014 <_trm_init>
 5  10000f:       90                      nop
 6
 700100010 <main>:
 8  100010:       31 c0                   xor    %eax,%eax
 9  100012:       c3                      ret
10  100013:       90                      nop
11
1200100014 <_trm_init>:
13  100014:       55                      push   %ebp
14  100015:       89 e5                   mov    %esp,%ebp
15  100017:       83 ec 14                sub    $0x14,%esp
16  10001a:       68 40 00 10 00          push   $0x100040
17  10001f:       e8 ec ff ff ff          call   100010 <main>
18  100024:       cc                      int3
19  100025:       83 c4 10                add    $0x10,%esp
20  100028:       eb fe                   jmp    100028 <_trm_init+0x14>

I’m sure everyone can clearly see the difference.

Hardware

When executing a program:

  • The PC register in the CPU records the position (memory address) of the instruction currently being executed, reads the content from it, and stores it in an instruction register (Instruction Fetch);
  • Next, the CPU extracts the opcode and operands from the instruction through a digital circuit (Decoding);
  • Then, the machine performs certain operations on the operands according to the content of the instruction (Execution);
  • Finally, the value stored in the PC is modified, thereby initializing the execution of the next instruction (Update PC).

These four steps are what all computers do when executing every single program instruction, and are also the operations we need to complete when writing the virtual machine.

Writing a Virtual Machine

I believe everyone can see that for the first three parts of program execution (program, compiler, instruction set), we have ready-made tools to use. Therefore, in PA 2.1, our task is to use code to accomplish the work of the hardware, which is using software to simulate a chip with the Risc-V architecture.

Memory is a continuous storage space in a computer, so we use an array to simulate memory; a register is a storage device with a clear structure, so we use a struct to simulate registers.

  • Instruction Fetch: Fetching a value is very simple, we just need to read the space corresponding to the PC from the array.
  • Decoding: Decoding is also not difficult; we still use regular expressions for pattern matching.
  • Execution: We need to write corresponding code for each instruction to implement its functionality. For Risc-V, since the functions of the instructions in it are relatively singular, it is also relatively simple and direct. (Although the handout asks us to RTFM, I personally think STFW is a more reasonable tool here).
  • Update PC: Under normal circumstances, programs only need to be executed sequentially, which means we only need to move the PC to the position of the next instruction each time. However, for some instructions designed for jumping, the PC needs to be specially handled.

And just like that, our virtual machine can run C code~


PA 2.1 Completed.

Appendix

Below is an algorithm for finding the Fibonacci sequence using recursion.

 1int fib(int n)
 2{
 3    if (n <= 1)
 4    {
 5        return n;
 6    }
 7    else
 8    {
 9        return fib(n-1) + fib(n-2);
10    }
11}
12
13int main()
14{
15    for (int i = 0; i <= 10; i++)
16    {
17        fib(i);
18    }
19    return 0;
20}

Below is the program compiled by riscv64-gcc:

```assembly 80000000 <_start>: 80000000: 00000413 li s0,0 80000004: 00009117 auipc sp,0x9 80000008: ffc10113 addi sp,sp,-4 # 80009000 <_end> 8000000c: 418000ef jal ra,80000424 <_trm_init>

80000010 : 80000010: 00100793 li a5,1 80000014: 3aa7d063 bge a5,a0,800003b4 <fib+0x3a4> 80000018: f5010113 addi sp,sp,-176 8000001c: ffe50793 addi a5,a0,-2 80000020: 0a912223 sw s1,164(sp) 80000024: 02f12423 sw a5,40(sp) 80000028: ffd50493 addi s1,a0,-3 8000002c: ffe7f793 andi a5,a5,-2 80000030: 40f487b3 sub a5,s1,a5 80000034: fff50f93 addi t6,a0,-1 80000038: 00048293 mv t0,s1 8000003c: 0a112623 sw ra,172(sp) 80000040: 0a812423 sw s0,168(sp) 80000044: 0b212023 sw s2,160(sp) 80000048: 09312e23 sw s3,156(sp) 8000004c: 09412c23 sw s4,152(sp) 80000050: 09512a23 sw s5,148(sp) 80000054: 09612823 sw s6,144(sp) 80000058: 09712623 sw s7,140(sp) 8000005c: 09812423 sw s8,136(sp) 80000060: 09912223 sw s9,132(sp) 80000064: 09a12023 sw s10,128(sp) 80000068: 07b12e23 sw s11,124(sp) 8000006c: 02f12223 sw a5,36(sp) 80000070: 00012623 sw zero,12(sp) 80000074: 00100393 li t2,1 80000078: 000f8493 mv s1,t6 8000007c: 00048a93 mv s5,s1 80000080: 28748c63 beq s1,t2,80000318 <fib+0x308> 80000084: fff48793 addi a5,s1,-1 80000088: 00f12c23 sw a5,24(sp) 8000008c: ffd48793 addi a5,s1,-3 80000090: 00f12423 sw a5,8(sp) 80000094: 00012823 sw zero,16(sp) 80000098: 02912623 sw s1,44(sp) 8000009c: 02512823 sw t0,48(sp) 800000a0: 01812783 lw a5,24(sp) 800000a4: 30778063 beq a5,t2,800003a4 <fib+0x394> 800000a8: 00812783 lw a5,8(sp) 800000ac: ffca8f13 addi t5,s5,-4 800000b0: ffea8f93 addi t6,s5,-2 800000b4: ffe7f793 andi a5,a5,-2 800000b8: 40ff07b3 sub a5,t5,a5 800000bc: 00000493 li s1,0 800000c0: 01f12e23 sw t6,28(sp) 800000c4: 02f12023 sw a5,32(sp) 800000c8: 00048293 mv t0,s1 800000cc: 000f8a13 mv s4,t6 800000d0: 1e7f8463 beq t6,t2,800002b8 <fib+0x2a8> 800000d4: ffff8793 addi a5,t6,-1 800000d8: 00000913 li s2,0 800000dc: 02512c23 sw t0,56(sp) 800000e0: 00f12a23 sw a5,20(sp) 800000e4: ffdf8e93 addi t4,t6,-3 800000e8: 03e12a23 sw t5,52(sp) 800000ec: 00090293 mv t0,s2 800000f0: 01412783 lw a5,20(sp) 800000f4: 2a778263 beq a5,t2,80000398 <fib+0x388> 800000f8: ffea0f13 addi t5,s4,-2 800000fc: ffeef713 andi a4,t4,-2 80000100: ffca0a13 addi s4,s4,-4 80000104: 40ea0733 sub a4,s4,a4 80000108: 000f0993 mv s3,t5 8000010c: 00000d13 li s10,0 80000110: 00070d93 mv s11,a4 80000114: 00098793 mv a5,s3 80000118: 14798c63 beq s3,t2,80000270 <fib+0x260> 8000011c: fff98713 addi a4,s3,-1 80000120: ffd98913 addi s2,s3,-3 80000124: 00000c93 li s9,0 80000128: 24770c63 beq a4,t2,80000380 <fib+0x370> 8000012c: ffe78c13 addi s8,a5,-2 80000130: ffa78513 addi a0,a5,-6 80000134: ffe97693 andi a3,s2,-2 80000138: ffc78613 addi a2,a5,-4 8000013c: ffb78493 addi s1,a5,-5 80000140: 000c0593 mv a1,s8 80000144: 40d507b3 sub a5,a0,a3 80000148: 00090893 mv a7,s2 8000014c: 00000813 li a6,0 80000150: 00058693 mv a3,a1 80000154: 0e758063 beq a1,t2,80000234 <fib+0x224> 80000158: 000c0313 mv t1,s8 8000015c: 00048b13 mv s6,s1 80000160: 00088a93 mv s5,a7 80000164: 00000b93 li s7,0 80000168: 00048c13 mv s8,s1 8000016c: 227a8063 beq s5,t2,8000038c <fib+0x37c> 80000170: ffe68493 addi s1,a3,-2 80000174: ffeb7513 andi a0,s6,-2 80000178: ffc68693 addi a3,a3,-4 8000017c: 40a68433 sub s0,a3,a0 80000180: 00048e13 mv t3,s1 80000184: 00000693 li a3,0 80000188: 000e0513 mv a0,t3 8000018c: 06612623 sw t1,108(sp) 80000190: 07e12423 sw t5,104(sp) 80000194: 07112223 sw a7,100(sp) 80000198: 06b12023 sw a1,96(sp) 8000019c: 04c12e23 sw a2,92(sp) 800001a0: 04f12c23 sw a5,88(sp) 800001a4: 04e12a23 sw a4,84(sp) 800001a8: 04d12823 sw a3,80(sp) 800001ac: 05012623 sw a6,76(sp) 800001b0: 04512423 sw t0,72(sp) 800001b4: 05f12223 sw t6,68(sp) 800001b8: 05d12023 sw t4,64(sp) 800001bc: 03c12e23 sw t3,60(sp) 800001c0: e51ff0ef jal ra,80000010 800001c4: 03c12e03 lw t3,60(sp) 800001c8: 05012683 lw a3,80(sp) 800001cc: 04012e83 lw t4,64(sp) 800001d0: ffee0e13 addi t3,t3,-2 800001d4: 04412f83 lw t6,68(sp) 800001d8: 04812283 lw t0,72(sp) 800001dc: 04c12803 lw a6,76(sp) 800001e0: 05412703 lw a4,84(sp) 800001e4: 05812783 lw a5,88(sp) 800001e8: 05c12603 lw a2,92(sp) 800001ec: 06012583 lw a1,96(sp) 800001f0: 06412883 lw a7,100(sp) 800001f4: 06812f03 lw t5,104(sp) 800001f8: 06c12303 lw t1,108(sp) 800001fc: 00a686b3 add a3,a3,a0 80000200: 00100393 li t2,1 80000204: f9c412e3 bne s0,t3,80000188 <fib+0x178>

Tags: work cs

← Back