As mentioned earlier, this architecture of the CPU features a set of 16 general-purpose 64-bit registers, 16 SSE registers with 128-bit width, and 8 floating point registers with 80-bit width:
Figure 5.1 – x86-64 CPU registers
There are architectures that build upon this base and extend it, such as the Intel Advanced Vector Extensions (AVX), which provide an additional 16 registers of 256 bits in width. Let’s take a look at a page from the System V ABI specification:
Figure 5.2 – Register usage
Figure 5.1 shows an overview of the general-purpose registers in the x86-64 architecture. Out of special interest for us right now are the registers marked as callee saved. These are the registers we need to keep track of our context across function calls. It includes the next instructions to run, the base pointer, the stack pointer, and so on. While the registers themselves are defined by the ISA, the rules on what is considered callee saved are defined by the System V ABI. We’ll get to know this more in detail later.
Note
Windows has a slightly different convention. On Windows, the register XMM6:XMM15 is also calle-saved and must be saved and restored if our functions use them. The code we write in this first example runs fine on Windows since we don’t really adhere to any ABI yet and just focus on how we’ll instruct the CPU to do what we want.
If we want to issue a very specific set of commands to the CPU directly, we need to write small pieces of code in assembly. Fortunately, we only need to know some very basic assembly instructions for our first mission. Specifically, we need to know how to move values to and from registers:
mov rax, rsp
A quick introduction to Assembly language
First and foremost, Assembly language isn’t particularly portable since it’s the lowest level of human-readable instructions we can write to the CPU, and the instructions we write in assembly will vary from architecture to architecture. Since we will only write assembly targeting the x86-64 architecture going forward, we only need to learn a few instructions for this particular architecture.
Before we go too deep into the specifics, you need to know that there are two popular dialects used in assembly: the AT&T dialect and the Intel dialect.
The Intel dialect is the standard when writing inline assembly in Rust, but in Rust, we can specify that we want to use the AT&T dialect instead if we want to. Rust has its own take on how to do inline assembly that at first glance looks foreign to anyone used to inline assembly in C. It’s well thought through though, and I’ll spend a bit of time explaining it in more detail as we go through the code, so both readers with experience with the C-type inline assembly and readers who have no experience should be able to follow along.