Hello from a compiler-free x86-64 world
I've stumbled upon two posts where Jessica McKellar demonstrates how to make a C program on Linux without using libc. They were written in 2010 when 32-bit x86 was far from extinct though, and use the old 32-bit ABI—a problem most examples of low level programming on UNIX-like systems suffer from, even if it's not their fault.
I'll take it to the next level and show you how to live wild and free of libc, compilers, and common sense, all in the native x86-64 way. Both assembly programming and ABI conventions are of little practical use whenever you are not writing a compiler, a kernel, or a runtime library, but I like the premise of those posts: making a program whose every instruction can be easily explained.
This is more of post about x86-64 than assembly of ELF, so I assume some familiarity with the x86 assembly language, although I tried to annotate the examples with pseudocode and avoid some common tricks to make it more beginner-friendly. If you want to learn about x86 assembly, there's an excellent (even if old) Programming from the Ground Up book by Jonathan Bartlett.
We'll use the GNU assembler and linker for this example.
Simply printing a “hello world” line on screen is too trivial. I wanted something that can demonstrate more than one concept, preferrably most of the important concepts, but still isn't overly complex for a first example.
We'll write a program that accepts unlimited number of names as command line arguments and prints greetings for them all:
$ greet Dennis Brian Ken Hello Dennis! Hello Brian! Hello Ken!
To do this, we need to know at least:
- How to find command line arguments in the process' memory
- How to use system calls
- How to implement and call functions
While in a pure assembly program one can do many things in an ad hoc way and still end up with a working program, we'll make our implementation more or less compliant with the system's conventions.
System ABI is not only specific to the OS, but also to the CPU architecture. Most UNIX-like systems use a variant of the System V ABI, that is divided into a part applicable to all systems (it describes the ELF file format among other things) and architecture-specific supplements.
The ABI of Linux on x86-64 is defined by x86-64 Architecture Processor Supplement.
The entry point that
ld is looking for is
_start. To prevent
ld from linking the prologue required for linking to libc and other fancy
things, we need to use
System call conventions
The old convention used to use interrupt
0x80 for invoking system calls. The x86-64 architecture introduced a new
specifically for this purpose. The convention for passing system call numbers and arguments also has changed.
Let's read the section A.2.1. It says that:
System call number is passed in the
System call arguments should be in registers
- System call arguments are never passed on stack anymore (system calls are limited to six arguments).
System calls destroy registers
The result is in
rax, negative values indicate that it's an
You can find system call numbers in
/usr/include/asm/unistd_64.h. They are technically subject to change, but I don't think they ever changed.
With this knowledge we can write a minimal program that exits correctly:
.text _start: mov $60, %rax # 60 = exit mov $0, %rdi syscall .global _start
$ as -o dummy.o ./dummy.s && ld -nostdlib -o dummy ./dummy.o $ ./dummy && echo $? 0
Where are the arguments?
To get command line arguments, we need to know where they are in the program memory. Section 3.4.1 of the ABI document (Initial Stack and Register State) has figure 3.9, which is a table describing initial stack layout. It says that at the top of the stack is argument count, followed by pointers to argument strings.
Let's try something:
.text _start: pop %rdi mov $60, %rax syscall .global _start
This program pops the supposed argument count off the top of the stack and uses it as an argument to the exit system call, so its exit code will be the number of arguments we gave it.
$ as -o argc.o ./argc.s && ld -nostdlib -o argc ./argc.o $ ./argc || echo $? 1 $ ./argc foo || echo $? 2 $ ./argc foo bar baz quux || echo $? 5
So far so good, it appears to work as expected.
System call arguments come in the same order as we see them in the API reference for C.
We can make a couple of macros for
write for future use right away:
.macro write filedescr bufptr length mov $1, %rax # syscall = __NR_write mov \filedescr, %rdi mov \bufptr, %rsi mov \length, %rdx syscall .endm .macro exit status mov $60, %rax # syscall = __NR_exit mov \status, %rdi syscall .endm
Function call conventions
Remember that we are trying to live free of the standard library? For writing characters to stdout
we can use the
write system call, but it needs to know the string length, and we do not have a ready-made
function for this. We'll have to make one ourselves, and for that we need to learn about function calling conventions.
One important thing is register ownership. If you call a function and it mangles your data, it's not a nice situations,
so all calling conventions specify which registers must be preserved by the callee, so that if you can be sure that
calling a function will not destroy them. In our x86-64 convention these are
We can make a couple of macros that save those register on stack and then restore them to save time:
.macro save_registers push %rbx push %rbp push %r12 push %r13 push %r14 push %r15 .endm .macro restore_registers pop %r15 pop %r14 pop %r13 pop %r12 pop %rbp pop %rbx .endm
Of course we could use a more granular approach with fewer instructions, but then we'd have to keep track of the registers we used. I'd rather leave that to compilers since they don't get tired of tracking registers halfway through.
The other part is argument passing. It its core it's very similar to the system call convention, though it's more extensive and
obviously does allow passing arguments on stack. Since we only need pointers and less than 64-bit integers for our task, we can
pass all arguments in registers, the sequence is:
r9 as the section 3.2.3 suggests.
The return value should be in
Let's write a simple
strlen function. Strings are null-terminated, so to find out the length of a string we can take characters
from it one by one, compare it to zero, and increment the accumulator if it's not a null byte. The accumulator will also serve
as string index.
strlen: save_registers # r12 will be used as an accumulator and string index, # which is fine since maximum string index is its length mov $0, %r12 strlen_loop: # r13b = *rdi[r12] mov (%rdi, %r12, 1), %r13b # if(r13b == 0) goto strlen_return cmp $0, %r13b je strlen_return inc %r12 jmp strlen_loop strlen_return: mov %r12, %rax # save return value in rax restore_registers ret .type strlen, @function
We'll also need a
puts function, but now that we have
strlen, it will be simple.
The complete program
Now we are ready to put the program together.
# The ABI demands that registers rbx, rbp, and r12-r15 # must be preserved by the callee, so we push them to stack # when a function is called and pop them back before returning # back to the caller .macro save_registers push %rbx push %rbp push %r12 push %r13 push %r14 push %r15 .endm .macro restore_registers pop %r15 pop %r14 pop %r13 pop %r12 pop %rbp pop %rbx .endm # Macros for system calls # The numbers can be found in /usr/include/asm/unistd_64.h on Linux .macro write filedescr bufptr length mov $1, %rax # syscall = __NR_write mov \filedescr, %rdi mov \bufptr, %rsi mov \length, %rdx syscall .endm .macro exit status mov $60, %rax # syscall = __NR_exit mov \status, %rdi syscall .endm .section .rodata hello_begin: .ascii "Hello \0" hello_end: .ascii "!\n\0" .section .text strlen: # (rdi = buf) save_registers # r12 will be used as an accumulator and string index, # which is fine since maximum string index is its length mov $0, %r12 strlen_loop: # r13b = buf[r12] mov (%rdi, %r12, 1), %r13b # if(r13b == 0) goto strlen_return cmp $0, %r13b je strlen_return inc %r12 jmp strlen_loop strlen_return: mov %r12, %rax # save the return value in rax restore_registers ret .type strlen, @function puts: # (rdi = filedescr, rsi = buf) save_registers # Save original arguments mov %rdi, %r12 # r12 = filedescr mov %rsi, %r13 # r13 = buf # r13 = strlen(buf) mov %r13, %rdi call strlen mov %rax, %r14 # r14 = strlen(buf) write %r12 %r13 %r14 restore_registers ret .type puts, @function _start: # argc is the first thing on the stack when a process is launched # load it into r12 for using it as a loop counter pop %r12 # r12 = argc # argc is followed by pointers to argument strings # the first argument is the program name, for the purpose of this program # we do not need it # *argv is the program name, ignore it pop %r13 # r13 = argv dec %r12 # argc-- # put the first real argument in r13 pop %r13 # r13 = argv main_loop: # if(argc == 0) goto exit cmp $0, %r12 je exit # puts(hello_begin) mov $1, %rdi # rdi = STDOUT mov $hello_begin, %rsi call puts mov $1, %rdi # rdi = STDOUT mov %r13, %rsi call puts mov $1, %rdi # rdi = STDOUT mov $hello_end, %rsi call puts pop %r13 # fetch next argument dec %r12 jmp main_loop exit: exit $0 .global _start
Let's try greeting the ABI document authors with it:
$ as -o greet.o ./greet.s && ld -nostdlib -o greet ./greet.o $ ./greet Michael Jan Andreas Mark Hello Michael! Hello Jan! Hello Andreas! Hello Mark!
Porting to FreeBSD
An interesting fact is that a number of systems adopted the same ABI on x86-64 that was first introduced in Linux, including FreeBSD. I decided to test the portability of this program and the result is a partial success.
If you change the system call numbers to 1 for
exit and 4 for
write (they are in
the program assembles fine and works for some, but has an issue that makes it segfault on some inputs, which I
didn't have time to track down:
$ ./greet Marshall Bill Keith Hello Marshall! Hello Bill! Hello Keith! $ ./greet foo bar baz Hello ./greet! Hello foo! Hello bar! Hello baz! Hello Segmentation fault (core dumped)
If you find what the problem is, please let me know.