CS456 - Systems Programming

Using the C standard library in assembly

To use the C standard library we will need to use gcc (the GNU C Compiler) to link our code to the library (the normal linker ld does not appear to have the secret sauce necessary to do the task, though it may be possible, it would require a command that is excessively long.) This will also link our program with the C Run-Time (CRT) environment, which uses the _start entry point and expects a main function to call, so to use the C standard library functions in our programs we will need to do the following:

Use main as the entry point for our program, like _start it should be made global. Our main function will be called by the C run-time.
Setup a stack frame for main that is at least 16 bytes in size. This seems to be necessary to keep some functions of the C runtime happy.
Optionally, use leave and ret to exit from the main function, exiting the program. One may still use the exit system call (or preferably the C exit function) to exit the program immediately.
The return value for a C function is stored in rax and so we should place the return value for main in rax prior to returning from the function.

All the C library functions have (almost) the same parameter order as system calls. The exceptions are that rax is not required as an input, however it does often hold the return value. To use a C function, it must be declared extern in the same way our library functions have been. Also the 4th parameter is not r10 but is instead rcx for some reason. Thus the the parameter order for C programs is:

(rdi, rsi, rdx, rcx, r8, r9, ...)

Any additional parameters should be pushed to the stack starting with the last parameter first, working down to the 7th parameter.

To then compile your program you build it in the normal with with nasm into a .o object file. The linking is then done with gcc in the same manner that gcc is used to link any other object file into a working program:

Example: print.s

extern printf

SECTION .data
      fmtstr: db `The answer is = %d\n`, 0

SECTION .text

global main
main:
        enter   16, 0

        mov     rax 0           ; Tell printf we need 0 FPU registers.
        mov     rdi, fmtstr
        mov     rsi, 42
        call    printf          ; printf("The answer is = %d\n", 42);

        leave
        mov     rax, 0
        ret                     ; return 0;

Then to compile:

nasm -g -F dwarf -felf64 print.s

# To link with shared libraries:
gcc -o print print.o

# To link as a static executable:
gcc -static -o print print.o

gcc will link with the C standard library (libc) by default. Any other library we might want to link to would require using the -l option followed by the name of the library, such as -lm to link to the math library (libm.)

The "heap"

The heap is the area of memory above the BSS section of our memory process that can be extended upwards toward the memory mapping segment (where shared libraries are mapped into the processes address space.) The size of the heap is adjusted by using the kernel system call brk which adjusts the program break or point in memory where the heap ends. The program break can be adjusted up or down as the program requires memory. To use the heap we will use the C libraries dynamic memory allocation functions, which will make managing the memory of the heap much simpler.

Dynamic memory allocation (in C)

The dynamic memory allocation function in C consist primarily of the following functions:

       #include <stdlib.h>

       void *malloc(size_t size);
       void free(void *ptr);
       void *calloc(size_t nmemb, size_t size);
       void *realloc(void *ptr, size_t size);
       void *reallocarray(void *ptr, size_t nmemb, size_t size);

The type size_t is an unsigned long integer (64 bits) representing the possible size of the largest in-memory object allowable on the system. The type void * represents a (64 bit) address to some memory area, in C often referred to as a pointer.

malloc

void *malloc(size_t size);

The malloc function allocates size number of bytes of memory from the heap, returning the address of the allocation.

Example:

        mov     rdi, 42         ; Allocate 42 bytes of memory
        call    malloc          ; rax = malloc(42);
        ; address of the allocation is now in rax

free

void free(void *ptr);

Free returns an allocated area back to the heap. It may be reallocated again by another call to malloc. The ptr passed to free is the address of a previous malloc allocation and must be the same address, otherwise the free will fail and likely cause the program to abort.

realloc

void *realloc(void *ptr, size_t size);

If we want to resize a previous allocation we would use the realloc function, passing it the address of allocation we want to resize (ptr) and the new size required (size.) The way realloc works is:

'mallocs' a new area for the allocation of size size
copies the data in the old allocation to the new allocation (or as much as it can)
frees the old allocation.
Returns the address of the new allocation

Thus the old address becomes invalid after a realloc and should be replaced with the new address returned by realloc.

calloc

void *calloc(size_t nmemb, size_t size);

Almost everything we want to do, can be done with just malloc, free and realloc. calloc is a function useful for allocation of arrays of data and it also differs in that the memory that is allocated is "zeroed", i.e. null bytes are written to each byte of the allocation, preventing any left-over junk from a previously allocated, used then freed section of memory being in the allocation. "Sanitizing" memory in this manner is good for security purposes and to avoid unexpected errors.

The parameter nmemb is the number of elements of size size to allocate, thus the total amount of memory allocated is nmemb x size bytes.

reallocarray

void *reallocarray(void *ptr, size_t nmemb, size_t size);

reallocarray is like a cross between realloc and calloc however unlike calloc it does not zero any of the new allocation, so is of limited utility compared to realloc, but if you are using calloc, then using reallocarray is more natural than realloc.

C pointers and arrays of pointers in Assembly

In C, when defining the type for a variable, an asterisk (*) defines the variable as a pointer, i.e. the address of where the actual data is located at. It may point to an array of data (i.e. consecutive values stored one after the other,) however the address only represents the very first value. To access the next value in the array, the address is incremented by the size of the data type.

In X86_64, all addresses are always 64 bits or a quad-word in size.

Sizes of C integer types:

Type	Size	Size in bits
`char`	byte	8
`short`	word	16
`int`	double word	32
`long`	quad word	64
Any pointer	quad word	64

char *var

With one * (asterisk) this represents a pointer to an array of characters, the value of var is the address of the first character.


  var ─► [ 'a' ][ 'b' ][ 'c' ][  0  ]
            └──────┴──────┴──────┴───── Individual bytes of memory

SECTION .bss
  var: resq 1

SECTION .text
        mov     rdi, 4
        call    malloc
        mov     QWORD[var], rax         ; var = malloc(4);

        mov     rsi, QWORD[var]         ; place address in a register so we can offset it.
        mov     BYTE[rsi+0], 'a'        ; var[0] = 'a';
        mov     BYTE[rsi+1], 'b'        ; var[1] = 'b';
        mov     BYTE[rsi+2], 'c'        ; var[2] = 'c';
        mov     BYTE[rsi+3], 0          ; var[3] = '\0';

int *var

Like char * the variable var points (i.e. is the address) to memory representing one or more integers. Since each integer is 4 bytes (a double word) to access the next integer in memory, one has to increase the address by a value of 4. We can do this also by multiplying the index position by 4, which our assembly has support for doing, that is the scaling factor in the X86_64 addressing mode.


  var ─► [ 1 ][ 2 ][ 3 ][ 0 ]
           └────┴────┴────┴───── Individual integers in memory (each 4 bytes in size)

  var[n] =>  DWORD [var + n * 4]

SECTION .bss
  var: resq 1

SECTION .text
        mov     rdi, 16                 ; 16 bytes is 4 integers of space
        call    malloc
        mov     QWORD[var], rax         ; var = malloc(4 * sizeof(int));

        mov     rsi, QWORD[var]         ; place address in a register so we can offset it.
        mov     DWORD[rsi+0*4], 1       ; var[0] = 1;
        mov     DWORD[rsi+1*4], 2       ; var[1] = 2;
        mov     DWORD[rsi+2*4], 3       ; var[2] = 3;
        mov     DWORD[rsi+3*4], 0       ; var[3] = 0;

int **var

A double pointer is a pointer (address) of an array of pointers, each of which points to an array of the given type. In many ways this is like a 2-dimensional array: var[r][c] where r represents the row and c the column position. In C:

// Allocates 4 rows:                       
int **var = malloc(sizeof(int *) * 4);     // var ─► [0]
                                           //        [1]
// Allocates the columns for row 2:        //        [2] ─► [0][1][2][3]
var[2] = malloc(sizeof(int) * 4);          //        [3]

// Set column 1, row 2 to the value 1:
var[2][1] = 0;

In assembly:

SECTION .text
        mov     rdi, 32                 ; 32 bytes is 4 quad-words of space
        call    malloc
        mov     QWORD[var], rax         ; var = malloc(sizeof(int *) * 4);

        mov     rdi, 16                 ; 16 bytes is 4 double-words of space
        call    malloc
        mov     rsi, QWORD[var]         ; place address in a register so we can offset it.
        mov     QWORD[rsi+2*8], rax     ; var[2] = malloc(sizeof(int) * 4)

        mov     rdi, QWORD[rsi+2*8]     ; Could just put rax into rdi here.  *rdi = var[2]
        mov     DWORD[rdi+1*4], 0       ; var[2][1] = 0    (rdi[1] = 0)