Bits: 63 .. 31 15 7 0 ┌────────────────────────────┬────────────────┬───────┬───────┐ │ │ │ AH │ AL │ └────────────────────────────┴────────────────┴───────┴───────┘ └ AX ┘ └ EAX ┘
Some registers have specific purposes (rip, rsp, rbp), some instructions require specific registers to do their work (such as DIV.)
rip
rsp
rbp
DIV
Bits Registers 8 AL/AH CL/CH DL/DH BL/BH SPL BPL SIL DIL R8B-R15B 16 AX CX DX BX SP BP SI DI R8W-R15W 32 EAX ECX EDX EBX ESP EBP ESI EDI R8D-R15D 64 RAX RCX RDX RBX RSP RBP RSI RDI R8-R15
┌──────────────────────────┐0xFFFFFFFFFFFFFFFF │ Kernel mode space │ ├──────────────────────────┤ │ ... │ ├──────────────────────────┤ │ Stack (Grows down) │ ├──────────────────────────┤ │ ... │ ├──────────────────────────┤ │ Memory mapping segment │ │ │ File mappings(dyn libs)│ │ │ Anon mappings ▽ │ ├──────────────────────────┤ │ ... │ ├──────────────────────────┤ │ △ │ │ │ Heap │ C malloc'ed memory ├──────────────────────────┤ │ ... │ ├──────────────────────────┤ │ BSS Segment (uninitial- │ │ ized static vars) Zero- │ section .bss │ Filled │ ├──────────────────────────┤ │ Data Segment (static │ │ Variables initilized by │ section .data │ programmer. │ ├──────────────────────────┤ │ Text Segment (ELF) │ section .text │ Binary image of program │ ├──────────────────────────┤ (start of program) └──────────────────────────┘ 0x0000000000000000
... = unmapped address space (i.e. no memory is mapped in these regions)
...
When writing assembly programs, we specify the locations of data and program code by specifying the section the following elements are stored in using an assembler directive, described further below.
man 2 syscall
The rax register holds the system call # (documented in /usr/include/asm/unistd_64.h)
rax
/usr/include/asm/unistd_64.h
Parameters 1-6 stored in: rdi, rsi, rdx, r10, r8, r9 (and in that order, i.e. the first parameter must be in rdi, the second if any in rsi, and so on.)
rdi
rsi
rdx
r10
r8
r9
Any remaining parameters are pushed on the stack (this likely will never be necessary.)
The return value of the system call is placed in rax upon return from the call.
The called routine is expected to preserve: rsp, rbp, rbx, r12, r13, r14, and r15 but may trample any other registers. Thus the only safe registers to use for your program when you make system-calls are rbx and r12-r15, you should avoid rbp and rsp if you use any stack instructions (i.e. PUSH/POP)
rbx
r12
r13
r14
r15
Consider the following "mnemonic":
rax = rax(rdi, rsi, rdx, r10, r8, r9)
; The write system call is Syscall #1: %define SYS_write 1 ; Descriptor #1 is the standard output: %define STDOUT_FILENO 1 ; w = write(1, buf, r); would be converted to: ; rax = rax (rdi, rsi, rdx); converting to mnemonic: ; [w] = 1 (1, buf, [r]) values to be put into the above registers ; Note that 'buf' is already an address, so may be copied into rsi as-is. SECTION .data r: resq 1 ; Reserve one quad-word at address 'r' w: resq 1 ; " " " " " 'w' ; NOTE: both r and w are "addresses", not values. Most labels are ; addresses (like a pointer). SECTION .text ;... mov rax, SYS_write ; Load rax with the syscall number (i.e. 1) mov rdi, STDOUT_FILENO ; Load rdi with the file descriptor number (i.e. 1) mov rsi, buf ; Load rsi with the buffer address mov rdx, [r] ; Load rdx with the value at address r ; If the system call needs more parameters it would use rdx, r10, etc. ; If it needs fewer then then use fewer, but always in the same order, ; i.e. rdi is the first parameter, rsi the second and so on. syscall ; Perform the system call mov [w], rax ; Return value is in rax, so store it @ w
Layout of a NASM source line:
label: instruction operands ; comment
:
;
Intel syntax does not require the use of instruction suffixes to specify the size of data, i.e.:
movl $0x000F, %eax ; Store hex 0xF (32 bits) in eax, AT&T syntax
instead use:
MOV eax, 0x000F ; register naming and/or immediate value indicates size (usually.)
The size of the data (in comparisons and other operations) may sometimes need to be specified, to do this use a size specifier before one of the arguments:
CMP BYTE [rax+rbx], 0 or: CMP [rax+rbx], BYTE 0
CMP BYTE [rax+rbx], 0
CMP [rax+rbx], BYTE 0
(R) register
R
(I) immediate - i.e. constants:
I
0x
$
H
h
0xFF
$FF
FFh
0o
Q
q
o
0o777
777o
0b
B
b
0b110101
01010011b
'a'
'abcd'
"abc"
'abc'
abc\n
(M) effective addresses (memory):
M
[label] Data located at the address of label [label+1] data at label + some constant offset [label+register] data at label offset by the amount in register [label+register*scale], [register*scale] [label+register*scale+constant], [register*scale + constant] [register:label+register]
[label]
[label+1]
[label+register]
[label+register*scale]
[register*scale]
[label+register*scale+constant]
[register*scale + constant]
[register:label+register]
char array[]
esi
Assembly C equivalent mov al, BYTE [esi] al = array[0] mov al, [esi + 10] al = array[10] mov al, [esi + ecx] al = array[ecx] mov al, [esi + ecx*8 + 100] al = array[ecx*8 + 100]
mov al, BYTE [esi]
al = array[0]
mov al, [esi + 10]
al = array[10]
mov al, [esi + ecx]
al = array[ecx]
mov al, [esi + ecx*8 + 100]
al = array[ecx*8 + 100]
A label refers to the address of the code or data on the line it occurs on.
A label may start with a letter, . (dot)(w/ special meaning), _ (underscore) or ?, and may contain letters, numbers, _, $, #, @, -, . and ?.
.
_
?
#
@
-
May be up to 4095 characters in length
A label that starts with a . (dot) is a "local" label and is associated with the previous non-local label. May be combined with its associated non-local label to be accessed from outside of it's local code.
func1: ... .loop: ... func2: ... .loop: ... jmp func1.loop ; go to the .loop inside of func1.
SECTION section name
SECTION
Section name .bss statically allocated objects that are un-initialized (usually zeroed) .data statically allocated objects that are pre-initialized, sometimes read-only. .text The program code
.bss
.data
.text
GLOBAL symbol
GLOBAL
EXTERN symbol
EXTERN
RESB, RESW, RESD, RESQ
RESB
RESW
RESD
RESQ
TIMES #
TIMES
label: times 10 resd 10 ; set aside 10x10 bytes of memory
label: times 8 db'abcd' ; repeats "abcd" 8 times.
buf: times 64 db0 ; Makes a zero'ed 64 byte buffer
DB, DW, DD and DQ
DB
DW
DD
DQ
INCBIN "file"[,skip[,amount]
INCBIN "
"[,
[,
]
EQU constant
EQU
x: equ 10
%include "file"
%include "
"
%define define
%define