CS456 - Systems Programming

More Machine Language / Instructions

Note on memory to memory transfers:

When moving data, it is not normally possible to move data from memory to memory without first storing the data in a register to transfer the data. About the only exception to this rule are the MOVS* instructions. This rule applies to almost all instructions, including comparison operations that do not save the result of their operations. If you have both operands to an instruction as memory locations it is almost certainly incorrect.


MOVZX reg, arg2
MOVSX reg, arg2

  • reg: must be a Register
  • arg2: may be a Register or Memory

These instructions are like the MOV instruction, however when moving a value from a smaller register or memory location into a larger register, the value is padded out to fill the destination register appropriately. If the number to be copied is a signed value, use MOVSX to copy the sign bit (the upper most bit) from the source to all the remaining bits of the destination, allowing larger register to maintain the twos compliment value of the original signed value.

MOVZX always pads the upper bits of the destination with 0's which is what you want to use if the original value is an unsigned value.


MOVSX rax, ebx ; Copies the signed 32 bit ebx value to the 64 bit rax


LEA reg, memory

  • The first parameter must be a register and the second a memory location.

LEA allows us to use the same semantics for computing an address inside of []'s that MOV allows, but rather than loading the value at that address it just computes the address and loads the register with the computed address.


LEA rsi, [buf + r14] ; Loads rsi with the address of buf offset by the value in r14


The MOVS* instructions move bytes (MOVSB), words (MOVSW), double words (MOVSD) or quad words (MOVSQ) at a time. None of these instructions take any parameters instead they use the registers rsi and rdi as source and destination addresses respectively. If the directory flag (DF) is clear (i.e. 0) then rsi and rdi are incremented after the copy. If the direction flag is set (i.e. 1) then they are decremented after the copy.

NOTE: Keep in mind that system calls will destroy the values of rsi/rdi so using MOVS* in a loop with a system call can be a problem.


; Copies 10 bytes from srcbuf to dstbuf.  Uses LOOP instruction, see below:
       MOV rcx, 10
       MOV rsi, srcbuf
       MOV rdi, dstbuf
       CLD              ; increment rsi/rdi after copy
       MOVSB            ; C equivalent to: *(rsi++) = *(rdi++);
       LOOP .loop       ; == dec rcx; jnz .loop


CLD - Clears the direction flag

STD - Sets the direction flag


LOOP addr

  • addr: is a location in memory (address to jump to)

LOOP uses the rcx register as a loop counter. Each time the loop instruction is run, rcx is decremented by one and then checked to see if it is zero, and if not, a conditional jump to arg is performed. LOOP is equivalent to:

       DEC rcx
       JNZ arg

NOTE: Keep in mind that system calls will destroy the value of rcx so using LOOP with a loop containing a system call can be a problem.



CALL arg

RET [ val ]

CALL pushes the current value of the instruction register (i.e. the address of the next instruction) onto the stack then jumps to arg. The address pushed onto the stack is often called the return address which is pop'ed off the stack when a RET instruction is encountered which loads the instruction register with the pop'ed value and execution will continue immediately after the original CALL instruction.

RET acccepts an immediate value, representing the number of additional bytes to remove from the stack (i.e. used to remove any parameters pushed to the stack prior to the system call.)


; Copies the string pointed to by rsi to destination pointed to by rdi
        MOV   al, BYTE [rsi]
        MOV   BYTE [rdi], al
        CMP   al, 0             ; Check for end of string
        JZ    .finish
        INC   rsi
        INC   rdi
        JMP   strcpy

        ; Setup parameters to strcpy:
        MOV   rsi, string
        MOV   rdi, buf
        CALL  strcpy                    ; Copy string to buf


PUSH arg

  • arg: may be a register/memory or immediate value that is 16,32 or 64 bits in size.

Pushes a value onto the hardware stack. The stack pointer is decremented by the size of the data then the data is moved onto the stack at the new stack pointer location.

POP arg

  • arg: for pop may be a register or memory location that is 16,32 or 64 bits in size.

Places the value at the current stack location into either a register or memory location then increments the stack pointer by the size of the data.

Arithmetic operations

Almost all arithmetic operations will modify the flags: OF, SF, ZF, AF, and PF depending on the result of the particular operation. Thus most arithmetic operations can be followed by conditional jumps.


INC arg
DEC arg

  • arg is a Register or Memory (R/M)

Increments (INC) or decrements (DEC) the register or memory location by 1.


Instruction C equivalent Notes:
ADD a, b a = a + b
SUB a, b a = a - b
NEG a a = - a
AND a, b a = a & b
OR a, b a = a | b
XOR a, b a = a ^ b
NOT a a = ~ a Does not affect flags
SHL a, c a = a << c
SHR a, c a = a >> c unsigned shift right
SAL a, c a = a << c
SAR a, c a = a >> c signed shift right

  • a should be a register or memory location (the result of the operation will be stored in/at 'a')
  • b may be a register, memory or immediate
  • c may be the cl register or an immediate

Note how the shift instructions may only use the cl register or an immediate value as the second parameter.

MUL - unsigned multiply

MUL arg

  • arg: may be a register or memory only.

The multiplication that MUL performs depends on the size of the argument:

Size of arg Operation
BYTE AX = AL * arg
WORD DX:AX = AX * arg

The RDX:RAX register pairing for example represents a 128 bit quantity since multiplying the 64 bit RAX by another 64 bit quantity could result in a value requiring 128 bits to represent it. For most simple multiplications where one is not concerned with overflow, using the two operand IMUL instruction is more convenient.

IMUL - signed multiply

IMUL arg rdx:rax ⟵ rax * arg
IMUL reg, arg regreg * arg
IMUL reg, arg, imm regarg * imm
  • arg: may be a register or memory only.
  • imm: is an immediate value only

The first form of IMUL is the same as the unsigned MUL instruction and the registers used for the result depend on the size of the argument size.

DIV / IDIV - unsigned / signed divide

DIV arg
IDIV arg

  • arg: may be a register or memory only.

Both DIV (unsigned divide) and IDIV (signed divide) both operate in a manner similar to the MUL instruction, merely in reverse.

Size of divisor Operation Quotient (result) Remainder
WORD (DX:AX) / arg AX DX

When dividing a small value by arg care must be taken to make sure that the ?D? register is initialized properly before the division is performed. If the division is to be unsigned, then setting the ?D? register to zero is sufficient, however if the value to be divided may be negative, then one of the following instructions should be used to extend the sign bit of the ?A? register into all the bits of the ?D? register.

Note: ?A? = one of AX, EAX, RAX depending on the size of register required.



  • Copies the sign bit of AX into all the bits of DX.
    DX:AX ⟵ AX


  • Copies the sign bit of EAX into all the bits of EDX.


  • Copies the sign bit of RAX into all the bits of RDX.

These instructions take no operands. They are used to create the 32, 64 or 128 bit quantities used by DIV or IDIV.

DIV / modulus example

To get the modulus (aka. remainder of integer division) we use the DIV or IDIV instruction to get the remainder:

mod 10 example

        ; We assume that rax holds the number to be divided.
        MOV rbx, 10     ; We'll use rbx to hold the divisor
        CQO             ; Copy the sign bit of rax into all the bits of rdx
                        ; We could use MOV rdx, 0 since we're using DIV and not
                        ; IDIV.
        DIV rbx         ; Perform the division.
        ; At this point rax holds the result of the division and
        ; rdx holds the remainder (i.e. modulus result)