Note that the IA32 instruction set is described in several large volumes made freely available by Intel. These volumes should not be read cover to cover, but should be used to look up particular technical details once you have read this introduction. In particular, volumes 2A and 2B describe every instruction in great detail.
int main( int argc, char *argv[] )
{
printf("hello world!\n");
return 0;
}
will yield a file hello.S that
looks something like this:
.file "test.c"
.section .rodata
.LC0:
.string "hello world!\n"
.text
.globl main
.type main,@function
main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
andl $-16, %esp
movl $0, %eax
subl %eax, %esp
subl $12, %esp
pushl $.LC0
call printf
addl $16, %esp
movl $0, %eax
leave
ret
.Lfe1:
.size main,.Lfe1-main
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.2.3 20030502 (Red Hat Linux 3.2.3-54)"
Note that the assembly code has three different kinds of elements:
By default, the assembly output is not optimized: it has many unnecessary instructions. It is interesting to consider the output of the compiler when we turn on optimization with the -O flag:
.file "test.c"
.section .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "hello world!"
.text
.globl main
.type main,@function
main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
andl $-16, %esp
subl $12, %esp
pushl $.LC0
call puts
movl $0, %eax
leave
ret
.Lfe1:
.size main,.Lfe1-main
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.2.3 20030502 (Red Hat Linux 3.2.3-54)"
This is a very aggressive optimizer!
Not only have several unnecessary instructions been removed,
but the compiler has determined that a call to the (complicated)
function printf can be replaced with a call to puts,
which only outputs strings! Your compiler need not be quite this clever.
To take this assembly code and turn it into a runnable program, run the as tool to create an object file, and then gcc to link the object file with the standard library and create an executable:
% as hello.s -o hello.o % gcc hello.o -o helloBe sure to take advantage of the fact that GCC emits assembly code. If you don't know quite what instructions to generate with your compiler, see what GCC emits and then look up the details in the Intel manual.
Now that you know what tools to use, let's begin to look at the assembly instructions in detail.
| %eax | %ebx | %ecx | %edx | %esi | %edi |
We say almost general purpose because earlier versions of the processors had restrictions on which registers could be used for various purposes. As the design developed, new instructions and addressing modes were added to make the various registers almost equal. A few remaining instructions, particularly related to string processing, require the use of %esi and %edi. In addition, two registers are employed as the stack pointer and the base pointer:
| %esp | %ebp |
The IA32 architecture has expanded from 8 to 16 to 32 bits over the years, and so each register has some internal structure that you should know about:
| %ah 8 bits | %al 8 bits | ||
| %ax 16 bits | |||
| %eax 32 bits | |||
The lowest 8 bits of the %eax register are an 8-bit register %al, and the next 8 bits are known as %ah. The low 16 bits are collectively known as %ax, and the entire 32-bits are known as %eax. A similar naming scheme applies to the remaining registers. Although we will generally make use of the full 32-bits, you are likely to encounter code that uses the small registers.
MOV, like most instructions, has a single letter suffix that determines the amount of data to be moved. The following names are used to describe data values of various sizes:
| Suffix | Name | Size |
| B | BYTE | 8 bits |
| W | WORD | 16 bits |
| L | LONG | 32 bits |
So, MOVB moves a byte, MOVW moves a word, and MOVL moves a long. Clearly, the size of the locations you are moving to and from must match the suffix. It is possible to leave off the suffix, and the assembler will attempt to choose the right size based on the arguments. However, this is not recommended, as it can have unexpected effects.
The arguments to MOV can have one of several addressing modes. A global value is simply referred to by an unadorned name such as x or printf An immediate value is a constant value indicated by a dollar sign such as $56 A register value is the name of a register such as %ebx. An indirect refers to a value by the address contained in a register. For example, (%esp) refers to the value pointed to by %esp. A base-relative value is given by adding a constant to the name of a register. For example, -12(%ecx) refers to the value at the memory location twelve bytes below the address indicated by %ecx. This mode is important for manipulating stacks, local values, and function parameters. There are a variety of complex variations on base-relative, for example -12(%esi,%ebx,4) refers to the value at the address -12+%esi+%ebx*4. This mode is useful for accessing elements of unusual sizes arranged in arrays.
Here is an example of using each kind of addressing mode to load a value into %eax:
| Global Symbol | MOVL x, %eax |
| Immediate | MOVL $56, %eax |
| Register | MOVL %ebx, %eax |
| Indirect | MOVL (%esp), %eax |
| Base-Relative | MOVL -4(%ebp), %eax |
| Offset-Scaled-Base-Relative | MOVL -12(%esi,%ebx,4), %eax |
Of course, the same addressing modes may be used to store data into registers and memory locations. IA32 is a CISC architecture, so most instructions allow many combinations of these addressing modes. However, not all modes are supported. For example, it is not possible to use base-relative for both arguments of MOV: MOVL -4(%ebx), -4(%ebx). To see exactly what combinations of addressing modes are supported, you must read the manual pages for the instruction in question.
You will need four basic arithmetic instructions for your compiler: ADD, SUB, IMUL, and IDIV. The first three instructions have two operands: a source and a destructive target. For example, this instruction:
ADDL %ebx, %eaxadds %ebx to %eax, and places the result in %eax, overwriting what might have been there before. This requires that you be a little careful in how you make use of registers. For example, suppose that you wish to translate c = b*(b+a), where a and b are global integers. To do this, you must be careful not to clobber the value of b when performing the addition. Here is one possible translation:
MOVL a, %eax MOVL b, %ebx ADDL %ebx, %eax IMULL %ebx, %eax MOVL %eax, cThe IDIV instruction is a little unusual. It implicitly expects the dividend to be available in %eax, while accepting a divisor as an argument. The result is placed in %eax and the remainder in %edx. For example, to divide a by five:
MOVL a, %eax IDIV $5
The instructions INC and DEC increment and decrement a register destructively. For example, the statement a = ++b could be translated as:
MOVL b, %eax INCL %eax MOVL %eax, aBoolean operations work in a very similar manner: AND, OR, and XOR perform destructive boolean operations on two operands, while NOT performs a destructive boolean-not on one operand.
Like the MOV instruction, the various arithmetic instructions can work on a variety of addressing modes. However, for your compiler project, you will likely find it most convenient to use MOV to load values in and out of registers, and then use only registers to perform arithmetic.
|
Note that the GNU tools use the traditional AT&T syntax, which is used across many processors on Unix-like operating systems, as opposed to the Intel syntax typically used on DOS and Windows systems. The following instruction is given in AT&T syntax: movl %esp, %ebpmovl is the name of the instruction, and the percent signs indicate that esp and ebp are registers. In the AT&T syntax, the source is always given first, and the destination is always given second. In other places (such as the Intel manual), you will see the Intel syntax, which (among other things) dispenses with the percent signs and reverses the order of the arguments. For example, this is the same instruction in the Intel syntax: MOVL EBP, ESPWhen reading manuals and web pages, be careful to determine whether you are looking at AT&T or Intel syntax: look for the percent signs! |
MOVL $0, %eax
loop:
INCL %eax
JMP loop
To define more useful structures such as terminating
loops and if-then statements,
we must have a mechanism for evaluating values and changing
program flow. In most assembly languages, these are handled
by two different kinds of instructions: compares and jumps.
All comparisons are done with the CMP instruction. CMP compares two different registers and then sets a few bits in an internal EFLAGS registers, recording whether the values are the same, greater, or lesser. You don't need to look at the EFLAGS register directly. Instead a selection of conditional jumps examine the EFLAGS register and jump appropriately:
| JE | Jump If Equal |
| JNE | Jump If Not Equal |
| JL | Jump If Less Than |
| JLE | Jump If Less or Equal |
| JG | Jump if Greater Than |
| JGE | Jump If Greater or Equal |
For example, here is a loop to count %eax from zero to five:
MOVL $0, %eax
loop:
INCL %eax
CMPL $5, %eax
JLE loop
And here is a conditional assignment: if global variable
x is greater than zero, then global variable y gets ten, else twenty:
MOVL x, %eax
CMPL $0, %eax
JLE twenty
ten:
MOVL $10, $ebx
JMP done
twenty:
MOVL $20, $ebx
JMP done
done:
MOVL %ebx, y
Note that jumps require the compiler to define target labels.
These labels must be unique and private within one assembly
file, but cannot be seen outside the file unless a .globl
directive is given. In C parlance, a plain assembly label is static,
while a .globl label is extern.
MOVL %eax, (%esp) SUBL $4, %espPopping a value from the stack involves the opposite:
ADDL $4, %esp MOVL (%esp), %eaxAnd, if we wish to discard the top value from the stack:
ADDL $4, %espOf course, pushing to and popping from the stack referred to by %esp is so common, that the two operations have their own instructions:
PUSHL %eax POPL %eaxUsing the stack, calling a function is straightforward. Each argument to the function must be pushed onto the stack in reverse order, then the function called. When the function returns, the result is found into the %eax register, overwriting whatever was there. The caller is then responsible for removing or discarding the arguments from the stack:
For example, the following C code:
x = printf("value: %d",y);
could be translated to this:
x:
.long 0
y:
.long 0
.LC0:
.string "value: %d"
start:
PUSHL y # push the last argument
PUSHL .LC0 # push the first argument
CALL printf # invoke printf
ADDL $8, %esp # discard the arguments
MOVL %eax, x # save the result in x
.globl func
func:
pushl %ebp # save the old base pointer
movl %esp, %ebp # set ebp to the current esp
subl $12,%esp # allocate three local variables
# body of function goes here
addl $12,%esp # de-allocate local variables
leave # restore ebp and esp
ret # return to the caller
A function has quite a few details that must be
kept track of: the arguments given to the function,
the information necessary to return, and space for local
computations. For this purpose, we use the base register
pointer %ebp. Whereas the stack pointer %esp points to
the end of the stack where new data will be pushed,
the base pointer %ebp points to the middle of the
values needed by the function.
Note that the nomenclature here is a little confusing: the stack grows down toward smaller numbers: the top of the stack is located at a lower address than the bottom of the stack. To avoid confusion, we will simply refer to numbers as either positive or negative relative to the base pointer.
Consider the stack layout for func, defined above:
| locals of calling function | ||
| argument 2 | 16(%ebp) | |
| argument 1 | 12(%ebp) | |
| argument 0 | 8(%ebp) | |
| old %eip register | 4(%ebp) | |
| old %ebp register | (%ebp) | |
| local variable 0 | -4(%ebp) | |
| local variable 1 | -8(%ebp) | |
| local variable 2 | -12(%ebp) | <-- %esp |
| space for current function | ||
Note that the base pointer points to the middle of the stack layout. At positive values (relative to %ebp) are located the arguments to the function. Argument zero (the leftmost argument in C) is always at 8(%ebp), argument one at 12(%ebp), and so forth. The old instruction pointer and base pointer are stored at 4(%ebp) and (%ebp); these are needed to return when the function is complete. At negative values are stored variables local to the function. Finally, the stack pointer points to the last local variable. If we must use the stack for additional purposes, data will be pushed to further negative values.
Take a moment to sketch out how this stack layout was arrived at. The caller is responsible for setting up items at positive values. First, the caller must push all the arguments onto the stack in reverse order. When the CALL instruction is executed, the old program counter is pushed onto the stack so that control can be returned to the caller when the function completes. The called function then pushes the old base pointer onto the stack, and makes space for three local variables.
Within the function, we may use base-relative addressing against the base pointer to refer to both arguments and locals. Argument N is located at +8+N*4(%ebp) and local variable N is located at -4-N*4(%ebp). (Although, you can't use that syntax directly.)
There is one more complication: each function needs to use a selection of registers to perform computations. However, what happens when one function is called in the middle of another? We do not want any registers currently in use by the caller to be clobbered by the called function. To prevent this, each function must save and restore all of the registers that it uses by pushing them onto the stack at the beginning, and popping them off of the stack before returning.
Here is a complete example that puts it all together. Suppose that you have a C function defined as follows:
int addthree( int a, int b, int c )
{
int x;
x = a+b+c;
return x;
}
Here is a straightforward translation of the function:
.globl addthree
addthree:
pushl %ebp # save the base pointer
movl %esp, %ebp # set new base pointer to esp
subl $4,%esp # allocate one local variable
pushl %ebx # save registers that we will use
pushl %ecx
pushl %edx
movl 8(%ebp), %ebx # load each arg into a register
movl 12(%ebp), %ecx
movl 16(%ebp), %edx
addl %edx, %ecx # add the args together
addl %ecx, %ebx
movl %ebx, -4(%ebp) # store the result into local 0
movl -4(%ebp), %eax # move local 0 into the result
popl %edx # restore temporary registers
popl %ecx
popl %ebx
addl $4,%esp # de-allocate local variables
leave
ret
Happy compiling!