Arguments and Result
In x86-64 assembly, the function prologue and epilogue are used to manage the stack frame for a function call, particularly when arguments are being passed and values are being returned. Here's an overview:
Setting up the stack frame (Prologue)
-
Push the base pointer onto the stack to save the previous base pointer value.
push rbp
-
Move the stack pointer into the base pointer to set up a new base frame.
mov rbp, rsp
-
Allocate space on the stack for local variables by moving the stack pointer.
sub rsp, <size>
<size>
represents the number of bytes needed for local variables.
Cleaning up the stack frame (Epilogue)
-
Move the base pointer back into the stack pointer to deallocate local variables.
mov rsp, rbp
-
Pop the old base pointer off the stack to restore the previous base pointer value.
pop rbp
-
Return from the function using the
ret
instruction.ret
Note: The
leave
instruction is a shorthand for themov
andpop
operations in (in the epilogue) to clean up the stack frame. Essentially,leave
resets the stack pointer and base pointer to their state before the function call.
These steps ensure that each function has its own stack frame, which is essential for local variable management and for the return address to be properly restored when the function completes. Remember that the calling convention may require you to save and restore other registers as well.
Passing Arguments to Functions
Arguments can be passed to functions in a couple of ways:
- Register usage for arguments - In many calling conventions, the first few arguments are passed in registers, which is faster than using the stack. The specific registers used can vary depending on the architecture (e.g., EAX, EDX, ECX on x86; RDI, RSI, RDX, RCX, R8, R9 on x86-64).
- Stack usage for additional arguments - If there are more arguments than there are registers available for argument passing, or if the calling convention dictates, additional arguments are passed on the stack. The stack is also used to pass arguments that are larger than the size of a register.
So, with this information in mind, let's look at an example of a C function call that takes just a few simple arguments:
#include <stdio.h>
// Function declaration
void greet(char *name, int age);
int main() {
// Function call with arguments
greet("Alice", 30);
return 0;
}
// Function definition
void greet(char *name, int age) {
printf("Hello, %s! You are %d years old.\n", name, age);
}
In this example, the function greet
is declared and defined to take two arguments: a string name
and an integer age
. The main
function then calls greet
with the arguments "Alice"
and 30
. When you run this program, it will output: Hello, Alice! You are 30 years old.
Let's look at it as assembly:
section .rodata
helloString db "Hello, %s! You are %d years old.", 10, 0 ; String with a newline and null terminator
section .text
global main
extern printf
; greet function
greet:
push rbp
mov rbp, rsp
sub rsp, 16 ; Allocate space for local variables if needed
mov rdi, [rbp+16] ; First argument (char *name)
mov rsi, [rbp+24] ; Second argument (int age)
mov rdx, helloString ; Address of the format string
mov rax, 0 ; No vector registers used for floating point
call printf ; Call printf function
leave
ret
; main function
main:
push rbp
mov rbp, rsp
sub rsp, 16 ; Allocate space for local variables if needed
mov rdi, name ; Address of the name string
mov rsi, 30 ; Age
call greet ; Call greet function
mov eax, 0 ; Return 0
leave
ret
section .data
name db "Alice", 0 ; Null-terminated string "Alice"
In this example, the greet
function takes two arguments:
- A pointer to a character array (the name string)
- An integer (the age)
Here's how the registers are used in the greet
function call:
rdi
is used to pass the first argument, which is the pointer to the string"Alice"
.rsi
is used to pass the second argument, which is the integer30
.
When calling the printf
function from within greet
, the arguments are as follows:
rdi
is used for the format string.rsi
is reused to pass the first variable argument, which is the name string.rdx
is used to pass the second variable argument, which is the age integer.rax
is set to 0 becauseprintf
is a variadic function, andrax
is used to indicate the number of vector registers used; in this case, none are used.
The printf
function then uses these register values to access the arguments and print the formatted string to the standard output. 🖨️🔢
Return Values from Functions
Values can be returned from a function in a couple of different ways:
-
Return values in registers: For small return values, typically those that are the size of a register or smaller, most calling conventions use registers to return values. On a 64-bit system, the
rax
register is commonly used. Returning values in registers is fast because it avoids the overhead of memory access. 🚀 -
Handling larger return values: When a function needs to return a value larger than the size of a register, such as a large struct or an array, the typical approach is to pass a pointer to a memory space allocated by the caller, where the function can write the return value. This technique is known as passing the return value by reference. The function signature might require the caller to provide a pointer to the space where the return value should be stored. This approach can also be used to return multiple values from a function by passing multiple pointers for each value that needs to be returned.
Here's an example of a function returning a value.
int add(int a, int b) {
return a + b;
}
int main() {
int result = add(3, 4);
return result;
}
Now, let's look at how this might be translated into assembly:
; Function: add
; Adds two integers and returns the result
add:
mov eax, edi ; Move first argument (a) into eax
add eax, esi ; Add second argument (b) to eax
ret ; Return with result in eax
; Function: main
; Calls add and returns its result
main:
push rbp ; Save base pointer
mov rbp, rsp ; Set up new base pointer
mov edi, 3 ; Set first argument (3) for add function
mov esi, 4 ; Set second argument (4) for add function
call add ; Call add function; result will be in eax
mov ebx, eax ; Store the result of add in ebx (or another register/memory location)
mov eax, ebx ; Move the result into eax to be the return value of main
pop rbp ; Restore base pointer
ret ; Return with result in eax
In this Assembly code the main
function sets up the stack frame by pushing the base pointer (rbp
) onto the stack and then copying the stack pointer (rsp
) to rbp
. It prepares the arguments for the add
function by moving the values 3
and 4
into the edi
and esi
registers, respectively. The call
instruction is used to call the add
function. After the call, the result is in the eax
register. The result is then moved from eax
to ebx
for temporary storage (this step is not strictly necessary if you're immediately returning the value). Finally, the result is moved back into eax
to set up the return value for main
, the stack frame is cleaned up, and the ret
instruction is used to return from main
.
Check Your Understanding: Implement a simple C program that uses recursion to calculate factorial (e.g., 6! = 6 * 5 * 4 * 3 * 2 * 1). Use your compiler or https://gcc.godbolt.org/ to generate the assembly. Study it and trace through the instructions that are used for modifying stack frame, for passing arguments, and for returning values.
This concludes the explanation for how the computer is able to provide the low-level support for function calls. In the next section, we will dive into a couple of more complex data structures that we commonly use in C: arrays and structures (aka structs
).