Tuesday 21 July 2015

Function call implementation in Assembly language !

Prerequisites:1.What is variable - variables for hackers : http://programmingethicalhackerway.blogspot.in/2015/07/what-is-variable-variables-for-hackers.html
2.Concept of function in C - Programmer section: http://programmingethicalhackerway.blogspot.in/2015/07/concept-of-function-in-c-programmer.html
3. Introduction to stack : http://programmingethicalhackerway.blogspot.in/2015/07/introduction-to-stack.html

A functions can be defined as :
It is the set of information that can be grouped into smaller sub-program called a function.
OR
It is set of instructions that performs a particular task and we need several times in our program so it can be grouped into a smaller subprogram called a function.

How Functions Work?
Functions are composed of several different pieces:
Function name:-A function’s name is a symbol that represents the address where the function’s code starts. In assembly language, the symbol is defined by typing the the function’s name as a label before the function’s code. This is just like labels you have used for jumping.

function parameters:When a function is invoked, you pass a value to the parameter. This value is referred to actual parameter or argument.For example, in mathematics, there is a sine function. If you were to ask a computer to find the sine of 2, sine would be the function’s name, and 2 would be the parameter.

local variables:Variables that are declared inside a function or block are called local variables. They can be used only by statements that are inside that function or block of code.It’s kind of like a scratch pad of paper. Functions get a new piece of paper every time they are activated, and they have to throw it away when they are finished processing.

static variables:According to Wikipedia, a static variable is a variable that has been allocated statically—whose lifetime or "extent" extends across the entire run of the program.Static variables are generally not used unless absolutely necessary, as they can cause problems later on.

Global variables :  If Variables are defined at the  beginning of the code i.e outside of any functions are called Global variables.
This variable can be read from and written to by any function, and the changes to it will  persist between functions.

return address:The return address is a parameter which tells the function where to resume executing after the function is completed. This is needed because functions can be called to do processing from many different parts of your program, and the function needs to be able to get back to wherever it was called from. In most programming languages, this parameter is passed automatically when the function is called. In assembly language, the call instruction handles passing the return address for you, and ret handles using
that address to return back to where you called the function from.

return value:The return value is the main method of transferring data back to the main program. Most programming languages only allow a single return value for a function.

Stack-based exploits are made possible by the call and ret instructions. When a function is called, the return address of the next instruction is pushed to the stack, beginning the stack frame. After the function is finished, the ret instruction pops the return address from the stack and jumps EIP back there. By overwriting the stored return address on the stack before the ret instruction, we can take control of a program’s execution.

A function call in assembly language simply requires pushing the arguments to the function onto the stack in reverse order, and issuing a call instruction. After calling, the arguments are then popped back off of the stack. After calling, the arguments are then popped back off of the stack.
For example,consider the C code:
sum("The sum is %d", 10);

This code can be translated into assembly language as such:
.section .data
    string:
        .ascii "The sum is %d\0"

.section .text
    pushl $10
    pushl $string
    call printf
    popl %eax
   

C Calling Conventions: The way that the variables are stored and the parameters and return values are transferred by the computer varies from language to language as well. This variance is known as a language’s calling convention, because it describes how functions expect to get and receive data when they are called.

You cannot write assembly-language functions without understanding how the computer’s  stack works. Each computer program that runs uses a region of memory called the stack to enable functions to  work properly.

                                                       
            


In previous Section we saw a simple example of a subroutine( sum(); ) defined in x86 assembly language.In practice, such simple function definitions are rarely useful. When more complex subroutines are combined in a single program, a number of complicating issues arise.For example, how are parameters passed to a subroutine? Can subroutines overwrite the values in a register, or does the caller expect the register contents to be preserved? Where should local variables in a subroutine be stored? How should results be returned from functions?
To allow separate programmers to share code and develop libraries for use by many programs, and to simplify the use of subroutines in general, programmers typically adopt a common calling convention.


The C calling convention is based heavily on the use of the hardware-supported stack. It is based on the push, pop, call, and ret instructions. Subroutine parameters are passed on the stack. Registers are saved on the stack, and local variables used by subroutines are placed in memory on the stack. The vast majority of high-level procedural languages implemented on most processors have used similar calling conventions.
The calling convention is broken into two sets of rules. The first set of rules is employed by the caller of the subroutine, and the second set of rules is observed by the writer of the subroutine (the “callee”).

The Caller’s Rules :
1. Before calling a subroutine, the caller should save the contents of certain registers that are designated
caller-saved. The caller-saved registers are EBX, ECX, EDX. If you want the contents of these registers to be preserved across the subroutine call, push them onto the stack.
2. To pass parameters to the subroutine, push them onto the stack before the call. The parameters
should be pushed in inverted order (i.e. last parameter first)—since the stack grows down, the first parameter will be stored at the lowest address (this inversion of parameters was historically used to allow functions to be passed a variable number of parameters).
3. To call the subroutine, use the call instruction. This instruction places the return address on
top of the parameters on the stack, and branches to the subroutine code.
4. After the subroutine returns, (i.e. immediately following the call instruction) the caller must remove the parameters from stack. This restores the stack to its state before the call was performed.
5. The caller can expect to find the return value of the subroutine in the register EAX.
6. The caller restores the contents of caller-saved registers (EBX, ECX, EDX) by popping them
off of the stack. The caller can assume that no other registers were modified by the subroutine

The Callee’s Rules:
1.At the beginning of the subroutine, the function should push the value of EBP onto the stack, and then copy the value of ESP into EBP using the following instructions:
push ebp
mov esp, ebp
The reason for this initial action is the maintenance of the base pointer, EBP. The base pointer is used by convention as a point of reference for finding parameters and local variables on the stack. Essentially, when any subroutine is executing, the base pointer is a “snapshot” of the stack pointer value from when the subroutine started executing. Parameters and local variables  will always be located at known, constant offsets away from the base pointer value. We push the old base pointer value at the beginning of the subroutine so that we can later restore the appropriate base pointer value for the caller when the subroutine returns. Remember, the caller isn’t expecting the subroutine to change the value of the base pointer. We then move the stack pointer into EBP to obtain our point of reference for accessing parameters and local variables.
2.Next, allocate local variables by making space on the stack.For example, if 3 local integers (4 bytes each) were required, the stack pointer would need to be decremented by 12 to make space for these local variables. I.e:
    sub 12, esp
3. Next, the values of any registers that are designated callee-saved that will be used by the func-
tion must be saved. To save registers, push them onto the stack. The callee-saved registers are
EDI and ESI (ESP and EBP will also be preserved by the call convention, but need not be
pushed on the stack during this step).
4. When the function is done, the return value for the function should be placed in EAX if it is
not already there.
5. The function must restore the old values of any callee-saved registers (EDI and ESI) that were
modified. The register contents are restored by popping them from the stack. Note, the registers
should be popped in the inverse order that they were pushed.
6. Next, we deallocate local variables. The obvious way to do this might be to add the appropriate
value to the stack pointer (since the space was allocated by subtracting the needed amount from the stack
pointer).In practice,
mov ebp, esp
This trick works because the base pointer always contains the value that the stack pointer contained
immediately prior to the allocation of the local variables.
7. Immediately before returning, we must restore the caller’s base pointer value by popping EBP off the stack. Remember, the first thing we did on entry to the subroutine was to push the base pointer to save its old value.
8. Finally, we return to the caller by executing a ret instruction. This instruction will find and remove the appropriate return address from the stack.

The first half of the rules apply to the beginning of the function, and are therefor commonly said to define the prologue to the function. The latter half of the rules apply to the end of the function, and are thus commonly said to define the epilogue of the function.

Call Convention Example : The Call Convention Example is here in detail: Concept of function in C - Hacker section: http://programmingethicalhackerway.blogspot.in/2015/07/concept-of-function-in-c-hacker-section.html
A good way to visualize the operation of the calling convention is to draw the contents of the nearby region of the stack during subroutine execution.

If you like this post or have any question, please feel free to comment!

No comments:

Post a Comment

Blogger Widget