Monday, 20 July 2015

Basic Assembly Programming Survival Skills for Hackers!

Many people(Programmers and hackers) thinks that high level language like Java, c/c++ are more useful than the Assembly language. But this is the wrong myth among them.Because assembly language allows you to do things you can't do in other programming languages. Without assembly language you will not be able to find the 0day against software , because debugger only output in asm code.In reality you don't need to be able to code in assembly, you need to be able to analyses malware and exploits and that is something else completely from coding for functionality.Actually you don't need to be coder but you should able to read it, understand it. So let's start...

Every computer has it's heart that have exactly two things : CPU and Memory.
The 8086 CPU was the first x86 processor. It was developed and manufactured by Intel.
The processor(CPU) is the heart of the your system. Basically It is designed to do following things:-
->  Fetch an instruction from memory.
->  Execute the instruction.

The CPU contains the following elements to accomplish this:
Program Counter
Instruction Decoder
Data bus
General-purpose registers
Arithmetic and logic unit
But here i am not discuss in details for the sack  of simplicity.

Now let's write a very simple C program:
#include <stdio.h>


int main()
{
    printf("Hello World");
 
}

Let’s start by looking at the machine code the main() function was translated into.
The GNU development tools include a program called objdump, which can be used to examine compiled binaries.
There are certain options -a, -d, -D, -f, -g, -G, -h, -H, -p, -r, -S, -t, -T, -V, -x that must be given to tell objdump what information to show.
Usage :
objdump [options] objfiles

options :
-d, –disassemble
Display assembler mnemonic names for the machine instructions. Disassemble only sections that are expected to contain instructions.
-D, –disassemble-all
Disassemble all sections, not just those expected to contain instructions.
-f, –file-header
Display overall header summary information.
-g, –debugging
Display debugging information.
-h, –section-header, –header
Display section-header summary information.
-H, –help
Display help information and exit.
-S, –source
Display source code intermixed with dis-assembly, if possible. Implies -d.
-V, –version
Print version information and exit.

root@r00t:~/Desktop/c_programming/blog_tutorial# gcc firstprog.c
root@r00t:~/Desktop/c_programming/blog_tutorial# objdump -D a.out | grep -A20 main.:
0804841c <main>:
 804841c:    55                       push   %ebp
 804841d:    89 e5                    mov    %esp,%ebp
 804841f:    83 e4 f0                 and    $0xfffffff0,%esp
 8048422:    83 ec 10                 sub    $0x10,%esp
 8048425:    c7 04 24 d0 84 04 08     movl   $0x80484d0,(%esp)
 804842c:    e8 cf fe ff ff           call   8048300 <printf@plt>
 8048431:    c9                       leave 
 8048432:    c3                       ret   
 8048433:    90                       nop
 8048434:    90                       nop
 8048435:    90                       nop
 8048436:    90                       nop
 8048437:    90                       nop
 8048438:    90                       nop
 8048439:    90                       nop
 804843a:    90                       nop
 804843b:    90                       nop
 804843c:    90                       nop
 804843d:    90                       nop
 804843e:    90                       nop

                                                   



The programs are stored in memory of computer. A computer program is nothing more than a collection of numbers stored in memory.Processor needs its own “memory” to save current working information locally. These memory spaces are called registers. Registers are used to store data temporarily.The registers are physically located in the processor itself, so it doesn’t have to fetch anything from RAM.It can be considered to be a sort of basic  variable, which can hold any value that the processor stores in it.

The x86 (Intel family) CPUs provide several general purpose registers for application use:
EAX, EBX, ECX, EDX, ESP, EBP, ESI, and EDI
The first four registers (EAX, ECX, EDX, and EBX) are known as general purpose registers.
The second four registers (ESP, EBP, ESI, and EDI) are also general purpose registers. These stand for Stack Pointer, Base Pointer, Source Index, and Destination Index, respectively.
The EIP register is the Instruction Pointer register, which points to the current instruction the processor is
reading. Like a child pointing his finger at each word as he reads, the processor reads each instruction using the EIP register as its finger.
GDB is used to show the state of the processor registers right before the program starts.Let's see..

root@r00t:~/Desktop/c_programming/blog_tutorial# gcc simple.c
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb ./a.out
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/Desktop/c_programming/blog_tutorial/a.out...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x804841f
(gdb) run
Starting program: /root/Desktop/c_programming/blog_tutorial/a.out

Breakpoint 1, 0x0804841f in main ()
(gdb) info registers
eax            0xbffff554    -1073744556
ecx            0x66aa2f28    1722429224
edx            0x1    1
ebx            0xb7fc1ff4    -1208213516
esp            0xbffff4a8    0xbffff4a8
ebp            0xbffff4a8    0xbffff4a8
esi            0x0    0
edi            0x0    0
eip            0x804841f    0x804841f <main+3>
eflags         0x246    [ PF ZF IF ]
cs             0x73    115
ss             0x7b    123
ds             0x7b    123
es             0x7b    123
fs             0x0    0
gs             0x33    51
(gdb)

A break point is set on the main() function so execution will stop right before our code is executed. Then GDB runs the program, stops at the break point, and is told to display all the processor registers and their current states.

Most of the instructions in assembly use these registers to read or write data, so understanding the registers of a processor is  essential to understanding the instructions.
The x86 CPU provides basic followings instructions :
MOV, ADD, SUB, CMP, AND, OR, NOT, JE, JNE, JB, JBE, JA, JAE, JMP, BRK, IRET, HALT, GET, and PUT, NOP.
Seven of these instructions have two operands, eight of these instructions have a single operand, five instructions have no operands at all.

The assembly instructions in Intel syntax generally follow this style:
operation <destination>, <source>
The destination and source values will either be a register, a memory address, or a value.
The mov operation will move a value from the source to the destination, sub will subtract, inc will increment.
For example:
 804841d:    89 e5                   mov    ebp,esp
 8048422:    83 ec 10                sub    esp,0x10

The instructions will move the value from ESP to EBP and then subtract from ESP (storing the result in ESP). The cmp operation is used to compare values,The jmp is used to Unconditional jump.
We will discuss about them in detail later.

The code can be shown in Intel syntax by providing an additional command-line option, -M intel, to objdump because Intel syntax is much more readable and easier to understand.
root@r00t:~/Desktop/c_programming/blog_tutorial# objdump -M intel -D a.out | grep -A20 main.:
0804841c <main>:
 804841c:    55                       push   ebp
 804841d:    89 e5                    mov    ebp,esp
 804841f:    83 e4 f0                 and    esp,0xfffffff0
 8048422:    83 ec 10                 sub    esp,0x10
 8048425:    c7 04 24 d0 84 04 08     mov    DWORD PTR [esp],0x80484d0
 804842c:    e8 cf fe ff ff           call   8048300 <printf@plt>
 8048431:    c9                       leave 
 8048432:    c3                       ret   
 8048433:    90                       nop
 8048434:    90                       nop
 8048435:    90                       nop
 8048436:    90                       nop
 8048437:    90                       nop
 8048438:    90                       nop
 8048439:    90                       nop
 804843a:    90                       nop
 804843b:    90                       nop
 804843c:    90                       nop
 804843d:    90                       nop
 804843e:    90                       nop

The disassembly syntax can be set to Intel by simply typing set disassembly intel or set dis intel.
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb -q
(gdb) set dis intel
Ambiguous set command "dis intel": disable-randomization, disassemble-next-line, disassembly-flavor, disconnected-
tracing...
(gdb)

Well this is all for basic assembly language which help you in debugging binary program and realizes that the compiled program is what actually gets executed out in the real world.I will discuss assembly language concept in detail later. I have tried to keep simple so that You do not have any  problem in understanding.

If you like this post or have any question, please feel free to comment.

No comments:

Post a Comment

Blogger Widget