Wednesday, 24 April 2013

What's in that wchar_t*?!

So I came across a situation yesterday where I wanted the contents of a wide character string while in GDB. Now for regular strings, GDB has the command 'x' which allows you to examine memory and it also takes a format specifier, so x/s usually does the job. Unfortunately, GDB doesn't have builtin support for printing wide character strings ( or maybe I missed it while going through the docs ). A quick search led me to results like this. Seemed a bit like overkill to me.

(gdb) disas main
Dump of assembler code for function main:
0x08048596 <main+0>: push ebp
0x08048597 <main+1>: mov ebp,esp
0x08048599 <main+3>: sub esp,0x18
0x0804859c <main+6>: and esp,0xfffffff0
0x0804859f <main+9>: mov eax,0x0
0x080485a4 <main+14>: sub esp,eax
0x080485a6 <main+16>: cmp DWORD PTR [ebp+0x8],0x2
0x080485aa <main+20>: je 0x80485ca <main+52>
0x080485ac <main+22>: mov eax,DWORD PTR [ebp+0xc]
0x080485af <main+25>: mov eax,DWORD PTR [eax]
0x080485b1 <main+27>: mov DWORD PTR [esp+0x4],eax
0x080485b5 <main+31>: mov DWORD PTR [esp],0x8048760
0x080485bc <main+38>: call 0x80483b8 <printf@plt>
0x080485c1 <main+43>: mov DWORD PTR [ebp-0x4],0x0
0x080485c8 <main+50>: jmp 0x8048618 <main+130>
0x080485ca <main+52>: call 0x804852d <pass>
0x080485cf <main+57>: mov DWORD PTR [esp+0x8],0x64
0x080485d7 <main+65>: mov eax,DWORD PTR [ebp+0xc]
0x080485da <main+68>: add eax,0x4
0x080485dd <main+71>: mov eax,DWORD PTR [eax]
0x080485df <main+73>: mov DWORD PTR [esp+0x4],eax
0x080485e3 <main+77>: mov DWORD PTR [esp],0x80491a0
0x080485ea <main+84>: call 0x80483a8 <mbstowcs@plt>
0x080485ef <main+89>: mov DWORD PTR [esp+0x4],0x8049140
0x080485f7 <main+97>: mov DWORD PTR [esp],0x80491a0
0x080485fe <main+104>: call 0x80483d8 <wcscmp@plt>
0x08048603 <main+109>: test eax,eax
0x08048605 <main+111>: jne 0x804860c <main+118>
0x08048607 <main+113>: call 0x80484b4 <win>
0x0804860c <main+118>: mov DWORD PTR [esp],0x8048795
0x08048613 <main+125>: call 0x80483e8 <puts@plt>
0x08048618 <main+130>: mov eax,DWORD PTR [ebp-0x4]
0x0804861b <main+133>: leave
0x0804861c <main+134>: ret
view raw wchars.asm hosted with ❤ by GitHub

So looking at the disassembly of main, we can see that one of the command line arguments passed to the program is converted to a wide character string and stored in a global ( I guess it could be static as well.. but meh ) buffer @ 0x80491a0. The contents of that string is already known so it isn't too interesting. Looking a little further down..

0x080485ef <main+89>: mov DWORD PTR [esp+0x4],0x8049140
0x080485f7 <main+97>: mov DWORD PTR [esp],0x80491a0
0x080485fe <main+104>: call 0x80483d8 <wcscmp@plt>
view raw wchars_e1.asm hosted with ❤ by GitHub

Now we see a comparison taking place between our string and another string. What is it comparing it against? We can see that the string in question is stored in a global as well. In a small program, not containing many wide character strings, there's a very simple but imprecise way of finding out.
There is the linux strings command. Now by default strings will only print regular character strings that are at least 4 characters long and followed by an unprintable character. Luckily for us strings takes a switch which allows us to specify the type of encoding of the strings that we're looking for.

 -e --encoding={s,S,b,l,B,L} Select character size and endianness: s = 7-bit, S = 8-bit, {b,l} = 16-bit, {B,L} = 32-bit

So a strings -el <binary> will get us our answer, somewhere inside a mess of other things. In a small program, this shouldn't be too troublesome but in larger programs this could become annoying.

The better, more precise way to get this done is from within gdb itself. If they're using those wide char functions then libc is probably loaded and available and so the respective functions for printing wide character strings should also be available; wprintf in particular.

So all that is really necessary is to call wprintf and pass the address of the string as the argument.


(gdb) call wprintf(0x8049140)
$1 = 8
(gdb) call fflush(stdout)
SecretPW$2 = 0
view raw gdb_output.c hosted with ❤ by GitHub
Now since the output stream is buffered and is flushed when things like newlines are printed, to see our output we either have to print a newline or we can just flush stdout. A call to fflush() does the trick and we can see the contents of the string.

No comments:

Post a Comment