Contents |
Introduction |
This article teaches you how to become smart reverser by automating your reverse engineering tasks through Scripting. It is the part of our free "Reverse Engineering & Malware Analysis Course" [Reference 1]. It is primarily written to act as additional learning material for our session on 'Part 5 - Reverse Engineering Tools' where in we are going to demonstrate important reversing tools. You can visit our training page here [Reference 1] and all the presentations of previous sessions here [Reference 2] |
Reverse engineering is a sophisticated task especially when we analyse large applications or packed files like malware or normal applications for vulnerabilities.
Some of the common tasks include |
|
These are just some simple examples where automation will help in a great way. For example, lets say that we want to monitor HeapAlloc
calls in an application and application may call HeapAlloc for hundreds
of times but we want to log the call for some specific values like if
allocation request is greater than 1024 bytes etc.
A simple script will give us all the information virtually on the
spot while in manual task we have to manually create breakpoints on
HeapAlloc and have to check if the allocation size is greater than 1024
bytes or not which eventually increase the analysis time for such a
simple task. In this article, I will show you how to automate some of these common tasks through Scripting for main reversing debuggers i.e Ollydbg, Immunity Debugger, Pydbg & Windbg with practical code samples. |
Ollydbg - Playing with OllyScript |
Ollydbg [Reference 3]
is one of the best ring 3 (user-land) debugger. It has a very nice gui
interface. It is one of the most popular debugger on the planet and has
very mature community support. Ollydbg is my all time favourite debugger
:) But ollydbg doesn't support scripting natively instead ollydbg support plugins. So people written scripting plugins for ollydbg, the one that i will use in this article is Ollyscript by ShaG. You can download Ollyscript from here [Reference 4]. Ollyscript comes with a nice help file. It has similar syntax like assembly programming and very easy to understand. It supports almost all functionalities like dumping memory, decision making etc. But when you compare it with other debuggers scripting environment then it will seems to be a rigid type of scripting environment, I will discuss more about it later in this article. So let's understand Ollydbg scripting environment i.e Ollyscript with the help of a simple example. |
Problem Statement: |
Let say we are analysing an
application for a simple bug and we want to identify the function that
is actually causing the problem. But the function is deep inside the
application and manually it will take hours of analysis time. So here we want to track the execution flow after a specific point up to the function that is causing the problem, more precisely I want to log the return address of each function. |
Solution: |
The above problem can be solved by multiple methods but to demonstrate it in a very simple way I will use the following steps, |
|
Below is the tiny script to accomplish this task. Please note that the script is just to demonstrate the concept, it may fail when call used after decision instructions. :) |
/* Author: Amit Malik http://www.securityxploded.com */ EOB breakprocess var return var infunction var x var y mov infunction,EIP mov return,EIP start: findop return,#E8# mov x,$RESULT findop infunction,#E8# mov y,$RESULT cmp x,0 ja breaksetx backx: cmp y,0 ja breaksety backy: run breakprocess: sti mov return,[esp] msg return sti mov infunction,EIP jmp start breaksetx: bp x jmp backx breaksety: bp y jmp backy |
Please refer to the Ollyscript help file [Reference 4] for more details. Here I will explain only important keywords and terms. The script start with EOB (Execute over breakpoint), as name states it will execute the code inside the label that is specified with EOB when a breakpoint hit. In this code it will execute the breakprocess label code. |
var - declares a variable. mov - is similar to assembly findop - search for opcode from the specified address & stores the results into a $RESULT variable run - is similar to F9 in ollydbg sti - step into - similar to F7 in ollydbg msg - will show a messagebox - (log should be used but I used msg just for visual pleasure :)) |
As you can see that scripting is similar to assembly language. Most of the time people use ollyscripting for unpacking malwares. I have never seen anyone using it for vulnerability analysis. It is not very much flexible and also limited in its functionality. But it can be used for some stuff that we want to automate through ollydbg. |
Immunity Debugger |
Immunity debugger [Reference 3]
is a pure python debugger with similar GUI interface as Ollydbg. It is
developed by Immunity Inc. and according to immunity it is the only
debugger designed specifically for vulnerability research. It has some very powerful pycommands like heap, lookasidelist etc. one of the major advantage of this debugger is that it provides plethora of APIs for various reversing tasks and supports python which makes it one of the best debugger for reversing. In the reference section [Reference 6] you can find some good tutorials and projects based on Immunity debuggers and also it comes with a nice help file so don't forget to check that as well. |
Problem statement: |
We want to search all "jmp esp" instruction addresses. |
Solution Script: |
You can use the below script directly on Immunity debugger python shell |
data = "jmp esp" asm = imm.assemble(data) # imm is object of immlib class results = imm.search(asm) for addr in results: print "%s %0.8x" % (data,addr) |
The above 5 lines of code will give you all the "jmp esp" addresses. This is the beauty of scripting :) |
Pydbg |
Pydbg [Reference 3] is also a pure python based debugger. Pydbg is my favourite debugger, I use it in various automation tasks and it is extremely flexible and powerful. |
Problem Statement: |
We want to track VirtualAlloc API whenever VirtualAlloc is called, our script should display its arguments and the returned pointer. |
VirtualAlloc:
LPVOID WINAPI VirtualAlloc( __in_opt LPVOID lpAddress, __in SIZE_T dwSize, __in DWORD flAllocationType, __in DWORD flProtect ); |
Solution: |
|
# Author: Amit Malik # http://www.securityxploded.com import sys import pefile import struct from pydbg import * from pydbg.defines import * def ret_addr_handler(dbg): lpAddress = dbg.context.Eax # Get value returned by VirtualAlloc print " Returned Pointer: ",hex(int(lpAddress)) return DBG_CONTINUE def virtual_handler(dbg): print "****************" pdwSize = dbg.context.Esp + 8 # 2nd argument to VirtualAlloc rdwSize = dbg.read_process_memory(pdwSize,4) dwSize = struct.unpack("L",rdwSize)[0] dwSize = int(dwSize) print "Allocation Size: ",hex(dwSize) pflAllocationType = dbg.context.Esp + 12 # 3rd argument to VirtualAlloc rflAllocationType = dbg.read_process_memory(pflAllocationType,4) flAllocationType = struct.unpack("L",rflAllocationType)[0] flAllocationType = int(flAllocationType) print "Allocation Type: ",hex(flAllocationType) pflProtect = dbg.context.Esp + 16 # 4th Argument to VirtualAlloc rflProtect = dbg.read_process_memory(pflProtect,4) flProtect = struct.unpack("L",rflProtect)[0] flProtect = int(flProtect) print "Protection Type: ",hex(flProtect) pret_addr = dbg.context.Esp # Get return Address rret_addr = dbg.read_process_memory(pret_addr,4) ret_addr = struct.unpack("L",rret_addr)[0] ret_addr = int(ret_addr) dbg.bp_set(ret_addr,description="ret_addr breakpoint",restore = True,handler = ret_addr_handler) return DBG_CONTINUE def entry_handler(dbg): virtual_addr = dbg.func_resolve("kernel32.dll","VirtualAlloc") # Get VirtualAlloc address if virtual_addr: dbg.bp_set(virtual_addr,description="Virtualalloc breakpoint",restore = True,handler = virtual_handler) return DBG_CONTINUE def main(): file = sys.argv[1] pe = pefile.PE(file) # get entry point entry_addr = pe.OPTIONAL_HEADER.AddressOfEntryPoint + pe.OPTIONAL_HEADER.ImageBase dbg = pydbg() # get pydbg object dbg.load(file) dbg.bp_set(entry_addr,description="Entry point breakpoint",restore = True,handler = entry_handler) dbg.run() if __name__ == '__main__': main() |
Notice that in this script first i am setting breakpoint on
entry point and then on VirtualAlloc not directly to VirtualAlloc
because pydbg does not support deferred breakpoints. I am also ignoring
1st argument to VirtualAlloc i.e lpAddress, see VirtualAlloc
specification in problem statement. This script uses two modules PEFile and Pydbg, PEFile is used to get the entry point. |
Windbg |
Windbg [Reference 3]
is the official Microsoft debugger. It is the most powerful debugger
available for reversing on windows platform (mainly Kernel side of it)
and it also supports symbols. Windbg provides its own scripting language which is similar to C language, it also comes with a great help file. I highly recommend reading help file before we start with Windbg. |
Problem Statement: |
We want to track malloc, whenever malloc is called, our script should display requested size for allocation and returned pointer. |
Solution: |
On the same lines as previous example. |
|
bp msvcrt!malloc ".printf \"Size: %x\n\",poi(esp+4);gu;.printf \"Returned Pointer: %x\n\",eax;g" |
When we use multiple commands in a single line then we have to separate them using semicolon (;) |
bp - sets breakpoint msvcrt!malloc - this is DLL!Method (here DLL name & function name are separated by ! ) |
These are known as conditional breakpoints and in conditional
breakpoints we want to perform something when breakpoint hit. In our
case we want extract the size of allocation from stack. So simple syntax is: |
bp address or dll!method or dll!method+offset "block that should be executed when breakpoint hits" poi - is similar to pointer in c gu - go up - execute until return g - go or execute |
For more interesting commands please check out the Windbg help file. |
Conclusion |
This article is an additional learning material to our next session on 'Part 5 - Reverse Engineering Tools' - part of our FREE Reversing/Malware Analysis course [Reference 1] |
Wednesday, 29 February 2012
Automation of Reversing Through Scripting -Amit Malik
Labels:
reverse engineering
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment