SlideShare a Scribd company logo
Reversing Microsoft
 patches to reveal
  vulnerable code
         Harsimran Walia

   Computer Security Enthusiast




                                  2011
Abstract


The paper would try to reveal the vulnerable code for a particular disclosed vulnerability, which is the
first and foremost step for making undisclosed exploit and patch verification. The process used herein
could be used to create vulnerability based signatures which are far better than exploit signatures.
Vulnerability signature is a superset of all the inputs satisfying a particular vulnerability condition
whereas exploit based signature would only cater to one type of input satisfying that vulnerability
condition. This paper would try to pin point the vulnerable code and the files in Microsoft products by
reverse engineering the Microsoft patches.

The method used would be to take a binary difference of the file which was patched taken at two
different instances, one is the most recent file before patching and the second is after applying the
patch but finding the two files is in itself another problem. Windows now releases two different versions
of patches, GDR (General distribution) which contains only security related updates and the other QFE
(Quick Fix Engineering) or LDR (Limited Distribution Release) which has both security related and
functional updates. The problem addressed is that the versions of the two files to be compared should
match that is either both should be GDR or LDR. The file after patching can be obtained by extracting
the patch of the considered vulnerability. The second file to be compared with a matching version with
the first one could be extracted from some other vulnerability patch addressing the issue with the same
software disclosed just before the vulnerability considered. The process of extraction of files from
patches differs in Vista and Windows 7 from the traditional way used in Windows XP.

After obtaining the correct files to be compared, the next step would be to get a binary difference
between the files which can be done very easily and effectively with the use of a tool called DarunGrim.
The tool provides a well illustrated difference between the subroutines in the term of percentage match
between them. Subroutines from both the files can be viewed in graph mode and can be compared to
find the vulnerability. The change in the code is done to fix that particular vulnerability which may be
removal of a piece of code and addition of another. Another problem arises at this point is that compiler
optimizations happen every-time a code is compiled, so if both the files are compiled with different
compilers or compiler versions, they would have different compiler optimizations and that would also
show up as a change in code. Simple Instruction reordering keeps happening over different releases
which give rise to another problem as when only the instructions are reordered, still it would show up
as changed code. The code change in one of the functions out of several functions in the file before
applying the patch would be the vulnerable code. From here knowledge of the reverse engineer would
come into play as how effectively and fast he can find the vulnerability from the code shown as being
changed from the previous file. Till now the process used was static analysis but from now onwards
dynamic analysis would be used as breakpoints could be set at these changed functions and run the
software. When a breakpoint is hit we can check in which of the functions is user input being dealt.
Obtaining all this information can then be used to write an exploit.

This process of reversing the patch and finding the details about the vulnerability would definitely help
in creating vulnerability signatures.
Introduction
We start by describing the life cycle of patch development. It starts with a 0day vulnerability being
found and used to create an exploit and compromise systems. When the vulnerability reaches the
vendor, it finds a fix and releases a patch of the vulnerability to its customer base in order for them to
secure their systems from malicious activity. In this paper I will talk about how the patches released by
Microsoft can be reverse engineered to exactly locate the code where the vulnerability exists. The paper
would also highlight the difficulties faced during the process and how to overcome those difficulties
wherever applicable.

In the paper, we would be using DarunGrim, a tool that gives the binary difference very easily and
effectively. For a better understanding of the tool and its context of use, I would like to mention its
working

How DarunGrim works?




                                       Figure 1 Schema of DarunGrim2



The schema of DarunGrim is shown in the figure above which comprises of sqlite database generated
with the help of IDA Pro .The heart of DarunGrim is its Diffing Engine which does all the processing
analogous to a CPU in the computer system.

In order to generate diffing results, both the binaries are first disassembled using IDA Pro which runs as
a background process and is not visible on the screen. After generating the disassemblies the
DarunGrim IDA plug-in is run automatically in the background. Finally both the files are fed to the
DiffEngine, which runs and generate the diffing results.
Algorithm ?
The main algorithm used by DarunGrim for binary diffing is called Basic block fingerprint hash map. In
this algorithm each basic block of assembly code is considered as a single entity and a fingerprint of this
basic block is generated from the instruction sequences. The fingerprint is generated with the help of
IDA Pro. Two fingerprint hash tables are generated from the basic blocks, one for the original binary and
the other for the patched binary.

For the comparison, each unique fingerprint hash from the original binary is searched against the
fingerprint hash table of the patched binary for a match. Likewise all the fingerprints from the original
binary are marked as matched or unmatched. The main purpose of the comparison exercise is to serve a
bigger purpose of finding unmatched functions. In order for a function to match, all the basic blocks
inside the function should match. Match rate of the function is calculated based on the fingerprint string
matching which is done like GNU diff works i.e. finding the longest common subsequence.

Vulnerability Vs Exploit based signatures
Exploit signatures are created by using byte string patterns or regular expressions. These signatures are
exploit specific but are the ones used widely, the main reason being the ease of their creation. Exploit
based signature would only cater to one type of input satisfying that vulnerability condition. The
problem with these types of signatures is that different attacks can exploit the same vulnerability, in
which case exploit based signatures will fail except for the one attack for which it is created. Consider
the following exploit signature for a buffer overflow with a long string of A’s.

                                      ESig = “docx?AAAAAAAAAAA...”

This will stop all the exploits with the pattern shown above but it cannot stop the exploits if I change the
A’s to B’s or any other alphabet.

On the contrary, vulnerability based exploits are based on the properties of the vulnerability and not on
the properties of the exploit. Vulnerability signature is a superset of all the inputs satisfying a particular
vulnerability condition. For the example above, the vulnerability based signature would be something
like

                               VSig = MATCH_STR (Buffer,"docx?(.*)$",limit)

The signature matches the string in buffer against the regular expression with the size of the string
specified by limit. In this case it is effective against any alphabet which is based on how the vulnerability
is actually exploited unlike exploit signature which is created for a particular exploit pattern.
For a good vulnerability signature, it should exhibit three properties:

1. It should strictly not allow any false negatives as even one exploit can pwn the system and create a
    gateway for the attacker into the network.

2. It should allow very few false positives, as too many false positives may lead to a DoS attack for the
    system.

3. The signature matching time should not create a considerable delay for the software and services.
Need
•   Whenever an exploit is to be created and if it is an undisclosed exploit, the first step would be to
    find the vulnerability and the vulnerable code in order to exploit it.

•   To verify if the patch released by Microsoft is working as per it is designed.

•   The process can be used to create vulnerability based signatures which are far better than exploit
    signatures.




                                            Procedure

    Finding patches
To start off with, pick any vulnerability and search for the Microsoft security bulletin for that
vulnerability. Let’s consider MS10-016 for the sake of simplicity in the paper. Go to the Microsoft
bulletin page and it will show all the affected OS/Software versions and correspondingly the bulletin just
before this, which addresses a similar issue in the same version of OS/Software, in our case it is None
which means the file version before this should be installed by default in the system, but may not
always be the case. Sometimes it would refer to some other bulletin, in which case you should use the
file from mentioned patch and not from the installed system.

Problem: Finding the two files is in itself another problem. Windows now releases two different
versions of patches, GDR (General distribution) which contains only security related updates and the
other QFE (Quick Fix Engineering) or LDR (Limited Distribution Release) which has both security and
functional updates. The problem addressed is that the versions of the two files to be compared should
match that is either both should be GDR or LDR. Now download the GDR version of the patch for Win
XP.
Figure 2 Selection of correct file version




As shown in the image above, the patch would replace the existing Moviemk.exe file to version
2.1.4027.0. Fortunately, the file that came installed by default with XP SP2 system was 2.1.4026.0 which
means this patch fixes the issue in the default installation. We would be using these two files for
comparison.

Quick workaround: Use open source ‘ms-patch-tools’ queryMSDB.py python file to get the versions
of the files to be compared. Use the file versions from two consecutive advisories.




                                            Figure 3 ms-patch-tools
Extraction of files
The traditional way of extracting the files from the patches that are delivered as exe file is:

                                             <patchfilename>.exe /x

This works only till Windows XP and earlier versions of Windows.

Problem: The patches for Windows Vista and Win 7 are delivered as msu files and the way to extract
them is completely different from the traditional method as:

Create a folder with the name “MSUFolder” in C: and enter following commands

1.        expand -F:* <Saved_MSU_File_Name>.msu C:MSUFolder

2.        expand -F:* <Saved_MSU_File_Name>.cab C:MSUFolder

After these commands, lots of files and folders are extracted but use the file inside the folder which has
the correct GDR version of file to be compared.




     Binary Differencing
Using DarunGrim to get the binary difference between the files selected to prepare for the analysis
step. Select both the binary files and give an output file in DarunGrim as shown in the figure and let it do
the rest.
Figure 4 DarunGrim file selection

The tool matches every single function by the algorithm explained earlier between the two files and
gives an illustrated detail on the percentage match for all the functions. The functions being patched to
fix the vulnerability would definitely have percentage match less than 100.
Figure 5 Darungrim results

Problem: Not every function whose Match percentage is less than 100% is the function changed.
There are several problems like

      Instruction reordering
      A lot of reordering happens over different releases which breaks the matching algorithm and
      marks the same blocks as unmatched.

      Split blocks
      A block in the graph which has only parent and the parent has only one child leads to a split block
      which causes a problem in the matching process. This can be improved if the two blocks can be
      merged and treated as a single block.

      Hot patching
      Instructions like mov eax, eax at the start of functions are a sign of hot patching which also lead
      to a mismatch in the block. The solution to this is just ignoring the instruction mentioned before.


 These problems create false positives that even same functions are shown different, which would have
to be eliminated by manual inspection of the functions.
Differencing Analysis
After the binary analysis we get a list of unmatched functions and also after removing the false positives
in the previous step we get to a function shown in the image that might be the function that would have
been compromised in earlier version of moviemk.exe. We then started deep function analysis looking
for differences in patched and unpatched function.




                                      Figure 6 Vulnerable moviemk.exe
Figure 7 Patched version of moviemk.exe




Binary Analysis of the unpatched function

push        [ebp-2Ch]      ; unsigned int
call        ??2@YAPAXI@Z    ; operator new(uint)
mov         ebx, eax
pop         ecx
mov         [ebp-18h], ebx
mov         [ebp-3Ch], ebx
mov         byte ptr [ebp-4], 1
push        dword ptr [ebp-2Ch]
mov         ecx, esi
push        ebx
push       [ebp-30h]
call       sub_118000C func(const *,void *,long)
mov        edi, eax
test       edi, edi
jge        short loc_1181503
When the first time new function is called, the pointer to the space is stored in ebx and a function
sub_118000C is called and this pointer is passed as an argument. After doing a little bit of reverse
engineering we could see that the function is used to fill content in the allocated space.

push        [ebp-2Ch]                    ; unsigned int
call        ??2@YAPAXI@Z    ; operator new(uint)
pop         ecx
mov         [ebp-14h], eax ; ebp-14h = pBuffer
mov         [ebp-40h], eax
mov         byte ptr [ebp-4], 2
push        [ebp-2Ch]
mov         ecx, esi
push        ebx
push        edi
call        sub_118000C func(const *,void *,long)
mov         esi, eax
test        esi, esi
jge         short loc_118158A


The second time new function is called, but instead of passing the pointer to the space allocated by the
second call to new function, the pointer to the space allocated by the first new call i.e. ebx is being
passed to the same function sub_118000C again to fill in the space which is where the vulnerability
might exist. Hence a larger data which was suppose to be copied to second space allocated can be
copied to a small space of the first space allocation, causing Buffer overflow.




    Debugging
After figuring out the vulnerability, we can go for validating our find by trying to get a crash of the
application. This step includes debugging of the application to create a file that can generate the crash.
Put a breakpoint at each call to new function and run the application attached to immunity debugger
and open a MSWMM windows movie maker file.
Figure 8 Immunity Debugger showing the two breakpoints




As it hits the breakpoint we can find out the size of the new space being allocated in both cases. Now
create a MSWMM file which will make the size of first new call space smaller than the size of the second
new call space in order to cause Buffer overflow thereby crashing the application.



                                           Conclusion
This paper is an overview of how the 1-day exploits and vulnerability signatures can be created by
reversing the patches supplied by the vendor. An attempt has been made to understand the process
involved in reversing and the problems faced during the execution of the process. However, what all
concepts have been presented here is needed to be perfect by interested reader via further research
and practice. In this paper we have only talked of reversing Microsoft patches which served our
purpose; however discussion on other vendor patches is left up to the reader.
Bibliography
1. David Brumley, James Newsome, Dawn Song, Hao Wang, and Somesh Jha (May 2006). Towards
   automatic generation of vulnerability-based signatures.

2. Jeongwook Oh (July 2009). Fight against 1-day exploits: Diffing Binaries vs Anti-diffing Binaries.

3. Jeongwook “Matt” Oh (July 2010). Exploit Spotting: Locating Vulnerabilities Out Of Vendor
   Patches Automatically

4. Ryan Iwahashi, Daniela A.S. de Oliveira, Jong-Soo Jang (2008). Towards Automatically Generated
   Double-Free Vulnerability Signatures Using Petri Notes.

5. Nabil Schear, David R. Albrecht , Nikita Borisov. High-speed Matching of Vulnerability Signatures

6. Intel Architecture Software Developer’s Manual. Volume 2:Instruction Set Reference

7. http://www.breakingpointsystems.com/community/blog/microsoft-vulnerability-proof-of-
   concept/

8. http://www.mydigitallife.info/2007/02/15/extract-and-view-contents-of-microsoft-update-
   standalone-package-msu-for-windows-vista/

9. http://www.abysssec.com/blog/

10. http://en.wikipedia.org/wiki/Diff

More Related Content

PDF
nullcon 2011 - Penetration Testing a Biometric System
PDF
WhitePaper : Security issues in android custom rom
PDF
Frankenstein. stitching malware from benign binaries
PDF
Crisis. advanced malware
PDF
Breaking av software
ODP
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
ODP
The Nightmare Fuzzing Suite and Blind Code Coverage Fuzzer
nullcon 2011 - Penetration Testing a Biometric System
WhitePaper : Security issues in android custom rom
Frankenstein. stitching malware from benign binaries
Crisis. advanced malware
Breaking av software
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
The Nightmare Fuzzing Suite and Blind Code Coverage Fuzzer

What's hot (20)

DOCX
Mobile binary code - Attack Tree and Mitigation
PDF
64 bit rugrats
PDF
.NET for hackers
PDF
Automatic binary deobfuscation
PPTX
File inflection techniques
PDF
Half-automatic Compilable Source Code Recovery
PDF
Defeating spyware and forensics on the black berry draft
PDF
Penetrating Windows 8 with syringe utility
PDF
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...
PDF
Espressif IoT Development Framework: 71 Shots in the Foot
PDF
DEFCON 21: EDS: Exploitation Detection System WP
PPTX
DEFCON 21: EDS: Exploitation Detection System Slides
PPTX
Penetration testing using metasploit
PDF
Az4301280282
PDF
A Security Barrier Device That Can Protect Critical Data Regardless of OS or ...
PDF
Meet the potnet - AboutAndroid | Malware Analysis Report
PDF
Bb31166168
PDF
Metasploit
PDF
Malware Analysis: Ransomware
Mobile binary code - Attack Tree and Mitigation
64 bit rugrats
.NET for hackers
Automatic binary deobfuscation
File inflection techniques
Half-automatic Compilable Source Code Recovery
Defeating spyware and forensics on the black berry draft
Penetrating Windows 8 with syringe utility
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...
Espressif IoT Development Framework: 71 Shots in the Foot
DEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System Slides
Penetration testing using metasploit
Az4301280282
A Security Barrier Device That Can Protect Critical Data Regardless of OS or ...
Meet the potnet - AboutAndroid | Malware Analysis Report
Bb31166168
Metasploit
Malware Analysis: Ransomware
Ad

Similar to nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code (20)

PDF
Reverse engineering – debugging fundamentals
PPTX
nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
PDF
SOURCE CODE ANALYSIS TO REMOVE SECURITY VULNERABILITIES IN JAVA SOCKET PROGR...
PDF
SOURCE CODE ANALYSIS TO REMOVE SECURITY VULNERABILITIES IN JAVA SOCKET PROGRA...
PDF
Detection of vulnerabilities in programs with the help of code analyzers
PDF
IRJET- Development of Uncrackable Software
PDF
A Platform for Application Risk Intelligence
PDF
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PDF
Exploits Attack on Windows Vulnerabilities
PDF
White Paper - Are antivirus solutions enough to protect industrial plants?
PDF
PVS-Studio's New Message Suppression Mechanism
DOC
Exploit Frameworks
PPTX
IDAPRO
PDF
Cyber Defense Forensic Analyst - Real World Hands-on Examples
PDF
Bypassing anti virus scanners
PPTX
Seh based exploitation
PDF
Blackhat Europe 2009 - Detecting Certified Pre Owned Software
PDF
Addressing New Challenges in Software Protection for .NET
PDF
Ijetr012045
Reverse engineering – debugging fundamentals
nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code
SOURCE CODE ANALYSIS TO REMOVE SECURITY VULNERABILITIES IN JAVA SOCKET PROGR...
SOURCE CODE ANALYSIS TO REMOVE SECURITY VULNERABILITIES IN JAVA SOCKET PROGRA...
Detection of vulnerabilities in programs with the help of code analyzers
IRJET- Development of Uncrackable Software
A Platform for Application Risk Intelligence
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
Exploits Attack on Windows Vulnerabilities
White Paper - Are antivirus solutions enough to protect industrial plants?
PVS-Studio's New Message Suppression Mechanism
Exploit Frameworks
IDAPRO
Cyber Defense Forensic Analyst - Real World Hands-on Examples
Bypassing anti virus scanners
Seh based exploitation
Blackhat Europe 2009 - Detecting Certified Pre Owned Software
Addressing New Challenges in Software Protection for .NET
Ijetr012045
Ad

More from n|u - The Open Security Community (20)

PDF
Hardware security testing 101 (Null - Delhi Chapter)
PPTX
SSRF exploit the trust relationship
PDF
PDF
Api security-testing
PDF
Introduction to TLS 1.3
PDF
Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...
PDF
Talking About SSRF,CRLF
PPTX
Building active directory lab for red teaming
PPTX
Owning a company through their logs
PPTX
Introduction to shodan
PDF
Detecting persistence in windows
PPTX
Frida - Objection Tool Usage
PDF
OSQuery - Monitoring System Process
PDF
DevSecOps Jenkins Pipeline -Security
PDF
Extensible markup language attacks
PPTX
PDF
Hardware security testing 101 (Null - Delhi Chapter)
SSRF exploit the trust relationship
Api security-testing
Introduction to TLS 1.3
Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...
Talking About SSRF,CRLF
Building active directory lab for red teaming
Owning a company through their logs
Introduction to shodan
Detecting persistence in windows
Frida - Objection Tool Usage
OSQuery - Monitoring System Process
DevSecOps Jenkins Pipeline -Security
Extensible markup language attacks

Recently uploaded (20)

PPTX
Spectroscopy.pptx food analysis technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
MYSQL Presentation for SQL database connectivity
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Electronic commerce courselecture one. Pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Big Data Technologies - Introduction.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Spectroscopy.pptx food analysis technology
Machine learning based COVID-19 study performance prediction
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
MYSQL Presentation for SQL database connectivity
The Rise and Fall of 3GPP – Time for a Sabbatical?
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Spectral efficient network and resource selection model in 5G networks
sap open course for s4hana steps from ECC to s4
Electronic commerce courselecture one. Pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation theory and applications.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Diabetes mellitus diagnosis method based random forest with bat algorithm
Per capita expenditure prediction using model stacking based on satellite ima...

nullcon 2011 - Reversing MicroSoft patches to reveal vulnerable code

  • 1. Reversing Microsoft patches to reveal vulnerable code Harsimran Walia Computer Security Enthusiast 2011
  • 2. Abstract The paper would try to reveal the vulnerable code for a particular disclosed vulnerability, which is the first and foremost step for making undisclosed exploit and patch verification. The process used herein could be used to create vulnerability based signatures which are far better than exploit signatures. Vulnerability signature is a superset of all the inputs satisfying a particular vulnerability condition whereas exploit based signature would only cater to one type of input satisfying that vulnerability condition. This paper would try to pin point the vulnerable code and the files in Microsoft products by reverse engineering the Microsoft patches. The method used would be to take a binary difference of the file which was patched taken at two different instances, one is the most recent file before patching and the second is after applying the patch but finding the two files is in itself another problem. Windows now releases two different versions of patches, GDR (General distribution) which contains only security related updates and the other QFE (Quick Fix Engineering) or LDR (Limited Distribution Release) which has both security related and functional updates. The problem addressed is that the versions of the two files to be compared should match that is either both should be GDR or LDR. The file after patching can be obtained by extracting the patch of the considered vulnerability. The second file to be compared with a matching version with the first one could be extracted from some other vulnerability patch addressing the issue with the same software disclosed just before the vulnerability considered. The process of extraction of files from patches differs in Vista and Windows 7 from the traditional way used in Windows XP. After obtaining the correct files to be compared, the next step would be to get a binary difference between the files which can be done very easily and effectively with the use of a tool called DarunGrim. The tool provides a well illustrated difference between the subroutines in the term of percentage match between them. Subroutines from both the files can be viewed in graph mode and can be compared to find the vulnerability. The change in the code is done to fix that particular vulnerability which may be removal of a piece of code and addition of another. Another problem arises at this point is that compiler optimizations happen every-time a code is compiled, so if both the files are compiled with different compilers or compiler versions, they would have different compiler optimizations and that would also show up as a change in code. Simple Instruction reordering keeps happening over different releases which give rise to another problem as when only the instructions are reordered, still it would show up as changed code. The code change in one of the functions out of several functions in the file before applying the patch would be the vulnerable code. From here knowledge of the reverse engineer would come into play as how effectively and fast he can find the vulnerability from the code shown as being changed from the previous file. Till now the process used was static analysis but from now onwards dynamic analysis would be used as breakpoints could be set at these changed functions and run the software. When a breakpoint is hit we can check in which of the functions is user input being dealt. Obtaining all this information can then be used to write an exploit. This process of reversing the patch and finding the details about the vulnerability would definitely help in creating vulnerability signatures.
  • 3. Introduction We start by describing the life cycle of patch development. It starts with a 0day vulnerability being found and used to create an exploit and compromise systems. When the vulnerability reaches the vendor, it finds a fix and releases a patch of the vulnerability to its customer base in order for them to secure their systems from malicious activity. In this paper I will talk about how the patches released by Microsoft can be reverse engineered to exactly locate the code where the vulnerability exists. The paper would also highlight the difficulties faced during the process and how to overcome those difficulties wherever applicable. In the paper, we would be using DarunGrim, a tool that gives the binary difference very easily and effectively. For a better understanding of the tool and its context of use, I would like to mention its working How DarunGrim works? Figure 1 Schema of DarunGrim2 The schema of DarunGrim is shown in the figure above which comprises of sqlite database generated with the help of IDA Pro .The heart of DarunGrim is its Diffing Engine which does all the processing analogous to a CPU in the computer system. In order to generate diffing results, both the binaries are first disassembled using IDA Pro which runs as a background process and is not visible on the screen. After generating the disassemblies the DarunGrim IDA plug-in is run automatically in the background. Finally both the files are fed to the DiffEngine, which runs and generate the diffing results.
  • 4. Algorithm ? The main algorithm used by DarunGrim for binary diffing is called Basic block fingerprint hash map. In this algorithm each basic block of assembly code is considered as a single entity and a fingerprint of this basic block is generated from the instruction sequences. The fingerprint is generated with the help of IDA Pro. Two fingerprint hash tables are generated from the basic blocks, one for the original binary and the other for the patched binary. For the comparison, each unique fingerprint hash from the original binary is searched against the fingerprint hash table of the patched binary for a match. Likewise all the fingerprints from the original binary are marked as matched or unmatched. The main purpose of the comparison exercise is to serve a bigger purpose of finding unmatched functions. In order for a function to match, all the basic blocks inside the function should match. Match rate of the function is calculated based on the fingerprint string matching which is done like GNU diff works i.e. finding the longest common subsequence. Vulnerability Vs Exploit based signatures Exploit signatures are created by using byte string patterns or regular expressions. These signatures are exploit specific but are the ones used widely, the main reason being the ease of their creation. Exploit based signature would only cater to one type of input satisfying that vulnerability condition. The problem with these types of signatures is that different attacks can exploit the same vulnerability, in which case exploit based signatures will fail except for the one attack for which it is created. Consider the following exploit signature for a buffer overflow with a long string of A’s. ESig = “docx?AAAAAAAAAAA...” This will stop all the exploits with the pattern shown above but it cannot stop the exploits if I change the A’s to B’s or any other alphabet. On the contrary, vulnerability based exploits are based on the properties of the vulnerability and not on the properties of the exploit. Vulnerability signature is a superset of all the inputs satisfying a particular vulnerability condition. For the example above, the vulnerability based signature would be something like VSig = MATCH_STR (Buffer,"docx?(.*)$",limit) The signature matches the string in buffer against the regular expression with the size of the string specified by limit. In this case it is effective against any alphabet which is based on how the vulnerability is actually exploited unlike exploit signature which is created for a particular exploit pattern. For a good vulnerability signature, it should exhibit three properties: 1. It should strictly not allow any false negatives as even one exploit can pwn the system and create a gateway for the attacker into the network. 2. It should allow very few false positives, as too many false positives may lead to a DoS attack for the system. 3. The signature matching time should not create a considerable delay for the software and services.
  • 5. Need • Whenever an exploit is to be created and if it is an undisclosed exploit, the first step would be to find the vulnerability and the vulnerable code in order to exploit it. • To verify if the patch released by Microsoft is working as per it is designed. • The process can be used to create vulnerability based signatures which are far better than exploit signatures. Procedure Finding patches To start off with, pick any vulnerability and search for the Microsoft security bulletin for that vulnerability. Let’s consider MS10-016 for the sake of simplicity in the paper. Go to the Microsoft bulletin page and it will show all the affected OS/Software versions and correspondingly the bulletin just before this, which addresses a similar issue in the same version of OS/Software, in our case it is None which means the file version before this should be installed by default in the system, but may not always be the case. Sometimes it would refer to some other bulletin, in which case you should use the file from mentioned patch and not from the installed system. Problem: Finding the two files is in itself another problem. Windows now releases two different versions of patches, GDR (General distribution) which contains only security related updates and the other QFE (Quick Fix Engineering) or LDR (Limited Distribution Release) which has both security and functional updates. The problem addressed is that the versions of the two files to be compared should match that is either both should be GDR or LDR. Now download the GDR version of the patch for Win XP.
  • 6. Figure 2 Selection of correct file version As shown in the image above, the patch would replace the existing Moviemk.exe file to version 2.1.4027.0. Fortunately, the file that came installed by default with XP SP2 system was 2.1.4026.0 which means this patch fixes the issue in the default installation. We would be using these two files for comparison. Quick workaround: Use open source ‘ms-patch-tools’ queryMSDB.py python file to get the versions of the files to be compared. Use the file versions from two consecutive advisories. Figure 3 ms-patch-tools
  • 7. Extraction of files The traditional way of extracting the files from the patches that are delivered as exe file is: <patchfilename>.exe /x This works only till Windows XP and earlier versions of Windows. Problem: The patches for Windows Vista and Win 7 are delivered as msu files and the way to extract them is completely different from the traditional method as: Create a folder with the name “MSUFolder” in C: and enter following commands 1. expand -F:* <Saved_MSU_File_Name>.msu C:MSUFolder 2. expand -F:* <Saved_MSU_File_Name>.cab C:MSUFolder After these commands, lots of files and folders are extracted but use the file inside the folder which has the correct GDR version of file to be compared. Binary Differencing Using DarunGrim to get the binary difference between the files selected to prepare for the analysis step. Select both the binary files and give an output file in DarunGrim as shown in the figure and let it do the rest.
  • 8. Figure 4 DarunGrim file selection The tool matches every single function by the algorithm explained earlier between the two files and gives an illustrated detail on the percentage match for all the functions. The functions being patched to fix the vulnerability would definitely have percentage match less than 100.
  • 9. Figure 5 Darungrim results Problem: Not every function whose Match percentage is less than 100% is the function changed. There are several problems like Instruction reordering A lot of reordering happens over different releases which breaks the matching algorithm and marks the same blocks as unmatched. Split blocks A block in the graph which has only parent and the parent has only one child leads to a split block which causes a problem in the matching process. This can be improved if the two blocks can be merged and treated as a single block. Hot patching Instructions like mov eax, eax at the start of functions are a sign of hot patching which also lead to a mismatch in the block. The solution to this is just ignoring the instruction mentioned before. These problems create false positives that even same functions are shown different, which would have to be eliminated by manual inspection of the functions.
  • 10. Differencing Analysis After the binary analysis we get a list of unmatched functions and also after removing the false positives in the previous step we get to a function shown in the image that might be the function that would have been compromised in earlier version of moviemk.exe. We then started deep function analysis looking for differences in patched and unpatched function. Figure 6 Vulnerable moviemk.exe
  • 11. Figure 7 Patched version of moviemk.exe Binary Analysis of the unpatched function push [ebp-2Ch] ; unsigned int call ??2@YAPAXI@Z ; operator new(uint) mov ebx, eax pop ecx mov [ebp-18h], ebx mov [ebp-3Ch], ebx mov byte ptr [ebp-4], 1 push dword ptr [ebp-2Ch] mov ecx, esi push ebx push [ebp-30h] call sub_118000C func(const *,void *,long) mov edi, eax test edi, edi jge short loc_1181503
  • 12. When the first time new function is called, the pointer to the space is stored in ebx and a function sub_118000C is called and this pointer is passed as an argument. After doing a little bit of reverse engineering we could see that the function is used to fill content in the allocated space. push [ebp-2Ch] ; unsigned int call ??2@YAPAXI@Z ; operator new(uint) pop ecx mov [ebp-14h], eax ; ebp-14h = pBuffer mov [ebp-40h], eax mov byte ptr [ebp-4], 2 push [ebp-2Ch] mov ecx, esi push ebx push edi call sub_118000C func(const *,void *,long) mov esi, eax test esi, esi jge short loc_118158A The second time new function is called, but instead of passing the pointer to the space allocated by the second call to new function, the pointer to the space allocated by the first new call i.e. ebx is being passed to the same function sub_118000C again to fill in the space which is where the vulnerability might exist. Hence a larger data which was suppose to be copied to second space allocated can be copied to a small space of the first space allocation, causing Buffer overflow. Debugging After figuring out the vulnerability, we can go for validating our find by trying to get a crash of the application. This step includes debugging of the application to create a file that can generate the crash. Put a breakpoint at each call to new function and run the application attached to immunity debugger and open a MSWMM windows movie maker file.
  • 13. Figure 8 Immunity Debugger showing the two breakpoints As it hits the breakpoint we can find out the size of the new space being allocated in both cases. Now create a MSWMM file which will make the size of first new call space smaller than the size of the second new call space in order to cause Buffer overflow thereby crashing the application. Conclusion This paper is an overview of how the 1-day exploits and vulnerability signatures can be created by reversing the patches supplied by the vendor. An attempt has been made to understand the process involved in reversing and the problems faced during the execution of the process. However, what all concepts have been presented here is needed to be perfect by interested reader via further research and practice. In this paper we have only talked of reversing Microsoft patches which served our purpose; however discussion on other vendor patches is left up to the reader.
  • 14. Bibliography 1. David Brumley, James Newsome, Dawn Song, Hao Wang, and Somesh Jha (May 2006). Towards automatic generation of vulnerability-based signatures. 2. Jeongwook Oh (July 2009). Fight against 1-day exploits: Diffing Binaries vs Anti-diffing Binaries. 3. Jeongwook “Matt” Oh (July 2010). Exploit Spotting: Locating Vulnerabilities Out Of Vendor Patches Automatically 4. Ryan Iwahashi, Daniela A.S. de Oliveira, Jong-Soo Jang (2008). Towards Automatically Generated Double-Free Vulnerability Signatures Using Petri Notes. 5. Nabil Schear, David R. Albrecht , Nikita Borisov. High-speed Matching of Vulnerability Signatures 6. Intel Architecture Software Developer’s Manual. Volume 2:Instruction Set Reference 7. http://www.breakingpointsystems.com/community/blog/microsoft-vulnerability-proof-of- concept/ 8. http://www.mydigitallife.info/2007/02/15/extract-and-view-contents-of-microsoft-update- standalone-package-msu-for-windows-vista/ 9. http://www.abysssec.com/blog/ 10. http://en.wikipedia.org/wiki/Diff