SlideShare a Scribd company logo
Introduction RevEngE Evaluation Final Remarks
RevEngE is a dish served cold: Debug-Oriented
Malware Decompilation and Reassembly
Marcus Botacin1, Lucas Galante2,
Paulo L´ıcio de Geus2, Andr´e Gr´egio1
1Federal University of Paran´a (UFPR-BR)
{mfbotacin, gregio}@inf.ufpr.br
2University of Campinas (UNICAMP-BR)
{galante, paulo}@lasca.ic.unicamp.br
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 1 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Who Am I?
Background
Computer Engineer (University of Campinas–Brazil).
CS Master (University of Campinas–Brazil).
CS PhD Student (Federal University of Paran´a–Brazil).
Malware Analyst (Since 2012).
Research Interests
Malware Analysis & Detection.
Hardware-Assisted Security.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 2 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
The Problem
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 3 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
The Problem
The Problem
Malware
Hard to Understand at low level (e.g. assembly).
Decompilers
Lift low level constructions to high level semantics.
Allow API and/or source code analyses.
Decompilation Challenges
Malware do not behave well.
Malware implement anti-analysis tricks.
Malware binaries exhibit dead code.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 4 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
The Problem
Insights & Proposal (1/2)
Current Decompilers
They perform reasonably well with small pieces of code.
They do not perform well with static disassembly.
State-of-the-art Decompilers
Current Debuggers
They can perform dynamic disassembly and/or inspection.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 5 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
The Problem
Insights & Proposal (2/2)
Current Analysts’ Tasks
Analysts already debug binaries in a sliced manner.
Analysts perform their own anti-anti-analysis routines.
What If...
We could combine analysts manual work with decompiler?
We could decompile the small pieces debugged by the analyst?
We could allow the analyst to overcome anti-analysis by
themselves?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 6 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Background
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 7 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Background
Background
Compiler
Parsing, Pre-Processing, Assemblying, Optimization, Linking
and Code Generation.
Decompiler
Disassembly, Lifting, data type recovery, and Code Generation.
Notice that:
Not the same code generation routines.
Decompiler is an inverse compiler.
There are cross-platform compilers and decompilers.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 8 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Background
The Challenges (1/2)
Disassembly
Opaque Constants.
Overlapping Instructions.
Data and Code are mixed.
Lifting
A typical ISA is VERY large.
Have you ever executed VFMADDSUBPS?
and O.S. support as well...
Do you know what is NUMA?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 9 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Background
The Challenges (2/2)
Data Type Reconstruction
What is the difference between an array (int a[2];) and
consecutive variables (int a,b;)?
Is 0x77FF... an integer or a pointer?
Code Generation
How to implement?
Which optimizations?
How to name variables?
Evaluation
Is recovered code a good metric for malware decompilation?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 10 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Overview
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 11 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Overview
Reverse Engineering Engine
Overview
PoC Decompiler focused on malware analysis.
GDB-powered (no-reimplementation).
Dynamic Inspection (no static analysis constraints).
Trace-Oriented (decompile what is debugged).
Reassembler (merge the decompiled pieces in a new software).
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 12 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 13 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
RevEngE-GDB Integration
Figure: RevEngE Architecture. GDB provides the basic debugging
capabilities and was armored to handle malware anti-analysis techniques.
RevEngE decompiler is developed on top of the armored GDB.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 14 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
GDB Armoring
1 __libc_start_main (main=<value >, argc=<value >,
ubp_av=<value >, init=<value >, fini=<value >,
rtld_fini=<value >, stack_end=<value >)
Code Snippet 1: Libc Entry Point. First argument points to
application entry point.
1 output = gdb.execute("set␣$eflags |=0x%x" % self.
flag_map[flag],to_string=True)
Code Snippet 2: Invert Branch Direction. Flags register is changed
according a map of possible flags for such command.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 15 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Instruction Representation
Figure: Instruction Representation. RevEngE benefits from Python’s
polymorphism to model instruction’s behaviors and overloads method
declarators to support each x86 instruction’s possible multiple argument
types.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 16 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Instruction Factory
1 class IFactory (...):
2 def get(self , args):
3 newclass = globals ()[name ]( args)
4 return newclass
Code Snippet 3: Instruction Factory. The Factory design pattern
allows instantiating objects from the proper class by exploring Python
OOP capabilities.
1 self.classes[’div’] = "IDiv"
2 self.classes[’divl ’] = "IDiv"
3 self.classes[’idiv ’] = "IDiv"
4 self.classes[’idivl ’] = "IDiv"
Code Snippet 4: Instruction Lifting. RevEngE assumes only signed
integer operations to handle all instructions via the same high-level
class.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 17 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Lifting Complex Instructions
1 0x4004eb cmp -0x8(%rbp) ,%eax
2 0x4004ee jle 4004 fb <main +0x25 >
Code Snippet 5: Low level representation of a conditional decision. IF
instructions are composed by multiple assembly instructions.
1 class HighLevelCompare ():
2 def __init__ (self ,cmp ,set):
3 self.op1 = cmp.op1
4 self.op2 = cmp.op2
5 self.op3 = set.op3
Code Snippet 6: High level conditional decision representation.
Assembly instructions are promoted to a single class that represents a
high level conditional structure (e.g., IFs).
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 18 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Handling Variables
1 self.vars = VariableManager ()
2 self.vars. remove_registers (reg=arg1.get_operand ())
3 self.vars. check_is_pointer (var.get_value ())
Code Snippet 7: Variable Management. RevEngE does not handle
variables directly but via a centralized manager to keep context
consistent.
1 self.var = self.vars.new_var(reg="%eax")
2 self.var = self.vars.new_var(reg=arg1.get_operand (),
value=val)
3 self.var = self.vars.new_var(value=arg1.get_value (),
mem=arg2.get_operand ())
Code Snippet 8: Variable Manager. Context complexity is
encapsulated by the manager, thus releasing RevEngE to focus on
decompilation logic.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 19 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Variable Disambiguation
1 main movl $0xF -0x4(%rbp)
2 NAME: [var0]
3 VAL: [0xF]
4 REG: [NONE]
5 MEM: [7 fffffffdc7c]
6
7 main mov -0x8(%rbp) %eax
8 NAME: [var0]
9 VAL: [0xF]
10 REG: [NONE]
11 MEM: [7 fffffffdc7c]
Code Snippet 9: Memory References Disambiguation. Variables are
referenced by their memory addresses instead of pointed registers.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 20 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Function Introspection
1 printf@stdio.h: int printf ( const char * format ,
... ); (Return: int) (N_Args: 2)
Code Snippet 10: Introspection Procedure. External function
prototypes are identified by searching for function and library names
on the Internet and parsing them to a format suitable for RevEngE
decompilation.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 21 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Architecture
Code Generation
Figure: Code Generation. RevEngE keeps distinct objects for the
same instruction address, thus representing the multiple calling contexts.
Loop unrolling is performed by removing the top of stack each time a
given instruction address is referred.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 22 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Decompilation
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 23 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Decompilation
Instructions Per Binary
0
20
40
60
80
100
1K 10K 100K 400K 800K
Binaries(#)
Instructions (#)
Number of instructions per binary file
Goodware
Malware
Figure: Number of instructions per binary. Malware samples executed
more instructions than goodware samples.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 24 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Decompilation
Handled Instructions per binary
0
5
10
15
20
25
30
35
40
45
50
70 76 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
Binaries(#)
Instructions (%)
Handled instructions per binary file
Goodware
Malware
Figure: Handled instructions per binary. Most binaries were
successfully handled. Malware samples impose greater challenges than
goodware samples.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 25 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 26 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Tsunami/Backdoor
1 call 0x8048dfc <rand@plt >
2 mov %eax ,%ecx mov $0x66666667 ,%eax
3 imul %ecx sar %edx
4 mov %ecx ,%eax sar $0x1f ,%eax
5 sub %eax ,%edx mov %edx ,%eax
6 shl $0x2 ,%eax
Code Snippet 11: Tsunami/Backdoor. Assembly code for the traced
function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 27 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Tsunami/Backdoor
1 void makestring(char *var3) {
2 int var1=0, var2=MAX_STRING ,
3 var6 =0 x666667 , var9 =0x1f , var12 =2;
4 for(var4=var1;var4 <var2;var4 ++){
5 var5=rand (); var7=var6/var5;
6 var8=var6%var5; var10=var7 >>var9;
7 var11=var8 -var10; var13=var11 <<var12;
8 var3[var4 ]= var13;
Code Snippet 12: Tsunami/Backdoor. Decompiled code function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 28 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Exploit/Trojan
1 call 0x80484b4 <atoi@plt >
2 add $0x10 ,%esp mov %eax ,%eax
3 mov %eax ,%eax mov %eax ,-0x18(%ebp)
4 cmpl $0x2 ,0x8(%ebp)
5 jle 0x804862a <main +90>
6 push $0x1 call 0x80484a4 <exit@plt >
Code Snippet 13: Exploit/Trojan. Assembly code for the traced
function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 29 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Exploit/Trojan
1 char var1[MAX_STRING ];
2 int var2=0, var3=3, var4=1,
3 var6 =0xf , var7=2, var8 =0xff;
4 if(argc == var3){ var5=atoi(argv[var4 ]);
5 if(var5 == var6){ var5=atoi(argv[var7 ]);
6 if(var5 == var8){
Code Snippet 14: Exploit/Trojan. Decompiled code function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 30 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Micmp/Backdoor
1 call 0x8048734 <time@plt >
2 add $0x4 ,%esp push %eax
3 call 0x8048794 <srand@plt >
4 add $0x10 ,%esp sub $0x4 ,%esp
5 sub $0xc ,%esp call 0x8048814 <rand@plt >
6 add $0xc ,%esp mov %eax ,%edx
7 sar $0x1f ,%edx idiv %ecx
Code Snippet 15: Micmp/Backdoor. Assembly code for the traced
function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 31 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Micmp/Backdoor
1 void return_randip(char *var1){
2 int var3 =0xB; srand(time(NULL));
3 var2 = rand (); var4 = var2 / var3;
4 var5 = rand (); var6 = var5 / var3;
5 var7 = rand (); var8 = var7 / var3;
6 var9 = rand (); var10 = var9 / var3;
7 sprintf(var1 ,"%d.%d.%d.%d",var ...);
Code Snippet 16: Micmp/Backdoor. Decompiled code function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 32 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Small/Backdoor
1 movl $0x8049798 ,(% esp)
2 call 0x80487a8 <system@plt >
3 movl $0x80497bb ,(% esp)
4 call 0x80487a8 <system@plt >
Code Snippet 17: Small/Backdoor. Assembly code for the traced
function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 33 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Small/Backdoor
1 void open_firewall (){
2 char var1 []="iptables␣-F␣INPUT";
3 char var2 []="iptables␣-P␣INPUT␣ACCEPT";
4 system(var1); system(var2);
Code Snippet 18: Small/Backdoor. Decompiled code function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 34 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
RST/Virus
1 call 0x804a104 <openlog@plt >
2 push %ebx push $0x806f5e7 push $0x7
3 call 0x8049fa4 <syslog@plt >
4 call 0x804a1b4 <closelog@plt >
5 <userfile_remove >:
6 call 8049 f54 <remove@plt >
Code Snippet 19: RST/Virus. Assembly code for the traced function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 35 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
RST/Virus
1 int debug (){
2 FILE *var1;
3 char var2 []="/var/log/syslog",
4 char var4 []="r";
5 int var3 =0;
6 var1 = fopen(var2 ,var4);
7 if(var1){ var3 =1; }
8 return var3;
Code Snippet 20: RST/Virus. Decompiled code function.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 36 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Malware Reassembly
Reassembled Malware Detection
Figure: No AV detected the reassembled malware sample.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 37 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Limitations
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 38 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Limitations
Limitations & Future Work
Limitations
Proof-of-Concept (PoC) for future developments.
Limited instruction set (x86, no floats).
C-like binaries only.
ELF binaries only.
Future Work
Implementing RevEngE in a real decompiler.
Radare2? IDA/HexRays? What else?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 39 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Conclusion
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 40 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Conclusion
Conclusion
Take Aways
Decompilers enable high-level analyses.
Full semantic reconstruction is challenging.
We know how to decompile small pieces of code.
Analysts already debug sliced binaries.
Moving towards trace-driven decompilation is the right move!
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 41 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Conclusion
Try RevEngE (1/2)
https://github.com/marcusbotacin/Reverse.
Engineering.Engine
Figure: RevEngE source code.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 42 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Conclusion
Try RevEngE (2/2
https://corvus.inf.ufpr.br/
Figure: Interactive, web-based RevEngE console.
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 43 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Questions?
Topics
1 Introduction
The Problem
Background
2 RevEngE
Overview
Architecture
3 Evaluation
Malware Decompilation
Malware Reassembly
4 Final Remarks
Limitations
Conclusion
Questions?
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 44 / 45 ROOTS’19
Introduction RevEngE Evaluation Final Remarks
Questions?
Contact
mfbotacin@inf.ufpr.br
@MarcusBotacin
RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 45 / 45 ROOTS’19

More Related Content

PDF
Finding latent code errors via machine learning over program ...
PPTX
Linux binary analysis and exploitation
PDF
Software Testing for Data Scientists
PDF
Software Testing and the R language
PPTX
Automated testing of NASA Software - part 2
PDF
Software Engineering - RS3
PPTX
System Verilog 2009 & 2012 enhancements
PPT
RPG Program for Unit Testing RPG
Finding latent code errors via machine learning over program ...
Linux binary analysis and exploitation
Software Testing for Data Scientists
Software Testing and the R language
Automated testing of NASA Software - part 2
Software Engineering - RS3
System Verilog 2009 & 2012 enhancements
RPG Program for Unit Testing RPG

What's hot (20)

PPTX
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
PPTX
[MSR2012] An Empirical Study of Supplementary Bug Fixes
PDF
VTU ECE 7th sem VLSI lab manual
PPTX
STAR: Stack Trace based Automatic Crash Reproduction
PDF
Php unit (eng)
PDF
Personalized Defect Prediction
PDF
LDTT : A Low Level Driver Unit Testing Tool
PDF
Reverse engineering and instrumentation of android apps
PDF
Cross-project defect prediction
PDF
ProbeDroid - Crafting Your Own Dynamic Instrument Tool on Android for App Beh...
PPTX
Secure application programming in the presence of side channel attacks
PPTX
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
PPT
C tutorial
PDF
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
PDF
Unit testing (eng)
PDF
EclipseCon 2011: Deciphering the CDT debugger alphabet soup
PDF
Presentation slides: "How to get 100% code coverage"
PDF
Jdj Foss Java Tools
PPT
Crowd debugging (FSE 2015)
PDF
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
[MSR2012] An Empirical Study of Supplementary Bug Fixes
VTU ECE 7th sem VLSI lab manual
STAR: Stack Trace based Automatic Crash Reproduction
Php unit (eng)
Personalized Defect Prediction
LDTT : A Low Level Driver Unit Testing Tool
Reverse engineering and instrumentation of android apps
Cross-project defect prediction
ProbeDroid - Crafting Your Own Dynamic Instrument Tool on Android for App Beh...
Secure application programming in the presence of side channel attacks
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
C tutorial
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Unit testing (eng)
EclipseCon 2011: Deciphering the CDT debugger alphabet soup
Presentation slides: "How to get 100% code coverage"
Jdj Foss Java Tools
Crowd debugging (FSE 2015)
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization
Ad

Similar to Towards Malware Decompilation and Reassembly (20)

ODP
Java Attacks & Defenses - End of Year 2010 Presentation
PPTX
Adventures in Asymmetric Warfare
PDF
Anti-Debugging - A Developers View
PDF
Aug-On-Demand-Malware RE basic to Advanced.pdf
PDF
Aug-On-Demand-Malware RE basic to Advanced.pdf
PDF
Malware-Reverse-Engineering-BeginnerToAdvanced-By-Abhijit-Mohanta-1.pdf
PDF
Looking for Bugs in MonoDevelop
PPTX
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
PDF
Debugging with NetBeans IDE
PDF
distage: Purely Functional Staged Dependency Injection; bonus: Faking Kind Po...
PDF
The Power Of Refactoring (PHPCon Italia)
PDF
Metamodeling of custom Pharo images
PDF
WhitePaperTemplate
PDF
Devnology back toschool software reengineering
PDF
Computational practices for reproducible science
ODP
Controller design-pattern-drupal-north-toronto-2018-final
PDF
PVS-Studio advertisement - static analysis of C/C++ code
PPTX
Refactoring ASP.NET and beyond
PDF
Applying Anti-Reversing Techniques to Java Bytecode
Java Attacks & Defenses - End of Year 2010 Presentation
Adventures in Asymmetric Warfare
Anti-Debugging - A Developers View
Aug-On-Demand-Malware RE basic to Advanced.pdf
Aug-On-Demand-Malware RE basic to Advanced.pdf
Malware-Reverse-Engineering-BeginnerToAdvanced-By-Abhijit-Mohanta-1.pdf
Looking for Bugs in MonoDevelop
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
Debugging with NetBeans IDE
distage: Purely Functional Staged Dependency Injection; bonus: Faking Kind Po...
The Power Of Refactoring (PHPCon Italia)
Metamodeling of custom Pharo images
WhitePaperTemplate
Devnology back toschool software reengineering
Computational practices for reproducible science
Controller design-pattern-drupal-north-toronto-2018-final
PVS-Studio advertisement - static analysis of C/C++ code
Refactoring ASP.NET and beyond
Applying Anti-Reversing Techniques to Java Bytecode
Ad

More from Marcus Botacin (20)

PDF
Cross-Regional Malware Detection via Model Distilling and Federated Learning
PDF
What do malware analysts want from academia? A survey on the state-of-the-pra...
PDF
GPThreats: Fully-automated AI-generated malware and its security risks
PDF
[Texas A&M University] Research @ Botacin's Lab
PDF
Pilares da Segurança e Chaves criptográficas
PDF
Machine Learning by Examples - Marcus Botacin - TAMU 2024
PDF
Near-memory & In-Memory Detection of Fileless Malware
PDF
GPThreats-3: Is Automated Malware Generation a Threat?
PDF
[HackInTheBOx] All You Always Wanted to Know About Antiviruses
PDF
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
PDF
Hardware-accelerated security monitoring
PDF
How do we detect malware? A step-by-step guide
PDF
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
PDF
Extraindo Caracterı́sticas de Arquivos Binários Executáveis
PDF
On the Malware Detection Problem: Challenges & Novel Approaches
PDF
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
PDF
Near-memory & In-Memory Detection of Fileless Malware
PDF
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
PDF
Integridade, confidencialidade, disponibilidade, ransomware
PDF
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
Cross-Regional Malware Detection via Model Distilling and Federated Learning
What do malware analysts want from academia? A survey on the state-of-the-pra...
GPThreats: Fully-automated AI-generated malware and its security risks
[Texas A&M University] Research @ Botacin's Lab
Pilares da Segurança e Chaves criptográficas
Machine Learning by Examples - Marcus Botacin - TAMU 2024
Near-memory & In-Memory Detection of Fileless Malware
GPThreats-3: Is Automated Malware Generation a Threat?
[HackInTheBOx] All You Always Wanted to Know About Antiviruses
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
Hardware-accelerated security monitoring
How do we detect malware? A step-by-step guide
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Extraindo Caracterı́sticas de Arquivos Binários Executáveis
On the Malware Detection Problem: Challenges & Novel Approaches
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
Near-memory & In-Memory Detection of Fileless Malware
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
Integridade, confidencialidade, disponibilidade, ransomware
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
A Presentation on Artificial Intelligence
PDF
Electronic commerce courselecture one. Pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Cloud computing and distributed systems.
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation_ Review paper, used for researhc scholars
PPT
Teaching material agriculture food technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Unlocking AI with Model Context Protocol (MCP)
A Presentation on Artificial Intelligence
Electronic commerce courselecture one. Pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Digital-Transformation-Roadmap-for-Companies.pptx
cuic standard and advanced reporting.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Review of recent advances in non-invasive hemoglobin estimation
Building Integrated photovoltaic BIPV_UPV.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Cloud computing and distributed systems.
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation_ Review paper, used for researhc scholars
Teaching material agriculture food technology
MYSQL Presentation for SQL database connectivity
Mobile App Security Testing_ A Comprehensive Guide.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

Towards Malware Decompilation and Reassembly

  • 1. Introduction RevEngE Evaluation Final Remarks RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly Marcus Botacin1, Lucas Galante2, Paulo L´ıcio de Geus2, Andr´e Gr´egio1 1Federal University of Paran´a (UFPR-BR) {mfbotacin, gregio}@inf.ufpr.br 2University of Campinas (UNICAMP-BR) {galante, paulo}@lasca.ic.unicamp.br RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 1 / 45 ROOTS’19
  • 2. Introduction RevEngE Evaluation Final Remarks Who Am I? Background Computer Engineer (University of Campinas–Brazil). CS Master (University of Campinas–Brazil). CS PhD Student (Federal University of Paran´a–Brazil). Malware Analyst (Since 2012). Research Interests Malware Analysis & Detection. Hardware-Assisted Security. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 2 / 45 ROOTS’19
  • 3. Introduction RevEngE Evaluation Final Remarks The Problem Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 3 / 45 ROOTS’19
  • 4. Introduction RevEngE Evaluation Final Remarks The Problem The Problem Malware Hard to Understand at low level (e.g. assembly). Decompilers Lift low level constructions to high level semantics. Allow API and/or source code analyses. Decompilation Challenges Malware do not behave well. Malware implement anti-analysis tricks. Malware binaries exhibit dead code. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 4 / 45 ROOTS’19
  • 5. Introduction RevEngE Evaluation Final Remarks The Problem Insights & Proposal (1/2) Current Decompilers They perform reasonably well with small pieces of code. They do not perform well with static disassembly. State-of-the-art Decompilers Current Debuggers They can perform dynamic disassembly and/or inspection. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 5 / 45 ROOTS’19
  • 6. Introduction RevEngE Evaluation Final Remarks The Problem Insights & Proposal (2/2) Current Analysts’ Tasks Analysts already debug binaries in a sliced manner. Analysts perform their own anti-anti-analysis routines. What If... We could combine analysts manual work with decompiler? We could decompile the small pieces debugged by the analyst? We could allow the analyst to overcome anti-analysis by themselves? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 6 / 45 ROOTS’19
  • 7. Introduction RevEngE Evaluation Final Remarks Background Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 7 / 45 ROOTS’19
  • 8. Introduction RevEngE Evaluation Final Remarks Background Background Compiler Parsing, Pre-Processing, Assemblying, Optimization, Linking and Code Generation. Decompiler Disassembly, Lifting, data type recovery, and Code Generation. Notice that: Not the same code generation routines. Decompiler is an inverse compiler. There are cross-platform compilers and decompilers. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 8 / 45 ROOTS’19
  • 9. Introduction RevEngE Evaluation Final Remarks Background The Challenges (1/2) Disassembly Opaque Constants. Overlapping Instructions. Data and Code are mixed. Lifting A typical ISA is VERY large. Have you ever executed VFMADDSUBPS? and O.S. support as well... Do you know what is NUMA? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 9 / 45 ROOTS’19
  • 10. Introduction RevEngE Evaluation Final Remarks Background The Challenges (2/2) Data Type Reconstruction What is the difference between an array (int a[2];) and consecutive variables (int a,b;)? Is 0x77FF... an integer or a pointer? Code Generation How to implement? Which optimizations? How to name variables? Evaluation Is recovered code a good metric for malware decompilation? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 10 / 45 ROOTS’19
  • 11. Introduction RevEngE Evaluation Final Remarks Overview Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 11 / 45 ROOTS’19
  • 12. Introduction RevEngE Evaluation Final Remarks Overview Reverse Engineering Engine Overview PoC Decompiler focused on malware analysis. GDB-powered (no-reimplementation). Dynamic Inspection (no static analysis constraints). Trace-Oriented (decompile what is debugged). Reassembler (merge the decompiled pieces in a new software). RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 12 / 45 ROOTS’19
  • 13. Introduction RevEngE Evaluation Final Remarks Architecture Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 13 / 45 ROOTS’19
  • 14. Introduction RevEngE Evaluation Final Remarks Architecture RevEngE-GDB Integration Figure: RevEngE Architecture. GDB provides the basic debugging capabilities and was armored to handle malware anti-analysis techniques. RevEngE decompiler is developed on top of the armored GDB. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 14 / 45 ROOTS’19
  • 15. Introduction RevEngE Evaluation Final Remarks Architecture GDB Armoring 1 __libc_start_main (main=<value >, argc=<value >, ubp_av=<value >, init=<value >, fini=<value >, rtld_fini=<value >, stack_end=<value >) Code Snippet 1: Libc Entry Point. First argument points to application entry point. 1 output = gdb.execute("set␣$eflags |=0x%x" % self. flag_map[flag],to_string=True) Code Snippet 2: Invert Branch Direction. Flags register is changed according a map of possible flags for such command. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 15 / 45 ROOTS’19
  • 16. Introduction RevEngE Evaluation Final Remarks Architecture Instruction Representation Figure: Instruction Representation. RevEngE benefits from Python’s polymorphism to model instruction’s behaviors and overloads method declarators to support each x86 instruction’s possible multiple argument types. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 16 / 45 ROOTS’19
  • 17. Introduction RevEngE Evaluation Final Remarks Architecture Instruction Factory 1 class IFactory (...): 2 def get(self , args): 3 newclass = globals ()[name ]( args) 4 return newclass Code Snippet 3: Instruction Factory. The Factory design pattern allows instantiating objects from the proper class by exploring Python OOP capabilities. 1 self.classes[’div’] = "IDiv" 2 self.classes[’divl ’] = "IDiv" 3 self.classes[’idiv ’] = "IDiv" 4 self.classes[’idivl ’] = "IDiv" Code Snippet 4: Instruction Lifting. RevEngE assumes only signed integer operations to handle all instructions via the same high-level class. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 17 / 45 ROOTS’19
  • 18. Introduction RevEngE Evaluation Final Remarks Architecture Lifting Complex Instructions 1 0x4004eb cmp -0x8(%rbp) ,%eax 2 0x4004ee jle 4004 fb <main +0x25 > Code Snippet 5: Low level representation of a conditional decision. IF instructions are composed by multiple assembly instructions. 1 class HighLevelCompare (): 2 def __init__ (self ,cmp ,set): 3 self.op1 = cmp.op1 4 self.op2 = cmp.op2 5 self.op3 = set.op3 Code Snippet 6: High level conditional decision representation. Assembly instructions are promoted to a single class that represents a high level conditional structure (e.g., IFs). RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 18 / 45 ROOTS’19
  • 19. Introduction RevEngE Evaluation Final Remarks Architecture Handling Variables 1 self.vars = VariableManager () 2 self.vars. remove_registers (reg=arg1.get_operand ()) 3 self.vars. check_is_pointer (var.get_value ()) Code Snippet 7: Variable Management. RevEngE does not handle variables directly but via a centralized manager to keep context consistent. 1 self.var = self.vars.new_var(reg="%eax") 2 self.var = self.vars.new_var(reg=arg1.get_operand (), value=val) 3 self.var = self.vars.new_var(value=arg1.get_value (), mem=arg2.get_operand ()) Code Snippet 8: Variable Manager. Context complexity is encapsulated by the manager, thus releasing RevEngE to focus on decompilation logic. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 19 / 45 ROOTS’19
  • 20. Introduction RevEngE Evaluation Final Remarks Architecture Variable Disambiguation 1 main movl $0xF -0x4(%rbp) 2 NAME: [var0] 3 VAL: [0xF] 4 REG: [NONE] 5 MEM: [7 fffffffdc7c] 6 7 main mov -0x8(%rbp) %eax 8 NAME: [var0] 9 VAL: [0xF] 10 REG: [NONE] 11 MEM: [7 fffffffdc7c] Code Snippet 9: Memory References Disambiguation. Variables are referenced by their memory addresses instead of pointed registers. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 20 / 45 ROOTS’19
  • 21. Introduction RevEngE Evaluation Final Remarks Architecture Function Introspection 1 printf@stdio.h: int printf ( const char * format , ... ); (Return: int) (N_Args: 2) Code Snippet 10: Introspection Procedure. External function prototypes are identified by searching for function and library names on the Internet and parsing them to a format suitable for RevEngE decompilation. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 21 / 45 ROOTS’19
  • 22. Introduction RevEngE Evaluation Final Remarks Architecture Code Generation Figure: Code Generation. RevEngE keeps distinct objects for the same instruction address, thus representing the multiple calling contexts. Loop unrolling is performed by removing the top of stack each time a given instruction address is referred. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 22 / 45 ROOTS’19
  • 23. Introduction RevEngE Evaluation Final Remarks Malware Decompilation Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 23 / 45 ROOTS’19
  • 24. Introduction RevEngE Evaluation Final Remarks Malware Decompilation Instructions Per Binary 0 20 40 60 80 100 1K 10K 100K 400K 800K Binaries(#) Instructions (#) Number of instructions per binary file Goodware Malware Figure: Number of instructions per binary. Malware samples executed more instructions than goodware samples. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 24 / 45 ROOTS’19
  • 25. Introduction RevEngE Evaluation Final Remarks Malware Decompilation Handled Instructions per binary 0 5 10 15 20 25 30 35 40 45 50 70 76 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 Binaries(#) Instructions (%) Handled instructions per binary file Goodware Malware Figure: Handled instructions per binary. Most binaries were successfully handled. Malware samples impose greater challenges than goodware samples. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 25 / 45 ROOTS’19
  • 26. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 26 / 45 ROOTS’19
  • 27. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Tsunami/Backdoor 1 call 0x8048dfc <rand@plt > 2 mov %eax ,%ecx mov $0x66666667 ,%eax 3 imul %ecx sar %edx 4 mov %ecx ,%eax sar $0x1f ,%eax 5 sub %eax ,%edx mov %edx ,%eax 6 shl $0x2 ,%eax Code Snippet 11: Tsunami/Backdoor. Assembly code for the traced function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 27 / 45 ROOTS’19
  • 28. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Tsunami/Backdoor 1 void makestring(char *var3) { 2 int var1=0, var2=MAX_STRING , 3 var6 =0 x666667 , var9 =0x1f , var12 =2; 4 for(var4=var1;var4 <var2;var4 ++){ 5 var5=rand (); var7=var6/var5; 6 var8=var6%var5; var10=var7 >>var9; 7 var11=var8 -var10; var13=var11 <<var12; 8 var3[var4 ]= var13; Code Snippet 12: Tsunami/Backdoor. Decompiled code function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 28 / 45 ROOTS’19
  • 29. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Exploit/Trojan 1 call 0x80484b4 <atoi@plt > 2 add $0x10 ,%esp mov %eax ,%eax 3 mov %eax ,%eax mov %eax ,-0x18(%ebp) 4 cmpl $0x2 ,0x8(%ebp) 5 jle 0x804862a <main +90> 6 push $0x1 call 0x80484a4 <exit@plt > Code Snippet 13: Exploit/Trojan. Assembly code for the traced function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 29 / 45 ROOTS’19
  • 30. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Exploit/Trojan 1 char var1[MAX_STRING ]; 2 int var2=0, var3=3, var4=1, 3 var6 =0xf , var7=2, var8 =0xff; 4 if(argc == var3){ var5=atoi(argv[var4 ]); 5 if(var5 == var6){ var5=atoi(argv[var7 ]); 6 if(var5 == var8){ Code Snippet 14: Exploit/Trojan. Decompiled code function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 30 / 45 ROOTS’19
  • 31. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Micmp/Backdoor 1 call 0x8048734 <time@plt > 2 add $0x4 ,%esp push %eax 3 call 0x8048794 <srand@plt > 4 add $0x10 ,%esp sub $0x4 ,%esp 5 sub $0xc ,%esp call 0x8048814 <rand@plt > 6 add $0xc ,%esp mov %eax ,%edx 7 sar $0x1f ,%edx idiv %ecx Code Snippet 15: Micmp/Backdoor. Assembly code for the traced function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 31 / 45 ROOTS’19
  • 32. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Micmp/Backdoor 1 void return_randip(char *var1){ 2 int var3 =0xB; srand(time(NULL)); 3 var2 = rand (); var4 = var2 / var3; 4 var5 = rand (); var6 = var5 / var3; 5 var7 = rand (); var8 = var7 / var3; 6 var9 = rand (); var10 = var9 / var3; 7 sprintf(var1 ,"%d.%d.%d.%d",var ...); Code Snippet 16: Micmp/Backdoor. Decompiled code function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 32 / 45 ROOTS’19
  • 33. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Small/Backdoor 1 movl $0x8049798 ,(% esp) 2 call 0x80487a8 <system@plt > 3 movl $0x80497bb ,(% esp) 4 call 0x80487a8 <system@plt > Code Snippet 17: Small/Backdoor. Assembly code for the traced function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 33 / 45 ROOTS’19
  • 34. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Small/Backdoor 1 void open_firewall (){ 2 char var1 []="iptables␣-F␣INPUT"; 3 char var2 []="iptables␣-P␣INPUT␣ACCEPT"; 4 system(var1); system(var2); Code Snippet 18: Small/Backdoor. Decompiled code function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 34 / 45 ROOTS’19
  • 35. Introduction RevEngE Evaluation Final Remarks Malware Reassembly RST/Virus 1 call 0x804a104 <openlog@plt > 2 push %ebx push $0x806f5e7 push $0x7 3 call 0x8049fa4 <syslog@plt > 4 call 0x804a1b4 <closelog@plt > 5 <userfile_remove >: 6 call 8049 f54 <remove@plt > Code Snippet 19: RST/Virus. Assembly code for the traced function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 35 / 45 ROOTS’19
  • 36. Introduction RevEngE Evaluation Final Remarks Malware Reassembly RST/Virus 1 int debug (){ 2 FILE *var1; 3 char var2 []="/var/log/syslog", 4 char var4 []="r"; 5 int var3 =0; 6 var1 = fopen(var2 ,var4); 7 if(var1){ var3 =1; } 8 return var3; Code Snippet 20: RST/Virus. Decompiled code function. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 36 / 45 ROOTS’19
  • 37. Introduction RevEngE Evaluation Final Remarks Malware Reassembly Reassembled Malware Detection Figure: No AV detected the reassembled malware sample. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 37 / 45 ROOTS’19
  • 38. Introduction RevEngE Evaluation Final Remarks Limitations Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 38 / 45 ROOTS’19
  • 39. Introduction RevEngE Evaluation Final Remarks Limitations Limitations & Future Work Limitations Proof-of-Concept (PoC) for future developments. Limited instruction set (x86, no floats). C-like binaries only. ELF binaries only. Future Work Implementing RevEngE in a real decompiler. Radare2? IDA/HexRays? What else? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 39 / 45 ROOTS’19
  • 40. Introduction RevEngE Evaluation Final Remarks Conclusion Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 40 / 45 ROOTS’19
  • 41. Introduction RevEngE Evaluation Final Remarks Conclusion Conclusion Take Aways Decompilers enable high-level analyses. Full semantic reconstruction is challenging. We know how to decompile small pieces of code. Analysts already debug sliced binaries. Moving towards trace-driven decompilation is the right move! RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 41 / 45 ROOTS’19
  • 42. Introduction RevEngE Evaluation Final Remarks Conclusion Try RevEngE (1/2) https://github.com/marcusbotacin/Reverse. Engineering.Engine Figure: RevEngE source code. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 42 / 45 ROOTS’19
  • 43. Introduction RevEngE Evaluation Final Remarks Conclusion Try RevEngE (2/2 https://corvus.inf.ufpr.br/ Figure: Interactive, web-based RevEngE console. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 43 / 45 ROOTS’19
  • 44. Introduction RevEngE Evaluation Final Remarks Questions? Topics 1 Introduction The Problem Background 2 RevEngE Overview Architecture 3 Evaluation Malware Decompilation Malware Reassembly 4 Final Remarks Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 44 / 45 ROOTS’19
  • 45. Introduction RevEngE Evaluation Final Remarks Questions? Contact mfbotacin@inf.ufpr.br @MarcusBotacin RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 45 / 45 ROOTS’19