SlideShare a Scribd company logo
Deobfuscation and beyond
Vasily Bukasov
and
Dmitry Schelkunov
https://re-crypt.com
Agenda
• We'll speak about obfuscation
techniques which commercial (and not
only) obfuscators use and how symbolic
equation systems could help to
deobfuscate such transformations
• We'll form the requirements for these
systems
• We'll briefly skim over design of our mini-
symbolic equation system and show the
results of deobfuscation (and not only)
using it
Software obfuscation
Is used for software
protection against
computer piracy
Is used for malware
protection against
signature-based and
heuristic-based
antiviruses
Common obfuscation techniques
Common obfuscation techniques
Recursive substitution
Common obfuscation techniques
Common obfuscation techniques
Code duplication
Common obfuscation techniques
Code duplication in
virtualization obfuscators
Previous researches and products
• The Case for Semantics-Based Methods in Reverse Engineering, Rolf
Rolles, RECON 2012
• Software deobfuscation methods: analysis and implementation, Sh.F.
Kurmangaleev, K.Y. Dolgorukova, V.V. Savchenko, A.R. Nurmukhametov,
H. A Matevosyan, V.P. Korchagin, Proceedings of the Institute for
System Programming of RAS, volume 24, 2013
• CodeDoctor
– deobfuscates simple expressions
– plugin for OllyDbg and IDA Pro
Previous researches and products
• VMSweeper
– declares deobfuscation (devirtualization) of Code
Virtualizer/CISC and VMProtect (works well on about 30% of
virtualized samples)
– not a generic tool (heavily relies on templates)
– works as a decompiler not optimizer
– weak symbolic equation system
• CodeUnvirtualizer
– declares deobfuscation (devirtualization) of Code
Virtualizer/CISC/RISC and Themida new VMs
– not a generic tool (heavily relies on templates)
– no symbolic equation system
Previous researches and products
• Ariadne
– complex toolset for deobfuscation and data flow analysis
– includes a lot of optimization algorithms from compiler theory
– no symbolic equation system
– it seems to be dead 
• LLVM forks
– are based on LLVM optimization algorithms (classical compiler
theory algorithms)
– we couldn’t find any decently working version
– are limited by LLVM architecture (How fast LLVM works with 500
000 IR instructions? How much system resources it requires?)
The problem
Existing deobfuscation solutions are mostly
based on classical compiler theory algorithms
and too weak against modern obfuscators in the
most of cases
Solution
• Use symbolic equation system (SES) for
deobfuscation
• Form input data for SES (translate source IR
code to SES representation)
• Simplify expressions using SES
• Translate results from SES representation to
IR
• Apply other deobfuscation transformations
Symbolic equation system
Symbolic equation system
Symbolic equation system
Symbolic equation system
Symbolic equation system
Symbolic equation system
Unfortunately, we couldn’t find an
appropriate third-party symbolic equation
system engine and … we decided to create
a new one for ourselves.
We called it Project Eq.
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
Eq design
eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
eax.0 (v)
eax.1 = eax.0
Profit! J
Eq design
Eq in work
union rebx_type
{
UINT32 rebx;
WORD rbx;
BYTE rblow[2];
};
void vmp_constant_playing(rebx_type &rebx)
{
BYTE var0;
union var1_type
{
UINT32 var;
WORD var_med;
BYTE var_low;
} var1;
var0 = rebx.rblow[0];
rebx.rblow[0] = 0xe7;
var1.var_med = rebx.rbx;
var1.var_low = 0x18;
rebx.rbx = var1.var_med;
rebx.rblow[0] = var0;
}
A C++ sample of
obfuscated code.
It was borrowed J
from VMProtect
Eq in work
Eq in work
Profit! J
Eq in work
void rustock_sample(UINT32 &rebp, UINT32 &redi, UINT32 &resi)
{
UINT32 var0, var1, var2;
var0 = rebp;
rebp = redi | rebp;
var1 = redi & var0;
resi = ~var1;
var2 = rebp & resi;
redi = var0 ^ var2;
}
A C++ sample of
obfuscated code.
It was borrowed J
from Rustock
Eq in work
Eq in work
Profit! J
Deobfuscation with Eq
Deobfuscation with Eq
After code virtualization
Deobfuscation with Eq
Deobfuscation with Eq
• ASProtect
• CodeVirtualizer/Themida/WinLicense
– old CISC/RISC
– new Fish/Tiger
• ExeCryptor
• NoobyProtect/SafeEngine
• Tages
• VMProtect
• Some others…
Were deobfuscated successfully J
Deobfuscation with Eq
Some numbers
Instructions initially ~100
Instructions after obfuscation ~300 000
Instructions after deobfuscation ~200
Code generation time ~4 min
Code deobfuscation time ~2 min
Memory ~300 Mb
Obfuscation with Eq
We could use optimization not for
deobfuscation only.
What if we could stop optimization
process at random step?
Obfuscation with Eq
Obfuscation with Eq
Obfuscation with Eq
Obfuscation with Eq
• Easy to implement
• Hard to deobfuscate using classical
compiler theory optimization algorithms
• Hard to deobfuscate using reverse
recursive substitution
• No templates and signatures in the
obfuscated code
Obfuscation with Eq
But this tricky obfuscation is still weak.
It’s possible to deobfuscate these expressions using Eq
project or another symbolic equation system.
And we have to go deeper!
Obfuscation with Eq
Obfuscation with Eq
Profit! J
Perspectives
• Obfuscation becomes stronger
– Complex mathematical expressions are
used more frequently
– Merges with cryptography
• Obfuscation migrates to dark side
– Protectors are dying
– Malware market is growing
Perspectives
• Obfuscation becomes undetectable
– Mimicry methods are improved
– Obfuscators try to avoid method of
recursive substitutions
– Obfuscators use well-known high-level
platforms
• LLVM becomes a generic platform for
creating obfuscators
Questions
?

More Related Content

PPS
On deobfuscation in practice
PDF
Dmitry Schelkunov, Vasily Bukasov - About practical deobfuscation
PPTX
08 - Return Oriented Programming, the chosen one
PDF
secure lazy binding, and the 64bit time_t development process by Philip Guenther
PPTX
Return oriented programming (ROP)
PDF
Introduction to ida python
PDF
Course lecture - An introduction to the Return Oriented Programming
PDF
Analyzing the Dolphin-emu project
On deobfuscation in practice
Dmitry Schelkunov, Vasily Bukasov - About practical deobfuscation
08 - Return Oriented Programming, the chosen one
secure lazy binding, and the 64bit time_t development process by Philip Guenther
Return oriented programming (ROP)
Introduction to ida python
Course lecture - An introduction to the Return Oriented Programming
Analyzing the Dolphin-emu project

What's hot (13)

PPTX
Python Programming Essentials - M6 - Code Blocks and Indentation
PPT
Georgy Nosenko - An introduction to the use SMT solvers for software security
PPTX
Dive into ROP - a quick introduction to Return Oriented Programming
PDF
Arduino C maXbox web of things slide show
PDF
From V8 to Modern Compilers
PDF
How to really obfuscate your pdf malware
PDF
Metrics ekon 14_2_kleiner
PPT
Erlang For Five Nines
PDF
Runtime Bytecode Transformation for Smalltalk
PDF
Entering the Fourth Dimension of OCR with Tesseract - Talk from Voxxed Days B...
PPTX
1300 david oswald id and ip theft with side-channel attacks
PPT
The Ongoing Democratization of Robotics Development
Python Programming Essentials - M6 - Code Blocks and Indentation
Georgy Nosenko - An introduction to the use SMT solvers for software security
Dive into ROP - a quick introduction to Return Oriented Programming
Arduino C maXbox web of things slide show
From V8 to Modern Compilers
How to really obfuscate your pdf malware
Metrics ekon 14_2_kleiner
Erlang For Five Nines
Runtime Bytecode Transformation for Smalltalk
Entering the Fourth Dimension of OCR with Tesseract - Talk from Voxxed Days B...
1300 david oswald id and ip theft with side-channel attacks
The Ongoing Democratization of Robotics Development
Ad

Viewers also liked (20)

PDF
Under the hood of modern HIPS-es and Windows access control mechanisms
PPTX
О чём не любят говорить ИБ-вендоры
PDF
Applying Anti-Reversing Techniques to Machine Code
PPTX
Back to the CORE
PPTX
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
PDF
(130216) #fitalk potentially malicious ur ls
PDF
Binary Obfuscation from the Top Down: Obfuscation Executables without Writing...
PPT
Intrusion detection and prevention
PDF
Spo2 t19 spo2-t19
PDF
Desofuscando um webshell em php h2hc Ed.9
PDF
Generic attack detection engine
PPTX
VMRay intro video
PDF
The (In)Security of Topology Discovery in Software Defined Networks
PDF
Ajit-Legiment_Techniques
PPTX
Welcome to the United States: An Acculturation Conversation
PPT
Applciation footprinting, discovery and enumeration
PDF
Obfuscation, Golfing and Secret Operators in Perl
PDF
Automated JavaScript Deobfuscation - PacSec 2007
PDF
Code obfuscation, php shells & more
PDF
A combined approach to search for evasion techniques in network intrusion det...
Under the hood of modern HIPS-es and Windows access control mechanisms
О чём не любят говорить ИБ-вендоры
Applying Anti-Reversing Techniques to Machine Code
Back to the CORE
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
(130216) #fitalk potentially malicious ur ls
Binary Obfuscation from the Top Down: Obfuscation Executables without Writing...
Intrusion detection and prevention
Spo2 t19 spo2-t19
Desofuscando um webshell em php h2hc Ed.9
Generic attack detection engine
VMRay intro video
The (In)Security of Topology Discovery in Software Defined Networks
Ajit-Legiment_Techniques
Welcome to the United States: An Acculturation Conversation
Applciation footprinting, discovery and enumeration
Obfuscation, Golfing and Secret Operators in Perl
Automated JavaScript Deobfuscation - PacSec 2007
Code obfuscation, php shells & more
A combined approach to search for evasion techniques in network intrusion det...
Ad

Similar to Deobfuscation and beyond (ZeroNights, 2014) (20)

PDF
Code obfuscation theory and practices
PDF
Binary code obfuscation through c++ template meta programming
PDF
Automatic binary deobfuscation
PPTX
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
PDF
DEF CON 23 - Atlas - fun with symboliks
PDF
Automated static deobfuscation in the context of Reverse Engineering
PPTX
Code obfuscation
PDF
Appsec obfuscator reloaded
PDF
How Triton can help to reverse virtual machine based software protections
PDF
Half-automatic Compilable Source Code Recovery
PDF
MODERN TECHNIQUES TO DEOBFUSCATE AND UEFI/BIOS MALWARE -- HITB 2019 AMSTERDAM
PDF
Simple Obfuscation Tool for Software Protection
PPTX
[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladisla...
PPTX
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
PPT
Code obfuscation
PDF
Binary obfuscation using signals
PDF
NSC #2 - D1 01 - Rolf Rolles - Program synthesis in reverse engineering
PPTX
GCC Summit 2010
PDF
CODEsign 2015
PPTX
Compiler optimizations based on call-graph flattening
Code obfuscation theory and practices
Binary code obfuscation through c++ template meta programming
Automatic binary deobfuscation
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
DEF CON 23 - Atlas - fun with symboliks
Automated static deobfuscation in the context of Reverse Engineering
Code obfuscation
Appsec obfuscator reloaded
How Triton can help to reverse virtual machine based software protections
Half-automatic Compilable Source Code Recovery
MODERN TECHNIQUES TO DEOBFUSCATE AND UEFI/BIOS MALWARE -- HITB 2019 AMSTERDAM
Simple Obfuscation Tool for Software Protection
[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladisla...
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Code obfuscation
Binary obfuscation using signals
NSC #2 - D1 01 - Rolf Rolles - Program synthesis in reverse engineering
GCC Summit 2010
CODEsign 2015
Compiler optimizations based on call-graph flattening

Recently uploaded (20)

PDF
Understanding Forklifts - TECH EHS Solution
PDF
top salesforce developer skills in 2025.pdf
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
medical staffing services at VALiNTRY
PDF
AI in Product Development-omnex systems
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
ai tools demonstartion for schools and inter college
PDF
System and Network Administration Chapter 2
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
L1 - Introduction to python Backend.pptx
PDF
System and Network Administraation Chapter 3
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
Introduction to Artificial Intelligence
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Understanding Forklifts - TECH EHS Solution
top salesforce developer skills in 2025.pdf
ISO 45001 Occupational Health and Safety Management System
medical staffing services at VALiNTRY
AI in Product Development-omnex systems
Adobe Illustrator 28.6 Crack My Vision of Vector Design
ai tools demonstartion for schools and inter college
System and Network Administration Chapter 2
Design an Analysis of Algorithms I-SECS-1021-03
How to Migrate SBCGlobal Email to Yahoo Easily
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
L1 - Introduction to python Backend.pptx
System and Network Administraation Chapter 3
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Introduction to Artificial Intelligence
Which alternative to Crystal Reports is best for small or large businesses.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf

Deobfuscation and beyond (ZeroNights, 2014)

  • 1. Deobfuscation and beyond Vasily Bukasov and Dmitry Schelkunov https://re-crypt.com
  • 2. Agenda • We'll speak about obfuscation techniques which commercial (and not only) obfuscators use and how symbolic equation systems could help to deobfuscate such transformations • We'll form the requirements for these systems • We'll briefly skim over design of our mini- symbolic equation system and show the results of deobfuscation (and not only) using it
  • 3. Software obfuscation Is used for software protection against computer piracy Is used for malware protection against signature-based and heuristic-based antiviruses
  • 8. Common obfuscation techniques Code duplication in virtualization obfuscators
  • 9. Previous researches and products • The Case for Semantics-Based Methods in Reverse Engineering, Rolf Rolles, RECON 2012 • Software deobfuscation methods: analysis and implementation, Sh.F. Kurmangaleev, K.Y. Dolgorukova, V.V. Savchenko, A.R. Nurmukhametov, H. A Matevosyan, V.P. Korchagin, Proceedings of the Institute for System Programming of RAS, volume 24, 2013 • CodeDoctor – deobfuscates simple expressions – plugin for OllyDbg and IDA Pro
  • 10. Previous researches and products • VMSweeper – declares deobfuscation (devirtualization) of Code Virtualizer/CISC and VMProtect (works well on about 30% of virtualized samples) – not a generic tool (heavily relies on templates) – works as a decompiler not optimizer – weak symbolic equation system • CodeUnvirtualizer – declares deobfuscation (devirtualization) of Code Virtualizer/CISC/RISC and Themida new VMs – not a generic tool (heavily relies on templates) – no symbolic equation system
  • 11. Previous researches and products • Ariadne – complex toolset for deobfuscation and data flow analysis – includes a lot of optimization algorithms from compiler theory – no symbolic equation system – it seems to be dead  • LLVM forks – are based on LLVM optimization algorithms (classical compiler theory algorithms) – we couldn’t find any decently working version – are limited by LLVM architecture (How fast LLVM works with 500 000 IR instructions? How much system resources it requires?)
  • 12. The problem Existing deobfuscation solutions are mostly based on classical compiler theory algorithms and too weak against modern obfuscators in the most of cases
  • 13. Solution • Use symbolic equation system (SES) for deobfuscation • Form input data for SES (translate source IR code to SES representation) • Simplify expressions using SES • Translate results from SES representation to IR • Apply other deobfuscation transformations
  • 19. Symbolic equation system Unfortunately, we couldn’t find an appropriate third-party symbolic equation system engine and … we decided to create a new one for ourselves. We called it Project Eq.
  • 20. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
  • 21. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
  • 22. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
  • 23. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
  • 24. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
  • 25. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
  • 26. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff
  • 27. Eq design eax.1 = ( ( eax.0 * 0xffffffff ) + 0xffffffff ) ^ 0xffffffff eax.0 (v) eax.1 = eax.0 Profit! J
  • 29. Eq in work union rebx_type { UINT32 rebx; WORD rbx; BYTE rblow[2]; }; void vmp_constant_playing(rebx_type &rebx) { BYTE var0; union var1_type { UINT32 var; WORD var_med; BYTE var_low; } var1; var0 = rebx.rblow[0]; rebx.rblow[0] = 0xe7; var1.var_med = rebx.rbx; var1.var_low = 0x18; rebx.rbx = var1.var_med; rebx.rblow[0] = var0; } A C++ sample of obfuscated code. It was borrowed J from VMProtect
  • 32. Eq in work void rustock_sample(UINT32 &rebp, UINT32 &redi, UINT32 &resi) { UINT32 var0, var1, var2; var0 = rebp; rebp = redi | rebp; var1 = redi & var0; resi = ~var1; var2 = rebp & resi; redi = var0 ^ var2; } A C++ sample of obfuscated code. It was borrowed J from Rustock
  • 36. Deobfuscation with Eq After code virtualization
  • 38. Deobfuscation with Eq • ASProtect • CodeVirtualizer/Themida/WinLicense – old CISC/RISC – new Fish/Tiger • ExeCryptor • NoobyProtect/SafeEngine • Tages • VMProtect • Some others… Were deobfuscated successfully J
  • 39. Deobfuscation with Eq Some numbers Instructions initially ~100 Instructions after obfuscation ~300 000 Instructions after deobfuscation ~200 Code generation time ~4 min Code deobfuscation time ~2 min Memory ~300 Mb
  • 40. Obfuscation with Eq We could use optimization not for deobfuscation only. What if we could stop optimization process at random step?
  • 44. Obfuscation with Eq • Easy to implement • Hard to deobfuscate using classical compiler theory optimization algorithms • Hard to deobfuscate using reverse recursive substitution • No templates and signatures in the obfuscated code
  • 45. Obfuscation with Eq But this tricky obfuscation is still weak. It’s possible to deobfuscate these expressions using Eq project or another symbolic equation system. And we have to go deeper!
  • 48. Perspectives • Obfuscation becomes stronger – Complex mathematical expressions are used more frequently – Merges with cryptography • Obfuscation migrates to dark side – Protectors are dying – Malware market is growing
  • 49. Perspectives • Obfuscation becomes undetectable – Mimicry methods are improved – Obfuscators try to avoid method of recursive substitutions – Obfuscators use well-known high-level platforms • LLVM becomes a generic platform for creating obfuscators