SlideShare a Scribd company logo
Messing with
binary formats
London, EnglandAnge Albertini
2013/09/13
ΠΟΛΎΓΛΩΣΣΟΣ
Welcome!
● this is the non-live version of my slides
○ more text
○ standard PDF file ;)
About me:
● Reverse engineer
● my website: http://corkami.com
○ reverse engineering &
visual documentations
to extract the live deck. 61 slides:
pdftk 44con-albertini.pdf cat 1 3 5 7 9 10 12 14 16 18 23 25 29 31 33 35 37 39 41 43 45 47 49 51 53 55-57 59 63 65 67 69 71 73 75 77 79 81 83 85 87 89 90 94 96 98 101-102 104 106-107 109-112 114-117 119 output 44con-albertini(live).pdf
119 2 4 6 8 11 13 15 17 19+4 24 26+3 32 34 36 38 40 42 44 46 48 50 52 54 58 60+3 64 66 68 70 72 74 76 78 80 82 84 86 88 91+3 95 97 99+2 103 105 108 113 118
low-level ones,
that is
I just like to play
with lego blocks
generate files byte per byte
Goals
● explore the format
● make sure that's how things
work
● full control over the structure
result:
● a complete executable
● all bytes defined by hand
our problem
● is related to virus (malwares)
● they use many file formats
● it's critical to identify them reliably
○ and to tell whether corrupted or well-formed
standard infection chain
the most common chain:
1. a web page, in HTML format
a. launching an applet
2. an evil applet, in CLASS format
a. exploiting a Java vulnerability
b. dropping an executable
3. a malicious executable, in Portable
Executable format
(a vast majority of malwares
rely on an executable)
another classic chain
● open a PDF document
○ with an exploit inside
■ dropping or downloading a PE executable
● get a malicious executable on your machine
the challenge
it might look obvious:
● tell whether it's a PDF, a PE, a JAVA, an
HTML...
● typical formats are clearly defined
○ Magic signature enforced at offset 0
reality
some formats have no header at all
● Command File (DOS 16 bits)
● Master Boot record
some formats don't need to start at offset 0
● Archives (Zips, Rars...)
● HTML
○ but text-only?
some formats accept a large controllable block
early in their header
● Portable Executable
● PICT image
How did this start?
a real-life problem:
1. a (malicious) HTML page
2. started with 'MZ' (the signature of PE)
3. just scanned as a PE!
a. wow, this PE is highly corrupted :)
b. it must be clean :p
?
MZ
polyglots in the wild
GIFAR = GIF + JAR
● an uploaded image
○ an avatar in a forum
● with a malicious JAVA appended as JAR
hosted on the server!
● bypass same domain policy
● now useable via its JAVA=EVIL payload
+ =
let's get started
PE, the executable format of windows
● it's central to windows malware
● it enforces a magic signature at offset 0
○ game over for other formats?
● starts with a compulsory header
● made of sub-headers
overview
a historical sandwich
1. a deprecated but required header
2. a modern header
old header content
● almost completely ignored
● only required:
○ 2 byte signature
○ pointer to new header
the new header can be
anywhere
ex: at the end of the file!
such as Corkami Standard Test
let's look at HTML format
it enforces NOTHING!
anything before the <html> tag!
even 28 Mb of binary!
and it's been the same
since Mozilla 1.0 in 2002thanks to Nicolas Grégoire!
now, the PDF format
signature position?
● officially at offset 0
● officially tolerated until offset 1024
● wtf?
○ it get actually worse later
PDF trick 1
put a small executable within 1024 bytes
(just concatenate)
trick 2
1. start a fake PDF + object in a PE header
2. finish fake object at the end the PE
3. end fake object
4. put PDF real structure
works with real-life example!
(PE data might contain PDF keywords)
JAR = ZIP + Class
just enforced at the very end of the file
but CRCs are just ignored
it was too easy :p
Summary
Structure
1. start
○ PE Signature
■ %PDF + fake obj start
■ HTML comment start
2. next
○ PE (next)
○ HTML
○ PDF (next)
3. bottom
○ ZIP
it’s time for a real example!
an inception demo!
wait, what?
we’re already in the demo!
the live version file is simultaneously:
● the PDF slides themselves
● a PDF viewer executable
○ ie, the file is loading itself
● the PoCs in a ZIP
● an HTML readme
○ with JavaScript mario
so, it works
but it lacks something
● not artistic enough
● not advanced enough
let's build a 'well representative' (=nasty) PoC
the PE specs
● Official MS specs = big joke
○ 'the gentle guide for beginners'
○ barely describes standard PEs
stripped down PE
many elements removed
● including no sections
imports
(imports = communication between executables and libraries)
imports are made of 3 lists
evil imports
● let's make these lists into each other
● with more extra tricks to fail parser!
ultimate import fail
● failing all tools
○ including IDA & Hiew
● now fixed :)
let's put some code
● some undocumented opcodes!
● big blank spaces in Intel official docs
let's check AMD's
● miracle!
result in WinDbg
● '???' == clueless (tool/user)
don't rely (only) on official docs
messing with PDF
there is a so-called standard
and the reality of existing parsers
looking at: Adobe, MuPDF, Chrome
● 3 different files
○ working each on a specific viewer
○ failing on the other 2
Messing with binary formats
let's look inside
● MuPDF
○ no %PDF sig required
■ a PDF without a PDF sig ? WTF ?!?!
○ no trailer keyword required either
● Chrome
○ integer overflows: -4294967275 = 21
○ trailer in a comment
■ it can actually be almost ANYWHERE
■ even inside another object
● Adobe
○ looks almost sane compare to the other 2
Messing with binary formats
Chrome insanity++
(thx to Jonas Magazinius)
● a single object
● no 'trailer'
● inline stream
● brackets are not even closed
● * are required - it just checks for minimum
space
%PDF*****
1 0 obj
<<
/Size 2
/W[[]1/]
/Root 1 0 R
/Pages<<
/Kids[<<
/Contents<<>>
stream
BT{99
Tf{Td(Inlined PDF)'
endstream
>>]
>>
>>
stream
*
endstream
startxref%*******
PDF.JS
● very strict
○ 'too' strict / naive ?
○ I don't want to be their QA ;)
● requires a lot of information usually ignored
○ xref
○ /Length %PDF-1.1
1 0 obj
<<
% /Type /Catalog
...
>>
endobj
2 0 obj
<<
/Type /Pages
...
>>
endobj
3 0 obj
<<
/Type /Page
/Resources <<
/Font <<
/F1 <<
/Type /Font
/Subtype /Type1
...
>>
>>
>>
>>
endobj
4 0 obj
<< /Length 47>>
stream
...
xref
0 1
0000000000 65535 f
0000000010 00000 n
...
let's play further
combine 3 documents in a single file
● it's actually 3 set of 'independant' objects
● objects are parsed
○ but not used
alternate reality demo
the live slide-deck contains 2 PDF
● bogus one under Chrome
● real one under MuPDF (Sumatra, Linux...)
● rejected under Acrobat
○ because of the PE signature (see later)
DEMO
final PoC
● combine most previously mentioned tricks
● many fails on many tools
● total control of the structure
○ the PDF 'ends' in the Java class
Adobe rejects 'weird
magics' after 10.1.5
not in their own specs :p
10.1.4 10.1.5
also in ELF/Linux flavor
● starring a signature-less PDF
○ which won't run on other viewers
Messing with binary formats
and Apple too
PS: I don't have a Mac, this was built blindly
Thanks to Nicolas Seriot for testing
why should we care?
like washing powders
security tools are selected:
● speed
● {files} → {[clean/detected]}
file types not taken into consideration
type confusion
make the tool believe it's another type, which
will fool the engine
engine with checksum caching will be fooled:
1. scanned as HTML, clean
2. reused as PE but malicious
Messing with binary formats
engine exhaustion
rankings in magazines are based on scanning
time
→ scanning per file must stop arbitrarily
→ waste scanning cycle by adding extra
formats
Weaknesses
● evasion
○ filters → exfiltration
○ same origin policy
○ detection
■ ex: clean PE but malicious PDF/HTML/...
■ exhaust checks
■ pretend to be corrupt
● DoS
Conclusion
Conclusion
● type confusion is bad
○ succinct docs too
○ lazy softwares as well
● go beyond the specs
○ Adobe: good
● suggestions
○ more extensions checks
○ isolate downloaded files
○ enforce magic signature at offset 0
Questions ?
thank YOU !
http://
reverseengineering.stackexchange.com
@angealbertini
✉ ange@corkami.com
Bonus
Valid image as JavaScript
Highlighted by Saumil Shah
● abusing header and parsers laxisms
● turn a field into /*
● close comment after the picture data

More Related Content

PDF
The windows socket
PDF
Beginning development in go
PDF
Docker and .NET Core - Best Friends Forever - Michael Newton - Codemotion Rom...
PDF
Go Programming Language by Google
PPTX
Learning Linq by Doing - Koans
PDF
Schizophrenic files v2
PDF
A Recovering Java Developer Learns to Go
PDF
Ruxmon.2015-08.-.proxenet
The windows socket
Beginning development in go
Docker and .NET Core - Best Friends Forever - Michael Newton - Codemotion Rom...
Go Programming Language by Google
Learning Linq by Doing - Koans
Schizophrenic files v2
A Recovering Java Developer Learns to Go
Ruxmon.2015-08.-.proxenet

What's hot (16)

PDF
Python for IoT, A return of experience
PDF
Phpconf taiwan-2012
PDF
PostgreSQL Development Today: 9.0
PDF
How to Dockerize, Automate the Build and Deployment Process for Flutter?
PDF
Egress-Assess and Owning Data Exfiltration
PDF
Stashaway 1
PPTX
Boosting python web apps with protocol buffers &amp; grpc
PDF
Pentester++
PPTX
Hacking - Breaking Into It
PDF
Making%20R%20Packages%20Under%20Windows
PDF
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
PDF
Understanding how concurrency work in os
PPTX
First month with golang - Building Telegram chat bot
PDF
MyReplayInZen
PDF
Not Your Fathers C - C Application Development In 2016
PPTX
Mono - Alternative .NET CLR Implementation
Python for IoT, A return of experience
Phpconf taiwan-2012
PostgreSQL Development Today: 9.0
How to Dockerize, Automate the Build and Deployment Process for Flutter?
Egress-Assess and Owning Data Exfiltration
Stashaway 1
Boosting python web apps with protocol buffers &amp; grpc
Pentester++
Hacking - Breaking Into It
Making%20R%20Packages%20Under%20Windows
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Understanding how concurrency work in os
First month with golang - Building Telegram chat bot
MyReplayInZen
Not Your Fathers C - C Application Development In 2016
Mono - Alternative .NET CLR Implementation
Ad

Viewers also liked (6)

PDF
Funky file formats - 31c3
PDF
A binary chimera - 3 headers & 1 data body in a single file
PDF
I2R Labs, Bengaluru, Telecommunication Equipment GPS Modules
PDF
Heartache and Heartbleed - 31c3
PDF
Personal tracking devices - A Journey Into The True Dark Net
Funky file formats - 31c3
A binary chimera - 3 headers & 1 data body in a single file
I2R Labs, Bengaluru, Telecommunication Equipment GPS Modules
Heartache and Heartbleed - 31c3
Personal tracking devices - A Journey Into The True Dark Net
Ad

Similar to Messing with binary formats (20)

PDF
PDF secrets - hiding & revealing secrets in PDF documents
PDF
PDF - Secrets - 140519092839-phpapp01
PDF
PDF: myths vs facts
PDF
A bit more of PE
PDF
Linux as a gaming platform, ideology aside
PDF
PDF
Caring for file formats
PDF
Drupalhagen 2014 kiss omg ftw
PDF
Docker and Go: why did we decide to write Docker in Go?
PPTX
Simplifying training deep and serving learning models with big data in python...
PPTX
Pen Testing Development
PPTX
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
PDF
Dfrws eu 2014 rekall workshop
PDF
Mender.io | Develop embedded applications faster | Comparing C and Golang
PPTX
Rusty Python
PDF
Castle Game Engine: intro, web, IFC, 3D scanning, mORMot
PDF
Drupal Day 2011 - Features: una vita felice
PDF
Python in Industry
PDF
Introduction to google chromebooks and chromeboxes presentation tech-talk
PDF
Ruxmon.2013-08.-.CodeBro!
PDF secrets - hiding & revealing secrets in PDF documents
PDF - Secrets - 140519092839-phpapp01
PDF: myths vs facts
A bit more of PE
Linux as a gaming platform, ideology aside
Caring for file formats
Drupalhagen 2014 kiss omg ftw
Docker and Go: why did we decide to write Docker in Go?
Simplifying training deep and serving learning models with big data in python...
Pen Testing Development
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Dfrws eu 2014 rekall workshop
Mender.io | Develop embedded applications faster | Comparing C and Golang
Rusty Python
Castle Game Engine: intro, web, IFC, 3D scanning, mORMot
Drupal Day 2011 - Features: una vita felice
Python in Industry
Introduction to google chromebooks and chromeboxes presentation tech-talk
Ruxmon.2013-08.-.CodeBro!

More from Ange Albertini (20)

PDF
Overview of file type identifiers (HackLu)
PDF
A question of time - Troopers 2024 Keynote
PDF
Technical challenges with file formats
PDF
Relations between archive formats
PDF
Abusing archive file formats
PDF
TimeCryption
PDF
You are *not* an idiot
PDF
Improving file formats
PDF
KILL MD5
PDF
No more dumb hex!
PDF
Beyond your studies
PDF
An introduction to inkscape
PDF
The challenges of file formats
PDF
Exploiting hash collisions
PDF
Infosec & failures
PDF
Connecting communities
PDF
TASBot - the perfectionist
PDF
Hacks in video games
PDF
Trusting files (and their formats)
PDF
Let's write a PDF file
Overview of file type identifiers (HackLu)
A question of time - Troopers 2024 Keynote
Technical challenges with file formats
Relations between archive formats
Abusing archive file formats
TimeCryption
You are *not* an idiot
Improving file formats
KILL MD5
No more dumb hex!
Beyond your studies
An introduction to inkscape
The challenges of file formats
Exploiting hash collisions
Infosec & failures
Connecting communities
TASBot - the perfectionist
Hacks in video games
Trusting files (and their formats)
Let's write a PDF file

Recently uploaded (20)

PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
sap open course for s4hana steps from ECC to s4
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Programs and apps: productivity, graphics, security and other tools
MIND Revenue Release Quarter 2 2025 Press Release
Dropbox Q2 2025 Financial Results & Investor Presentation
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation_ Review paper, used for researhc scholars
Review of recent advances in non-invasive hemoglobin estimation
Understanding_Digital_Forensics_Presentation.pptx
20250228 LYD VKU AI Blended-Learning.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Mobile App Security Testing_ A Comprehensive Guide.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
sap open course for s4hana steps from ECC to s4

Messing with binary formats

  • 1. Messing with binary formats London, EnglandAnge Albertini 2013/09/13 ΠΟΛΎΓΛΩΣΣΟΣ
  • 2. Welcome! ● this is the non-live version of my slides ○ more text ○ standard PDF file ;) About me: ● Reverse engineer ● my website: http://corkami.com ○ reverse engineering & visual documentations to extract the live deck. 61 slides: pdftk 44con-albertini.pdf cat 1 3 5 7 9 10 12 14 16 18 23 25 29 31 33 35 37 39 41 43 45 47 49 51 53 55-57 59 63 65 67 69 71 73 75 77 79 81 83 85 87 89 90 94 96 98 101-102 104 106-107 109-112 114-117 119 output 44con-albertini(live).pdf 119 2 4 6 8 11 13 15 17 19+4 24 26+3 32 34 36 38 40 42 44 46 48 50 52 54 58 60+3 64 66 68 70 72 74 76 78 80 82 84 86 88 91+3 95 97 99+2 103 105 108 113 118
  • 3. low-level ones, that is I just like to play with lego blocks
  • 4. generate files byte per byte Goals ● explore the format ● make sure that's how things work ● full control over the structure
  • 5. result: ● a complete executable ● all bytes defined by hand
  • 6. our problem ● is related to virus (malwares) ● they use many file formats ● it's critical to identify them reliably ○ and to tell whether corrupted or well-formed
  • 7. standard infection chain the most common chain: 1. a web page, in HTML format a. launching an applet 2. an evil applet, in CLASS format a. exploiting a Java vulnerability b. dropping an executable 3. a malicious executable, in Portable Executable format (a vast majority of malwares rely on an executable)
  • 8. another classic chain ● open a PDF document ○ with an exploit inside ■ dropping or downloading a PE executable ● get a malicious executable on your machine
  • 9. the challenge it might look obvious: ● tell whether it's a PDF, a PE, a JAVA, an HTML... ● typical formats are clearly defined ○ Magic signature enforced at offset 0
  • 10. reality some formats have no header at all ● Command File (DOS 16 bits) ● Master Boot record some formats don't need to start at offset 0 ● Archives (Zips, Rars...) ● HTML ○ but text-only? some formats accept a large controllable block early in their header ● Portable Executable ● PICT image
  • 11. How did this start? a real-life problem: 1. a (malicious) HTML page 2. started with 'MZ' (the signature of PE) 3. just scanned as a PE! a. wow, this PE is highly corrupted :) b. it must be clean :p ? MZ
  • 12. polyglots in the wild GIFAR = GIF + JAR ● an uploaded image ○ an avatar in a forum ● with a malicious JAVA appended as JAR hosted on the server! ● bypass same domain policy ● now useable via its JAVA=EVIL payload + =
  • 13. let's get started PE, the executable format of windows ● it's central to windows malware ● it enforces a magic signature at offset 0 ○ game over for other formats?
  • 14. ● starts with a compulsory header ● made of sub-headers overview
  • 15. a historical sandwich 1. a deprecated but required header 2. a modern header
  • 16. old header content ● almost completely ignored ● only required: ○ 2 byte signature ○ pointer to new header
  • 17. the new header can be anywhere ex: at the end of the file! such as Corkami Standard Test
  • 18. let's look at HTML format
  • 19. it enforces NOTHING! anything before the <html> tag! even 28 Mb of binary!
  • 20. and it's been the same since Mozilla 1.0 in 2002thanks to Nicolas Grégoire!
  • 21. now, the PDF format
  • 22. signature position? ● officially at offset 0 ● officially tolerated until offset 1024 ● wtf? ○ it get actually worse later
  • 23. PDF trick 1 put a small executable within 1024 bytes (just concatenate)
  • 24. trick 2 1. start a fake PDF + object in a PE header 2. finish fake object at the end the PE 3. end fake object 4. put PDF real structure works with real-life example! (PE data might contain PDF keywords)
  • 25. JAR = ZIP + Class just enforced at the very end of the file
  • 26. but CRCs are just ignored it was too easy :p
  • 28. Structure 1. start ○ PE Signature ■ %PDF + fake obj start ■ HTML comment start 2. next ○ PE (next) ○ HTML ○ PDF (next) 3. bottom ○ ZIP
  • 29. it’s time for a real example! an inception demo! wait, what?
  • 30. we’re already in the demo! the live version file is simultaneously: ● the PDF slides themselves ● a PDF viewer executable ○ ie, the file is loading itself ● the PoCs in a ZIP ● an HTML readme ○ with JavaScript mario
  • 31. so, it works but it lacks something ● not artistic enough ● not advanced enough let's build a 'well representative' (=nasty) PoC
  • 32. the PE specs ● Official MS specs = big joke ○ 'the gentle guide for beginners' ○ barely describes standard PEs
  • 33. stripped down PE many elements removed ● including no sections
  • 34. imports (imports = communication between executables and libraries) imports are made of 3 lists
  • 35. evil imports ● let's make these lists into each other ● with more extra tricks to fail parser!
  • 36. ultimate import fail ● failing all tools ○ including IDA & Hiew ● now fixed :)
  • 37. let's put some code ● some undocumented opcodes! ● big blank spaces in Intel official docs
  • 39. result in WinDbg ● '???' == clueless (tool/user) don't rely (only) on official docs
  • 41. there is a so-called standard and the reality of existing parsers looking at: Adobe, MuPDF, Chrome ● 3 different files ○ working each on a specific viewer ○ failing on the other 2
  • 43. let's look inside ● MuPDF ○ no %PDF sig required ■ a PDF without a PDF sig ? WTF ?!?! ○ no trailer keyword required either ● Chrome ○ integer overflows: -4294967275 = 21 ○ trailer in a comment ■ it can actually be almost ANYWHERE ■ even inside another object ● Adobe ○ looks almost sane compare to the other 2
  • 45. Chrome insanity++ (thx to Jonas Magazinius) ● a single object ● no 'trailer' ● inline stream ● brackets are not even closed ● * are required - it just checks for minimum space
  • 46. %PDF***** 1 0 obj << /Size 2 /W[[]1/] /Root 1 0 R /Pages<< /Kids[<< /Contents<<>> stream BT{99 Tf{Td(Inlined PDF)' endstream >>] >> >> stream * endstream startxref%*******
  • 47. PDF.JS ● very strict ○ 'too' strict / naive ? ○ I don't want to be their QA ;) ● requires a lot of information usually ignored ○ xref ○ /Length %PDF-1.1 1 0 obj << % /Type /Catalog ... >> endobj 2 0 obj << /Type /Pages ... >> endobj 3 0 obj << /Type /Page /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 ... >> >> >> >> endobj 4 0 obj << /Length 47>> stream ... xref 0 1 0000000000 65535 f 0000000010 00000 n ...
  • 48. let's play further combine 3 documents in a single file ● it's actually 3 set of 'independant' objects ● objects are parsed ○ but not used
  • 49. alternate reality demo the live slide-deck contains 2 PDF ● bogus one under Chrome ● real one under MuPDF (Sumatra, Linux...) ● rejected under Acrobat ○ because of the PE signature (see later) DEMO
  • 50. final PoC ● combine most previously mentioned tricks ● many fails on many tools ● total control of the structure ○ the PDF 'ends' in the Java class
  • 51. Adobe rejects 'weird magics' after 10.1.5 not in their own specs :p 10.1.4 10.1.5
  • 52. also in ELF/Linux flavor ● starring a signature-less PDF ○ which won't run on other viewers
  • 54. and Apple too PS: I don't have a Mac, this was built blindly Thanks to Nicolas Seriot for testing
  • 55. why should we care?
  • 56. like washing powders security tools are selected: ● speed ● {files} → {[clean/detected]} file types not taken into consideration
  • 57. type confusion make the tool believe it's another type, which will fool the engine engine with checksum caching will be fooled: 1. scanned as HTML, clean 2. reused as PE but malicious
  • 59. engine exhaustion rankings in magazines are based on scanning time → scanning per file must stop arbitrarily → waste scanning cycle by adding extra formats
  • 60. Weaknesses ● evasion ○ filters → exfiltration ○ same origin policy ○ detection ■ ex: clean PE but malicious PDF/HTML/... ■ exhaust checks ■ pretend to be corrupt ● DoS
  • 62. Conclusion ● type confusion is bad ○ succinct docs too ○ lazy softwares as well ● go beyond the specs ○ Adobe: good ● suggestions ○ more extensions checks ○ isolate downloaded files ○ enforce magic signature at offset 0
  • 65. Bonus
  • 66. Valid image as JavaScript Highlighted by Saumil Shah ● abusing header and parsers laxisms ● turn a field into /* ● close comment after the picture data