SlideShare a Scribd company logo
DTrace + OS X = Fun
Andrzej Dyjak (@dyjakan)
Confidence 2015, Kraków
www.census-labs.com
> AGENDA
• Part 1: Introduction
I. What is DTrace?
II. D language
III. Past work
IV. Similar projects
• Part 2: Usage
I. One-liners
II. Scripts
III. Future work
IV. References
> PART 1: INTRODUCTION
> What is DTrace?
„DTrace is a comprehensive dynamic tracing facility
(...) that can be used by administrators and
developers on live production systems to examine
the behavior of both user programs and of the
operating system itself. DTrace enables you to
explore your system to understand how it works,
track down performance problems across many
layers of software, or locate the cause of aberrant
behavior.”
To put it simply: Retarded debugger / DBI engine for
user and kernel modes.
www.census-labs.co
www.census-labs.co
# cat example.d
PROVIDER:MODULE:FUNCTION:NAME
/PREDICATE/
{
actions;
}
# dtrace –s example.d
# dtrace –n ’PROVIDER:MODULE:FUNCTION:NAME
/PREDICATE/ {action;}’
www.census-labs.co
www.census-labs.co
BONUS: USDT (User-Level Statically
Defined Tracing)
„(…) providing debug macros that can be
customized and placed throughout the
code.”
Debugging / analysis capabilities can be
improved even more.
> D language
• Data types
• Variables
• Built-ins
• Operators
• Control statements
• Actions & subroutines
• Default providers
> Data types
• char, short, int, long, long long, float,
double, long double
• Aliases (like int32_t)
• You can dereference pointers and walk
structure chains
• You can cast things
> Variables
Types:
• Scalars
• Strings (differs from C)
• Arrays
• Associative arrays
Scope:
• Globals: foobar = 1337
• Clause-locals: this->foo = 13
• Thread-locals: self->bar = 37
• External variables: `internal_kernel_variable
> Built-ins
Built-in variables:
• *curpsinfo, *curlwpsinfo, *curthread, caller,
arg0-9 and args[], execname, pid, ppid,
timestamp, uregs[], …
> Operators
• Arithmetic
• Relational (apply also to strings, e.g. As a
predicate /execname == ”foobar”/)
• Logical (XOR is ^^)
• Bitwise (XOR is ^)
• Assignment
• Increment / Decrement
> Control statements
None. Loops and IFs (apart from predicates
and ?:) are not implemented.
> Actions & subroutines
Generic and safe:
• stack() / ustack()
• tracemem()
• alloca()
• bcopy()
• copyin() / copyinstr() / copyinto()
• msgsize() / strlen()
[ … ]
> Actions & subroutines cont’d
Destructive for specific process:
• stop()
• raise()
• copyout() / copyoutstr()
• system()
> Actions & subroutines cont’d
Destructive for the system:
• breakpoint()
• panic()
• chill()
> Default providers
Most interesting:
• syscall
• pid
• objc
• fbt
• proc
[ … ]
www.census-labs.co
> Past work (in the context of
security)
• BlackHat 2008 (and some others)
– „RE:Trace - Applied Reverse Engineering on
OS X” by Tiller Beauchamp and David Weston
• Infiltrate 2013
– „Destructive D-Trace” by nemo
> Similar projects (among others)
• SystemTap (Red Hat)
– Very similar to DTrace, kinda like a response from
Red Hat for Linux
– For interesting usage case see http://census-
labs.com/news/2014/11/06/systemtap-unbound-
overflow/
• Detours (Microsoft)
– „Software package for re-routing Win32 APIs
underneath applications.”
– Similar in functionality, differs in the implementation,
e.g.
http://blogs.msdn.com/b/oldnewthing/archive/2011/09/
21/10214405.aspx
> PART 2: USAGE
> One-liners
• Syscalls stats
• Bytes read by process stats
• Process creation logging
> Syscalls stats
$ sudo dtrace -n 'syscall:::entry/pid == 3589/{ @syscalls[probefunc] =
count(); }'
dtrace: description 'syscall:::entry' matched 490 probes
^C
bsdthread_create 1
[ ... ]
fstat64 22
fsgetpath 36
proc_info 38
[ ... ]
mmap 352
munmap 357
bsdthread_ctl 542
workq_kernreturn 620
> Bytes read by process stats
$ sudo dtrace -n 'syscall::read:entry { @bytes[execname] = sum(arg2); }'
dtrace: description 'syscall::read:entry ' matched 1 probe
^C
Google Chrome H 26
authd 64
SFLIconTool 504
cfprefsd 858
CoreServicesUIA 1024
iTerm 1056
[ ... ]
mds 589696
fseventsd 76866
> Process creation logging
$ sudo dtrace -qn 'syscall::posix_spawn:entry { printf("%Y
%sn", walltimestamp, copyinstr(arg1)); }'
2015 May 26 13:39:35 /usr/libexec/xpcproxy
2015 May 26 13:39:35
/Applications/Safari.app/Contents/MacOS/Safari
2015 May 26 13:39:35 /usr/libexec/xpcproxy
2015 May 26 13:39:35 /usr/libexec/xpcproxy
2015 May 26 13:39:35
/System/Library/StagedFrameworks/Safari/WebKit.framework/Version
s/A/XPCServices/com.apple.WebKit.Networking.xpc/Contents/MacOS/c
om.apple.WebKit.Networking
2015 May 26 13:39:35
/System/Library/StagedFrameworks/Safari/WebKit.framework/Version
s/A/XPCServices/com.apple.WebKit.WebContent.xpc/Contents/MacOS/c
om.apple.WebKit.WebContent
2015 May 26 13:39:36 /usr/libexec/xpcproxy
2015 May 26 13:39:36 /usr/libexec/SafariNotificationAgent
> One-liners cont’d
For some more ideas you can quickly check
http://mfukar.github.io/2014/03/19/dtrace.ht
ml or just google for them.
> Scripts
• Tracking input
• Memory allocation snooping
• Hit tracing
> Tracking input
• I’ve covered this on my blog for read()
• However, often times mmap() is used
instead and this led to an interesting
problem
• Also, this can be reimplemented for
network input as well
www.census-labs.com
BEGIN
{
trackedfd[0] = 0;
trackedmmap[0] = 0;
}
www.census-labs.com
pid$target::__open:entry
/copyinstr(arg0) == "/Users/ad/Desktop/test"/
{
self->fname = copyinstr(arg0);
self->openok = 1;
}
pid$target::__open:return
/self->openok/
{
trackedfd[arg1] = 1;
printf("Opening %s with fd %#xn", self->fname, arg1);
self->fname = 0;
self->openok = 0;
}
www.census-labs.com
pid$target::__mmap:entry
/trackedfd[arg4] == 1/
{
self->msz = arg1;
self->mfd = arg4;
}
pid$target::__mmap:return
/self->msz/
{
trackedmmap[arg1] = 1;
printf("Mapping fd %#x to %#p size %#xn", self->mfd, arg1,
self->msz);
ustack(); printf("n");
}
www.census-labs.com
pid$target::__munmap:entry
/trackedmmap[arg0] == 1/
{
printf("Unmapping %#pn", arg0);
tracemem(copyin(arg0, arg1), 128);
self->msz = 0;
self->mfd = 0;
trackedmmap[arg0] = 0;
}
www.census-labs.com
pid$target::close:entry
/trackedfd[arg0] == 1/
{
trackedfd[arg0] = 0;
}
> Memory allocation snooping
• Implementation of a simple tool that
imitates output of ltrace for memory
allocation functions from libc
But there are more possible scenarios, e.g.:
• Heap layout analysis
• Snooping into custom memory allocators
• Tracking kernel memory allocations
www.census-labs.com
pid$target::malloc:entry
{
self->msize = arg0;
}
pid$target::malloc:return
/self->msize/
{
printf("malloc(%d) = %#pn", self->msize,
arg1);
self->msize = 0;
}
www.census-labs.com
pid$target::valloc:entry
{
self->vsize = arg0;
}
pid$target::valloc:return
/self->vsize/
{
printf("valloc(%d) = %#pn", self->vsize,
arg1);
self->vsize = 0;
}
www.census-labs.com
pid$target::calloc:entry
{
self->ccount = arg0;
self->csize = arg1;
}
pid$target::calloc:return
/self->csize/
{
printf("calloc(%d, %d) = %#pn", self->ccount, self-
>csize, arg1);
self->ccount = 0;
self->csize = 0;
}
www.census-labs.com
pid$target::realloc:entry
{
self->raddr = arg0;
self->rsize = arg1;
}
pid$target::realloc:return
/self->rsize/
{
printf("realloc(%#p, %d) = %#pn", self->raddr, self-
>rsize, arg1);
self->rsize = 0;
self->raddr = 0;
}
www.census-labs.com
pid$target::reallocf:entry
{
self->rfaddr = arg0;
self->rfsize = arg1;
}
pid$target::reallocf:return
/self->rfsize/
{
printf("reallocf(%#p, %d) = %#pn", self->rfaddr,
self->rfsize, arg1);
self->rfaddr = 0;
self->rfsize = 0;
}
www.census-labs.com
pid$target::free:entry
{
printf("free(%#p) = <void>n",
arg0);
}
www.census-labs.com
[mbp:~/] ad% sudo ./memtrace.d -c /bin/ls
README.md memtrace.d tests
malloc(3312) = 0x7f90ec802000
malloc(4096) = 0x7f90ec801000
realloc(0x7f90ec802000, 91380) = 0x7f90ec802e00
reallocf(0x7f90ec802000, 91380) = 0x7f90ec802e00
free(0x7f90ec801000) = <void>
malloc(231) = 0x7f90ebd00000
malloc(72) = 0x7f90ebd00100
[ ... ]
www.census-labs.com
> Hit tracing
• Kinda like a code coverage but the end-goal
is different
• Two modes of operation:
– Shallow would mark functions within module
– Deep would mark instructions within a function
• Output is pre-processed and lands in IDA for
graph colorization
• Similar to
http://dvlabs.tippingpoint.com/blog/2008/07/1
7/mindshare-hit-tracing-in-windbg
> Future work
• More kernel work
• More USDT work (V8?)
• Python-based DTrace consumer (a.k.a.
Python bindings)
I’m open to ideas, don’t be shy and mail me.
> References
• http://dtrace.org/blogs/
• https://wikis.oracle.com/display/DTrace/Docu
mentation
• http://dtracebook.com
• http://dtracehol.com
• http://phrack.org/issues/63/3.html
• „Dynamic Instrumentation of Production
Systems” Cantrill, Shapiro, Leventhal
• Apple TN2124, DTrace entry
Q & A
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak

More Related Content

PDF
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
PDF
Wprowadzenie do technologi Big Data i Apache Hadoop
PDF
Powered by Python - PyCon Germany 2016
PDF
Zabbix LLD from a C Module by Jan-Piet Mens
PDF
Cluj.py Meetup: Extending Python in C
PDF
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
PDF
Profiling Ruby
PDF
Modern c++ Memory Management
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologi Big Data i Apache Hadoop
Powered by Python - PyCon Germany 2016
Zabbix LLD from a C Module by Jan-Piet Mens
Cluj.py Meetup: Extending Python in C
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Profiling Ruby
Modern c++ Memory Management

What's hot (20)

PDF
C c++-meetup-1nov2017-autofdo
KEY
連邦の白いヤツ 「Objective-C」
PDF
Compose Async with RxJS
PDF
Nodejs性能分析优化和分布式设计探讨
PDF
Обзор фреймворка Twisted
PDF
Cluj Big Data Meetup - Big Data in Practice
DOCX
Assignment no39
PDF
RxJS Evolved
PPT
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
PDF
Building fast interpreters in Rust
PDF
FalsyValues. Dmitry Soshnikov - ECMAScript 6
PDF
Writing native bindings to node.js in C++
PDF
C++ game development with oxygine
PDF
JavaScript ES6
PDF
HelsinkiJS meet-up. Dmitry Soshnikov - ECMAScript 6
DOC
Study of aloha protocol using ns2 network java proram
PPTX
Lua: the world's most infuriating language
PPTX
AST - the only true tool for building JavaScript
PDF
Time Series Meetup: Virtual Edition | July 2020
C c++-meetup-1nov2017-autofdo
連邦の白いヤツ 「Objective-C」
Compose Async with RxJS
Nodejs性能分析优化和分布式设计探讨
Обзор фреймворка Twisted
Cluj Big Data Meetup - Big Data in Practice
Assignment no39
RxJS Evolved
DATASTRUCTURES PPTS PREPARED BY M V BRAHMANANDA REDDY
Building fast interpreters in Rust
FalsyValues. Dmitry Soshnikov - ECMAScript 6
Writing native bindings to node.js in C++
C++ game development with oxygine
JavaScript ES6
HelsinkiJS meet-up. Dmitry Soshnikov - ECMAScript 6
Study of aloha protocol using ns2 network java proram
Lua: the world's most infuriating language
AST - the only true tool for building JavaScript
Time Series Meetup: Virtual Edition | July 2020
Ad

Similar to CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak (20)

PDF
Solaris DTrace, An Introduction
PDF
A22 Introduction to DTrace by Kyle Hailey
PDF
It802 bruning
PDF
DTrace Topics: Introduction
PPTX
Performance analysis and troubleshooting using DTrace
DOCX
Bsdtw17: george neville neil: realities of dtrace on free-bsd
PDF
D trace kde4presentation
PDF
Interruption Timer Périodique
PDF
Solaris Kernel Debugging V1.0
PDF
20082501 Leeds Pm
ODP
Linux kernel tracing superpowers in the cloud
PDF
dtrace_topics_intro.pdf
PDF
zine.pdf
PPTX
Debug generic process
PDF
Вениамин Гвоздиков: Особенности использования DTrace
PDF
Linux Tracing Superpowers by Eugene Pirogov
PDF
Trace kernel code tips
PPT
Advanced driver debugging (13005399) copy
PDF
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
PDF
Découvrir dtrace en ligne de commande.
Solaris DTrace, An Introduction
A22 Introduction to DTrace by Kyle Hailey
It802 bruning
DTrace Topics: Introduction
Performance analysis and troubleshooting using DTrace
Bsdtw17: george neville neil: realities of dtrace on free-bsd
D trace kde4presentation
Interruption Timer Périodique
Solaris Kernel Debugging V1.0
20082501 Leeds Pm
Linux kernel tracing superpowers in the cloud
dtrace_topics_intro.pdf
zine.pdf
Debug generic process
Вениамин Гвоздиков: Особенности использования DTrace
Linux Tracing Superpowers by Eugene Pirogov
Trace kernel code tips
Advanced driver debugging (13005399) copy
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
Découvrir dtrace en ligne de commande.
Ad

Recently uploaded (20)

PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
top salesforce developer skills in 2025.pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Understanding Forklifts - TECH EHS Solution
PDF
medical staffing services at VALiNTRY
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
ai tools demonstartion for schools and inter college
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
AI in Product Development-omnex systems
PPTX
Introduction to Artificial Intelligence
PPT
Introduction Database Management System for Course Database
PDF
Nekopoi APK 2025 free lastest update
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Operating system designcfffgfgggggggvggggggggg
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
top salesforce developer skills in 2025.pdf
How to Choose the Right IT Partner for Your Business in Malaysia
2025 Textile ERP Trends: SAP, Odoo & Oracle
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Understanding Forklifts - TECH EHS Solution
medical staffing services at VALiNTRY
Internet Downloader Manager (IDM) Crack 6.42 Build 41
ai tools demonstartion for schools and inter college
Odoo POS Development Services by CandidRoot Solutions
Navsoft: AI-Powered Business Solutions & Custom Software Development
AI in Product Development-omnex systems
Introduction to Artificial Intelligence
Introduction Database Management System for Course Database
Nekopoi APK 2025 free lastest update
Online Work Permit System for Fast Permit Processing
VVF-Customer-Presentation2025-Ver1.9.pptx

CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak

  • 1. DTrace + OS X = Fun Andrzej Dyjak (@dyjakan) Confidence 2015, Kraków www.census-labs.com
  • 2. > AGENDA • Part 1: Introduction I. What is DTrace? II. D language III. Past work IV. Similar projects • Part 2: Usage I. One-liners II. Scripts III. Future work IV. References
  • 3. > PART 1: INTRODUCTION
  • 4. > What is DTrace? „DTrace is a comprehensive dynamic tracing facility (...) that can be used by administrators and developers on live production systems to examine the behavior of both user programs and of the operating system itself. DTrace enables you to explore your system to understand how it works, track down performance problems across many layers of software, or locate the cause of aberrant behavior.” To put it simply: Retarded debugger / DBI engine for user and kernel modes.
  • 6. www.census-labs.co # cat example.d PROVIDER:MODULE:FUNCTION:NAME /PREDICATE/ { actions; } # dtrace –s example.d # dtrace –n ’PROVIDER:MODULE:FUNCTION:NAME /PREDICATE/ {action;}’
  • 9. BONUS: USDT (User-Level Statically Defined Tracing) „(…) providing debug macros that can be customized and placed throughout the code.” Debugging / analysis capabilities can be improved even more.
  • 10. > D language • Data types • Variables • Built-ins • Operators • Control statements • Actions & subroutines • Default providers
  • 11. > Data types • char, short, int, long, long long, float, double, long double • Aliases (like int32_t) • You can dereference pointers and walk structure chains • You can cast things
  • 12. > Variables Types: • Scalars • Strings (differs from C) • Arrays • Associative arrays Scope: • Globals: foobar = 1337 • Clause-locals: this->foo = 13 • Thread-locals: self->bar = 37 • External variables: `internal_kernel_variable
  • 13. > Built-ins Built-in variables: • *curpsinfo, *curlwpsinfo, *curthread, caller, arg0-9 and args[], execname, pid, ppid, timestamp, uregs[], …
  • 14. > Operators • Arithmetic • Relational (apply also to strings, e.g. As a predicate /execname == ”foobar”/) • Logical (XOR is ^^) • Bitwise (XOR is ^) • Assignment • Increment / Decrement
  • 15. > Control statements None. Loops and IFs (apart from predicates and ?:) are not implemented.
  • 16. > Actions & subroutines Generic and safe: • stack() / ustack() • tracemem() • alloca() • bcopy() • copyin() / copyinstr() / copyinto() • msgsize() / strlen() [ … ]
  • 17. > Actions & subroutines cont’d Destructive for specific process: • stop() • raise() • copyout() / copyoutstr() • system()
  • 18. > Actions & subroutines cont’d Destructive for the system: • breakpoint() • panic() • chill()
  • 19. > Default providers Most interesting: • syscall • pid • objc • fbt • proc [ … ]
  • 21. > Past work (in the context of security) • BlackHat 2008 (and some others) – „RE:Trace - Applied Reverse Engineering on OS X” by Tiller Beauchamp and David Weston • Infiltrate 2013 – „Destructive D-Trace” by nemo
  • 22. > Similar projects (among others) • SystemTap (Red Hat) – Very similar to DTrace, kinda like a response from Red Hat for Linux – For interesting usage case see http://census- labs.com/news/2014/11/06/systemtap-unbound- overflow/ • Detours (Microsoft) – „Software package for re-routing Win32 APIs underneath applications.” – Similar in functionality, differs in the implementation, e.g. http://blogs.msdn.com/b/oldnewthing/archive/2011/09/ 21/10214405.aspx
  • 23. > PART 2: USAGE
  • 24. > One-liners • Syscalls stats • Bytes read by process stats • Process creation logging
  • 25. > Syscalls stats $ sudo dtrace -n 'syscall:::entry/pid == 3589/{ @syscalls[probefunc] = count(); }' dtrace: description 'syscall:::entry' matched 490 probes ^C bsdthread_create 1 [ ... ] fstat64 22 fsgetpath 36 proc_info 38 [ ... ] mmap 352 munmap 357 bsdthread_ctl 542 workq_kernreturn 620
  • 26. > Bytes read by process stats $ sudo dtrace -n 'syscall::read:entry { @bytes[execname] = sum(arg2); }' dtrace: description 'syscall::read:entry ' matched 1 probe ^C Google Chrome H 26 authd 64 SFLIconTool 504 cfprefsd 858 CoreServicesUIA 1024 iTerm 1056 [ ... ] mds 589696 fseventsd 76866
  • 27. > Process creation logging $ sudo dtrace -qn 'syscall::posix_spawn:entry { printf("%Y %sn", walltimestamp, copyinstr(arg1)); }' 2015 May 26 13:39:35 /usr/libexec/xpcproxy 2015 May 26 13:39:35 /Applications/Safari.app/Contents/MacOS/Safari 2015 May 26 13:39:35 /usr/libexec/xpcproxy 2015 May 26 13:39:35 /usr/libexec/xpcproxy 2015 May 26 13:39:35 /System/Library/StagedFrameworks/Safari/WebKit.framework/Version s/A/XPCServices/com.apple.WebKit.Networking.xpc/Contents/MacOS/c om.apple.WebKit.Networking 2015 May 26 13:39:35 /System/Library/StagedFrameworks/Safari/WebKit.framework/Version s/A/XPCServices/com.apple.WebKit.WebContent.xpc/Contents/MacOS/c om.apple.WebKit.WebContent 2015 May 26 13:39:36 /usr/libexec/xpcproxy 2015 May 26 13:39:36 /usr/libexec/SafariNotificationAgent
  • 28. > One-liners cont’d For some more ideas you can quickly check http://mfukar.github.io/2014/03/19/dtrace.ht ml or just google for them.
  • 29. > Scripts • Tracking input • Memory allocation snooping • Hit tracing
  • 30. > Tracking input • I’ve covered this on my blog for read() • However, often times mmap() is used instead and this led to an interesting problem • Also, this can be reimplemented for network input as well
  • 32. www.census-labs.com pid$target::__open:entry /copyinstr(arg0) == "/Users/ad/Desktop/test"/ { self->fname = copyinstr(arg0); self->openok = 1; } pid$target::__open:return /self->openok/ { trackedfd[arg1] = 1; printf("Opening %s with fd %#xn", self->fname, arg1); self->fname = 0; self->openok = 0; }
  • 33. www.census-labs.com pid$target::__mmap:entry /trackedfd[arg4] == 1/ { self->msz = arg1; self->mfd = arg4; } pid$target::__mmap:return /self->msz/ { trackedmmap[arg1] = 1; printf("Mapping fd %#x to %#p size %#xn", self->mfd, arg1, self->msz); ustack(); printf("n"); }
  • 34. www.census-labs.com pid$target::__munmap:entry /trackedmmap[arg0] == 1/ { printf("Unmapping %#pn", arg0); tracemem(copyin(arg0, arg1), 128); self->msz = 0; self->mfd = 0; trackedmmap[arg0] = 0; }
  • 36. > Memory allocation snooping • Implementation of a simple tool that imitates output of ltrace for memory allocation functions from libc But there are more possible scenarios, e.g.: • Heap layout analysis • Snooping into custom memory allocators • Tracking kernel memory allocations
  • 39. www.census-labs.com pid$target::calloc:entry { self->ccount = arg0; self->csize = arg1; } pid$target::calloc:return /self->csize/ { printf("calloc(%d, %d) = %#pn", self->ccount, self- >csize, arg1); self->ccount = 0; self->csize = 0; }
  • 40. www.census-labs.com pid$target::realloc:entry { self->raddr = arg0; self->rsize = arg1; } pid$target::realloc:return /self->rsize/ { printf("realloc(%#p, %d) = %#pn", self->raddr, self- >rsize, arg1); self->rsize = 0; self->raddr = 0; }
  • 41. www.census-labs.com pid$target::reallocf:entry { self->rfaddr = arg0; self->rfsize = arg1; } pid$target::reallocf:return /self->rfsize/ { printf("reallocf(%#p, %d) = %#pn", self->rfaddr, self->rfsize, arg1); self->rfaddr = 0; self->rfsize = 0; }
  • 43. www.census-labs.com [mbp:~/] ad% sudo ./memtrace.d -c /bin/ls README.md memtrace.d tests malloc(3312) = 0x7f90ec802000 malloc(4096) = 0x7f90ec801000 realloc(0x7f90ec802000, 91380) = 0x7f90ec802e00 reallocf(0x7f90ec802000, 91380) = 0x7f90ec802e00 free(0x7f90ec801000) = <void> malloc(231) = 0x7f90ebd00000 malloc(72) = 0x7f90ebd00100 [ ... ]
  • 45. > Hit tracing • Kinda like a code coverage but the end-goal is different • Two modes of operation: – Shallow would mark functions within module – Deep would mark instructions within a function • Output is pre-processed and lands in IDA for graph colorization • Similar to http://dvlabs.tippingpoint.com/blog/2008/07/1 7/mindshare-hit-tracing-in-windbg
  • 46. > Future work • More kernel work • More USDT work (V8?) • Python-based DTrace consumer (a.k.a. Python bindings) I’m open to ideas, don’t be shy and mail me.
  • 47. > References • http://dtrace.org/blogs/ • https://wikis.oracle.com/display/DTrace/Docu mentation • http://dtracebook.com • http://dtracehol.com • http://phrack.org/issues/63/3.html • „Dynamic Instrumentation of Production Systems” Cantrill, Shapiro, Leventhal • Apple TN2124, DTrace entry
  • 48. Q & A

Editor's Notes

  • #2: Test.
  • #3: Agenda. Walk through briefly.
  • #4: Go!
  • #5: DTrace was designed and implemented by Bryan Cantrill, Mike Shapiro, and Adam Leventhal. It was released in 2005. Since then it was open-sourced and for now it is supported by Solaris, Mac OS X, FreeBSD, NetBSD, Linux kernel (partially). Particularly OS X included dtrace in 2007 (version 10.5. Leopard) as part of Instruments testing suite. Stability was core assumption, that’s why there is no overhead when probes are disabled and also that’s why it’s limited in functionality.
  • #6: DTrace mascot.
  • #7: PROVIDER gives us general funcionality; MODULE sets the module we're focusing on (e.g. specific dylib); FUNCTION specifies function within module (this poses a limitation i.e. you can’t trace binaries that have their symbols stripped; ofc that also applies to unusal ‘calls’ like JMPing into code chunk instead of calling it – these will be invisible to dtrace); NAME gives us some idea about semantic meaning (e.g. entry/return, BEGIN/END, also when tracing a function you can specify offset within a function (at this offset you can e.g. peek into memory pointed by some register) or leave NAME blank to trace all the instruction within); PREDICATE acts as a conditional, and ACTIONS are what's gonna happen when the probe fires. NOTE: For PROVIDER:MODULE:FUNCTION:NAME you can use wildcards, e.g. *open* will trace any function with ‘open’ string in it. You can use Dtrace rapidly as one-liners and for more challenging tasks we can switch to scripting. Also, D scripts can be embedded into e.g. bash scripts to gain additional possibilities like argument parsing. Note: dtrace requires root privileges (interaction with kernel mode + destructive actions)
  • #8: Example of dtrace script and a one-liner. Talk about dtrace provider and its BEGIN and END probe (they fire on starting and ending of a dtrace script).
  • #9: Dtrace command invokes the compiler for the D language that outputs D Intermedaite Format which is sent to kernel part. As previously mentioned, Dtrace is pretty strict about corectness and guarantees safety with no additional overhead when probes are disabled (and in fact a system with disabled probes is identical to a system without dtrace at all). There is a possibility of stand-alone consumer as e.g. Pythons bindings. Dtrace providers are kernel modules that talk with dtrace kernel module through API.
  • #10: You yourself can put static probes inside of the application, re-compile it and improve analysis with dynamic activation of the USDT probes when required (neglibile overhead when disabled). Possibilities examples: The JavaScript provider uses USDT to instrument the Mozilla JavaScript engine (Spider Monkey). It provides probes for function calls, object creation, garbage collection, and code execution. Basically you can use this provider to trace the operation of JavaScript code. I did not test it, so I’m not sure if this is still ‘the thing’ but the sole possibility is enough (e.g. woudln’t V8 equivalent be awesome? maybe)
  • #11: D is a C-like language as they will see in a second
  • #12: C-like syntax and functionality. Ptr dereferencing & traversing structure chains. Also, character escapes sequences are same as in C (e.g. \n = backslash n for newline)
  • #13: We do not declare data type of a variable explicitly; Associative arrays = keys are tuples; We can declare a variable without initilization; When we zero-out variable it’s freed. Talk about global / clause / thread locality (when would we use it? Mention later examples) External variables are a way to access kernel variables in your Dtrace script. You need to pre-pend variable name with a backtick character to access them. Also, worth mentioning that Dtrace supports structs and unions along with typedefs. And even bit fields!
  • #14: Talk about each and every one. Curpsinfo points to psinfo_t struct, curlwpsinfo points to lwpsinfo_t, and curthread points to kthread_t both are internal structures for the current process and thread. Give examples for args usage e.g. File descriptors from your scripts + Note: Args for C++ methods can be tricky to access, this is because it’s up to C++ compiler to organize arguments and you need to know how they’re organized before tracing operation (i.e. Which argument is the this ptr and are there any compiler bonus args). Ppid is parent PID. Double note that there is some more built-ins, worth checking it for yourself.
  • #15: All this are C-like. Nothing much to say in this slide. Run through them and say e.g. ‘usual + - * /’ etc
  • #16: This is due to guaranteed safety. Loops and ifs too easily can lead to never ending story (=break the system) which would break core assumption about safe usage on production systems. However, there are reserved keyword for loops, ifs, gotos, et cetera but they never saw an actual implementation (or a release, who knows what Sun/Oracle and later Joyent did).
  • #17: Stack() / ustack() – self explanatory, display kernel and user mode stack. Tracemem() – dumps memory into the screen (peek a boo!) Alloca() – dynamic mem allocation inside of the dtrace script Bcopy() – might be used to copy data into newly allocated buffer Copyin() / copyinstr() / copyinto() – used to peek into data from user-mode processes (e.g. playing with user-mode requires usage of these in order to transer data to kernel) Msgsize() / strlen() – sometimes we want to measure sth
  • #18: Talk for a second about pragmas, like quiet or destructive. Stop() – stopping process at point XYZ Raise() – sending signal, similar to kill cmd Copyout() / copyoutstr() – allows data modification, nemo used it to tamper function call arguments (he did so for x86 where fcalls args are usually all passed via stack; for x64 where first 6 args are passed via registers it might work if the argument is a pointer not a value => take the arg from register and mangle memory pointed by it) System() – execs an application
  • #19: Breakpoint() – puts kernel-mode breakpoint (sucks if you don’t have connected debugger) Panic() – induces kernel panic at specific point Chill() – causing dtrace to spin for N nanoseconds, this is interesting when testing race conditions (you can slow down execution on purpose just to win races more often)
  • #20: There is more providers and some of them might be interesting to _you_, e.g. Tcp/ip/udp providers might be interesting to sysadmins/network operators. Syscall – for tracing syscalls Pid – for tracing specific processes Objc – apple specific, more in ‘man dtrace’, for tracing specific objd functionality Fbt – function boundary tracing, you can use it to trace function from kernel (usage example for vulnerability analysis is in the ‘guide to kernel exploitation’ book) Proc – process creation and termination monitoring (nicely used in recent ‘launchd’ blog post by wuntee) Also, mention PHP and PYTHON providers as interesting (e.g. You can watch internals of the python script not the python VM when using python provider) and JS provider with conjunction to previously mentioned Firefox’s USDT.
  • #21: Examples of active providers probes list (with counting!). Pid provider is a-ok because by default nothing uses it (that’s why it shows 0). Fbt is huge because you can probe most of the kernel functions.
  • #22: Tiller and David’s talk touches the aspect of vulnerability research (ease of analysis + HIDS + code coverage). They also introduce „RE:Trace framework” a mixture of DTrace and Ruby bindings however I was not able to find it in teh Internetz. Also, I’m not a jeweller so that’s a no-no for ruby bindings. Nemo’s talk is mostly about rootkit-like functionality implemented via Dtrace (e.g. hiding files). I’m not so sure if this is the best course of action for gaining persistance but he presents interesting examples, mostly tampering with syscalls. In general context there is shit load of resources about general Dtrace usage, I will list most interesting ones in the reference section.
  • #23: Regarding SystemTap: there even exists compatibility between Dtrace and Systemtap when it comes to USDTs Regarding provided link: Detours is the reason behind „mov edi, edi” hot-patch point inside of Microsoft’s DLLs. So it does introduce slight overhead even when disabled (as opposed to dtrace).
  • #24: Go!
  • #26: We’re snooping into Preview
  • #28: These are logs for Safari execution
  • #29: Googling for ideas is stil work-in-progress at google.
  • #32: Global arrays for flagging FDs and MMAPs
  • #33: Marking opened FD and printing out logging information.
  • #34: Marking mmaps. Hm, but where’s our tracemem at mmap()’s return?
  • #35: What? Tracemem() at munmap()? Well, as oppsoed to read() we can’t peek into memory at mmap()’s return even though the pointer is already valid. I found out that dtrace can’t peek into memory that was not previously touched and it seems that this is the case here. This has some down sides (memory could be altered at this point) but it’s better than nothing and I actually successfuly used it when tracking input for some OS X applications.
  • #36: Just to be nice, we’re freeing closed FDs. This concludes input tracking, for working examples go to my blog and read latest post (even though for read() it’s basically the same).
  • #37: Heap layout analysis when you’re performing heap exploitation (e.g. Can you somehow influence the heap layout? How reliably? Often times you can tinker with the heap but you don’t always get 100% reliability, then it’s good to know how many times your object is in the range where you want it to be). Custom memory allocators are also very interesting, mainly because if you would snoop only into system API for memory allocs you wouldn’t get much meaning (=application makes its own pools) but if you study the mechanism of the allocator then you can insert probes at appropriate functions via pid provider and get meaningful information. For kernel memory allocation we would need to utilize FBT provider in order to snoop into BSD wrappers or just go straight into the dragons den and snoop into MACH internals.
  • #38: Note that when returning arg1 holds our return value instead of original argument. For other allocation functions (valloc, calloc, realloc, reallocf) probes would look similar, hence no point in going through all of them however I did include them for the sake of completness.
  • #42: Sidenote: what’s the difference between realloc() and reallocf()? When reallocf() fails it frees source buffer (this call is FreeBSD specific to which OS X is closely related).
  • #43: Freeing is as simple as it gets (not really, but for current version this is how we do things).
  • #44: Output example. However we can pipe this into villoc and get visualizations!
  • #45: Merging with villoc is an on-going project; we needed to discuss couple of things (e.g. Is memory allocation on OS X thread-safe or not? (Aparently it is, since it’s a POSIX requirement) and other things along what faults do we want to detect) in any way working alfa version is available on my github.
  • #46: This is work in progress (mainly due to IDA side of the tool). It should be soon available on my github. Typical end goal for code coverage is to shrink an input pool for fuzzing operation of the application XYZ, I want to mark what code was touched for very specific input in order to speed up my analysis inside of a tool like IDA or Hopper. Yes, I am aware of IDA’s mac_servers for debugging integration. I’ve had some problems with them.
  • #47: Regarding Python: When and if I finish python-based dtrace consumer I will open source it (if you’re interested you can follow me on twitter or github).
  • #49: Questions and answers.