SlideShare a Scribd company logo
Ruby Performance Secrets and 
How to Uncover Them 
http://www.slideshare.net/adymo/adymo-rubyconf-performance
Who am I? 
Alexander Dymo 
C/C++ since 2000 
Ruby/Rails since 2006 
Started to optimize back in 2007 
Never stopped since then
Rails Performance: What You Need to Know 
https://www.airpair.com/ruby-on-rails/performance 
Make Your Ruby/Rails App Fast: Performance And Memory 
Profiling Using ruby-prof and Kcachegrind 
http://www.acunote.com/blog/2008/02/make-your-ruby-rails-applications-fast-performance-and-memory-profiling.html 
Ruby Performance Tuning 
http://theprosegarden.com/contents-of-recent-issues/#10-14
Ruby 
Performance 
The first comprehensive book 
on Ruby Performance 
I'm 50% done. Beta soon. 
ruby-performance-book.com
Big thanks to:
What do we talk about today? 
Performance tips 
Performance best practices
What do we talk about today? 
Performance tips 
Performance best practices 
How to understand what's wrong 
How to find your own performance tips/best practices
In examples
Example 1
What can go wrong with this code?
What can go wrong with this code?
This was faster
Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them
100-200ms faster 
Sometimes 
…
Smells like...
https:// www.flickr.com/photos/timquijano/5720765523/
Let's check what happens:
Let's profile memory allocations 
Need patched ruby 
rvm reinstall 1.9.3 --patch railsexpress 
rvm reinstall 2.0.0 --patch railsexpress 
rvm reinstall 2.1.4 --patch railsexpress
Let's profile memory allocations 
Need profiler 
gem install ruby-prof
Let's profile memory allocations 
Need visualization tool 
Mac: 
brew install qcachegrind 
Linux: 
<your package manager> install kcachegrind 
Windows: 
http://sourceforge.net/projects/qcachegrindwin/
Let's profile memory allocations 
ruby-prof -p call_tree –mode=allocations before.rb > 
callgrind.out.before 
ruby-prof -p call_tree –mode=allocations after.rb > 
callgrind.out.after 
kcachegrind callgrind.out.before 
kcachegrind callgrind.out.after
Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them
Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them
Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them
static VALUE enum_inject(int argc, VALUE *argv, VALUE obj) 
{ 
NODE *memo; 
VALUE init, op; 
rb_block_call_func *iter = inject_i; 
… 
memo = NEW_MEMO(init, argc, op); 
rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); 
return memo->u1.value; 
}
> gdb `rbenv which ruby` 
GNU gdb (GDB) SUSE (7.5.1-2.5.1) 
Reading symbols from 
/home/gremlin/.rbenv/versions/2.1.4/bin/ruby...done. 
(gdb)
(gdb) l enum_inject 
632 * longest 
#=> "sheep" 
633 * 
634 */ 
635 static VALUE 
636 enum_inject(int argc, VALUE *argv, VALUE obj) 
637 { 
638 NODE *memo; 
639 VALUE init, op; 
640 rb_block_call_func *iter = inject_i; 
641 ID id; 
(gdb)
636 enum_inject(int argc, VALUE *argv, VALUE obj) 
637 { 
638 NODE *memo; 
639 VALUE init, op; 
640 rb_block_call_func *iter = inject_i; 
641 ID id; 
(gdb) b 638 
Breakpoint 1 at 0x1cbc0a: file enum.c, line 638. 
(gdb)
(gdb) r -e '[1,2,3].inject {}' 
Starting program: 
/home/gremlin/.rbenv/versions/2.1.4/bin/ruby -e 
'[1,2,3].inject {}' 
[Thread debugging using libthread_db enabled] 
Using host libthread_db library "/lib64/libthread_db.so.1". 
[New Thread 0x7ffff7ff2700 (LWP 3893)] 
Breakpoint 1, enum_inject (argc=0, argv=<optimized out>, 
obj=93825001586240) at enum.c:640 
640 rb_block_call_func *iter = inject_i; 
(gdb)
640 rb_block_call_func *iter = inject_i; 
(gdb) n 
665 memo = NEW_MEMO(init, argc, op); 
(gdb)
640 rb_block_call_func *iter = inject_i; 
(gdb) n 
665 memo = NEW_MEMO(init, argc, op); 
(gdb) n 
666 rb_block_call(obj, id_each, 0, 0, iter, 
(VALUE)memo); 
(gdb)
640 rb_block_call_func *iter = inject_i; 
(gdb) n 
665 memo = NEW_MEMO(init, argc, op); 
(gdb) n 
666 rb_block_call(obj, id_each, 0, 0, iter, 
(VALUE)memo); 
(gdb) s 
rb_block_call (obj=93825001586240, mid=1456, argc=0, 
argv=0x0, bl_proc=0x555555722460 <inject_i>, 
data2=93825001586200) at vm_eval.c:1142 
1142 { 
(gdb)
640 rb_block_call_func *iter = inject_i; 
(gdb) n 
665 memo = NEW_MEMO(init, argc, op); 
(gdb) n 
666 rb_block_call(obj, id_each, 0, 0, iter, 
(VALUE)memo); 
(gdb) s 
rb_block_call (obj=93825001586240, mid=1456, argc=0, 
argv=0x0, bl_proc=0x555555722460 <inject_i>, 
data2=93825001586200) at vm_eval.c:1142 
1142 { 
(gdb) s 
1145 arg.obj = obj; 
(gdb)
640 rb_block_call_func *iter = inject_i; 
(gdb) n 
665 memo = NEW_MEMO(init, argc, op); 
(gdb) n 
666 rb_block_call(obj, id_each, 0, 0, iter, 
(VALUE)memo); 
(gdb) s 
rb_block_call (obj=93825001586240, mid=1456, argc=0, 
argv=0x0, bl_proc=0x555555722460 <inject_i>, 
data2=93825001586200) at vm_eval.c:1142 
1142 { 
(gdb) s 
1145 arg.obj = obj; 
(gdb) s 
1146 arg.mid = mid; 
(gdb)
640 rb_block_call_func *iter = inject_i; 
(gdb) n 
665 memo = NEW_MEMO(init, argc, op); 
(gdb) n 
666 rb_block_call(obj, id_each, 0, 0, iter, 
(VALUE)memo); 
(gdb) s 
rb_block_call (obj=93825001586240, mid=1456, argc=0, 
argv=0x0, bl_proc=0x555555722460 <inject_i>, 
data2=93825001586200) at vm_eval.c:1142 
1142 { 
(gdb) s 
1145 arg.obj = obj; 
(gdb) s 
1146 arg.mid = mid; 
(gdb) s 
1147 arg.argc = argc; 
(gdb)
(gdb) s 
1147 arg.argc = argc; 
(gdb) s 
1148 arg.argv = argv; 
(gdb)
(gdb) s 
1147 arg.argc = argc; 
(gdb) s 
1148 arg.argv = argv; 
(gdb) s 
1149 return rb_iterate(iterate_method, (VALUE)&arg, 
bl_proc, data2); 
(gdb)
(gdb) s 
1147 arg.argc = argc; 
(gdb) s 
1148 arg.argv = argv; 
(gdb) s 
1149 return rb_iterate(iterate_method, (VALUE)&arg, 
bl_proc, data2); 
(gdb) s 
rb_iterate (it_proc=it_proc@entry=0x5555556c0790 
<iterate_method>, data1=data1@entry=140737488340304, 
bl_proc=0x555555722460 <inject_i>, data2=93825001586200) 
at vm_eval.c:1054 
1054 { 
(gdb)
(gdb) s 
1147 arg.argc = argc; 
(gdb) s 
1148 arg.argv = argv; 
(gdb) s 
1149 return rb_iterate(iterate_method, (VALUE)&arg, 
bl_proc, data2); 
(gdb) s 
rb_iterate (it_proc=it_proc@entry=0x5555556c0790 
<iterate_method>, data1=data1@entry=140737488340304, 
bl_proc=0x555555722460 <inject_i>, data2=93825001586200) 
at vm_eval.c:1054 
1054 { 
(gdb) s 
1057 NODE *node = NEW_IFUNC(bl_proc, data2); 
(gdb)
(gdb) s 
1147 arg.argc = argc; 
(gdb) s 
1148 arg.argv = argv; 
(gdb) s 
1149 return rb_iterate(iterate_method, (VALUE)&arg, 
bl_proc, data2); 
(gdb) s 
rb_iterate (it_proc=it_proc@entry=0x5555556c0790 
<iterate_method>, data1=data1@entry=140737488340304, 
bl_proc=0x555555722460 <inject_i>, data2=93825001586200) 
at vm_eval.c:1054 
1054 { 
(gdb) s 
1057 NODE *node = NEW_IFUNC(bl_proc, data2); 
(gdb)
static VALUE enum_inject(int argc, VALUE *argv, VALUE obj) 
{ 
NODE *memo; 
VALUE init, op; 
rb_block_call_func *iter = inject_i; 
… 
memo = NEW_MEMO(init, argc, op); 
rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); 
return memo->u1.value; 
}
VALUE rb_block_call(…) 
{ 
… 
return rb_iterate(iterate_method, 
(VALUE)&arg, bl_proc, data2); 
} 
VALUE rb_iterate(…) 
{ 
int state; 
volatile VALUE retval = Qnil; 
NODE *node = NEW_IFUNC(bl_proc, data2); 
… 
}
2 T_NODE's per inject() call
10000.times { [].inject } 
20000 extra T_NODE objects 
some work for GC
Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them
Ruby 
Performance 
More in my book 
ruby-performance-book.com
Lessons learned: 
1. use profiler to understand why your code is slow 
2. use C debugger to understand Ruby behavior
Example 2
What's the difference? 
str = 'a'*1024*1024*10 
str = str.gsub('a', 'b') 
str = 'a'*1024*1024*10 
str.gsub!('a', 'b')
str = 'a'*1024*1024*10 
str = str.gsub('a', 'b') 
str = 'a'*1024*1024*10 
str.gsub!('a', 'b') 
replaces 'a' with 'b' 
creates a new object 
reuses "str" name 
replaces 'a' with 'b' 
changes the original
Supposedly
Let's profile memory usage 
ruby-prof -p call_tree –mode=memory after.rb > 
callgrind.out.after 
kcachegrind callgrind.out.after
Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them
So, gsub! doesn't save any memory
So, gsub! doesn't save any memory 
… except one slot on Ruby heap
So, gsub! doesn't save any memory 
except one slot on Ruby heap 
… which is 40 bytes
Not all bang! functions are the same 
str = 'a'*1024*1024*10 
str.downcase! 
ruby-prof -p call_tree –mode=memory downcase.rb > 
callgrind.out.downcase 
kcachegrind callgrind.out.downcase
Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them
Lessons learned: 
1. profile memory 
2. challenge all tips/tricks/best practices
Conclusions 
1. Don't guess. Profile. 
2. Guess. Profile. 
3. Profile not only CPU, but Memory. 
4. Look at the source, use GDB if not enlightened. 
5. Challenge all tips/tricks. Understand instead.
Big thanks to:
Ruby 
Performance 
ruby-performance-book.com 
airpair.me/adymo 
@alexander_dymo

More Related Content

PDF
Message in a bottle
PPTX
Node.js System: The Landing
PDF
How to write Ruby extensions with Crystal
KEY
谈谈Javascript设计
PDF
Shared memory and multithreading in Node.js - Timur Shemsedinov - JSFest'19
PDF
"You shall not pass : anti-debug methodics"
PPTX
Groovy
PDF
Powered by Python - PyCon Germany 2016
Message in a bottle
Node.js System: The Landing
How to write Ruby extensions with Crystal
谈谈Javascript设计
Shared memory and multithreading in Node.js - Timur Shemsedinov - JSFest'19
"You shall not pass : anti-debug methodics"
Groovy
Powered by Python - PyCon Germany 2016

What's hot (20)

PDF
Cluj.py Meetup: Extending Python in C
PDF
Start Wrap Episode 11: A New Rope
PDF
Node.js extensions in C++
PDF
Cluj Big Data Meetup - Big Data in Practice
PDF
Writing native bindings to node.js in C++
PPTX
Hacking Go Compiler Internals / GoCon 2014 Autumn
PPTX
Basic C++ 11/14 for Python Programmers
PDF
Groovy.pptx
PPTX
Behavior driven oop
PDF
12 Monkeys Inside JS Engine
PDF
Unleash your inner console cowboy
KEY
Objective-Cひとめぐり
PPT
Python Objects
PPTX
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
PDF
Letswift18 워크숍#1 스위프트 클린코드와 코드리뷰
PDF
Nodejs性能分析优化和分布式设计探讨
PDF
Compose Async with RxJS
PDF
7주 JavaScript 실습
PPT
C++totural file
PPT
Cluj.py Meetup: Extending Python in C
Start Wrap Episode 11: A New Rope
Node.js extensions in C++
Cluj Big Data Meetup - Big Data in Practice
Writing native bindings to node.js in C++
Hacking Go Compiler Internals / GoCon 2014 Autumn
Basic C++ 11/14 for Python Programmers
Groovy.pptx
Behavior driven oop
12 Monkeys Inside JS Engine
Unleash your inner console cowboy
Objective-Cひとめぐり
Python Objects
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Letswift18 워크숍#1 스위프트 클린코드와 코드리뷰
Nodejs性能分析优化和分布式设计探讨
Compose Async with RxJS
7주 JavaScript 실습
C++totural file
Ad

Viewers also liked (20)

PDF
Profiling Ruby
PPTX
Призма24 - Маркетплейсы.
PDF
Debbug Rails Application For Dummies
PDF
RSpec. Part 2
PDF
PDF
Deploy.rb, Ilya Zykin, Rails club2016
PDF
10 reasons I love RubyOnRails
PDF
RSpec. Part 3
ODP
I18n ruby-приложений
PPTX
Rails Concerns
ODP
Ruby on Rails for noobs
PDF
Когда технологий много - iForum 2013
PDF
Assets Pipeline
ODP
Alexander Dymo - IT Jam 2009 - Ruby: Beaty Or The Beast
PPT
Александр Тищенко - "Антикризисная презентация"
PDF
Фронтенд для рубиста
PDF
RSpec. Part 1
PPTX
развертывание среды Rails (антон веснин, Locum Ru)
PDF
Как сделать контрибут в Ruby on Rails
PPTX
Why does code style matter?
Profiling Ruby
Призма24 - Маркетплейсы.
Debbug Rails Application For Dummies
RSpec. Part 2
Deploy.rb, Ilya Zykin, Rails club2016
10 reasons I love RubyOnRails
RSpec. Part 3
I18n ruby-приложений
Rails Concerns
Ruby on Rails for noobs
Когда технологий много - iForum 2013
Assets Pipeline
Alexander Dymo - IT Jam 2009 - Ruby: Beaty Or The Beast
Александр Тищенко - "Антикризисная презентация"
Фронтенд для рубиста
RSpec. Part 1
развертывание среды Rails (антон веснин, Locum Ru)
Как сделать контрибут в Ruby on Rails
Why does code style matter?
Ad

Similar to Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them (20)

PDF
Практический опыт профайлинга и оптимизации производительности Ruby-приложений
PDF
Профилирование и оптимизация производительности Ruby-кода
KEY
There and Back Again
ZIP
Rubinius 1.0 and more!
PDF
Ruby memory tips and tricks
ODP
Debugging and Profiling Rails Application
PDF
Debugging Ruby Systems
ODP
RailswayCon 2010 - Dynamic Language VMs
PDF
Ruby 2.4 Internals
PDF
Performance tweaks and tools for Linux (Joe Damato)
PDF
Debugging Ruby
PDF
Railswaycon Inside Matz Ruby
PDF
Profiling ruby
PDF
Scaling Rails with Ruby-prof -- Ruby Conf Kenya 2017 by Ben Hughes
PPTX
Ruby/rails performance and profiling
PDF
Rubinius - Improving the Rails ecosystem
PDF
Metaprogramming in Ruby
PDF
Rubinius - What Have You Done For Me Lately?
PDF
Rechecking TortoiseSVN with the PVS-Studio Code Analyzer
KEY
Gotcha! Ruby things that will come back to bite you.
Практический опыт профайлинга и оптимизации производительности Ruby-приложений
Профилирование и оптимизация производительности Ruby-кода
There and Back Again
Rubinius 1.0 and more!
Ruby memory tips and tricks
Debugging and Profiling Rails Application
Debugging Ruby Systems
RailswayCon 2010 - Dynamic Language VMs
Ruby 2.4 Internals
Performance tweaks and tools for Linux (Joe Damato)
Debugging Ruby
Railswaycon Inside Matz Ruby
Profiling ruby
Scaling Rails with Ruby-prof -- Ruby Conf Kenya 2017 by Ben Hughes
Ruby/rails performance and profiling
Rubinius - Improving the Rails ecosystem
Metaprogramming in Ruby
Rubinius - What Have You Done For Me Lately?
Rechecking TortoiseSVN with the PVS-Studio Code Analyzer
Gotcha! Ruby things that will come back to bite you.

Recently uploaded (20)

PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
ai tools demonstartion for schools and inter college
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
AI in Product Development-omnex systems
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
medical staffing services at VALiNTRY
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
CHAPTER 2 - PM Management and IT Context
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Odoo Companies in India – Driving Business Transformation.pdf
2025 Textile ERP Trends: SAP, Odoo & Oracle
Softaken Excel to vCard Converter Software.pdf
ai tools demonstartion for schools and inter college
Wondershare Filmora 15 Crack With Activation Key [2025
ISO 45001 Occupational Health and Safety Management System
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
AI in Product Development-omnex systems
VVF-Customer-Presentation2025-Ver1.9.pptx
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Online Work Permit System for Fast Permit Processing
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
How Creative Agencies Leverage Project Management Software.pdf
How to Choose the Right IT Partner for Your Business in Malaysia
medical staffing services at VALiNTRY

Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them

  • 1. Ruby Performance Secrets and How to Uncover Them http://www.slideshare.net/adymo/adymo-rubyconf-performance
  • 2. Who am I? Alexander Dymo C/C++ since 2000 Ruby/Rails since 2006 Started to optimize back in 2007 Never stopped since then
  • 3. Rails Performance: What You Need to Know https://www.airpair.com/ruby-on-rails/performance Make Your Ruby/Rails App Fast: Performance And Memory Profiling Using ruby-prof and Kcachegrind http://www.acunote.com/blog/2008/02/make-your-ruby-rails-applications-fast-performance-and-memory-profiling.html Ruby Performance Tuning http://theprosegarden.com/contents-of-recent-issues/#10-14
  • 4. Ruby Performance The first comprehensive book on Ruby Performance I'm 50% done. Beta soon. ruby-performance-book.com
  • 6. What do we talk about today? Performance tips Performance best practices
  • 7. What do we talk about today? Performance tips Performance best practices How to understand what's wrong How to find your own performance tips/best practices
  • 10. What can go wrong with this code?
  • 11. What can go wrong with this code?
  • 17. Let's check what happens:
  • 18. Let's profile memory allocations Need patched ruby rvm reinstall 1.9.3 --patch railsexpress rvm reinstall 2.0.0 --patch railsexpress rvm reinstall 2.1.4 --patch railsexpress
  • 19. Let's profile memory allocations Need profiler gem install ruby-prof
  • 20. Let's profile memory allocations Need visualization tool Mac: brew install qcachegrind Linux: <your package manager> install kcachegrind Windows: http://sourceforge.net/projects/qcachegrindwin/
  • 21. Let's profile memory allocations ruby-prof -p call_tree –mode=allocations before.rb > callgrind.out.before ruby-prof -p call_tree –mode=allocations after.rb > callgrind.out.after kcachegrind callgrind.out.before kcachegrind callgrind.out.after
  • 25. static VALUE enum_inject(int argc, VALUE *argv, VALUE obj) { NODE *memo; VALUE init, op; rb_block_call_func *iter = inject_i; … memo = NEW_MEMO(init, argc, op); rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); return memo->u1.value; }
  • 26. > gdb `rbenv which ruby` GNU gdb (GDB) SUSE (7.5.1-2.5.1) Reading symbols from /home/gremlin/.rbenv/versions/2.1.4/bin/ruby...done. (gdb)
  • 27. (gdb) l enum_inject 632 * longest #=> "sheep" 633 * 634 */ 635 static VALUE 636 enum_inject(int argc, VALUE *argv, VALUE obj) 637 { 638 NODE *memo; 639 VALUE init, op; 640 rb_block_call_func *iter = inject_i; 641 ID id; (gdb)
  • 28. 636 enum_inject(int argc, VALUE *argv, VALUE obj) 637 { 638 NODE *memo; 639 VALUE init, op; 640 rb_block_call_func *iter = inject_i; 641 ID id; (gdb) b 638 Breakpoint 1 at 0x1cbc0a: file enum.c, line 638. (gdb)
  • 29. (gdb) r -e '[1,2,3].inject {}' Starting program: /home/gremlin/.rbenv/versions/2.1.4/bin/ruby -e '[1,2,3].inject {}' [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff7ff2700 (LWP 3893)] Breakpoint 1, enum_inject (argc=0, argv=<optimized out>, obj=93825001586240) at enum.c:640 640 rb_block_call_func *iter = inject_i; (gdb)
  • 30. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb)
  • 31. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb)
  • 32. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb)
  • 33. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb) s 1145 arg.obj = obj; (gdb)
  • 34. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb) s 1145 arg.obj = obj; (gdb) s 1146 arg.mid = mid; (gdb)
  • 35. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb) s 1145 arg.obj = obj; (gdb) s 1146 arg.mid = mid; (gdb) s 1147 arg.argc = argc; (gdb)
  • 36. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb)
  • 37. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb)
  • 38. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb) s rb_iterate (it_proc=it_proc@entry=0x5555556c0790 <iterate_method>, data1=data1@entry=140737488340304, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1054 1054 { (gdb)
  • 39. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb) s rb_iterate (it_proc=it_proc@entry=0x5555556c0790 <iterate_method>, data1=data1@entry=140737488340304, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1054 1054 { (gdb) s 1057 NODE *node = NEW_IFUNC(bl_proc, data2); (gdb)
  • 40. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb) s rb_iterate (it_proc=it_proc@entry=0x5555556c0790 <iterate_method>, data1=data1@entry=140737488340304, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1054 1054 { (gdb) s 1057 NODE *node = NEW_IFUNC(bl_proc, data2); (gdb)
  • 41. static VALUE enum_inject(int argc, VALUE *argv, VALUE obj) { NODE *memo; VALUE init, op; rb_block_call_func *iter = inject_i; … memo = NEW_MEMO(init, argc, op); rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); return memo->u1.value; }
  • 42. VALUE rb_block_call(…) { … return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); } VALUE rb_iterate(…) { int state; volatile VALUE retval = Qnil; NODE *node = NEW_IFUNC(bl_proc, data2); … }
  • 43. 2 T_NODE's per inject() call
  • 44. 10000.times { [].inject } 20000 extra T_NODE objects some work for GC
  • 46. Ruby Performance More in my book ruby-performance-book.com
  • 47. Lessons learned: 1. use profiler to understand why your code is slow 2. use C debugger to understand Ruby behavior
  • 49. What's the difference? str = 'a'*1024*1024*10 str = str.gsub('a', 'b') str = 'a'*1024*1024*10 str.gsub!('a', 'b')
  • 50. str = 'a'*1024*1024*10 str = str.gsub('a', 'b') str = 'a'*1024*1024*10 str.gsub!('a', 'b') replaces 'a' with 'b' creates a new object reuses "str" name replaces 'a' with 'b' changes the original
  • 52. Let's profile memory usage ruby-prof -p call_tree –mode=memory after.rb > callgrind.out.after kcachegrind callgrind.out.after
  • 54. So, gsub! doesn't save any memory
  • 55. So, gsub! doesn't save any memory … except one slot on Ruby heap
  • 56. So, gsub! doesn't save any memory except one slot on Ruby heap … which is 40 bytes
  • 57. Not all bang! functions are the same str = 'a'*1024*1024*10 str.downcase! ruby-prof -p call_tree –mode=memory downcase.rb > callgrind.out.downcase kcachegrind callgrind.out.downcase
  • 59. Lessons learned: 1. profile memory 2. challenge all tips/tricks/best practices
  • 60. Conclusions 1. Don't guess. Profile. 2. Guess. Profile. 3. Profile not only CPU, but Memory. 4. Look at the source, use GDB if not enlightened. 5. Challenge all tips/tricks. Understand instead.
  • 62. Ruby Performance ruby-performance-book.com airpair.me/adymo @alexander_dymo