SlideShare a Scribd company logo
Lessons learnt in 2009
         Pratik Naik



                       Jobless
Lessons learnt in 2009
         Pratik Naik

                       Freelancer/
                       ActionRails

                       Blogger
                       m.onkey.org

                       Rails Core Team
                       member
Lessons Learnt in 2009
And some hobby apps


   http://planetrubyonrails.com



     http://tweetmuffler.com
• My first ever presentation.
• Usually don’t like any conferences
• But it’s Brazil!!
So if I screw up...




DONT Tweet FFS
Overview
•   Using Ruby Enterprise Edition

•   Testing

    •   Faster tests

    •   Factory v/s Fixtures

•   Security

    •   Auto escaping for XSS prevention

    •   More MessageVerifier

•   Asynchronous job processing with DJ

    •   How to use

    •   Fork based batch processing

•   Scaling

    •   Scaling file uploads with mod_porter

    •   Better Pagination
This talk is targeted at
      Ruby Web Developers
      Not everything may apply to things outside the web




Also if this doesn’t interest you, now is the time to go the
                        next room :-)
Lesson 1



Always use REE
What is REE ?
Ruby + COW GC
 And a bunch of other cool patches
Maintained by




(Photo acquired by the use of force)
                                                       (That was just googling)




               The Phusion Guys
Who uses REE ?



   And many others
Who uses REE ?



   And many others

  In Production
Why should you use it
      for the
  development ?
Topmost Reason

Super Fast Tests
MRI - Ruby 1.8.6
    $ time rake

    real
 1m45.293s
    user
 0m54.341s
    sys 
 0m33.008s
REE - Ruby 1.8.6
    $ time rake

    real
 1m30.219s
    user
 0m40.290s
    sys 
 0m25.433s
That’s15 seconds faster
       Completely Free

          YMMV
Only Catch
 Twitter’s GC Settings
.profile

   RUBY_HEAP_MIN_SLOTS=500000
   RUBY_HEAP_SLOTS_INCREMENT=250000
   RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
   RUBY_GC_MALLOC_LIMIT=50000000


http://blog.evanweaver.com/articles/2009/04/09/ruby-gc-
                         tuning/
Second Reason
ruby-prof & Rails Performance Tests
What are Rails Performance tests ?
      Integration Tests + ruby-prof
http://guides.rubyonrails.org/performance_testing.html
$ script/generate performance_test Home
       exists test/performance/
       create test/performance/home_test.rb



class HomeTest < ActionController::PerformanceTest
  def test_homepage
   get '/'
  end
end
MRI




$ rake test:profile
HomeTest#test_homepage (29 ms warmup)
        wall_time: 25 ms
          memory: 0.00 KB
         objects: 0


Needs a custom installation with special Patches
REE




$ rake test:profile
HomeTest#test_homepage (48 ms warmup)
        wall_time: 16 ms
          memory: 698.18 KB
         objects: 21752


                 Just “works”
Lesson 2
       Efficient Testing

• Faster Tests
• Factory
• More Integration Tests & Less Unit Tests
Faster Tests




        Tickle
http://github.com/lifo/tickle
$ script/plugin install git://github.com/lifo/tickle.git
Benchmarks
  ( they’re real )

$ time rake

real
 1m30.219s
user
 0m40.290s
sys 
 0m25.433s
Benchmarks
   ( they’re real )

$ time rake tickle

real
 0m55.691s
user
 0m37.532s
sys 
 0m22.563s
That’s another 35
            seconds
( On top of the 15 seconds initially saved by REE )
To get the Best of Tickle


• Experiment with the number of processes
• Create more test databases


          It’s all in the README
Parallel Specs
http://github.com/jasonm/parallel_specs
Faster Tests




    fast_context
http://github.com/lifo/fast_context
Problem




That’s 5 DB Queries and 5 GET Requests
Solution




That’s 1 DB Query and 1 GET Request
   5x Faster w/ one word change
Catch ?

• Tests no longer atomic
• Developers should not need to care about
  atomicity of the tests
• It’s an optimization
Writing more effective tests in less time
Factory v/s Fixtures
★ Slow                   ★ Fast
★ Very easy to manage    ★ Hard to manage
★ Describes the data     ★ Doesn’t describe the
   being tested             data

★ Hardly Breaks          ★ Very Brittle
★ Runs all the callbacks ★ Does not run callbacks
Example : Fixtures



      What can go wrong ?
• Someone could change users(:lifo) to be no
    longer a ‘free’ account holder
•   Someone could add more items to ‘lifo’
•   Someone could remove lifo’s item!
•   ‘create_more_items’ could fails because ‘lifo’
    failed validations
Example : Factory



 What can go wrong ?


   Not much
Factory + Faker
Awesome Development Data
  (Clients and Designers Love it)
Writing Good Factories


• Should be able to loop
  10.times { Factory(:user) }


• No associations in the base Factory
  Factory(:user) and Factory(:user_with_items)


• Should pass the validations
Integration Tests > Unit Tests
Lesson 3
Improved Security
To know more




http://guides.rubyonrails.org/security.html
rails_xss
http://github.com/NZKoz/rails_xss
 By Michael Kozkiarski of http://therailsway.com/
rails_xss
 Without rails_xss                           With rails_xss



<%= “<script>alert(‘foo’)</script>” %>   <%= “<script>alert(‘foo’)</script>” %>

                  =>                                       =>

     <script>alert(‘foo’)</script>       &lt;script&gt;alert('foo')&lt;/script&gt;


    Must use h() explicitly                       No need of h()
rails_xss

• Built in Rails 3
• Enabled by the rails_xss plugin in Rails
  2.3.next
• Requires Erubis
rails_xss
Introduces the concept of SafeBuffer




         ( * just the relevant bits )
rails_xss
>> buffer = ActionView::SafeBuffer.new
=> ""

>> buffer << "Hello ! "
=> "Hello ! "

>> buffer << "<script>"
=> "Hello ! &lt;script&gt;"
rails_xss

• Uses Erubis hooks to make <%= %> tags to
  always return a SafeBuffer
• Modifies all the relevant Rails helpers and
  mark them as html_safe!
rails_xss
       When you don’t want to escape


<%= "<a href='#{foo_path}'>foo</a>".html_safe! %>
MessageVerifier
http://api.rubyonrails.org/classes/ActiveSupport/MessageVerifier.html
MessageVerifier

Secret                       Ruby Data




            Signed Message



(Derived from Cookie Session Store)
MessageVerifier
>> verifier = ActiveSupport::MessageVerifier.new("my super secret")
=> #<ActiveSupport::MessageVerifier:0x2d559ec @secret="my super secret",
@digest="SHA1">

>> data = [1, 10.days.from_now.utc]
=> [1, Sat Oct 24 05:28:07 UTC 2009]

>> token = verifier.generate(data) # Generate a token that is safe to distribute
=>
"BAhbB2kGSXU6CVRpbWUNBWcbgO7HdXAGOh9AbWFyc2hhbF93aXRoX3V0Y19jb2Vy
Y2lvblQ=--ff41cf5575006a2797cad49e6738361346292bfa"

>> id, expiry_time = verifier.verify(token) # Get the data back
=> [1, Sat Oct 24 05:28:07 UTC 2009]
MessageVerifier

      Example use case
 “Remember Me” Functionality
MessageVerifier
    When you store the ‘remember me’ tokens in the db



• Extra column
• More maintenance
 Expiring tokens after every use or after password reset



• Doesn’t play well with multiple
 browsers
MessageVerifier
# User.rb
def remember_me_token
 User.remember_me_verifier.generate([self.id, self.salt])
end

# Controller - when user checks the ‘remember me’
def send_remember_cookie!
 cookies[:auth_token] = {
  :value => @current_user.remember_me_token,
  :expires => 20.years.from_now.utc }
end
MessageVerifier

• Use a different secret for every use of
  MessageVerifier
  rake secret

• Make sure to use the ‘salt’ for generating
  the token, making sure the token expires
  on the password change
bcrypt
http://bcrypt-ruby.rubyforge.org
bcrypt
             Bcrypt                          MD5/SHA1



★ Designed for generating password ★ Designed for detecting data
    hash                                tampering

★ Meant to be “slow”               ★ Meant to be super “fast”
bcrypt

• bcrypt-ruby gem by Coda Hale works great
• Reduces the need of ‘salt’ column by
  storing the salt in the encrypted password
  column
• Allows you to increase the ‘cost factor’ as
  the computers get faster
Lesson 4
Background processing
     with the DJ
      http://github.com/tobi/delayed_job
What is DJ ?
                                                                  DJ
                                                                 Worker

Webserver

                                                                  DJ
            Jobs               Database               Jobs
                                                                 Worker

Webserver
                                                                  DJ
                                                                 Worker



                   Database backed asynchronous priority queue
How to use DJ ?




Minimal Example using Delayed::Job.enqueue
How to use DJ ?




  More practical example
How to use DJ ?




   Using send_later
How to use DJ ?




Using handle_asynchronously
Batch Processing w/ DJ
Batch Processing w/ DJ
              Tweetmuffler Requirement




That’s average 4-10 external calls per user. Every 2 minutes.
Batch Processing w/ DJ
     Initial Implementation




          1 job/user
Batch Processing w/ DJ


      Problems with that ?
Batch Processing w/ DJ

     DID NOT SCALE

   • Too Slow
   • Way too much memory required
   • Too many workers required
Batch Processing w/ DJ

             Solution


     Fork based workers w/ REE
Batch Processing w/ DJ
Batch Processing w/ DJ

  Has scaled great so far

     • 10x faster
     • Uses 40% less memory
     • Just 1 worker needed
Batch Processing w/ DJ
General things to remember when forking w/ Ruby

     • Always reset the database
       connections
     • Always reset any open file handlers
     • Make sure the child calls exit! from
       an ensure block
     • Make sure mysql allows sufficient
       number of connections
Batch Processing w/ DJ
REE Specific things to remember when forking


   • Call GC.start before you fork
   • Call GC.copy_on_write_friendly = true as
    early as possible. Possibly from the top of
    the Rakefile and environment.rb
Lesson 5
  Scaling
http://modporter.com




Scaling file uploads
Mod Porter
            What’s the problem ?


• Rails processes are resource intensive
• Multipart parsing for large files can get
  slower
• Keeping  a Rails process occupied for
  multipart parsing of large files can have
  serious scaling issues
Mod Porter
       How does mod_porter work ?

• mod_porter is an apache module built on
  top of libapreq
• libapreq does the heavy job of multipart
  parsing in a cheap little apache process
• mod_porter sends those multipart files as
  tmpfile urls to the Rails app
• mod_porter Rails plugin makes the whole
  thing transparent to the application
Mod Porter
                    Apache Config File

<VirtualHost *:8080>
  ServerName actionrails.com
  DocumentRoot /Users/actionrails/application/current/public

  Porter On
  PorterSharedSecret secret
</VirtualHost>


                     Rails Configration

     class ApplicationController < ActionController::Base
       self.mod_porter_secret = "secret"
     end
will_paginate
   Does not scale
will_paginate
        The common pattern




SELECT * FROM `posts` LIMIT 10,10
will_paginate
            Scaling Problems

• Large OFFSET are harder to scale
• Problems clear when you have more rows
  than the memory can hold
• Very hard to cache
• Extra COUNT queries
How to scale Pagination?
Scalable Pagination




       Github
Scalable Pagination




       Twitter
Scalable Pagination
   What’s common with Github and Twitter ?

• Don’t show all the page links
• Don’t show the total count
• AJAX is much easier to scale when it
  comes to pagination
• Pagination query does not use OFFSET,
  just LIMIT.
Scalable Pagination
                         Page 1
page1 = SELECT * FROM `posts` LIMIT 10 WHERE id > 0 ASC id
                page2_min_id = page1.last.id

                         Page 2
    page2 = SELECT * FROM `posts` LIMIT 10 WHERE id >
                   page2_min_id ASC id
Scalable Pagination
                  Benefits ?

• Using no OFFSET is much faster
• Plays great with caching. No records ever
  get repeated
• A little less user friendly as you cannot
  show all the page numbers
Source
http://www.scribd.com/doc/14683263/Efficient-Pagination-Using-MySQL
                         By the Yahoo folks
That’s all !
Thank you.
Blog http://m.onkey.org

        @lifo

More Related Content

KEY
DSLs Internas e Ruby
PDF
Lightweight Webservices with Sinatra and RestClient
PDF
Inside Bokete: Web Application with Mojolicious and others
PDF
Using Sinatra to Build REST APIs in Ruby
PDF
Developing apps using Perl
PDF
Sinatra Rack And Middleware
KEY
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
PDF
RESTful web services
DSLs Internas e Ruby
Lightweight Webservices with Sinatra and RestClient
Inside Bokete: Web Application with Mojolicious and others
Using Sinatra to Build REST APIs in Ruby
Developing apps using Perl
Sinatra Rack And Middleware
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
RESTful web services

What's hot (20)

PDF
Perl web frameworks
PDF
ECMAScript 6
KEY
Mojo as a_client
ODP
Writing webapps with Perl Dancer
PDF
A reviravolta do desenvolvimento web
KEY
Plack - LPW 2009
PDF
Modern Perl Web Development with Dancer
PDF
Django Heresies
PDF
Perl Dancer for Python programmers
PPTX
Express JS
PDF
Keeping it small - Getting to know the Slim PHP micro framework
KEY
Keeping it small: Getting to know the Slim micro framework
KEY
Mojolicious - A new hope
PDF
Mojolicious. Веб в коробке!
PDF
Asynchronous programming patterns in Perl
PPTX
Building Web Apps with Express
PDF
PerlDancer for Perlers (FOSDEM 2011)
ZIP
Web Apps in Perl - HTTP 101
PPTX
Node.js Express
PDF
JSON and the APInauts
Perl web frameworks
ECMAScript 6
Mojo as a_client
Writing webapps with Perl Dancer
A reviravolta do desenvolvimento web
Plack - LPW 2009
Modern Perl Web Development with Dancer
Django Heresies
Perl Dancer for Python programmers
Express JS
Keeping it small - Getting to know the Slim PHP micro framework
Keeping it small: Getting to know the Slim micro framework
Mojolicious - A new hope
Mojolicious. Веб в коробке!
Asynchronous programming patterns in Perl
Building Web Apps with Express
PerlDancer for Perlers (FOSDEM 2011)
Web Apps in Perl - HTTP 101
Node.js Express
JSON and the APInauts
Ad

Similar to Lessons Learnt in 2009 (20)

KEY
Ship It ! with Ruby/ Rails Ecosystem
PDF
Rails Performance
PDF
Ruby performance - The low hanging fruit
KEY
Fast, concurrent ruby web applications with EventMachine and EM::Synchrony
PDF
Great Tools Heavily Used In Japan, You Don't Know.
PDF
Puppet Development Workflow
PPTX
Exploring Ruby on Rails and PostgreSQL
PDF
20141210 rakuten techtalk
PPTX
BTV PHP - Building Fast Websites
PDF
Angular 2 overview
PPTX
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)
PDF
Steamlining your puppet development workflow
PDF
Puppet Camp New York 2014: Streamlining Puppet Development Workflow
PDF
Top ten-list
PDF
Celery: The Distributed Task Queue
PDF
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
PPTX
BTD2015 - Your Place In DevTOps is Finding Solutions - Not Just Bugs!
PPTX
Tuenti Release Workflow
PDF
Getting Started with Rails on GlassFish (Hands-on Lab) - Spark IT 2010
KEY
Rapid development with Rails
Ship It ! with Ruby/ Rails Ecosystem
Rails Performance
Ruby performance - The low hanging fruit
Fast, concurrent ruby web applications with EventMachine and EM::Synchrony
Great Tools Heavily Used In Japan, You Don't Know.
Puppet Development Workflow
Exploring Ruby on Rails and PostgreSQL
20141210 rakuten techtalk
BTV PHP - Building Fast Websites
Angular 2 overview
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)
Steamlining your puppet development workflow
Puppet Camp New York 2014: Streamlining Puppet Development Workflow
Top ten-list
Celery: The Distributed Task Queue
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
BTD2015 - Your Place In DevTOps is Finding Solutions - Not Just Bugs!
Tuenti Release Workflow
Getting Started with Rails on GlassFish (Hands-on Lab) - Spark IT 2010
Rapid development with Rails
Ad

Recently uploaded (20)

PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Modernizing your data center with Dell and AMD
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPT
Teaching material agriculture food technology
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
MYSQL Presentation for SQL database connectivity
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation theory and applications.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Approach and Philosophy of On baking technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Modernizing your data center with Dell and AMD
Reach Out and Touch Someone: Haptics and Empathic Computing
Advanced methodologies resolving dimensionality complications for autism neur...
Unlocking AI with Model Context Protocol (MCP)
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The AUB Centre for AI in Media Proposal.docx
Dropbox Q2 2025 Financial Results & Investor Presentation
Teaching material agriculture food technology
NewMind AI Weekly Chronicles - August'25 Week I
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
MYSQL Presentation for SQL database connectivity
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation theory and applications.pdf
Encapsulation_ Review paper, used for researhc scholars
Approach and Philosophy of On baking technology
Diabetes mellitus diagnosis method based random forest with bat algorithm

Lessons Learnt in 2009