SlideShare a Scribd company logo
Python Profiling:
A. Jesse Jiryu Davis


@jessejiryudavis


MongoDB
The Glory
&
The Guts
Python Performance Profiling: The Guts And The Glory
“PyMongo is slower!
compared to the JavaScript version”
MongoDB Node.js driver:!88,000 per second
PyMongo: ! ! ! ! ! ! ! ! ! 29,000 per second
“Why Is
PyMongo Slower?”
From:!steve@mongodb.com!
To:!! jesse@mongodb.com!
CC:!! eliot@mongodb.com

Hi Jesse,!
!
Why is the Node MongoDB driver 3 times!
faster than PyMongo?!


http://dzone.com/articles/mongodb-facts-over-80000
The Python Code
# Obtain a MongoDB collection.!
import pymongo!
!
client = pymongo.MongoClient('localhost')!
db = client.random!
collection = db.randomData!
collection.remove()!
n_documents = 80000!
batch_size = 5000!
batch = []!
!
import time!
start = time.time()
The Python Code
import random!
from datetime import datetime!
!
min_date = datetime(2012, 1, 1)!
max_date = datetime(2013, 1, 1)!
delta = (max_date - min_date).total_seconds()!
The Python Code
What?!
The Python Code
for i in range(n_documents):!
date = datetime.fromtimestamp(!
time.mktime(min_date.timetuple())!
+ int(round(random.random() * delta)))!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
duration = time.time() - start!
!
print 'inserted %d documents per second' % (!
n_documents / duration)!
The Python Code
inserted 30,000 documents per second
The Node.js Code
(not shown)
The Question
Why is the Python script
3 times slower than the
equivalent Node script?
Why Profile?
• Optimization is like debugging
• Hypothesis:

“The following change will yield a
worthwhile improvement.”
• Experiment
• Repeat until fast enough
Why Profile?
Profiling is a way to

generate hypotheses.
Which Profiler?
• cProfile
• GreenletProfiler
• Yappi
Yappi
By Sümer Cip
Yappi
Compared to cProfile, it is:
!
• As fast
• Also measures functions
• Can measure CPU time, not just wall

• Can measure all threads
• Can export to callgrind
Yappi
import yappi!
!
yappi.set_clock_type('cpu')!
yappi.start(builtins=True)!
!
start = time.time()!
!
for i in range(n_documents):!
# ... same code ... !
!
duration = time.time() - start!
stats = yappi.get_func_stats()!
stats.save('callgrind.out', type='callgrind')!
Same code

as before
KCacheGrind
for index in range(n_documents):!
date = datetime.fromtimestamp(!
time.mktime(min_date.timetuple())!
+ int(round(random.random() * delta)))!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
The Python Code
one third

of the time
for index in range(n_documents):!
date = datetime.now()!
!
!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
The Python Code
The Python Code
• Before: 30,000 inserts per second
• After: 50,000 inserts per second
Why Profile?
• Generate hypotheses

• Estimate possible improvement
How Does

Profiling Work?
int callback(PyFrameObject *frame,!
int what,!
PyObject *arg);!
int start(void)!
{!
PyEval_SetProfile(callback);!
}!
PyObject *!
PyEval_EvalFrameEx(PyFrameObject *frame)!
{!
if (tstate->c_profilefunc != NULL) {!
tstate->c_profilefunc(frame,!
PyTrace_CALL,!
Py_None);!
}!
!
/* ... execute bytecode in the frame!
* until return or exception... */!
!
if (tstate->c_profilefunc != NULL) {!
tstate->c_profilefunc(frame,!
PyTrace_RETURN,!
retval);!
}!
}!
int callback(PyFrameObject *frame,!
int what,!
PyObject *arg)!
{!
switch (what) {!
case PyTrace_CALL:!
{!
PyCodeObject *cobj = frame->f_code;!
PyObject *filename = cobj->co_filename;!
PyObject *funcname = cobj->co_name;!
!
/* ... record the function call ... */!
}!
break;!
!
/* ... other cases ... */!
!
}!
}!
A. Jesse Jiryu Davis


@jessejiryudavis


MongoDB

More Related Content

PDF
Traceur - Javascript.next - Now! RheinmainJS April 14th
PDF
(WS14) Sasa Matijasic - Node.js i "novi" web
PDF
mruby で mackerel のプラグインを作るはなし
PPTX
Building a Line of Business Enterprise Web App with Play, Scala, and Akka
PDF
Cooking Up Drama
PDF
Cooking Up Drama - ChefConf 2015
KEY
Talk about fabric
PDF
Capistrano && SystemD
Traceur - Javascript.next - Now! RheinmainJS April 14th
(WS14) Sasa Matijasic - Node.js i "novi" web
mruby で mackerel のプラグインを作るはなし
Building a Line of Business Enterprise Web App with Play, Scala, and Akka
Cooking Up Drama
Cooking Up Drama - ChefConf 2015
Talk about fabric
Capistrano && SystemD

What's hot (18)

PDF
The Secrets of The FullStack Ninja - Part A - Session I
ODP
Event Loop in Javascript
PDF
appborg, coffeesurgeon, moof, logging-system
PDF
Matthew Eernisse, NodeJs, .toster {webdev}
PDF
Mining crypto in browser as a bleeding edge performance challenge for the Web...
PPTX
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)
PDF
Building a REST API with Node.js and MongoDB
PDF
Make the prompt great again
PDF
Front End Development Automation with Grunt
PDF
Swarm@MoscowJS v2 (en)
PDF
What Is Async, How Does It Work, And When Should I Use It?
PPTX
Web development-workflow
PPTX
Nvvp streams-3
PDF
Introduction to Erebos: a JavaScript client for Swarm
PPTX
ECMAScript 6 and the Node Driver
PDF
Build web application by express
KEY
earthquake.gem
PDF
grifork - fast propagative task runner -
The Secrets of The FullStack Ninja - Part A - Session I
Event Loop in Javascript
appborg, coffeesurgeon, moof, logging-system
Matthew Eernisse, NodeJs, .toster {webdev}
Mining crypto in browser as a bleeding edge performance challenge for the Web...
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)
Building a REST API with Node.js and MongoDB
Make the prompt great again
Front End Development Automation with Grunt
Swarm@MoscowJS v2 (en)
What Is Async, How Does It Work, And When Should I Use It?
Web development-workflow
Nvvp streams-3
Introduction to Erebos: a JavaScript client for Swarm
ECMAScript 6 and the Node Driver
Build web application by express
earthquake.gem
grifork - fast propagative task runner -
Ad

Viewers also liked (20)

PDF
Python Performance: Single-threaded, multi-threaded, and Gevent
PDF
Blazing Performance with Flame Graphs
PDF
Understanding greenlet
PDF
Deep into your applications, performance & profiling
PDF
Smashing the bottleneck: Qt application profiling
PPTX
Websockets in Node.js - Making them reliable and scalable
PDF
Scaling Django with gevent
PPTX
Vasiliy Litvinov - Python Profiling
PDF
What’s eating python performance
PPTX
Denis Nagorny - Pumping Python Performance
PDF
The High Performance Python Landscape by Ian Ozsvald
PPTX
Boost.Python: C++ and Python Integration
PDF
Spark + Scikit Learn- Performance Tuning
PDF
Python profiling
PDF
Exploiting GPUs in Spark
PDF
Spark performance tuning - Maksud Ibrahimov
PPTX
The Potential of GPU-driven High Performance Data Analytics in Spark
PDF
Python performance profiling
PPTX
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
PDF
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Python Performance: Single-threaded, multi-threaded, and Gevent
Blazing Performance with Flame Graphs
Understanding greenlet
Deep into your applications, performance & profiling
Smashing the bottleneck: Qt application profiling
Websockets in Node.js - Making them reliable and scalable
Scaling Django with gevent
Vasiliy Litvinov - Python Profiling
What’s eating python performance
Denis Nagorny - Pumping Python Performance
The High Performance Python Landscape by Ian Ozsvald
Boost.Python: C++ and Python Integration
Spark + Scikit Learn- Performance Tuning
Python profiling
Exploiting GPUs in Spark
Spark performance tuning - Maksud Ibrahimov
The Potential of GPU-driven High Performance Data Analytics in Spark
Python performance profiling
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Ad

Similar to Python Performance Profiling: The Guts And The Glory (20)

PDF
High performance json- postgre sql vs. mongodb
PPTX
Dev Jumpstart: Build Your First App with MongoDB
PDF
Golang Performance : microbenchmarks, profilers, and a war story
PPTX
Dev Jumpstart: Build Your First App with MongoDB
KEY
Deployment Strategy
PDF
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PDF
MongoDB tuning on AWS
PDF
Mongo nyc nyt + mongodb
PDF
Benchmarking, Load Testing, and Preventing Terrible Disasters
PPTX
High Performance, Scalable MongoDB in a Bare Metal Cloud
KEY
Deployment Strategies
KEY
Deployment Strategies (Mongo Austin)
PDF
MongoDB and server performance
PDF
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...
PPTX
Deployment Preparedness
PDF
Deployment
KEY
MongoDB Command Line Tools
KEY
MongoDB NYC Python
DOCX
What are the major components of MongoDB and the major tools used in it.docx
PPTX
Agility and Scalability with MongoDB
High performance json- postgre sql vs. mongodb
Dev Jumpstart: Build Your First App with MongoDB
Golang Performance : microbenchmarks, profilers, and a war story
Dev Jumpstart: Build Your First App with MongoDB
Deployment Strategy
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
MongoDB tuning on AWS
Mongo nyc nyt + mongodb
Benchmarking, Load Testing, and Preventing Terrible Disasters
High Performance, Scalable MongoDB in a Bare Metal Cloud
Deployment Strategies
Deployment Strategies (Mongo Austin)
MongoDB and server performance
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...
Deployment Preparedness
Deployment
MongoDB Command Line Tools
MongoDB NYC Python
What are the major components of MongoDB and the major tools used in it.docx
Agility and Scalability with MongoDB

Recently uploaded (20)

PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
ai tools demonstartion for schools and inter college
PPTX
Essential Infomation Tech presentation.pptx
PDF
AI in Product Development-omnex systems
PDF
Nekopoi APK 2025 free lastest update
PDF
System and Network Administration Chapter 2
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
System and Network Administraation Chapter 3
PDF
top salesforce developer skills in 2025.pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
How Creative Agencies Leverage Project Management Software.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Which alternative to Crystal Reports is best for small or large businesses.pdf
Navsoft: AI-Powered Business Solutions & Custom Software Development
ai tools demonstartion for schools and inter college
Essential Infomation Tech presentation.pptx
AI in Product Development-omnex systems
Nekopoi APK 2025 free lastest update
System and Network Administration Chapter 2
How to Choose the Right IT Partner for Your Business in Malaysia
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Wondershare Filmora 15 Crack With Activation Key [2025
System and Network Administraation Chapter 3
top salesforce developer skills in 2025.pdf
Upgrade and Innovation Strategies for SAP ERP Customers
Odoo POS Development Services by CandidRoot Solutions
Design an Analysis of Algorithms II-SECS-1021-03
Odoo Companies in India – Driving Business Transformation.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
How Creative Agencies Leverage Project Management Software.pdf

Python Performance Profiling: The Guts And The Glory

  • 1. Python Profiling: A. Jesse Jiryu Davis 
 @jessejiryudavis 
 MongoDB The Glory & The Guts
  • 3. “PyMongo is slower! compared to the JavaScript version” MongoDB Node.js driver:!88,000 per second PyMongo: ! ! ! ! ! ! ! ! ! 29,000 per second
  • 4. “Why Is PyMongo Slower?” From:!steve@mongodb.com! To:!! jesse@mongodb.com! CC:!! eliot@mongodb.com
 Hi Jesse,! ! Why is the Node MongoDB driver 3 times! faster than PyMongo?! 
 http://dzone.com/articles/mongodb-facts-over-80000
  • 5. The Python Code # Obtain a MongoDB collection.! import pymongo! ! client = pymongo.MongoClient('localhost')! db = client.random! collection = db.randomData! collection.remove()!
  • 6. n_documents = 80000! batch_size = 5000! batch = []! ! import time! start = time.time() The Python Code
  • 7. import random! from datetime import datetime! ! min_date = datetime(2012, 1, 1)! max_date = datetime(2013, 1, 1)! delta = (max_date - min_date).total_seconds()! The Python Code
  • 8. What?! The Python Code for i in range(n_documents):! date = datetime.fromtimestamp(! time.mktime(min_date.timetuple())! + int(round(random.random() * delta)))! ! value = random.random()! document = {! 'created_on': date,! 'value': value}! ! batch.append(document)! if len(batch) == batch_size:! collection.insert(batch)! batch = []!
  • 9. duration = time.time() - start! ! print 'inserted %d documents per second' % (! n_documents / duration)! The Python Code inserted 30,000 documents per second
  • 11. The Question Why is the Python script 3 times slower than the equivalent Node script?
  • 12. Why Profile? • Optimization is like debugging • Hypothesis:
 “The following change will yield a worthwhile improvement.” • Experiment • Repeat until fast enough
  • 13. Why Profile? Profiling is a way to
 generate hypotheses.
  • 14. Which Profiler? • cProfile • GreenletProfiler • Yappi
  • 16. Yappi Compared to cProfile, it is: ! • As fast • Also measures functions • Can measure CPU time, not just wall
 • Can measure all threads • Can export to callgrind
  • 17. Yappi import yappi! ! yappi.set_clock_type('cpu')! yappi.start(builtins=True)! ! start = time.time()! ! for i in range(n_documents):! # ... same code ... ! ! duration = time.time() - start! stats = yappi.get_func_stats()! stats.save('callgrind.out', type='callgrind')! Same code
 as before
  • 19. for index in range(n_documents):! date = datetime.fromtimestamp(! time.mktime(min_date.timetuple())! + int(round(random.random() * delta)))! ! value = random.random()! document = {! 'created_on': date,! 'value': value}! ! batch.append(document)! if len(batch) == batch_size:! collection.insert(batch)! batch = []! The Python Code one third
 of the time
  • 20. for index in range(n_documents):! date = datetime.now()! ! ! ! value = random.random()! document = {! 'created_on': date,! 'value': value}! ! batch.append(document)! if len(batch) == batch_size:! collection.insert(batch)! batch = []! The Python Code
  • 21. The Python Code • Before: 30,000 inserts per second • After: 50,000 inserts per second
  • 22. Why Profile? • Generate hypotheses
 • Estimate possible improvement
  • 23. How Does
 Profiling Work? int callback(PyFrameObject *frame,! int what,! PyObject *arg);! int start(void)! {! PyEval_SetProfile(callback);! }!
  • 24. PyObject *! PyEval_EvalFrameEx(PyFrameObject *frame)! {! if (tstate->c_profilefunc != NULL) {! tstate->c_profilefunc(frame,! PyTrace_CALL,! Py_None);! }! ! /* ... execute bytecode in the frame! * until return or exception... */! ! if (tstate->c_profilefunc != NULL) {! tstate->c_profilefunc(frame,! PyTrace_RETURN,! retval);! }! }!
  • 25. int callback(PyFrameObject *frame,! int what,! PyObject *arg)! {! switch (what) {! case PyTrace_CALL:! {! PyCodeObject *cobj = frame->f_code;! PyObject *filename = cobj->co_filename;! PyObject *funcname = cobj->co_name;! ! /* ... record the function call ... */! }! break;! ! /* ... other cases ... */! ! }! }!
  • 26. A. Jesse Jiryu Davis 
 @jessejiryudavis 
 MongoDB