SlideShare a Scribd company logo
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Efficient Django
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Abstract
Tips and best practices for avoiding scalability
issues and performance bottlenecks in Django
● 1) Basic concepts: the theory
● 2) Measuring: how to find bottlenecks
● 3) Tips and tricks
● 4) Conclusion (yes, it scales!)
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Hi!
● I'm David Arcos
● Python/Django developer since 2008
● Co-organizer at Python Barcelona
● CTO at Lead Ratings
David Arcos - @DZPMEfficient Django – #EuroPython 2016
●
“We improve your sales conversions, using
predictive algorithms to rate the leads”
●
Prediction API, “Machine Learning as a Service”
●
http://lead-ratings.com
David Arcos - @DZPMEfficient Django – #EuroPython 2016
1) Basic concepts
David Arcos - @DZPMEfficient Django – #EuroPython 2016
The Pareto Principle
"For many events, roughly 80% of the effects
come from 20% of the causes"
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Prioritize and focus
Focus on the few tasks that will have the most impact
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Basic scalability
“Potential to be enlarged to handle a growing
amount of work”
●
Stateless app servers
– Load balance them, scale horizontally
●
Keep the state on the database(s)
– This is the difficult part! Each system is different
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Database performance
●
Do less requests:
– Less reads
– Less writes
●
Do faster requests:
– Indexed fields
– De-normalize
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Templates
●
Cache them
●
Jinja2 is a bit faster than the default engine
– but cache them anyways
●
You can do fragment caching (for blocks)
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cache
●
Generic approach: cache at each stack level
●
The cache documentation is excellent
●
Beware of the cache invalidation!
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cache
●
Generic approach: cache at each stack level
●
The cache documentation is excellent
●
Beware of the cache invalidation!
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Bottlenecks
●
Where is your bottleneck?
●
CPU bound or I/O bound?
– CPU? Run heavy calculations in async workers
– Memory? Compress objects before caching
– Database? Read from db replicas
●
How to find it?
David Arcos - @DZPMEfficient Django – #EuroPython 2016
2) Measuring
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Can't improve what you don't measure
●
Measure your system to find bottlenecks
●
Optimize those bottlenecks
●
Verify the improvements
●
Rinse and repeat!
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Monitoring
●
System: load, CPU, memory...
●
Database: q/s, response time, size
●
Cache: q/s, hit rate
●
Queue: length
●
Custom: metrics for your app
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Profiling
●
The cProfile module provides profiling of
Python programs by collecting data:
– Number of calls, running time, time per call...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
timeit
●
The timeit module is a simple way to time
execution time of small bits of Python code:
David Arcos - @DZPMEfficient Django – #EuroPython 2016
ipdb
●
Like pdb, but for ipython
– tab completion, syntax highlighting, better
tracebacks, better introspection…
●
Use ipdb.set_trace() to add a breakpoint and
jump in with the debugger
David Arcos - @DZPMEfficient Django – #EuroPython 2016
django-debug-toolbar
●
Display debug information about the current
request/response
●
Panels, very modular
David Arcos - @DZPMEfficient Django – #EuroPython 2016
django-debug-toolbar-line-profiler
●
A toolbar panel for profiling
Django Debug Panel
●
Chrome extension
●
For AJAX requests and non-HTML responses
David Arcos - @DZPMEfficient Django – #EuroPython 2016
3) Tips and tricks
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Add db indexes
●
Single (db_index) or multiple (index_together)
●
Be sure to profile and measure!
– Sometimes it’s not obvious (i.e., admin)
– Huge difference, i.e. from 15s to 3 ms (3.5M rows)
●
But: uses more space, slower writes
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Do bulk operations
●
Will greatly reduce the number of SQL queries:
– Model.objects.bulk_create()
– qs.update() <- maybe with F() expressions
– qs.delete()
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Get related objects
●
Return FK fields in same query:
– qs.select_related()
●
Return M2M fields, extra query:
– qs.prefetch_related()
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Slow admin?
●
Use list_select_related
●
Overwrite get_queryset() with prefetch_related
●
Is ordering using an index? Same for search_fields
●
readonly_fields will avoid FK/M2M queries
●
Use the raw_id_fields widget (or better:
django-salmonella)
●
Extend admin/filter.html to show filters as <select>
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cachalot
●
Caches your Django ORM queries and
automatically invalidates them
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Queues and workers
●
Do slow stuff later
●
Some operations can be queued, and executed
asynchronously in workers
●
Use Celery
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cached sessions
●
Use SESSION_ENGINE to set cached sessions:
– Non-persistent: don’t hit the DB
– Persistent: don’t hit the DB… so often
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Persistent connections
●
Use CONN_MAX_AGE to set the lifetime of a
database connection (persistence)
David Arcos - @DZPMEfficient Django – #EuroPython 2016
UUIDs
●
Use UUID for Primary Keys (instead of
incremental IDs)
– Guaranteed uniqueness, avoid collisions
– UUIDs are well-indexed
●
Easier db sharding
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Slow tests?
●
Skip migrations: --keepdb
●
Run in parallel: --parallel
●
Disable unused middlewares, installed_apps,
password hashers, logging, etc…
●
Use mocking whenever possible
David Arcos - @DZPMEfficient Django – #EuroPython 2016
4) Conclusions
●
Measure first
●
Optimize only the bottleneck
●
Go for the low-hanging fruit
●
Measure again
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Good resources
●
The official Django documentation
●
Book: “High Performance Django”
●
Blog: “Instagram Engineering”
●
“Latency Numbers Every Programmer Should Know”
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Thanks for attending!
- Get the slides at http://slideshare.net/DZPM
- We are looking for engineers and data scientists!

More Related Content

PDF
Getting Started With Django
PDF
Django Documentation
PPT
Introduction To Django
PDF
Django: Beyond Basics
PPT
JBUG 11 - Django-The Web Framework For Perfectionists With Deadlines
PDF
QA 4 python
PPTX
Django Interview Questions and Answers
PDF
Two scoops of Django - Deployment
Getting Started With Django
Django Documentation
Introduction To Django
Django: Beyond Basics
JBUG 11 - Django-The Web Framework For Perfectionists With Deadlines
QA 4 python
Django Interview Questions and Answers
Two scoops of Django - Deployment

What's hot (20)

PPTX
PSR-7 - Middleware - Zend Expressive
PPTX
The Django Web Application Framework 2
PDF
PDF
Introduction to Django REST Framework, an easy way to build REST framework in...
PDF
The Evil Tester's Guide to HTTP proxies Tutorial
ODP
Behat Workshop at WeLovePHP
PPTX
Introduction to django
PDF
API Design & Security in django
PDF
Drupal and contribution (2010 - 2011 / 2)
PDF
Zend expressive workshop
PPTX
WordPress automation and CI
PDF
GDD HTML5, Flash, and the Battle for Faster Cat Videos
PDF
Testing nightwatch, by David Torroija
PDF
Python/Django Training
PDF
10 things you should know about django
PDF
Go at Skroutz
PDF
High Performance JavaScript 2011
PDF
Buildr - build like you code
PDF
Django Article V0
PDF
Automated testing in Drupal
PSR-7 - Middleware - Zend Expressive
The Django Web Application Framework 2
Introduction to Django REST Framework, an easy way to build REST framework in...
The Evil Tester's Guide to HTTP proxies Tutorial
Behat Workshop at WeLovePHP
Introduction to django
API Design & Security in django
Drupal and contribution (2010 - 2011 / 2)
Zend expressive workshop
WordPress automation and CI
GDD HTML5, Flash, and the Battle for Faster Cat Videos
Testing nightwatch, by David Torroija
Python/Django Training
10 things you should know about django
Go at Skroutz
High Performance JavaScript 2011
Buildr - build like you code
Django Article V0
Automated testing in Drupal
Ad

Similar to Efficient Django (20)

PDF
Serverless for High Performance Computing
PDF
Free django
PPTX
Creating a reasonable project boilerplate
PDF
Scalable Django Architecture
PDF
Software maintenance PyConPL 2016
PDF
NE Scala 2016 roundup
PDF
Serverless for High Performance Computing
PDF
Angular (v2 and up) - Morning to understand - Linagora
PDF
django
PDF
There is something about serverless
PDF
Python Django Intro V0.1
PDF
Unblocking The Main Thread Solving ANRs and Frozen Frames
PDF
Serverless? How (not) to develop, deploy and operate serverless applications.
PDF
Apache Spark Performance Observations
PDF
SciPipe - A light-weight workflow library inspired by flow-based programming
PDF
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
PPTX
Raising ux bar with offline first design
PDF
Fuzzing - Part 2
KEY
Intro To Django
PPTX
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Serverless for High Performance Computing
Free django
Creating a reasonable project boilerplate
Scalable Django Architecture
Software maintenance PyConPL 2016
NE Scala 2016 roundup
Serverless for High Performance Computing
Angular (v2 and up) - Morning to understand - Linagora
django
There is something about serverless
Python Django Intro V0.1
Unblocking The Main Thread Solving ANRs and Frozen Frames
Serverless? How (not) to develop, deploy and operate serverless applications.
Apache Spark Performance Observations
SciPipe - A light-weight workflow library inspired by flow-based programming
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Raising ux bar with offline first design
Fuzzing - Part 2
Intro To Django
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Ad

Recently uploaded (20)

PPT
Teaching material agriculture food technology
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
MYSQL Presentation for SQL database connectivity
Teaching material agriculture food technology
Understanding_Digital_Forensics_Presentation.pptx
Electronic commerce courselecture one. Pdf
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Big Data Technologies - Introduction.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Diabetes mellitus diagnosis method based random forest with bat algorithm
“AI and Expert System Decision Support & Business Intelligence Systems”
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation_ Review paper, used for researhc scholars
Reach Out and Touch Someone: Haptics and Empathic Computing
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Review of recent advances in non-invasive hemoglobin estimation
Unlocking AI with Model Context Protocol (MCP)
Programs and apps: productivity, graphics, security and other tools
MYSQL Presentation for SQL database connectivity

Efficient Django

  • 1. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Efficient Django
  • 2. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Abstract Tips and best practices for avoiding scalability issues and performance bottlenecks in Django ● 1) Basic concepts: the theory ● 2) Measuring: how to find bottlenecks ● 3) Tips and tricks ● 4) Conclusion (yes, it scales!)
  • 3. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Hi! ● I'm David Arcos ● Python/Django developer since 2008 ● Co-organizer at Python Barcelona ● CTO at Lead Ratings
  • 4. David Arcos - @DZPMEfficient Django – #EuroPython 2016 ● “We improve your sales conversions, using predictive algorithms to rate the leads” ● Prediction API, “Machine Learning as a Service” ● http://lead-ratings.com
  • 5. David Arcos - @DZPMEfficient Django – #EuroPython 2016 1) Basic concepts
  • 6. David Arcos - @DZPMEfficient Django – #EuroPython 2016 The Pareto Principle "For many events, roughly 80% of the effects come from 20% of the causes"
  • 7. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Prioritize and focus Focus on the few tasks that will have the most impact
  • 8. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Basic scalability “Potential to be enlarged to handle a growing amount of work” ● Stateless app servers – Load balance them, scale horizontally ● Keep the state on the database(s) – This is the difficult part! Each system is different
  • 9. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Database performance ● Do less requests: – Less reads – Less writes ● Do faster requests: – Indexed fields – De-normalize
  • 10. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Templates ● Cache them ● Jinja2 is a bit faster than the default engine – but cache them anyways ● You can do fragment caching (for blocks)
  • 11. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cache ● Generic approach: cache at each stack level ● The cache documentation is excellent ● Beware of the cache invalidation!
  • 12. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cache ● Generic approach: cache at each stack level ● The cache documentation is excellent ● Beware of the cache invalidation!
  • 13. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Bottlenecks ● Where is your bottleneck? ● CPU bound or I/O bound? – CPU? Run heavy calculations in async workers – Memory? Compress objects before caching – Database? Read from db replicas ● How to find it?
  • 14. David Arcos - @DZPMEfficient Django – #EuroPython 2016 2) Measuring
  • 15. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Can't improve what you don't measure ● Measure your system to find bottlenecks ● Optimize those bottlenecks ● Verify the improvements ● Rinse and repeat!
  • 16. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Monitoring ● System: load, CPU, memory... ● Database: q/s, response time, size ● Cache: q/s, hit rate ● Queue: length ● Custom: metrics for your app
  • 17. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Profiling ● The cProfile module provides profiling of Python programs by collecting data: – Number of calls, running time, time per call...
  • 18. David Arcos - @DZPMEfficient Django – #EuroPython 2016 timeit ● The timeit module is a simple way to time execution time of small bits of Python code:
  • 19. David Arcos - @DZPMEfficient Django – #EuroPython 2016 ipdb ● Like pdb, but for ipython – tab completion, syntax highlighting, better tracebacks, better introspection… ● Use ipdb.set_trace() to add a breakpoint and jump in with the debugger
  • 20. David Arcos - @DZPMEfficient Django – #EuroPython 2016 django-debug-toolbar ● Display debug information about the current request/response ● Panels, very modular
  • 21. David Arcos - @DZPMEfficient Django – #EuroPython 2016 django-debug-toolbar-line-profiler ● A toolbar panel for profiling Django Debug Panel ● Chrome extension ● For AJAX requests and non-HTML responses
  • 22. David Arcos - @DZPMEfficient Django – #EuroPython 2016 3) Tips and tricks
  • 23. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Add db indexes ● Single (db_index) or multiple (index_together) ● Be sure to profile and measure! – Sometimes it’s not obvious (i.e., admin) – Huge difference, i.e. from 15s to 3 ms (3.5M rows) ● But: uses more space, slower writes
  • 24. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Do bulk operations ● Will greatly reduce the number of SQL queries: – Model.objects.bulk_create() – qs.update() <- maybe with F() expressions – qs.delete()
  • 25. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Get related objects ● Return FK fields in same query: – qs.select_related() ● Return M2M fields, extra query: – qs.prefetch_related()
  • 26. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Slow admin? ● Use list_select_related ● Overwrite get_queryset() with prefetch_related ● Is ordering using an index? Same for search_fields ● readonly_fields will avoid FK/M2M queries ● Use the raw_id_fields widget (or better: django-salmonella) ● Extend admin/filter.html to show filters as <select>
  • 27. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cachalot ● Caches your Django ORM queries and automatically invalidates them
  • 28. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Queues and workers ● Do slow stuff later ● Some operations can be queued, and executed asynchronously in workers ● Use Celery
  • 29. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cached sessions ● Use SESSION_ENGINE to set cached sessions: – Non-persistent: don’t hit the DB – Persistent: don’t hit the DB… so often
  • 30. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Persistent connections ● Use CONN_MAX_AGE to set the lifetime of a database connection (persistence)
  • 31. David Arcos - @DZPMEfficient Django – #EuroPython 2016 UUIDs ● Use UUID for Primary Keys (instead of incremental IDs) – Guaranteed uniqueness, avoid collisions – UUIDs are well-indexed ● Easier db sharding
  • 32. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Slow tests? ● Skip migrations: --keepdb ● Run in parallel: --parallel ● Disable unused middlewares, installed_apps, password hashers, logging, etc… ● Use mocking whenever possible
  • 33. David Arcos - @DZPMEfficient Django – #EuroPython 2016 4) Conclusions ● Measure first ● Optimize only the bottleneck ● Go for the low-hanging fruit ● Measure again
  • 34. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Good resources ● The official Django documentation ● Book: “High Performance Django” ● Blog: “Instagram Engineering” ● “Latency Numbers Every Programmer Should Know”
  • 35. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Thanks for attending! - Get the slides at http://slideshare.net/DZPM - We are looking for engineers and data scientists!