SlideShare a Scribd company logo
Django and working with
large database tables
Django Stockholm Meetup Group
March 30, 2017
About me
Ilian Iliev
Platform Engineer at Lifesum
ilian@ilian.io
www.ilian.io
The setup
2.5GHz i7, 16GB Ram, MacBook Pro
Django 1.10
MySQL 5.7.14
PostgreSQL 9.5.4
The Models
class Tag(models.Model):
name = models.CharField(max_length=255)
class User(models.Model):
name = models.CharField(max_length=255)
date = models.DateTimeField(null=True)
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag)
The Change
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag, blank=True)
The weird migration
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`)
REFERENCES `big_tables_tag` (`id`);
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY
(`message_id`) REFERENCES `big_tables_message` (`id`);
MySQL
Rows ~ 2.7M
Size ~ 88MB
message_id index size ~ 48MB
tags_id index size ~ 61MB
Migration time ~ 41 sec
The weird migration
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id")
REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED;
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY
("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
PostgreSQL
Rows ~ 2.8M
Size ~ 83MB
message_id index size ~ 77MB
tags_id index size ~ 119MB
Migration time ~ 3.2 sec
Modify the migration that created the field and add the change there
* It is a know issue https://code.djangoproject.com/ticket/25253
Solution
class MessagesTags(models.Model):
message = models.ForeignKey(Message)
tag = models.ForeignKey(Tag)
added_by = models.ForeignKey(User, null=True)
Adding fields to big tables
MySQL: 31 sec
PostgreSQL: 5.3 sec
Timing
MySQL INPLACE
ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id`
integer NULL, ALGORITHM INPLACE, LOCK NONE;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id`
FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`),
ALGORITHM INPLACE, LOCK NONE;
* The INPLACE algorithm is supported when foreign_key_checks is disabled.
Otherwise, only the COPY algorithm is supported.
Running this on prod
Running in on prod resulted in the API crashing
Non locking query but still too heavy for the DB
Aurora appears even slower
Alternative
class MessagesTagsExtend(models.Model):
STATUS_PENDING_REVIEW = 0
STATUS_APPROVED = 10
DEFAULT_STATUS = STATUS_PENDING_REVIEW
message_tag = models.OneToOneField(MessagesTags)
status = models.IntegerField(default=DEFAULT_STATUS)
Alternative
class MessagesTags(models.Model):
...
@property
def status(self):
try:
return self.messagestagsextend.status
except MessagesTagsExtend.DoesNotExist:
print 'here'
return MessagesTagsExtend.DEFAULT_STATUS
@status.setter
def status(self, value):
obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self)
obj.status = value
obj.save()
self.messagestagsextend = obj
* Performance is not tested on production environment
Iterating on big tables
for x in MessagesTags.objects.all():
print x
+ Single SQL query
- Loads everything in memory
Iterating on big tables
for x in MessagesTags.objects.iterator():
print x
+ Single SQL query
+ Loads pieces of the result in memory
- prefetch_related is not working
Questions?

More Related Content

PDF
Developing Applications with MySQL and Java for beginners
PPTX
Giving Clarity to LINQ Queries by Extending Expressions R2
PPTX
Unit/Integration Testing using Spock
PPTX
Cassandra
PPTX
Indexing with MongoDB
PPTX
Xpath injection in XML databases
PPT
XPath Injection
PPTX
MySql:Basics
Developing Applications with MySQL and Java for beginners
Giving Clarity to LINQ Queries by Extending Expressions R2
Unit/Integration Testing using Spock
Cassandra
Indexing with MongoDB
Xpath injection in XML databases
XPath Injection
MySql:Basics

What's hot (20)

PPTX
MySql:Introduction
PPT
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
ODP
Mongo indexes
PDF
Hacking XPATH 2.0
PDF
20190627 j hipster-conf- diary of a java dev lost in the .net world
PPTX
PPTX
XML & XPath Injections
PDF
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
PPTX
บทที่4
PPTX
Indexing and Query Optimizer (Aaron Staple)
PPTX
MongoDB and Indexes - MUG Denver - 20160329
PDF
Clojure functions midje
PPT
Indexing & query optimization
PPTX
Python PCEP Functions
PPT
Fast querying indexing for performance (4)
PPT
Jdbc oracle
PPTX
Sequelize
PDF
Python dictionary : past, present, future
PPTX
Smarter Testing with Spock
PDF
1.4 data cleaning and manipulation in r and excel
MySql:Introduction
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Mongo indexes
Hacking XPATH 2.0
20190627 j hipster-conf- diary of a java dev lost in the .net world
XML & XPath Injections
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
บทที่4
Indexing and Query Optimizer (Aaron Staple)
MongoDB and Indexes - MUG Denver - 20160329
Clojure functions midje
Indexing & query optimization
Python PCEP Functions
Fast querying indexing for performance (4)
Jdbc oracle
Sequelize
Python dictionary : past, present, future
Smarter Testing with Spock
1.4 data cleaning and manipulation in r and excel
Ad

Similar to Django and working with large database tables (20)

PDF
Questions On The Code And Core Module
PDF
Django Good Practices
PDF
concurrency with GPars
PDF
More Stored Procedures and MUMPS for DivConq
PDF
Python Metaprogramming
PPT
Clean code _v2003
PPT
Django Models
PDF
Data herding
PDF
Data herding
PDF
Java → kotlin: Tests Made Simple
PDF
Why Our Code Smells
PPTX
03 object-classes-pbl-4-slots
PPTX
03 object-classes-pbl-4-slots
PDF
Clean code
PDF
[FT-7][snowmantw] How to make a new functional language and make the world be...
PDF
Building node.js applications with Database Jones
PDF
Addressing Scenario
PDF
GSP 125 Final Exam Guide
DOCX
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
PDF
Metaprogramovanie #1
Questions On The Code And Core Module
Django Good Practices
concurrency with GPars
More Stored Procedures and MUMPS for DivConq
Python Metaprogramming
Clean code _v2003
Django Models
Data herding
Data herding
Java → kotlin: Tests Made Simple
Why Our Code Smells
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots
Clean code
[FT-7][snowmantw] How to make a new functional language and make the world be...
Building node.js applications with Database Jones
Addressing Scenario
GSP 125 Final Exam Guide
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
Metaprogramovanie #1
Ad

Recently uploaded (20)

PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
System and Network Administration Chapter 2
PPT
Introduction Database Management System for Course Database
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
top salesforce developer skills in 2025.pdf
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
L1 - Introduction to python Backend.pptx
PPTX
history of c programming in notes for students .pptx
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Introduction to Artificial Intelligence
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Operating system designcfffgfgggggggvggggggggg
System and Network Administration Chapter 2
Introduction Database Management System for Course Database
Reimagine Home Health with the Power of Agentic AI​
top salesforce developer skills in 2025.pdf
Navsoft: AI-Powered Business Solutions & Custom Software Development
Odoo Companies in India – Driving Business Transformation.pdf
Softaken Excel to vCard Converter Software.pdf
CHAPTER 2 - PM Management and IT Context
L1 - Introduction to python Backend.pptx
history of c programming in notes for students .pptx
VVF-Customer-Presentation2025-Ver1.9.pptx
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
How to Choose the Right IT Partner for Your Business in Malaysia
Introduction to Artificial Intelligence
Computer Software and OS of computer science of grade 11.pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free

Django and working with large database tables

  • 1. Django and working with large database tables Django Stockholm Meetup Group March 30, 2017
  • 2. About me Ilian Iliev Platform Engineer at Lifesum ilian@ilian.io www.ilian.io
  • 3. The setup 2.5GHz i7, 16GB Ram, MacBook Pro Django 1.10 MySQL 5.7.14 PostgreSQL 9.5.4
  • 4. The Models class Tag(models.Model): name = models.CharField(max_length=255) class User(models.Model): name = models.CharField(max_length=255) date = models.DateTimeField(null=True) class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag)
  • 5. The Change class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag, blank=True)
  • 6. The weird migration ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`) REFERENCES `big_tables_tag` (`id`); ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY (`message_id`) REFERENCES `big_tables_message` (`id`);
  • 7. MySQL Rows ~ 2.7M Size ~ 88MB message_id index size ~ 48MB tags_id index size ~ 61MB Migration time ~ 41 sec
  • 8. The weird migration ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id") REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED; ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY ("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
  • 9. PostgreSQL Rows ~ 2.8M Size ~ 83MB message_id index size ~ 77MB tags_id index size ~ 119MB Migration time ~ 3.2 sec
  • 10. Modify the migration that created the field and add the change there * It is a know issue https://code.djangoproject.com/ticket/25253 Solution
  • 11. class MessagesTags(models.Model): message = models.ForeignKey(Message) tag = models.ForeignKey(Tag) added_by = models.ForeignKey(User, null=True) Adding fields to big tables
  • 12. MySQL: 31 sec PostgreSQL: 5.3 sec Timing
  • 13. MySQL INPLACE ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id` integer NULL, ALGORITHM INPLACE, LOCK NONE; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id` FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`), ALGORITHM INPLACE, LOCK NONE; * The INPLACE algorithm is supported when foreign_key_checks is disabled. Otherwise, only the COPY algorithm is supported.
  • 14. Running this on prod Running in on prod resulted in the API crashing Non locking query but still too heavy for the DB Aurora appears even slower
  • 15. Alternative class MessagesTagsExtend(models.Model): STATUS_PENDING_REVIEW = 0 STATUS_APPROVED = 10 DEFAULT_STATUS = STATUS_PENDING_REVIEW message_tag = models.OneToOneField(MessagesTags) status = models.IntegerField(default=DEFAULT_STATUS)
  • 16. Alternative class MessagesTags(models.Model): ... @property def status(self): try: return self.messagestagsextend.status except MessagesTagsExtend.DoesNotExist: print 'here' return MessagesTagsExtend.DEFAULT_STATUS @status.setter def status(self, value): obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self) obj.status = value obj.save() self.messagestagsextend = obj * Performance is not tested on production environment
  • 17. Iterating on big tables for x in MessagesTags.objects.all(): print x + Single SQL query - Loads everything in memory
  • 18. Iterating on big tables for x in MessagesTags.objects.iterator(): print x + Single SQL query + Loads pieces of the result in memory - prefetch_related is not working

Editor's Notes

  • #4: How many of you use MySQL How many use PostgreSQL Anyone using SQLite or Oracle?
  • #17: And of course you will always have to add select related
  • #18: Single SQL query Loads everything in memory I killed it after taking 2G of ram and it still hasn’t started printing the results
  • #19: And of course you will always have to add select related Consider using values() and values_list()
  • #20: Thank you a for listening, do you have any questions.