The	
  Hacker’s	
  Database	
  
      Amir	
  Salihefendic	
  (amix)	
  
About	
  Me	
  
•  Co-­‐founder	
  and	
  former	
  CTO	
  of	
  Plurk.com	
  
   	
  
•  Helped	
  Plurk	
  scale	
  to	
  millions	
  of	
  users,	
  
   billions	
  of	
  pages	
  views	
  and	
  8+	
  billion	
  unique	
  
   data	
  items.	
  With	
  minimal	
  hardware!	
  


•  Founder	
  of	
  Doist.io	
  
   creators	
  of	
  Todoist	
  and	
  Wedoist	
  
Outline	
  of	
  the	
  talk	
  
•  Plurk	
  Timelines	
  opKmizaKon:	
  How	
  we	
  saved	
  
   hundreds	
  of	
  thousands	
  of	
  dollars	
  
   	
  
•  What’s	
  great	
  about	
  Redis?	
  
   	
  
•  Different	
  sample	
  implementaKons:	
  
   –  redis_wrap	
  
   –  redis_graph	
  
   –  redis_queue	
  
      	
  
•  Advanced	
  analyKcs	
  using	
  Redis	
  
   –  bitmapist	
  and	
  bitmapist.cohort	
  
Amir Salihefendic: Redis - the hacker's database
Problem	
  
ExponenKal	
  data	
  growth	
  in	
  Social	
  Networks	
  




    data size




                         number of users
The	
  Easy	
  Solu=on	
  
Throw	
  money	
  at	
  the	
  problem	
  
The	
  Smarter	
  Solu=on	
  
    Reduce	
  to	
  linear	
  data	
  growth	
  
    	
  


data size




                     number of users
Example:	
  Timelines	
  
Example:	
  Timelines	
  



 timeline
data size




                 number of users
Example:	
  Timelines	
  
                 SoluKon:	
  Chea=ng!	
  
    Make	
  Kmelines	
  a	
  fixed	
  size	
  -­‐	
  500	
  messages	
  



 timeline                                         •  O(1)	
  inserKon	
  
data size                                         •  O(1)	
  update	
  
                                                  •  Cache	
  able	
  


                   number of users
Plurk’s	
  =melines	
  migra=on	
  path	
  

	
  
	
  
	
  
	
                                              Tokyo	
  Tyrant	
  
	
  
	
  
	
  

•  Problem	
  with	
  MySQL	
  and	
  Tokyo	
  Tyrant?	
  
   Death	
  by	
  IO	
  
What’s	
  great	
  about	
  Redis?	
  
• Everything	
  is	
  in	
  memory,	
  
  but	
  the	
  data	
  is	
  persistent.	
  
  	
  
• Amazing	
  performance:	
  
  100.000+	
  SETs	
  pr.	
  sec	
  
  80.000+	
  GETs	
  pr.	
  sec	
  
Redis	
  Rich	
  Datatypes	
  
•  Rela=onal	
  databases	
  
  Schemas,	
  tables,	
  columns,	
  rows,	
  indexes	
  etc.	
  
   	
  
•  Column	
  databases	
  (BigTable,	
  hBase	
  etc.)	
  
   Schemas,	
  columns,	
  column	
  families,	
  rows	
  etc.	
  
   	
  
•  Redis	
  
  key-­‐value,	
  sets,	
  lists,	
  hashes,	
  bitmaps,	
  etc.	
  
Redis	
  datatypes	
  resemble	
  datatypes	
  
in	
  programming	
  languages.	
  
	
  
They	
  are	
  natural	
  to	
  us!	
  
redis_wrap	
  
•  Implements	
  a	
  wrapper	
  for	
  Redis	
  datatypes	
  so	
  
   they	
  mimic	
  the	
  datatypes	
  found	
  in	
  Python	
  
   	
  
•  100	
  lines	
  of	
  code	
  
   	
  
•  h_ps://github.com/Doist/redis_wrap	
  	
  
redis_wrap	
  

# Mimic of Python lists	              # Mimic of Python sets	
bears = get_list('bears')	            fishes = get_set('fishes')	
bears.append('grizzly')	              assert 'nemo' not in fishes	
	                                     	
assert len(bears) == 1	               fishes.add('nemo')	
assert 'grizzly' in bears	
           assert 'nemo' in fishes	
                                      	
                                      for item in fishes:	
                                          assert item == 'nemo'	
  


# Mimic of hashes 	
villains = get_hash('villains')	
assert 'riddler' not in villains	
	
villains['riddler'] = 'Edward Nigma'	
assert 'riddler' in villains	
assert len(villains.keys()) == 1	
	
del villains['riddler']	
assert len(villains) == 0	
  
redis_graph	
  
•  Implements	
  a	
  simple	
  graph	
  database	
  in	
  Python	
  
   	
  
•  Can	
  scale	
  to	
  a	
  few	
  million	
  nodes	
  easily	
  

•  You	
  could	
  use	
  something	
  similar	
  to	
  implement	
  
   LinkedIn’s	
  “who	
  is	
  connected	
  to	
  who”	
  feature	
  
   	
  
•  Under	
  40	
  lines	
  of	
  code	
  
   	
  
•  h_ps://github.com/Doist/redis_graph	
  	
  
redis_graph	
  
# Adding an edge between nodes	
add_edge(from_node='frodo', to_node='gandalf')	
assert has_edge(from_node='frodo',	
                to_node='gandalf') == True	
                	
# Getting neighbors of a node	
assert list(neighbors('frodo')) == ['gandalf']	
	
# Deleting edges	
delete_edge(from_node='frodo', to_node='gandalf')	
  


# Setting node values	
set_node_value('frodo', '1')	
assert get_node_value('frodo') == '1'	
	
# Setting edge values	
set_edge_value('frodo_baggins', '2')	
assert get_edge_value('frodo_baggins') == '2'	
  
redis_graph:	
  The	
  implementaKon	
  
from redis_wrap import *	
	
#--- Edges ----------------------------------------------	
def add_edge(from_node, to_node, system='default'):	
     edges = get_set( from_node, system=system )	
     edges.add( to_node )	
	
def delete_edge(from_node, to_node, system='default'):	
     edges = get_set( from_node, system=system )	
	
     key_node_y = to_node	
     if key_node_y in edges:	
         edges.remove( key_node_y )	                 #--- Node values ----------------------------	
	                                                    def get_node_value(node_x, system='default'):	
def has_edge(from_node, to_node, system='default'):	      node_key = 'nv:%s' % node_x	
     edges = get_set( from_node, system=system )	         return get_redis(system).get( node_key )	
     return to_node in edges	                        	
	                                                    def set_node_value(node_x, value, system='default'):	
def neighbors(node_x, system='default'):	                 node_key = 'nv:%s' % node_x	
     return get_set( node_x, system=system )	             return get_redis(system).set( node_key, value )	
	
                                                   	
                                                     #--- Edge values -----------------------------	
                                                     def get_edge_value(edge_x, system='default'):	
                                                          edge_key = 'ev:%s' % edge_x	
                                                          return get_redis(system).get( edge_key )	
                                                     	
                                                     def set_edge_value(edge_x, value, system='default'):	
                                                          edge_key = 'ev:%s' % edge_x	
                                                          return get_redis(system).set( edge_key, value )	
                                                     	
  
redis_queue	
  
•  Implements	
  a	
  queue	
  in	
  Python	
  using	
  Redis	
  
   	
  
•  Used	
  to	
  process	
  millions	
  of	
  background	
  tasks	
  on	
  
   Plurk	
  /	
  Todoist	
  /	
  Wedoist	
  daily	
  (billions	
  in	
  total)	
  
   	
  
•  Implementa=on:	
  18	
  lines	
  
   “real”	
  implementaKon	
  a	
  bit	
  bigger	
  
   	
  
•  h_ps://github.com/Doist/redis_simple_queue	
  	
  
redis_queue	
  
from redis_simple_queue import *	
	
delete_jobs('tasks')	
	
put_job('tasks', '42')	
	
assert 'tasks' in get_all_queues()	
assert queue_stats('tasks')['queue_size'] == 1	
	
assert reserve_job('tasks') == '42'	
assert queue_stats('tasks')['queue_size'] == 0	
  
redis_queue:	
  Implementa=on	
  
from redis_wrap import *	
	
def put(queue, job_data, system='default'):	
    get_list(queue, system=system).append(job_data)	
	
def reserve(queue, system='default'):	
    return get_list(queue, system=system).pop()	
	
def delete_jobs(queue, system='default'):	
    get_redis(system).delete(queue)	
	
def get_all_queues(system='default'):	
    return get_redis(system).keys('*').split(' ')	
	
def queue_stats(queue, system='default'):	
    return {	
          'queue_size': len(get_list(queue))	
    }	
  
bitmapist	
  and	
  bitmapist.cohort	
  
•  Implements	
  an	
  advanced	
  analyKcs	
  library	
  on	
  top	
  
   of	
  Redis	
  bitmaps.	
  Saved	
  us	
  $2000	
  USD/month	
  
   (Mixpanel)!	
  
   	
  
•  bitmapist	
  
   h_ps://github.com/Doist/bitmapist	
  
   	
  
•  bitmapist.cohort	
  
   Cohort	
  analyKcs	
  (retenKon)	
  
bitmapist:	
  What	
  does	
  it	
  help	
  with?	
  
•  Has	
  user	
  123	
  been	
  online	
  today?	
  This	
  week?	
  
•  Has	
  user	
  123	
  performed	
  acKon	
  "X"?	
  
•  How	
  many	
  users	
  have	
  been	
  acKve	
  have	
  this	
  month?	
  
•  How	
  many	
  unique	
  users	
  have	
  performed	
  acKon	
  "X"	
  
   this	
  week?	
  
•  How	
  many	
  %	
  of	
  users	
  that	
  were	
  acKve	
  last	
  week	
  are	
  
   sKll	
  acKve?	
  
•  How	
  many	
  %	
  of	
  users	
  that	
  were	
  acKve	
  last	
  month	
  are	
  
   sKll	
  acKve	
  this	
  month?	
  
•  Bitmapist	
  can	
  answer	
  thisfor	
  millions	
  of	
  users	
  and	
  
   most	
  operaKons	
  are	
  O(1)!	
  Using	
  very	
  small	
  amounts	
  
   of	
  memory.	
  
What	
  are	
  bitmaps?	
  
•  Opera=ons:	
  SETBIT,	
  GETBIT,	
  BITCOUNT,	
  BITOP	
  	
  
   	
  
•  SETBIT	
  somekey	
  8	
  1	
  

•  GETBIT	
  somekey	
  8	
  

•  BITOP	
  AND	
  destkey	
  somekey1	
  somekey2	
  

•  h_p://en.wikipedia.org/wiki/Bit_array	
  	
  
bitmapist:	
  Using	
  it	
  
# Mark user 123 as active and has played a song	
mark_event('active', 123)	
mark_event('song:played', 123)	
	
# Answer if user 123 has been active this month	
assert 123 in MonthEvents('active', now.year, now.month)	
assert 123 in MonthEvents('song:played', now.year, now.month)	
	
# How many users have been active this week?	
print len(WeekEvents('active', now.year, now.isocalendar()[1]))	
	
# Perform bit operations. How many users that	
# have been active last month are still active this month?	
active_2_months = BitOpAnd(	
    MonthEvents('active', last_month.year, last_month.month),	
    MonthEvents('active', now.year, now.month)	
)	
print len(active_2_months)	
	
  
bitmapist.cohort:	
  
 Manage	
  retenKon!	
  




h_p://amix.dk/blog/post/19718	
  	
  
•  Goal:	
  InvenKng	
  a	
  modern	
  way	
  to	
  work	
  together	
  

•  Join	
  an	
  amazing	
  team	
  of	
  13	
  people	
  from	
  all	
  around	
  
   the	
  world.	
  A	
  profitable	
  business.	
  500.000+	
  users.	
  

•  Work	
  from	
  anywhere.	
  Hacker	
  friendly	
  culture.	
  
   Python.	
  CompeKKve	
  salaries.	
  

•  We	
  are	
  hiring:	
  	
  jobs@doist.io	
  	
  
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.doist.io	
  	
  
Ques=ons	
  and	
  Answers	
  
•  Slides	
  will	
  be	
  posted	
  to	
  
   h_p://amix.dk/	
  	
  

•  For	
  “offline”	
  quesKons	
  contact:	
  
   amix@doist.io	
  	
  

More Related Content

PDF
A deeper-understanding-of-spark-internals
PDF
[131]해커의 관점에서 바라보기
PDF
PHP data structures (and the impact of php 7 on them), phpDay Verona 2015, Italy
KEY
Spl Not A Bridge Too Far phpNW09
PDF
PHP 7 – What changed internally?
PDF
2018 PyCon Korea - Ring
PPTX
Php data structures – beyond spl (online version)
ODP
Intro to The PHP SPL
A deeper-understanding-of-spark-internals
[131]해커의 관점에서 바라보기
PHP data structures (and the impact of php 7 on them), phpDay Verona 2015, Italy
Spl Not A Bridge Too Far phpNW09
PHP 7 – What changed internally?
2018 PyCon Korea - Ring
Php data structures – beyond spl (online version)
Intro to The PHP SPL

What's hot (20)

PPTX
SPL: The Undiscovered Library - DataStructures
PPTX
Webinar: Replication and Replica Sets
PPTX
SPL - The Undiscovered Library - PHPBarcelona 2015
PDF
スマートフォン勉強会@関東 #11 どう考えてもdisconなものをiPhoneに移植してみた
PPTX
Corinna Status 2022.pptx
KEY
Invertible-syntax 入門
PDF
GR8Conf 2011: Effective Groovy
PPTX
テスト用のプレゼンテーション
PPTX
Bioinformatics p5-bioperl v2013-wim_vancriekinge
PDF
ScalaMeter 2014
PDF
Fabric.js @ Falsy Values
PPTX
Bioinformatica p6-bioperl
PDF
AutoIt for the rest of us - handout
PDF
Great BigTable and my toys
PDF
Arduino creative coding class part iii
PDF
Python WATs: Uncovering Odd Behavior
PPT
Hive introduction 介绍
PPTX
Unit testing pig
PDF
Adventures in Optimization
PDF
Modern Application Foundations: Underscore and Twitter Bootstrap
SPL: The Undiscovered Library - DataStructures
Webinar: Replication and Replica Sets
SPL - The Undiscovered Library - PHPBarcelona 2015
スマートフォン勉強会@関東 #11 どう考えてもdisconなものをiPhoneに移植してみた
Corinna Status 2022.pptx
Invertible-syntax 入門
GR8Conf 2011: Effective Groovy
テスト用のプレゼンテーション
Bioinformatics p5-bioperl v2013-wim_vancriekinge
ScalaMeter 2014
Fabric.js @ Falsy Values
Bioinformatica p6-bioperl
AutoIt for the rest of us - handout
Great BigTable and my toys
Arduino creative coding class part iii
Python WATs: Uncovering Odd Behavior
Hive introduction 介绍
Unit testing pig
Adventures in Optimization
Modern Application Foundations: Underscore and Twitter Bootstrap

Similar to Amir Salihefendic: Redis - the hacker's database (20)

PDF
Advanced Redis data structures
PPT
Redis And python at pycon_2011
PPT
Python redis talk
PDF
Redis - Usability and Use Cases
PDF
Paris Redis Meetup Introduction
PDF
Redispresentation apac2012
PDF
Introduction to Redis
PDF
Introduction to Redis
PPTX
PDF
Redis Everywhere - Sunshine PHP
PDF
mar07-redis.pdf
PDF
Speed up your Symfony2 application and build awesome features with Redis
PPTX
Redis Use Patterns (DevconTLV June 2014)
PPTX
REDIS327
PDF
Redis — The AK-47 of Post-relational Databases
PDF
Kicking ass with redis
PDF
Introduction to Redis
PDF
Serializing Ruby Objects in Redis
PPTX
Redis Indices (#RedisTLV)
PDF
Introduction to redis - version 2
Advanced Redis data structures
Redis And python at pycon_2011
Python redis talk
Redis - Usability and Use Cases
Paris Redis Meetup Introduction
Redispresentation apac2012
Introduction to Redis
Introduction to Redis
Redis Everywhere - Sunshine PHP
mar07-redis.pdf
Speed up your Symfony2 application and build awesome features with Redis
Redis Use Patterns (DevconTLV June 2014)
REDIS327
Redis — The AK-47 of Post-relational Databases
Kicking ass with redis
Introduction to Redis
Serializing Ruby Objects in Redis
Redis Indices (#RedisTLV)
Introduction to redis - version 2

More from it-people (20)

PDF
«Про аналитику и серебряные пули» Александр Подсобляев, Rambler&Co
PDF
«Scrapy internals» Александр Сибиряков, Scrapinghub
PDF
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
PDF
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
PDF
«Ещё один Поиск Яндекса» Александр Кошелев, Яндекс
PDF
«How I Learned to Stop Worrying and Love the BFG: нагрузочное тестирование со...
PDF
«Write once run anywhere — почём опиум для народа?» Игорь Новиков, Scalr
PDF
«Gensim — тематическое моделирование для людей» Иван Меньших, Лев Константино...
PDF
«Тотальный контроль производительности» Михаил Юматов, ЦИАН
PDF
«Детские болезни live-чата» Ольга Сентемова, Тинькофф Банк
PDF
«Микросервисы наносят ответный удар!» Олег Чуркин, Rambler&Co
PDF
«Память и Python. Что надо знать для счастья?» Алексей Кузьмин, ЦНС
PDF
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
PDF
«Python на острие бритвы: PyPy project» Александр Кошкин, Positive Technologies
PDF
«PyWat. А хорошо ли вы знаете Python?» Александр Швец, Marilyn System
PDF
«(Без)опасный Python», Иван Цыганов, Positive Technologies
PDF
«Python of Things», Кирилл Борисов, Яндекс
PDF
«Как сделать так, чтобы тесты на Swift не причиняли боль» Сычев Александр, Ra...
PDF
«Клиенту и серверу нужно поговорить» Прокопов Никита, Cognician
PDF
«Кошелек или деньги: сложный выбор между памятью и процессором» Алексеенко Иг...
«Про аналитику и серебряные пули» Александр Подсобляев, Rambler&Co
«Scrapy internals» Александр Сибиряков, Scrapinghub
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
«Ещё один Поиск Яндекса» Александр Кошелев, Яндекс
«How I Learned to Stop Worrying and Love the BFG: нагрузочное тестирование со...
«Write once run anywhere — почём опиум для народа?» Игорь Новиков, Scalr
«Gensim — тематическое моделирование для людей» Иван Меньших, Лев Константино...
«Тотальный контроль производительности» Михаил Юматов, ЦИАН
«Детские болезни live-чата» Ольга Сентемова, Тинькофф Банк
«Микросервисы наносят ответный удар!» Олег Чуркин, Rambler&Co
«Память и Python. Что надо знать для счастья?» Алексей Кузьмин, ЦНС
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
«Python на острие бритвы: PyPy project» Александр Кошкин, Positive Technologies
«PyWat. А хорошо ли вы знаете Python?» Александр Швец, Marilyn System
«(Без)опасный Python», Иван Цыганов, Positive Technologies
«Python of Things», Кирилл Борисов, Яндекс
«Как сделать так, чтобы тесты на Swift не причиняли боль» Сычев Александр, Ra...
«Клиенту и серверу нужно поговорить» Прокопов Никита, Cognician
«Кошелек или деньги: сложный выбор между памятью и процессором» Алексеенко Иг...

Recently uploaded (20)

PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Five Habits of High-Impact Board Members
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPT
What is a Computer? Input Devices /output devices
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPTX
The various Industrial Revolutions .pptx
PDF
Architecture types and enterprise applications.pdf
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
CloudStack 4.21: First Look Webinar slides
Zenith AI: Advanced Artificial Intelligence
Final SEM Unit 1 for mit wpu at pune .pptx
Five Habits of High-Impact Board Members
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
What is a Computer? Input Devices /output devices
NewMind AI Weekly Chronicles – August ’25 Week III
Getting started with AI Agents and Multi-Agent Systems
2018-HIPAA-Renewal-Training for executives
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Hindi spoken digit analysis for native and non-native speakers
Credit Without Borders: AI and Financial Inclusion in Bangladesh
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
The various Industrial Revolutions .pptx
Architecture types and enterprise applications.pdf
Microsoft Excel 365/2024 Beginner's training
sbt 2.0: go big (Scala Days 2025 edition)
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
Convolutional neural network based encoder-decoder for efficient real-time ob...
CloudStack 4.21: First Look Webinar slides

Amir Salihefendic: Redis - the hacker's database

  • 1. The  Hacker’s  Database   Amir  Salihefendic  (amix)  
  • 2. About  Me   •  Co-­‐founder  and  former  CTO  of  Plurk.com     •  Helped  Plurk  scale  to  millions  of  users,   billions  of  pages  views  and  8+  billion  unique   data  items.  With  minimal  hardware!   •  Founder  of  Doist.io   creators  of  Todoist  and  Wedoist  
  • 3. Outline  of  the  talk   •  Plurk  Timelines  opKmizaKon:  How  we  saved   hundreds  of  thousands  of  dollars     •  What’s  great  about  Redis?     •  Different  sample  implementaKons:   –  redis_wrap   –  redis_graph   –  redis_queue     •  Advanced  analyKcs  using  Redis   –  bitmapist  and  bitmapist.cohort  
  • 5. Problem   ExponenKal  data  growth  in  Social  Networks   data size number of users
  • 6. The  Easy  Solu=on   Throw  money  at  the  problem  
  • 7. The  Smarter  Solu=on   Reduce  to  linear  data  growth     data size number of users
  • 9. Example:  Timelines   timeline data size number of users
  • 10. Example:  Timelines   SoluKon:  Chea=ng!   Make  Kmelines  a  fixed  size  -­‐  500  messages   timeline •  O(1)  inserKon   data size •  O(1)  update   •  Cache  able   number of users
  • 11. Plurk’s  =melines  migra=on  path           Tokyo  Tyrant         •  Problem  with  MySQL  and  Tokyo  Tyrant?   Death  by  IO  
  • 12. What’s  great  about  Redis?   • Everything  is  in  memory,   but  the  data  is  persistent.     • Amazing  performance:   100.000+  SETs  pr.  sec   80.000+  GETs  pr.  sec  
  • 13. Redis  Rich  Datatypes   •  Rela=onal  databases   Schemas,  tables,  columns,  rows,  indexes  etc.     •  Column  databases  (BigTable,  hBase  etc.)   Schemas,  columns,  column  families,  rows  etc.     •  Redis   key-­‐value,  sets,  lists,  hashes,  bitmaps,  etc.  
  • 14. Redis  datatypes  resemble  datatypes   in  programming  languages.     They  are  natural  to  us!  
  • 15. redis_wrap   •  Implements  a  wrapper  for  Redis  datatypes  so   they  mimic  the  datatypes  found  in  Python     •  100  lines  of  code     •  h_ps://github.com/Doist/redis_wrap    
  • 16. redis_wrap   # Mimic of Python lists # Mimic of Python sets bears = get_list('bears') fishes = get_set('fishes') bears.append('grizzly') assert 'nemo' not in fishes assert len(bears) == 1 fishes.add('nemo') assert 'grizzly' in bears   assert 'nemo' in fishes for item in fishes: assert item == 'nemo'   # Mimic of hashes villains = get_hash('villains') assert 'riddler' not in villains villains['riddler'] = 'Edward Nigma' assert 'riddler' in villains assert len(villains.keys()) == 1 del villains['riddler'] assert len(villains) == 0  
  • 17. redis_graph   •  Implements  a  simple  graph  database  in  Python     •  Can  scale  to  a  few  million  nodes  easily   •  You  could  use  something  similar  to  implement   LinkedIn’s  “who  is  connected  to  who”  feature     •  Under  40  lines  of  code     •  h_ps://github.com/Doist/redis_graph    
  • 18. redis_graph   # Adding an edge between nodes add_edge(from_node='frodo', to_node='gandalf') assert has_edge(from_node='frodo', to_node='gandalf') == True # Getting neighbors of a node assert list(neighbors('frodo')) == ['gandalf'] # Deleting edges delete_edge(from_node='frodo', to_node='gandalf')   # Setting node values set_node_value('frodo', '1') assert get_node_value('frodo') == '1' # Setting edge values set_edge_value('frodo_baggins', '2') assert get_edge_value('frodo_baggins') == '2'  
  • 19. redis_graph:  The  implementaKon   from redis_wrap import * #--- Edges ---------------------------------------------- def add_edge(from_node, to_node, system='default'): edges = get_set( from_node, system=system ) edges.add( to_node ) def delete_edge(from_node, to_node, system='default'): edges = get_set( from_node, system=system ) key_node_y = to_node if key_node_y in edges: edges.remove( key_node_y ) #--- Node values ---------------------------- def get_node_value(node_x, system='default'): def has_edge(from_node, to_node, system='default'): node_key = 'nv:%s' % node_x edges = get_set( from_node, system=system ) return get_redis(system).get( node_key ) return to_node in edges def set_node_value(node_x, value, system='default'): def neighbors(node_x, system='default'): node_key = 'nv:%s' % node_x return get_set( node_x, system=system ) return get_redis(system).set( node_key, value )   #--- Edge values ----------------------------- def get_edge_value(edge_x, system='default'): edge_key = 'ev:%s' % edge_x return get_redis(system).get( edge_key ) def set_edge_value(edge_x, value, system='default'): edge_key = 'ev:%s' % edge_x return get_redis(system).set( edge_key, value )  
  • 20. redis_queue   •  Implements  a  queue  in  Python  using  Redis     •  Used  to  process  millions  of  background  tasks  on   Plurk  /  Todoist  /  Wedoist  daily  (billions  in  total)     •  Implementa=on:  18  lines   “real”  implementaKon  a  bit  bigger     •  h_ps://github.com/Doist/redis_simple_queue    
  • 21. redis_queue   from redis_simple_queue import * delete_jobs('tasks') put_job('tasks', '42') assert 'tasks' in get_all_queues() assert queue_stats('tasks')['queue_size'] == 1 assert reserve_job('tasks') == '42' assert queue_stats('tasks')['queue_size'] == 0  
  • 22. redis_queue:  Implementa=on   from redis_wrap import * def put(queue, job_data, system='default'): get_list(queue, system=system).append(job_data) def reserve(queue, system='default'): return get_list(queue, system=system).pop() def delete_jobs(queue, system='default'): get_redis(system).delete(queue) def get_all_queues(system='default'): return get_redis(system).keys('*').split(' ') def queue_stats(queue, system='default'): return { 'queue_size': len(get_list(queue)) }  
  • 23. bitmapist  and  bitmapist.cohort   •  Implements  an  advanced  analyKcs  library  on  top   of  Redis  bitmaps.  Saved  us  $2000  USD/month   (Mixpanel)!     •  bitmapist   h_ps://github.com/Doist/bitmapist     •  bitmapist.cohort   Cohort  analyKcs  (retenKon)  
  • 24. bitmapist:  What  does  it  help  with?   •  Has  user  123  been  online  today?  This  week?   •  Has  user  123  performed  acKon  "X"?   •  How  many  users  have  been  acKve  have  this  month?   •  How  many  unique  users  have  performed  acKon  "X"   this  week?   •  How  many  %  of  users  that  were  acKve  last  week  are   sKll  acKve?   •  How  many  %  of  users  that  were  acKve  last  month  are   sKll  acKve  this  month?   •  Bitmapist  can  answer  thisfor  millions  of  users  and   most  operaKons  are  O(1)!  Using  very  small  amounts   of  memory.  
  • 25. What  are  bitmaps?   •  Opera=ons:  SETBIT,  GETBIT,  BITCOUNT,  BITOP       •  SETBIT  somekey  8  1   •  GETBIT  somekey  8   •  BITOP  AND  destkey  somekey1  somekey2   •  h_p://en.wikipedia.org/wiki/Bit_array    
  • 26. bitmapist:  Using  it   # Mark user 123 as active and has played a song mark_event('active', 123) mark_event('song:played', 123) # Answer if user 123 has been active this month assert 123 in MonthEvents('active', now.year, now.month) assert 123 in MonthEvents('song:played', now.year, now.month) # How many users have been active this week? print len(WeekEvents('active', now.year, now.isocalendar()[1])) # Perform bit operations. How many users that # have been active last month are still active this month? active_2_months = BitOpAnd( MonthEvents('active', last_month.year, last_month.month), MonthEvents('active', now.year, now.month) ) print len(active_2_months)  
  • 27. bitmapist.cohort:   Manage  retenKon!   h_p://amix.dk/blog/post/19718    
  • 28. •  Goal:  InvenKng  a  modern  way  to  work  together   •  Join  an  amazing  team  of  13  people  from  all  around   the  world.  A  profitable  business.  500.000+  users.   •  Work  from  anywhere.  Hacker  friendly  culture.   Python.  CompeKKve  salaries.   •  We  are  hiring:    jobs@doist.io                                                            www.doist.io    
  • 29. Ques=ons  and  Answers   •  Slides  will  be  posted  to   h_p://amix.dk/     •  For  “offline”  quesKons  contact:   amix@doist.io