SlideShare a Scribd company logo
The NOSQL
ST
ORE
Ev
er
yone IGNORED
By Zohaib Sibte H
as
san @ DOorD
a
About mE
Zohaib Sibte Hassan
@zohaibility
Dad, engineer, tinkerer, love open
source & working for DoorDash!
At DoorDash we use Postgres!
DoorD
a
Hi
st
ory
2009 - Friend Feed blog
Hi
st
ory
2011 - I discovered HSTORE and blogged about it
Hi
st
ory
2012 - I revisited imagining FriendFeed on Postgres & HSTORE
Hi
st
ory
2015 - Talk with same title in Dublin
Our Roadmap Today
•A brief look at FriendFeed use
-
case


•Warming up with HSTORE


•Taking it to next level:


•JSONB


•Complex yet simple queries


•Partitioning our documents
Po
st
gr
es
IS ALWAYs
ev
olvING
•Robust schemaless
-
types:


•Array


•HSTORE


•XML


•JSON & JSONB


•Improved storage engine


•Improved Foreign Data Wrappers


•Partitioning support
B
ri
ef
Hi
st
ory
FOR
HS
TORE
HS
TORE
May 2003 - First version of store
HS
TORE
•May 16, 2003 - f
i
rst (unpublished) version of
hstore for PostgreSQL 7.3


•Dec, 05, 2006 - hstore is a part of PostgreSQL 8.2


•May 23, 2007 - GIN index for hstore, PostgreSQL 8.3


•Sep, 20, 2010 - Andrew Gierth improved hstore,
PostgreSQL 9.0


•May 24, 2013 - Nested hstore with array support,
key
-
>
value model
-
>
document
-
based model (Stay
tuned for more on this)
HS
TORE
•Benef
i
ts:


•Provides a
f
l
exible model for storing a semi
-
structured data in Postgres.


•Binary represented so extremely fast! Selecting
f
i
elds or properties.


•GIN and GiST indexes can be used!


•Drawbacks:


•Too
f
l
at! Doesn't support tree
-
like structures
(e.g. JSON introduced in 2006, 3 years after store)
Enou
gh
TH
EORY
L
et
's build som
et
hing s
er
ious
F
ri
en
dFEED
U
SI
NG SQL To BUILD NoSQL
• https:
/
/
backchannel.org/blog/friendfeed
-
schemaless
-
mysql
WHY F
RI
EN
DFEED?
•Good example of understanding available
technology and problem at hand.


•Did not cave in to buzzword, and started
using something less known/reliable.


•Large scale problem with good example on how
modern SQL tooling solves the problem.


•Using tool that you are comfortable with.


•Read blog post!
WHY F
RI
EN
DFEED?
F
RI
EN
DFEED
{


"id": "71f0c4d2291844cca2df6f486e96e37c",


"user_id": "f48b0440ca0c4f66991c4d5f6a078eaf",


"feed_id": "f48b0440ca0c4f66991c4d5f6a078eaf",


"title": "We just launched a new backend system for FriendFeed!",


"link": "http:
/
/
friendfeed.com/e/71f0c4d2-2918-44cc
-
a2df-6f486e96e37c",


"published": 1235697046,


"updated": 1235697046,


}
F
RI
EN
DFEED
CREATE TABLE entities (


added_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,


id BINARY(16) NOT NULL,


updated TIMESTAMP NOT NULL,


body MEDIUMBLOB,


UNIQUE KEY (id),


KEY (updated)


) ENGINE=InnoDB;
HOW
CA
N WE INDEX?
{


"id": "71f0c4d2291844cca2df6f486e96e37c",


"user_id": "f48b0440ca0c4f66991c4d5f6a078eaf",


"feed_id": "f48b0440ca0c4f66991c4d5f6a078eaf",


"title": "We just launched a new backend system for FriendFeed!",


"link": "http://friendfeed.com/e/71f0c4d2-2918-44cc-a2df-6f486e96e37c",


"published": 1235697046,


"updated": 1235697046,


}
F
RI
EN
DFEED INDEXING
CREATE TABLE index_user_id (


user_id BINARY(16) NOT NULL,


entity_id BINARY(16) NOT NULL UNIQUE,


PRIMARY KEY (user_id, entity_id)


) ENGINE=InnoDB;
•Create tables for each indexed f
i
eld.


•Have background workers to populate newly created index.


•Complete language framework to ensure documents are
indexed as they are inserted.
CODING F
RA
M
EW
O
R
HS
TORE
The K
ey
-Value Store Ev
er
yone
Ignored
HS
TORE
HS
TORE
CREATE TABLE feed (


id varchar(64) NOT NULL PRIMARY KEY,


doc hstore


);
HS
TORE
INSERT INTO feed VALUES (


'ff923c93-7769-4ef6-b026-50c5a87a79c5',


'id=>zohaibility, post=>hello'::hstore


);
HS
TORE
SELECT doc
-
>
'post' as post, doc
-
>
'undef
i
ned_f
i
eld' as undef
i
ned


FROM feed


WHERE doc
-
>
'id' = 'zohaibility';
post | undef
i
ned


-------+-----------


hello |


(1 row)
HS
TORE
EXPLAIN SELECT *


FROM feed


WHERE doc
-
>
'id' = 'zohaibility';
QUERY PLAN


-------------------------------------------------------


Seq Scan on feed (cost=0.00
.
.
1.03 rows=1 width=178)


Filter: ((doc
-
>
'id'
:
:
text) = 'zohaibility'
:
:
text)


(2 rows)
INDEXING
HS
TORE
CREATE INDEX feed_user_id_index


ON feed ((doc->'id'));
HS
TORE ❤
GI
S
CREATE INDEX feed_gist_idx


ON feed


USING gist (doc);
HS
TORE ❤
GI
S
SELECT doc->'post' as post, doc->'undefined_field' as should_be_null


FROM feed


WHERE doc @> ‘id=>zohaibility';
post | undef
i
ned


-------+-----------


hello |


(1 row)
R
EI
MA
GI
NING Fr
ei
ndFEED
CREATE TABLE entities (


id BIGINT PRIMARY KEY,


updated TIMESTAMP NOT NULL,


body HSTORE,


…


);
CREATE TABLE index_user_id (


user_id BINARY(16) NOT NULL,


entity_id BINARY(16) NOT NULL UNIQUE,


PRIMARY KEY (user_id, entity_id)


) ENGINE=InnoDB;
CREATE INDEX CONCURRENTLY entity_id_index


ON entities ((body->’entity_id’));
MORE Op
era
to
rs
!
https:
/
/
w
w
w
.postgresql.org/docs/current/hstore.html
JSONB
tO IN
FI
NI
TY
AND B
EY
OND
WHY JSON?
•Well understood, and goto standard for
almost everything on modern web.


•“Self describing”, hierarchical, and
parsing and serialization libraries for
every programming language


•Describes a loose shape of the object,
which might be necessary in some cases.
TW
E
ET
s
TW
E
ET
S TABLE
CREATE TABLE tweets (


id varchar(64) NOT NULL PRIMARY KEY,


content jsonb NOT NULL


);
B
AS
IC QU
ER
Y
SELECT "content"->'text' as txt, "content"->'favorite_count' as cnt


FROM tweets


WHERE “content"->'id_str' == ‘…’
And Y
ES
you
ca
n index
th
is!!!
PE
EK
IN INTO
ST
RU
CT
URE
SELECT *


FROM tweets


WHERE (content->>'favorite_count')::integer >= 1;
😭
EXPLAIN SELECT *


FROM tweets


WHERE (content->'favorite_count')::integer >= 1;
QUERY PLAN


------------------------------------------------------------------


Seq Scan on tweets (cost=0.00
.
.
2453.28 rows=6688 width=718)


Filter: (((content
-
>
>
'favorite_count'
:
:
text))
:
:
integer
>
=
1)


(2 rows)
B
AS
IC INDEXING
CREATE INDEX fav_count_index


ON tweets (((content->’favorite_count')::INTEGER));
B
AS
IC INDEXING
EXPLAIN SELECT *


FROM tweets


WHERE (content->'favorite_count')::integer >= 1;
QUERY PLAN


-----------------------------------------------------------------------------------


Bitmap Heap Scan on tweets (cost=128.12
.
.
2297.16 rows=6688 width=718)


Recheck Cond: (((content
-
>
'favorite_count'
:
:
text))
:
:
integer
>
=
1)


-
>
Bitmap Index Scan on fav_count_index (cost=0.00
.
.
126.45 rows=6688 width=0)


Index Cond: (((content
-
>
'favorite_count'
:
:
text))
:
:
integer
>
=
1)


(4 rows)
JSONB-JI
S
SELECT content#>>’{text}' as txt


FROM tweets


WHERE (content#>'{entities,hashtags}') @> '[{"text": "python"}]'::jsonb;
JSON OP
ERA
TO
R
JSONB Op
era
to
r
MAT
CH
ING TAGS
SELECT content#>>’{text}' as txt


FROM tweets


WHERE (content#>'{entities,hashtags}') @> '[{"text": "python"}]'::jsonb;
INDEXING
CREATE INDEX idx_gin_hashtags


ON tweets


USING GIN ((content#>'{entities,hashtags}') jsonb_ops);
Complex S
ea
r
c
CREATE INDEX idx_gin_rt_hashtags


ON tweets


USING GIN ((content#>'{retweeted_status,entities,hashtags}') jsonb_ops);
SELECT content#>'{text}' as txt


FROM tweets


WHERE (


(content#>'{entities,hashtags}') @> '[{"text": “python"}]'::jsonb


OR


(content#>'{retweeted_status,entities,hashtags}') @> '[{"text": “postgres"}]'::jsonb


);
JSONB +
ECO
SY
ST
E
TH
E POW
ER
OF AL
CH
EM
Y
JSONB + TSVE
CT
OR
CREATE INDEX idx_gin_tweet_text


ON tweets


USING GIN (to_tsvector('english', content->>'text') tsvector_ops);
SELECT content->>'text' as txt


FROM tweets


WHERE to_tsvector('english', content->>'text') @@ to_tsquery('english', 'python');
JSONB + PA
RT
ITIOn
CREATE TABLE part_tweets (


id varchar(64) NOT NULL,


content jsonb NOT NULL


) PARTITION BY hash (md5(content->>’user'->>'id'));


CREATE TABLE part_tweets_0 PARTITION OF part_tweets FOR
VALUES WITH (MODULUS 4, REMAINDER 0);


CREATE TABLE part_tweets_1 PARTITION OF part_tweets FOR
VALUES WITH (MODULUS 4, REMAINDER 1);


CREATE TABLE part_tweets_2 PARTITION OF part_tweets FOR
VALUES WITH (MODULUS 4, REMAINDER 2);


CREATE TABLE part_tweets_3 PARTITION OF part_tweets FOR
VALUES WITH (MODULUS 4, REMAINDER 3);
JSONB + PA
RT
ITIOn + INDEXING
CREATE INDEX pidx_gin_hashtags ON part_tweets USING GIN
((content#>'{entities,hashtags}') jsonb_ops);


CREATE INDEX pidx_gin_rt_hashtags ON part_tweets USING GIN
((content#>'{retweeted_status,entities,hashtags}') jsonb_ops);


CREATE INDEX pidx_gin_tweet_text ON tweets USING GIN
(to_tsvector('english', content->>'text') tsvector_ops);
INSERT INTO part_tweets SELECT * from tweets;
JSONB + PA
RT
ITIOn + INDEXING
EXPLAIN SELECT content#>'{text}' as txt


FROM part_tweets


WHERE (content#>'{entities,hashtags}') @> '[{"text": "postgres"}]'::jsonb;
QUERY PLAN


----------------------------------------------------------------------------------------------------------
-


Append (cost=24.26
.
.
695.46 rows=131 width=32)


-
>
Bitmap Heap Scan on part_tweets_0 (cost=24.26
.
.
150.18 rows=34 width=32)


Recheck Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


-
>
Bitmap Index Scan on part_tweets_0_expr_idx (cost=0.00
.
.
24.25 rows=34 width=0)


Index Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


-
>
Bitmap Heap Scan on part_tweets_1 (cost=80.25
.
.
199.02 rows=32 width=32)


Recheck Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


-
>
Bitmap Index Scan on part_tweets_1_expr_idx (cost=0.00
.
.
80.24 rows=32 width=0)


Index Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


-
>
Bitmap Heap Scan on part_tweets_2 (cost=28.25
.
.
147.15 rows=32 width=32)


Recheck Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


-
>
Bitmap Index Scan on part_tweets_2_expr_idx (cost=0.00
.
.
28.24 rows=32 width=0)


Index Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


-
>
Bitmap Heap Scan on part_tweets_3 (cost=76.26
.
.
198.46 rows=33 width=32)


Recheck Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


-
>
Bitmap Index Scan on part_tweets_3_expr_idx (cost=0.00
.
.
76.25 rows=33 width=0)


Index Cond: ((content #> '{entities,hashtags}'
:
:
text[]) @> '[{"text": "postgres"}]'
:
:
jsonb)


(17 rows)
JSONB + PA
RT
ITIOn + INDEXING
EXPLAIN SELECT content#>'{text}' as txt FROM tweets WHERE (


(content#>'{entities,hashtags}') @> '[{"text": "python"}]'::jsonb


OR


(content#>'{retweeted_status,entities,hashtags}') @> '[{"text": "python"}]'::jsonb


);
LIMIT IS YOUR
IMA
GI
NATION
Don’t Und
er
E
st
imate
th
e pow
er
of
LIN
KS
& R
es
ourc
e
•https:
/
/
w
w
w
.linuxjournal.com/content/postgresql
-
nosql
-
database


•http:
/
/
w
w
w
.sai.msu.su/~megera/postgres/talks/hstore
-
dublin-2013.pdf


•https:
/
/
maxpert.tumblr.com/post/14062057061/the
-
key
-
value
-
store
-
everyone
-
ignored
-
postgresql


•https:
/
/
maxpert.tumblr.com/post/32461917960/migrating
-
friendfeed
-
to
-
postgresql


•https:
/
/
w
w
w
.postgresql.org/docs/current/datatype
-
json.html


•https:
/
/
w
w
w
.postgresql.org/docs/current/functions
-
json.html


•https:
/
/
w
w
w
.postgresql.org/docs/current/gin
-
builtin
-
opclasses.html


•https:
/
/
w
w
w
.postgresql.org/docs/current/ddl
-
partitioning.html


•https:
/
/
w
w
w
.postgresql.org/docs/current/textsearch
-
tables.html


•https:
/
/
pgdash.io/blog/partition
-
postgres-11.html


•https:
/
/
talks.bitexpert.de/dpc15-postgres
-
nosql/#/


•https:
/
/
w
w
w
.postgresql.org/docs/current/hstore.html
TH
ANK YOU!

More Related Content

PPT
Tthornton code4lib
PDF
"Solr Update" at code4lib '13 - Chicago
PDF
Использование Elasticsearch для организации поиска по сайту
PDF
Data Exploration with Elasticsearch
ODP
Linked Open Communism - c4l13
PDF
Beyond full-text searches with Lucene and Solr
PPTX
Apache Solr
PPTX
MongoDB (Advanced)
Tthornton code4lib
"Solr Update" at code4lib '13 - Chicago
Использование Elasticsearch для организации поиска по сайту
Data Exploration with Elasticsearch
Linked Open Communism - c4l13
Beyond full-text searches with Lucene and Solr
Apache Solr
MongoDB (Advanced)

What's hot (20)

PDF
Introduction to Solr
PDF
ElasticSearch: Найдется все... и быстро!
PDF
Introduction to Apache Solr
PDF
How Solr Search Works
PPTX
20130310 solr tuorial
PDF
Introduction to solr
PPT
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
PDF
Solr Masterclass Bangkok, June 2014
PDF
Elasto Mania
PPTX
Building a Scalable Inbox System with MongoDB and Java
PDF
Schemaless Solr and the Solr Schema REST API
PDF
PDF
Solr: 4 big features
PPTX
MongoDB Schema Design: Four Real-World Examples
PDF
よく使うテストヘルパーの紹介 #ios_test_night
PDF
Elastify you application: from SQL to NoSQL in less than one hour!
PDF
Building your own search engine with Apache Solr
PPTX
RDF validation tutorial
PDF
Solr Application Development Tutorial
PPTX
04 standard class library c#
Introduction to Solr
ElasticSearch: Найдется все... и быстро!
Introduction to Apache Solr
How Solr Search Works
20130310 solr tuorial
Introduction to solr
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Solr Masterclass Bangkok, June 2014
Elasto Mania
Building a Scalable Inbox System with MongoDB and Java
Schemaless Solr and the Solr Schema REST API
Solr: 4 big features
MongoDB Schema Design: Four Real-World Examples
よく使うテストヘルパーの紹介 #ios_test_night
Elastify you application: from SQL to NoSQL in less than one hour!
Building your own search engine with Apache Solr
RDF validation tutorial
Solr Application Development Tutorial
04 standard class library c#
Ad

Similar to NoSQL store everyone ignored - Postgres Conf 2021 (20)

PDF
The NoSQL store everyone ignored
PDF
PostgreSQL, your NoSQL database
PDF
Mathias test
PDF
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
PDF
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
ODP
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
ODP
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PDF
PostgreSQL 9.4: NoSQL on ACID
PPTX
PostgreSQL - It's kind've a nifty database
PDF
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PDF
NoSQL on ACID - Meet Unstructured Postgres
 
PDF
Syntactic sugar in Postgre SQL
PPTX
Syntactic sugar in postgre sql
PDF
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
PPTX
PostgreSQL as NoSQL
PDF
10 Reasons to Start Your Analytics Project with PostgreSQL
PPTX
When to no sql and when to know sql javaone
PDF
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
PPTX
PostgreSQL's Secret NoSQL Superpowers
PDF
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
The NoSQL store everyone ignored
PostgreSQL, your NoSQL database
Mathias test
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL 9.4: NoSQL on ACID
PostgreSQL - It's kind've a nifty database
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
NoSQL on ACID - Meet Unstructured Postgres
 
Syntactic sugar in Postgre SQL
Syntactic sugar in postgre sql
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
PostgreSQL as NoSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
When to no sql and when to know sql javaone
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
PostgreSQL's Secret NoSQL Superpowers
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Ad

Recently uploaded (20)

PDF
top salesforce developer skills in 2025.pdf
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
medical staffing services at VALiNTRY
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
System and Network Administration Chapter 2
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
CHAPTER 2 - PM Management and IT Context
top salesforce developer skills in 2025.pdf
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
2025 Textile ERP Trends: SAP, Odoo & Oracle
medical staffing services at VALiNTRY
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Softaken Excel to vCard Converter Software.pdf
wealthsignaloriginal-com-DS-text-... (1).pdf
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
VVF-Customer-Presentation2025-Ver1.9.pptx
Upgrade and Innovation Strategies for SAP ERP Customers
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
System and Network Administration Chapter 2
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Reimagine Home Health with the Power of Agentic AI​
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Design an Analysis of Algorithms I-SECS-1021-03
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
CHAPTER 2 - PM Management and IT Context

NoSQL store everyone ignored - Postgres Conf 2021