SlideShare a Scribd company logo
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
Using the PostgreSQL
Extension Ecosystem for
Advanced Analytics
sales@chartio.com
(855) 232-0320
- The problem
- The prevailing view vs. the practical reality
- A possible solution
- Or just building blocks?
- Nearness
- Near at hand, near to our skill set, near to our capabilities
- A more complete solution
- The PostgreSQL extension ecosystem
Agenda
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
The Problem
The Prevailing View
vs.
The Practical Reality
sales@chartio.com
(855) 232-0320
The Prevailing View - Logical
Dimension Relational Non-Relational
Schema objects ● Structured rows and columns
● Schema on write
● Referential integrity
● Painful migrations
● Unstructured files, docs, etc
● Schema on read
● No referential integrity
● No migrations
Query languages ● SQL
● Declarative
● Easy enough for non-tech users
● Various
● Procedural
● Requires some programming skills
Exploratory analysis ● Native support for joins
● Interactive/low execution overhead
● No native support for joins
● OLAP - Batch processing
Data science and ML ● Only descriptive statistics
● Requires exporting dumps/samples
● Robust ecosystem
● Does not require exports
sales@chartio.com
(855) 232-0320
The Prevailing View - Physical
Dimension Relational Non-Relational
Parallel query
processing
● Single node system
● Single process per query
● Multiple node system
● Multiple processes per query
Concurrency ● High concurrency
● Single process per connection
● OLAP - low concurrency/high
scheduling overhead
High Availability &
Replication
● Async and sync replication
● HA may not be native
● Async and sync replication
● HA likely to be native
Sharding ● Sharding may not be native
● Difficult to manage
● Sharding likely to be native
● Easy to manage
sales@chartio.com
(855) 232-0320
The Prevailing View - Summary
- RDBMS have nice properties for producing rich data
- ACID, relational integrity, constraints, strong data types
- Easier for non-tech users and exploratory analysis
- Probably don’t meet the needs of today’s analysts
- Data science & Machine Learning
- Parallel processing
- Definitely don’t meet the needs of today’s apps
- Schema migrations
- Replication and sharding
sales@chartio.com
(855) 232-0320
The Practical Reality
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
But we still want more advanced
functionality.
The Practical Reality
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
A Possible Solution
Or Just Building Blocks?
sales@chartio.com
(855) 232-0320
Modern SQL
- Many people still think of SQL in terms of SQL-92
- Since then we’ve had: SQL:1999, SQL:2003, SQL:2006, SQL:2008,
SQL:2011
- http://use-the-index-luke.com/blog/2015-02/modern-sql
- Common Table Expressions (CTEs) / Recursive CTEs
- Window Functions
- Ordered-set Aggregates
- Lateral joins
- Temporal support
- The list goes on...
sales@chartio.com
(855) 232-0320
Procedural Languages
- Native
pgSQL Tcl Perl Python
- Community
Java PHP R Javascript Ruby Scheme
sh
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
These solve some problems.
For others, they are just building blocks.
Building Blocks
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
Nearness
Near at Hand
Near to Our Skill Set
Near to Our Capabilities
sales@chartio.com
(855) 232-0320
- http://www.infoq.com/presentations/Simple-Made-Easy
Nearness
sales@chartio.com
(855) 232-0320
- Near at hand
- Easily installable
- Near to our skill set
- Familiar tool/language/abstraction
- Modular and composable
- Near to our capabilities
- Capable of solving a problem in our domain
Nearness Drives Adoption
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
A More Complete Solution
The PostgreSQL Extension Ecosystem
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
- Package Manager: pgxn
- Index/Network: http://pgxn.org/
- PyPI, RubyGems, CPAN, CRAN
The PostgreSQL Extension Network
sales@chartio.com
(855) 232-0320
The PostgreSQL Extension Network
- Near at hand
- pgxn search semver
- pgxn info semver
- pgxn install semver
- pgxn load –d somedb semver
- pgxn unload –d somedb semver
- pgxn uninstall semver
- Search github? google? mailing list?
- Github README?
- git clone; make; make install;
- psql –c “CREATE EXTENSION IF NOT
EXISTS”
- psql –c “DROP EXTENSION IF EXISTS”
- make uninstall?
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
UDFs & Operators: pg_similarity
- Near to our capabilities
- Similarity coefficient algorithms
- L1 Distance
- Cosine Distance
- Dice Coefficient
- Euclidean Distance
- Hamming Distance
- Jaccard Coefficient
- Jaro Distance
- Jaro-Winkler Distance
- Levenshtein Distance
- Matching Coefficient
- Monge-Elkan Coefficient
- Needleman-Wunsch Coefficient
- Overlap Coefficient
- Q-Gram Distance
- Smith-Waterman Coefficient
- Smith-Waterman-Gotoh Coefficient
- Soundex Distance
sales@chartio.com
(855) 232-0320
UDFs & Operators: pg_similarity
- Near to our skill set
sales@chartio.com
(855) 232-0320
UDFs & Operators: pg_similarity
- Implementation
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
UDAs & Data Types: postgresql-hll
- Near to our capabilities & near to our skill set
- Data type
- Estimate count distinct with tunable precision
- 1280 bytes estimates tens of billions of distinct values with few percent error
sales@chartio.com
(855) 232-0320
UDAs & Data Types: postgresql-hll
sales@chartio.com
(855) 232-0320
UDAs & Data Types: postgresql-hll
- Implementation
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Foreign Data Wrappers: API
sales@chartio.com
(855) 232-0320
Foreign Data Wrappers: multicorn
- Near to our skill set
sales@chartio.com
(855) 232-0320
Foreign Data Wrappers: pgosquery
- Near at hand
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Indexes: ZomboDB
- Index Access Method API
- http://www.postgresql.org/docs/9.4/static/indexam.html
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes (GiST, GIN): https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Composing Extension Methods: MADlib
Near to our capabilities
sales@chartio.com
(855) 232-0320
Composing Extension Methods: MADlib
- Near to our skill set
sales@chartio.com
(855) 232-0320
Composing Extension Methods: MADlib
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Parallel Processing
- Parallel sequential scan
- http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-is-committed.html
- Columnar FDW:
- https://github.com/citusdata/cstore_fdw
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Composing Extensions: Alps
sales@chartio.com
(855) 232-0320
Composing Extensions: Record Linking
sales@chartio.com
(855) 232-0320
Beyond Analytics
- Web app framework
- http://blog.aquameta.com/
- REST API
- https://github.com/begriffs/postgrest
- Unit testing framework
- http://pgtap.org/
- Firewall
- https://github.com/uptimejp/sql_firewall
- More every week!
sales@chartio.com
(855) 232-0320
Conclusion
- With PostgreSQL, you get
- more than rows and columns
- more than SELECT, FROM, WHERE, GROUP BY, ORDER BY
- more than a single machine
- Make sure you get the full return on your investment!
Get your Chartio free trial!
sales@chartio.com
(855) 232-0320

More Related Content

PPTX
Producing and Analyzing Rich Data with PostgreSQL
PPTX
Learn How to Run Python on Redshift
PDF
Apache Arrow Flight: A New Gold Standard for Data Transport
PDF
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
PDF
Apache Arrow: Cross-language Development Platform for In-memory Data
PPTX
Future of pandas
PDF
Ursa Labs and Apache Arrow in 2019
PDF
Spark Meetup Amsterdam - Dealing with Bad Actors in ETL, Databricks
Producing and Analyzing Rich Data with PostgreSQL
Learn How to Run Python on Redshift
Apache Arrow Flight: A New Gold Standard for Data Transport
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Apache Arrow: Cross-language Development Platform for In-memory Data
Future of pandas
Ursa Labs and Apache Arrow in 2019
Spark Meetup Amsterdam - Dealing with Bad Actors in ETL, Databricks

What's hot (20)

PDF
ODI11g, Hadoop and "Big Data" Sources
PPTX
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
PPTX
BDM8 - Near-realtime Big Data Analytics using Impala
PPTX
HBase and Drill: How loosley typed SQL is ideal for NoSQL
PDF
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
PDF
Apache Arrow at DataEngConf Barcelona 2018
PDF
ACM TechTalks : Apache Arrow and the Future of Data Frames
PDF
Apache Arrow -- Cross-language development platform for in-memory data
PDF
DataFrames: The Good, Bad, and Ugly
PDF
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
PDF
Apache Arrow: Present and Future @ ScaledML 2020
PDF
introduction to Neo4j (Tabriz Software Open Talks)
PDF
Apache Arrow: Leveling Up the Analytics Stack
PPT
Hadoop Frameworks Panel__HadoopSummit2010
PDF
Apache Arrow Workshop at VLDB 2019 / BOSS Session
PDF
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PDF
From flat files to deconstructed database
PDF
Performance of Spark vs MapReduce
PPTX
Apache drill
ODP
The other Apache Technologies your Big Data solution needs
ODI11g, Hadoop and "Big Data" Sources
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM8 - Near-realtime Big Data Analytics using Impala
HBase and Drill: How loosley typed SQL is ideal for NoSQL
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
Apache Arrow at DataEngConf Barcelona 2018
ACM TechTalks : Apache Arrow and the Future of Data Frames
Apache Arrow -- Cross-language development platform for in-memory data
DataFrames: The Good, Bad, and Ugly
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Apache Arrow: Present and Future @ ScaledML 2020
introduction to Neo4j (Tabriz Software Open Talks)
Apache Arrow: Leveling Up the Analytics Stack
Hadoop Frameworks Panel__HadoopSummit2010
Apache Arrow Workshop at VLDB 2019 / BOSS Session
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
From flat files to deconstructed database
Performance of Spark vs MapReduce
Apache drill
The other Apache Technologies your Big Data solution needs
Ad

Similar to Using the PostgreSQL Extension Ecosystem for Advanced Analytics (20)

PDF
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PDF
Must Know Postgres Extension for DBA and Developer during Migration
PDF
Beyond Postgres: Interesting Projects, Tools and forks
PDF
Postgresql Up And Running Regina Obe Leo Hsu
PDF
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
PDF
Useful PostgreSQL Extensions
 
PDF
Making Postgres Central in Your Data Center
 
PDF
Learning postgresql
PDF
Building a Complex, Real-Time Data Management Application
PDF
Making Postgres Central in Your Data Center
 
PDF
Making.postgres.central.2015
 
PDF
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
PDF
Open Source SQL Databases
PPTX
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
PDF
PostgreSQL - Case Study
PDF
PostgreSQL, Extensible to the Nth Degree: Functions, Languages, Types, Rules,...
PPT
Postgres for the Future
 
KEY
Releasing PostgreSQL Extension on PGXN
PPTX
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
PDF
TechEvent 2019: Oracle to PostgreSQL - a Travel Guide from Practice; Roland S...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
Must Know Postgres Extension for DBA and Developer during Migration
Beyond Postgres: Interesting Projects, Tools and forks
Postgresql Up And Running Regina Obe Leo Hsu
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Useful PostgreSQL Extensions
 
Making Postgres Central in Your Data Center
 
Learning postgresql
Building a Complex, Real-Time Data Management Application
Making Postgres Central in Your Data Center
 
Making.postgres.central.2015
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
Open Source SQL Databases
Postgres for Digital Transformation: NoSQL Features, Replication, FDW & More
PostgreSQL - Case Study
PostgreSQL, Extensible to the Nth Degree: Functions, Languages, Types, Rules,...
Postgres for the Future
 
Releasing PostgreSQL Extension on PGXN
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
TechEvent 2019: Oracle to PostgreSQL - a Travel Guide from Practice; Roland S...
Ad

More from Chartio (6)

PPTX
Rethinking Your Ad Spend: 5 Tips for intelligent digital advertising
PPTX
How To Drive Exponential Growth Using Unconventional Data Sources
PPTX
7 Reasons You Haven't Reached Hyper-Growth
PPTX
Redshift Chartio Event Presentation
PPTX
The Vital Metrics Every Sales Team Should Be Measuring
PDF
CSV and XLS Best Practices
Rethinking Your Ad Spend: 5 Tips for intelligent digital advertising
How To Drive Exponential Growth Using Unconventional Data Sources
7 Reasons You Haven't Reached Hyper-Growth
Redshift Chartio Event Presentation
The Vital Metrics Every Sales Team Should Be Measuring
CSV and XLS Best Practices

Recently uploaded (20)

PPTX
Online Work Permit System for Fast Permit Processing
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Digital Strategies for Manufacturing Companies
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Online Work Permit System for Fast Permit Processing
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
How to Choose the Right IT Partner for Your Business in Malaysia
CHAPTER 2 - PM Management and IT Context
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Which alternative to Crystal Reports is best for small or large businesses.pdf
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Navsoft: AI-Powered Business Solutions & Custom Software Development
Digital Strategies for Manufacturing Companies
Odoo Companies in India – Driving Business Transformation.pdf
Upgrade and Innovation Strategies for SAP ERP Customers
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
VVF-Customer-Presentation2025-Ver1.9.pptx
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Operating system designcfffgfgggggggvggggggggg
Understanding Forklifts - TECH EHS Solution
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025

Using the PostgreSQL Extension Ecosystem for Advanced Analytics

  • 1. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 Using the PostgreSQL Extension Ecosystem for Advanced Analytics
  • 2. sales@chartio.com (855) 232-0320 - The problem - The prevailing view vs. the practical reality - A possible solution - Or just building blocks? - Nearness - Near at hand, near to our skill set, near to our capabilities - A more complete solution - The PostgreSQL extension ecosystem Agenda
  • 3. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 The Problem The Prevailing View vs. The Practical Reality
  • 4. sales@chartio.com (855) 232-0320 The Prevailing View - Logical Dimension Relational Non-Relational Schema objects ● Structured rows and columns ● Schema on write ● Referential integrity ● Painful migrations ● Unstructured files, docs, etc ● Schema on read ● No referential integrity ● No migrations Query languages ● SQL ● Declarative ● Easy enough for non-tech users ● Various ● Procedural ● Requires some programming skills Exploratory analysis ● Native support for joins ● Interactive/low execution overhead ● No native support for joins ● OLAP - Batch processing Data science and ML ● Only descriptive statistics ● Requires exporting dumps/samples ● Robust ecosystem ● Does not require exports
  • 5. sales@chartio.com (855) 232-0320 The Prevailing View - Physical Dimension Relational Non-Relational Parallel query processing ● Single node system ● Single process per query ● Multiple node system ● Multiple processes per query Concurrency ● High concurrency ● Single process per connection ● OLAP - low concurrency/high scheduling overhead High Availability & Replication ● Async and sync replication ● HA may not be native ● Async and sync replication ● HA likely to be native Sharding ● Sharding may not be native ● Difficult to manage ● Sharding likely to be native ● Easy to manage
  • 6. sales@chartio.com (855) 232-0320 The Prevailing View - Summary - RDBMS have nice properties for producing rich data - ACID, relational integrity, constraints, strong data types - Easier for non-tech users and exploratory analysis - Probably don’t meet the needs of today’s analysts - Data science & Machine Learning - Parallel processing - Definitely don’t meet the needs of today’s apps - Schema migrations - Replication and sharding
  • 8. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 But we still want more advanced functionality. The Practical Reality
  • 9. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 A Possible Solution Or Just Building Blocks?
  • 10. sales@chartio.com (855) 232-0320 Modern SQL - Many people still think of SQL in terms of SQL-92 - Since then we’ve had: SQL:1999, SQL:2003, SQL:2006, SQL:2008, SQL:2011 - http://use-the-index-luke.com/blog/2015-02/modern-sql - Common Table Expressions (CTEs) / Recursive CTEs - Window Functions - Ordered-set Aggregates - Lateral joins - Temporal support - The list goes on...
  • 11. sales@chartio.com (855) 232-0320 Procedural Languages - Native pgSQL Tcl Perl Python - Community Java PHP R Javascript Ruby Scheme sh
  • 12. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 These solve some problems. For others, they are just building blocks. Building Blocks
  • 13. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 Nearness Near at Hand Near to Our Skill Set Near to Our Capabilities
  • 15. sales@chartio.com (855) 232-0320 - Near at hand - Easily installable - Near to our skill set - Familiar tool/language/abstraction - Modular and composable - Near to our capabilities - Capable of solving a problem in our domain Nearness Drives Adoption
  • 16. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 A More Complete Solution The PostgreSQL Extension Ecosystem
  • 17. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & operators: https://github.com/eulerto/pg_similarity - UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 18. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & operators: https://github.com/eulerto/pg_similarity - UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 19. sales@chartio.com (855) 232-0320 - Package Manager: pgxn - Index/Network: http://pgxn.org/ - PyPI, RubyGems, CPAN, CRAN The PostgreSQL Extension Network
  • 20. sales@chartio.com (855) 232-0320 The PostgreSQL Extension Network - Near at hand - pgxn search semver - pgxn info semver - pgxn install semver - pgxn load –d somedb semver - pgxn unload –d somedb semver - pgxn uninstall semver - Search github? google? mailing list? - Github README? - git clone; make; make install; - psql –c “CREATE EXTENSION IF NOT EXISTS” - psql –c “DROP EXTENSION IF EXISTS” - make uninstall?
  • 21. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & operators: https://github.com/eulerto/pg_similarity - UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 22. sales@chartio.com (855) 232-0320 UDFs & Operators: pg_similarity - Near to our capabilities - Similarity coefficient algorithms - L1 Distance - Cosine Distance - Dice Coefficient - Euclidean Distance - Hamming Distance - Jaccard Coefficient - Jaro Distance - Jaro-Winkler Distance - Levenshtein Distance - Matching Coefficient - Monge-Elkan Coefficient - Needleman-Wunsch Coefficient - Overlap Coefficient - Q-Gram Distance - Smith-Waterman Coefficient - Smith-Waterman-Gotoh Coefficient - Soundex Distance
  • 23. sales@chartio.com (855) 232-0320 UDFs & Operators: pg_similarity - Near to our skill set
  • 24. sales@chartio.com (855) 232-0320 UDFs & Operators: pg_similarity - Implementation
  • 25. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 26. sales@chartio.com (855) 232-0320 UDAs & Data Types: postgresql-hll - Near to our capabilities & near to our skill set - Data type - Estimate count distinct with tunable precision - 1280 bytes estimates tens of billions of distinct values with few percent error
  • 27. sales@chartio.com (855) 232-0320 UDAs & Data Types: postgresql-hll
  • 28. sales@chartio.com (855) 232-0320 UDAs & Data Types: postgresql-hll - Implementation
  • 29. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 31. sales@chartio.com (855) 232-0320 Foreign Data Wrappers: multicorn - Near to our skill set
  • 32. sales@chartio.com (855) 232-0320 Foreign Data Wrappers: pgosquery - Near at hand
  • 33. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 34. sales@chartio.com (855) 232-0320 Indexes: ZomboDB - Index Access Method API - http://www.postgresql.org/docs/9.4/static/indexam.html
  • 35. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes (GiST, GIN): https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 36. sales@chartio.com (855) 232-0320 Composing Extension Methods: MADlib Near to our capabilities
  • 37. sales@chartio.com (855) 232-0320 Composing Extension Methods: MADlib - Near to our skill set
  • 39. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 40. sales@chartio.com (855) 232-0320 Parallel Processing - Parallel sequential scan - http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-is-committed.html - Columnar FDW: - https://github.com/citusdata/cstore_fdw
  • 41. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 44. sales@chartio.com (855) 232-0320 Beyond Analytics - Web app framework - http://blog.aquameta.com/ - REST API - https://github.com/begriffs/postgrest - Unit testing framework - http://pgtap.org/ - Firewall - https://github.com/uptimejp/sql_firewall - More every week!
  • 45. sales@chartio.com (855) 232-0320 Conclusion - With PostgreSQL, you get - more than rows and columns - more than SELECT, FROM, WHERE, GROUP BY, ORDER BY - more than a single machine - Make sure you get the full return on your investment! Get your Chartio free trial! sales@chartio.com (855) 232-0320