SlideShare a Scribd company logo
PostgreSQL – Tomasz Borek
Teaching PostgreSQL to new people
@LAFK_pl
Consultant @
About me
@LAFK_pl
Consultant @
Tomasz Borek
Teaching PostgreSQL to new people
What will I tell you?
● About me (done)
● Show of hands
● Who „new people” might be
– And usually – in my case – are
● About teaching
– Comfort zone, learners, stepping back
● Chosen approaches, features, gotchas and the like
● Why, why, why
● And yes, this’ll be about Postgres, but in an unusual way
Show of hands
● Developers (not PL/SQL ones)
Show of hands
● Developers
● Developers (PL/SQL ones)
Show of hands
● Developers
● Developers (PL/SQL ones)
● DBA (Admin, Architect)
Show of hands
● Developers
● Developers (PL/SQL ones)
● DBA (Admin, Architect)
● DevOps
Show of hands
● Developers
● Developers (PL/SQL ones)
● DBA (Admin, Architect)
● DevOps
● SysAdmin
Show of hands
● Developers
● Developers (PL/SQL ones)
● DBA (Admin, Architect)
● DevOps
● SysAdmin
● Trainers / consultants
Show of hands
● Developers
● Developers (PL/SQL ones)
● DBA (Admin, Architect)
● DevOps
● SysAdmin
● Trainers / consultants
● Other?
„New” people
Surprisingly
● Often your colleagues
● Sometimes older
● Sometimes more senior
● Experienced
● With success under their belts
Surprisingly
● Often your colleagues
● Sometimes older
● Sometimes more senior
● Experienced
● With success under their belts
● Basically: FORMED already
– Or MADE, if you will
Developers are problem solvers
● Your colleagues have certain problems
● Is Postgres the solution?
– Or „a solution” at least?
● And how is the learning curve
– Time including
Developers are not SQL people!
● Not many know JOINs very well
● Not many know how indexes work
● Not many know indexes weaknesses
● CTEs, window functions, procedures, cursors…
● They „omit” this
● Comfort zone is nice
Do not abandon them
Or they’ll abandon you
Do not abandon them
● Docs
● Materials
● Tools
● Links to good content
● Pictures, pictures, pictures
● They can edit / comment (Wiki)
● Your (colleagues) time
Teaching
What is YOUR problem?
● DBA wanting respite for your DB?
● Malpractice in SQL queries?
● Why don’t they use XYZ feature?
● From tomorrow on, teach them some SQL
● Migration from X to Postgres
● Guidelines creation
Xun Kuang once said
不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知
之不若行之
Xunzi book 8: Ruxiao, chapter 11
Teaching PostgreSQL to new people
Xun Kuang once said
不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知
之不若行之
“Not having heard something is not as good as
having heard it; having heard it is not as good as
having seen it; having seen it is not as good as
knowing it; knowing it is not as good as putting it
into practice.”
Xunzi book 8: Ruxiao, chapter 11
Xun Kuang paraphrase would be
不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知
之不若行之
“Not having heard something < having heard it;
having heard it < having seen it;
having seen it < knowing it;
knowing it < putting it into practice.”
Xunzi book 8: Ruxiao, chapter 11
How do they learn?
● „Practice makes master”
– Except it doesn’t
● Learning styles
● Docs still relevant
– If well-placed, accessible and easy to get in
Repetitio est mater studiorum
● Crash course
● Workshop
● Problem solving on their own
● Docs to help
● Code reviews
Comfort zone
Comfort zone
● Setup / install
● Moving around
● Logs, timing queries
● EXPLAIN + ANALYZE
● Indexes
● PgSQL and variants
● NoSQL + XML
Chosen features, gotchas etc.
so
How to teach Postgres?
In short
● History – battle-tested, feature-rich, used
● Basics – moving around, commands, etc.
● Prepare your bait accordingly
– My faves
– Advanced features
– NoSQL angle
– …
● Don’t just drink the KoolAid!
Battle-tested
● Matures since 1987
● Comes in many flavours (forks)
● Largest cluster – 2PBs in Yahoo
● Skype, NASA, Instagram
● Stable:
– Many years on one version
– Good version support
– Every year something new
– Follows ANSI SQL standards
https://www.postgresql.org/about/users/
In-/Postgres forks
Teaching PostgreSQL to new people
Support?
Great angles
● Procedures: Java, Perl, Python, CTEs...
● Enterprise / NoSQL - handles XMLs and JSONs
● Index power – spatial or geo or your own
● CTEs and FDWs => great ETL or µservice
● Pure dev: error reporting / logging, MVCC (dirty
read gone), own index, plenty of data types,
Java/Perl/… inside
● Solid internals: processes, sec built-in,
Basics
● Setup
● Psql
– Moving around
– What’s in
● Indexes
● Joins
● Query path
● Explain, Explain Analyze
Query Path
http://www.slideshare.net/SFScon/sfscon15-peter-moser-the-path-of-a-query-postgresql-internals
Parser
● Syntax checks, like FRIM is not a keyword
– SELECT * FRIM myTable;
● Catalog lookup
– MyTable may not exist
● In the end query tree is built
– Query tokenization: SELECT (keyword)
employeeName (field id) count (function call)...
Grammar and a query tree
Planner
● Where Planner Tree is built
● Where best execution is decided upon
– Seq or index scan? Index or bitmap index?
– Which join order?
– Which join strategy (nested, hashed, merge)?
– Inner or outer?
– Aggregation: plain, hashed, sorted…
● Heuristic, if finding all plans too costly
Full query path
Example to explain EXPLAIN
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00
rows=10000 width=244)
Explaining EXPLAIN - what
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000
width=244)
● Startup cost – time before output phase begins
● Total cost – in page fetches, may change, assumed to
run node to completion
●
Rows – estimated number to scan (but LIMIT etc.)
● Estimated average width of output from that node (in
bytes)
Explaining EXPLAIN - how
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; //358|10k
●
No WHERE, no index
● Cost = disk pages read * seq page cost + rows scanned
* cpu tuple cost
● 358 * 1.0 + 10000 * 0.01 = 458 // default values
Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
● Actually runs the query
● More info: actual times, rows removed by filter,
sort method used, disk/memory used...
Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10
loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024
rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022
rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10
loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024
rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022
rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10
loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024
rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022
rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
My Faves
● Error reporting
● PL/xSQL – feel free to use Perl, Python, Ruby, Java,
LISP...
● Data types
– XML and JSON handling
● Foreign Data Wrappers (FDW)
● Windowing functions
● Common table expressions (CTE) and recursive queries
● Power of Indexes
Will DB eat your cake?
● Thanks @anandology
Will DB eat your cake?
● Thanks @anandology
Will DB eat your cake?
● Thanks @anandology
The cake is a lie!
Will DB eat your cake?
● Thanks @anandology
Will DB eat your cake?
● Thanks @anandology
Will DB eat your cake?
● Thanks @anandology
Consider password VARCHAR(8)
Logging, ‘gotchas’
● Default is to stderr only
●
Set on CLI or in config, not through sets
● Where is it?
●
How to log queries… or turning log_collector on
Where is it?
● Default
– data/pg_log
● Launchers can set it (Mac Homebrew/plist)
● Version and config dependent
Ask DB
Logging, turn it on
● Default is to stderr only
● In PG:
logging_collector = on
log_filename = strftime-patterned filename
[log_destination = [stderr|syslog|csvlog] ]
log_statement = [none|ddl|mod|all] // all
log_min_error_statement = ERROR
log_line_prefix = '%t %c %u ' # time sessionid user
Log line prefix
PL/pgSQL
● Stored procedure dilemma
– Where to keep your logic?
– How your logic is NOT in your SCM
PL/pgSQL
● Stored procedure dilemma
– Where to keep your logic?
– How your logic is NOT in your SCM
● Over dozen of options:
– Perl, Python, Ruby,
– pgSQL, Java,
– TCL, LISP…
PL/pgSQL
● Stored procedure dilemma
– Where to keep your logic?
– How your logic is NOT in your SCM
● Over dozen of options:
– Perl, Python, Ruby,
– pgSQL, Java,
– TCL, LISP…
● DevOps, SysAdmins, DBAs… ETLs etc.
PL/pgSQL
● Stored procedure dilemma
– Where to keep your logic?
– How your logic is NOT in your SCM
● Over dozen of options:
– Perl, Python, Ruby,
– pgSQL, Java,
– TCL, LISP…
● DevOps, SysAdmins, DBAs… ETLs etc.
Perl function example
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
my ($x, $y) = @_;
if (not defined $x) {
return undef if not defined $y;
return $y;
}
return $x if not defined $y;
return $x if $x > $y;
return $y;
$$ LANGUAGE plperl;
XML or JSON support
● Parsing and retrieving XML (functions)
● Valid JSON checks (type)
● Careful with encoding!
– PG allows only one server encoding per database
– Specify it to UTF-8 or weep
● Document database instead of OO or rel
– JSON, JSONB, HSTORE – noSQL fun welcome!
HSTORE?
CREATE TABLE example (
id serial PRIMARY KEY,
data hstore);
HSTORE?
CREATE TABLE example (
id serial PRIMARY KEY,
data hstore);
INSERT INTO example (data) VALUES
('name => "John Smith", age => 28, gender => "M"'),
('name => "Jane Smith", age => 24');
HSTORE?
CREATE TABLE example (
id serial PRIMARY KEY,
data hstore);
INSERT INTO example (data)
VALUES
('name => "John Smith", age => 28,
gender => "M"'),
('name => "Jane Smith", age => 24');
SELECT id,
data->'name'
FROM example;
SELECT id, data->'age'
FROM example
WHERE data->'age' >=
'25';
XML and JSON datatype
CREATE TABLE test (
...,
xml_file xml,
json_file json,
...
);
XML functions example
XMLROOT (
XMLELEMENT (
NAME gazonk,
XMLATTRIBUTES (
’val’ AS name,
1 + 1 AS num
),
XMLELEMENT (
NAME qux,
’foo’
)
),
VERSION ’1.0’,
STANDALONE YES
)
<?xml version=’1.0’
standalone=’yes’ ?>
<gazonk name=’val’
num=’2’>
<qux>foo</qux>
</gazonk>
xml '<foo>bar</foo>'
'<foo>bar</foo>'::xml
Architecture and internals
Teaching PostgreSQL to new people
Teaching PostgreSQL to new people
Check out processes
●
pgrep -l postgres
●
htop > filter: postgres
● Whatever you like / use usually
●
Careful with kill -9 on connections
– kill -15 better
Teaching PostgreSQL to new people
Summary
Before
● Who are they?
● What is your problem?
● How large comfort zone, how to push them out?
● Materials, docs, workshop preparation
● How much time for training?
● How much time after?
● How many people will it be?
● What indicates that problem is solved?
During
● Establish the goal
– And – if possible – learning styles
● Promise support (and tell how!)
– Push out from comfort zone!
● Ask for hard work and stupid questions
● Show documentation, do live tour
● Do the workshop
● Involve, find best ones
– You will have them help you later
● Expect questions, make them ask
– Again, push out from comfort zone!
After
● Where are the docs?
– Are they using them?
● Answer the questions
– Again, and again
● Code reviews
– Deliver on support promise!
– Involve promising students
● Is the problem gone / better?
Don’t omit the basics
● Joins
● Indexes – how they work
● Query path (EXPLAIN, EXPLAIN ANALYZE)
● Moving around (psql)
● Setup and getting to DB
Postgres is cool
● Goodies like error reporting or log line prefix
● Processes thought out
● Good for µservices and enterprise
● Not only SQL (XML, JSON, Perl, Python...)
● Ask DB
● Indexes
● Powerful: CTEs, recursive queries, FDWs...
● Battle tested and always high
Teaching Postgres – Tomasz Borek
Teaching Postgres
to new people
@LAFK_pl
Consultant @

More Related Content

PDF
Better Full Text Search in PostgreSQL
PDF
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
PDF
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
PDF
Full Text Search in PostgreSQL
PDF
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
PDF
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
PDF
On Beyond (PostgreSQL) Data Types
PDF
Mastering PostgreSQL Administration
 
Better Full Text Search in PostgreSQL
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
Full Text Search in PostgreSQL
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
On Beyond (PostgreSQL) Data Types
Mastering PostgreSQL Administration
 

What's hot (20)

PDF
Postgresql search demystified
PDF
Accelerating Local Search with PostgreSQL (KNN-Search)
PDF
Flexible Indexing with Postgres
 
PDF
Cassandra summit 2013 - DataStax Java Driver Unleashed!
PDF
PostgreSQL Replication Tutorial
PDF
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
PDF
PostgreSQL 9.4, 9.5 and Beyond @ COSCUP 2015 Taipei
PDF
Effective testing for spark programs scala bay preview (pre-strata ny 2015)
PDF
Pgbr 2013 fts
PDF
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
PDF
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PDF
PostgreSQL WAL for DBAs
ODP
PostgreSQL Administration for System Administrators
PDF
What is the best full text search engine for Python?
PDF
Neo4j after 1 year in production
PDF
PostgreSQL query planner's internals
PPTX
Introduction to Apache Cassandra
PDF
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
PPTX
Exploring Parallel Merging In GPU Based Systems Using CUDA C.
PDF
Advanced backup methods (Postgres@CERN)
Postgresql search demystified
Accelerating Local Search with PostgreSQL (KNN-Search)
Flexible Indexing with Postgres
 
Cassandra summit 2013 - DataStax Java Driver Unleashed!
PostgreSQL Replication Tutorial
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
PostgreSQL 9.4, 9.5 and Beyond @ COSCUP 2015 Taipei
Effective testing for spark programs scala bay preview (pre-strata ny 2015)
Pgbr 2013 fts
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PostgreSQL WAL for DBAs
PostgreSQL Administration for System Administrators
What is the best full text search engine for Python?
Neo4j after 1 year in production
PostgreSQL query planner's internals
Introduction to Apache Cassandra
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
Exploring Parallel Merging In GPU Based Systems Using CUDA C.
Advanced backup methods (Postgres@CERN)
Ad

Viewers also liked (8)

PDF
Managing thousands of databases
PDF
Gbroccolo pgconfeu2016 pgnfs
PDF
Multimaster
PDF
Managing PostgreSQL with PgCenter
PDF
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PDF
Modern SQL in Open Source and Commercial Databases
PDF
Life on a_rollercoaster
PDF
The future is CSN
Managing thousands of databases
Gbroccolo pgconfeu2016 pgnfs
Multimaster
Managing PostgreSQL with PgCenter
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
Modern SQL in Open Source and Commercial Databases
Life on a_rollercoaster
The future is CSN
Ad

Similar to Teaching PostgreSQL to new people (20)

PDF
query-optimization-techniques_talk.pdf
PDF
PostgreSQL 9.0 & The Future
PPTX
PostgreSQL - It's kind've a nifty database
PDF
Postgres can do THAT?
PDF
Postgres performance for humans
PDF
SQL: Query optimization in practice
PDF
Postgres Performance for Humans
PDF
query_tuning.pdf
PDF
Does PostgreSQL respond to the challenge of analytical queries?
PDF
Tech Talk - JPA and Query Optimization - publish
PDF
Pg for web developer
PPTX
PostgreSQL - Object Relational Database
PDF
Manipulating Data in Style with SQL
PDF
10 Reasons to Start Your Analytics Project with PostgreSQL
PPTX
Modern sql
PDF
Indexes don't mean slow inserts.
PDF
PostgreSQL performance improvements in 9.5 and 9.6
KEY
PostgreSQL
PDF
Pdxpugday2010 pg90
PDF
Beyond EXPLAIN: Query Optimization From Theory To Code
query-optimization-techniques_talk.pdf
PostgreSQL 9.0 & The Future
PostgreSQL - It's kind've a nifty database
Postgres can do THAT?
Postgres performance for humans
SQL: Query optimization in practice
Postgres Performance for Humans
query_tuning.pdf
Does PostgreSQL respond to the challenge of analytical queries?
Tech Talk - JPA and Query Optimization - publish
Pg for web developer
PostgreSQL - Object Relational Database
Manipulating Data in Style with SQL
10 Reasons to Start Your Analytics Project with PostgreSQL
Modern sql
Indexes don't mean slow inserts.
PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL
Pdxpugday2010 pg90
Beyond EXPLAIN: Query Optimization From Theory To Code

More from Tomek Borek (20)

PDF
Noc informatyka - co ja wiem o testowaniu
PDF
Nowoczesne architektury
PDF
Java tuning on GNU/Linux for busy dev
ODP
Jvm tuning in a rush! - Lviv JUG
ODP
Java Memory Consistency Model - concepts and context
ODP
Seeing through the smoke
PDF
AR drone - Polish JUG short demo
PDF
Testing SAAS, how to go about it?
ODP
Spróbujmy szczęścia bo zaciskanie pięści nie działa
PDF
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...
ODP
Lightning talk on Java Memory Consistency Model Java Day Kiev 2014
ODP
Few words about happiness (Polish talk) / O szczęściu słów kilka
ODP
Jak użytecznie, prawdziwie i solidnie odpowiedzieć na pytanie "jak było"
PDF
It's not always the application's fault
PDF
To nie zawsze wina aplikacji!
PPT
Wprowadzenie do optymalizacji wielokryterialnej / Intro to multicriteria opti...
ODP
Git nie dla początkujących
ODP
Architecture visualizers - tools usability study
PDF
Meta on HCI - keyword analysis and trends
PDF
"Narco" emotions - description of study on whether Twitter can be used to gle...
Noc informatyka - co ja wiem o testowaniu
Nowoczesne architektury
Java tuning on GNU/Linux for busy dev
Jvm tuning in a rush! - Lviv JUG
Java Memory Consistency Model - concepts and context
Seeing through the smoke
AR drone - Polish JUG short demo
Testing SAAS, how to go about it?
Spróbujmy szczęścia bo zaciskanie pięści nie działa
Łukasz Romaszewski on Internet of Things Raspberry Pi and Java Embedded JavaC...
Lightning talk on Java Memory Consistency Model Java Day Kiev 2014
Few words about happiness (Polish talk) / O szczęściu słów kilka
Jak użytecznie, prawdziwie i solidnie odpowiedzieć na pytanie "jak było"
It's not always the application's fault
To nie zawsze wina aplikacji!
Wprowadzenie do optymalizacji wielokryterialnej / Intro to multicriteria opti...
Git nie dla początkujących
Architecture visualizers - tools usability study
Meta on HCI - keyword analysis and trends
"Narco" emotions - description of study on whether Twitter can be used to gle...

Recently uploaded (20)

PDF
RMMM.pdf make it easy to upload and study
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Complications of Minimal Access Surgery at WLH
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
master seminar digital applications in india
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Computing-Curriculum for Schools in Ghana
PDF
01-Introduction-to-Information-Management.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Pre independence Education in Inndia.pdf
RMMM.pdf make it easy to upload and study
Anesthesia in Laparoscopic Surgery in India
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Complications of Minimal Access Surgery at WLH
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPH.pptx obstetrics and gynecology in nursing
master seminar digital applications in india
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
VCE English Exam - Section C Student Revision Booklet
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
Computing-Curriculum for Schools in Ghana
01-Introduction-to-Information-Management.pdf
Sports Quiz easy sports quiz sports quiz
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Pre independence Education in Inndia.pdf

Teaching PostgreSQL to new people

  • 1. PostgreSQL – Tomasz Borek Teaching PostgreSQL to new people @LAFK_pl Consultant @
  • 4. What will I tell you? ● About me (done) ● Show of hands ● Who „new people” might be – And usually – in my case – are ● About teaching – Comfort zone, learners, stepping back ● Chosen approaches, features, gotchas and the like ● Why, why, why ● And yes, this’ll be about Postgres, but in an unusual way
  • 5. Show of hands ● Developers (not PL/SQL ones)
  • 6. Show of hands ● Developers ● Developers (PL/SQL ones)
  • 7. Show of hands ● Developers ● Developers (PL/SQL ones) ● DBA (Admin, Architect)
  • 8. Show of hands ● Developers ● Developers (PL/SQL ones) ● DBA (Admin, Architect) ● DevOps
  • 9. Show of hands ● Developers ● Developers (PL/SQL ones) ● DBA (Admin, Architect) ● DevOps ● SysAdmin
  • 10. Show of hands ● Developers ● Developers (PL/SQL ones) ● DBA (Admin, Architect) ● DevOps ● SysAdmin ● Trainers / consultants
  • 11. Show of hands ● Developers ● Developers (PL/SQL ones) ● DBA (Admin, Architect) ● DevOps ● SysAdmin ● Trainers / consultants ● Other?
  • 13. Surprisingly ● Often your colleagues ● Sometimes older ● Sometimes more senior ● Experienced ● With success under their belts
  • 14. Surprisingly ● Often your colleagues ● Sometimes older ● Sometimes more senior ● Experienced ● With success under their belts ● Basically: FORMED already – Or MADE, if you will
  • 15. Developers are problem solvers ● Your colleagues have certain problems ● Is Postgres the solution? – Or „a solution” at least? ● And how is the learning curve – Time including
  • 16. Developers are not SQL people! ● Not many know JOINs very well ● Not many know how indexes work ● Not many know indexes weaknesses ● CTEs, window functions, procedures, cursors… ● They „omit” this ● Comfort zone is nice
  • 17. Do not abandon them Or they’ll abandon you
  • 18. Do not abandon them ● Docs ● Materials ● Tools ● Links to good content ● Pictures, pictures, pictures ● They can edit / comment (Wiki) ● Your (colleagues) time
  • 20. What is YOUR problem? ● DBA wanting respite for your DB? ● Malpractice in SQL queries? ● Why don’t they use XYZ feature? ● From tomorrow on, teach them some SQL ● Migration from X to Postgres ● Guidelines creation
  • 21. Xun Kuang once said 不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知 之不若行之 Xunzi book 8: Ruxiao, chapter 11
  • 23. Xun Kuang once said 不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知 之不若行之 “Not having heard something is not as good as having heard it; having heard it is not as good as having seen it; having seen it is not as good as knowing it; knowing it is not as good as putting it into practice.” Xunzi book 8: Ruxiao, chapter 11
  • 24. Xun Kuang paraphrase would be 不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知 之不若行之 “Not having heard something < having heard it; having heard it < having seen it; having seen it < knowing it; knowing it < putting it into practice.” Xunzi book 8: Ruxiao, chapter 11
  • 25. How do they learn? ● „Practice makes master” – Except it doesn’t ● Learning styles ● Docs still relevant – If well-placed, accessible and easy to get in
  • 26. Repetitio est mater studiorum ● Crash course ● Workshop ● Problem solving on their own ● Docs to help ● Code reviews
  • 28. Comfort zone ● Setup / install ● Moving around ● Logs, timing queries ● EXPLAIN + ANALYZE ● Indexes ● PgSQL and variants ● NoSQL + XML
  • 29. Chosen features, gotchas etc. so How to teach Postgres?
  • 30. In short ● History – battle-tested, feature-rich, used ● Basics – moving around, commands, etc. ● Prepare your bait accordingly – My faves – Advanced features – NoSQL angle – … ● Don’t just drink the KoolAid!
  • 31. Battle-tested ● Matures since 1987 ● Comes in many flavours (forks) ● Largest cluster – 2PBs in Yahoo ● Skype, NASA, Instagram ● Stable: – Many years on one version – Good version support – Every year something new – Follows ANSI SQL standards https://www.postgresql.org/about/users/
  • 35. Great angles ● Procedures: Java, Perl, Python, CTEs... ● Enterprise / NoSQL - handles XMLs and JSONs ● Index power – spatial or geo or your own ● CTEs and FDWs => great ETL or µservice ● Pure dev: error reporting / logging, MVCC (dirty read gone), own index, plenty of data types, Java/Perl/… inside ● Solid internals: processes, sec built-in,
  • 36. Basics ● Setup ● Psql – Moving around – What’s in ● Indexes ● Joins ● Query path ● Explain, Explain Analyze
  • 38. Parser ● Syntax checks, like FRIM is not a keyword – SELECT * FRIM myTable; ● Catalog lookup – MyTable may not exist ● In the end query tree is built – Query tokenization: SELECT (keyword) employeeName (field id) count (function call)...
  • 39. Grammar and a query tree
  • 40. Planner ● Where Planner Tree is built ● Where best execution is decided upon – Seq or index scan? Index or bitmap index? – Which join order? – Which join strategy (nested, hashed, merge)? – Inner or outer? – Aggregation: plain, hashed, sorted… ● Heuristic, if finding all plans too costly
  • 42. Example to explain EXPLAIN EXPLAIN SELECT * FROM tenk1; QUERY PLAN ------------------------------------------------------------ Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
  • 43. Explaining EXPLAIN - what EXPLAIN SELECT * FROM tenk1; QUERY PLAN ------------------------------------------------------------ Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) ● Startup cost – time before output phase begins ● Total cost – in page fetches, may change, assumed to run node to completion ● Rows – estimated number to scan (but LIMIT etc.) ● Estimated average width of output from that node (in bytes)
  • 44. Explaining EXPLAIN - how EXPLAIN SELECT * FROM tenk1; QUERY PLAN ------------------------------------------------------------ Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; //358|10k ● No WHERE, no index ● Cost = disk pages read * seq page cost + rows scanned * cpu tuple cost ● 358 * 1.0 + 10000 * 0.01 = 458 // default values
  • 45. Analyzing EXPLAIN ANALYZE EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1) -> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1) Recheck Cond: (unique1 < 10) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1) Index Cond: (unique1 < 10) -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10) Index Cond: (unique2 = t1.unique2) Planning time: 0.181 ms Execution time: 0.501 ms ● Actually runs the query ● More info: actual times, rows removed by filter, sort method used, disk/memory used...
  • 46. Analyzing EXPLAIN ANALYZE EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1) -> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1) Recheck Cond: (unique1 < 10) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1) Index Cond: (unique1 < 10) -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10) Index Cond: (unique2 = t1.unique2) Planning time: 0.181 ms Execution time: 0.501 ms
  • 47. Analyzing EXPLAIN ANALYZE EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1) -> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1) Recheck Cond: (unique1 < 10) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1) Index Cond: (unique1 < 10) -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10) Index Cond: (unique2 = t1.unique2) Planning time: 0.181 ms Execution time: 0.501 ms
  • 48. Analyzing EXPLAIN ANALYZE EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1) -> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1) Recheck Cond: (unique1 < 10) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1) Index Cond: (unique1 < 10) -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10) Index Cond: (unique2 = t1.unique2) Planning time: 0.181 ms Execution time: 0.501 ms
  • 49. My Faves ● Error reporting ● PL/xSQL – feel free to use Perl, Python, Ruby, Java, LISP... ● Data types – XML and JSON handling ● Foreign Data Wrappers (FDW) ● Windowing functions ● Common table expressions (CTE) and recursive queries ● Power of Indexes
  • 50. Will DB eat your cake? ● Thanks @anandology
  • 51. Will DB eat your cake? ● Thanks @anandology
  • 52. Will DB eat your cake? ● Thanks @anandology
  • 53. The cake is a lie!
  • 54. Will DB eat your cake? ● Thanks @anandology
  • 55. Will DB eat your cake? ● Thanks @anandology
  • 56. Will DB eat your cake? ● Thanks @anandology Consider password VARCHAR(8)
  • 57. Logging, ‘gotchas’ ● Default is to stderr only ● Set on CLI or in config, not through sets ● Where is it? ● How to log queries… or turning log_collector on
  • 58. Where is it? ● Default – data/pg_log ● Launchers can set it (Mac Homebrew/plist) ● Version and config dependent
  • 60. Logging, turn it on ● Default is to stderr only ● In PG: logging_collector = on log_filename = strftime-patterned filename [log_destination = [stderr|syslog|csvlog] ] log_statement = [none|ddl|mod|all] // all log_min_error_statement = ERROR log_line_prefix = '%t %c %u ' # time sessionid user
  • 62. PL/pgSQL ● Stored procedure dilemma – Where to keep your logic? – How your logic is NOT in your SCM
  • 63. PL/pgSQL ● Stored procedure dilemma – Where to keep your logic? – How your logic is NOT in your SCM ● Over dozen of options: – Perl, Python, Ruby, – pgSQL, Java, – TCL, LISP…
  • 64. PL/pgSQL ● Stored procedure dilemma – Where to keep your logic? – How your logic is NOT in your SCM ● Over dozen of options: – Perl, Python, Ruby, – pgSQL, Java, – TCL, LISP… ● DevOps, SysAdmins, DBAs… ETLs etc.
  • 65. PL/pgSQL ● Stored procedure dilemma – Where to keep your logic? – How your logic is NOT in your SCM ● Over dozen of options: – Perl, Python, Ruby, – pgSQL, Java, – TCL, LISP… ● DevOps, SysAdmins, DBAs… ETLs etc.
  • 66. Perl function example CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$ my ($x, $y) = @_; if (not defined $x) { return undef if not defined $y; return $y; } return $x if not defined $y; return $x if $x > $y; return $y; $$ LANGUAGE plperl;
  • 67. XML or JSON support ● Parsing and retrieving XML (functions) ● Valid JSON checks (type) ● Careful with encoding! – PG allows only one server encoding per database – Specify it to UTF-8 or weep ● Document database instead of OO or rel – JSON, JSONB, HSTORE – noSQL fun welcome!
  • 68. HSTORE? CREATE TABLE example ( id serial PRIMARY KEY, data hstore);
  • 69. HSTORE? CREATE TABLE example ( id serial PRIMARY KEY, data hstore); INSERT INTO example (data) VALUES ('name => "John Smith", age => 28, gender => "M"'), ('name => "Jane Smith", age => 24');
  • 70. HSTORE? CREATE TABLE example ( id serial PRIMARY KEY, data hstore); INSERT INTO example (data) VALUES ('name => "John Smith", age => 28, gender => "M"'), ('name => "Jane Smith", age => 24'); SELECT id, data->'name' FROM example; SELECT id, data->'age' FROM example WHERE data->'age' >= '25';
  • 71. XML and JSON datatype CREATE TABLE test ( ..., xml_file xml, json_file json, ... );
  • 72. XML functions example XMLROOT ( XMLELEMENT ( NAME gazonk, XMLATTRIBUTES ( ’val’ AS name, 1 + 1 AS num ), XMLELEMENT ( NAME qux, ’foo’ ) ), VERSION ’1.0’, STANDALONE YES ) <?xml version=’1.0’ standalone=’yes’ ?> <gazonk name=’val’ num=’2’> <qux>foo</qux> </gazonk> xml '<foo>bar</foo>' '<foo>bar</foo>'::xml
  • 76. Check out processes ● pgrep -l postgres ● htop > filter: postgres ● Whatever you like / use usually ● Careful with kill -9 on connections – kill -15 better
  • 79. Before ● Who are they? ● What is your problem? ● How large comfort zone, how to push them out? ● Materials, docs, workshop preparation ● How much time for training? ● How much time after? ● How many people will it be? ● What indicates that problem is solved?
  • 80. During ● Establish the goal – And – if possible – learning styles ● Promise support (and tell how!) – Push out from comfort zone! ● Ask for hard work and stupid questions ● Show documentation, do live tour ● Do the workshop ● Involve, find best ones – You will have them help you later ● Expect questions, make them ask – Again, push out from comfort zone!
  • 81. After ● Where are the docs? – Are they using them? ● Answer the questions – Again, and again ● Code reviews – Deliver on support promise! – Involve promising students ● Is the problem gone / better?
  • 82. Don’t omit the basics ● Joins ● Indexes – how they work ● Query path (EXPLAIN, EXPLAIN ANALYZE) ● Moving around (psql) ● Setup and getting to DB
  • 83. Postgres is cool ● Goodies like error reporting or log line prefix ● Processes thought out ● Good for µservices and enterprise ● Not only SQL (XML, JSON, Perl, Python...) ● Ask DB ● Indexes ● Powerful: CTEs, recursive queries, FDWs... ● Battle tested and always high
  • 84. Teaching Postgres – Tomasz Borek Teaching Postgres to new people @LAFK_pl Consultant @