SlideShare a Scribd company logo
Practical
Partitioning in
Production with
Postgres
Jimmy Angelakos
Senior PostgreSQL Architect
Postgres Vision 2021-06-23
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
2
We’ll be looking at:
• Intro to Partitioning in PostgreSQL
• Why?
• How?
• Practical Example
Introduction to
Partitioning in
PostgreSQL
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
4
• RDBMS context: division of a table into distinct independent tables
• Horizontal partitioning (by row) – different rows in different tables
• Why?
– Easier to manage
– Performance
What is partitioning?
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
5
• Has had partitioning for quite some time now PG 8.1 (2005)
…
– Inheritance-based
– Why haven’t I heard of this before?
– It’s not great tbh...
• Declarative Partitioning: PG 10 (2017)
– Massive improvement
Partitioning in PostgreSQL
HISTORY
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
6
CREATE TABLE cust (id INT, signup DATE)
PARTITION BY RANGE (signup);
CREATE TABLE cust_2020
PARTITION OF cust FOR VALUES FROM
('2020-01-01') TO ('2021-01-01');
• Partitions may be partitioned
themselves (sub-partitioning)
Declarative Partitioning
( PG 10+ )
Specification of: By declaring a table (DDL):
• Partitioning method
• Partition key
– Column(s) or expression(s)
– Value determines data routing
• Partition boundaries
Why?
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
8
• Database size: unlimited ✅
• Tables per database: 1.4 billion ✅
• Table size: 32 TB 😐
– Default block size: 8192 bytes
• Rows per table: depends
– As many as can fit onto 4.2 billion blocks 😐
PostgreSQL limits
(Hard limits, hard to reach)
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
9
• Disk size limitations
– You can put partitions on different tablespaces
• Performance
– Partition pruning
– Table scans
– Index scans
– Hidden pitfalls of very large tables*
What partitioning can help with (i)
(Very large tables)
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
10
• Maintenance
– Deletions (some filesystems are bad at deleting large numbers of files)
🤭
– DROP TABLE cust_2020;
– ALTER TABLE cust DETACH PARTITION cust_2020;
• VACUUM
– Bloat
– Freezing → xid wraparound
What partitioning can help with (ii)
(Very large tables)
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
11
• Magic bullet
– No substitute for rational database design
• Sharding
– Not about putting part of the data on different nodes
• Performance tuning
– Unless you have one of the mentioned issues
What partitioning is not
How?
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
13
• Get your calculator out
– Data ingestion rate (both rows and size in bytes)
– Projected increases (e.g. 25 locations projected to be 200 by end of year)
– Data retention requirements
• Will inform choice of partitioning method and key
• For instance: 1440 measurements/day from each of 1000 sensors – extrapolate per year
• Keep checking if this is valid and be prepared to revise
Dimensioning
Plan ahead!
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
14
• Range: For key column(s) e.g. ranges of dates, identifiers, etc.
– Lower end: inclusive, upper end: exclusive
• List: Explicit key values stated for each partition
• Hash (PG 11+): If you have a column with values close to unique
– Define Modulus ( & remainder ) for number of almost-evenly-sized partitions
Partitioning method
Dimensioning usually makes this clearer
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
15
• Analysis
– Determine main keys used for retrieval from queries
– Proper key selection enables partition pruning
– Can use multiple columns for higher granularity (more partitions)
• Desirable
– High enough cardinality (range of values) for the number of partitions needed
– A column that doesn’t change often, to avoid moving rows among partitions
Partition Key selection
Choose wisely - know your data!
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
16
• Simply put, partitions are partitioned tables themselves. Plan ahead!
• CREATE TABLE transactions ( , location_code
… TEXT, tstamp TIMESTAMPTZ)
PARTITION BY RANGE (tstamp);
• CREATE TABLE transactions_2021_06
PARTITION OF transactions FOR VALUES FROM ('2021-06-01') TO ('2021-07-01')
PARTITION BY HASH (location_code);
• CREATE TABLE transactions_2021_06_p1
PARTITION OF transactions_2021_06 FOR VALUES WITH (MODULUS 4, REMAINDER 0);
Sub-partitioning
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
17
Partitioning by multiple columns
• CREATE TABLE transactions ( , location_code
… TEXT, tstamp TIMESTAMPTZ)
PARTITION BY RANGE (tstamp, location_code);
• CREATE TABLE transactions_2021_06_a PARTITION OF transactions
FOR VALUES FROM ('2021-06-01', 'AAA') TO ('2021-07-01', 'AZZ');
• CREATE TABLE transactions_2021_06_b PARTITION OF transactions
FOR VALUES FROM ('2021-06-01', 'BAA') TO ('2021-07-01', 'BZZ');
ERROR: partition "transactions_2021_06_b" would overlap partition
"transactions_2021_06_a"
• Because tstamp '2021-06-01' can only go in the first partition!
Be careful!
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
18
• Automatic creation of partitions
– Create in advance
– Use a cronjob
• Imperative merging/splitting of partitions
– Move rows manually
• Sharding to different nodes
– You may have to configure FDW manually
What Postgres does not do
core
Practical
Example
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
20
• Is your table too large to handle?
• Can partitioning help?
• What if it’s in constant use?
Partitioning a live production system
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
21
• OLTP workload, transactions keep flowing in
– Table keeps increasing in size
• VACUUM never ends
– Has been running for a full month already…
• Queries are getting slower
– Not just because of sheer number of rows...
The situation
Huge 20 TB table
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
22
• Postgres has 1GB segment size
– Can only be changed at
compilation time
– 20 TB table = 20000 segments
(files on disk)
• Why is this a problem?
– md.c →
* Hidden performance pitfall (i)
For VERY large tables
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
23
●
This loops 20000 times every time you
want to access a table page
– Linked list of segments
●
Code from PG 9.6
●
It has been heavily optimised recently
(caching, etc).
●
Still needs to run a lot of times
* Hidden performance pitfall (ii)
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
24
• Need to partition the huge table
– Dimensioning
– Partition method
– Partition key
• Make sure we’re on the latest version (PG 13)
– Get latest features & performance enhancements
So what do we do?
Next steps
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
25
• Dimensioning
– One partition per month will be about 30GB of data, so acceptable size
• Method, Key
– Candidate key is transaction date, which we can partition by range
– Check that there are no data errors (e.g. dates in the future when they shouldn’t be)
• Partition sizes don’t have to be equal
– We can partition older, less often accessed data by year
What is our table like?
It holds daily transaction totals for each point of sales
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
26
• Lock the table totally (ACCESS EXCLUSIVE) or prevent writes
– People will start yelling, and they will be right
• Cause excessive load on the system (e.g. I/O) or cause excessive disk space usage
– Can’t copy whole 20 TB table into empty partitioned table
– See above about yelling
• Present an inconsistent or incomplete view of the data
Problems
What things you cannot do in production
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
27
• Rename the huge table and its indices
• Create an empty partitioned table with the old huge table’s name
• Create the required indices on the new partitioned table
– They will be created automatically for each new partition
• Create first new partition for new incoming data
• Attach the old table as a partition of the new table so it can be used normally*
• Move data out of the old table incrementally at our own pace
The plan
Take it step by step
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
28
-- Do this all in one transaction
BEGIN;
ALTER TABLE dailytotals RENAME TO dailytotals_legacy;
ALTER INDEX dailytotals_batchid RENAME TO dailytotals_legacy_batchid;
ALTER INDEX …
…
Rename the huge table and its indices
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
29
CREATE TABLE dailytotals (
totalid BIGINT NOT NULL DEFAULT nextval('dailytotals_totalid_seq')
, totaldate DATE NOT NULL
, totalsum BIGINT
…
, batchid BIGINT NOT NULL
)
PARTITION BY RANGE (totaldate);
CREATE INDEX dailytotals_batchid ON dailytotals (batchid);
…
Create empty partitioned table & indices
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
30
CREATE TABLE dailytotals_202106
PARTITION OF dailytotals
FOR VALUES FROM ('2021-06-01') TO ('2021-07-01');
Create partition for new incoming data
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
31
DO $$
DECLARE earliest DATE;
DECLARE latest DATE;
BEGIN
-- Set boundaries
SELECT min(totaldate) INTO earliest FROM dailytotals_legacy;
latest := '2021-06-01'::DATE;
Attach old table as a partition (i)
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
32
-- HACK HACK HACK (only because we know and trust our data)
ALTER TABLE dailytotals_legacy
ADD CONSTRAINT dailytotals_legacy_totaldate
CHECK (totaldate >= earliest AND totaldate < latest)
NOT VALID;
-- You should not touch pg_catalog directly 😕
UPDATE pg_constraint
SET convalidated = true
WHERE conname = 'dailytotals_legacy_totaldate';
Attach old table as a partition (ii)
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
33
ALTER TABLE dailytotals
ATTACH PARTITION dailytotals_legacy
FOR VALUES FROM (earliest) TO (latest);
END;
$$ LANGUAGE PLPGSQL;
COMMIT;
Attach old table as a partition (iii)
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
34
• For instance, during quiet hours for the system, in scheduled batch jobs, etc.
WITH rows AS (
DELETE FROM dailytotals_legacy d
WHERE (totaldate >= '2020-01-01' AND totaldate < '2021-01-01')
RETURNING d.* )
INSERT INTO dailytotals SELECT * FROM rows;
• In the same transaction: DETACH the old table, perform the move, reATTACH with changed
boundaries. Rinse and repeat!
• Make sure the target partition exists!
Move data from old table at our own pace
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
35
• PG11: DEFAULT partition, UPDATE on partition key, HASH method, PKs, FKs, Indexes, Triggers
• PG12: Performance (pruning, COPY), FK references for partitioned tables, ordered scans
• PG13: Logical replication for partitioned tables, improved performance (JOINs, pruning)
• (Soon) PG14: REINDEX CONCURRENTLY, DETACH CONCURRENTLY, faster UPDATE/DELETE
Partitioning improvements
Make sure you’re on the latest release so you have them!
© Copyright EnterpriseDB Corporation, 2021. All rights reserved.
36
• Know your data!
• Upgrade – be on the latest release!
• Partition before you get in deep water!
• Find me on Twitter: @vyruss
To conclude...

More Related Content

PDF
Deep dive into PostgreSQL statistics.
PPTX
PostGreSQL Performance Tuning
PDF
Indexes in postgres
PDF
Table Partitioning in SQL Server: A Magic Solution for Better Performance? (P...
PPSX
Oracle Table Partitioning - Introduction
PDF
PostgreSQL WAL for DBAs
PDF
Postgresql database administration volume 1
PDF
Postgresql tutorial
Deep dive into PostgreSQL statistics.
PostGreSQL Performance Tuning
Indexes in postgres
Table Partitioning in SQL Server: A Magic Solution for Better Performance? (P...
Oracle Table Partitioning - Introduction
PostgreSQL WAL for DBAs
Postgresql database administration volume 1
Postgresql tutorial

What's hot (20)

PDF
Practical Partitioning in Production with Postgres
 
PPTX
SQL Tuning, takes 3 to tango
ODP
Partitioning
PPTX
Query Optimizer – MySQL vs. PostgreSQL
PPTX
PDF
Data driven-products-now
PDF
MySQL Space Management
PDF
Parquet performance tuning: the missing guide
PDF
Redo internals ppt
PDF
MySQL: Indexing for Better Performance
PPTX
PDF
PostgreSQL Performance Tuning
PPTX
Cassandra Troubleshooting 3.0
PDF
DB2 10 Universal Table Space - 2012-03-18 - no template
PDF
Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
PPTX
Power JSON with PostgreSQL
 
PPSX
Oracle Performance Tuning Fundamentals
PPSX
Collections - Maps
PPTX
Logical Replication in PostgreSQL
 
PDF
NoSQL databases
Practical Partitioning in Production with Postgres
 
SQL Tuning, takes 3 to tango
Partitioning
Query Optimizer – MySQL vs. PostgreSQL
Data driven-products-now
MySQL Space Management
Parquet performance tuning: the missing guide
Redo internals ppt
MySQL: Indexing for Better Performance
PostgreSQL Performance Tuning
Cassandra Troubleshooting 3.0
DB2 10 Universal Table Space - 2012-03-18 - no template
Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
Power JSON with PostgreSQL
 
Oracle Performance Tuning Fundamentals
Collections - Maps
Logical Replication in PostgreSQL
 
NoSQL databases
Ad

Similar to Practical Partitioning in Production with Postgres (20)

PDF
The Truth About Partitioning
 
PDF
Partition and conquer large data in PostgreSQL 10
PDF
PostgreSQL - Decoding Partitions
PPTX
Postgres db performance improvements
PDF
PostgreSQL Table Partitioning / Sharding
PDF
PostgreSQL 13 is Coming - Find Out What's New!
 
PDF
Data Organisation: Table Partitioning in PostgreSQL
PPTX
New and Improved Features in PostgreSQL 13
 
PDF
Partitioning tables and indexing them
PDF
Partitioning Tables and Indexing Them --- Article
PDF
Postgre sql 10 table partitioning
PPT
Informix partitioning interval_rolling_window_table
PDF
Partitioning Under The Hood
PPTX
Sql server lesson7
PPTX
Tech-Spark: Scaling Databases
ODP
Chetan postgresql partitioning
ODP
Chetan postgresql partitioning
PPTX
Partitioning 101
PDF
Big data mgmt bala
PPTX
Partitioning kendralittle
The Truth About Partitioning
 
Partition and conquer large data in PostgreSQL 10
PostgreSQL - Decoding Partitions
Postgres db performance improvements
PostgreSQL Table Partitioning / Sharding
PostgreSQL 13 is Coming - Find Out What's New!
 
Data Organisation: Table Partitioning in PostgreSQL
New and Improved Features in PostgreSQL 13
 
Partitioning tables and indexing them
Partitioning Tables and Indexing Them --- Article
Postgre sql 10 table partitioning
Informix partitioning interval_rolling_window_table
Partitioning Under The Hood
Sql server lesson7
Tech-Spark: Scaling Databases
Chetan postgresql partitioning
Chetan postgresql partitioning
Partitioning 101
Big data mgmt bala
Partitioning kendralittle
Ad

More from Jimmy Angelakos (9)

PDF
Don't Do This [FOSDEM 2023]
PDF
Slow things down to make them go faster [FOSDEM 2022]
PDF
Changing your huge table's data types in production
PDF
The State of (Full) Text Search in PostgreSQL 12
PDF
Deploying PostgreSQL on Kubernetes
PDF
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
PDF
Using PostgreSQL with Bibliographic Data
PDF
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλον
PDF
PostgreSQL: Mέθοδοι για Data Replication
Don't Do This [FOSDEM 2023]
Slow things down to make them go faster [FOSDEM 2022]
Changing your huge table's data types in production
The State of (Full) Text Search in PostgreSQL 12
Deploying PostgreSQL on Kubernetes
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Using PostgreSQL with Bibliographic Data
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλον
PostgreSQL: Mέθοδοι για Data Replication

Recently uploaded (20)

PDF
Nekopoi APK 2025 free lastest update
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
history of c programming in notes for students .pptx
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Essential Infomation Tech presentation.pptx
PPTX
ai tools demonstartion for schools and inter college
PPTX
Introduction to Artificial Intelligence
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Nekopoi APK 2025 free lastest update
Upgrade and Innovation Strategies for SAP ERP Customers
Softaken Excel to vCard Converter Software.pdf
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
history of c programming in notes for students .pptx
Operating system designcfffgfgggggggvggggggggg
Odoo POS Development Services by CandidRoot Solutions
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Essential Infomation Tech presentation.pptx
ai tools demonstartion for schools and inter college
Introduction to Artificial Intelligence
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Reimagine Home Health with the Power of Agentic AI​
Navsoft: AI-Powered Business Solutions & Custom Software Development
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Wondershare Filmora 15 Crack With Activation Key [2025
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus

Practical Partitioning in Production with Postgres

  • 1. Practical Partitioning in Production with Postgres Jimmy Angelakos Senior PostgreSQL Architect Postgres Vision 2021-06-23
  • 2. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 2 We’ll be looking at: • Intro to Partitioning in PostgreSQL • Why? • How? • Practical Example
  • 4. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 4 • RDBMS context: division of a table into distinct independent tables • Horizontal partitioning (by row) – different rows in different tables • Why? – Easier to manage – Performance What is partitioning?
  • 5. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 5 • Has had partitioning for quite some time now PG 8.1 (2005) … – Inheritance-based – Why haven’t I heard of this before? – It’s not great tbh... • Declarative Partitioning: PG 10 (2017) – Massive improvement Partitioning in PostgreSQL HISTORY
  • 6. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 6 CREATE TABLE cust (id INT, signup DATE) PARTITION BY RANGE (signup); CREATE TABLE cust_2020 PARTITION OF cust FOR VALUES FROM ('2020-01-01') TO ('2021-01-01'); • Partitions may be partitioned themselves (sub-partitioning) Declarative Partitioning ( PG 10+ ) Specification of: By declaring a table (DDL): • Partitioning method • Partition key – Column(s) or expression(s) – Value determines data routing • Partition boundaries
  • 8. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 8 • Database size: unlimited ✅ • Tables per database: 1.4 billion ✅ • Table size: 32 TB 😐 – Default block size: 8192 bytes • Rows per table: depends – As many as can fit onto 4.2 billion blocks 😐 PostgreSQL limits (Hard limits, hard to reach)
  • 9. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 9 • Disk size limitations – You can put partitions on different tablespaces • Performance – Partition pruning – Table scans – Index scans – Hidden pitfalls of very large tables* What partitioning can help with (i) (Very large tables)
  • 10. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 10 • Maintenance – Deletions (some filesystems are bad at deleting large numbers of files) 🤭 – DROP TABLE cust_2020; – ALTER TABLE cust DETACH PARTITION cust_2020; • VACUUM – Bloat – Freezing → xid wraparound What partitioning can help with (ii) (Very large tables)
  • 11. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 11 • Magic bullet – No substitute for rational database design • Sharding – Not about putting part of the data on different nodes • Performance tuning – Unless you have one of the mentioned issues What partitioning is not
  • 12. How?
  • 13. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 13 • Get your calculator out – Data ingestion rate (both rows and size in bytes) – Projected increases (e.g. 25 locations projected to be 200 by end of year) – Data retention requirements • Will inform choice of partitioning method and key • For instance: 1440 measurements/day from each of 1000 sensors – extrapolate per year • Keep checking if this is valid and be prepared to revise Dimensioning Plan ahead!
  • 14. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 14 • Range: For key column(s) e.g. ranges of dates, identifiers, etc. – Lower end: inclusive, upper end: exclusive • List: Explicit key values stated for each partition • Hash (PG 11+): If you have a column with values close to unique – Define Modulus ( & remainder ) for number of almost-evenly-sized partitions Partitioning method Dimensioning usually makes this clearer
  • 15. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 15 • Analysis – Determine main keys used for retrieval from queries – Proper key selection enables partition pruning – Can use multiple columns for higher granularity (more partitions) • Desirable – High enough cardinality (range of values) for the number of partitions needed – A column that doesn’t change often, to avoid moving rows among partitions Partition Key selection Choose wisely - know your data!
  • 16. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 16 • Simply put, partitions are partitioned tables themselves. Plan ahead! • CREATE TABLE transactions ( , location_code … TEXT, tstamp TIMESTAMPTZ) PARTITION BY RANGE (tstamp); • CREATE TABLE transactions_2021_06 PARTITION OF transactions FOR VALUES FROM ('2021-06-01') TO ('2021-07-01') PARTITION BY HASH (location_code); • CREATE TABLE transactions_2021_06_p1 PARTITION OF transactions_2021_06 FOR VALUES WITH (MODULUS 4, REMAINDER 0); Sub-partitioning
  • 17. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 17 Partitioning by multiple columns • CREATE TABLE transactions ( , location_code … TEXT, tstamp TIMESTAMPTZ) PARTITION BY RANGE (tstamp, location_code); • CREATE TABLE transactions_2021_06_a PARTITION OF transactions FOR VALUES FROM ('2021-06-01', 'AAA') TO ('2021-07-01', 'AZZ'); • CREATE TABLE transactions_2021_06_b PARTITION OF transactions FOR VALUES FROM ('2021-06-01', 'BAA') TO ('2021-07-01', 'BZZ'); ERROR: partition "transactions_2021_06_b" would overlap partition "transactions_2021_06_a" • Because tstamp '2021-06-01' can only go in the first partition! Be careful!
  • 18. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 18 • Automatic creation of partitions – Create in advance – Use a cronjob • Imperative merging/splitting of partitions – Move rows manually • Sharding to different nodes – You may have to configure FDW manually What Postgres does not do core
  • 20. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 20 • Is your table too large to handle? • Can partitioning help? • What if it’s in constant use? Partitioning a live production system
  • 21. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 21 • OLTP workload, transactions keep flowing in – Table keeps increasing in size • VACUUM never ends – Has been running for a full month already… • Queries are getting slower – Not just because of sheer number of rows... The situation Huge 20 TB table
  • 22. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 22 • Postgres has 1GB segment size – Can only be changed at compilation time – 20 TB table = 20000 segments (files on disk) • Why is this a problem? – md.c → * Hidden performance pitfall (i) For VERY large tables
  • 23. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 23 ● This loops 20000 times every time you want to access a table page – Linked list of segments ● Code from PG 9.6 ● It has been heavily optimised recently (caching, etc). ● Still needs to run a lot of times * Hidden performance pitfall (ii)
  • 24. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 24 • Need to partition the huge table – Dimensioning – Partition method – Partition key • Make sure we’re on the latest version (PG 13) – Get latest features & performance enhancements So what do we do? Next steps
  • 25. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 25 • Dimensioning – One partition per month will be about 30GB of data, so acceptable size • Method, Key – Candidate key is transaction date, which we can partition by range – Check that there are no data errors (e.g. dates in the future when they shouldn’t be) • Partition sizes don’t have to be equal – We can partition older, less often accessed data by year What is our table like? It holds daily transaction totals for each point of sales
  • 26. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 26 • Lock the table totally (ACCESS EXCLUSIVE) or prevent writes – People will start yelling, and they will be right • Cause excessive load on the system (e.g. I/O) or cause excessive disk space usage – Can’t copy whole 20 TB table into empty partitioned table – See above about yelling • Present an inconsistent or incomplete view of the data Problems What things you cannot do in production
  • 27. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 27 • Rename the huge table and its indices • Create an empty partitioned table with the old huge table’s name • Create the required indices on the new partitioned table – They will be created automatically for each new partition • Create first new partition for new incoming data • Attach the old table as a partition of the new table so it can be used normally* • Move data out of the old table incrementally at our own pace The plan Take it step by step
  • 28. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 28 -- Do this all in one transaction BEGIN; ALTER TABLE dailytotals RENAME TO dailytotals_legacy; ALTER INDEX dailytotals_batchid RENAME TO dailytotals_legacy_batchid; ALTER INDEX … … Rename the huge table and its indices
  • 29. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 29 CREATE TABLE dailytotals ( totalid BIGINT NOT NULL DEFAULT nextval('dailytotals_totalid_seq') , totaldate DATE NOT NULL , totalsum BIGINT … , batchid BIGINT NOT NULL ) PARTITION BY RANGE (totaldate); CREATE INDEX dailytotals_batchid ON dailytotals (batchid); … Create empty partitioned table & indices
  • 30. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 30 CREATE TABLE dailytotals_202106 PARTITION OF dailytotals FOR VALUES FROM ('2021-06-01') TO ('2021-07-01'); Create partition for new incoming data
  • 31. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 31 DO $$ DECLARE earliest DATE; DECLARE latest DATE; BEGIN -- Set boundaries SELECT min(totaldate) INTO earliest FROM dailytotals_legacy; latest := '2021-06-01'::DATE; Attach old table as a partition (i)
  • 32. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 32 -- HACK HACK HACK (only because we know and trust our data) ALTER TABLE dailytotals_legacy ADD CONSTRAINT dailytotals_legacy_totaldate CHECK (totaldate >= earliest AND totaldate < latest) NOT VALID; -- You should not touch pg_catalog directly 😕 UPDATE pg_constraint SET convalidated = true WHERE conname = 'dailytotals_legacy_totaldate'; Attach old table as a partition (ii)
  • 33. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 33 ALTER TABLE dailytotals ATTACH PARTITION dailytotals_legacy FOR VALUES FROM (earliest) TO (latest); END; $$ LANGUAGE PLPGSQL; COMMIT; Attach old table as a partition (iii)
  • 34. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 34 • For instance, during quiet hours for the system, in scheduled batch jobs, etc. WITH rows AS ( DELETE FROM dailytotals_legacy d WHERE (totaldate >= '2020-01-01' AND totaldate < '2021-01-01') RETURNING d.* ) INSERT INTO dailytotals SELECT * FROM rows; • In the same transaction: DETACH the old table, perform the move, reATTACH with changed boundaries. Rinse and repeat! • Make sure the target partition exists! Move data from old table at our own pace
  • 35. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 35 • PG11: DEFAULT partition, UPDATE on partition key, HASH method, PKs, FKs, Indexes, Triggers • PG12: Performance (pruning, COPY), FK references for partitioned tables, ordered scans • PG13: Logical replication for partitioned tables, improved performance (JOINs, pruning) • (Soon) PG14: REINDEX CONCURRENTLY, DETACH CONCURRENTLY, faster UPDATE/DELETE Partitioning improvements Make sure you’re on the latest release so you have them!
  • 36. © Copyright EnterpriseDB Corporation, 2021. All rights reserved. 36 • Know your data! • Upgrade – be on the latest release! • Partition before you get in deep water! • Find me on Twitter: @vyruss To conclude...