SlideShare a Scribd company logo
HOW ZHEAP WORKS
REINVENTING POSTGRESQL STORAGE
BY HANS-JÜRGEN SCHÖNIG
ABOUT
ME AND MY
COMPANY
■ Who is the guy?
■ Who is CYBERTEC?
HANS-JÜRGEN
SCHÖNIG
CEO & SENIOR DATABASE CONSULTANT
■ PostgreSQL since 1999
■ author of various database books
M A I L hs@cybertec.at
P H O N E +43 2622 930 22-2
W E B www.cybertec-postgresql.com
DATABASE SERVICES
DATA Science
▪ Artificial Intelligence
▪ Machine Learning
▪ Big Data
▪ Business Intelligence
▪ Data Mining
▪ etc.
POSTGRESQL Services
▪ 24/7 Support
▪ Training
▪ Consulting
▪ Performance Tuning
▪ Clustering
▪ etc.
 Learn how zheap works
▪ ICT
▪ University
▪ Government
▪ Automotive
▪ Industry
▪ Trade
▪ Finance
▪ etc.
CLIENT
SECTORS
AGENDA
■ traditional tables
■ table bloat and VACUUM
■ Why a new storage system?
■ zheap: the goal
■ zheap: basic architecture
■ zheap: transaction slots, etc.
■ performance impacts
■ roadmap
TRADITIONAL TABLES
HEAP: STANDARD TABLES
■ Data structure looks as follows:
■ Data structure looks as follows:
HEAP: STANDARD TABLES
HEAP AND TRANSACTIONS
UPDATES AND VISIBILITY
PROBLEMS WITH HEAP
MAIN ISSUE: TABLE BLOAT
test=# CREATE TABLE a (aid int) WITH (autovacuum_enabled = off);
CREATE TABLE
test=# INSERT INTO a SELECT * FROM generate_series(1, 1000000);
INSERT 0 1000000
test=# SELECT pg_size_pretty(pg_relation_size('a'));
pg_size_pretty
----------------
35 MB
(1 row)
MAIN ISSUE: TABLE BLOAT
test=# UPDATE a SET aid = aid + 1;
UPDATE 1000000
test=# SELECT pg_size_pretty(pg_relation_size('a'));
pg_size_pretty
----------------
69 MB
(1 row)
MAIN ISSUE: TABLE BLOAT
test=# VACUUM VERBOSE a;
INFO: vacuuming "public.a"
INFO: "a": removed 1000000 row versions in 4425 pages
INFO: "a": found 1000000 removable, 1000000 nonremovable row versions in 8850
out of 8850 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 539
...
VACUUM
test=# SELECT pg_size_pretty(pg_relation_size('a'));
pg_size_pretty
----------------
69 MB
(1 row)
ONE WORD ABOUT VACUUM
■ VACUUM is not always allowed to
reallocate dead rows
■ A row must be REALLY dead for VACUUM
to do its job
■ Long transactions can be an enemy
→ Once you are in pain it tends not to go away
WAYS OUT
■ VACUUM FULL: Needs a table lock
■ pg_squeeze:
■ Shrinking tables with less locking
■ Move between tablespaces
■ Index organize tables
HINT: Try to avoid bloat in the first place!
ZHEAP
COMING TO THE RESCUE
ZHEAP: DESIGN GOALS
■ Perform UPDATE in place
■ Have smaller tables
■ smaller tuple headers
■ improved alignment
■ Reduce writes as much as possible
■ avoid dirtying pages unless data is modified
■ normal heaps dirty pages in some cases during reads
■ Reuse space more quickly
■ Get rid of VACUUM
ZHEAP: TUPLE HEADERS
ZHEAP: TUPLE HEADERS
■ Heap: 20+ bytes per row
■ Zheap: 5 bytes per row
How can this be achieved?
■ The tuple header controls “visibility”
■ “Normalize tuple header”
■ Move visibility info to the page level
ZHEAP: TRANSACTION SLOTS
Transaction slots hold transactional visibility
ZHEAP: TRANSACTION SLOTS
Transaction slots:
■ 16 bytes of storage
■ contains the following information
■ transaction id
■ epoch
■ latest undo record pointer of that transaction
What if we need more slots?
ZHEAP: TPD PAGES
■ TPD: Store additional transaction slots if “4” is not enough
■ TPD pages are interleaved with normal pages
■
UNDO: HANDLING
STALE DATA
OPERATION: INSERT
■ Allocate a transaction slot
■ Emit an undo entry to fix things on error
■ Space can be reclaimed instantly after a ROLLBACK
→ Most simplistic operation
OPERATION: UPDATE
■ More complicated:
■ The new row fits into the old space
■ The new row does not fit into the old space
OPERATION: UPDATE FITS
■ If the row is shorter:
■ We can overwrite it
■ Emit undo record
In short: We hold the new row in zheap and a copy of the old row in undo so
that we can copy it back to the old structure in case it is needed.
OPERATION: UPDATE DOESN’T FIT
■ Will be worse
■ DELETE old row
■ INSERT new row in a different place
■ Less efficient
Space can instantly be reclaimed in the following cases:
■ When updating a row to a shorter version
■ When non-inplace UPDATEs are performed
OPERATION: DELETE
■ How it works
■ Emit undo record
■ DELETE row from zheap
Old row can be moved back into zheap during ROLLBACK.
UNDO PAGE FORMAT
ROLLBACK
ROLLBACK
■ In case a ROLLBACK happens:
■ undo has to make sure that the old state of the table is restored.
■ Old rows have to be copied back
■ ROLLBACK takes longer !
Undo itself can be removed in three cases:
■ as soon as there are no transactions anymore that can see the data.
■ as soon as all undo action has been completed
■ For committed transactions till the time they are all-visible
UNDO WORKERS
■ Discarding the undo logs is performed by discard worker
■ Undo launcher checks the rollback_hash_table periodically
■ Spawn new undo workers to perform the rollback
■ Each spawned undo worker processes the rollback requests for a
particular database.
UNDO LOG PROCESSING
OBSERVATIONS
PREPARING DATA
■ Creating some random data
test=# SET temp_buffers TO '1 GB';
SET
test=# CREATE TEMP TABLE raw AS
SELECT id,
hashtext(id::text) as name,
random() * 10000 AS n, true AS b
FROM generate_series(1, 10000000) AS id;
SELECT 10000000
LOADING A HEAP
■ Populating a normal table
test=# timing
Timing is on.
test=# CREATE TABLE h1 (LIKE raw) USING heap;
CREATE TABLE
Time: 7.836 ms
test=# INSERT INTO h1 SELECT * FROM raw;
INSERT 0 10000000
Time: 7495.798 ms (00:07.496)
LOADING A ZHEAP
■ Mind the runtime
test=# CREATE TABLE z1 (LIKE raw) USING zheap;
CREATE TABLE
Time: 8.045 ms
test=# INSERT INTO z1 SELECT * FROM raw;
INSERT 0 10000000
Time: 27947.516 ms (00:27.948)
ZHEAP IS SMALLER
■ Smaller tuple headers make a difference
test=# d+
List of relations
Schema | Name | Type | Owner | Persistence | Size | ...
-----------+------+-------+-------+-------------+--------+----
pg_temp_5 | raw | table | hs | temporary | 498 MB |
public | h1 | table | hs | permanent | 498 MB |
public | z1 | table | hs | permanent | 251 MB |
ZHEAP IN ACTION
test=# BEGIN;
BEGIN
test=*# SELECT pg_size_pretty(pg_relation_size('z1'));
pg_size_pretty
----------------
251 MB
(1 row)
test=*# UPDATE z1 SET id = id + 1;
UPDATE 10000000
test=*# SELECT pg_size_pretty(pg_relation_size('z1'));
pg_size_pretty
----------------
251 MB
(1 row)
UNDO IN ACTION
[hs@hs-MS-7817 undo]$ pwd
/home/hs/db13/base/undo
[hs@hs-MS-7817 undo]$ ls -l | tail -n 10
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003EC00000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003ED00000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003EE00000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003EF00000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F000000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F100000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F200000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F300000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F400000
-rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F500000
ROADMAP
WHAT WE ARE WORKING ON
■ agree on final design issues
■ fix bugs in current code
■ large code base
■ not easy to handle
■ preparing a patch to move “undo” to core
■ “undo” is core infrastructure
We hope to bring this into core some day.
QUESTIONS?
Feel free to contact me!
M A I L hs@cybertec.at
P H O N E +43 2622 930 22-2
T W I T T E R @postgresql_007

More Related Content

PPTX
Simple Works Best
 
PDF
Evan Ellis "Tumblr. Massively Sharded MySQL"
PDF
Flickr Architecture Presentation
PDF
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
PDF
Kill mysql-performance
PPTX
Compression talk
PDF
Online Schema Changes for Maximizing Uptime
PDF
NoSQL in Financial Industry - Pierre Bittner
Simple Works Best
 
Evan Ellis "Tumblr. Massively Sharded MySQL"
Flickr Architecture Presentation
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Kill mysql-performance
Compression talk
Online Schema Changes for Maximizing Uptime
NoSQL in Financial Industry - Pierre Bittner

What's hot (19)

PPTX
M|18 How DBAs at TradingScreen Make Life Easier With Automation
PDF
Empowering developers to deploy their own data stores
ODP
Real-world Experiences in Scala
PPTX
HBaseConEast2016: Splice machine open source rdbms
PPTX
Get More Out of MySQL with TokuDB
PPTX
SQL Server to Redshift Data Load Using SSIS
PDF
Avoiding Data Hotspots at Scale
PPTX
Sizing Your Scylla Cluster
PPTX
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
PDF
What Every Developer Should Know About Database Scalability
PDF
What every developer should know about database scalability, PyCon 2010
PDF
NewSQL overview, Feb 2015
PDF
Clustered Columnstore - Deep Dive
PPTX
M|18 Scalability via Expendable Resources: Containers at BlaBlaCar
PDF
Conquering "big data": An introduction to shard query
PPTX
Cloud DWH deep dive
PDF
Shard-Query, an MPP database for the cloud using the LAMP stack
PDF
Performance tuning ColumnStore
PPSX
Introduction to Vertica (Architecture & More)
M|18 How DBAs at TradingScreen Make Life Easier With Automation
Empowering developers to deploy their own data stores
Real-world Experiences in Scala
HBaseConEast2016: Splice machine open source rdbms
Get More Out of MySQL with TokuDB
SQL Server to Redshift Data Load Using SSIS
Avoiding Data Hotspots at Scale
Sizing Your Scylla Cluster
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
What Every Developer Should Know About Database Scalability
What every developer should know about database scalability, PyCon 2010
NewSQL overview, Feb 2015
Clustered Columnstore - Deep Dive
M|18 Scalability via Expendable Resources: Containers at BlaBlaCar
Conquering "big data": An introduction to shard query
Cloud DWH deep dive
Shard-Query, an MPP database for the cloud using the LAMP stack
Performance tuning ColumnStore
Introduction to Vertica (Architecture & More)
Ad

Similar to Learn how zheap works (20)

PPTX
Optimizing E-Business Suite Storage Using Oracle Advanced Compression
PDF
Redis Beyond
PDF
PDF
MySQL Query Optimisation 101
PDF
MySQL innoDB split and merge pages
PDF
Really Big Elephants: PostgreSQL DW
PDF
How MySQL can boost (or kill) your application v2
PPTX
Engineers guide to data analysis
PDF
OSMC 2016 | The Engineer's guide to Data Analysis by Avishai Ish-Shalom
PDF
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
PDF
Vacuum in PostgreSQL
PDF
How to build TiDB
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PPT
Sql server performance tuning
PDF
MySQL Performance Schema in Action
PDF
Object Compaction in Cloud for High Yield
PDF
The Future of zHeap
 
PDF
DBMS data recovery techniques and slides pptx
PDF
Database recovery techniques and slides pptx
PDF
Oracle vs NoSQL – The good, the bad and the ugly
Optimizing E-Business Suite Storage Using Oracle Advanced Compression
Redis Beyond
MySQL Query Optimisation 101
MySQL innoDB split and merge pages
Really Big Elephants: PostgreSQL DW
How MySQL can boost (or kill) your application v2
Engineers guide to data analysis
OSMC 2016 | The Engineer's guide to Data Analysis by Avishai Ish-Shalom
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
Vacuum in PostgreSQL
How to build TiDB
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Sql server performance tuning
MySQL Performance Schema in Action
Object Compaction in Cloud for High Yield
The Future of zHeap
 
DBMS data recovery techniques and slides pptx
Database recovery techniques and slides pptx
Oracle vs NoSQL – The good, the bad and the ugly
Ad

More from EDB (20)

PDF
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
PDF
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
 
PDF
Migre sus bases de datos Oracle a la nube
 
PDF
EFM Office Hours - APJ - July 29, 2021
 
PDF
Benchmarking Cloud Native PostgreSQL
 
PDF
Las Variaciones de la Replicación de PostgreSQL
 
PDF
NoSQL and Spatial Database Capabilities using PostgreSQL
 
PDF
Is There Anything PgBouncer Can’t Do?
 
PDF
Data Analysis with TensorFlow in PostgreSQL
 
PDF
Practical Partitioning in Production with Postgres
 
PDF
A Deeper Dive into EXPLAIN
 
PDF
IOT with PostgreSQL
 
PDF
A Journey from Oracle to PostgreSQL
 
PDF
Psql is awesome!
 
PDF
EDB 13 - New Enhancements for Security and Usability - APJ
 
PPTX
Comment sauvegarder correctement vos données
 
PDF
Cloud Native PostgreSQL - Italiano
 
PDF
New enhancements for security and usability in EDB 13
 
PPTX
Best Practices in Security with PostgreSQL
 
PDF
Cloud Native PostgreSQL - APJ
 
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
 
Migre sus bases de datos Oracle a la nube
 
EFM Office Hours - APJ - July 29, 2021
 
Benchmarking Cloud Native PostgreSQL
 
Las Variaciones de la Replicación de PostgreSQL
 
NoSQL and Spatial Database Capabilities using PostgreSQL
 
Is There Anything PgBouncer Can’t Do?
 
Data Analysis with TensorFlow in PostgreSQL
 
Practical Partitioning in Production with Postgres
 
A Deeper Dive into EXPLAIN
 
IOT with PostgreSQL
 
A Journey from Oracle to PostgreSQL
 
Psql is awesome!
 
EDB 13 - New Enhancements for Security and Usability - APJ
 
Comment sauvegarder correctement vos données
 
Cloud Native PostgreSQL - Italiano
 
New enhancements for security and usability in EDB 13
 
Best Practices in Security with PostgreSQL
 
Cloud Native PostgreSQL - APJ
 

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPT
Teaching material agriculture food technology
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Empathic Computing: Creating Shared Understanding
Big Data Technologies - Introduction.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Programs and apps: productivity, graphics, security and other tools
Per capita expenditure prediction using model stacking based on satellite ima...
Review of recent advances in non-invasive hemoglobin estimation
Teaching material agriculture food technology
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Diabetes mellitus diagnosis method based random forest with bat algorithm
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
Digital-Transformation-Roadmap-for-Companies.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The AUB Centre for AI in Media Proposal.docx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Empathic Computing: Creating Shared Understanding

Learn how zheap works

  • 1. HOW ZHEAP WORKS REINVENTING POSTGRESQL STORAGE BY HANS-JÜRGEN SCHÖNIG
  • 2. ABOUT ME AND MY COMPANY ■ Who is the guy? ■ Who is CYBERTEC?
  • 3. HANS-JÜRGEN SCHÖNIG CEO & SENIOR DATABASE CONSULTANT ■ PostgreSQL since 1999 ■ author of various database books M A I L hs@cybertec.at P H O N E +43 2622 930 22-2 W E B www.cybertec-postgresql.com
  • 4. DATABASE SERVICES DATA Science ▪ Artificial Intelligence ▪ Machine Learning ▪ Big Data ▪ Business Intelligence ▪ Data Mining ▪ etc. POSTGRESQL Services ▪ 24/7 Support ▪ Training ▪ Consulting ▪ Performance Tuning ▪ Clustering ▪ etc.
  • 6. ▪ ICT ▪ University ▪ Government ▪ Automotive ▪ Industry ▪ Trade ▪ Finance ▪ etc. CLIENT SECTORS
  • 7. AGENDA ■ traditional tables ■ table bloat and VACUUM ■ Why a new storage system? ■ zheap: the goal ■ zheap: basic architecture ■ zheap: transaction slots, etc. ■ performance impacts ■ roadmap
  • 9. HEAP: STANDARD TABLES ■ Data structure looks as follows:
  • 10. ■ Data structure looks as follows: HEAP: STANDARD TABLES
  • 13. MAIN ISSUE: TABLE BLOAT test=# CREATE TABLE a (aid int) WITH (autovacuum_enabled = off); CREATE TABLE test=# INSERT INTO a SELECT * FROM generate_series(1, 1000000); INSERT 0 1000000 test=# SELECT pg_size_pretty(pg_relation_size('a')); pg_size_pretty ---------------- 35 MB (1 row)
  • 14. MAIN ISSUE: TABLE BLOAT test=# UPDATE a SET aid = aid + 1; UPDATE 1000000 test=# SELECT pg_size_pretty(pg_relation_size('a')); pg_size_pretty ---------------- 69 MB (1 row)
  • 15. MAIN ISSUE: TABLE BLOAT test=# VACUUM VERBOSE a; INFO: vacuuming "public.a" INFO: "a": removed 1000000 row versions in 4425 pages INFO: "a": found 1000000 removable, 1000000 nonremovable row versions in 8850 out of 8850 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 539 ... VACUUM test=# SELECT pg_size_pretty(pg_relation_size('a')); pg_size_pretty ---------------- 69 MB (1 row)
  • 16. ONE WORD ABOUT VACUUM ■ VACUUM is not always allowed to reallocate dead rows ■ A row must be REALLY dead for VACUUM to do its job ■ Long transactions can be an enemy → Once you are in pain it tends not to go away
  • 17. WAYS OUT ■ VACUUM FULL: Needs a table lock ■ pg_squeeze: ■ Shrinking tables with less locking ■ Move between tablespaces ■ Index organize tables HINT: Try to avoid bloat in the first place!
  • 19. ZHEAP: DESIGN GOALS ■ Perform UPDATE in place ■ Have smaller tables ■ smaller tuple headers ■ improved alignment ■ Reduce writes as much as possible ■ avoid dirtying pages unless data is modified ■ normal heaps dirty pages in some cases during reads ■ Reuse space more quickly ■ Get rid of VACUUM
  • 21. ZHEAP: TUPLE HEADERS ■ Heap: 20+ bytes per row ■ Zheap: 5 bytes per row How can this be achieved? ■ The tuple header controls “visibility” ■ “Normalize tuple header” ■ Move visibility info to the page level
  • 22. ZHEAP: TRANSACTION SLOTS Transaction slots hold transactional visibility
  • 23. ZHEAP: TRANSACTION SLOTS Transaction slots: ■ 16 bytes of storage ■ contains the following information ■ transaction id ■ epoch ■ latest undo record pointer of that transaction What if we need more slots?
  • 24. ZHEAP: TPD PAGES ■ TPD: Store additional transaction slots if “4” is not enough ■ TPD pages are interleaved with normal pages ■
  • 26. OPERATION: INSERT ■ Allocate a transaction slot ■ Emit an undo entry to fix things on error ■ Space can be reclaimed instantly after a ROLLBACK → Most simplistic operation
  • 27. OPERATION: UPDATE ■ More complicated: ■ The new row fits into the old space ■ The new row does not fit into the old space
  • 28. OPERATION: UPDATE FITS ■ If the row is shorter: ■ We can overwrite it ■ Emit undo record In short: We hold the new row in zheap and a copy of the old row in undo so that we can copy it back to the old structure in case it is needed.
  • 29. OPERATION: UPDATE DOESN’T FIT ■ Will be worse ■ DELETE old row ■ INSERT new row in a different place ■ Less efficient Space can instantly be reclaimed in the following cases: ■ When updating a row to a shorter version ■ When non-inplace UPDATEs are performed
  • 30. OPERATION: DELETE ■ How it works ■ Emit undo record ■ DELETE row from zheap Old row can be moved back into zheap during ROLLBACK.
  • 33. ROLLBACK ■ In case a ROLLBACK happens: ■ undo has to make sure that the old state of the table is restored. ■ Old rows have to be copied back ■ ROLLBACK takes longer ! Undo itself can be removed in three cases: ■ as soon as there are no transactions anymore that can see the data. ■ as soon as all undo action has been completed ■ For committed transactions till the time they are all-visible
  • 34. UNDO WORKERS ■ Discarding the undo logs is performed by discard worker ■ Undo launcher checks the rollback_hash_table periodically ■ Spawn new undo workers to perform the rollback ■ Each spawned undo worker processes the rollback requests for a particular database.
  • 37. PREPARING DATA ■ Creating some random data test=# SET temp_buffers TO '1 GB'; SET test=# CREATE TEMP TABLE raw AS SELECT id, hashtext(id::text) as name, random() * 10000 AS n, true AS b FROM generate_series(1, 10000000) AS id; SELECT 10000000
  • 38. LOADING A HEAP ■ Populating a normal table test=# timing Timing is on. test=# CREATE TABLE h1 (LIKE raw) USING heap; CREATE TABLE Time: 7.836 ms test=# INSERT INTO h1 SELECT * FROM raw; INSERT 0 10000000 Time: 7495.798 ms (00:07.496)
  • 39. LOADING A ZHEAP ■ Mind the runtime test=# CREATE TABLE z1 (LIKE raw) USING zheap; CREATE TABLE Time: 8.045 ms test=# INSERT INTO z1 SELECT * FROM raw; INSERT 0 10000000 Time: 27947.516 ms (00:27.948)
  • 40. ZHEAP IS SMALLER ■ Smaller tuple headers make a difference test=# d+ List of relations Schema | Name | Type | Owner | Persistence | Size | ... -----------+------+-------+-------+-------------+--------+---- pg_temp_5 | raw | table | hs | temporary | 498 MB | public | h1 | table | hs | permanent | 498 MB | public | z1 | table | hs | permanent | 251 MB |
  • 41. ZHEAP IN ACTION test=# BEGIN; BEGIN test=*# SELECT pg_size_pretty(pg_relation_size('z1')); pg_size_pretty ---------------- 251 MB (1 row) test=*# UPDATE z1 SET id = id + 1; UPDATE 10000000 test=*# SELECT pg_size_pretty(pg_relation_size('z1')); pg_size_pretty ---------------- 251 MB (1 row)
  • 42. UNDO IN ACTION [hs@hs-MS-7817 undo]$ pwd /home/hs/db13/base/undo [hs@hs-MS-7817 undo]$ ls -l | tail -n 10 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003EC00000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003ED00000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003EE00000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003EF00000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F000000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F100000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F200000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F300000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F400000 -rw-------. 1 hs hs 1048576 Oct 8 12:08 000001.003F500000
  • 44. WHAT WE ARE WORKING ON ■ agree on final design issues ■ fix bugs in current code ■ large code base ■ not easy to handle ■ preparing a patch to move “undo” to core ■ “undo” is core infrastructure We hope to bring this into core some day.
  • 45. QUESTIONS? Feel free to contact me! M A I L hs@cybertec.at P H O N E +43 2622 930 22-2 T W I T T E R @postgresql_007