SlideShare a Scribd company logo
www.postgrespro.ru
In-core compression
Anastasia Lubennikova
Postgres Pro
● Russian PostgreSQL vendor
• Tecnical support
• Database migration
• Core PostgreSQL development
● Postgres Pro and Postgres Pro EE forks
● Contacts
• postgrespro.com
• info@postgrespro.ru
Agenda
● What does Postgres store?
• A couple of words about storage internals
● Check list for your schema
• A set of tricks to optimize database size
● In-core block level compression
• Out-of-box feature of Postgres Pro EE
What this talk doesn’t cover
● MVCC bloat
• Tune autovacuum properly
• Drop unused indexes
• Use pg_repack
• Try pg_squeeze
● WAL-log size
• Enable wal_compression
Data layout
Empty tables are not that empty
● Imagine we have no data
create table tbl();
insert into tbl select from generate_series(0,1e07);
select pg_size_pretty(pg_relation_size('tbl'));
pg_size_pretty
---------------
???
Empty tables are not that empty
● Imagine we have no data
create table tbl();
insert into tbl select from generate_series(0,1e07);
select pg_size_pretty(pg_relation_size('tbl'));
pg_size_pretty
---------------
268 MB
Meta information
db=# select * from heap_page_items(get_raw_page('tbl',0));
-[ RECORD 1 ]-------------------
lp | 1
lp_off | 8160
lp_flags | 1
lp_len | 32
t_xmin | 720
t_xmax | 0
t_field3 | 0
t_ctid | (0,1)
t_infomask2 | 2
t_infomask | 2048
t_hoff | 24
t_bits |
t_oid |
t_data |
Order matters
● Attributes must be aligned inside the row
Safe up to 20% of space.
create table bad (i1 int, b1 bigint, i1 int);
create table good (i1 int, i1 int, b1 bigint);
Alignment and B-tree
All index entries are 8 bytes aligned
create table good (i1 int, i1 int, b1 bigint);
create index idx on good (i1);
create index idx_multi on good (i1, i1);
create index idx_big on good (b1);
Alignment and B-tree
● It cannot be smaller, but it can keep more data
● Covering indexes* may come in handy here
• CREATE INDEX tbl_pkey (i1) INCLUDING (i2)
● + It enables index-only scan for READ queries
● – It disables HOT updates for WRITE queries
*Already in PostgresPro, will be in PostgreSQL 10
Use proper data types
CREATE TABLE b AS
SELECT 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::bytea;
select lp_len, t_data from heap_page_items(get_raw_page('b',0));
lp_len | t_data
-------+---------------------------------------------------------
61 |
x4b61306565626339392d396330622d346566382d626236642d3662623962643
33830613131
CREATE TABLE u AS
SELECT 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::uuid;
select lp_len, t_data from heap_page_items(get_raw_page('u',0));
lp_len | t_data
-------+------------------------------------
40 | xa0eebc999c0b4ef8bb6d6bb9bd380a11
Know your data and your database
● Use proper data types
● Reorder columns to avoid padding
● Do not normalize everything
● Pack data into bigger chunks to trigger TOAST
Know your data and your database
● Use proper data types
● Reorder columns to avoid padding
● Do not normalize everything
● Pack data into bigger chunks to trigger TOAST
CFS
● CFS — «compressed file system»
• Out of box (PostgresPro Enterprise Edition)
decompress
compress
Layout changes
● Postgres layout ● CFS layout
CFS parameters
● cfs_gc_workers = 1
• Number of background workers performing CFS
garbage collection
● cfs_gc_threashold = 50%
• Percent of garbage in the file after which
defragmentation begins
● cfs_gc_period = 5 seconds
• Interval between CFS garbage collection iterations
● cfs_gc_delay = 0 milliseconds
• Delay between files defragmentation
● cfs_encryption = off
Encryption & compression
● $ export PG_CIPHER_KEY=”my top secrete”
CFS usage
CREATE TABLESPACE cfs LOCATION
'/home/tblspc/cfs' with (compression=true);
SET default_tablespace=cfs;
CREATE TABLE tbl (x int);
INSERT INTO tbl VALUES (generate_series(1, 1000000));
UPDATE tbl set x=x+1;
SELECT cfs_start_gc(4); /* 4 — number of workers */
Pgbench performance
● pgbench -s 1000 -i
• 2 times slower
• 98 sec → 214 sec
● database size
• 18 times smaller
• 15334 MB → 827 MB
● pgbench -c 10 -j 10 -t 10000
• 5% better
• 3904 TPS → 4126 TPS
Comparison of
compression algoritms
Configuration Size (Gb) Time (sec)
no compression 15.31 92
snappy 5.18 99
lz4 4.12 91
postgres internal lz 3.89 214
lzfse 2.80 1099
zlib (best speed) 2.43 191
zlib (default level) 2.37 284
zstd 1.69 125
pgbench -i -s 1000
Compression ratio
I/O usage
CPU usage
CFS pros
● Good compression rate:
• All information on the page is compressed including
headers
● Better locality:
• CFS always writes new pages sequentially
● Minimal changes in Postgres core:
• CFS works at the lowest level
● Flexibility:
• Easy to use various compression algorithms
CFS cons
● Shared buffers utilization:
• Buffer cache keeps pages incompressed
● Inefficient WAL and replication:
• Replica has to perform compression and GC itself
● Fragmentation
• CFS needs its own garbage collector
Roadmap
● CFS
• Compression of already existing tables
• Support of multiple compression algorithms
• Optimizations for append only tables
● Postgres data layout
• Alternative storages for specific use-cases
www.postgrespro.ru
Thank you for your attention!
Any questions?
• postgrespro.com
• info@postgrespro.ru
• a.lubennikova@postgrespro.ru

More Related Content

PDF
Btree. Explore the heart of PostgreSQL.
PDF
Page compression. PGCON_2016
PDF
Advanced backup methods (Postgres@CERN)
PDF
Indexes don't mean slow inserts.
PDF
Full Text Search in PostgreSQL
PDF
PostgreSQL 9.5 - Major Features
PPTX
Ordered Record Collection
ODP
PostgreSQL Administration for System Administrators
Btree. Explore the heart of PostgreSQL.
Page compression. PGCON_2016
Advanced backup methods (Postgres@CERN)
Indexes don't mean slow inserts.
Full Text Search in PostgreSQL
PostgreSQL 9.5 - Major Features
Ordered Record Collection
PostgreSQL Administration for System Administrators

What's hot (20)

PDF
PostgreSQL performance improvements in 9.5 and 9.6
PDF
pg_proctab: Accessing System Stats in PostgreSQL
PDF
Optimizer Hints
PDF
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PDF
pg_proctab: Accessing System Stats in PostgreSQL
PDF
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
PDF
PostgreSQL 9.6 새 기능 소개
PDF
Pgcenter overview
PDF
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
PDF
PostgreSQL 9.4, 9.5 and Beyond @ COSCUP 2015 Taipei
PDF
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
PDF
Troubleshooting PostgreSQL Streaming Replication
PDF
Gur1009
PDF
Hypertable Nosql
PDF
Mastering PostgreSQL Administration
 
PDF
hadoop
PDF
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
PDF
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PDF
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
DOCX
Formaldehye2 job program
PostgreSQL performance improvements in 9.5 and 9.6
pg_proctab: Accessing System Stats in PostgreSQL
Optimizer Hints
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
pg_proctab: Accessing System Stats in PostgreSQL
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
PostgreSQL 9.6 새 기능 소개
Pgcenter overview
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
PostgreSQL 9.4, 9.5 and Beyond @ COSCUP 2015 Taipei
Новые возможности полнотекстового поиска в PostgreSQL / Олег Бартунов (Postgr...
Troubleshooting PostgreSQL Streaming Replication
Gur1009
Hypertable Nosql
Mastering PostgreSQL Administration
 
hadoop
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Formaldehye2 job program
Ad

Viewers also liked (20)

PDF
PostgreSQL: O melhor banco de dados Universo
PDF
Boas praticas em um Projeto de Banco de Dados
DOC
2011-2012 AmLit Syllabus
PDF
Архитектура и новые возможности B-tree
TXT
P (1)ele escolheu você
PPTX
Global warming
DOCX
Kumpulan pantun by:ririnrosalinda abdinegara
PPT
Kpi key
PPT
Kpi indicator
TXT
Andre Silva Campos P (9)
PPTX
Toyato cars
PPTX
Making Awesome Experiences with Biosensors
PPT
Kpi process
PDF
Historia de los_satelites_de_comunicaciones._bit_134._5c6c417a
DOC
Survival Packet
PDF
Instructivo hidrología unamba
DOC
2008-2009 AP Syllabus
PDF
Destroying Router Security
PPT
Kpi google analytics
PostgreSQL: O melhor banco de dados Universo
Boas praticas em um Projeto de Banco de Dados
2011-2012 AmLit Syllabus
Архитектура и новые возможности B-tree
P (1)ele escolheu você
Global warming
Kumpulan pantun by:ririnrosalinda abdinegara
Kpi key
Kpi indicator
Andre Silva Campos P (9)
Toyato cars
Making Awesome Experiences with Biosensors
Kpi process
Historia de los_satelites_de_comunicaciones._bit_134._5c6c417a
Survival Packet
Instructivo hidrología unamba
2008-2009 AP Syllabus
Destroying Router Security
Kpi google analytics
Ad

Similar to PgconfSV compression (20)

PDF
In-core compression: how to shrink your database size in several times
PDF
Managing terabytes: When Postgres gets big
PDF
PostgreSQL + ZFS best practices
PDF
Managing terabytes: When PostgreSQL gets big
PDF
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
PPTX
Postgresql Database Administration Basic - Day2
PDF
a look at the postgresql engine
PDF
PostgreSQL Performance Tuning
PDF
20070920 Highload2007 Training Performance Momjian
PDF
PostgreSQL High_Performance_Cheatsheet
PDF
Join-fu: The Art of SQL Tuning for MySQL
PDF
The Accidental DBA
PDF
PostgreSQL on Solaris
PDF
PostgreSQL on Solaris
PDF
Postgres can do THAT?
PPTX
Tuning PostgreSQL for High Write Throughput
PDF
Five steps perform_2013
PDF
Creating PostgreSQL-as-a-Service at Scale
PDF
5 Steps to PostgreSQL Performance
PDF
Five steps perform_2009 (1)
In-core compression: how to shrink your database size in several times
Managing terabytes: When Postgres gets big
PostgreSQL + ZFS best practices
Managing terabytes: When PostgreSQL gets big
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Postgresql Database Administration Basic - Day2
a look at the postgresql engine
PostgreSQL Performance Tuning
20070920 Highload2007 Training Performance Momjian
PostgreSQL High_Performance_Cheatsheet
Join-fu: The Art of SQL Tuning for MySQL
The Accidental DBA
PostgreSQL on Solaris
PostgreSQL on Solaris
Postgres can do THAT?
Tuning PostgreSQL for High Write Throughput
Five steps perform_2013
Creating PostgreSQL-as-a-Service at Scale
5 Steps to PostgreSQL Performance
Five steps perform_2009 (1)

More from Anastasia Lubennikova (7)

PDF
Hacking PostgreSQL. Локальная память процессов. Контексты памяти.
PDF
Hacking PostgreSQL. Разделяемая память и блокировки.
ODP
Hacking PostgreSQL. Физическое представление данных
PDF
Hacking PostgreSQL. Обзор исходного кода
PDF
Расширения для PostgreSQL
PDF
Hacking PostgreSQL. Обзор архитектуры.
PDF
Советы для начинающих разработчиков PostgreSQL
Hacking PostgreSQL. Локальная память процессов. Контексты памяти.
Hacking PostgreSQL. Разделяемая память и блокировки.
Hacking PostgreSQL. Физическое представление данных
Hacking PostgreSQL. Обзор исходного кода
Расширения для PostgreSQL
Hacking PostgreSQL. Обзор архитектуры.
Советы для начинающих разработчиков PostgreSQL

Recently uploaded (20)

PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
Transform Your Business with a Software ERP System
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Essential Infomation Tech presentation.pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Introduction to Artificial Intelligence
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
top salesforce developer skills in 2025.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
ai tools demonstartion for schools and inter college
PPTX
Odoo POS Development Services by CandidRoot Solutions
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Upgrade and Innovation Strategies for SAP ERP Customers
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Transform Your Business with a Software ERP System
Which alternative to Crystal Reports is best for small or large businesses.pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
wealthsignaloriginal-com-DS-text-... (1).pdf
How to Choose the Right IT Partner for Your Business in Malaysia
Essential Infomation Tech presentation.pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Introduction to Artificial Intelligence
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Wondershare Filmora 15 Crack With Activation Key [2025
Navsoft: AI-Powered Business Solutions & Custom Software Development
top salesforce developer skills in 2025.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
ai tools demonstartion for schools and inter college
Odoo POS Development Services by CandidRoot Solutions

PgconfSV compression

  • 2. Postgres Pro ● Russian PostgreSQL vendor • Tecnical support • Database migration • Core PostgreSQL development ● Postgres Pro and Postgres Pro EE forks ● Contacts • postgrespro.com • info@postgrespro.ru
  • 3. Agenda ● What does Postgres store? • A couple of words about storage internals ● Check list for your schema • A set of tricks to optimize database size ● In-core block level compression • Out-of-box feature of Postgres Pro EE
  • 4. What this talk doesn’t cover ● MVCC bloat • Tune autovacuum properly • Drop unused indexes • Use pg_repack • Try pg_squeeze ● WAL-log size • Enable wal_compression
  • 6. Empty tables are not that empty ● Imagine we have no data create table tbl(); insert into tbl select from generate_series(0,1e07); select pg_size_pretty(pg_relation_size('tbl')); pg_size_pretty --------------- ???
  • 7. Empty tables are not that empty ● Imagine we have no data create table tbl(); insert into tbl select from generate_series(0,1e07); select pg_size_pretty(pg_relation_size('tbl')); pg_size_pretty --------------- 268 MB
  • 8. Meta information db=# select * from heap_page_items(get_raw_page('tbl',0)); -[ RECORD 1 ]------------------- lp | 1 lp_off | 8160 lp_flags | 1 lp_len | 32 t_xmin | 720 t_xmax | 0 t_field3 | 0 t_ctid | (0,1) t_infomask2 | 2 t_infomask | 2048 t_hoff | 24 t_bits | t_oid | t_data |
  • 9. Order matters ● Attributes must be aligned inside the row Safe up to 20% of space. create table bad (i1 int, b1 bigint, i1 int); create table good (i1 int, i1 int, b1 bigint);
  • 10. Alignment and B-tree All index entries are 8 bytes aligned create table good (i1 int, i1 int, b1 bigint); create index idx on good (i1); create index idx_multi on good (i1, i1); create index idx_big on good (b1);
  • 11. Alignment and B-tree ● It cannot be smaller, but it can keep more data ● Covering indexes* may come in handy here • CREATE INDEX tbl_pkey (i1) INCLUDING (i2) ● + It enables index-only scan for READ queries ● – It disables HOT updates for WRITE queries *Already in PostgresPro, will be in PostgreSQL 10
  • 12. Use proper data types CREATE TABLE b AS SELECT 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::bytea; select lp_len, t_data from heap_page_items(get_raw_page('b',0)); lp_len | t_data -------+--------------------------------------------------------- 61 | x4b61306565626339392d396330622d346566382d626236642d3662623962643 33830613131 CREATE TABLE u AS SELECT 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::uuid; select lp_len, t_data from heap_page_items(get_raw_page('u',0)); lp_len | t_data -------+------------------------------------ 40 | xa0eebc999c0b4ef8bb6d6bb9bd380a11
  • 13. Know your data and your database ● Use proper data types ● Reorder columns to avoid padding ● Do not normalize everything ● Pack data into bigger chunks to trigger TOAST
  • 14. Know your data and your database ● Use proper data types ● Reorder columns to avoid padding ● Do not normalize everything ● Pack data into bigger chunks to trigger TOAST
  • 15. CFS ● CFS — «compressed file system» • Out of box (PostgresPro Enterprise Edition) decompress compress
  • 16. Layout changes ● Postgres layout ● CFS layout
  • 17. CFS parameters ● cfs_gc_workers = 1 • Number of background workers performing CFS garbage collection ● cfs_gc_threashold = 50% • Percent of garbage in the file after which defragmentation begins ● cfs_gc_period = 5 seconds • Interval between CFS garbage collection iterations ● cfs_gc_delay = 0 milliseconds • Delay between files defragmentation ● cfs_encryption = off
  • 18. Encryption & compression ● $ export PG_CIPHER_KEY=”my top secrete”
  • 19. CFS usage CREATE TABLESPACE cfs LOCATION '/home/tblspc/cfs' with (compression=true); SET default_tablespace=cfs; CREATE TABLE tbl (x int); INSERT INTO tbl VALUES (generate_series(1, 1000000)); UPDATE tbl set x=x+1; SELECT cfs_start_gc(4); /* 4 — number of workers */
  • 20. Pgbench performance ● pgbench -s 1000 -i • 2 times slower • 98 sec → 214 sec ● database size • 18 times smaller • 15334 MB → 827 MB ● pgbench -c 10 -j 10 -t 10000 • 5% better • 3904 TPS → 4126 TPS
  • 21. Comparison of compression algoritms Configuration Size (Gb) Time (sec) no compression 15.31 92 snappy 5.18 99 lz4 4.12 91 postgres internal lz 3.89 214 lzfse 2.80 1099 zlib (best speed) 2.43 191 zlib (default level) 2.37 284 zstd 1.69 125 pgbench -i -s 1000
  • 25. CFS pros ● Good compression rate: • All information on the page is compressed including headers ● Better locality: • CFS always writes new pages sequentially ● Minimal changes in Postgres core: • CFS works at the lowest level ● Flexibility: • Easy to use various compression algorithms
  • 26. CFS cons ● Shared buffers utilization: • Buffer cache keeps pages incompressed ● Inefficient WAL and replication: • Replica has to perform compression and GC itself ● Fragmentation • CFS needs its own garbage collector
  • 27. Roadmap ● CFS • Compression of already existing tables • Support of multiple compression algorithms • Optimizations for append only tables ● Postgres data layout • Alternative storages for specific use-cases
  • 28. www.postgrespro.ru Thank you for your attention! Any questions? • postgrespro.com • info@postgrespro.ru • a.lubennikova@postgrespro.ru