SlideShare a Scribd company logo
P R E S E N T A T I O N
PostgreSQL Training
Part 1:
Terminology
Some of what I have here is copy-pasta from https://www.postgresql.org/docs/current/glossary.html
with some extra information added from their respective pages as well as some of my own knowledge
and research.
Glossary
Command
You will see this used all over the documentation, but it's never explained.
A command is a string that is sent to the server in order for it to do something for you. In PSQL, are
separated by semicolons.
A command generally is used to:
● fetch data
● modify data
● administer the PostgreSQL instance.
SELECT * FROM table
CREATE EXTENSION pg_stat_statements
BEGIN; DELETE FROM TABLE
Object
Any object that can be created with a CREATE command.
Most objects are specific to one database, and commonly known as local SQL objects.
Local Object
Schema Local Objects: Name and type are unique within each schema
● Relations
● Routines
● Data types
CREATE TABLE; CREATE VIEW; CREATE INDEX
CREATE FUNCTION
CREATE TYPE
Non-schema Local Objects
Local Objects: Name and type are unique within each database
● Extensions
● Data type casts
● Foreign data wrappers
CREATE EXTENSION
CREATE CAST
CREATE FOREIGN DATA WRAPPER
Global Objects
Exist entirely outside of any specific database. Names are unique within the database cluster.
● Roles
● Tablespaces
● Replication origins
● Subscriptions for logical replication
● Databases
CREATE ROLE
CREATE TABLESPACE
CALL pg_replication_origin_create()
CREATE SUBSCRIPTION
CREATE DATABASE
Tablespace
A named location on the server file system.
Allows database admins to define locations of the filesystem where the files representing the database
objects can be stored.
This is very useful if you have databases of varying sizes or for optimizing performance. You can put a
bigger, or less needed database on a slower disk, and a very active database on a faster disk.
Initially, a database cluster contains a single usable tablespace which is used as the default for all SQL
objects, called pg_default .
Tablespace
Some examples:
CREATE TABLESPACE tablespace_name LOCATION 'directory'; .
Then you can create an SQL Object:
CREATE DATABASE name TABLESPACE tablespace_name; .
CREATE TABLE name TABLESPACE tablespace_name; .
CREATE INDEX name ON table_name TABLESPACE tablespace_name; .
Database
A named collection of local SQL objects.
You need to connect to a database when connecting to a cluster.
The SQL standard calls databases “catalogs”, but there is no difference in practice.
There’s 2 ways to create a database:
1. CREATE DATABASE dbname [OWNER relename] from an SQL environment
2. createdb [-O rolename] dbname from the shell
There’s 2 ways to destroy a database:
1. DROP DATABASE dbname from an SQL environment
2. dropdb dbname from the shell
Database Cluster
A collection of databases and global SQL objects, and their common static and dynamic metadata.
In PostgreSQL, the term cluster is also sometimes used to refer to an instance.
Instance
A group of backend and auxiliary processes that communicate using a common shared memory area.
One postmaster process manages the instance.
One instance manages exactly one database cluster with all its databases.
Many instances can run on the same server as long as their TCP ports do not conflict.
Postmaster
The very first process of an instance.
It manages the other processes and creates backend processes on demand.
Backend
Process of an instance which acts on behalf of a client session and handles its requests.
One backed process will be forked for each client session.
Session
A state that allows a client and backend to interact, communicating over a connection.
Connection
An established line of communication between a client process and a backend process, supporting a
session.
Usually over a network, but also can work over a socket.
Query
A type of command sent by a client to a backend.
Most of the time, a query will be retrieving data or modifying the database.
Relation
The generic term for all objects in a database that have
1. A name
2. A list of attributes defined in a specific order
Includes:
● Tables
● Sequences
● Views
● Foreign Tables
● Materialized views
● Composite types
● Indexes
Heap
This is not the memory heap of the application.
It is the data for a relation.
The heap is stored in one or more file segments.
File Segment
A physical file which stores data for a given relation.
File size is limited with --with-segsize during compilation, default is 1 GB.
If a relation exceeds the size limit, it is split into multiple segments.
To know more than you ever needed to, see: https://www.postgresql.org/docs/current/storage-file-
layout.html
Storage File Layout
Table
A relation that stores a collection of tuples having a common data structure.
TOAST
Stands for: The Oversized-Attribute Storage Technique
A mechanism by which large attributes of table rows are split and stored in a secondary table, called the
TOAST table.
Each relation with large attributes has its own TOAST table.
Long string storage is generally where you will find TOAST being used.
Column
An attribute found in a table or view.
Tuple
A collection of attributes in a fixed order.
That order may be defined by the relation where the tuple is contained.
When talking about a table, a tuple is generally referred to as a row.
View
A relation that is defined by a SELECT statement, but has no storage of its own.
Any time a query references a view, the definition of the view is substituted into the query.
This substitution happens before the query planner or optimizer.
Materialized
The property that some information has been pre-computed and stored for later use, rather than
computing it on-the-fly.
Materialized View
Like an immutable table.
Update the results with REFRESH MATERIALIZED VIEW .
You can CREATE INDEX .
You can also ALTER|DROP MATERIALIZED VIEW like you can with a table.
Transaction
A combination of commands that must act as a single atomic command.
They all succeed or all fail as a single unit.
Their effects are not visible to other sessions until the transaction is complete.
Each transaction has a Transaction ID, or XID . A session is assigned a Transaction ID when it first
causes a database modification.
Manually started with BEGIN , and ends with ROLLBACK or COMMIT .
Commit
Finalizes a transaction, which makes it visible to the other transactions and assures its durability.
Rollback
Rolls back all of the changes made since the beginning of the transaction.
Index
A relation that contains data derived from a table or materialized view.
Its internal structure supports fast retrieval of and access to the original data.
Write-Ahead Log (WAL)
The journal that keeps track of the changes in the database cluster.
Consists of multiple WAL records, written sequentially to WAL files.
There is only 1 WAL per cluster.
WAL Record
A low-level, binary description of an individual change.
Replayed in the event of a database failure.
It's more efficient having this write-only log instead of modifying the page files directly.
Also a method of Postgres replication. Records are streamed to the replicas and replayed.
A change to the cluster is considered persistent when it’s WAL record is written to disk.
WAL File
A.K.A. WAL segment. A.K.A. WAL segment file.
If the system crashes, the files are read in order, eventually restoring the last state of the database.
Barman ships these WAL files to allow you to restore your database to any point in time by replaying
the WAL records until the requested time has been reached.
Each WAL file can be released after a checkpoint writes all the changes to the corresponding data files.
Releasing the file can be done by either:
● Deleting it
● Changing its name so that it will be used in the future. A.K.A. recycling.
Checkpoint
A point in the WAL sequence at which it is guaranteed that the heap and index data files have been
updated with all information from shared memory modified before that checkpoint.
A checkpoint record is written and flushed to WAL to mark that point.
A checkpoint is started:
● Every checkpoint_timeout seconds
● If max_wal_size is about to be exceeded
● When calling CHECKPOINT
Whichever comes first.
Multi-version concurrency control (MVCC)
A mechanism designed to allow several transactions to be reading and writing the same rows without
one process causing other processes to stall.
A read will not block a write and a write will not block a read.
How MVCC works
Postgres stores transaction information with each row: xmin and xmax .
These are used to determine if a row is visible to a transaction or not.
A row is visible to a transaction if xmin < XID < xmax .
This depends on the isolation level.
By default, as soon as a transaction is committed, the new visibility is applied to all transactions.
The SERIALIZABLE isolation level works as described before.
What actually happens
A row is given:
● xmin when it is INSERTed
● xmax when it is marked as DELETEd.
Updating a row is like inserting a new row and deleting the old one.
You can query xmin and xmax from any row.
SELECT xmin, xmax FROM table;
Dead tuples (Dead rows)
When a tuple (row) is no longer visible to any transaction, it is considered dead.
Time for a little quest
Transaction Exhaustion (Wraparound)
Transaction Exhaustion
Transaction IDs are 32-bits , so you can have a total of 232 transactions.
The XID s are split into 2 parts:
● XID s in the past
● XID s in the future
You can get your current XID with:
SELECT txid_current() .
Transaction Exhaustion
When txid_current reaches 232 , the next transaction will wrap around back to 0.
All of a sudden, all rows appear to be in the future.
All deleted rows are not deleted anymore and all created rows are not created.
Transaction Exhaustion
When txid_current reaches 232 , the next transaction will wrap around back to 0.
All of a sudden, all rows appear to be in the future.
All deleted rows are not deleted anymore and all created rows are not created.
This is referred to as Not a good time .
Transaction Exhaustion
When txid_current reaches 232 , the next transaction will wrap around back to 0.
All of a sudden, all rows appear to be in the future.
All deleted rows are not deleted anymore and all created rows are not created.
This is referred to as Not a good time .
But it actually works a bit differently...
Transaction Exhaustion
Basically:
● Past XID s are txid_current - 231 to txid_current - 1 .
● Future XID s are txid_current + 1 to txid_current - 231 - 1 .
So we have ~2 billion transactions.
About 1 million transactions before the Not a good time , Postgres will not allow any new
transactions, and will start a VACUUM , even if autovacuum is not enabled.
Transaction Exhaustion
What Not a good time looks like in the logs:
WARNING: database "mydb" must be vacuumed within x transactions
HINT: To avoid database shutdown, execute a database-wide VACUUM in "mydb"
Vacuum
It has 2 jobs:
1. Remove dead tuples from tables or materialized views.
2. Freeze tuples.
VACUUM steps: SELECT datname, phase FROM pg_stat_progress_vacuum .
1. initializing
2. scanning heap
3. vacuuming indexes
4. vacuuming heap
5. cleaning up indexes
6. truncating heap
7. performing final cleanup
See: https://www.postgresql.org/docs/current/progress-reporting.html
Postgres Reporting
Tuple Freezing
Each row has a frozen bit which, if set, means that no matter what the xmin and xmax is set
to, this row is always in the past.
Tuple freezing is the process of setting this bit on all tuples that are in the past of all the current
transactions.
Each table has a relfrozenxid value that is the xmin of the oldest row that is not frozen.
datfrozenxid is the oldest relfrozenxid for the database.
So datfrozenxid + 231 - 1 million is actually Not a good time .
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = 0 .
datfrozenxid
datfrozenxid+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~500M .
Transactions start
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~900M .
Everything is still seems normal
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~1 400M .
We’re starting to get close to the database
shutting down
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~2 000M .
Not a good time happened.
At around 2 trillion transactions, the database
stops accepting connections and starts
VACUUM .
datfrozenxid
datfrozenxid+231 txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~350M .
txid_current = ~2 000M .
VACUUM starts freezing tuples.
We can only connect again when it’s done.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~700M .
txid_current = ~2 000M .
VACUUM is still freezing tuples.
Still no new transactions allowed.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 000M .
txid_current = ~2 000M .
VACUUM is still freezing tuples.
Still no new transactions allowed. datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 400M .
txid_current = ~2 000M .
VACUUM is still freezing tuples.
Still no new transactions allowed.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 800M .
txid_current = ~2 000M .
VACUUM is done.
New transactions are allowed to start again.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 800M .
txid_current = ~2 300M .
Database is working again.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
I made a little animation to help understand:
https://tuple-freezing-demo.angusd.com
Tuple Freezing Demo

More Related Content

PDF
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
PDF
Apache Calcite Tutorial - BOSS 21
PPTX
Hive + Tez: A Performance Deep Dive
PDF
PostgreSQL WAL for DBAs
PDF
The Apache Spark File Format Ecosystem
PPTX
How to Actually Tune Your Spark Jobs So They Work
PDF
Apache Spark on K8S Best Practice and Performance in the Cloud
PDF
A Deep Dive into Query Execution Engine of Spark SQL
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Apache Calcite Tutorial - BOSS 21
Hive + Tez: A Performance Deep Dive
PostgreSQL WAL for DBAs
The Apache Spark File Format Ecosystem
How to Actually Tune Your Spark Jobs So They Work
Apache Spark on K8S Best Practice and Performance in the Cloud
A Deep Dive into Query Execution Engine of Spark SQL

What's hot (20)

PDF
PySpark Best Practices
PDF
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
PDF
Making Apache Spark Better with Delta Lake
PDF
Top 5 Mistakes When Writing Spark Applications
PPTX
Zero to Snowflake Presentation
PDF
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
PDF
Tanel Poder - Scripts and Tools short
PDF
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
PDF
EM12c: Capacity Planning with OEM Metrics
PDF
Practical Partitioning in Production with Postgres
 
PDF
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
PPTX
Survey of High Performance NoSQL Systems
PPTX
From cache to in-memory data grid. Introduction to Hazelcast.
PDF
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
PPTX
Apache Tez - A unifying Framework for Hadoop Data Processing
PDF
ksqlDB: A Stream-Relational Database System
PDF
Mastering PostgreSQL Administration
 
PDF
Apache Calcite: One planner fits all
PDF
Oracle RAC 19c: Best Practices and Secret Internals
PPTX
Snowflake Architecture.pptx
PySpark Best Practices
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Making Apache Spark Better with Delta Lake
Top 5 Mistakes When Writing Spark Applications
Zero to Snowflake Presentation
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Tanel Poder - Scripts and Tools short
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
EM12c: Capacity Planning with OEM Metrics
Practical Partitioning in Production with Postgres
 
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Survey of High Performance NoSQL Systems
From cache to in-memory data grid. Introduction to Hazelcast.
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Apache Tez - A unifying Framework for Hadoop Data Processing
ksqlDB: A Stream-Relational Database System
Mastering PostgreSQL Administration
 
Apache Calcite: One planner fits all
Oracle RAC 19c: Best Practices and Secret Internals
Snowflake Architecture.pptx
Ad

Similar to PostgreSQL Terminology (20)

PDF
PostgreSQL Prologue
PPTX
PDF
An evening with Postgresql
ODP
Introduction to PostgreSQL
PDF
PostgreSQL 9.0 & The Future
PPTX
PostgreSQL - Object Relational Database
PPTX
PostgreSQL- An Introduction
PPTX
PostgreSQL Database Slides
PDF
9.6_Course Material-Postgresql_002.pdf
PDF
Bn 1016 demo postgre sql-online-training
KEY
PostgreSQL
PPT
Object Relational Database Management System
PPTX
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
PDF
Grokking TechTalk #20: PostgreSQL Internals 101
PDF
Discover Database
PPT
A brief introduction to PostgreSQL
PPTX
Modern sql
PDF
PostgreSQL - Case Study
PDF
Introduction to PostgreSQL for System Administrators
PDF
Demystifying PostgreSQL
PostgreSQL Prologue
An evening with Postgresql
Introduction to PostgreSQL
PostgreSQL 9.0 & The Future
PostgreSQL - Object Relational Database
PostgreSQL- An Introduction
PostgreSQL Database Slides
9.6_Course Material-Postgresql_002.pdf
Bn 1016 demo postgre sql-online-training
PostgreSQL
Object Relational Database Management System
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
Grokking TechTalk #20: PostgreSQL Internals 101
Discover Database
A brief introduction to PostgreSQL
Modern sql
PostgreSQL - Case Study
Introduction to PostgreSQL for System Administrators
Demystifying PostgreSQL
Ad

More from Showmax Engineering (6)

PDF
Networking fundamentals
PPTX
Android crash course
PDF
PostgreSQL Monitoring using modern software stacks
PDF
Implementing GraphQL - Without a Backend
PDF
Deep learning features and similarity of movies based on their video content
PDF
2015 11-12 GoLang Meetup Praha
Networking fundamentals
Android crash course
PostgreSQL Monitoring using modern software stacks
Implementing GraphQL - Without a Backend
Deep learning features and similarity of movies based on their video content
2015 11-12 GoLang Meetup Praha

Recently uploaded (20)

PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Essential Infomation Tech presentation.pptx
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
medical staffing services at VALiNTRY
PPT
Introduction Database Management System for Course Database
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Digital Strategies for Manufacturing Companies
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Transform Your Business with a Software ERP System
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
history of c programming in notes for students .pptx
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
System and Network Administration Chapter 2
Design an Analysis of Algorithms II-SECS-1021-03
Essential Infomation Tech presentation.pptx
Materi-Enum-and-Record-Data-Type (1).pptx
How to Migrate SBCGlobal Email to Yahoo Easily
medical staffing services at VALiNTRY
Introduction Database Management System for Course Database
Operating system designcfffgfgggggggvggggggggg
Digital Strategies for Manufacturing Companies
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Transform Your Business with a Software ERP System
Which alternative to Crystal Reports is best for small or large businesses.pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
How to Choose the Right IT Partner for Your Business in Malaysia
history of c programming in notes for students .pptx
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Wondershare Filmora 15 Crack With Activation Key [2025
System and Network Administration Chapter 2

PostgreSQL Terminology

  • 1. P R E S E N T A T I O N PostgreSQL Training
  • 2. Part 1: Terminology Some of what I have here is copy-pasta from https://www.postgresql.org/docs/current/glossary.html with some extra information added from their respective pages as well as some of my own knowledge and research. Glossary
  • 3. Command You will see this used all over the documentation, but it's never explained. A command is a string that is sent to the server in order for it to do something for you. In PSQL, are separated by semicolons. A command generally is used to: ● fetch data ● modify data ● administer the PostgreSQL instance. SELECT * FROM table CREATE EXTENSION pg_stat_statements BEGIN; DELETE FROM TABLE
  • 4. Object Any object that can be created with a CREATE command. Most objects are specific to one database, and commonly known as local SQL objects.
  • 5. Local Object Schema Local Objects: Name and type are unique within each schema ● Relations ● Routines ● Data types CREATE TABLE; CREATE VIEW; CREATE INDEX CREATE FUNCTION CREATE TYPE
  • 6. Non-schema Local Objects Local Objects: Name and type are unique within each database ● Extensions ● Data type casts ● Foreign data wrappers CREATE EXTENSION CREATE CAST CREATE FOREIGN DATA WRAPPER
  • 7. Global Objects Exist entirely outside of any specific database. Names are unique within the database cluster. ● Roles ● Tablespaces ● Replication origins ● Subscriptions for logical replication ● Databases CREATE ROLE CREATE TABLESPACE CALL pg_replication_origin_create() CREATE SUBSCRIPTION CREATE DATABASE
  • 8. Tablespace A named location on the server file system. Allows database admins to define locations of the filesystem where the files representing the database objects can be stored. This is very useful if you have databases of varying sizes or for optimizing performance. You can put a bigger, or less needed database on a slower disk, and a very active database on a faster disk. Initially, a database cluster contains a single usable tablespace which is used as the default for all SQL objects, called pg_default .
  • 9. Tablespace Some examples: CREATE TABLESPACE tablespace_name LOCATION 'directory'; . Then you can create an SQL Object: CREATE DATABASE name TABLESPACE tablespace_name; . CREATE TABLE name TABLESPACE tablespace_name; . CREATE INDEX name ON table_name TABLESPACE tablespace_name; .
  • 10. Database A named collection of local SQL objects. You need to connect to a database when connecting to a cluster. The SQL standard calls databases “catalogs”, but there is no difference in practice. There’s 2 ways to create a database: 1. CREATE DATABASE dbname [OWNER relename] from an SQL environment 2. createdb [-O rolename] dbname from the shell There’s 2 ways to destroy a database: 1. DROP DATABASE dbname from an SQL environment 2. dropdb dbname from the shell
  • 11. Database Cluster A collection of databases and global SQL objects, and their common static and dynamic metadata. In PostgreSQL, the term cluster is also sometimes used to refer to an instance.
  • 12. Instance A group of backend and auxiliary processes that communicate using a common shared memory area. One postmaster process manages the instance. One instance manages exactly one database cluster with all its databases. Many instances can run on the same server as long as their TCP ports do not conflict.
  • 13. Postmaster The very first process of an instance. It manages the other processes and creates backend processes on demand.
  • 14. Backend Process of an instance which acts on behalf of a client session and handles its requests. One backed process will be forked for each client session.
  • 15. Session A state that allows a client and backend to interact, communicating over a connection.
  • 16. Connection An established line of communication between a client process and a backend process, supporting a session. Usually over a network, but also can work over a socket.
  • 17. Query A type of command sent by a client to a backend. Most of the time, a query will be retrieving data or modifying the database.
  • 18. Relation The generic term for all objects in a database that have 1. A name 2. A list of attributes defined in a specific order Includes: ● Tables ● Sequences ● Views ● Foreign Tables ● Materialized views ● Composite types ● Indexes
  • 19. Heap This is not the memory heap of the application. It is the data for a relation. The heap is stored in one or more file segments.
  • 20. File Segment A physical file which stores data for a given relation. File size is limited with --with-segsize during compilation, default is 1 GB. If a relation exceeds the size limit, it is split into multiple segments. To know more than you ever needed to, see: https://www.postgresql.org/docs/current/storage-file- layout.html Storage File Layout
  • 21. Table A relation that stores a collection of tuples having a common data structure.
  • 22. TOAST Stands for: The Oversized-Attribute Storage Technique A mechanism by which large attributes of table rows are split and stored in a secondary table, called the TOAST table. Each relation with large attributes has its own TOAST table. Long string storage is generally where you will find TOAST being used.
  • 23. Column An attribute found in a table or view.
  • 24. Tuple A collection of attributes in a fixed order. That order may be defined by the relation where the tuple is contained. When talking about a table, a tuple is generally referred to as a row.
  • 25. View A relation that is defined by a SELECT statement, but has no storage of its own. Any time a query references a view, the definition of the view is substituted into the query. This substitution happens before the query planner or optimizer.
  • 26. Materialized The property that some information has been pre-computed and stored for later use, rather than computing it on-the-fly.
  • 27. Materialized View Like an immutable table. Update the results with REFRESH MATERIALIZED VIEW . You can CREATE INDEX . You can also ALTER|DROP MATERIALIZED VIEW like you can with a table.
  • 28. Transaction A combination of commands that must act as a single atomic command. They all succeed or all fail as a single unit. Their effects are not visible to other sessions until the transaction is complete. Each transaction has a Transaction ID, or XID . A session is assigned a Transaction ID when it first causes a database modification. Manually started with BEGIN , and ends with ROLLBACK or COMMIT .
  • 29. Commit Finalizes a transaction, which makes it visible to the other transactions and assures its durability.
  • 30. Rollback Rolls back all of the changes made since the beginning of the transaction.
  • 31. Index A relation that contains data derived from a table or materialized view. Its internal structure supports fast retrieval of and access to the original data.
  • 32. Write-Ahead Log (WAL) The journal that keeps track of the changes in the database cluster. Consists of multiple WAL records, written sequentially to WAL files. There is only 1 WAL per cluster.
  • 33. WAL Record A low-level, binary description of an individual change. Replayed in the event of a database failure. It's more efficient having this write-only log instead of modifying the page files directly. Also a method of Postgres replication. Records are streamed to the replicas and replayed. A change to the cluster is considered persistent when it’s WAL record is written to disk.
  • 34. WAL File A.K.A. WAL segment. A.K.A. WAL segment file. If the system crashes, the files are read in order, eventually restoring the last state of the database. Barman ships these WAL files to allow you to restore your database to any point in time by replaying the WAL records until the requested time has been reached. Each WAL file can be released after a checkpoint writes all the changes to the corresponding data files. Releasing the file can be done by either: ● Deleting it ● Changing its name so that it will be used in the future. A.K.A. recycling.
  • 35. Checkpoint A point in the WAL sequence at which it is guaranteed that the heap and index data files have been updated with all information from shared memory modified before that checkpoint. A checkpoint record is written and flushed to WAL to mark that point. A checkpoint is started: ● Every checkpoint_timeout seconds ● If max_wal_size is about to be exceeded ● When calling CHECKPOINT Whichever comes first.
  • 36. Multi-version concurrency control (MVCC) A mechanism designed to allow several transactions to be reading and writing the same rows without one process causing other processes to stall. A read will not block a write and a write will not block a read.
  • 37. How MVCC works Postgres stores transaction information with each row: xmin and xmax . These are used to determine if a row is visible to a transaction or not. A row is visible to a transaction if xmin < XID < xmax . This depends on the isolation level. By default, as soon as a transaction is committed, the new visibility is applied to all transactions. The SERIALIZABLE isolation level works as described before.
  • 38. What actually happens A row is given: ● xmin when it is INSERTed ● xmax when it is marked as DELETEd. Updating a row is like inserting a new row and deleting the old one. You can query xmin and xmax from any row. SELECT xmin, xmax FROM table;
  • 39. Dead tuples (Dead rows) When a tuple (row) is no longer visible to any transaction, it is considered dead.
  • 40. Time for a little quest Transaction Exhaustion (Wraparound)
  • 41. Transaction Exhaustion Transaction IDs are 32-bits , so you can have a total of 232 transactions. The XID s are split into 2 parts: ● XID s in the past ● XID s in the future You can get your current XID with: SELECT txid_current() .
  • 42. Transaction Exhaustion When txid_current reaches 232 , the next transaction will wrap around back to 0. All of a sudden, all rows appear to be in the future. All deleted rows are not deleted anymore and all created rows are not created.
  • 43. Transaction Exhaustion When txid_current reaches 232 , the next transaction will wrap around back to 0. All of a sudden, all rows appear to be in the future. All deleted rows are not deleted anymore and all created rows are not created. This is referred to as Not a good time .
  • 44. Transaction Exhaustion When txid_current reaches 232 , the next transaction will wrap around back to 0. All of a sudden, all rows appear to be in the future. All deleted rows are not deleted anymore and all created rows are not created. This is referred to as Not a good time . But it actually works a bit differently...
  • 45. Transaction Exhaustion Basically: ● Past XID s are txid_current - 231 to txid_current - 1 . ● Future XID s are txid_current + 1 to txid_current - 231 - 1 . So we have ~2 billion transactions. About 1 million transactions before the Not a good time , Postgres will not allow any new transactions, and will start a VACUUM , even if autovacuum is not enabled.
  • 46. Transaction Exhaustion What Not a good time looks like in the logs: WARNING: database "mydb" must be vacuumed within x transactions HINT: To avoid database shutdown, execute a database-wide VACUUM in "mydb"
  • 47. Vacuum It has 2 jobs: 1. Remove dead tuples from tables or materialized views. 2. Freeze tuples. VACUUM steps: SELECT datname, phase FROM pg_stat_progress_vacuum . 1. initializing 2. scanning heap 3. vacuuming indexes 4. vacuuming heap 5. cleaning up indexes 6. truncating heap 7. performing final cleanup See: https://www.postgresql.org/docs/current/progress-reporting.html Postgres Reporting
  • 48. Tuple Freezing Each row has a frozen bit which, if set, means that no matter what the xmin and xmax is set to, this row is always in the past. Tuple freezing is the process of setting this bit on all tuples that are in the past of all the current transactions. Each table has a relfrozenxid value that is the xmin of the oldest row that is not frozen. datfrozenxid is the oldest relfrozenxid for the database. So datfrozenxid + 231 - 1 million is actually Not a good time .
  • 49. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = 0 . datfrozenxid datfrozenxid+231
  • 50. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~500M . Transactions start datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 51. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~900M . Everything is still seems normal datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 52. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~1 400M . We’re starting to get close to the database shutting down datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 53. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~2 000M . Not a good time happened. At around 2 trillion transactions, the database stops accepting connections and starts VACUUM . datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 54. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~350M . txid_current = ~2 000M . VACUUM starts freezing tuples. We can only connect again when it’s done. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 55. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~700M . txid_current = ~2 000M . VACUUM is still freezing tuples. Still no new transactions allowed. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 56. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 000M . txid_current = ~2 000M . VACUUM is still freezing tuples. Still no new transactions allowed. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 57. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 400M . txid_current = ~2 000M . VACUUM is still freezing tuples. Still no new transactions allowed. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 58. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 800M . txid_current = ~2 000M . VACUUM is done. New transactions are allowed to start again. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 59. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 800M . txid_current = ~2 300M . Database is working again. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 60. Transaction Exhaustion and Tuple Freezing Visualized I made a little animation to help understand: https://tuple-freezing-demo.angusd.com Tuple Freezing Demo