© 2013 EDB All rights reserved. 1
Materialized views in PostgreSQL
Ashutosh Bapat | 28th
March, 2014
© 2013 EDB All rights reserved. 2
Theoretical background
PostgreSQL's support
Use cases
© 2013 EDB All rights reserved. 3
(SQL) View
● “Virtual relation” defined
by a query
● Represents the result of
the query
● Can be queried similar
to a table
● Referencing view in a
query, requires the
defining query to be
executed each time
View: emp_with_good_salary
SELECT emp_name
FROM emp
WHERE salary > 15000;
Table: emp
emp_name salary
Kiran 10000
Mohan 20000
Leela 30000
© 2013 EDB All rights reserved. 4
Materialized View (MV)
● A “view” with results of
associated query stored in
the database
● Referencing a materialized
view does not require
execution of the query
● Needs to be “maintained”
to keep up with changes in
underlying objects (tables
or views)
● Can be indexed unlike
non-materialized view
Table: emp
emp_name salary
Kiran 10000
Mohan 20000
Leela 30000
MV: emp_with_good_salary
emp_name salary
Mohan 20000
Leela 30000
© 2013 EDB All rights reserved. 5
Theoretical background
PostgreSQL's support
Use cases
© 2013 EDB All rights reserved. 6
●
Creation
– CREATE MATERIALIZED VIEW
●
Maintainance
– REFRESH MATERIALIZED VIEW
●
Destruction
– DROP MATERIALIZED VIEW
●
Supported from 9.3
●
Enhancements in 9.4
– REFRESH MATERIALIZED VIEW
CONCURRENTLY
Materialized Views in PostgreSQL
© 2013 EDB All rights reserved. 7
●
Lazy refresh
– Materialized view usually contains stale data
– REFRESH periodically or suitable independent of
DML activity
–
●
Aggressive refresh
– Materialized view contains latest data in
serializable transactions and nearly fresh data at
other isolation levels
– REFRESH using triggers/rules
Refreshing MV
© 2013 EDB All rights reserved. 8
●
Incremental refresh
– Refreshing only those rows affected by changes to
the underlying table
– Being worked on community
●
Using Materialized views for query optimization
– Using MVs automatically
●
Auto-refresh
– Refreshing materialized view automatically when
the underlying tables change
What's not supported in 9.4
© 2013 EDB All rights reserved. 9
Theoretical background
PostgreSQL's support
Use cases
© 2013 EDB All rights reserved. 10
Reporting using stale data
● Very frequently updated tables
● Approximate reports are fine
● Create materialized view/s for reporting queries
● Refresh every night or on weekly/monthly basis
© 2013 EDB All rights reserved. 11
Reporting region-wise sales
● Table schema
CREATE TABLE salesman(salesman_no integer PRIMARY KEY,
name varchar(100),
region varchar(100));
CREATE TABLE invoice (invoice_no integer PRIMARY KEY,
salesman_no integer REFERENCES salesman,
invoice_amt numeric(13, 2),
invoice_date date);
● Reporting Query
SELECT sum(i.invoice_amt) region_sale, s.region region
FROM salesman s, invoice i
WHERE i.salesman_no = s.salesman_no
GROUP BY s.region
ORDER BY region_sale
LIMIT 10;
© 2013 EDB All rights reserved. 12
Reporting region-wise sales
EXPLAIN ANALYZE SELECT sum(i.invoice_amt) region_sale, s.region region
FROM salesman s, invoice i
WHERE i.salesman_no = s.salesman_no
GROUP BY s.region
ORDER BY region_sale
LIMIT 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=44294.16..44294.18 rows=10 width=234) (actual time=2609.868..2609.870 rows=10 loops=1)
-> Sort (cost=44294.16..44294.66 rows=200 width=234) (actual time=2609.860..2609.861 rows=10 loops=1)
Sort Key: (sum(i.invoice_amt))
Sort Method: top-N heapsort Memory: 26kB
-> HashAggregate (cost=44287.84..44289.84 rows=200 width=234) (actual time=2609.347..2609.366 rows=26
loops=1)
-> Hash Join (cost=559.84..39828.84 rows=891800 width=234) (actual time=29.751..1374.305
rows=1000000 loops=1)
Hash Cond: (i.salesman_no = s.salesman_no)
-> Seq Scan on invoice i (cost=0.00..15288.00 rows=891800 width=20) (actual time=0.048..398.745
rows=1000000 loops=1)
-> Hash (cost=345.15..345.15 rows=5015 width=222) (actual time=29.602..29.602 rows=10000 loops=1)
Buckets: 1024 Batches: 2 Memory Usage: 685kB
-> Seq Scan on salesman s (cost=0.00..345.15 rows=5015 width=222) (actual time=0.009..5.221
rows=10000 loops=1)
Total runtime: 2610.316 ms
© 2013 EDB All rights reserved. 13
Reporting region-wise sales
CREATE MATERIALIZED VIEW sales_by_region AS
SELECT sum(i.invoice_amt) region_sale, s.region region
FROM salesman s, invoice i
WHERE i.salesman_no = s.salesman_no
GROUP BY s.region;
EXPLAIN ANALYZE SELECT * FROM sales_by_region ORDER BY region_sale LIMIT 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Limit (cost=19.17..19.19 rows=10 width=250) (actual time=0.065..0.066 rows=10 loops=1)
-> Sort (cost=19.17..19.89 rows=290 width=250) (actual time=0.064..0.064 rows=10 loops=1)
Sort Key: region_sale
Sort Method: top-N heapsort Memory: 26kB
-> Seq Scan on sales_by_region (cost=0.00..12.90 rows=290 width=250) (actual time=0.007..0.013 rows=26
loops=1)
Total runtime: 0.094 ms
(6 rows)
© 2013 EDB All rights reserved. 14
Complex queries
● Relatively stable underlying tables
● Complex and slow running queries
● Bonus
– Stale data not tolerable – use triggers to refresh
– Faster query results – use indexes on MV
© 2013 EDB All rights reserved. 15
Shortest route problem
● Table schema
CREATE TABLE roads (source char,
dest char,
length numeric(5, 2));
● Slow query
WITH RECURSIVE paths (source, dest, length, path) AS (
SELECT source, dest, length::float, '{}'::bpchar[]
FROM roads
WHERE source = 'A'
UNION ALL
SELECT p.source, r.dest, p.length + r.length, p.path || ARRAY[r.source]
FROM paths p, roads r
WHERE p.dest = r.source AND not (r.dest = ANY(p.path))
)
SELECT * FROM paths
WHERE dest = 'L'
ORDER BY length LIMIT 1;
© 2013 EDB All rights reserved. 16
SRP: without MV
EXPLAIN ANALYZE output
WITH RECURSIVE paths (source, dest, length, path) AS (
ORDER BY length LIMIT 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=686.43..686.43 rows=1 width=56) (actual time=897.159..897.159 rows=1 loops=1)
CTE paths
-> Recursive Union (cost=0.00..581.31 rows=4667 width=76) (actual time=0.039..720.175 rows=138640 loops=1)
-> Seq Scan on roads (cost=0.00..27.52 rows=7 width=28) (actual time=0.036..0.061 rows=5 loops=1)
Filter: (source = 'A'::bpchar)
Rows Removed by Filter: 75
-> Hash Join (cost=2.28..46.04 rows=466 width=76) (actual time=9.528..38.388 rows=8665 loops=16)
Hash Cond: (r.source = p.dest)
Join Filter: (r.dest <> ALL (p.path))
-> Seq Scan on roads r (cost=0.00..24.00 rows=1400 width=28) (actual time=0.010..0.025 rows=80
loops=16)
-> Hash (cost=1.40..1.40 rows=70 width=56) (actual time=9.159..9.159 rows=8665 loops=16)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> WorkTable Scan on paths p (cost=0.00..1.40 rows=70 width=56) (actual time=0.008..3.959
rows=8665 loops=16)
-> Sort (cost=105.12..105.18 rows=23 width=56) (actual time=897.154..897.154 rows=1 loops=1)
Sort Key: paths.length
Sort Method: top-N heapsort Memory: 25kB
-> CTE Scan on paths (cost=0.00..105.01 rows=23 width=56) (actual time=0.696..896.652 rows=912 loops=1)
Filter: (dest = 'L'::bpchar)
Rows Removed by Filter: 137728
Total runtime: 900.970 ms
(20 rows)
© 2013 EDB All rights reserved. 17
SRP: Materialized View
CREATE MATERIALIZED VIEW paths AS
WITH RECURSIVE paths (source, dest, length, path) AS (
SELECT source, dest, length::float, '{}'::bpchar[]
FROM roads
UNION ALL
SELECT p.source, r.dest, p.length + r.length, p.path || ARRAY[r.source]
FROM paths p, roads r
WHERE p.dest = r.source AND not (r.dest = ANY(p.path))
)
SELECT * FROM paths;
EXPLAIN ANALYZE SELECT * FROM paths WHERE source = 'A' and dest = 'L' ORDER BY length DESC LIMIT 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Limit (cost=10623.33..10623.33 rows=1 width=56) (actual time=125.326..125.327 rows=1 loops=1)
-> Sort (cost=10623.33..10623.35 rows=10 width=56) (actual time=125.324..125.324 rows=1 loops=1)
Sort Key: length
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on paths (cost=0.00..10623.28 rows=10 width=56) (actual time=0.283..124.988 rows=912 loops=1)
Filter: ((source = 'A'::bpchar) AND (dest = 'L'::bpchar))
Rows Removed by Filter: 281233
Total runtime: 125.377 ms
(8 rows)
© 2013 EDB All rights reserved. 18
SRP: MV with indexes
CREATE INDEX i_paths_source on paths(source, dest);
EXPLAIN ANALYZE SELECT * FROM paths WHERE source = 'A' and dest = 'L' ORDER BY length DESC LIMIT 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=31.80..31.80 rows=1 width=56) (actual time=1.265..1.265 rows=1 loops=1)
-> Sort (cost=31.80..31.81 rows=7 width=56) (actual time=1.264..1.264 rows=1 loops=1)
Sort Key: length
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on paths (cost=4.49..31.76 rows=7 width=56) (actual time=0.327..0.982 rows=912
loops=1)
Recheck Cond: ((source = 'A'::bpchar) AND (dest = 'L'::bpchar))
-> Bitmap Index Scan on i_paths_source (cost=0.00..4.49 rows=7 width=0) (actual time=0.304..0.304
rows=912 loops=1)
Index Cond: ((source = 'A'::bpchar) AND (dest = 'L'::bpchar))
Total runtime: 1.317 ms
(9 rows)
© 2013 EDB All rights reserved. 19
SRP: latest data using triggers
CREATE FUNCTION refresh_mvs() RETURNS trigger LANGUAGE plpgsql AS
$$
BEGIN
REFRESH MATERIALIZED VIEW paths;
RETURN NULL;
END;
$$;
CREATE TRIGGER paths_trig AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE
ON roads
FOR EACH STATEMENT
EXECUTE PROCEDURE refresh_mvs();
© 2013 EDB All rights reserved. 20
SRP: latest data using triggers
SELECT * FROM paths WHERE source = 'T';
source | dest | length | path
--------+------+--------+------
(0 rows)
EXPLAIN ANALYZE INSERT INTO roads VALUES ('T', 'Z', 100.4);
QUERY PLAN
---------------------------------------------------------------------------------------------
Insert on roads (cost=0.00..0.01 rows=1 width=0) (actual time=0.033..0.033 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
Trigger paths_trig: time=9080.960 calls=1
Total runtime: 9081.028 ms
(4 rows)
SELECT * FROM paths WHERE source = 'T';
source | dest | length | path
--------+------+--------+------
T | Z | 100.4 | {}
(1 row)
© 2013 EDB All rights reserved. 21
Caching foreign data
● Materialized views on foreign tables
– Data availability in case of foreign server failure
– Faster data access
– Possibly stale data
● Aggressive refresh
– Triggers on foreign tables not supported
● Being discussed in the community
– External method for firing REFRESH when foreign data changes
● Lazy refresh
– Fire REFRESH periodically
© 2013 EDB All rights reserved. 22
Caching foreign data
postgres=# d+ remote_emp
Foreign table "public.remote_emp"
Column | Type | Modifiers | FDW Options | Storage | Stats target | Description
--------+-----------------------+-----------+-------------+----------+--------------+-------------
empno | numeric(4,0) | | | main | |
ename | character varying(10) | | | extended | |
job | character varying(10) | | | extended | |
Server: local_ppas
FDW Options: (schema_name 'public', table_name 'emp')
Has OIDs: no
postgres=# create materialized view cached_remote_emp as select * from remote_emp;
postgres=# explain analyze select * from cached_remote_emp;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on cached_remote_emp (cost=0.00..16.90 rows=690 width=88) (actual time=0.020..0.024 rows=14
loops=1)
Planning time: 0.076 ms
Total runtime: 0.068 ms
(3 rows)
postgres=# explain analyze select * from remote_emp;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Foreign Scan on remote_emp (cost=100.00..131.93 rows=731 width=88) (actual time=0.834..0.836 rows=14 loops=1)
Planning time: 0.077 ms
Total runtime: 1.451 ms
(3 rows)
© 2013 EDB All rights reserved. 23
Thank you

More Related Content

PDF
Data Warehousing 101(and a video)
PDF
Really Big Elephants: PostgreSQL DW
PDF
Managing terabytes: When PostgreSQL gets big
PPTX
Managing a 14 TB reporting datawarehouse with postgresql
PDF
Why PostgreSQL for Analytics Infrastructure (DW)?
PDF
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
PPT
15 Ways to Kill Your Mysql Application Performance
PDF
PostgreSQL Table Partitioning / Sharding
Data Warehousing 101(and a video)
Really Big Elephants: PostgreSQL DW
Managing terabytes: When PostgreSQL gets big
Managing a 14 TB reporting datawarehouse with postgresql
Why PostgreSQL for Analytics Infrastructure (DW)?
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
15 Ways to Kill Your Mysql Application Performance
PostgreSQL Table Partitioning / Sharding

What's hot (20)

ODP
PostgreSQL 8.4 TriLUG 2009-11-12
PDF
Bulk Loading Data into Cassandra
PDF
Mysql database basic user guide
PPTX
Postgresql Database Administration Basic - Day1
PPTX
Tuning Apache Phoenix/HBase
PDF
MySQL database replication
PDF
PostgreSQL 9.5 - Major Features
PDF
Dbvisit replicate: logical replication made easy
PDF
Mongodb replication
PDF
Percona Live 2012PPT: MySQL Query optimization
PDF
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
PDF
Sap basis administrator user guide
PDF
Oracle NOLOGGING
PPTX
Example R usage for oracle DBA UKOUG 2013
PPT
Leveraging Hadoop in your PostgreSQL Environment
PDF
Practical Partitioning in Production with Postgres
PDF
How to teach an elephant to rock'n'roll
PDF
Major features postgres 11
 
PPTX
Oracle Database 12.1.0.2 New Features
PPS
Csql Cache Presentation
PostgreSQL 8.4 TriLUG 2009-11-12
Bulk Loading Data into Cassandra
Mysql database basic user guide
Postgresql Database Administration Basic - Day1
Tuning Apache Phoenix/HBase
MySQL database replication
PostgreSQL 9.5 - Major Features
Dbvisit replicate: logical replication made easy
Mongodb replication
Percona Live 2012PPT: MySQL Query optimization
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
Sap basis administrator user guide
Oracle NOLOGGING
Example R usage for oracle DBA UKOUG 2013
Leveraging Hadoop in your PostgreSQL Environment
Practical Partitioning in Production with Postgres
How to teach an elephant to rock'n'roll
Major features postgres 11
 
Oracle Database 12.1.0.2 New Features
Csql Cache Presentation
Ad

Viewers also liked (20)

PDF
PostgreSQL Materialized Views with Active Record
PPTX
Pattern driven Enterprise Architecture
PPTX
Part1 materialized view
PDF
Introduction to Postrges-XC
PPT
05 OLAP v6 weekend
PDF
Pgxc scalability pg_open2012
PPTX
FedX - Optimization Techniques for Federated Query Processing on Linked Data
PPT
Whats A Data Warehouse
PDF
Data Warehouse and OLAP - Lear-Fabini
PPTX
Oracle Optimizer: 12c New Capabilities
PDF
PostgreSQL Replication Tutorial
PDF
Cassandra Materialized Views
PPTX
SSSW2015 Data Workflow Tutorial
PDF
Olap Cube Design
 
PDF
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PPT
OLAP Cubes in Datawarehousing
PPT
Crm evolution- crm phases
PPTX
The Magic of Tuning in PostgreSQL
PDF
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PDF
Streaming replication in practice
PostgreSQL Materialized Views with Active Record
Pattern driven Enterprise Architecture
Part1 materialized view
Introduction to Postrges-XC
05 OLAP v6 weekend
Pgxc scalability pg_open2012
FedX - Optimization Techniques for Federated Query Processing on Linked Data
Whats A Data Warehouse
Data Warehouse and OLAP - Lear-Fabini
Oracle Optimizer: 12c New Capabilities
PostgreSQL Replication Tutorial
Cassandra Materialized Views
SSSW2015 Data Workflow Tutorial
Olap Cube Design
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
OLAP Cubes in Datawarehousing
Crm evolution- crm phases
The Magic of Tuning in PostgreSQL
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
Streaming replication in practice
Ad

Similar to Materialized views in PostgreSQL (20)

PDF
Adaptive Query Optimization
ODP
Basic Query Tuning Primer - Pg West 2009
ODP
Basic Query Tuning Primer
PDF
les07.pdf
PPTX
Writing efficient sql
PPTX
Embarcadero In Search of Plan Stability Part 1 Webinar Slides
PPTX
Presentación Oracle Database Migración consideraciones 10g/11g/12c
PDF
Postgres performance for humans
PPTX
Optimizing applications and database performance
PDF
Evolution of Performance Management: Oracle 12c adaptive optimizations - ukou...
PDF
Understand the Query Plan to Optimize Performance with EXPLAIN and EXPLAIN AN...
 
PDF
New Tuning Features in Oracle 11g - How to make your database as boring as po...
PDF
Write Faster SQL with Trino.pdf
PPTX
D73549GC10_06.pptx
PPT
Informix Warehouse Accelerator (IWA) features in version 12.1
PDF
EvolveExecutionPlans.pdf
PDF
Oracle SQL Tuning
PDF
12c SQL Plan Directives
PPTX
Oracle 12c SPM
PDF
How To Control IO Usage using Resource Manager
Adaptive Query Optimization
Basic Query Tuning Primer - Pg West 2009
Basic Query Tuning Primer
les07.pdf
Writing efficient sql
Embarcadero In Search of Plan Stability Part 1 Webinar Slides
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Postgres performance for humans
Optimizing applications and database performance
Evolution of Performance Management: Oracle 12c adaptive optimizations - ukou...
Understand the Query Plan to Optimize Performance with EXPLAIN and EXPLAIN AN...
 
New Tuning Features in Oracle 11g - How to make your database as boring as po...
Write Faster SQL with Trino.pdf
D73549GC10_06.pptx
Informix Warehouse Accelerator (IWA) features in version 12.1
EvolveExecutionPlans.pdf
Oracle SQL Tuning
12c SQL Plan Directives
Oracle 12c SPM
How To Control IO Usage using Resource Manager

Recently uploaded (20)

PPT
Geologic Time for studying geology for geologist
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Five Habits of High-Impact Board Members
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PPT
What is a Computer? Input Devices /output devices
PPTX
The various Industrial Revolutions .pptx
PPT
Module 1.ppt Iot fundamentals and Architecture
DOCX
search engine optimization ppt fir known well about this
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
Geologic Time for studying geology for geologist
Benefits of Physical activity for teenagers.pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Five Habits of High-Impact Board Members
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
sbt 2.0: go big (Scala Days 2025 edition)
What is a Computer? Input Devices /output devices
The various Industrial Revolutions .pptx
Module 1.ppt Iot fundamentals and Architecture
search engine optimization ppt fir known well about this
The influence of sentiment analysis in enhancing early warning system model f...
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Developing a website for English-speaking practice to English as a foreign la...
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Credit Without Borders: AI and Financial Inclusion in Bangladesh
1 - Historical Antecedents, Social Consideration.pdf
A contest of sentiment analysis: k-nearest neighbor versus neural network
2018-HIPAA-Renewal-Training for executives
A proposed approach for plagiarism detection in Myanmar Unicode text

Materialized views in PostgreSQL

  • 1. © 2013 EDB All rights reserved. 1 Materialized views in PostgreSQL Ashutosh Bapat | 28th March, 2014
  • 2. © 2013 EDB All rights reserved. 2 Theoretical background PostgreSQL's support Use cases
  • 3. © 2013 EDB All rights reserved. 3 (SQL) View ● “Virtual relation” defined by a query ● Represents the result of the query ● Can be queried similar to a table ● Referencing view in a query, requires the defining query to be executed each time View: emp_with_good_salary SELECT emp_name FROM emp WHERE salary > 15000; Table: emp emp_name salary Kiran 10000 Mohan 20000 Leela 30000
  • 4. © 2013 EDB All rights reserved. 4 Materialized View (MV) ● A “view” with results of associated query stored in the database ● Referencing a materialized view does not require execution of the query ● Needs to be “maintained” to keep up with changes in underlying objects (tables or views) ● Can be indexed unlike non-materialized view Table: emp emp_name salary Kiran 10000 Mohan 20000 Leela 30000 MV: emp_with_good_salary emp_name salary Mohan 20000 Leela 30000
  • 5. © 2013 EDB All rights reserved. 5 Theoretical background PostgreSQL's support Use cases
  • 6. © 2013 EDB All rights reserved. 6 ● Creation – CREATE MATERIALIZED VIEW ● Maintainance – REFRESH MATERIALIZED VIEW ● Destruction – DROP MATERIALIZED VIEW ● Supported from 9.3 ● Enhancements in 9.4 – REFRESH MATERIALIZED VIEW CONCURRENTLY Materialized Views in PostgreSQL
  • 7. © 2013 EDB All rights reserved. 7 ● Lazy refresh – Materialized view usually contains stale data – REFRESH periodically or suitable independent of DML activity – ● Aggressive refresh – Materialized view contains latest data in serializable transactions and nearly fresh data at other isolation levels – REFRESH using triggers/rules Refreshing MV
  • 8. © 2013 EDB All rights reserved. 8 ● Incremental refresh – Refreshing only those rows affected by changes to the underlying table – Being worked on community ● Using Materialized views for query optimization – Using MVs automatically ● Auto-refresh – Refreshing materialized view automatically when the underlying tables change What's not supported in 9.4
  • 9. © 2013 EDB All rights reserved. 9 Theoretical background PostgreSQL's support Use cases
  • 10. © 2013 EDB All rights reserved. 10 Reporting using stale data ● Very frequently updated tables ● Approximate reports are fine ● Create materialized view/s for reporting queries ● Refresh every night or on weekly/monthly basis
  • 11. © 2013 EDB All rights reserved. 11 Reporting region-wise sales ● Table schema CREATE TABLE salesman(salesman_no integer PRIMARY KEY, name varchar(100), region varchar(100)); CREATE TABLE invoice (invoice_no integer PRIMARY KEY, salesman_no integer REFERENCES salesman, invoice_amt numeric(13, 2), invoice_date date); ● Reporting Query SELECT sum(i.invoice_amt) region_sale, s.region region FROM salesman s, invoice i WHERE i.salesman_no = s.salesman_no GROUP BY s.region ORDER BY region_sale LIMIT 10;
  • 12. © 2013 EDB All rights reserved. 12 Reporting region-wise sales EXPLAIN ANALYZE SELECT sum(i.invoice_amt) region_sale, s.region region FROM salesman s, invoice i WHERE i.salesman_no = s.salesman_no GROUP BY s.region ORDER BY region_sale LIMIT 10; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=44294.16..44294.18 rows=10 width=234) (actual time=2609.868..2609.870 rows=10 loops=1) -> Sort (cost=44294.16..44294.66 rows=200 width=234) (actual time=2609.860..2609.861 rows=10 loops=1) Sort Key: (sum(i.invoice_amt)) Sort Method: top-N heapsort Memory: 26kB -> HashAggregate (cost=44287.84..44289.84 rows=200 width=234) (actual time=2609.347..2609.366 rows=26 loops=1) -> Hash Join (cost=559.84..39828.84 rows=891800 width=234) (actual time=29.751..1374.305 rows=1000000 loops=1) Hash Cond: (i.salesman_no = s.salesman_no) -> Seq Scan on invoice i (cost=0.00..15288.00 rows=891800 width=20) (actual time=0.048..398.745 rows=1000000 loops=1) -> Hash (cost=345.15..345.15 rows=5015 width=222) (actual time=29.602..29.602 rows=10000 loops=1) Buckets: 1024 Batches: 2 Memory Usage: 685kB -> Seq Scan on salesman s (cost=0.00..345.15 rows=5015 width=222) (actual time=0.009..5.221 rows=10000 loops=1) Total runtime: 2610.316 ms
  • 13. © 2013 EDB All rights reserved. 13 Reporting region-wise sales CREATE MATERIALIZED VIEW sales_by_region AS SELECT sum(i.invoice_amt) region_sale, s.region region FROM salesman s, invoice i WHERE i.salesman_no = s.salesman_no GROUP BY s.region; EXPLAIN ANALYZE SELECT * FROM sales_by_region ORDER BY region_sale LIMIT 10; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Limit (cost=19.17..19.19 rows=10 width=250) (actual time=0.065..0.066 rows=10 loops=1) -> Sort (cost=19.17..19.89 rows=290 width=250) (actual time=0.064..0.064 rows=10 loops=1) Sort Key: region_sale Sort Method: top-N heapsort Memory: 26kB -> Seq Scan on sales_by_region (cost=0.00..12.90 rows=290 width=250) (actual time=0.007..0.013 rows=26 loops=1) Total runtime: 0.094 ms (6 rows)
  • 14. © 2013 EDB All rights reserved. 14 Complex queries ● Relatively stable underlying tables ● Complex and slow running queries ● Bonus – Stale data not tolerable – use triggers to refresh – Faster query results – use indexes on MV
  • 15. © 2013 EDB All rights reserved. 15 Shortest route problem ● Table schema CREATE TABLE roads (source char, dest char, length numeric(5, 2)); ● Slow query WITH RECURSIVE paths (source, dest, length, path) AS ( SELECT source, dest, length::float, '{}'::bpchar[] FROM roads WHERE source = 'A' UNION ALL SELECT p.source, r.dest, p.length + r.length, p.path || ARRAY[r.source] FROM paths p, roads r WHERE p.dest = r.source AND not (r.dest = ANY(p.path)) ) SELECT * FROM paths WHERE dest = 'L' ORDER BY length LIMIT 1;
  • 16. © 2013 EDB All rights reserved. 16 SRP: without MV EXPLAIN ANALYZE output WITH RECURSIVE paths (source, dest, length, path) AS ( ORDER BY length LIMIT 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=686.43..686.43 rows=1 width=56) (actual time=897.159..897.159 rows=1 loops=1) CTE paths -> Recursive Union (cost=0.00..581.31 rows=4667 width=76) (actual time=0.039..720.175 rows=138640 loops=1) -> Seq Scan on roads (cost=0.00..27.52 rows=7 width=28) (actual time=0.036..0.061 rows=5 loops=1) Filter: (source = 'A'::bpchar) Rows Removed by Filter: 75 -> Hash Join (cost=2.28..46.04 rows=466 width=76) (actual time=9.528..38.388 rows=8665 loops=16) Hash Cond: (r.source = p.dest) Join Filter: (r.dest <> ALL (p.path)) -> Seq Scan on roads r (cost=0.00..24.00 rows=1400 width=28) (actual time=0.010..0.025 rows=80 loops=16) -> Hash (cost=1.40..1.40 rows=70 width=56) (actual time=9.159..9.159 rows=8665 loops=16) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> WorkTable Scan on paths p (cost=0.00..1.40 rows=70 width=56) (actual time=0.008..3.959 rows=8665 loops=16) -> Sort (cost=105.12..105.18 rows=23 width=56) (actual time=897.154..897.154 rows=1 loops=1) Sort Key: paths.length Sort Method: top-N heapsort Memory: 25kB -> CTE Scan on paths (cost=0.00..105.01 rows=23 width=56) (actual time=0.696..896.652 rows=912 loops=1) Filter: (dest = 'L'::bpchar) Rows Removed by Filter: 137728 Total runtime: 900.970 ms (20 rows)
  • 17. © 2013 EDB All rights reserved. 17 SRP: Materialized View CREATE MATERIALIZED VIEW paths AS WITH RECURSIVE paths (source, dest, length, path) AS ( SELECT source, dest, length::float, '{}'::bpchar[] FROM roads UNION ALL SELECT p.source, r.dest, p.length + r.length, p.path || ARRAY[r.source] FROM paths p, roads r WHERE p.dest = r.source AND not (r.dest = ANY(p.path)) ) SELECT * FROM paths; EXPLAIN ANALYZE SELECT * FROM paths WHERE source = 'A' and dest = 'L' ORDER BY length DESC LIMIT 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Limit (cost=10623.33..10623.33 rows=1 width=56) (actual time=125.326..125.327 rows=1 loops=1) -> Sort (cost=10623.33..10623.35 rows=10 width=56) (actual time=125.324..125.324 rows=1 loops=1) Sort Key: length Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on paths (cost=0.00..10623.28 rows=10 width=56) (actual time=0.283..124.988 rows=912 loops=1) Filter: ((source = 'A'::bpchar) AND (dest = 'L'::bpchar)) Rows Removed by Filter: 281233 Total runtime: 125.377 ms (8 rows)
  • 18. © 2013 EDB All rights reserved. 18 SRP: MV with indexes CREATE INDEX i_paths_source on paths(source, dest); EXPLAIN ANALYZE SELECT * FROM paths WHERE source = 'A' and dest = 'L' ORDER BY length DESC LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=31.80..31.80 rows=1 width=56) (actual time=1.265..1.265 rows=1 loops=1) -> Sort (cost=31.80..31.81 rows=7 width=56) (actual time=1.264..1.264 rows=1 loops=1) Sort Key: length Sort Method: top-N heapsort Memory: 25kB -> Bitmap Heap Scan on paths (cost=4.49..31.76 rows=7 width=56) (actual time=0.327..0.982 rows=912 loops=1) Recheck Cond: ((source = 'A'::bpchar) AND (dest = 'L'::bpchar)) -> Bitmap Index Scan on i_paths_source (cost=0.00..4.49 rows=7 width=0) (actual time=0.304..0.304 rows=912 loops=1) Index Cond: ((source = 'A'::bpchar) AND (dest = 'L'::bpchar)) Total runtime: 1.317 ms (9 rows)
  • 19. © 2013 EDB All rights reserved. 19 SRP: latest data using triggers CREATE FUNCTION refresh_mvs() RETURNS trigger LANGUAGE plpgsql AS $$ BEGIN REFRESH MATERIALIZED VIEW paths; RETURN NULL; END; $$; CREATE TRIGGER paths_trig AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE ON roads FOR EACH STATEMENT EXECUTE PROCEDURE refresh_mvs();
  • 20. © 2013 EDB All rights reserved. 20 SRP: latest data using triggers SELECT * FROM paths WHERE source = 'T'; source | dest | length | path --------+------+--------+------ (0 rows) EXPLAIN ANALYZE INSERT INTO roads VALUES ('T', 'Z', 100.4); QUERY PLAN --------------------------------------------------------------------------------------------- Insert on roads (cost=0.00..0.01 rows=1 width=0) (actual time=0.033..0.033 rows=0 loops=1) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) Trigger paths_trig: time=9080.960 calls=1 Total runtime: 9081.028 ms (4 rows) SELECT * FROM paths WHERE source = 'T'; source | dest | length | path --------+------+--------+------ T | Z | 100.4 | {} (1 row)
  • 21. © 2013 EDB All rights reserved. 21 Caching foreign data ● Materialized views on foreign tables – Data availability in case of foreign server failure – Faster data access – Possibly stale data ● Aggressive refresh – Triggers on foreign tables not supported ● Being discussed in the community – External method for firing REFRESH when foreign data changes ● Lazy refresh – Fire REFRESH periodically
  • 22. © 2013 EDB All rights reserved. 22 Caching foreign data postgres=# d+ remote_emp Foreign table "public.remote_emp" Column | Type | Modifiers | FDW Options | Storage | Stats target | Description --------+-----------------------+-----------+-------------+----------+--------------+------------- empno | numeric(4,0) | | | main | | ename | character varying(10) | | | extended | | job | character varying(10) | | | extended | | Server: local_ppas FDW Options: (schema_name 'public', table_name 'emp') Has OIDs: no postgres=# create materialized view cached_remote_emp as select * from remote_emp; postgres=# explain analyze select * from cached_remote_emp; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Seq Scan on cached_remote_emp (cost=0.00..16.90 rows=690 width=88) (actual time=0.020..0.024 rows=14 loops=1) Planning time: 0.076 ms Total runtime: 0.068 ms (3 rows) postgres=# explain analyze select * from remote_emp; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Foreign Scan on remote_emp (cost=100.00..131.93 rows=731 width=88) (actual time=0.834..0.836 rows=14 loops=1) Planning time: 0.077 ms Total runtime: 1.451 ms (3 rows)
  • 23. © 2013 EDB All rights reserved. 23 Thank you