How Questioning My Own Code Saved Our PostgreSQL Database

How Questioning My Own Code Saved Our PostgreSQL Database

How understanding PostgreSQL internals helped us reduce database load by 80%

When Materialized Views Attack

Materialized views are one of PostgreSQL's most powerful features for query optimization. They pre-compute expensive queries and store the results, turning complex aggregations into simple table scans. But there's a dark side: the refresh process itself can become a performance nightmare.

This is the story of how our well-intentioned architecture decision nearly brought down our production database, and the systematic approach we used to fix it. More importantly, it's a guide to understanding why certain optimizations work, so you can apply these principles to your own PostgreSQL challenges.

The Architecture That Led to Trouble

Like many SaaS companies, we use a multi-tenant architecture where each customer workspace gets its own set of pre-computed analytics. Our approach:

├── Workspace 1
│   ├── Analytics View Type A
│   ├── Analytics View Type B
│   └── Analytics View Type C
├── Workspace 2
│   ├── Analytics View Type A
│   ├── Analytics View Type B
│   └── Analytics View Type C
└── ... (hundreds more)        

The math was simple but scary:

  • N workspaces × 3 view types = 3N materialized views
  • Each refresh involves: DROP OLD DATA → COMPUTE NEW DATA → WRITE TO DISK
  • All happening every 15 minutes

Understanding the Performance Cliff

Why Materialized View Refreshes Are Expensive

To understand our optimization journey, you need to understand what PostgreSQL does during a REFRESH MATERIALIZED VIEW:

  1. Lock Acquisition: Even with CONCURRENTLY, PostgreSQL needs locks
  2. Full Recomputation: The entire view is recalculated from scratch
  3. Disk I/O: Old data is marked dead, new data is written
  4. Index Rebuilding: All indexes on the MV must be updated
  5. WAL Generation: Everything gets written to the Write-Ahead Log

When you multiply this by hundreds of views, you're essentially asking your database to rebuild a significant portion of itself every 15 minutes.

The Catalog Query Problem

Our original implementation had a subtle but devastating flaw:

-- Checking if each materialized view exists
SELECT EXISTS (
    SELECT 1 FROM pg_matviews 
    WHERE matviewname = 'view_name_here'
)        

This looks innocent, but PostgreSQL's system catalogs (pg_matviews) are heavily accessed tables. Every query in your system checks these catalogs for metadata. When you hammer them with hundreds of existence checks in rapid succession, you create contention that affects every other query in your database.

The Systematic Optimization Approach

Step 1: Understanding the Real Data Flow

The first breakthrough came from questioning our assumptions. We were joining through a customer transaction table to find workspaces, creating an expensive DISTINCT operation:

-- What we were doing (simplified)
SELECT DISTINCT workspace_id 
FROM customer_transactions 
WHERE active = true;

-- The hidden cost: scanning millions of rows to extract hundreds of values        

Why This Matters: PostgreSQL's DISTINCT operation requires sorting or hashing all rows. Even with indexes, you're asking the database to examine far more data than necessary. The cost grows linearly with your transaction table size, not your workspace count.

The Fix: Go straight to the source:

-- Direct query on the authoritative table
SELECT id FROM workspaces WHERE active = true;        

This seems obvious in hindsight, but it represents a crucial principle: always query the authoritative source of truth, not derived data.

Step 2: Batching Catalog Queries

The next optimization required understanding PostgreSQL's query planner. Individual queries have overhead:

  • Parser initialization
  • Query planning
  • Network round trip
  • Result marshaling

The Mathematics of Overhead:

  • 1 query with 1ms overhead × 300 executions = 300ms overhead
  • 1 query with 1ms overhead × 1 execution = 1ms overhead

We transformed our approach:

-- Instead of N queries, use one set-based operation
WITH view_catalog AS (
    SELECT 
        workspace_id,
        array_agg(view_name) as existing_views
    FROM (
        SELECT 
            w.id as workspace_id,
            v.matviewname as view_name
        FROM workspaces w
        CROSS JOIN LATERAL (
            SELECT matviewname 
            FROM pg_matviews 
            WHERE matviewname = ANY(
                ARRAY[
                    'view_type_a_' || w.id,
                    'view_type_b_' || w.id,
                    'view_type_c_' || w.id
                ]
            )
        ) v
    ) subquery
    GROUP BY workspace_id
)        

Why This Works:

  • Single catalog scan instead of hundreds
  • Leverages PostgreSQL's efficient array operations
  • Reduces lock contention on system tables

Step 3: Understanding Sleep() and Scheduling Theory

When I initially built the system, I added pg_sleep(0.3) after each workspace refresh as a precautionary measure to avoid overwhelming the database. This seemed reasonable - give the system time to breathe between operations. But the math told a different story:

The Hidden Cost of Sleep:

300 workspaces × 0.3 seconds = 90 seconds of doing nothing
90 seconds ÷ 900 seconds (15 min) = 10% of our refresh window wasted        

More importantly, this represented a misunderstanding of load management. Artificial delays don't reduce load - they just spread it inefficiently. The database was perfectly capable of handling back-to-back refreshes; the problem was doing them all at once.

Better Approach - Adaptive Throttling:

-- Only pause when system is under pressure
IF refresh_count > 50 AND refresh_count % 10 = 0 THEN
    -- Check system load
    IF (SELECT count(*) FROM pg_stat_activity 
        WHERE state = 'active' AND wait_event_type IS NOT NULL) > threshold THEN
        PERFORM pg_sleep(0.1);
    END IF;
END IF;        

Step 4: Time-Based Partitioning - The Theory

The final optimization leverages queueing theory. Instead of processing all N items at once (creating a load spike), we distribute them across time slots.

The Mathematics:

  • Peak load with single batch: N operations/second
  • Peak load with K partitions: N/K operations/second
  • Resource utilization: Changes from spiky to smooth

Implementation Using Modular Arithmetic:

-- Partition workspaces based on ID
WHERE workspace_id % total_partitions = partition_number        

This ensures:

  • Even distribution (assuming relatively sequential IDs)
  • Deterministic assignment (same workspace always in same partition)
  • Simple implementation (no additional state tables needed)

PostgreSQL Memory Tuning for Materialized Views

Understanding Memory Parameters

PostgreSQL uses different memory pools for different operations:

  1. work_mem: Used for sorts, hashes, and joins per operation Default: 4MB For MV refresh: Needs to handle complex aggregations Setting: (Available RAM / Max Connections) / Operations per Query
  2. maintenance_work_mem: Used for VACUUM, CREATE INDEX, and REFRESH MATERIALIZED VIEW Default: 64MB, for MV refresh: This is your primary tuning parameter Setting: 10-20% of RAM for dedicated maintenance operations
  3. effective_cache_size: Hint to query planner about OS cache Not actual allocation, just planning hint Setting: 50-75% of total RAM

AWS RDS Specific Considerations

For db.t4g.medium (4GB RAM):

-- Conservative settings that leave room for connections
SET LOCAL work_mem = '64MB';  
SET LOCAL maintenance_work_mem = '256MB';  -- Key for MV refresh
SET LOCAL effective_cache_size = '2GB';        

Why These Values:

  • 4GB total RAM
  • ~600MB for shared_buffers (RDS default)
  • ~500MB for OS and RDS monitoring
  • Leaves ~3GB for PostgreSQL operations
  • With 50 connections: 64MB work_mem is safe
  • 256MB maintenance_work_mem optimizes refresh without starving other operations

Implementing with pg_cron

We use pg_cron for scheduling our distributed refresh strategy:

-- Remove old single-batch job
SELECT cron.unschedule('refresh-materialized-views');

-- Group 1: Even workspace IDs at :00, :15, :30, :45
SELECT cron.schedule(
    'refresh-mvs-group-1',
    '*/15 * * * *',
    'SELECT refresh_all_workspace_mvs_optimized(0, 2);'
);

-- Group 2: Odd workspace IDs at :05, :20, :35, :50
SELECT cron.schedule(
    'refresh-mvs-group-2', 
    '5-59/15 * * * *',
    'SELECT refresh_all_workspace_mvs_optimized(1, 2);'
);        

This spreads the load while maintaining the same 15-minute refresh frequency for each workspace.

Monitoring and Validation

Key Metrics to Track

-- Refresh performance tracking
CREATE VIEW refresh_performance_analysis AS
WITH refresh_stats AS (
    SELECT 
        date_trunc('hour', occurred_at) as hour,
        count(*) as total_refreshes,
        avg(duration_seconds) as avg_duration,
        max(duration_seconds) as max_duration,
        stddev(duration_seconds) as duration_stddev
    FROM refresh_logs
    WHERE status = 'completed'
    GROUP BY date_trunc('hour', occurred_at)
)
SELECT 
    hour,
    total_refreshes,
    round(avg_duration::numeric, 2) as avg_duration_sec,
    round(max_duration::numeric, 2) as max_duration_sec,
    round(duration_stddev::numeric, 2) as duration_variance,
    CASE 
        WHEN duration_stddev > avg_duration * 0.5 
        THEN 'HIGH VARIANCE - INVESTIGATE'
        ELSE 'NORMAL'
    END as stability_indicator
FROM refresh_stats;        

System Impact Verification

Using pg_stat_activity to check for lock contention:

-- Check for lock contention during refreshes
SELECT 
    blocked_locks.pid AS blocked_pid,
    blocked_activity.usename AS blocked_user,
    blocked_activity.query AS blocked_query,
    blocking_locks.pid AS blocking_pid,
    blocking_activity.query AS blocking_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity 
    ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks 
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.relation = blocked_locks.relation
    AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity 
    ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted
    AND blocking_activity.query LIKE '%REFRESH MATERIALIZED VIEW%';        

Results and Learnings

Quantitative Improvements

  1. Query Reduction: From O(N×M) to O(1) catalog queries
  2. CPU Usage: Peak usage dropped from 100% to 35%
  3. Duration: Total refresh time reduced by 80%
  4. Consistency: Standard deviation of refresh times reduced by 65%

Key Principles Learned

  1. Understand Your Tools: Materialized views aren't just "cached queries" - they're physical tables with all associated overhead. Understanding MVCC internals helps explain why refreshes are so expensive.
  2. Measure Everything: My initial assumptions about where time was being spent were wrong. Only through systematic measurement did the real bottlenecks become clear.
  3. System Catalogs Are Shared Resources: Every query touches them. Treat them with respect.
  4. Load Distribution > Load Reduction: Sometimes the same work spread over time is far better than less work all at once.
  5. Question Your Own Decisions: That sleep() I added? It made sense when I wrote it, but the context had changed. Regular review of architectural decisions is crucial.

Alternative Approaches We Considered

Why Not Incremental Refresh?

PostgreSQL doesn't natively support incremental materialized view refresh (as of version 15). I considered:

  • Trigger-based incremental updates: Too complex for our aggregation logic
  • Logical replication to update views: Operational overhead too high
  • External CDC tools: Added architectural complexity

Why Not Regular Views?

We tested converting some MVs to regular views with aggressive indexing:

  • Query time increased 10-50x for complex aggregations
  • Index maintenance became a different bottleneck
  • Customer experience degraded unacceptably

Read Replicas Don't Help Here

Initially, I thought we could leverage our read replica for refresh operations. This was a fundamental misunderstanding - AWS RDS read replicas are strictly read-only. Any write operation, including REFRESH MATERIALIZED VIEW, must happen on the primary instance.

Conclusion: The Path Forward

Materialized views remain powerful tools, but they require careful consideration of:

  • Refresh strategies and timing
  • System resource allocation
  • PostgreSQL internals and behavior
  • Business requirements for data freshness

This optimization journey taught me that performance work isn't just about making things faster - it's about understanding why they're slow in the first place. The best optimizations often come from questioning your own assumptions and understanding the tools you're using at a deeper level.

The next time you face a PostgreSQL performance challenge, remember:

  1. Measure first, optimize second
  2. Understand the internals of what you're optimizing
  3. Simple solutions often beat complex ones
  4. Distribution of load can be more effective than reduction
  5. Your past decisions aren't sacred - question them regularly


This article represents optimization patterns that can be applied to many PostgreSQL performance challenges. While specific implementation details have been generalized for broader applicability, the principles and techniques are battle-tested in production environments handling millions of queries daily.


To view or add a comment, sign in

Others also viewed

Explore topics