Postgres Index Optimization: A Complete Guide to B-tree, GIN, and Composite Indexes

When your Postgres database starts slowing down under load, the culprit is often missing or poorly configured indexes. In this comprehensive guide, you’ll learn how to diagnose slow queries using EXPLAIN ANALYZE and implement the right indexing strategy for your workload.

The Performance Crisis: Why Indexes Matter

Let’s start with a real-world scenario. You’ve built a SaaS application on Supabase, and everything runs smoothly in development. But as you scale to 10,000 users, queries that took 50ms now take 3 seconds. Your application is grinding to a halt.

The problem? Sequential scans on large tables. Without proper indexes, Postgres must read every single row to find matches. With indexes, it can jump directly to the relevant data.

Performance Impact:

  • Without index: 3,200ms for 100,000 rows
  • With B-tree index: 12ms (266x faster)
  • With optimized composite index: 3ms (1,066x faster)

Understanding EXPLAIN ANALYZE: Your Diagnostic Tool

Before creating indexes, you need to understand what’s actually slow. EXPLAIN ANALYZE is your X-ray vision into query execution.

Basic EXPLAIN ANALYZE Syntax

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'user@example.com';

Reading EXPLAIN ANALYZE Output

Let’s break down a real output:

-- Query WITHOUT index
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 12345 AND status = 'pending';

-- Output:
Seq Scan on orders  (cost=0.00..18334.00 rows=5 width=120) (actual time=145.234..2847.123 rows=5 loops=1)
  Filter: ((user_id = 12345) AND (status = 'pending'))
  Rows Removed by Filter: 999995
Planning Time: 0.234 ms
Execution Time: 2847.456 ms

Key Metrics to Watch:

  1. Seq Scan - Sequential scan (BAD for large tables)
  2. actual time - Real execution time in milliseconds
  3. rows - Estimated vs actual rows
  4. Rows Removed by Filter - Wasted work (999,995 rows scanned but discarded!)
  5. Execution Time - Total query time

This query scanned 1 million rows to return 5 results. That’s a 99.9995% waste of resources.

EXPLAIN ANALYZE with Index

-- After creating index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 12345 AND status = 'pending';

-- Output:
Index Scan using idx_orders_user_status on orders  (cost=0.42..8.45 rows=5 width=120) (actual time=0.123..0.234 rows=5 loops=1)
  Index Cond: ((user_id = 12345) AND (status = 'pending'))
Planning Time: 0.134 ms
Execution Time: 0.289 ms

Results:

  • Execution time: 2847ms → 0.29ms (9,817x faster!)
  • Method changed from Seq Scan to Index Scan
  • Zero rows removed by filter (perfect targeting)

Index Types in Postgres: Choosing the Right Tool

Postgres offers multiple index types. Choosing the wrong type is like using a hammer to tighten a screw - it won’t work well.

1. B-tree Indexes: The Default Workhorse

Best For:

  • Equality comparisons (=)
  • Range queries (<, >, BETWEEN)
  • Sorting (ORDER BY)
  • Pattern matching with left-anchored patterns (LIKE 'prefix%')

Use Cases:

  • User lookups by ID or email
  • Date range filtering
  • Numeric comparisons
  • Foreign key relationships

Creating B-tree Indexes

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Multiple indexes for different query patterns
CREATE INDEX idx_orders_created_at ON orders(created_at);
CREATE INDEX idx_orders_total ON orders(total_amount);

-- Index with expression
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

B-tree Performance Example

-- Before index
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'john@example.com';
-- Seq Scan: 1,234ms for 500,000 rows

-- Create index
CREATE INDEX idx_users_email ON users(email);

-- After index
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'john@example.com';
-- Index Scan: 0.234ms (5,274x faster)

2. GIN Indexes: Full-Text Search and Arrays

Best For:

  • Full-text search
  • JSONB queries
  • Array containment (@>, <@)
  • Text search with tsvector

Use Cases:

  • Searching product descriptions
  • Filtering by tags (array columns)
  • JSONB attribute queries
  • Multi-language text search

Creating GIN Indexes

-- JSONB index
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);

-- Array index
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);

-- Full-text search index
CREATE INDEX idx_articles_search ON articles USING GIN (to_tsvector('english', content));

-- Composite GIN index for multiple JSONB keys
CREATE INDEX idx_users_preferences ON users USING GIN (preferences jsonb_path_ops);

GIN Performance Example

-- Product search in JSONB metadata
-- Without index
EXPLAIN ANALYZE
SELECT * FROM products
WHERE metadata @> '{"category": "electronics", "brand": "Apple"}';
-- Seq Scan: 4,567ms

-- Create GIN index
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);

-- With index
EXPLAIN ANALYZE
SELECT * FROM products
WHERE metadata @> '{"category": "electronics", "brand": "Apple"}';
-- Bitmap Index Scan using idx_products_metadata: 23ms (198x faster)

Full-Text Search with GIN

-- Create tsvector column for better performance
ALTER TABLE articles ADD COLUMN search_vector tsvector;

-- Populate search vector
UPDATE articles
SET search_vector = to_tsvector('english', title || ' ' || content);

-- Create GIN index
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);

-- Create trigger to keep it updated
CREATE TRIGGER articles_search_update
BEFORE INSERT OR UPDATE ON articles
FOR EACH ROW EXECUTE FUNCTION
tsvector_update_trigger(search_vector, 'pg_catalog.english', title, content);

-- Query with full-text search
EXPLAIN ANALYZE
SELECT * FROM articles
WHERE search_vector @@ to_tsquery('english', 'postgres & performance');
-- Index Scan: 12ms for complex text search across 100,000 articles

3. Composite Indexes: Multiple Columns, One Index

Best For:

  • Queries filtering on multiple columns
  • Queries with specific column order in WHERE clause
  • Covering indexes (index-only scans)

Column Order Matters!

  • Put high-selectivity columns first
  • Match your WHERE clause order
  • Consider most common queries

Creating Composite Indexes

-- Basic composite index
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

-- Three-column composite
CREATE INDEX idx_events_user_type_date ON events(user_id, event_type, created_at);

-- Composite with INCLUDE for covering index
CREATE INDEX idx_users_email_include ON users(email) INCLUDE (first_name, last_name);

-- Partial composite index (filtered)
CREATE INDEX idx_active_orders ON orders(user_id, created_at)
WHERE status = 'active';

Composite Index: Column Order Impact

-- Query pattern: Filter by user_id, then sort by date
SELECT * FROM orders
WHERE user_id = 12345
ORDER BY created_at DESC
LIMIT 10;

-- GOOD: Matches query pattern
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Index Scan: 0.234ms

-- BAD: Wrong column order
CREATE INDEX idx_orders_date_user ON orders(created_at, user_id);
-- Seq Scan: 234ms (index exists but can't be used efficiently)

Covering Indexes: Index-Only Scans

-- Query needs email, first_name, last_name
SELECT first_name, last_name
FROM users
WHERE email = 'user@example.com';

-- Without INCLUDE: Index Scan + Table Lookup
CREATE INDEX idx_users_email ON users(email);
-- Execution: 0.345ms (index scan + heap fetch)

-- With INCLUDE: Index-Only Scan (no table access needed)
CREATE INDEX idx_users_email_include ON users(email)
INCLUDE (first_name, last_name);
-- Execution: 0.089ms (74% faster, no heap access)

-- Verify with EXPLAIN ANALYZE
EXPLAIN ANALYZE
SELECT first_name, last_name
FROM users
WHERE email = 'user@example.com';
-- Output: "Index Only Scan using idx_users_email_include"

Practical Indexing Strategies

Strategy 1: Start with High-Impact Queries

-- Find your slowest queries
SELECT
  query,
  calls,
  total_exec_time,
  mean_exec_time,
  stddev_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Strategy 2: Index Foreign Keys

-- ALWAYS index foreign keys
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);

Strategy 3: Partial Indexes for Filtered Queries

-- Only index active records
CREATE INDEX idx_active_users ON users(email)
WHERE deleted_at IS NULL;

-- Only index recent orders
CREATE INDEX idx_recent_orders ON orders(user_id, created_at)
WHERE created_at > NOW() - INTERVAL '90 days';

-- Only index specific status
CREATE INDEX idx_pending_orders ON orders(user_id)
WHERE status = 'pending';

Strategy 4: Expression Indexes for Computed Values

-- Case-insensitive email search
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Query using the index
SELECT * FROM users WHERE LOWER(email) = LOWER('User@Example.com');

-- Date part extraction
CREATE INDEX idx_orders_month ON orders(EXTRACT(MONTH FROM created_at));

-- JSONB attribute extraction
CREATE INDEX idx_products_price ON products((metadata->>'price')::numeric);

Real-World Case Study: E-commerce Order System

Problem: Slow Order Dashboard

Scenario:

  • 5 million orders table
  • Dashboard query taking 8 seconds
  • Filtering by user, status, and date range
-- Original query (SLOW)
EXPLAIN ANALYZE
SELECT
  o.id,
  o.order_number,
  o.created_at,
  o.status,
  o.total_amount,
  u.email
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.user_id = 12345
  AND o.status IN ('pending', 'processing')
  AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY o.created_at DESC
LIMIT 25;

-- Output:
Hash Join  (cost=1234.56..890234.67 rows=250 width=200) (actual time=3456.123..8234.567 rows=25 loops=1)
  Hash Cond: (o.user_id = u.id)
  ->  Seq Scan on orders o  (cost=0.00..887654.00 rows=250 width=180) (actual time=2345.234..7890.123 rows=25 loops=1)
        Filter: ((user_id = 12345) AND (status = ANY ('{pending,processing}')) AND (created_at >= (now() - '30 days'::interval)))
        Rows Removed by Filter: 4999975
  ->  Hash  (cost=1234.00..1234.00 rows=45 width=20) (actual time=12.345..12.345 rows=1 loops=1)
        ->  Index Scan using users_pkey on users u  (cost=0.42..1234.00 rows=45 width=20) (actual time=0.123..12.234 rows=1 loops=1)
Execution Time: 8234.789 ms

Problems Identified:

  1. Sequential scan on 5M rows orders table
  2. 4,999,975 rows filtered out (massive waste)
  3. No index on (user_id, status, created_at)

Solution: Strategic Composite Index

-- Create optimized composite index
CREATE INDEX idx_orders_dashboard ON orders(user_id, status, created_at DESC)
WHERE status IN ('pending', 'processing');

-- Also ensure users.id has an index (should be primary key already)
-- And create covering index if we frequently access these columns
CREATE INDEX idx_orders_dashboard_covering ON orders(user_id, status, created_at DESC)
INCLUDE (order_number, total_amount);

After Optimization

EXPLAIN ANALYZE
SELECT
  o.id,
  o.order_number,
  o.created_at,
  o.status,
  o.total_amount,
  u.email
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.user_id = 12345
  AND o.status IN ('pending', 'processing')
  AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY o.created_at DESC
LIMIT 25;

-- Output:
Nested Loop  (cost=0.84..156.23 rows=25 width=200) (actual time=0.234..2.456 rows=25 loops=1)
  ->  Index Scan using idx_orders_dashboard_covering on orders o  (cost=0.42..123.45 rows=25 width=180) (actual time=0.123..1.234 rows=25 loops=1)
        Index Cond: ((user_id = 12345) AND (status = ANY ('{pending,processing}')) AND (created_at >= (now() - '30 days'::interval)))
  ->  Index Scan using users_pkey on users u  (cost=0.42..1.31 rows=1 width=20) (actual time=0.012..0.012 rows=1 loops=25)
        Index Cond: (id = o.user_id)
Execution Time: 2.567 ms

Results:

  • 8,234ms → 2.5ms (3,293x faster!)
  • Method changed from Seq Scan to Index Scan
  • Zero wasted row filtering
  • Index-only scan for covering columns

Advanced Indexing Techniques

1. Index Maintenance and Monitoring

-- Check index usage statistics
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan as index_scans,
  idx_tup_read as tuples_read,
  idx_tup_fetch as tuples_fetched
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;

-- Find unused indexes (candidates for removal)
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexname NOT LIKE '%_pkey';

-- Check index size
SELECT
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

-- Identify bloated indexes
SELECT
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as size,
  idx_scan,
  CASE WHEN idx_scan = 0 THEN 'UNUSED'
       WHEN pg_relation_size(indexrelid) > 100000000 THEN 'LARGE'
       ELSE 'OK'
  END as status
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

2. REINDEX to Rebuild Bloated Indexes

-- Reindex a specific index
REINDEX INDEX CONCURRENTLY idx_orders_user_status;

-- Reindex entire table
REINDEX TABLE CONCURRENTLY orders;

-- Reindex database (use carefully)
REINDEX DATABASE CONCURRENTLY your_database_name;

3. Concurrent Index Creation

-- Create index without blocking writes (production-safe)
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);

-- Drop index without blocking
DROP INDEX CONCURRENTLY idx_old_index;

Common Indexing Mistakes to Avoid

Mistake 1: Over-Indexing

-- BAD: Too many overlapping indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_email_status ON users(email, status);
CREATE INDEX idx_users_email_created ON users(email, created_at);

-- GOOD: Strategic composite that serves multiple queries
CREATE INDEX idx_users_email_multi ON users(email, status, created_at);

Why it matters:

  • Each index slows down INSERT/UPDATE/DELETE operations
  • Indexes consume disk space
  • Maintenance overhead increases

Mistake 2: Wrong Column Order in Composite Index

-- Query pattern
SELECT * FROM orders WHERE status = 'pending' AND user_id = 12345;

-- BAD: Low selectivity column first
CREATE INDEX idx_orders_wrong ON orders(status, user_id);
-- Status has low selectivity (only 5 possible values)

-- GOOD: High selectivity column first
CREATE INDEX idx_orders_correct ON orders(user_id, status);
-- user_id is highly selective (thousands of unique values)

Mistake 3: Not Using Partial Indexes

-- BAD: Indexing all rows including deleted
CREATE INDEX idx_users_email ON users(email);
-- Wastes space on soft-deleted users

-- GOOD: Only index active users
CREATE INDEX idx_active_users_email ON users(email)
WHERE deleted_at IS NULL;
-- Smaller, faster, more efficient

Mistake 4: Ignoring Index Maintenance

-- Schedule regular analysis
ANALYZE orders;

-- Enable auto-vacuum (should be default)
ALTER TABLE orders SET (autovacuum_enabled = true);

-- Check for bloat and reindex periodically
REINDEX TABLE CONCURRENTLY orders;

Supabase-Specific Indexing Considerations

1. RLS (Row Level Security) and Indexes

-- RLS policies can't use indexes directly
-- Solution: Create filtered indexes matching RLS conditions

-- RLS Policy
CREATE POLICY "Users see own orders" ON orders
FOR SELECT USING (user_id = auth.uid());

-- Matching index
CREATE INDEX idx_orders_user_auth ON orders(user_id)
WHERE user_id IS NOT NULL;

2. Realtime Subscriptions

-- Optimize realtime channel queries
CREATE INDEX idx_messages_room_created ON messages(room_id, created_at DESC)
WHERE deleted_at IS NULL;

3. Full-Text Search with Postgres and Supabase

-- Create search function
CREATE OR REPLACE FUNCTION search_products(search_term TEXT)
RETURNS TABLE (
  id UUID,
  name TEXT,
  description TEXT,
  rank REAL
) AS $$
BEGIN
  RETURN QUERY
  SELECT
    p.id,
    p.name,
    p.description,
    ts_rank(p.search_vector, query) as rank
  FROM products p,
       to_tsquery('english', search_term) query
  WHERE p.search_vector @@ query
  ORDER BY rank DESC;
END;
$$ LANGUAGE plpgsql;

-- Create index for the search
CREATE INDEX idx_products_search ON products USING GIN (search_vector);

Performance Testing and Validation

Benchmarking Your Indexes

-- Enable timing
\timing on

-- Test without index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 12345;

-- Create index
CREATE INDEX idx_orders_user ON orders(user_id);

-- Test with index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 12345;

-- Compare results

Load Testing

-- Generate test data
INSERT INTO orders (user_id, status, total_amount, created_at)
SELECT
  (random() * 100000)::int as user_id,
  (ARRAY['pending', 'processing', 'completed', 'cancelled'])[floor(random() * 4 + 1)] as status,
  (random() * 1000)::numeric(10,2) as total_amount,
  NOW() - (random() * 365 || ' days')::interval as created_at
FROM generate_series(1, 1000000);

-- Run your queries and measure

Index Strategy Checklist

Before Creating an Index:

  1. Run EXPLAIN ANALYZE on your slow query
  2. Identify if it’s doing a Seq Scan on large table
  3. Check if relevant columns have high cardinality
  4. Verify the query pattern is consistent
  5. Check existing indexes for overlap

When Creating an Index:

  1. Use CONCURRENTLY in production
  2. Choose appropriate index type (B-tree, GIN, etc.)
  3. Order composite index columns by selectivity
  4. Consider partial indexes for filtered queries
  5. Use INCLUDE for covering indexes

After Creating an Index:

  1. Run EXPLAIN ANALYZE to verify index usage
  2. Monitor index size and scan statistics
  3. Check impact on write operations
  4. Schedule periodic REINDEX if needed
  5. Review unused indexes monthly

Conclusion

Proper indexing can transform your Postgres database from a performance bottleneck into a speed demon. Here’s what we covered:

  1. EXPLAIN ANALYZE is your diagnostic tool - use it religiously
  2. B-tree indexes for equality, ranges, and sorting
  3. GIN indexes for full-text search, JSONB, and arrays
  4. Composite indexes for multi-column queries (column order matters!)
  5. Covering indexes eliminate table lookups entirely
  6. Partial indexes reduce size and improve performance
  7. Index maintenance prevents bloat and keeps performance high

Remember: Measure first, optimize second, validate always. Every database is different, so use EXPLAIN ANALYZE to guide your indexing decisions.

Next Steps

  1. Audit your slowest queries with pg_stat_statements
  2. Run EXPLAIN ANALYZE on each slow query
  3. Create targeted indexes based on query patterns
  4. Monitor index usage with pg_stat_user_indexes
  5. Remove unused indexes to reduce overhead

Your users will thank you when their 8-second queries drop to 8 milliseconds.

Additional Resources


Have questions about indexing your specific use case? Drop them in the comments below.