Bitmap scans in Postgres: bridging the gap between index and sequential scans

When PostgreSQL needs to retrieve rows matching a query, its strategy is largely driven by selectivity. The query planner has several tools at its disposal, but two common options are:

Sequential scan: read the entire table page by page, checking every row against the query conditions. Generally efficient when a large fraction of the table needs to be accessed.
Index scan: use an index to identify the locations of matching rows, then fetch each row directly from the heap. Optimal for highly selective queries that return a small number of rows.

Index scans excel at pinpointing a few rows; sequential scans are best for grabbing large portions of the table. But what about a query that falls somewhere in between?

An index scan could find all those rows, but consider the cost: for each row found in the index, the database has to jump to a different location on disk to fetch the row from the heap. If the matching rows are scattered across many pages, this index-lookup → heap-fetch cycle repeated thousands of times produces a lot of random I/O. Fetching that many rows this way can be more expensive than just reading the whole table sequentially.

Bitmap scans bridge this gap by decoupling the index search from the heap access. The work splits across two plan nodes.

The Bitmap Index Scan walks the relevant index for entries matching the query conditions. Instead of fetching rows from the heap using the TIDs it finds, it builds an in-memory bitmap that records the locations of potentially matching rows. Ideally the bitmap operates in exact mode, tracking the specific location of every matching tuple. If too many tuples match and a fully exact bitmap would consume too much memory, Postgres transitions parts of the bitmap to lossy mode: it stops tracking individual tuples and only records which pages contain at least one matching tuple.

The Bitmap Heap Scan then reads the bitmap and accesses only the heap pages it marks, visiting them in physical order to minimize random I/O. For pages whose bitmap entry is exact, Postgres knows which tuples to fetch. For lossy entries it only knows the page might contain matches, so it examines every tuple on that page and rechecks it against the original query conditions.

When does Postgres choose a bitmap scan?

⚠️ Planner decisions

The planner picks the plan with the lowest estimated cost. Table statistics, configuration parameters (e.g., random_page_cost, work_mem), data distribution, and available indexes all feed into that estimate, so you might see different plans when running these queries yourself.

Let’s set up a sample table and walk through common scenarios where Postgres may pick a bitmap scan:

SELECT setseed(0.1);

CREATE TABLE random_table (id int, val float);
CREATE INDEX rand_ix ON random_table (val);
CREATE INDEX idx ON random_table(id);

INSERT INTO random_table (id, val)
SELECT s, random()
FROM generate_series(1, 1000000) AS s;

ANALYZE random_table;

Moderate selectivity

At moderate selectivity, a query returns too many rows for an index scan to be efficient (the estimated random I/O cost gets too high), but not enough to make a full sequential scan worthwhile.

-- High selectivity -> index scan
EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT * FROM random_table WHERE val < 0.000001;
-- Index Scan using rand_ix on random_table (rows=1)

-- Moderate selectivity -> bitmap scan
EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT * FROM random_table WHERE val < 0.01;
-- Bitmap Heap Scan + Bitmap Index Scan (rows=10092)

-- Low selectivity -> sequential scan
EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT * FROM random_table WHERE val < 0.5;
-- Seq Scan (rows=499449)

Low index correlation

Correlation measures how well the physical order of rows in the table matches the logical order of the index keys. Low correlation means rows with similar index values are scattered across many table pages. Fetching even a moderate number of rows via an index scan then requires visiting many distinct heap pages, driving up the estimated cost. A bitmap scan becomes relatively more attractive because it consolidates those reads into a single ordered pass over the heap.

SELECT attname, correlation
FROM pg_stats
WHERE tablename = 'random_table';

 attname | correlation
---------+--------------
 id      |            1
 val     | 0.0024226764

Let’s force high correlation using CLUSTER and see the impact:

-- Cluster the table physically according to the index
CLUSTER random_table USING rand_ix;
ANALYZE random_table;

-- Rerun the moderately selective query
EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT * FROM random_table WHERE val < 0.01;
-- Output now shows index scan

After clustering, the rows the query needs are physically close together on disk. The index scan’s estimated cost drops because fewer distinct pages need to be fetched randomly, making it the preferred plan over the bitmap scan.

Combining multiple index conditions

Bitmap scans are particularly good at combining results from multiple indexes, especially with OR conditions. Bitmaps from individual index scans can be combined in memory (BitmapOr, BitmapAnd) before any heap access happens.

Bitmap internals

When building a bitmap, Postgres uses a data structure called TIDBitmap: a hash table keyed by page numbers. Each entry contains the actual bitmap data — either matching tuples for a specific page (in exact mode) or presence information for a chunk of pages (in lossy mode). Each entry is a PagetableEntry:

typedef struct PagetableEntry
{
   BlockNumber blockno;        /* page number (hashtable key) */
   char        status;         /* hash entry status */
   bool        ischunk;        /* T = lossy storage, F = exact */
   bool        recheck;        /* should the tuples be rechecked? */
   bitmapword  words[Max(WORDS_PER_PAGE, WORDS_PER_CHUNK)];
} PagetableEntry;

A PagetableEntry operates in one of two modes, dictated by the ischunk flag:

Exact mode (ischunk = false)
- Represents a single table page identified by blockno.
- The words[] array is a precise bitmap for that page. Each set bit corresponds to a specific tuple offset that matched the index condition.
- The Bitmap Heap Scan can fetch those specific tuples directly.
Lossy mode (ischunk = true)
- To conserve memory when many tuples match (the bitmap would otherwise exceed work_mem), Postgres switches to lossy storage for chunks of pages.
- blockno now stores the starting page number of the chunk.
- The words[] array is reinterpreted: each bit corresponds to an entire page within the chunk (blockno + k). A set bit only indicates that the page contains at least one potentially matching tuple — information about which specific tuples match is lost.

The bitmap’s memory budget — set by work_mem — decides whether it stays exact or transitions to lossy. Run the same query under default and tightly-restricted work_mem and compare:

SET max_parallel_workers_per_gather = 0; -- disable parallel plans

SET work_mem = '4MB'; -- the default

EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF, BUFFERS OFF, COSTS OFF)
SELECT * FROM random_table WHERE val < 0.05;

  QUERY PLAN
---------------------------------------------------------------
 Bitmap Heap Scan on random_table (actual rows=50195.00 loops=1)
   Recheck Cond: (val < '0.05'::double precision)
   Heap Blocks: exact=5406
   ->  Bitmap Index Scan on rand_ix (actual rows=50195.00 loops=1)
         Index Cond: (val < '0.05'::double precision)
         Index Searches: 1

Now reduce work_mem significantly:

SET work_mem = '64kB';

EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF, BUFFERS OFF, COSTS OFF)
SELECT * FROM random_table WHERE val < 0.05;

  QUERY PLAN
---------------------------------------------------------------
 Bitmap Heap Scan on random_table (actual rows=50195.00 loops=1)
   Recheck Cond: (val < '0.05'::double precision)
   Rows Removed by Index Recheck: 817154
   Heap Blocks: exact=747 lossy=4659
   ->  Bitmap Index Scan on rand_ix (actual rows=50195.00 loops=1)
         Index Cond: (val < '0.05'::double precision)
         Index Searches: 1

The bitmap can no longer hold a per-tuple entry for every page, so most pages flip to lossy storage. The heap scan compensates by rechecking each tuple on those pages against the original condition.

Putting it together

Bitmap scans sit between the two extremes: cheaper than an index scan when matching rows are spread across many heap pages, cheaper than a sequential scan when only a fraction of the table needs to be read. They show up most often for medium-selectivity queries, low-correlation columns, and OR’d conditions across multiple indexes. The signal that work_mem is the knob to reach for is lossy next to Heap Blocks in EXPLAIN ANALYZE.

💡 Tuning random_page_cost

The planner’s choice between access methods is heavily influenced by the cost parameters seq_page_cost (default 1.0) and random_page_cost (default 4.0). The defaults assume traditional spinning disks where random access is much slower than sequential. On SSDs, random access is much closer to sequential, and lowering random_page_cost (e.g., to 1.1 or 1.5) tells the planner that random I/O isn’t as expensive — which makes index scans look relatively cheaper and can shift moderately-selective queries away from bitmap scans.