文章

Index Only Scan 的“Only”是理想状态,不是保证

如何使用 pg_visibility 查看表的 visibility map

Index Only Scan 的“Only”是理想状态,不是保证

Index Only Scan 的“Only”是理想状态,不是保证

有如下这样一张表:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
test=# \d+ t2;
                                                              Table "test.t2"
   Column   |         Type          | Collation | Nullable |            Default             | Storage  | Compression | Stats target | Description
------------+-----------------------+-----------+----------+--------------------------------+----------+-------------+--------------+-------------
 id         | integer               |           | not null | nextval('t2_id_seq'::regclass) | plain    |             |              |
 name       | character varying(32) |           |          |                                | extended |             |              |
 identifier | character(18)         |           | not null |                                | extended |             |              |
Indexes:
    "t2_pkey" PRIMARY KEY, btree (id)
    "t2_identifier_key" UNIQUE CONSTRAINT, btree (identifier)
Not-null constraints:
    "t2_id_not_null" NOT NULL "id"
    "t2_identifier_not_null" NOT NULL "identifier"
Access method: heap

执行如下SQL,并查看执行计划:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
test=# EXPLAIN ANALYZE
test-# SELECT /*+ IndexOnlyScan(t t2_identifier_key) */
test-# identifier
test-# FROM t2 t
test-# WHERE identifier = '110101198601235432';
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using t2_identifier_key on t2 t  (cost=0.13..8.15 rows=1 width=19) (actual time=0.024..0.024 rows=1.00 loops=1)
   Index Cond: (identifier = '210101198501235432'::bpchar)
   Heap Fetches: 1
   Index Searches: 1
   Buffers: shared hit=2
 Planning:
   Buffers: shared hit=38
 Planning Time: 0.214 ms
 Execution Time: 0.037 ms
(9 rows)

发现一个问题,即使执行计划走Index Only Scan进行扫描,最终还是有一个回表动作:Heap Fetches: 1,这是怎么回事呢?

[!WARNING]

❗ Heap Fetches 出现的原因不是索引问题

👉 而是:

Visibility Map 没标记该 page 为 all-visible

为什么不是 all-visible?

这是重点👇

[!IMPORTANT]

只有当一个 page 满足:

✔ 所有行对所有事务都可见 ✔ 没有未提交事务 ✔ 没有 recently updated/delete

👉 才会标记:

all-visible = true

[!IMPORTANT]

👉 pg_visibility 是一个系统扩展,用来查看:

  • 哪些数据页是 all-visible
  • 哪些不是(会导致 Heap Fetch)

先确认是否安装了pg_visibility, 如果没有就安装扩展

1
2
3
4
5
6
7
test=# SELECT * FROM pg_extension WHERE extname = 'pg_visibility';
 oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
-----+---------+----------+--------------+----------------+------------+-----------+--------------
(0 rows)

test=# CREATE EXTENSION pg_visibility;
CREATE EXTENSION

[!IMPORTANT]

✅ 1️⃣ 查看每个 page 的可见性

1
2
3
4
5
test=# select * from pg_visibility('t2');
 blkno | all_visible | all_frozen | pd_all_visible
-------+-------------+------------+----------------
     0 | f           | f          | f
(1 row)

✅ 2️⃣ 统计多少 page 是 all-visible

1
2
3
4
test=# select count(*) total_pages,SUM(CASE WHEN all_visible THEN 1 ELSE 0 END) all_visible_pages from pg_visibility('t2');
 total_pages | all_visible_pages
-------------+-------------------
           1 |                 0

🔴 整张表的唯一一个数据页,不是 all-visible

👉 推导结论:

任何 Index Only Scan → 都必须回表 → Heap Fetches = 行数

那问题就来了,为什么这个 page 不是 all-visible 呢?

这是由于刚才对 t2 表进行了一个update操作,但没 VACUUM;

[!NOTE]

也就是说 DML → 不会更新 Visibility Map

👉 VM 只在这些时候更新:

✔ VACUUM ✔ autovacuum

1
2
3
4
5
6
7
test=# vacuum t2;
VACUUM
test=# select * from pg_visibility('t2');
 blkno | all_visible | all_frozen | pd_all_visible
-------+-------------+------------+----------------
     0 | t           | t          | t
(1 row)

再次执行上述查询:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
test=# EXPLAIN ANALYZE
test-# SELECT /*+ IndexOnlyScan(t t2_identifier_key) */
test-# identifier
test-# FROM t2 t
test-# WHERE identifier = '110101198601235432';
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using t2_identifier_key on t2 t  (cost=0.13..4.15 rows=1 width=19) (actual time=0.023..0.024 rows=1.00 loops=1)
   Index Cond: (identifier = '110101198601235432'::bpchar)
   Heap Fetches: 0
   Index Searches: 1
   Buffers: shared hit=2
 Planning:
   Buffers: shared hit=26
 Planning Time: 0.158 ms
 Execution Time: 0.053 ms
(9 rows)
本文由作者按照 CC BY 4.0 进行授权