Index Only Scan 的“Only”是理想状态,不是保证
如何使用 pg_visibility 查看表的 visibility map
Index Only Scan 的“Only”是理想状态,不是保证
Index Only Scan 的“Only”是理想状态,不是保证
有如下这样一张表:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
test=# \d+ t2;
Table "test.t2"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+-----------------------+-----------+----------+--------------------------------+----------+-------------+--------------+-------------
id | integer | | not null | nextval('t2_id_seq'::regclass) | plain | | |
name | character varying(32) | | | | extended | | |
identifier | character(18) | | not null | | extended | | |
Indexes:
"t2_pkey" PRIMARY KEY, btree (id)
"t2_identifier_key" UNIQUE CONSTRAINT, btree (identifier)
Not-null constraints:
"t2_id_not_null" NOT NULL "id"
"t2_identifier_not_null" NOT NULL "identifier"
Access method: heap
执行如下SQL,并查看执行计划:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
test=# EXPLAIN ANALYZE
test-# SELECT /*+ IndexOnlyScan(t t2_identifier_key) */
test-# identifier
test-# FROM t2 t
test-# WHERE identifier = '110101198601235432';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using t2_identifier_key on t2 t (cost=0.13..8.15 rows=1 width=19) (actual time=0.024..0.024 rows=1.00 loops=1)
Index Cond: (identifier = '210101198501235432'::bpchar)
Heap Fetches: 1
Index Searches: 1
Buffers: shared hit=2
Planning:
Buffers: shared hit=38
Planning Time: 0.214 ms
Execution Time: 0.037 ms
(9 rows)
发现一个问题,即使执行计划走Index Only Scan进行扫描,最终还是有一个回表动作:Heap Fetches: 1,这是怎么回事呢?
[!WARNING]
❗ Heap Fetches 出现的原因不是索引问题
👉 而是:
Visibility Map 没标记该 page 为 all-visible
为什么不是 all-visible?
这是重点👇
[!IMPORTANT]
只有当一个 page 满足:
✔ 所有行对所有事务都可见 ✔ 没有未提交事务 ✔ 没有 recently updated/delete
👉 才会标记:
all-visible = true
[!IMPORTANT]
👉
pg_visibility是一个系统扩展,用来查看:
- 哪些数据页是 all-visible
- 哪些不是(会导致 Heap Fetch)
先确认是否安装了pg_visibility, 如果没有就安装扩展
1
2
3
4
5
6
7
test=# SELECT * FROM pg_extension WHERE extname = 'pg_visibility';
oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
-----+---------+----------+--------------+----------------+------------+-----------+--------------
(0 rows)
test=# CREATE EXTENSION pg_visibility;
CREATE EXTENSION
[!IMPORTANT]
✅ 1️⃣ 查看每个 page 的可见性
1 2 3 4 5 test=# select * from pg_visibility('t2'); blkno | all_visible | all_frozen | pd_all_visible -------+-------------+------------+---------------- 0 | f | f | f (1 row)✅ 2️⃣ 统计多少 page 是 all-visible
1 2 3 4 test=# select count(*) total_pages,SUM(CASE WHEN all_visible THEN 1 ELSE 0 END) all_visible_pages from pg_visibility('t2'); total_pages | all_visible_pages -------------+------------------- 1 | 0🔴 整张表的唯一一个数据页,不是 all-visible
👉 推导结论:
任何 Index Only Scan → 都必须回表 → Heap Fetches = 行数
那问题就来了,为什么这个 page 不是 all-visible 呢?
这是由于刚才对 t2 表进行了一个update操作,但没 VACUUM;
[!NOTE]
也就是说 DML → 不会更新 Visibility Map
👉 VM 只在这些时候更新:
✔ VACUUM ✔ autovacuum
1
2
3
4
5
6
7
test=# vacuum t2;
VACUUM
test=# select * from pg_visibility('t2');
blkno | all_visible | all_frozen | pd_all_visible
-------+-------------+------------+----------------
0 | t | t | t
(1 row)
再次执行上述查询:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
test=# EXPLAIN ANALYZE
test-# SELECT /*+ IndexOnlyScan(t t2_identifier_key) */
test-# identifier
test-# FROM t2 t
test-# WHERE identifier = '110101198601235432';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using t2_identifier_key on t2 t (cost=0.13..4.15 rows=1 width=19) (actual time=0.023..0.024 rows=1.00 loops=1)
Index Cond: (identifier = '110101198601235432'::bpchar)
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=2
Planning:
Buffers: shared hit=26
Planning Time: 0.158 ms
Execution Time: 0.053 ms
(9 rows)
本文由作者按照
CC BY 4.0
进行授权