排查分析Empty regions 較大原因

語言: CN / TW / HK

作者:郭鵬霞

原文來源:http://tidb.net/blog/0b1b8ff3

 

問題背景:

通過granfan-->pd-->region health 圖表監控到empty region 較多,且此叢集目前業務並不多,資料量也很小,初始化時只建立了較表。

問題排查:

檢視region merge相關引數:

max-merge-region-size:

##控制region merge的大小上限,當region大於指定值時pd不會將相鄰的region合併(預設20)

max-merge-region-keys:

##控制region merge的key上限,當region大於指定值時pd不會將相鄰的region合併(預設2000)

merge-schedule-limit:

##同時進行region merge排程的任務數,設定為則關閉region merge(預設值8)

如下,這些引數都是預設值,

$ pd-ctl config show max-merge-region-size |grep max-merge-region-size  "max-merge-region-size": 20,   $ pd-ctl config show max-merge-region-keys |grep max-merge-region-keys    "max-merge-region-keys": 200000,$ pd-ctl config show merge-schedule-limit |grep merge-schedule-limit    "merge-schedule-limit": 8,

檢視region的split和merge頻率相關引數,

patrol-region-interval:

##控制replicaChecker檢查region監控狀態的執行頻率,越短則執行越快(10ms)

split-merge-interval:

##控制對同一個region做split和merge的操作間隔,對應新split的region一段時間不會merge(預設1h)

$ pd-ctl config show patrol-region-interval |grep patrol    "patrol-region-interval": "100ms",$ pd-ctl config show split-merge-interval |grep split    "split-merge-interval": "1h0m0s",

 

測試ddl時與分配的region是否會merge:

突然想到,空regions多的叢集在ddl時使用了region預分配引數,檢視相關表時只有1個region,在社群諮詢老師說是被merge了。

建立測試表並預分配region,檢視region數是指定的:

mysql> ** create table guo004(id int, name varchar(10))  SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=4;**Query OK, 0 rows affected (0.51 sec)mysql> show table guo004 regions;+-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+| REGION_ID | START_KEY                   | END_KEY                     | LEADER_ID | LEADER_STORE_ID | PEERS               | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS |+-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+|     25373 | t_437_                      | t_437_r_576460752303423488  |     25375 |               5 | 25374, 25375, 25376 |          0 |           341 |          0 |                    1 |                0 ||     25377 | t_437_r_576460752303423488  | t_437_r_1152921504606846976 |     25379 |               5 | 25378, 25379, 25380 |          0 |            29 |          0 |                    1 |                0 ||     25381 | t_437_r_1152921504606846976 | t_437_r_1729382256910270464 |     25383 |               5 | 25382, 25383, 25384 |          0 |             0 |          0 |                    1 |                0 ||     25385 | t_437_r_1729382256910270464 | t_437_r_2305843009213693952 |     25387 |               5 | 25386, 25387, 25388 |          0 |             0 |          0 |                    1 |                0 ||     25389 | t_437_r_2305843009213693952 | t_437_r_2882303761517117440 |     25391 |               5 | 25390, 25391, 25392 |          0 |            29 |          0 |                    1 |                0 ||     25393 | t_437_r_2882303761517117440 | t_437_r_3458764513820540928 |     25395 |               5 | 25394, 25395, 25396 |          0 |             0 |          0 |                    1 |                0 ||     25397 | t_437_r_3458764513820540928 | t_437_r_4035225266123964416 |     25399 |               5 | 25398, 25399, 25400 |          0 |            29 |          0 |                    1 |                0 ||     25401 | t_437_r_4035225266123964416 | t_437_r_4611686018427387904 |     25403 |               5 | 25402, 25403, 25404 |          0 |             0 |          0 |                    1 |                0 ||     25405 | t_437_r_4611686018427387904 | t_437_r_5188146770730811392 |     25407 |               5 | 25406, 25407, 25408 |          0 |            29 |          0 |                    1 |                0 ||     25409 | t_437_r_5188146770730811392 | t_437_r_5764607523034234880 |     25411 |               5 | 25410, 25411, 25412 |          0 |             0 |          0 |                    1 |                0 ||     25413 | t_437_r_5764607523034234880 | t_437_r_6341068275337658368 |     25415 |               5 | 25414, 25415, 25416 |          0 |            29 |          0 |                    1 |                0 ||     25417 | t_437_r_6341068275337658368 | t_437_r_6917529027641081856 |     25419 |               5 | 25418, 25419, 25420 |          0 |            29 |          0 |                    1 |                0 ||     25421 | t_437_r_6917529027641081856 | t_437_r_7493989779944505344 |     25423 |               5 | 25422, 25423, 25424 |          0 |             0 |          0 |                    1 |                0 ||     25425 | t_437_r_7493989779944505344 | t_437_r_8070450532247928832 |     25427 |               5 | 25426, 25427, 25428 |          0 |            29 |          0 |                    1 |                0 ||     25429 | t_437_r_8070450532247928832 | t_437_r_8646911284551352320 |     25431 |               5 | 25430, 25431, 25432 |          0 |             0 |          0 |                    1 |                0 ||     25169 | t_437_r_8646911284551352320 |                             |     25171 |               5 | 25170, 25171, 25172 |          0 |             0 |          0 |                    1 |                0 |+-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+16 rows in set (0.01 sec)

檢查預分配region是否merge

根據以上split-merge-interval引數,1小時後測試表的guo004空regions將被回收:

mysql> mysql> show table guo004 regions;+-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+| REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS               | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS |+-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+|     25425 | t_437_    |         |     25427 |               5 | 25426, 25427, 25428 |          0 |          1285 |          0 |                    1 |                0 |+-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+1 row in set (0.01 sec)

即使是ddl預分配的region,如果沒有regions為空的話,都會自動merge.

檢視空regions 資訊:

引數說明:

approximate_size:估算的region的資料量大小,單位為MB

approximate_keys:估算的region內的key的個數

注意關注approximate_size、approximate_keys 的值。

檢視所有空region資訊:

$ ./pd-ctl region check empty-region |more

$ ./pd-ctl region check empty-region |more{  "count": 272,  "regions": [    {      "id": 202,      "start_key": "7480000000000000FF5C00000000000000F8",      "end_key": "7480000000000000FF5E00000000000000F8",      "epoch": {        "conf_ver": 71639,        "version": 43      },      "peers": [        {          "id": 7796112,          "store_id": 21        },        {          "id": 7952908,          "store_id": 2        },        {          "id": 7955094,          "store_id": 19        }      ],      "leader": {        "id": 7796112,        "store_id": 21      },      "written_bytes": 0,      "read_bytes": 0,      "written_keys": 0,      "read_keys": 0,      "approximate_size": 1,      "approximate_keys": 0    },    {      "id": 217800,      "start_key": "7480000000000002FF8500000000000000F8",      "end_key": "7480000000000002FF8700000000000000F8",      "epoch": {        "conf_ver": 71369,        "version": 290      },      "peers": [        {          "id": 7932728,          "store_id": 3        },        {          "id": 7954710,          "store_id": 1        },        {          "id": 7955013,          "store_id": 5        }      ],      "leader": {        "id": 7932728,        "store_id": 3      },      "written_bytes": 0,      "read_bytes": 0,      "written_keys": 0,      "read_keys": 0,**      "approximate_size": 1,****      "approximate_keys": 0**    },--More--

檢視單個region資訊:

$ ./pd-ctl  region 217800

$ ./pd-ctl  region 217800{  "id": 217800,  "start_key": "7480000000000002FF8500000000000000F8",  "end_key": "7480000000000002FF8700000000000000F8",  "epoch": {    "conf_ver": 71369,    "version": 290  },  "peers": [    {      "id": 7932728,      "store_id": 3    },    {      "id": 7954710,      "store_id": 1    },    {      "id": 7955013,      "store_id": 5    }  ],  "leader": {    "id": 7932728,    "store_id": 3  },  "written_bytes": 0,  "read_bytes": 0,  "written_keys": 0,  "read_keys": 0,**  "approximate_size": 1,****  "approximate_keys": 0**}

或者通過系統表TIKV_REGION_STATUS檢視:

(我個人更傾向於查系統表,比對的可視度更強些):

mysql> select REGION_ID,START_KEY,END_KEY,TABLE_ID,DB_NAME,TABLE_NAME,APPROXIMATE_SIZE,APPROXIMATE_KEYS from TIKV_REGION_STATUS where DB_NAME='dzdz' order by START_KEY limit 30;+-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+| REGION_ID | START_KEY                            | END_KEY                              | TABLE_ID | DB_NAME | TABLE_NAME               | APPROXIMATE_SIZE | APPROXIMATE_KEYS |+-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+|    217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 |      429 | dzdz    | qrtz_blob_triggers       |                1 |                0 ||    217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 |      429 | dzdz    | qrtz_blob_triggers       |                1 |                0 ||    217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 |      429 | dzdz    | qrtz_blob_triggers       |                1 |                0 ||    217156 | 7480000000000001FFAF00000000000000F8 | 7480000000000001FFB100000000000000F8 |      431 | dzdz    | qrtz_calendars           |                1 |                0 ||    217156 | 7480000000000001FFAF00000000000000F8 | 7480000000000001FFB100000000000000F8 |      431 | dzdz    | qrtz_calendars           |                1 |                0 ||    217160 | 7480000000000001FFB100000000000000F8 | 7480000000000001FFB300000000000000F8 |      433 | dzdz    | qrtz_cron_triggers       |                1 |                0 ||    217160 | 7480000000000001FFB100000000000000F8 | 7480000000000001FFB300000000000000F8 |      433 | dzdz    | qrtz_cron_triggers       |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 ||    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 ||    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 ||    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 ||    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 ||    217172 | 7480000000000001FFB700000000000000F8 | 7480000000000001FFB900000000000000F8 |      439 | dzdz    | qrtz_locks               |                1 |                0 ||    217172 | 7480000000000001FFB700000000000000F8 | 7480000000000001FFB900000000000000F8 |      439 | dzdz    | qrtz_locks               |                1 |                0 ||    217176 | 7480000000000001FFB900000000000000F8 | 7480000000000001FFBB00000000000000F8 |      441 | dzdz    | qrtz_paused_trigger_grps |                1 |                0 ||    217176 | 7480000000000001FFB900000000000000F8 | 7480000000000001FFBB00000000000000F8 |      441 | dzdz    | qrtz_paused_trigger_grps |                1 |                0 ||    217180 | 7480000000000001FFBB00000000000000F8 | 7480000000000001FFBD00000000000000F8 |      443 | dzdz    | qrtz_scheduler_state     |               10 |            20343 ||    217180 | 7480000000000001FFBB00000000000000F8 | 7480000000000001FFBD00000000000000F8 |      443 | dzdz    | qrtz_scheduler_state     |               10 |            20343 ||    217184 | 7480000000000001FFBD00000000000000F8 | 7480000000000001FFBF00000000000000F8 |      445 | dzdz    | qrtz_simple_triggers     |                1 |                0 ||    217184 | 7480000000000001FFBD00000000000000F8 | 7480000000000001FFBF00000000000000F8 |      445 | dzdz    | qrtz_simple_triggers     |                1 |                0 ||    217188 | 7480000000000001FFBF00000000000000F8 | 7480000000000001FFC100000000000000F8 |      447 | dzdz    | qrtz_simprop_triggers    |                1 |                0 ||    217188 | 7480000000000001FFBF00000000000000F8 | 7480000000000001FFC100000000000000F8 |      447 | dzdz    | qrtz_simprop_triggers    |                1 |                0 ||    217192 | 7480000000000001FFC100000000000000F8 | 7480000000000001FFC300000000000000F8 |      449 | dzdz    | qrtz_triggers            |                1 |                0 |+-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+30 rows in set (0.01 sec)mysql>

通過以上資訊顯示一個regions的END_KEY 是另一個region的START_KEY, 結合APPROXIMATE_SIZE和APPROXIMATE_KEYS欄位的值,可知部分空region是連續的。

 

檢查跨表合併是否啟用

enable-cross-table-merge:

##設定是否開啟跨表merge(預設v4.0:false,v6.0:true)

$  ./pd-ctl config show enable-cross-table-merge |grep cross    "enable-cross-table-merge": "false",$如上跨表合併未啟用

** **

測試啟用跨表合併是否能降低空region數:

檢查當前空region數:

測試環境的empty region 也比較多,如下圖:

 

啟用跨表合併:

確認誇表合併未啟用:» config show{  "replication": {    "enable-placement-rules": "true",    "location-labels": "dc,host",    "max-replicas": 3,    "strictly-match-label": "false"  },  "schedule": {    "enable-cross-table-merge": "false",......  }}啟用跨表合併:****» config set enable-cross-table-merge trueSuccess! 確認跨表合併啟用:» config show{  "replication": {    "enable-placement-rules": "true",    "location-labels": "dc,host",    "max-replicas": 3,    "strictly-match-label": "false"  },  "schedule": {    "enable-cross-table-merge": "true",......  }}»  

啟用跨表何必後空region數:

在grafana上看region health狀態,empty region 數量已經為0:

 

問題處理:

啟用跨表合併,可以減少或至0空region資料量

操作命令:$ ./pd-ctl config set enable-cross-table-merge true 即可。

綜上所述:

通過granfan-->pd-->region health 圖表監控到empty region 較多:

檢視region merge大小相關引數:

max-merge-region-size:

##控制region merge的大小上限,當region大於指定值時pd不會將相鄰的region合併(預設20)

max-merge-region-keys:

##控制region merge的key上限,當region大於指定值時pd不會將相鄰的region合併(預設2000)

merge-schedule-limit:

##同時進行region merge排程的任務數,設定為則關閉region merge(預設值8)

 

如果以上引數設定太小或不合理可以適當調整讓region加速merge

如果以上引數設定合理(一般預設值就可以),請檢查region merge頻率引數

檢視region的split和merge頻率相關引數,

patrol-region-interval:

##控制replicaChecker檢查region監控狀態的執行頻率,越短則執行越快(10ms)

split-merge-interval:

##控制對同一個region做split和merge的操作間隔,對應新split的region一段時間不會merge(預設1h)

 

如果以上引數設定過大或不合理可適當調小,但也不建議過小。

如果設定合理(一般預設值就可以),請檢查跨表合併引數。

檢查跨表合併是否啟用

enable-cross-table-merge:

##設定是否開啟跨表merge(預設v4.0:false,v5.0及其後版本:true)

 

跨表合併引數v5.0及其後版本預設已經啟用,使用v5.0及其後版本的應該不太會出現這個問題。

使用v4.0、小於v5.0的版本或enable-cross-table-merge 未啟用情況下可能會出現empty region較多。

一般多DDL on table情況下, enable-cross-table-merge結果較為明顯。