我们从分片集群里移除hdshard4分片。
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("60545017224c766911a9c440") } shards: { "_id" : "hdshard1", "host" : "hdshard1/172.16.254.136:40001,172.16.254.137:40001,172.16.254.138:40001", "state" : 1 } { "_id" : "hdshard2", "host" : "hdshard2/172.16.254.136:40002,172.16.254.137:40002,172.16.254.138:40002", "state" : 1 } { "_id" : "hdshard3", "host" : "hdshard3/172.16.254.136:40003,172.16.254.137:40003,172.16.254.138:40003", "state" : 1 } { "_id" : "hdshard4", "host" : "hdshard4/172.16.254.139:40005,172.16.254.139:40006,172.16.254.139:40007", "state" : 1 } active mongoses: "4.2.12" : 3 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 2 Last reported error: Could not find host matching read preference { mode: "primary" } for set hdshard4 Time of Reported error: Mon Apr 19 2021 17:11:42 GMT+0800 (CST) Migration Results for the last 24 hours: 289 : Success 11 : Failed with error 'aborted', from hdshard1 to hdshard4 14 : Failed with error 'aborted', from hdshard2 to hdshard4 3 : Failed with error 'aborted', from hdshard3 to hdshard4 databases: { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: hdshard1 256 hdshard2 256 hdshard3 256 hdshard4 256 too many chunks to print, use verbose if you want to force print { "_id" : "db1", "primary" : "hdshard3", "partitioned" : true, "version" : { "uuid" : UUID("71bb472c-7896-4a31-a77c-e3aaf723be3c"), "lastMod" : 1 } } { "_id" : "db2", "primary" : "hdshard4", "partitioned" : false, "version" : { "uuid" : UUID("add90941-a8b1-4c40-94e9-9ccc38d73096"), "lastMod" : 1 } } { "_id" : "recommend", "primary" : "hdshard1", "partitioned" : true, "version" : { "uuid" : UUID("cb833b8e-cc4f-4c52-83c3-719aa383bac4"), "lastMod" : 1 } } recommend.rcmd_1_min_tag_mei_rong shard key: { "_id" : "hashed" } unique: false balancing: true chunks: hdshard1 2 hdshard2 2 hdshard3 2 hdshard4 2 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6701866976688134138") } on : hdshard4 Timestamp(7, 0) { "_id" : NumberLong("-6701866976688134138") } -->> { "_id" : NumberLong("-4163240026901542572") } on : hdshard3 Timestamp(3, 0) { "_id" : NumberLong("-4163240026901542572") } -->> { "_id" : NumberLong("-1616330844721205691") } on : hdshard2 Timestamp(7, 1) { "_id" : NumberLong("-1616330844721205691") } -->> { "_id" : NumberLong("909129560750995399") } on : hdshard3 Timestamp(5, 0) { "_id" : NumberLong("909129560750995399") } -->> { "_id" : NumberLong("3449289120186727718") } on : hdshard2 Timestamp(6, 0) { "_id" : NumberLong("3449289120186727718") } -->> { "_id" : NumberLong("5980358241733552715") } on : hdshard4 Timestamp(8, 0) { "_id" : NumberLong("5980358241733552715") } -->> { "_id" : NumberLong("8520801504243263436") } on : hdshard1 Timestamp(8, 1) { "_id" : NumberLong("8520801504243263436") } -->> { "_id" : { "$maxKey" : 1 } } on : hdshard1 Timestamp(1, 7) recommend.rcmd_1_tag_li_liao shard key: { "_id" : 1 } unique: false balancing: true chunks: hdshard1 27 hdshard2 27 hdshard3 27 hdshard4 26 too many chunks to print, use verbose if you want to force print
mongos> db.databases.find() { "_id" : "recommend", "primary" : "hdshard1", "partitioned" : true, "version" : { "uuid" : UUID("cb833b8e-cc4f-4c52-83c3-719aa383bac4"), "lastMod" : 1 } } { "_id" : "db1", "primary" : "hdshard3", "partitioned" : true, "version" : { "uuid" : UUID("71bb472c-7896-4a31-a77c-e3aaf723be3c"), "lastMod" : 1 } } { "_id" : "db2", "primary" : "hdshard4", "partitioned" : false, "version" : { "uuid" : UUID("add90941-a8b1-4c40-94e9-9ccc38d73096"), "lastMod" : 1 } }
可以看到hdshard4还充当了db2库的primary节点。
mongos> use admin switched to db admin mongos> db.runCommand( { removeShard: "hdshard4" } ) { "msg" : "draining started successfully", "state" : "started", "shard" : "hdshard4", "note" : "you need to drop or movePrimary these databases", "dbsToMove" : [ "db2" ], "ok" : 1, "operationTime" : Timestamp(1618825477, 2), "$clusterTime" : { "clusterTime" : Timestamp(1618825477, 2), "signature" : { "hash" : BinData(0,"l0H3goYxmKMOFF1aWq5rCZcdhS8="), "keyId" : NumberLong("6941260985399246879") } } }
看到提示信息:db2在hdshard4上边,you need to drop or movePrimary these databases。我们先不管先往下做。监控chunks迁移进展。
mongos> db.rcmd_1_tag_li_liao.getShardDistribution() Shard hdshard1 at hdshard1/172.16.254.136:40001,172.16.254.137:40001,172.16.254.138:40001 data : 1.14GiB docs : 125924 chunks : 35 estimated data per chunk : 33.61MiB estimated docs per chunk : 3597 Shard hdshard4 at hdshard4/172.16.254.139:40005,172.16.254.139:40006,172.16.254.139:40007 data : 826.72MiB docs : 92767 chunks : 2 estimated data per chunk : 413.36MiB estimated docs per chunk : 46383 Shard hdshard3 at hdshard3/172.16.254.136:40003,172.16.254.137:40003,172.16.254.138:40003 data : 1.06GiB docs : 124880 chunks : 35 estimated data per chunk : 31.28MiB estimated docs per chunk : 3568 Shard hdshard2 at hdshard2/172.16.254.136:40002,172.16.254.137:40002,172.16.254.138:40002 data : 1.06GiB docs : 124879 chunks : 35 estimated data per chunk : 31.26MiB estimated docs per chunk : 3567 Totals data : 4.09GiB docs : 468450 chunks : 107 Shard hdshard1 contains 28.06% data, 26.88% docs in cluster, avg obj size on shard : 9KiB Shard hdshard4 contains 19.71% data, 19.8% docs in cluster, avg obj size on shard : 9KiB Shard hdshard3 contains 26.11% data, 26.65% docs in cluster, avg obj size on shard : 8KiB Shard hdshard2 contains 26.09% data, 26.65% docs in cluster, avg obj size on shard : 8KiB mongos> mongos> mongos> mongos> mongos> db.rcmd_1_tag_li_liao.getShardDistribution() Shard hdshard2 at hdshard2/172.16.254.136:40002,172.16.254.137:40002,172.16.254.138:40002 data : 1.06GiB docs : 124879 chunks : 35 estimated data per chunk : 31.26MiB estimated docs per chunk : 3567 Shard hdshard1 at hdshard1/172.16.254.136:40001,172.16.254.137:40001,172.16.254.138:40001 data : 1.16GiB docs : 127813 chunks : 36 estimated data per chunk : 33.18MiB estimated docs per chunk : 3550 Shard hdshard3 at hdshard3/172.16.254.136:40003,172.16.254.137:40003,172.16.254.138:40003 data : 1.1GiB docs : 128448 chunks : 36 estimated data per chunk : 31.36MiB estimated docs per chunk : 3568 Totals data : 3.33GiB docs : 381140 chunks : 107 Shard hdshard2 contains 32.01% data, 32.76% docs in cluster, avg obj size on shard : 8KiB Shard hdshard1 contains 34.95% data, 33.53% docs in cluster, avg obj size on shard : 9KiB Shard hdshard3 contains 33.03% data, 33.7% docs in cluster, avg obj size on shard : 9KiB
可以看到hdshard4已经被移除。
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("60545017224c766911a9c440") } shards: { "_id" : "hdshard1", "host" : "hdshard1/172.16.254.136:40001,172.16.254.137:40001,172.16.254.138:40001", "state" : 1 } { "_id" : "hdshard2", "host" : "hdshard2/172.16.254.136:40002,172.16.254.137:40002,172.16.254.138:40002", "state" : 1 } { "_id" : "hdshard3", "host" : "hdshard3/172.16.254.136:40003,172.16.254.137:40003,172.16.254.138:40003", "state" : 1 } { "_id" : "hdshard4", "host" : "hdshard4/172.16.254.139:40005,172.16.254.139:40006,172.16.254.139:40007", "state" : 1, "draining" : true } active mongoses: "4.2.12" : 3 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 572 : Success 14 : Failed with error 'aborted', from hdshard2 to hdshard4 5 : Failed with error 'aborted', from hdshard4 to hdshard2 1 : Failed with error 'aborted', from hdshard4 to hdshard3 3 : Failed with error 'aborted', from hdshard4 to hdshard1 11 : Failed with error 'aborted', from hdshard1 to hdshard4 3 : Failed with error 'aborted', from hdshard3 to hdshard4 databases: { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: hdshard1 342 hdshard2 341 hdshard3 341 too many chunks to print, use verbose if you want to force print { "_id" : "db1", "primary" : "hdshard3", "partitioned" : true, "version" : { "uuid" : UUID("71bb472c-7896-4a31-a77c-e3aaf723be3c"), "lastMod" : 1 } } { "_id" : "db2", "primary" : "hdshard4", "partitioned" : false, "version" : { "uuid" : UUID("add90941-a8b1-4c40-94e9-9ccc38d73096"), "lastMod" : 1 } } { "_id" : "recommend", "primary" : "hdshard1", "partitioned" : true, "version" : { "uuid" : UUID("cb833b8e-cc4f-4c52-83c3-719aa383bac4"), "lastMod" : 1 } } recommend.rcmd_1_min_tag_mei_rong shard key: { "_id" : "hashed" } unique: false balancing: true chunks: hdshard1 2 hdshard2 3 hdshard3 3 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6701866976688134138") } on : hdshard3 Timestamp(9, 0) { "_id" : NumberLong("-6701866976688134138") } -->> { "_id" : NumberLong("-4163240026901542572") } on : hdshard3 Timestamp(3, 0) { "_id" : NumberLong("-4163240026901542572") } -->> { "_id" : NumberLong("-1616330844721205691") } on : hdshard2 Timestamp(7, 1) { "_id" : NumberLong("-1616330844721205691") } -->> { "_id" : NumberLong("909129560750995399") } on : hdshard3 Timestamp(5, 0) { "_id" : NumberLong("909129560750995399") } -->> { "_id" : NumberLong("3449289120186727718") } on : hdshard2 Timestamp(6, 0) { "_id" : NumberLong("3449289120186727718") } -->> { "_id" : NumberLong("5980358241733552715") } on : hdshard2 Timestamp(10, 0) { "_id" : NumberLong("5980358241733552715") } -->> { "_id" : NumberLong("8520801504243263436") } on : hdshard1 Timestamp(8, 1) { "_id" : NumberLong("8520801504243263436") } -->> { "_id" : { "$maxKey" : 1 } } on : hdshard1 Timestamp(1, 7) recommend.rcmd_1_tag_li_liao shard key: { "_id" : 1 } unique: false balancing: true chunks: hdshard1 36 hdshard2 35 hdshard3 36 too many chunks to print, use verbose if you want to force print
可以看到hdshard4的状态是draining,结合我们移除分片时候的提示,我们现在进行moveprimary。注意是在admin库下。
mongos> use admin switched to db admin mongos> db.runCommand({"moveprimary":"db2","to":"hdshard2"}) { "ok" : 1, "operationTime" : Timestamp(1618828614, 2007), "$clusterTime" : { "clusterTime" : Timestamp(1618828614, 2007), "signature" : { "hash" : BinData(0,"v/C6nLIAKa1Fc16/ICGk3N5+dvc="), "keyId" : NumberLong("6941260985399246879") } } }
mongos> db.databases.find() { "_id" : "recommend", "primary" : "hdshard1", "partitioned" : true, "version" : { "uuid" : UUID("cb833b8e-cc4f-4c52-83c3-719aa383bac4"), "lastMod" : 1 } } { "_id" : "db1", "primary" : "hdshard3", "partitioned" : true, "version" : { "uuid" : UUID("71bb472c-7896-4a31-a77c-e3aaf723be3c"), "lastMod" : 1 } } { "_id" : "db2", "primary" : "hdshard2", "partitioned" : false, "version" : { "uuid" : UUID("add90941-a8b1-4c40-94e9-9ccc38d73096"), "lastMod" : 2 } }
可以看到db2已经迁移到hdshard2。
mongos> use admin switched to db admin mongos> db.runCommand( { removeShard: "hdshard4" } ) { "msg" : "removeshard completed successfully", "state" : "completed", "shard" : "hdshard4", "ok" : 1, "operationTime" : Timestamp(1618828812, 2), "$clusterTime" : { "clusterTime" : Timestamp(1618828812, 2), "signature" : { "hash" : BinData(0,"15Wkccw2xFhMFYWCFaEDXbpSH7E="), "keyId" : NumberLong("6941260985399246879") } } }
可以看到提示显示:hdshard4移除已经completed。
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("60545017224c766911a9c440") } shards: { "_id" : "hdshard1", "host" : "hdshard1/172.16.254.136:40001,172.16.254.137:40001,172.16.254.138:40001", "state" : 1 } { "_id" : "hdshard2", "host" : "hdshard2/172.16.254.136:40002,172.16.254.137:40002,172.16.254.138:40002", "state" : 1 } { "_id" : "hdshard3", "host" : "hdshard3/172.16.254.136:40003,172.16.254.137:40003,172.16.254.138:40003", "state" : 1 } active mongoses: "4.2.12" : 3 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 572 : Success 1 : Failed with error 'aborted', from hdshard4 to hdshard3 5 : Failed with error 'aborted', from hdshard4 to hdshard2 14 : Failed with error 'aborted', from hdshard2 to hdshard4 3 : Failed with error 'aborted', from hdshard3 to hdshard4 11 : Failed with error 'aborted', from hdshard1 to hdshard4 3 : Failed with error 'aborted', from hdshard4 to hdshard1 databases: { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: hdshard1 342 hdshard2 341 hdshard3 341 too many chunks to print, use verbose if you want to force print { "_id" : "db1", "primary" : "hdshard3", "partitioned" : true, "version" : { "uuid" : UUID("71bb472c-7896-4a31-a77c-e3aaf723be3c"), "lastMod" : 1 } } { "_id" : "db2", "primary" : "hdshard2", "partitioned" : false, "version" : { "uuid" : UUID("add90941-a8b1-4c40-94e9-9ccc38d73096"), "lastMod" : 2 } } { "_id" : "recommend", "primary" : "hdshard1", "partitioned" : true, "version" : { "uuid" : UUID("cb833b8e-cc4f-4c52-83c3-719aa383bac4"), "lastMod" : 1 } } recommend.rcmd_1_min_tag_mei_rong shard key: { "_id" : "hashed" } unique: false balancing: true chunks: hdshard1 2 hdshard2 3 hdshard3 3 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6701866976688134138") } on : hdshard3 Timestamp(9, 0) { "_id" : NumberLong("-6701866976688134138") } -->> { "_id" : NumberLong("-4163240026901542572") } on : hdshard3 Timestamp(3, 0) { "_id" : NumberLong("-4163240026901542572") } -->> { "_id" : NumberLong("-1616330844721205691") } on : hdshard2 Timestamp(7, 1) { "_id" : NumberLong("-1616330844721205691") } -->> { "_id" : NumberLong("909129560750995399") } on : hdshard3 Timestamp(5, 0) { "_id" : NumberLong("909129560750995399") } -->> { "_id" : NumberLong("3449289120186727718") } on : hdshard2 Timestamp(6, 0) { "_id" : NumberLong("3449289120186727718") } -->> { "_id" : NumberLong("5980358241733552715") } on : hdshard2 Timestamp(10, 0) { "_id" : NumberLong("5980358241733552715") } -->> { "_id" : NumberLong("8520801504243263436") } on : hdshard1 Timestamp(8, 1) { "_id" : NumberLong("8520801504243263436") } -->> { "_id" : { "$maxKey" : 1 } } on : hdshard1 Timestamp(1, 7) recommend.rcmd_1_tag_li_liao shard key: { "_id" : 1 } unique: false balancing: true chunks: hdshard1 36 hdshard2 35 hdshard3 36 too many chunks to print, use verbose if you want to force print
hdshard4的信息已经完全看不到。
注意事项:必须检查分片集群状态,保障要被移除的分片完全移除完毕。