Our Environment:
#HOSTNAME IP_ADDR
mongo-router 192.168.1.120
mongo-config 192.168.1.121
mongo-shard-1 192.168.1.126
mongo-shard-2 192.168.1.127
mongo-shard-3 192.168.1.128
Install Packages:
yum install epel-release -y
yum install mongodb-org*
Setup Hostnames:
echo "
192.168.1.120 mongo-router
192.168.1.121 mongo-config
192.168.1.126 mongo-shard-1
192.168.1.127 mongo-shard-2
192.168.1.128 mongo-shard-3
127.0.0.1 localhost
" > /etc/hosts
Copy the generated host entries over to all the hosts:
for x in 192.168.1.{120..125}; do scp /etc/hosts $x:/etc/hosts ; done
Shard Servers:
Shards will be responsible for hosting the actual data. Here we will execute the following commands on:
- mongo-shard-1
- mongo-shard-2
- mongo-shard-3
# on shard-server{1,2,3}
mongod --shardsvr --bind_ip 0.0.0.0 --fork --logpath /var/log/mongodb.log --logappend
Config Servers:
In a Production Sharding Environment, we need 3 config servers, this is to ensure redundancy and high-availability. For our demonstration, we will will only use 1.
# on mongo-config
mongod --configsvr --bind_ip 0.0.0.0 --fork --logpath /var/log/mongodb.log --logappend
Query Routers:
Query Routers will be interfacing with your application. The query router nodes will then communicate with the config servers in order to get the location of the data which will be shard servers.
# on mongo-router
mongos --configdb mongo-config --fork --logpath /var/log/mongodb.log --logappend
Setup Clustering:
Now that we have all our needed servers running, we now will have to configure these nodes to talk to each other.
$ mongo
use admin
db.runCommand({ listshards : 1 })
{ "shards" : [ ], "ok" : 1 }
Adding Shards to the Cluster:
First we will add our shards:
mongos> sh.addShard("mongo-shard-1:27018");
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard("mongo-shard-2:27018");
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.addShard("mongo-shard-3:27018");
{ "shardAdded" : "shard0002", "ok" : 1 }
Then list your shards to verify that they are configured correctly:
db.runCommand({ listshards : 1 })
{
"shards" : [
{
"_id" : "shard0000",
"host" : "mongo-shard-1:27018"
},
{
"_id" : "shard0001",
"host" : "mongo-shard-2:27018"
},
{
"_id" : "shard0002",
"host" : "mongo-shard-3:27018"
}
],
"ok" : 1
}
We can also query the sharding status by running the following:
sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("572ca286c544101b9f9e6548")
}
shards:
{ "_id" : "shard0000", "host" : "mongo-shard-1:27018" }
{ "_id" : "shard0001", "host" : "mongo-shard-2:27018" }
{ "_id" : "shard0002", "host" : "mongo-shard-3:27018" }
active mongoses:
"3.2.5" : 1
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 5
Last reported error: Server's sharding metadata manager failed to initialize and will remain in this state until the instance is manually reset :: caused by :: HostNotFound: unable to resolve DNS for host mongo-config
Time of Reported error: Fri May 06 2016 10:36:00 GMT-0400 (EDT)
Migration Results for the last 24 hours:
No recent migrations
databases:
Enable sharding for a collection:
db.runCommand({ enablesharding : "db1" })
db.runCommand({ shardcollection : "db1.col1", key : { Name : 1 }} );
Inserting Data:
use db1
db.db1.createIndex({ Name: "hashed" });
db.printShardingStatus()
for (var i = 1; i < 10000; i++) {db.col1.insert({ "Name": i }) }
Retrieving Sharding Statistics:
mongos> db.col1.getShardDistribution()
Shard shard0000 at mongo-shard-1:27018
data : 350KiB docs : 9970 chunks : 1
estimated data per chunk : 350KiB
estimated docs per chunk : 9970
Shard shard0001 at mongo-shard-2:27018
data : 36B docs : 1 chunks : 1
estimated data per chunk : 36B
estimated docs per chunk : 1
Shard shard0002 at mongo-shard-3:27018
data : 1008B docs : 28 chunks : 1
estimated data per chunk : 1008B
estimated docs per chunk : 28
Totals
data : 351KiB docs : 9999 chunks : 3
Shard shard0000 contains 99.7% data, 99.7% docs in cluster, avg obj size on shard : 36B
Shard shard0001 contains 0.01% data, 0.01% docs in cluster, avg obj size on shard : 36B
Shard shard0002 contains 0.28% data, 0.28% docs in cluster, avg obj size on shard : 36B
Another example:
db.runCommand({ enablesharding : "db2" })
db.runCommand({ shardcollection : "db2.col1", key : { Name : 1 }} );
use db2
db.db2.createIndex({ Name: "hashed" });
db.printShardingStatus()
function randomString() {
var chars =
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghiklmnopqrstuvwxyz";
var randomstring = '';
var string_length = 100;
for (var i=0; i<string_length; i++) {
var rnum = Math.floor(Math.random() * chars.length);
randomstring += chars.substring(rnum,rnum+1);
}
return randomstring;
}
for (var i = 1; i < 10000; i++) {db.col1.insert({Name:randomString() }) }
Other Commands Used:
sh.status()
sh.isBalancerRunning()
sh.getBalancerState()
sh.getBalancerHost()
sh.startBalancer()
Resources Used:
- https://docs.mongodb.com/manual/tutorial/deploy-shard-cluster/
- https://docs.mongodb.com/manual/reference/method/sh.setBalancerState/
- http://stackoverflow.com/questions/20461724/mongodb-compound-shard-key-using-three-values
- https://www.percona.com/blog/2015/03/19/choosing-a-good-sharding-key-in-mongodb-and-mysql/
Comments