Our Environment:

#HOSTNAME 	IP_ADDR

mongo-router 	192.168.1.120
mongo-config 	192.168.1.121
mongo-shard-1 	192.168.1.126
mongo-shard-2 	192.168.1.127
mongo-shard-3 	192.168.1.128

Install Packages:

yum install epel-release -y
yum install mongodb-org*

Setup Hostnames:

echo "
192.168.1.120 mongo-router
192.168.1.121 mongo-config
192.168.1.126 mongo-shard-1
192.168.1.127 mongo-shard-2
192.168.1.128 mongo-shard-3
127.0.0.1 localhost
" > /etc/hosts

Copy the generated host entries over to all the hosts:

for x in 192.168.1.{120..125}; do scp /etc/hosts $x:/etc/hosts ; done

Shard Servers:

Shards will be responsible for hosting the actual data. Here we will execute the following commands on:

  • mongo-shard-1
  • mongo-shard-2
  • mongo-shard-3
# on shard-server{1,2,3}
mongod --shardsvr --bind_ip 0.0.0.0 --fork --logpath /var/log/mongodb.log --logappend

Config Servers:

In a Production Sharding Environment, we need 3 config servers, this is to ensure redundancy and high-availability. For our demonstration, we will will only use 1.

# on mongo-config
mongod --configsvr --bind_ip 0.0.0.0 --fork --logpath /var/log/mongodb.log --logappend

Query Routers:

Query Routers will be interfacing with your application. The query router nodes will then communicate with the config servers in order to get the location of the data which will be shard servers.

# on mongo-router
mongos --configdb mongo-config --fork --logpath /var/log/mongodb.log --logappend

Setup Clustering:

Now that we have all our needed servers running, we now will have to configure these nodes to talk to each other.

$ mongo
use admin
db.runCommand({ listshards : 1 })
{ "shards" : [ ], "ok" : 1 }

Adding Shards to the Cluster:

First we will add our shards:

mongos> sh.addShard("mongo-shard-1:27018");
{ "shardAdded" : "shard0000", "ok" : 1 }

mongos> sh.addShard("mongo-shard-2:27018");
{ "shardAdded" : "shard0001", "ok" : 1 }

mongos> sh.addShard("mongo-shard-3:27018");
{ "shardAdded" : "shard0002", "ok" : 1 }

Then list your shards to verify that they are configured correctly:

db.runCommand({ listshards : 1 })
{
	"shards" : [
		{
			"_id" : "shard0000",
			"host" : "mongo-shard-1:27018"
		},
		{
			"_id" : "shard0001",
			"host" : "mongo-shard-2:27018"
		},
		{
			"_id" : "shard0002",
			"host" : "mongo-shard-3:27018"
		}
	],
	"ok" : 1
}

We can also query the sharding status by running the following:

sh.status()
--- Sharding Status --- 
  sharding version: {
	"_id" : 1,
	"minCompatibleVersion" : 5,
	"currentVersion" : 6,
	"clusterId" : ObjectId("572ca286c544101b9f9e6548")
}
  shards:
	{  "_id" : "shard0000",  "host" : "mongo-shard-1:27018" }
	{  "_id" : "shard0001",  "host" : "mongo-shard-2:27018" }
	{  "_id" : "shard0002",  "host" : "mongo-shard-3:27018" }
  active mongoses:
	"3.2.5" : 1
  balancer:
	Currently enabled:  yes
	Currently running:  no
	Failed balancer rounds in last 5 attempts:  5
	Last reported error:  Server's sharding metadata manager failed to initialize and will remain in this state until the instance is manually reset :: caused by :: HostNotFound: unable to resolve DNS for host mongo-config
	Time of Reported error:  Fri May 06 2016 10:36:00 GMT-0400 (EDT)
	Migration Results for the last 24 hours: 
		No recent migrations
  databases:

Enable sharding for a collection:

db.runCommand({ enablesharding : "db1" })
db.runCommand({ shardcollection : "db1.col1", key : { Name : 1 }} );

Inserting Data:

use db1
db.db1.createIndex({ Name: "hashed" });
db.printShardingStatus()
for (var i = 1; i < 10000; i++) {db.col1.insert({ "Name": i }) }

Retrieving Sharding Statistics:


mongos> db.col1.getShardDistribution()

Shard shard0000 at mongo-shard-1:27018
 data : 350KiB docs : 9970 chunks : 1
 estimated data per chunk : 350KiB
 estimated docs per chunk : 9970

Shard shard0001 at mongo-shard-2:27018
 data : 36B docs : 1 chunks : 1
 estimated data per chunk : 36B
 estimated docs per chunk : 1

Shard shard0002 at mongo-shard-3:27018
 data : 1008B docs : 28 chunks : 1
 estimated data per chunk : 1008B
 estimated docs per chunk : 28

Totals
 data : 351KiB docs : 9999 chunks : 3
 Shard shard0000 contains 99.7% data, 99.7% docs in cluster, avg obj size on shard : 36B
 Shard shard0001 contains 0.01% data, 0.01% docs in cluster, avg obj size on shard : 36B
 Shard shard0002 contains 0.28% data, 0.28% docs in cluster, avg obj size on shard : 36B

Another example:

db.runCommand({ enablesharding : "db2" })
db.runCommand({ shardcollection : "db2.col1", key : { Name : 1 }} );
use db2
db.db2.createIndex({ Name: "hashed" });
db.printShardingStatus()

function randomString() { 
        var chars = 
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghiklmnopqrstuvwxyz"; 
        var randomstring = ''; 
        var string_length = 100;
        for (var i=0; i<string_length; i++) { 
                var rnum = Math.floor(Math.random() * chars.length); 
                randomstring += chars.substring(rnum,rnum+1); 
        } 
        return randomstring; 
} 

for (var i = 1; i < 10000; i++) {db.col1.insert({Name:randomString() }) }

Other Commands Used:

sh.status()
sh.isBalancerRunning()
sh.getBalancerState()
sh.getBalancerHost()
sh.startBalancer()

Resources Used: