SlideShare a Scribd company logo
MongoDB Sharding
Uzzal Basak
MongoDB Sharding
Sharding is a method for distributing data across multiple machines. MongoDB uses
sharding to support deployments with very large data sets and high throughput operations.
Database systems with large data sets or high throughput applications can challenge the
capacity of a single server. For example, high query rates can exhaust the CPU capacity
of the server. Working set sizes larger than the system’s RAM stress the I/O capacity of
disk drives.
There are two methods for addressing system growth: vertical and horizontal scaling.
Vertical Scaling involves increasing the capacity of a single server, such as using a more
powerful CPU, adding more RAM, or increasing the amount of storage space. Limitations
in available technology may restrict a single machine from being sufficiently powerful
for a given workload. Additionally, Cloud-based providers have hard ceilings based on
available hardware configurations. As a result, there is a practical maximum for vertical
scaling.
Horizontal Scaling involves dividing the system dataset and load over multiple servers,
adding additional servers to increase capacity as required. While the overall speed or
capacity of a single machine may not be high, each machine handles a subset of the overall
workload, potentially providing better efficiency than a single high-speed high-capacity
server. Expanding the capacity of the deployment only requires adding additional servers
as needed, which can be a lower overall cost than high-end hardware for a single machine.
The trade off is increased complexity in infrastructure and maintenance for the
deployment.
Sharded Cluster
A MongoDB sharded cluster consists of the following components:
 shard: Each shard contains a subset of the sharded data. Each shard can be
deployed as a replica set.
 mongos: The mongos acts as a query router, providing an interface between
client applications and the sharded cluster.
 config servers: Config servers store metadata and configuration settings for the
cluster.
A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica
sets provide redundancy and high availability, and are the basis for all production deployments.
Advantage of Replica Set:
1. It’s ensure data availability is any disaster period. If any Primary failure due to any hardware issue,
then data will not loss at all.
2. We can execute Select queries from Secondary database which can reduce the load from Primary
database.
3. Then we can set the delay replication from Secondary database that will help any data corruption
from developer. Suppose we set 1 hour delay replication that means always 1 hour data sync lag
from Primary database. If any developer does anything accidentally, then we can recovery from
secondary database so we can consider this Secondary Database as hot-backup.
When we go for Sharding?
Sharding is the most complex architecture we can deploy using MongoDB, and there are two main
approaches as to when to shard or not. The first is to configure the cluster as soon as possible – when
we predict high throughput and fast data growth.
The second says we should use a cluster as the best alternative when the application demands more
resources than the replica set can offer (such as low memory, an overloaded disk or high processor
load). This approach is more corrective than preventative, but we will discuss that in the future.
1) Disaster recovery plan
Disaster recovery (DR) is a very delicate topic: how long would tolerate an outage? If necessary, how
long would it take to restore the entire database? Depending on the database size and on disk speed, a
backup/restore process might take hours or even days!
There is no hard number in Gigabytes to justify a cluster. But in general, you should engage when the
database is more than 200GB the backup and restore processes might take a while to finish.
Let’s consider the case where we have a replica set with a 300GB database. The full restore process
might last around four hours, whereas if the database has two shards, it will take about two hours –
and depending on the number of shards we can improve that time. Simple math: if there are two shards,
the restore process takes half of the time to restore when compared to a single replica set.
2) Hardware limitations
Disk and memory are inexpensive nowadays. However, this is not true when companies need to scale
out to high numbers (such as TB of RAM). Suppose cloud provider can only offer up to 5,000 IOPS
in the disk subsystem, but the application needs more than that to work correctly. To work around this
performance limitation, it is better to start a cluster and divide the writes among instances. That said,
if there are two shards the application will have 10000 IOPS available to use for writes and reads in
the disk subsystem.
3) Storage engine limitations
There are a few storage engine limitations that can be a bottleneck . MMAPv2 does have a lock per
collection, while WiredTiger has tickets that will limit the number of writes and reads happening
concurrently. Although we can tweak the number of tickets available in WiredTiger, there is a virtual
limit – which means that changing the available tickets might generate processor overload instead of
increasing performance. If one of these situations becomes a bottleneck in system, we can start a
cluster. Once shard the collection, distribute the load/lock among the different instances.
4) Hot data vs. cold data
Several databases only work with a small percentage of the data being stored. This is called hot data
or working set. Cold data or historical data is rarely read, and demands considerable system resources
when it is. So why spend money on expensive machines that only store cold data or low-value
data? With a cluster deployment we can choose where the cold data is stored, and use cheap devices
and disks to do so. The same is true for hot data – we can use better machines to have better
performance. This methodology also speeds up writes and reads on the hot data, as the indexes are
smaller and add less overhead to the system.
5) Geo-distributed data
It doesn’t matter whether this need comes from application design or legal compliance. If the data
must stay within continent or country borders, a cluster helps make that happen. It is possible to limit
data localization so that it is stored solely in a specific “part of the world.” The number of shards and
their geographic positions is not essential for the application, as it only views the database. This is
commonly used in worldwide companies for better performance, or simply to comply with the local
law.
6) Infrastructure limitations
Infrastructure and hardware limitations are very similar. When thinking about infrastructure, however,
we focus on specific cases when the instances should be small. An example is running MongoDB on
Mesos. Some providers only offer a few cores and a limited amount of RAM. Even if you are willing
to pay more for that, it is not possible to purchase more than they offer as their products. A cluster
provides the option to split a small amount of data among a lot of shards, reaching the same
performance a big and expensive machine provides.
7) Failure isolation
Consider that a replica set or a single instance holds all the data. If for any reason this instance/replica
set goes down, the whole application goes down. In a cluster, if we lose one of the five shards, 80%
of the data is still available. Running a few shards helps to isolate failures. Obviously, running a bunch
of instances makes the cluster prone to have a failed instance, but as each shard must have at least
three instances the probability of the entire shard being down is minimal. For providers that offer
different zones, it is good practice to have different members of the shard in different availability zones
(or even different regions).
8) Speed up queries
Queries can take too long, depending on the number of reads they perform. In a clustered deployment,
queries can run in parallel and speed up the query response time. If a query runs in ten seconds in a
replica set, it is very likely that the same query will run in five to six seconds if the cluster has two
shards, and so on.
Sharding Configuration is below:
Basically, I have only Two Virtual Box. So, 3 node Config servers and 3 node Shard1 and Mogos
process will run in this Machine.
CRS
Host IP Port
192.168.56.103 7001
192.168.56.101 7002
192.168.56.103 7003
Shard Process Running Machine
Host IP Port
192.168.56.101 4000
Shard1
Host IP Port
192.168.56.101 5001
192.168.56.103 5002
192.168.56.101 5003
Target 1: Create a Replica set for Shard 1 and test Replica set is working.
Target 2: Create a Config Server with 3 node Replica set also Sharding name.
Target 3: Create a MONGOS process and add Config server in MONGOS process.
Target 4: Add Shard name with Shard 1 and start with new config file
Target 5; Add Shard 1 into MONGOS process
Create directory for keyfile, Datafile and logfile for both Node
Shard1:
Host and Port No Datafile Logfile
192.168.56.101:5001 /home/oracle/mongo_shard/mongoShard1_1/datafile /home/oracle/mongo_shard/mongoShard1_1/logfile
192.168.56.103:6002 /home/oracle/mongo_shard/mongoShard1_2/datafile /home/oracle/mongo_shard/mongoShard1_2/logfile
192.168.56.101:5003 /home/oracle/mongo_shard/mongoShard1_3/datafile /home/oracle/mongo_shard/mongoShard1_3/logfile
ConfigServer
Host and Port No Datafile Logfile
192.168.56.103:7001 /home/oracle/mongo_shard/mongoConfig_1/datafile /home/oracle/mongo_shard/mongoConfig_1/logfile
192.168.56.101:7002 /home/oracle/mongo_shard/mongoConfig_2/datafile /home/oracle/mongo_shard/mongoConfig_2/logfile
192.168.56.103:7003 /home/oracle/mongo_shard/mongoConfig_3/datafile /home/oracle/mongo_shard/mongoConfig_3/logfile
Mongos
Host and Port No Datafile Logfile
192.168.56.101:4000 N/a /home/oracle/mongo_shard/Shard/logfile
Generate keyfile and transfer this file another node:
[23:11:51 oracle@test2 keyfile]$ openssl rand -base64 741 > /home/oracle/mongodb/keyfile/keyfile
[23:12:15 oracle@test2 keyfile]$ chmod 600 /home/oracle/mongodb/keyfile/keyfile
[23:12:24 oracle@test2 keyfile]$ scp /home/oracle/mongodb/keyfile/keyfile oracle@ansible:/home/oracle/mongodb/keyfile
oracle@ansible's password:
keyfile
100% 1004 648.0KB/s 00:00
[23:13:29 oracle@test2 keyfile]$
Target 1: Create a Replica set for Shard 1 and test Replica set is working
Config File for All Three Node
Node 1:
replication:
replSetName: shard1
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_1/datafile
net:
bindIp: 192.168.56.101,localhost
port: 5001
security:
authorization: enabled
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_1/logfile/mongod.log
logAppend: true
processManagement:
fork: true
Node 2
replication:
replSetName: shard1
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_2/datafile
net:
bindIp: 192.168.56.103,localhost
port: 6002
security:
authorization: enabled
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_2/logfile/mongod.log
logAppend: true
processManagement:
fork: true
Node 3
replication:
replSetName: shard1
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_3/datafile
net:
bindIp: 192.168.56.101,localhost
port: 5003
security:
authorization: enabled
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_3/logfile/mongod.log
logAppend: true
processManagement:
fork: true
Start the MongoDB Deamon process in both node using configure file
Node 1
[02:02:39 oracle@test2 bin]$ mongod -f config_file/shard1/node1.conf
about to fork child process, waiting until server is ready for connections.
forked process: 6033
child process started successfully, parent exiting
Node 2
[23:34:11 oracle@ansible bin]$ mongod -f config_file/shard1/node2.conf
about to fork child process, waiting until server is ready for connections.
forked process: 9687
child process started successfully, parent exiting
[23:34:27 oracle@ansible bin]$
Node 3
[02:05:31 oracle@test2 bin]$ mongod -f config_file/shard1/node3.conf
about to fork child process, waiting until server is ready for connections.
forked process: 6102
child process started successfully, parent exiting
ReplicaSet Configuration
Connecting to node1:
[02:06:19 oracle@test2 bin]$ mongo --port 5001
MongoDB shell version v3.6.8
connecting to: mongodb://127.0.0.1:5001/
MongoDB server version: 3.6.8
MongoDB Enterprise >
Replication Initialization
MongoDB Enterprise > rs.initiate()
MongoDB user Creation
db.createUser({
user: "admin",
pwd: "pass",
roles: [
{role: "root", db: "admin"}
]
})
MongoDB Enterprise > use admin
switched to db admin
MongoDB Enterprise > db.createUser({
... user: "admin",
... pwd: "pass",
... roles: [
... {role: "root", db: "admin"}
... ]
... })
Successfully added user: {
"user" : "admin",
"roles" : [
{
"role" : "root",
"db" : "admin"
}
]
}
MongoDB Enterprise shard1:PRIMARY>
Connect with admin user
mongo --host "shard1/192.168.56.101:5001" -u "admin" -p "pass" --authenticationDatabase
"admin"
Add Node
rs.add("192.168.56.103:6002")
rs.add("192.168.56.101:5003")
Now testing Replication is working
rs.stepDown()
Primary is Changing
Target 2: Create a Config Server with 3 node Replica set also Sharding name.
This Part So Far
Config servers store the metadata for a sharded cluster. The metadata reflects state and organization
for all data and components within the sharded cluster. The metadata includes the list of chunks on
every shard and the ranges that define the chunks
Config File for All Three Node Config replication
Node 1:
sharding:
clusterRole: configsvr
replication:
replSetName: Config
security:
keyFile: /home/oracle/mongodb/keyfile/keyfile
net:
bindIp: localhost,192.168.56.103
port: 7001
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoConfig_1/logfile/csrsvr.log
logAppend: true
processManagement:
fork: true
storage:
dbPath: /home/oracle/mongo_shard/mongoConfig_1/datafile
Node 2
sharding:
clusterRole: configsvr
replication:
replSetName: Config
security:
keyFile: /home/oracle/mongodb/keyfile/keyfile
net:
bindIp: localhost,192.168.56.101
port: 7002
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoConfig_2/logfile/csrsvr.log
logAppend: true
processManagement:
fork: true
storage:
dbPath: /home/oracle/mongo_shard/mongoConfig_2/datafile
Node 3
sharding:
clusterRole: configsvr
replication:
replSetName: Config
security:
keyFile: /home/oracle/mongodb/keyfile/keyfile
net:
bindIp: localhost,192.168.56.103
port: 7003
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoConfig_3/logfile/csrsvr.log
logAppend: true
processManagement:
fork: true
storage:
dbPath: /home/oracle/mongo_shard/mongoConfig_3/datafile
Start the MongoDB Daemon process for all config server using configure file
Node 1
[00:31:31 oracle@ansible bin]$ mongod -f
/home/oracle/mongodb_software/bin/config_file/config_svr/cnsvr1.conf
about to fork child process, waiting until server is ready for connections.
forked process: 13992
child process started successfully, parent exiting
Node 2
[03:03:18 oracle@test2 bin]$ mongod -f
/home/TimesTen_SFT/mongodb/bin/config_file/config_svr/cnsvr2.conf
about to fork child process, waiting until server is ready for connections.
forked process: 7168
child process started successfully, parent exiting
[03:03:34 oracle@test2 bin]$
Node 3
[00:31:44 oracle@ansible bin]$ mongod -f
/home/oracle/mongodb_software/bin/config_file/config_svr/cnsvr3.conf
about to fork child process, waiting until server is ready for connections.
forked process: 14031
child process started successfully, parent exiting
Connect to one of the config servers:
mongo --port 7001
Initiating the CSRS:
rs.initiate()
Creating super user on CSRS:
use admin
db.createUser({
user: "admin",
pwd: "pass",
roles: [
{role: "root", db: "admin"}
]
})
Authenticating as the super user:
db.auth("admin", "pass")
Add the second and third node to the CSRS:
rs.add("192.168.56.101:7002")
rs.add("192.168.56.103:7003")
MongoDB Sharding
Target 3: Create a MONGOS process and add Config server in MONGOS process
MongoDB mongos instances route queries and write operations to shards in a
sharded cluster. mongos provide the only interface to a sharded cluster from the
perspective of applications. Applications never connect or communicate directly with
the shards.
The mongos tracks what data is on which shard by caching the metadata from
the config servers. The mongosuses the metadata to route operations from
applications and clients to the mongod instances. A mongos has no persistent state
and consumes minimal system resources.
The most common practice is to run mongos instances on the same systems as your
application servers, but you can maintain mongos instances on the shards or on
other dedicated resources.
Routing and Results Process
A mongos instance routes a query to a cluster by:
1. Determining the list of shards that must receive the query.
2. Establishing a cursor on all targeted shards.
The mongos then merges the data from each of the targeted shards and returns the
result document. Certain query modifiers, such as sorting, are performed on a shard
such as the primary shard before mongos retrieves the results.
This Part So Far
sharding:
configDB: Config/192.168.56.103:7001,192.168.56.101:7002,192.168.56.103:7003
security:
keyFile: /home/oracle/mongodb/keyfile/keyfile
net:
bindIp: localhost,192.168.56.101
port: 4000
systemLog:
destination: file
path: /home/oracle/mongo_shard/Shard/logfile/shard.log
logAppend: true
processManagement:
fork: true
Log shows it’s also working fine
Target 4: Add Shard name with Shard 1 and start with new config file
Check Current Shard Status
mongo --port 4000 --username admin --password pass --authenticationDatabase admin
Update Config file for Shard 1
Previous Config File New Config File
replication:
replSetName: shard1
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_1/datafile
net:
bindIp: 192.168.56.101,localhost
port: 5001
security:
authorization: enabled
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_1/logfile/mongod.log
logAppend: true
processManagement:
fork: true
sharding:
clusterRole: shardsvr
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_1/datafile
wiredTiger:
engineConfig:
cacheSizeGB: .1
net:
bindIp: 192.168.56.101,localhost
port: 5001
security:
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_1/logfile/mongod.log
logAppend: true
processManagement:
fork: true
replication:
replSetName: shard1
Connect:
mongo --port 5001 -u "admin" -p "pass" --authenticationDatabase "admin"
db.shutdownServer()
Now start with new above config file.
Check the Replicaset status and make it primary
Current Primary 
mongo --port 5003 -u "admin" -p "pass" --authenticationDatabase "admin"
rs.isMaster()
rs.stepDown()
Shutdown Node 2 and start with new config file
mongo --port 6002 -u "admin" -p "pass" --authenticationDatabase "admin"
db.shutdownServer()
Previous Config File New Config File
replication:
replSetName: shard1
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_2/datafile
net:
bindIp: 192.168.56.103,localhost
sharding:
clusterRole: shardsvr
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_2/datafile
wiredTiger:
engineConfig:
port: 6002
security:
authorization: enabled
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_2/logfile/mongod.log
logAppend: true
processManagement:
fork: true
cacheSizeGB: .1
net:
bindIp: 192.168.56.103,localhost
port: 6002
security:
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_2/logfile/mongod.log
logAppend: true
processManagement:
fork: true
replication:
replSetName: shard1
Start the Mongod process
Shutdown Node 3 and start with new config file
mongo --port 5003 -u "admin" -p "pass" --authenticationDatabase "admin"
db.shutdownServer()
Previous config file New config file
replication: sharding:
replSetName: shard1
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_3/datafile
net:
bindIp: 192.168.56.101,localhost
port: 5003
security:
authorization: enabled
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_3/logfile/mongod.log
logAppend: true
processManagement:
fork: true
clusterRole: shardsvr
storage:
dbPath: /home/oracle/mongo_shard/mongoShard1_3/datafile
wiredTiger:
engineConfig:
cacheSizeGB: .1
net:
bindIp: 192.168.56.101,localhost
port: 5003
security:
keyFile: /home/oracle/mongodb/keyfile/keyfile
systemLog:
destination: file
path: /home/oracle/mongo_shard/mongoShard1_3/logfile/mongod.log
logAppend: true
processManagement:
fork: true
replication:
replSetName: shard1
Start Node 3 with new config file
mongod -f config_file/shard1/node3.conf
and connect it
mongo --port 5003 -u "admin" -p "pass" --authenticationDatabase "admin"
Target 5; Add Shard 1 into MONGOS process
Connect Shard Cluster
mongo --port 4000 --username admin --password pass --authenticationDatabase
admin
Adding new shard to cluster from mongos:
sh.addShard("shard1/192.168.56.101:5001")
Output Shows Shard added
Log Shows
Check the Shard Status
Then We can do add new Shard named Shard 2 as like in below Diagram

More Related Content

PDF
Oracle Cloud Infrastructure:2020年5月度サービス・アップデート
PDF
Oracle Cloud Infrastructure:2020年6月度サービス・アップデート
PPTX
Exadata
PDF
【旧版】Oracle Cloud Infrastructure:サービス概要のご紹介 [2020年6月版]
PDF
【旧版】Oracle Cloud Infrastructure:サービス概要のご紹介 [2020年2月版]
PDF
Dell PowerEdge R920 running Oracle Database: Benefits of upgrading with NVMe ...
PDF
クラウドのコストを大幅削減!事例から見るクラウド間移行の効果(Oracle Cloudウェビナーシリーズ: 2020年7月8日)
PDF
Benefity Oracle Cloudu (4/4): Storage
Oracle Cloud Infrastructure:2020年5月度サービス・アップデート
Oracle Cloud Infrastructure:2020年6月度サービス・アップデート
Exadata
【旧版】Oracle Cloud Infrastructure:サービス概要のご紹介 [2020年6月版]
【旧版】Oracle Cloud Infrastructure:サービス概要のご紹介 [2020年2月版]
Dell PowerEdge R920 running Oracle Database: Benefits of upgrading with NVMe ...
クラウドのコストを大幅削減!事例から見るクラウド間移行の効果(Oracle Cloudウェビナーシリーズ: 2020年7月8日)
Benefity Oracle Cloudu (4/4): Storage

What's hot (20)

PPT
Oracle 10g Application Server
PDF
Oracle Cloud PaaS & IaaS:2020年2月度サービス情報アップデート
PDF
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
PDF
Performance benchmark results: Amazon Web Services (AWS) SAN in the Cloud vs....
PDF
Oracle Cloud Infrastructure:2021年1月度サービス・アップデート
PDF
Serverless Patterns by Jesse Butler
PDF
【旧版】Oracle Database Cloud Service:サービス概要のご紹介 [2020年5月版]
PDF
Oracle Cloud Infrastructure – Compute
PDF
Get higher transaction throughput and better price/performance with an Amazon...
PDF
A single-socket Dell EMC PowerEdge R7515 solution delivered better value on a...
PDF
Oracle Cloud PaaS & IaaS:2019年12月度サービス情報アップデート
PDF
IBM Power9 Features and Specifications
PDF
Oracle Cloud Infrastructure – Storage
PDF
Give DevOps teams self-service resource pools within your private infrastruct...
PDF
SQL Server 2016 database performance on the Dell EMC PowerEdge FC630 QLogic 1...
PDF
Vertica on Amazon Web Services
PDF
自律型データベース Oracle Autonomous Database 最新情報(Oracle Cloudウェビナーシリーズ: 2020年8月6日)
PDF
Ensure greater uptime and boost VMware vSAN cluster performance with the Del...
PDF
Výhody a benefity nasazení Oracle Database Appliance
PDF
Power edge mx7000_sds_performance_1018
Oracle 10g Application Server
Oracle Cloud PaaS & IaaS:2020年2月度サービス情報アップデート
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
Performance benchmark results: Amazon Web Services (AWS) SAN in the Cloud vs....
Oracle Cloud Infrastructure:2021年1月度サービス・アップデート
Serverless Patterns by Jesse Butler
【旧版】Oracle Database Cloud Service:サービス概要のご紹介 [2020年5月版]
Oracle Cloud Infrastructure – Compute
Get higher transaction throughput and better price/performance with an Amazon...
A single-socket Dell EMC PowerEdge R7515 solution delivered better value on a...
Oracle Cloud PaaS & IaaS:2019年12月度サービス情報アップデート
IBM Power9 Features and Specifications
Oracle Cloud Infrastructure – Storage
Give DevOps teams self-service resource pools within your private infrastruct...
SQL Server 2016 database performance on the Dell EMC PowerEdge FC630 QLogic 1...
Vertica on Amazon Web Services
自律型データベース Oracle Autonomous Database 最新情報(Oracle Cloudウェビナーシリーズ: 2020年8月6日)
Ensure greater uptime and boost VMware vSAN cluster performance with the Del...
Výhody a benefity nasazení Oracle Database Appliance
Power edge mx7000_sds_performance_1018
Ad

Similar to MongoDB Sharding (20)

DOCX
MongoDB Replication and Sharding
PPTX
Performance Tuning
PPTX
Cassandra in Operation
PDF
What is New in Hadoop 3 .
PDF
Bt0070 operating systems 2
PDF
Webcenter application performance tuning guide
PDF
EOUG95 - Client Server Very Large Databases - Paper
PDF
Big Data Glossary of terms
PDF
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
PPT
A presentaion on Panasas HPC NAS
PDF
Why is Virtualization Creating Storage Sprawl? By Storage Switzerland
PPTX
Google File System
PPT
Waters Grid & HPC Course
PDF
Scaling Up vs. Scaling-out
PDF
DOCX
Hadoop Research
PDF
User-space Network Processing
PDF
Mdb dn 2016_11_ops_mgr
MongoDB Replication and Sharding
Performance Tuning
Cassandra in Operation
What is New in Hadoop 3 .
Bt0070 operating systems 2
Webcenter application performance tuning guide
EOUG95 - Client Server Very Large Databases - Paper
Big Data Glossary of terms
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
A presentaion on Panasas HPC NAS
Why is Virtualization Creating Storage Sprawl? By Storage Switzerland
Google File System
Waters Grid & HPC Course
Scaling Up vs. Scaling-out
Hadoop Research
User-space Network Processing
Mdb dn 2016_11_ops_mgr
Ad

More from uzzal basak (11)

PPT
Elk presentation 2#3
PPT
Elk presentation1#3
PDF
Oracle goldengate 11g schema replication from standby database
PDF
12c db upgrade from 11.2.0.4
PDF
Encrypt and decrypt in solaris system
DOCX
Oracle table partition step
PDF
Oracle business intelligence enterprise edition 11g
DOC
EMC Networker installation Document
DOC
Oracle Audit vault
DOC
Schema replication using oracle golden gate 12c
DOC
Oracle data guard configuration in 12c
Elk presentation 2#3
Elk presentation1#3
Oracle goldengate 11g schema replication from standby database
12c db upgrade from 11.2.0.4
Encrypt and decrypt in solaris system
Oracle table partition step
Oracle business intelligence enterprise edition 11g
EMC Networker installation Document
Oracle Audit vault
Schema replication using oracle golden gate 12c
Oracle data guard configuration in 12c

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation theory and applications.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Spectroscopy.pptx food analysis technology
PDF
Approach and Philosophy of On baking technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
cuic standard and advanced reporting.pdf
Understanding_Digital_Forensics_Presentation.pptx
Electronic commerce courselecture one. Pdf
Encapsulation theory and applications.pdf
MIND Revenue Release Quarter 2 2025 Press Release
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation_ Review paper, used for researhc scholars
Spectroscopy.pptx food analysis technology
Approach and Philosophy of On baking technology
Reach Out and Touch Someone: Haptics and Empathic Computing
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Review of recent advances in non-invasive hemoglobin estimation
sap open course for s4hana steps from ECC to s4
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

MongoDB Sharding

  • 2. MongoDB Sharding Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations. Database systems with large data sets or high throughput applications can challenge the capacity of a single server. For example, high query rates can exhaust the CPU capacity of the server. Working set sizes larger than the system’s RAM stress the I/O capacity of disk drives. There are two methods for addressing system growth: vertical and horizontal scaling. Vertical Scaling involves increasing the capacity of a single server, such as using a more powerful CPU, adding more RAM, or increasing the amount of storage space. Limitations in available technology may restrict a single machine from being sufficiently powerful for a given workload. Additionally, Cloud-based providers have hard ceilings based on available hardware configurations. As a result, there is a practical maximum for vertical scaling.
  • 3. Horizontal Scaling involves dividing the system dataset and load over multiple servers, adding additional servers to increase capacity as required. While the overall speed or capacity of a single machine may not be high, each machine handles a subset of the overall workload, potentially providing better efficiency than a single high-speed high-capacity server. Expanding the capacity of the deployment only requires adding additional servers as needed, which can be a lower overall cost than high-end hardware for a single machine. The trade off is increased complexity in infrastructure and maintenance for the deployment. Sharded Cluster A MongoDB sharded cluster consists of the following components:  shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.  mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.  config servers: Config servers store metadata and configuration settings for the cluster.
  • 4. A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments. Advantage of Replica Set: 1. It’s ensure data availability is any disaster period. If any Primary failure due to any hardware issue, then data will not loss at all. 2. We can execute Select queries from Secondary database which can reduce the load from Primary database. 3. Then we can set the delay replication from Secondary database that will help any data corruption from developer. Suppose we set 1 hour delay replication that means always 1 hour data sync lag from Primary database. If any developer does anything accidentally, then we can recovery from secondary database so we can consider this Secondary Database as hot-backup. When we go for Sharding? Sharding is the most complex architecture we can deploy using MongoDB, and there are two main approaches as to when to shard or not. The first is to configure the cluster as soon as possible – when we predict high throughput and fast data growth. The second says we should use a cluster as the best alternative when the application demands more resources than the replica set can offer (such as low memory, an overloaded disk or high processor load). This approach is more corrective than preventative, but we will discuss that in the future. 1) Disaster recovery plan Disaster recovery (DR) is a very delicate topic: how long would tolerate an outage? If necessary, how long would it take to restore the entire database? Depending on the database size and on disk speed, a backup/restore process might take hours or even days! There is no hard number in Gigabytes to justify a cluster. But in general, you should engage when the database is more than 200GB the backup and restore processes might take a while to finish. Let’s consider the case where we have a replica set with a 300GB database. The full restore process might last around four hours, whereas if the database has two shards, it will take about two hours – and depending on the number of shards we can improve that time. Simple math: if there are two shards, the restore process takes half of the time to restore when compared to a single replica set. 2) Hardware limitations Disk and memory are inexpensive nowadays. However, this is not true when companies need to scale out to high numbers (such as TB of RAM). Suppose cloud provider can only offer up to 5,000 IOPS in the disk subsystem, but the application needs more than that to work correctly. To work around this performance limitation, it is better to start a cluster and divide the writes among instances. That said, if there are two shards the application will have 10000 IOPS available to use for writes and reads in
  • 5. the disk subsystem. 3) Storage engine limitations There are a few storage engine limitations that can be a bottleneck . MMAPv2 does have a lock per collection, while WiredTiger has tickets that will limit the number of writes and reads happening concurrently. Although we can tweak the number of tickets available in WiredTiger, there is a virtual limit – which means that changing the available tickets might generate processor overload instead of increasing performance. If one of these situations becomes a bottleneck in system, we can start a cluster. Once shard the collection, distribute the load/lock among the different instances. 4) Hot data vs. cold data Several databases only work with a small percentage of the data being stored. This is called hot data or working set. Cold data or historical data is rarely read, and demands considerable system resources when it is. So why spend money on expensive machines that only store cold data or low-value data? With a cluster deployment we can choose where the cold data is stored, and use cheap devices and disks to do so. The same is true for hot data – we can use better machines to have better performance. This methodology also speeds up writes and reads on the hot data, as the indexes are smaller and add less overhead to the system. 5) Geo-distributed data It doesn’t matter whether this need comes from application design or legal compliance. If the data must stay within continent or country borders, a cluster helps make that happen. It is possible to limit data localization so that it is stored solely in a specific “part of the world.” The number of shards and their geographic positions is not essential for the application, as it only views the database. This is commonly used in worldwide companies for better performance, or simply to comply with the local law. 6) Infrastructure limitations Infrastructure and hardware limitations are very similar. When thinking about infrastructure, however, we focus on specific cases when the instances should be small. An example is running MongoDB on Mesos. Some providers only offer a few cores and a limited amount of RAM. Even if you are willing to pay more for that, it is not possible to purchase more than they offer as their products. A cluster provides the option to split a small amount of data among a lot of shards, reaching the same performance a big and expensive machine provides. 7) Failure isolation Consider that a replica set or a single instance holds all the data. If for any reason this instance/replica set goes down, the whole application goes down. In a cluster, if we lose one of the five shards, 80% of the data is still available. Running a few shards helps to isolate failures. Obviously, running a bunch of instances makes the cluster prone to have a failed instance, but as each shard must have at least three instances the probability of the entire shard being down is minimal. For providers that offer
  • 6. different zones, it is good practice to have different members of the shard in different availability zones (or even different regions). 8) Speed up queries Queries can take too long, depending on the number of reads they perform. In a clustered deployment, queries can run in parallel and speed up the query response time. If a query runs in ten seconds in a replica set, it is very likely that the same query will run in five to six seconds if the cluster has two shards, and so on. Sharding Configuration is below: Basically, I have only Two Virtual Box. So, 3 node Config servers and 3 node Shard1 and Mogos process will run in this Machine. CRS Host IP Port 192.168.56.103 7001 192.168.56.101 7002 192.168.56.103 7003
  • 7. Shard Process Running Machine Host IP Port 192.168.56.101 4000 Shard1 Host IP Port 192.168.56.101 5001 192.168.56.103 5002 192.168.56.101 5003 Target 1: Create a Replica set for Shard 1 and test Replica set is working. Target 2: Create a Config Server with 3 node Replica set also Sharding name. Target 3: Create a MONGOS process and add Config server in MONGOS process. Target 4: Add Shard name with Shard 1 and start with new config file Target 5; Add Shard 1 into MONGOS process Create directory for keyfile, Datafile and logfile for both Node Shard1: Host and Port No Datafile Logfile 192.168.56.101:5001 /home/oracle/mongo_shard/mongoShard1_1/datafile /home/oracle/mongo_shard/mongoShard1_1/logfile 192.168.56.103:6002 /home/oracle/mongo_shard/mongoShard1_2/datafile /home/oracle/mongo_shard/mongoShard1_2/logfile 192.168.56.101:5003 /home/oracle/mongo_shard/mongoShard1_3/datafile /home/oracle/mongo_shard/mongoShard1_3/logfile ConfigServer Host and Port No Datafile Logfile 192.168.56.103:7001 /home/oracle/mongo_shard/mongoConfig_1/datafile /home/oracle/mongo_shard/mongoConfig_1/logfile 192.168.56.101:7002 /home/oracle/mongo_shard/mongoConfig_2/datafile /home/oracle/mongo_shard/mongoConfig_2/logfile 192.168.56.103:7003 /home/oracle/mongo_shard/mongoConfig_3/datafile /home/oracle/mongo_shard/mongoConfig_3/logfile Mongos
  • 8. Host and Port No Datafile Logfile 192.168.56.101:4000 N/a /home/oracle/mongo_shard/Shard/logfile Generate keyfile and transfer this file another node: [23:11:51 oracle@test2 keyfile]$ openssl rand -base64 741 > /home/oracle/mongodb/keyfile/keyfile [23:12:15 oracle@test2 keyfile]$ chmod 600 /home/oracle/mongodb/keyfile/keyfile [23:12:24 oracle@test2 keyfile]$ scp /home/oracle/mongodb/keyfile/keyfile oracle@ansible:/home/oracle/mongodb/keyfile oracle@ansible's password: keyfile 100% 1004 648.0KB/s 00:00 [23:13:29 oracle@test2 keyfile]$ Target 1: Create a Replica set for Shard 1 and test Replica set is working Config File for All Three Node Node 1: replication: replSetName: shard1 storage: dbPath: /home/oracle/mongo_shard/mongoShard1_1/datafile net: bindIp: 192.168.56.101,localhost port: 5001 security: authorization: enabled keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog:
  • 9. destination: file path: /home/oracle/mongo_shard/mongoShard1_1/logfile/mongod.log logAppend: true processManagement: fork: true Node 2 replication: replSetName: shard1 storage: dbPath: /home/oracle/mongo_shard/mongoShard1_2/datafile net: bindIp: 192.168.56.103,localhost port: 6002 security: authorization: enabled keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_2/logfile/mongod.log logAppend: true processManagement: fork: true Node 3 replication: replSetName: shard1 storage: dbPath: /home/oracle/mongo_shard/mongoShard1_3/datafile net: bindIp: 192.168.56.101,localhost port: 5003 security: authorization: enabled keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_3/logfile/mongod.log
  • 10. logAppend: true processManagement: fork: true Start the MongoDB Deamon process in both node using configure file Node 1 [02:02:39 oracle@test2 bin]$ mongod -f config_file/shard1/node1.conf about to fork child process, waiting until server is ready for connections. forked process: 6033 child process started successfully, parent exiting Node 2 [23:34:11 oracle@ansible bin]$ mongod -f config_file/shard1/node2.conf about to fork child process, waiting until server is ready for connections. forked process: 9687 child process started successfully, parent exiting [23:34:27 oracle@ansible bin]$ Node 3 [02:05:31 oracle@test2 bin]$ mongod -f config_file/shard1/node3.conf about to fork child process, waiting until server is ready for connections. forked process: 6102 child process started successfully, parent exiting ReplicaSet Configuration Connecting to node1: [02:06:19 oracle@test2 bin]$ mongo --port 5001 MongoDB shell version v3.6.8 connecting to: mongodb://127.0.0.1:5001/ MongoDB server version: 3.6.8
  • 11. MongoDB Enterprise > Replication Initialization MongoDB Enterprise > rs.initiate() MongoDB user Creation db.createUser({ user: "admin", pwd: "pass", roles: [ {role: "root", db: "admin"} ] }) MongoDB Enterprise > use admin switched to db admin MongoDB Enterprise > db.createUser({ ... user: "admin", ... pwd: "pass", ... roles: [ ... {role: "root", db: "admin"} ... ] ... }) Successfully added user: { "user" : "admin", "roles" : [ { "role" : "root", "db" : "admin" } ] } MongoDB Enterprise shard1:PRIMARY>
  • 12. Connect with admin user mongo --host "shard1/192.168.56.101:5001" -u "admin" -p "pass" --authenticationDatabase "admin" Add Node rs.add("192.168.56.103:6002") rs.add("192.168.56.101:5003") Now testing Replication is working rs.stepDown() Primary is Changing
  • 13. Target 2: Create a Config Server with 3 node Replica set also Sharding name. This Part So Far Config servers store the metadata for a sharded cluster. The metadata reflects state and organization for all data and components within the sharded cluster. The metadata includes the list of chunks on every shard and the ranges that define the chunks Config File for All Three Node Config replication Node 1: sharding: clusterRole: configsvr replication: replSetName: Config security: keyFile: /home/oracle/mongodb/keyfile/keyfile net: bindIp: localhost,192.168.56.103 port: 7001
  • 14. systemLog: destination: file path: /home/oracle/mongo_shard/mongoConfig_1/logfile/csrsvr.log logAppend: true processManagement: fork: true storage: dbPath: /home/oracle/mongo_shard/mongoConfig_1/datafile Node 2 sharding: clusterRole: configsvr replication: replSetName: Config security: keyFile: /home/oracle/mongodb/keyfile/keyfile net: bindIp: localhost,192.168.56.101 port: 7002 systemLog: destination: file path: /home/oracle/mongo_shard/mongoConfig_2/logfile/csrsvr.log logAppend: true processManagement: fork: true storage: dbPath: /home/oracle/mongo_shard/mongoConfig_2/datafile Node 3 sharding: clusterRole: configsvr replication: replSetName: Config security: keyFile: /home/oracle/mongodb/keyfile/keyfile net: bindIp: localhost,192.168.56.103 port: 7003
  • 15. systemLog: destination: file path: /home/oracle/mongo_shard/mongoConfig_3/logfile/csrsvr.log logAppend: true processManagement: fork: true storage: dbPath: /home/oracle/mongo_shard/mongoConfig_3/datafile Start the MongoDB Daemon process for all config server using configure file Node 1 [00:31:31 oracle@ansible bin]$ mongod -f /home/oracle/mongodb_software/bin/config_file/config_svr/cnsvr1.conf about to fork child process, waiting until server is ready for connections. forked process: 13992 child process started successfully, parent exiting Node 2 [03:03:18 oracle@test2 bin]$ mongod -f /home/TimesTen_SFT/mongodb/bin/config_file/config_svr/cnsvr2.conf about to fork child process, waiting until server is ready for connections. forked process: 7168 child process started successfully, parent exiting [03:03:34 oracle@test2 bin]$ Node 3 [00:31:44 oracle@ansible bin]$ mongod -f /home/oracle/mongodb_software/bin/config_file/config_svr/cnsvr3.conf about to fork child process, waiting until server is ready for connections. forked process: 14031 child process started successfully, parent exiting Connect to one of the config servers:
  • 16. mongo --port 7001 Initiating the CSRS: rs.initiate() Creating super user on CSRS: use admin db.createUser({ user: "admin", pwd: "pass", roles: [ {role: "root", db: "admin"} ] }) Authenticating as the super user: db.auth("admin", "pass")
  • 17. Add the second and third node to the CSRS: rs.add("192.168.56.101:7002") rs.add("192.168.56.103:7003")
  • 19. Target 3: Create a MONGOS process and add Config server in MONGOS process MongoDB mongos instances route queries and write operations to shards in a sharded cluster. mongos provide the only interface to a sharded cluster from the perspective of applications. Applications never connect or communicate directly with the shards. The mongos tracks what data is on which shard by caching the metadata from the config servers. The mongosuses the metadata to route operations from applications and clients to the mongod instances. A mongos has no persistent state and consumes minimal system resources. The most common practice is to run mongos instances on the same systems as your application servers, but you can maintain mongos instances on the shards or on other dedicated resources. Routing and Results Process A mongos instance routes a query to a cluster by: 1. Determining the list of shards that must receive the query. 2. Establishing a cursor on all targeted shards. The mongos then merges the data from each of the targeted shards and returns the result document. Certain query modifiers, such as sorting, are performed on a shard such as the primary shard before mongos retrieves the results. This Part So Far
  • 20. sharding: configDB: Config/192.168.56.103:7001,192.168.56.101:7002,192.168.56.103:7003 security: keyFile: /home/oracle/mongodb/keyfile/keyfile net: bindIp: localhost,192.168.56.101 port: 4000 systemLog: destination: file path: /home/oracle/mongo_shard/Shard/logfile/shard.log logAppend: true processManagement: fork: true Log shows it’s also working fine Target 4: Add Shard name with Shard 1 and start with new config file
  • 21. Check Current Shard Status mongo --port 4000 --username admin --password pass --authenticationDatabase admin Update Config file for Shard 1 Previous Config File New Config File replication: replSetName: shard1 storage: dbPath: /home/oracle/mongo_shard/mongoShard1_1/datafile net: bindIp: 192.168.56.101,localhost port: 5001 security: authorization: enabled keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_1/logfile/mongod.log logAppend: true processManagement: fork: true sharding: clusterRole: shardsvr storage: dbPath: /home/oracle/mongo_shard/mongoShard1_1/datafile wiredTiger: engineConfig: cacheSizeGB: .1 net: bindIp: 192.168.56.101,localhost port: 5001 security: keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_1/logfile/mongod.log logAppend: true processManagement: fork: true replication:
  • 22. replSetName: shard1 Connect: mongo --port 5001 -u "admin" -p "pass" --authenticationDatabase "admin" db.shutdownServer() Now start with new above config file. Check the Replicaset status and make it primary Current Primary  mongo --port 5003 -u "admin" -p "pass" --authenticationDatabase "admin" rs.isMaster() rs.stepDown() Shutdown Node 2 and start with new config file mongo --port 6002 -u "admin" -p "pass" --authenticationDatabase "admin" db.shutdownServer() Previous Config File New Config File replication: replSetName: shard1 storage: dbPath: /home/oracle/mongo_shard/mongoShard1_2/datafile net: bindIp: 192.168.56.103,localhost sharding: clusterRole: shardsvr storage: dbPath: /home/oracle/mongo_shard/mongoShard1_2/datafile wiredTiger: engineConfig:
  • 23. port: 6002 security: authorization: enabled keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_2/logfile/mongod.log logAppend: true processManagement: fork: true cacheSizeGB: .1 net: bindIp: 192.168.56.103,localhost port: 6002 security: keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_2/logfile/mongod.log logAppend: true processManagement: fork: true replication: replSetName: shard1 Start the Mongod process Shutdown Node 3 and start with new config file mongo --port 5003 -u "admin" -p "pass" --authenticationDatabase "admin" db.shutdownServer() Previous config file New config file replication: sharding:
  • 24. replSetName: shard1 storage: dbPath: /home/oracle/mongo_shard/mongoShard1_3/datafile net: bindIp: 192.168.56.101,localhost port: 5003 security: authorization: enabled keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_3/logfile/mongod.log logAppend: true processManagement: fork: true clusterRole: shardsvr storage: dbPath: /home/oracle/mongo_shard/mongoShard1_3/datafile wiredTiger: engineConfig: cacheSizeGB: .1 net: bindIp: 192.168.56.101,localhost port: 5003 security: keyFile: /home/oracle/mongodb/keyfile/keyfile systemLog: destination: file path: /home/oracle/mongo_shard/mongoShard1_3/logfile/mongod.log logAppend: true processManagement: fork: true replication: replSetName: shard1 Start Node 3 with new config file mongod -f config_file/shard1/node3.conf and connect it mongo --port 5003 -u "admin" -p "pass" --authenticationDatabase "admin" Target 5; Add Shard 1 into MONGOS process
  • 25. Connect Shard Cluster mongo --port 4000 --username admin --password pass --authenticationDatabase admin Adding new shard to cluster from mongos: sh.addShard("shard1/192.168.56.101:5001") Output Shows Shard added
  • 26. Log Shows Check the Shard Status Then We can do add new Shard named Shard 2 as like in below Diagram