SlideShare a Scribd company logo
MongoDB
Replication & Sharding
What is replication?
• Replication is a way of keeping identical copies
of data on multiple servers and it is
recommended for all production deployments.
• Primary Server: Making standalone as primary
server with no: 27017
Secondary Servers: Started two servers with
port numbers: 27020 and 27021
Mongodb replication
Mongodb replication
Mongodb replication
Best practices for replication
Replica sets are MongoDB's mechanism to provide redundancy, high availability, and higher read throughput
under the right conditions.
Replication in MongoDB is easy to configure and light in operational terms:
Always use replica sets: Even if your dataset is at the moment small and you don't expect it to grow
exponentially, you never know when that might happen. Also, having a replica set of at least three servers
helps design for redundancy, separating work loads between real time and analytics (using the
secondaries)and having data redundancy built from day one.
Use a replica set to your advantage: A replica set is not just for data replication. We can and should in most
cases use the primary server for writes and preference reads from one of the secondaries to offload the
primary server. This can be done by setting read preference for reads, together with the correct
writeconcern to ensure writes propagate as needed.
Use an odd number of replicas in a MongoDB replica set: If a server does downor loses connectivity with the
rest of them (network partitioning), the rest have tovote as to which one will be elected as the primary
server. If we have an odd number of replica set members, we can guarantee that each subset of servers
knows if they belong to the majority or the minority of the replica set members. If we can't have an odd
number of replicas, we need to have one extra host set as anarbiter with the sole purpose of voting in the
election process. Even a microinstance in EC2 could serve this purpose
Step by Step MongoDB Replication setup on Windows
Important Notes: Before going to setup MongoDB replication, please take backup all important.
1. Start standalone server as shown below.
mongod --dbpath "C:Program FilesMongoDBServer4.0data" --logpath "C:Program FilesMongoDBServer4.0logmongod.log" --port 27017 --storageEngine=wiredTiger --journal --replSet testdb
2. Connect to the server with port number 27017
mongo --port 27017
3. Then, create variable rsconf.
rsconf={_id:"testdb",members:[{_id:0,host:"localhost:27017"}]}
rs.initiate(rsconf)
Note: Here i am configuring replication on single windows machine. If you have three different machines, then localhost with name or IP address and port number.
4. Start secondary server on the port 27020.
mongod --dbpath "C:data1db" --logpath "C:data1logmongod.log" --port 27020 --storageEngine=wiredTiger --journal --replSet testdb
5. Logon to secondary server.
mongo --port 27020
6. Start secondary server on the port 27021.
mongod --dbpath "C:data2db" --logpath "C:data2logmongod.log" --port 27021 --storageEngine=wiredTiger --journal --replSet testdb
7. Logon to secondary server.
mongo --port 27021
8. Run the following commands on Primary server.
rs.add("localhost:27020")
rs.add("localhost:27020")
9. Now go to secondary servers and run below command on both the secondary servers.
rs.slaveOk()
Replication Setup verification
Create a collection primary server and verify this change will reflect on secondary servers or not.
1. Connect to primary server.
use ecom
2. Create a collection in Primary
db.test.insert({name:"MongoDB"})
1. Now connect to secondary servers and check the list of the database by running command
show dbs
2. Switch to the newly created database.
use ecom
3. run the command against ecom database.
db.test.find().pretty().
MongoDB replication setup :
1. Keep data backup /etc/hosts and /etc/mongod.conf
2. Configure hosts/
3. Configure firewall
4. Configure MongoDB Replica Set
5. Initiate Replication
6. Test the replication
In this mongodb replication setup step by step on linux, following are my ip addresses and
their host names respectively.
• 192.168.152.135 mongodb1
• 192.168.152.141 mongodb2
• 192.168.152.142 mongodb3
Mongodb replication setup step
by step on linux
Step1: Keep data backup /etc/hosts and /etc/mongod.conf
cp /etc/hosts hosts_before_mod
cp /etc/mongod.conf mongod_before_mod
Step2: Configure hosts/ - all servers
Edit the /etc/hosts file in all replica servers and add below lines.
vi /etc/hosts
Then add below lines and save hosts file.
192.168.152.135 mongodb1
192.168.152.141 mongodb2
192.168.152.142 mongodb3
OS command : Hostname & ifconfig
Step3: Configure firewall in all nodes in replica set
Now install ufw(uncomplicated firewall), if not installed by using below command.
sudo apt install ufw
sudo ufw enable
Now enable the port 27017 on all replica nodes.
sudo ufw allow 27017
Verify firewall opened for port 27017 or not
sudo ufw statsu verbose
Step1: Keep data backup
/etc/hosts and /etc/mongod.conf
Step2: Configure hosts/ - all
servers
Step3: Configure firewall in all
nodes in replica set
Step4: Configure MongoDB Replica Set
This can be done by modifying the /etc/mongod.conf configuration file. In this step, we add bindIp and replica set name.
vi /etc/mongod.conf
Then add the ip address of your host to the field bindIp
Before modification of bindIp
# network interfaces
net:
port: 27017
bindIp: 127.0.0.1
After modification of bindIp
# network interfaces
net:
port: 27017
bindIp: 192.168.152.135 #127.0.0.1
Now enable replication and add replica set name:
Before replica set name added:
#replication:
After replica set name added: Remove the hash mark and field replSetName. replSetName is case sensitive.
replication:
replSetName: "testdb"
Important note: There should be single space after colon(:) and two spaces before the replSetName
repeat the above steps of 4th step in all replica nodes.
Step4: Configure MongoDB
Replica Set
Step5: Initiate Replication on three nodes
We have to restart MongoDB servers on all three nodes by using below command:
systemctl restart mongod.service
systemctl status mongod.service
Connect to the mongodb servers.
Then execute the command:
rs.initiate()
Press Enter twice.
After this add another nodes to replica set.
rs.add("192.168.152.141")
rs.add("192.168.152.142")
Now verify the replication status:
rs.status()
Step5: Initiate Replication on
three nodes
Now create a database on primary server. Then create a collection in that database and test this
changes have been updated in the secondary servers or not.
Goto Primary node.
a) Create a database
testdb>:PRIMARY> use test
switched to db test
testdb>:PRIMARY> db
test
>:PRIMARY>
b) Create a collection in the above database:
db.eclhur.insert({"name":"Elchuru"})
Then run the command
show dbs
c) Now switch to Secondary nodes and type show dbs. If the new database reflected in the seconday
nodes then replication setup is successfull otherwise something went wrong and need to
troubleshooted.
testdb>:SECONDARY> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
test 0.000GB
New database replicated in the secondary note. So replication test successful.
Test the replication
Check the status of Replication -- Run below command from the primary replica node to get complete info of the replica
set.
rs.conf()
rs.status()
rs0:PRIMARY> rs.conf()
{
"_id" : "rs0",
"version" : 2,
"protocolVersion" : NumberLong(1),
"writeConcernMajorityJournalDefault" : true,
"members" : [
{
"_id" : 0,
"host" : "localhost:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 1,
"host" : "localhost:27018",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 2,
"host" : "localhost:27019",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
Check the status of Replication
Continue …….
{
"_id" : 3,
"host" : "localhost:27016",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5ed6180d01a39f2162335de5")
}
}
Check the status of Replication
Add new MongoDB instance to a replica set
Start Primary MongoDB client and run below command
Syntax: rs.add(“hostname:port”)
Example:
rs0:PRIMARY> rs.add("localhost:27016")
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1591094195, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1591094195, 1)
}
Remove existing MongoDB instance from the replica set
The below command will remove the required secondary host from the replica set.
Syntax: rs.remove("localhost:27017")
Example:
rs0:PRIMARY> rs.remove("localhost:27016")
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1591095681, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1591095681, 1)
}
rs0:PRIMARY>
Make Primary as Secondary replica set
MongoDB provides a command to instruct primary replica to become a secondary replica set.
Syntax: rs.stepDown( stepDownSecs , secondaryCatchupSecs )
Example:
rs0:PRIMARY> rs.stepDown(12)
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1591096055, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1591096055, 1)
}
rs0:SECONDARY>
Check the Replica Lag between primary and Secondary
The below command will be used to check the replication lag between all replica set from the primary.
Syntax: rs.printSlaveReplicationInfo()
Example:
rs0:PRIMARY> rs.printSlaveReplicationInfo()
source: localhost:27018
syncedTo: Tue Jun 02 2020 16:14:04 GMT+0530 (India Standard Time)
0 secs (0 hrs) behind the primary
source: localhost:27019
syncedTo: Thu Jan 01 1970 05:30:00 GMT+0530 (India Standard Time)
1591094644 secs (441970.73 hrs) behind the primary
source: localhost:27016
syncedTo: Tue Jun 02 2020 16:14:04 GMT+0530 (India Standard Time)
0 secs (0 hrs) behind the primary
rs0:PRIMARY>
Monitoring- Replication
• Replication lag. Replication lag refers to delays in copying data from the
primary node to a secondary node.
• Replica state. The replica state is a method of tracking if secondary nodes
have died, and if there was an election of a new primary node.
• Locking state. The locking state shows what data locks are set, and the
length of time they have been in place.
• Disk utilization. Disk utilization refers to disk access.
• Memory usage. Memory usages refers to how much memory is being
used, and how it is being used.
• Number of connections. The number of connections the database has
open in order to serve requests as quickly as possible.
MongoDB Sharding
MongoDB
Sharding
• Sharding is the architecture to store big data in distributed servers.
• In MongoDB, sharding maintains a huge data and is mostly used for
massively growing space requirement. Now big applications are
based on the end to end transactional data, which is growing day by
day and the requirement of space is rapidly increasing.
• Just because of the increase in information storage, a single machine
is not able to deal with the huge storage capacity. We have to share
the information in chunks between different servers.
• In mongo, sharding provides horizontal scale-up application
architecture by which we can divide information upon different
servers.
• With the help of sharding, we can connect multiple servers with the
current instance of the database to support growing information
easily. This architecture maintains a load of information
automatically upon connected servers.
• A single shard represents as a single instance of the database and
collectively it becomes a logical database. As much the cluster
grows-up with a combination of the different shard, accordingly the
responsibility of each shard becomes lesser.
• For Example, we have to store 1GB of information in MongoDB. In
the Sharding architecture, if we have four shards, then each will hold
250MB and if we have two shards then each will hold 512MB.
MongoDB
Sharding Key
While implementing sharding in MongoDB we have to
define the key which will be treated as the primary key
for the shared instance.
For Example, if we have a collection of student
information of a particular class consisting of 14
students, along with which, we have two shard
instances.
Sharding Key
Then the same collection is divided between these
shards having 7/7 documents. To bind these two shard
instances we have a common key which will reflect the
relationship between these documents that will be
known as the shard key. It may be numeric, compound
or based on any hash.
What is
Sharding in
MongoDB?
Sharding is a concept in MongoDB, which splits
large data sets into small data sets across
multiple MongoDB instances.
Sometimes the data within MongoDB will be so
huge, that queries against such big data sets can
cause a lot of CPU utilization on the server. To
tackle this situation, MongoDB has a concept of
Sharding, which is basically the splitting of data
sets across multiple MongoDB instances.
The collection which could be large in size is
actually split across multiple collections or Shards
as they are called. Logically all the shards work as
one collection.
How to Implement Sharding
Shards are implemented by using clusters which are nothing but a group of MongoDB instances.
The components of a Shard include
1.A Shard – This is the basic thing, and this is nothing but a MongoDB instance which holds the subset of the data. In production environments, all
shards need to be part of replica sets.
2.Config server – This is a mongodb instance which holds metadata about the cluster, basically information about the various mongodb instances
which will hold the shard data.
3.A Router – This is a mongodb instance which basically is responsible to re-directing the commands send by the client to the right servers.
Step by Step Sharding Cluster Example
Step 1) Create a separate database for the config server.
mkdir /data/configdb
Step 2) Start the mongodb instance in configuration mode. Suppose if we have a server named Server D which would be our configuration server, we
would need to run the below command to configure the server as a configuration server.
mongod –configdb ServerD: 27019
Step 3) Start the mongos instance by specifying the configuration server
mongos –configdb ServerD: 27019
Step 4) From the mongo shell connect to the mongo's instance
mongo –host ServerD –port 27017
Step 5) If you have Server A and Server B which needs to be added to the cluster, issue the below commands
sh.addShard("ServerA:27017")
sh.addShard("ServerB:27017")
Step 6) Enable sharding for the database. So if we need to shard the Employeedb database, issue the below command
sh.enableSharding(Employeedb)
Step 7) Enable sharding for the collection. So if we need to shard the Employee collection, issue the below command
Sh.shardCollection("db.Employee" , { "Employeeid" : 1 , "EmployeeName" : 1})
Summary:
•As explained in tutorial, Sharding is a concept in MongoDB, which splits large data sets into small data sets across multiple MongoDB instances.
•https://www.guru99.com/mongodb-vs-mysql.html
Sharding
• What is Sharding in MongoDB?
• Sharding is a concept in MongoDB, which splits large data sets into
small data sets across multiple MongoDB instances.
• Sometimes the data within MongoDB will be so huge, that queries
against such big data sets can cause a lot of CPU utilization on the
server. To tackle this situation, MongoDB has a concept of Sharding,
which is basically the splitting of data sets across multiple MongoDB
instances.
• The collection which could be large in size is actually split across
multiple collections or Shards as they are called. Logically all the
shards work as one collection.
Why Sharding?
• In replication, all writes go to master node
• Latency sensitive queries still go to master
• Single replica set has limitation of 12 nodes
• Memory can't be large enough when active dataset is big
• Local disk is not big enough
• Vertical scaling is too expensive
Sharding in MongoDB
The following diagram shows the Sharding in MongoDB using sharded cluster.
Sharding
Sharding in MongoDB
• In the following diagram, there are three main components −
• Shards − Shards are used to store data. They provide high availability and
data consistency. In production environment, each shard is a separate
replica set.
• Config Servers − Config servers store the cluster's metadata. This data
contains a mapping of the cluster's data set to the shards. The query router
uses this metadata to target operations to specific shards. In production
environment, sharded clusters have exactly 3 config servers.
• Query Routers − Query routers are basically mongo instances, interface
with client applications and direct operations to the appropriate shard. The
query router processes and targets the operations to shards and then
returns results to the clients. A sharded cluster can contain more than one
query router to divide the client request load. A client sends requests to
one query router. Generally, a sharded cluster have many query routers.
Step by Step Sharding Cluster Example
Step 1) Create a separate database for the config server.
mkir /data/configdb
Step 2) Start the mongodb instance in configuration mode.
mongod –configdb ServerD: 27019
Step 3) Start the mongos instance by specifying the configuration server
mongos –configdb ServerD: 27019
Step 4) From the mongo shell connect to the mongo's instance
mongo –host ServerD –port 27017
Step 5) If you have Server A and Server B which needs to be added to the cluster, issue the below commands
sh.addShard("ServerA:27017")
sh.addShard("ServerB:27017")
Step 6) Enable sharding for the database. So if we need to shard the Employeedb database, issue the below
command
sh.enableSharding(Employeedb)
Step 7) Enable sharding for the collection. So if we need to shard the Employee collection, issue the below
command
Sh.shardCollection("db.Employee" , { "Employeeid" : 1 , "EmployeeName" : 1})

More Related Content

PDF
Introduction to elasticsearch
PPTX
Java script arrays
PPTX
React state
PDF
The New JavaScript: ES6
PPTX
PDF
Angular - Chapter 9 - Authentication and Authorization
PDF
MongoDB Aggregation Framework
PDF
HTML5: features with examples
Introduction to elasticsearch
Java script arrays
React state
The New JavaScript: ES6
Angular - Chapter 9 - Authentication and Authorization
MongoDB Aggregation Framework
HTML5: features with examples

What's hot (20)

PPTX
Introduction to HTML5 Canvas
PPTX
MongoDB - Aggregation Pipeline
PDF
Postgresql tutorial
PPTX
Spring Boot Tutorial
PDF
Atomic design
PPTX
Angular modules in depth
ODP
Introduction to jQuery
PPTX
Sequelize
PDF
TypeScript - An Introduction
PPTX
Owl: The New Odoo UI Framework
PPTX
Introduction to JSX
PPTX
jQuery
PPTX
Typescript ppt
PPT
Introduction to Javascript
PPTX
HTML, CSS, JavaScript for beginners
PDF
Introduction to Elasticsearch
PPTX
Elasticsearch
PDF
Java 17
ODP
Routing & Navigating Pages in Angular 2
Introduction to HTML5 Canvas
MongoDB - Aggregation Pipeline
Postgresql tutorial
Spring Boot Tutorial
Atomic design
Angular modules in depth
Introduction to jQuery
Sequelize
TypeScript - An Introduction
Owl: The New Odoo UI Framework
Introduction to JSX
jQuery
Typescript ppt
Introduction to Javascript
HTML, CSS, JavaScript for beginners
Introduction to Elasticsearch
Elasticsearch
Java 17
Routing & Navigating Pages in Angular 2
Ad

Similar to Mongodb replication (20)

PDF
Setting up mongo replica set
PPTX
MongoDB basics & Introduction
PPTX
Getting started with replica set in MongoDB
PDF
Configuring MongoDB HA Replica Set on AWS EC2
PPTX
Get expertise with mongo db
PPTX
EuroPython 2016 : A Deep Dive into the Pymongo Driver
PDF
Setting up mongodb sharded cluster in 30 minutes
PDF
Evolution Of MongoDB Replicaset
PDF
Chef solo the beginning
PDF
Webinar: MariaDB Provides the Solution to Ease Multi-Source Replication
PDF
Mongodb workshop
PDF
Evolution of MongoDB Replicaset and Its Best Practices
PDF
Exploring the replication in MongoDB
DOCX
MongoDB Replication and Sharding
PDF
오픈 소스 프로그래밍 - NoSQL with Python
PPTX
MongoDB for Beginners
PDF
Maintenance for MongoDB Replica Sets
PPTX
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
PPTX
Uri Cohen & Dan Kilman, GigaSpaces - Orchestration Tool Roundup - OpenStack l...
PDF
How do i Meet MongoDB
Setting up mongo replica set
MongoDB basics & Introduction
Getting started with replica set in MongoDB
Configuring MongoDB HA Replica Set on AWS EC2
Get expertise with mongo db
EuroPython 2016 : A Deep Dive into the Pymongo Driver
Setting up mongodb sharded cluster in 30 minutes
Evolution Of MongoDB Replicaset
Chef solo the beginning
Webinar: MariaDB Provides the Solution to Ease Multi-Source Replication
Mongodb workshop
Evolution of MongoDB Replicaset and Its Best Practices
Exploring the replication in MongoDB
MongoDB Replication and Sharding
오픈 소스 프로그래밍 - NoSQL with Python
MongoDB for Beginners
Maintenance for MongoDB Replica Sets
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
Uri Cohen & Dan Kilman, GigaSpaces - Orchestration Tool Roundup - OpenStack l...
How do i Meet MongoDB
Ad

More from PoguttuezhiniVP (7)

PDF
Sap basis administrator user guide
PDF
Mysql database basic user guide
PDF
MySQL database replication
PPTX
PostgreSQL – Logical Replication
PPTX
Postgresql Database Administration- Day4
PPTX
Postgresql Database Administration Basic - Day2
PPTX
Postgresql Database Administration Basic - Day1
Sap basis administrator user guide
Mysql database basic user guide
MySQL database replication
PostgreSQL – Logical Replication
Postgresql Database Administration- Day4
Postgresql Database Administration Basic - Day2
Postgresql Database Administration Basic - Day1

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
KodekX | Application Modernization Development
PPTX
Cloud computing and distributed systems.
PPT
Teaching material agriculture food technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Machine learning based COVID-19 study performance prediction
PDF
Empathic Computing: Creating Shared Understanding
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
sap open course for s4hana steps from ECC to s4
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Review of recent advances in non-invasive hemoglobin estimation
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
KodekX | Application Modernization Development
Cloud computing and distributed systems.
Teaching material agriculture food technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Machine learning based COVID-19 study performance prediction
Empathic Computing: Creating Shared Understanding
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Approach and Philosophy of On baking technology
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Building Integrated photovoltaic BIPV_UPV.pdf
sap open course for s4hana steps from ECC to s4
The AUB Centre for AI in Media Proposal.docx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Review of recent advances in non-invasive hemoglobin estimation

Mongodb replication

  • 2. What is replication? • Replication is a way of keeping identical copies of data on multiple servers and it is recommended for all production deployments. • Primary Server: Making standalone as primary server with no: 27017 Secondary Servers: Started two servers with port numbers: 27020 and 27021
  • 6. Best practices for replication Replica sets are MongoDB's mechanism to provide redundancy, high availability, and higher read throughput under the right conditions. Replication in MongoDB is easy to configure and light in operational terms: Always use replica sets: Even if your dataset is at the moment small and you don't expect it to grow exponentially, you never know when that might happen. Also, having a replica set of at least three servers helps design for redundancy, separating work loads between real time and analytics (using the secondaries)and having data redundancy built from day one. Use a replica set to your advantage: A replica set is not just for data replication. We can and should in most cases use the primary server for writes and preference reads from one of the secondaries to offload the primary server. This can be done by setting read preference for reads, together with the correct writeconcern to ensure writes propagate as needed. Use an odd number of replicas in a MongoDB replica set: If a server does downor loses connectivity with the rest of them (network partitioning), the rest have tovote as to which one will be elected as the primary server. If we have an odd number of replica set members, we can guarantee that each subset of servers knows if they belong to the majority or the minority of the replica set members. If we can't have an odd number of replicas, we need to have one extra host set as anarbiter with the sole purpose of voting in the election process. Even a microinstance in EC2 could serve this purpose
  • 7. Step by Step MongoDB Replication setup on Windows Important Notes: Before going to setup MongoDB replication, please take backup all important. 1. Start standalone server as shown below. mongod --dbpath "C:Program FilesMongoDBServer4.0data" --logpath "C:Program FilesMongoDBServer4.0logmongod.log" --port 27017 --storageEngine=wiredTiger --journal --replSet testdb 2. Connect to the server with port number 27017 mongo --port 27017 3. Then, create variable rsconf. rsconf={_id:"testdb",members:[{_id:0,host:"localhost:27017"}]} rs.initiate(rsconf) Note: Here i am configuring replication on single windows machine. If you have three different machines, then localhost with name or IP address and port number. 4. Start secondary server on the port 27020. mongod --dbpath "C:data1db" --logpath "C:data1logmongod.log" --port 27020 --storageEngine=wiredTiger --journal --replSet testdb 5. Logon to secondary server. mongo --port 27020 6. Start secondary server on the port 27021. mongod --dbpath "C:data2db" --logpath "C:data2logmongod.log" --port 27021 --storageEngine=wiredTiger --journal --replSet testdb 7. Logon to secondary server. mongo --port 27021 8. Run the following commands on Primary server. rs.add("localhost:27020") rs.add("localhost:27020") 9. Now go to secondary servers and run below command on both the secondary servers. rs.slaveOk()
  • 8. Replication Setup verification Create a collection primary server and verify this change will reflect on secondary servers or not. 1. Connect to primary server. use ecom 2. Create a collection in Primary db.test.insert({name:"MongoDB"}) 1. Now connect to secondary servers and check the list of the database by running command show dbs 2. Switch to the newly created database. use ecom 3. run the command against ecom database. db.test.find().pretty().
  • 9. MongoDB replication setup : 1. Keep data backup /etc/hosts and /etc/mongod.conf 2. Configure hosts/ 3. Configure firewall 4. Configure MongoDB Replica Set 5. Initiate Replication 6. Test the replication In this mongodb replication setup step by step on linux, following are my ip addresses and their host names respectively. • 192.168.152.135 mongodb1 • 192.168.152.141 mongodb2 • 192.168.152.142 mongodb3 Mongodb replication setup step by step on linux
  • 10. Step1: Keep data backup /etc/hosts and /etc/mongod.conf cp /etc/hosts hosts_before_mod cp /etc/mongod.conf mongod_before_mod Step2: Configure hosts/ - all servers Edit the /etc/hosts file in all replica servers and add below lines. vi /etc/hosts Then add below lines and save hosts file. 192.168.152.135 mongodb1 192.168.152.141 mongodb2 192.168.152.142 mongodb3 OS command : Hostname & ifconfig Step3: Configure firewall in all nodes in replica set Now install ufw(uncomplicated firewall), if not installed by using below command. sudo apt install ufw sudo ufw enable Now enable the port 27017 on all replica nodes. sudo ufw allow 27017 Verify firewall opened for port 27017 or not sudo ufw statsu verbose Step1: Keep data backup /etc/hosts and /etc/mongod.conf Step2: Configure hosts/ - all servers Step3: Configure firewall in all nodes in replica set
  • 11. Step4: Configure MongoDB Replica Set This can be done by modifying the /etc/mongod.conf configuration file. In this step, we add bindIp and replica set name. vi /etc/mongod.conf Then add the ip address of your host to the field bindIp Before modification of bindIp # network interfaces net: port: 27017 bindIp: 127.0.0.1 After modification of bindIp # network interfaces net: port: 27017 bindIp: 192.168.152.135 #127.0.0.1 Now enable replication and add replica set name: Before replica set name added: #replication: After replica set name added: Remove the hash mark and field replSetName. replSetName is case sensitive. replication: replSetName: "testdb" Important note: There should be single space after colon(:) and two spaces before the replSetName repeat the above steps of 4th step in all replica nodes. Step4: Configure MongoDB Replica Set
  • 12. Step5: Initiate Replication on three nodes We have to restart MongoDB servers on all three nodes by using below command: systemctl restart mongod.service systemctl status mongod.service Connect to the mongodb servers. Then execute the command: rs.initiate() Press Enter twice. After this add another nodes to replica set. rs.add("192.168.152.141") rs.add("192.168.152.142") Now verify the replication status: rs.status() Step5: Initiate Replication on three nodes
  • 13. Now create a database on primary server. Then create a collection in that database and test this changes have been updated in the secondary servers or not. Goto Primary node. a) Create a database testdb>:PRIMARY> use test switched to db test testdb>:PRIMARY> db test >:PRIMARY> b) Create a collection in the above database: db.eclhur.insert({"name":"Elchuru"}) Then run the command show dbs c) Now switch to Secondary nodes and type show dbs. If the new database reflected in the seconday nodes then replication setup is successfull otherwise something went wrong and need to troubleshooted. testdb>:SECONDARY> show dbs admin 0.000GB config 0.000GB local 0.000GB test 0.000GB New database replicated in the secondary note. So replication test successful. Test the replication
  • 14. Check the status of Replication -- Run below command from the primary replica node to get complete info of the replica set. rs.conf() rs.status() rs0:PRIMARY> rs.conf() { "_id" : "rs0", "version" : 2, "protocolVersion" : NumberLong(1), "writeConcernMajorityJournalDefault" : true, "members" : [ { "_id" : 0, "host" : "localhost:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "localhost:27018", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "localhost:27019", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { Check the status of Replication
  • 15. Continue ……. { "_id" : 3, "host" : "localhost:27016", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : -1, "catchUpTakeoverDelayMillis" : 30000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("5ed6180d01a39f2162335de5") } } Check the status of Replication
  • 16. Add new MongoDB instance to a replica set Start Primary MongoDB client and run below command Syntax: rs.add(“hostname:port”) Example: rs0:PRIMARY> rs.add("localhost:27016") { "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1591094195, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1591094195, 1) }
  • 17. Remove existing MongoDB instance from the replica set The below command will remove the required secondary host from the replica set. Syntax: rs.remove("localhost:27017") Example: rs0:PRIMARY> rs.remove("localhost:27016") { "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1591095681, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1591095681, 1) } rs0:PRIMARY>
  • 18. Make Primary as Secondary replica set MongoDB provides a command to instruct primary replica to become a secondary replica set. Syntax: rs.stepDown( stepDownSecs , secondaryCatchupSecs ) Example: rs0:PRIMARY> rs.stepDown(12) { "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1591096055, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1591096055, 1) } rs0:SECONDARY>
  • 19. Check the Replica Lag between primary and Secondary The below command will be used to check the replication lag between all replica set from the primary. Syntax: rs.printSlaveReplicationInfo() Example: rs0:PRIMARY> rs.printSlaveReplicationInfo() source: localhost:27018 syncedTo: Tue Jun 02 2020 16:14:04 GMT+0530 (India Standard Time) 0 secs (0 hrs) behind the primary source: localhost:27019 syncedTo: Thu Jan 01 1970 05:30:00 GMT+0530 (India Standard Time) 1591094644 secs (441970.73 hrs) behind the primary source: localhost:27016 syncedTo: Tue Jun 02 2020 16:14:04 GMT+0530 (India Standard Time) 0 secs (0 hrs) behind the primary rs0:PRIMARY>
  • 20. Monitoring- Replication • Replication lag. Replication lag refers to delays in copying data from the primary node to a secondary node. • Replica state. The replica state is a method of tracking if secondary nodes have died, and if there was an election of a new primary node. • Locking state. The locking state shows what data locks are set, and the length of time they have been in place. • Disk utilization. Disk utilization refers to disk access. • Memory usage. Memory usages refers to how much memory is being used, and how it is being used. • Number of connections. The number of connections the database has open in order to serve requests as quickly as possible.
  • 22. MongoDB Sharding • Sharding is the architecture to store big data in distributed servers. • In MongoDB, sharding maintains a huge data and is mostly used for massively growing space requirement. Now big applications are based on the end to end transactional data, which is growing day by day and the requirement of space is rapidly increasing. • Just because of the increase in information storage, a single machine is not able to deal with the huge storage capacity. We have to share the information in chunks between different servers. • In mongo, sharding provides horizontal scale-up application architecture by which we can divide information upon different servers. • With the help of sharding, we can connect multiple servers with the current instance of the database to support growing information easily. This architecture maintains a load of information automatically upon connected servers. • A single shard represents as a single instance of the database and collectively it becomes a logical database. As much the cluster grows-up with a combination of the different shard, accordingly the responsibility of each shard becomes lesser. • For Example, we have to store 1GB of information in MongoDB. In the Sharding architecture, if we have four shards, then each will hold 250MB and if we have two shards then each will hold 512MB.
  • 23. MongoDB Sharding Key While implementing sharding in MongoDB we have to define the key which will be treated as the primary key for the shared instance. For Example, if we have a collection of student information of a particular class consisting of 14 students, along with which, we have two shard instances. Sharding Key Then the same collection is divided between these shards having 7/7 documents. To bind these two shard instances we have a common key which will reflect the relationship between these documents that will be known as the shard key. It may be numeric, compound or based on any hash.
  • 24. What is Sharding in MongoDB? Sharding is a concept in MongoDB, which splits large data sets into small data sets across multiple MongoDB instances. Sometimes the data within MongoDB will be so huge, that queries against such big data sets can cause a lot of CPU utilization on the server. To tackle this situation, MongoDB has a concept of Sharding, which is basically the splitting of data sets across multiple MongoDB instances. The collection which could be large in size is actually split across multiple collections or Shards as they are called. Logically all the shards work as one collection.
  • 25. How to Implement Sharding Shards are implemented by using clusters which are nothing but a group of MongoDB instances. The components of a Shard include 1.A Shard – This is the basic thing, and this is nothing but a MongoDB instance which holds the subset of the data. In production environments, all shards need to be part of replica sets. 2.Config server – This is a mongodb instance which holds metadata about the cluster, basically information about the various mongodb instances which will hold the shard data. 3.A Router – This is a mongodb instance which basically is responsible to re-directing the commands send by the client to the right servers. Step by Step Sharding Cluster Example Step 1) Create a separate database for the config server. mkdir /data/configdb Step 2) Start the mongodb instance in configuration mode. Suppose if we have a server named Server D which would be our configuration server, we would need to run the below command to configure the server as a configuration server. mongod –configdb ServerD: 27019 Step 3) Start the mongos instance by specifying the configuration server mongos –configdb ServerD: 27019 Step 4) From the mongo shell connect to the mongo's instance mongo –host ServerD –port 27017 Step 5) If you have Server A and Server B which needs to be added to the cluster, issue the below commands sh.addShard("ServerA:27017") sh.addShard("ServerB:27017") Step 6) Enable sharding for the database. So if we need to shard the Employeedb database, issue the below command sh.enableSharding(Employeedb) Step 7) Enable sharding for the collection. So if we need to shard the Employee collection, issue the below command Sh.shardCollection("db.Employee" , { "Employeeid" : 1 , "EmployeeName" : 1}) Summary: •As explained in tutorial, Sharding is a concept in MongoDB, which splits large data sets into small data sets across multiple MongoDB instances. •https://www.guru99.com/mongodb-vs-mysql.html
  • 26. Sharding • What is Sharding in MongoDB? • Sharding is a concept in MongoDB, which splits large data sets into small data sets across multiple MongoDB instances. • Sometimes the data within MongoDB will be so huge, that queries against such big data sets can cause a lot of CPU utilization on the server. To tackle this situation, MongoDB has a concept of Sharding, which is basically the splitting of data sets across multiple MongoDB instances. • The collection which could be large in size is actually split across multiple collections or Shards as they are called. Logically all the shards work as one collection.
  • 27. Why Sharding? • In replication, all writes go to master node • Latency sensitive queries still go to master • Single replica set has limitation of 12 nodes • Memory can't be large enough when active dataset is big • Local disk is not big enough • Vertical scaling is too expensive
  • 28. Sharding in MongoDB The following diagram shows the Sharding in MongoDB using sharded cluster.
  • 30. Sharding in MongoDB • In the following diagram, there are three main components − • Shards − Shards are used to store data. They provide high availability and data consistency. In production environment, each shard is a separate replica set. • Config Servers − Config servers store the cluster's metadata. This data contains a mapping of the cluster's data set to the shards. The query router uses this metadata to target operations to specific shards. In production environment, sharded clusters have exactly 3 config servers. • Query Routers − Query routers are basically mongo instances, interface with client applications and direct operations to the appropriate shard. The query router processes and targets the operations to shards and then returns results to the clients. A sharded cluster can contain more than one query router to divide the client request load. A client sends requests to one query router. Generally, a sharded cluster have many query routers.
  • 31. Step by Step Sharding Cluster Example Step 1) Create a separate database for the config server. mkir /data/configdb Step 2) Start the mongodb instance in configuration mode. mongod –configdb ServerD: 27019 Step 3) Start the mongos instance by specifying the configuration server mongos –configdb ServerD: 27019 Step 4) From the mongo shell connect to the mongo's instance mongo –host ServerD –port 27017 Step 5) If you have Server A and Server B which needs to be added to the cluster, issue the below commands sh.addShard("ServerA:27017") sh.addShard("ServerB:27017") Step 6) Enable sharding for the database. So if we need to shard the Employeedb database, issue the below command sh.enableSharding(Employeedb) Step 7) Enable sharding for the collection. So if we need to shard the Employee collection, issue the below command Sh.shardCollection("db.Employee" , { "Employeeid" : 1 , "EmployeeName" : 1})