SlideShare a Scribd company logo
MongoDB Documentation 
Release 2.6.4 
MongoDB Documentation Project 
September 16, 2014
2
Contents 
1 Introduction to MongoDB 3 
1.1 What is MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 
2 Install MongoDB 5 
2.1 Installation Guides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 
2.2 First Steps with MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 
3 MongoDB CRUD Operations 51 
3.1 MongoDB CRUD Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 
3.2 MongoDB CRUD Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 
3.3 MongoDB CRUD Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 
3.4 MongoDB CRUD Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 
4 Data Models 131 
4.1 Data Modeling Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 
4.2 Data Modeling Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 
4.3 Data Model Examples and Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 
4.4 Data Model Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 
5 Administration 171 
5.1 Administration Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 
5.2 Administration Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 
5.3 Administration Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 
6 Security 279 
6.1 Security Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 
6.2 Security Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 
6.3 Security Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 
6.4 Security Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 
7 Aggregation 387 
7.1 Aggregation Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 
7.2 Aggregation Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 
7.3 Aggregation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 
7.4 Aggregation Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 
8 Indexes 431 
8.1 Index Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 
8.2 Index Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436 
i
8.3 Indexing Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 
8.4 Indexing Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 
9 Replication 503 
9.1 Replication Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 
9.2 Replication Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507 
9.3 Replica Set Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543 
9.4 Replication Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593 
10 Sharding 607 
10.1 Sharding Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 
10.2 Sharding Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613 
10.3 Sharded Cluster Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634 
10.4 Sharding Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678 
11 Frequently Asked Questions 687 
11.1 FAQ: MongoDB Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 
11.2 FAQ: MongoDB for Application Developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690 
11.3 FAQ: The mongo Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700 
11.4 FAQ: Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702 
11.5 FAQ: Sharding with MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706 
11.6 FAQ: Replication and Replica Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711 
11.7 FAQ: MongoDB Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714 
11.8 FAQ: Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718 
11.9 FAQ: MongoDB Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720 
12 Release Notes 725 
12.1 Current Stable Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725 
12.2 Previous Stable Releases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762 
12.3 Other MongoDB Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808 
12.4 MongoDB Version Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808 
13 About MongoDB Documentation 811 
13.1 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811 
13.2 Editions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811 
13.3 Version and Revisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812 
13.4 Report an Issue or Make a Change Request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812 
13.5 Contribute to the Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812 
Index 829 
ii
MongoDB Documentation, Release 2.6.4 
See About MongoDB Documentation (page 811) for more information about the MongoDB Documentation project, 
this Manual and additional editions of this text. 
Note: This version of the PDF does not include the reference section, see MongoDB Reference Manual1 for a PDF 
edition of all MongoDB Reference Material. 
1http://docs.mongodb.org/master/MongoDB-reference-manual.pdf 
Contents 1
MongoDB Documentation, Release 2.6.4 
2 Contents
CHAPTER 1 
Introduction to MongoDB 
Welcome to MongoDB. This document provides a brief introduction to MongoDB and some key concepts. See the 
installation guides (page 5) for information on downloading and installing MongoDB. 
1.1 What is MongoDB 
MongoDB is an open-source document database that provides high performance, high availability, and automatic 
scaling. 
1.1.1 Document Database 
A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB docu-ments 
are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents. 
Figure 1.1: A MongoDB document. 
The advantages of using documents are: 
• Documents (i.e. objects) correspond to native data types in many programming languages. 
• Embedded documents and arrays reduce need for expensive joins. 
• Dynamic schema supports fluent polymorphism. 
3
MongoDB Documentation, Release 2.6.4 
1.1.2 Key Features 
High Performance 
MongoDB provides high performance data persistence. In particular, 
• Support for embedded data models reduces I/O activity on database system. 
• Indexes support faster queries and can include keys from embedded documents and arrays. 
High Availability 
To provide high availability, MongoDB’s replication facility, called replica sets, provide: 
• automatic failover. 
• data redundancy. 
A replica set (page 503) is a group of MongoDB servers that maintain the same data set, providing redundancy and 
increasing data availability. 
Automatic Scaling 
MongoDB provides horizontal scalability as part of its core functionality. 
• Automatic sharding (page 607) distributes data across a cluster of machines. 
• Replica sets can provide eventually-consistent reads for low-latency high throughput deployments. 
4 Chapter 1. Introduction to MongoDB
CHAPTER 2 
Install MongoDB 
MongoDB runs on most platforms and supports both 32-bit and 64-bit architectures. 
2.1 Installation Guides 
See the Release Notes (page 725) for information about specific releases of MongoDB. 
Install on Linux (page 6) Documentations for installing the official MongoDB distribution on Linux-based systems. 
Install on Red Hat (page 6) Install MongoDB on Red Hat Enterprise, CentOS, Fedora and related Linux sys-tems 
using .rpm packages. 
Install on Ubuntu (page 9) Install MongoDB on Ubuntu Linux systems using .deb packages. 
Install on Debian (page 12) Install MongoDB on Debian systems using .deb packages. 
Install on Other Linux Systems (page 14) Install the official build of MongoDB on other Linux systems from 
MongoDB archives. 
Install on OS X (page 16) Install the official build of MongoDB on OS X systems from Homebrew packages or from 
MongoDB archives. 
Install on Windows (page 19) Install MongoDB on Windows systems and optionally start MongoDB as a Windows 
service. 
Install MongoDB Enterprise (page 24) MongoDB Enterprise is available for MongoDB Enterprise subscribers and 
includes several additional features including support for SNMP monitoring, LDAP authentication, Kerberos 
authentication, and System Event Auditing. 
Install MongoDB Enterprise on Red Hat (page 24) Install the MongoDB Enterprise build and required depen-dencies 
on Red Hat Enterprise or CentOS Systems using packages. 
Install MongoDB Enterprise on Ubuntu (page 27) Install the MongoDB Enterprise build and required depen-dencies 
on Ubuntu Linux Systems using packages. 
Install MongoDB Enterprise on Debian (page 30) Install the MongoDB Enterprise build and required depen-dencies 
on Debian Linux Systems using packages. 
Install MongoDB Enterprise on SUSE (page 32) Install the MongoDB Enterprise build and required depen-dencies 
on SUSE Enterprise Linux. 
Install MongoDB Enterprise on Amazon AMI (page 34) Install the MongoDB Enterprise build and required 
dependencies on Amazon Linux AMI. 
Install MongoDB Enterprise on Windows (page 36) Install the MongoDB Enterprise build and required de-pendencies 
using the .msi installer. 
5
MongoDB Documentation, Release 2.6.4 
2.1.1 Install on Linux 
These documents provide instructions to install MongoDB for various Linux systems. 
Recommended 
For easy installation, MongoDB provides packages for popular Linux distributions. The following guides detail the 
installation process for these systems: 
Install on Red Hat (page 6) Install MongoDB on Red Hat Enterprise, CentOS, Fedora and related Linux systems 
using .rpm packages. 
Install on Ubuntu (page 9) Install MongoDB on Ubuntu Linux systems using .deb packages. 
Install on Debian (page 12) Install MongoDB on Debian systems using .deb packages. 
For systems without supported packages, refer to the Manual Installation tutorial. 
Manual Installation 
Although packages are the preferred installation method, for Linux systems without supported packages, see the 
following guide: 
Install on Other Linux Systems (page 14) Install the official build of MongoDB on other Linux systems from Mon-goDB 
archives. 
Install MongoDB on Red Hat Enterprise, CentOS, Fedora, or Amazon Linux 
Overview Use this tutorial to install MongoDB on Red Hat Enterprise Linux, CentOS Linux, Fedora Linux, or a 
related system from .rpm packages. While some of these distributions include their own MongoDB packages, the 
official MongoDB packages are generally more up to date. 
Packages MongoDB provides packages of the officially supported MongoDB builds in it’s own repository. This 
repository provides the MongoDB distribution in the following packages: 
• mongodb-org 
This package is a metapackage that will automatically install the four component packages listed below. 
• mongodb-org-server 
This package contains the mongod daemon and associated configuration and init scripts. 
• mongodb-org-mongos 
This package contains the mongos daemon. 
• mongodb-org-shell 
This package contains the mongo shell. 
• mongodb-org-tools 
This package contains the following MongoDB tools: mongoimport bsondump, mongodump, 
mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, 
mongostat, and mongotop. 
6 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Control Scripts The mongodb-org package includes various control scripts, including the init script 
/etc/rc.d/init.d/mongod. These scripts are used to stop, start, and restart daemon processes. 
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. See 
http://docs.mongodb.org/manualreference/configuration-options for documentation of the 
configuration file. 
As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). 
You can use the mongod init script to derive your own mongos control script for use in such environments. See the 
mongos reference for configuration details. 
Warning: With the introduction of systemd in Fedora 15, the control scripts included in the packages available 
in the MongoDB downloads repository are not compatible with Fedora systems. A correction is forthcoming; see 
SERVER-7285a for more information. In the mean time use your own control scripts or install using the procedure 
outlined in Install MongoDB on Linux Systems (page 14). 
ahttps://jira.mongodb.org/browse/SERVER-7285 
Considerations For production deployments, always run MongoDB on 64-bit systems. 
The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 
127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. 
Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation 
of an older release, please refer to the documentation for the appropriate version. 
Install MongoDB 
Step 1: Configure the package management system (YUM). Create a /etc/yum.repos.d/mongodb.repo 
file to hold the following configuration information for the MongoDB repository: 
If you are running a 64-bit system, use the following configuration: 
[mongodb] 
name=MongoDB Repository 
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/ 
gpgcheck=0 
enabled=1 
If you are running a 32-bit system, which is not recommended for production deployments, use the following config-uration: 
[mongodb] 
name=MongoDB Repository 
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/i686/ 
gpgcheck=0 
enabled=1 
Step 2: Install the MongoDB packages and associated tools. When you install the packages, you choose whether 
to install the current release or a previous one. This step provides the commands for both. 
To install the latest stable version of MongoDB, issue the following command: 
sudo yum install -y mongodb-org 
2.1. Installation Guides 7
MongoDB Documentation, Release 2.6.4 
To install a specific release of MongoDB, specify each component package individually and append the version number 
to the package name, as in the following example that installs the 2.6.1‘ release of MongoDB: 
sudo yum install -y mongodb-org-2.6.1 mongodb-org-server-2.6.1 mongodb-org-shell-2.6.1 mongodb-org-mongos-You can specify any available version of MongoDB. However yum will upgrade the packages when a newer version 
becomes available. To prevent unintended upgrades, pin the package. To pin a package, add the following exclude 
directive to your /etc/yum.conf file: 
exclude=mongodb-org,mongodb-org-server,mongodb-org-shell,mongodb-org-mongos,mongodb-org-tools 
Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for 
more information1. 
Run MongoDB 
Important: You must configure SELinux to allow MongoDB to start on Red Hat Linux-based systems (Red Hat 
Enterprise Linux, CentOS, Fedora). Administrators have three options: 
• enable access to the relevant ports (e.g. 27017) for SELinux. See Default MongoDB Port (page 380) for more 
information on MongoDB’s default ports. For default settings, this can be accomplished by running 
semanage port -a -t mongodb_port_t -p tcp 27017 
• set SELinux to permissive mode in /etc/selinux.conf. The line 
SELINUX=enforcing 
should be changed to 
SELINUX=permissive 
• disable SELinux entirely; as above but set 
SELINUX=disabled 
All three options require root privileges. The latter two options each requires a system reboot and may have larger 
implications for your deployment. 
You may alternatively choose not to install the SELinux packages when you are installing your Linux operating system, 
or choose to remove the relevant packages. This option is the most invasive and is not recommended. 
The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb 
by default, and runs using the mongod user account. You can specify alternate log and data file directories in 
/etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. 
If you change the user that runs the MongoDB process, you must modify the access control rights to the 
/var/lib/mongo and /var/log/mongodb directories to give this users access to these directories. 
Step 1: Start MongoDB. You can start the mongod process by issuing the following command: 
sudo service mongod start 
Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-cessfully 
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading 
1http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-linux 
8 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
[initandlisten] waiting for connections on port <port> 
where <port> is the port configured in /etc/mongod.conf, 27017 by default. 
You can optionally ensure that MongoDB will start following a system reboot by issuing the following command: 
sudo chkconfig mongod on 
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: 
sudo service mongod stop 
Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command: 
sudo service mongod restart 
You can follow the state of the process for errors or important messages by watching the output in the 
/var/log/mongodb/mongod.log file. 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Install MongoDB on Ubuntu 
Overview Use this tutorial to install MongoDB on Ubuntu Linux systems from .deb packages. While Ubuntu 
includes its own MongoDB packages, the official MongoDB packages are generally more up-to-date. 
Note: If you use an older Ubuntu that does not use Upstart (i.e. any version before 9.10 “Karmic”), please follow the 
instructions on the Install MongoDB on Debian (page 12) tutorial. 
Packages MongoDB provides packages of the officially supported MongoDB builds in it’s own repository. This 
repository provides the MongoDB distribution in the following packages: 
• mongodb-org 
This package is a metapackage that will automatically install the four component packages listed below. 
• mongodb-org-server 
This package contains the mongod daemon and associated configuration and init scripts. 
• mongodb-org-mongos 
This package contains the mongos daemon. 
• mongodb-org-shell 
This package contains the mongo shell. 
• mongodb-org-tools 
This package contains the following MongoDB tools: mongoimport bsondump, mongodump, 
mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, 
mongostat, and mongotop. 
2.1. Installation Guides 9
MongoDB Documentation, Release 2.6.4 
Control Scripts The mongodb-org package includes various control scripts, including the init script 
/etc/init.d/mongod. These scripts are used to stop, start, and restart daemon processes. 
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. See 
http://docs.mongodb.org/manualreference/configuration-options for documentation of the 
configuration file. 
As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). 
You can use the mongod init script to derive your own mongos control script for use in such environments. See the 
mongos reference for configuration details. 
Considerations For production deployments, always run MongoDB on 64-bit systems. 
You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients pack-ages 
provided by Ubuntu. 
The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 
127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. 
Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation 
of an older release, please refer to the documentation for the appropriate version. 
Install MongoDB 
Step 1: Import the public key used by the package management system. The Ubuntu package management tools 
(i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with 
GPG keys. Issue the following command to import the MongoDB public GPG Key2: 
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 
Step 2: Create a list file for MongoDB. Create the /etc/apt/sources.list.d/mongodb.list list file 
using the following command: 
echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.Step 3: Reload local package database. Issue the following command to reload the local package database: 
sudo apt-get update 
Step 4: Install the MongoDB packages. You can install either the latest stable version of MongoDB or a specific 
version of MongoDB. 
Install the latest stable version of MongoDB. Issue the following command: 
sudo apt-get install -y mongodb-org 
Install a specific release of MongoDB. Specify each component package individually and append the version num-ber 
to the package name, as in the following example that installs the 2.6.1 release of MongoDB: 
2http://docs.mongodb.org/10gen-gpg-key.asc 
10 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
sudo apt-get install -y mongodb-org=2.6.1 mongodb-org-server=2.6.1 mongodb-org-shell=2.6.1 mongodb-org-Pin a specific version of MongoDB. Although you can specify any available version of MongoDB, apt-get will 
upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To 
pin the version of MongoDB at the currently installed version, issue the following command sequence: 
echo "mongodb-org hold" | sudo dpkg --set-selections 
echo "mongodb-org-server hold" | sudo dpkg --set-selections 
echo "mongodb-org-shell hold" | sudo dpkg --set-selections 
echo "mongodb-org-mongos hold" | sudo dpkg --set-selections 
echo "mongodb-org-tools hold" | sudo dpkg --set-selections 
Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for 
more information3. 
Run MongoDB The MongoDB instance stores its data files in /var/lib/mongodb and its log files in 
/var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log and 
data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional 
information. 
If you change the user that runs the MongoDB process, you must modify the access control rights to the 
/var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. 
Step 1: Start MongoDB. Issue the following command to start mongod: 
sudo service mongod start 
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully 
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading 
[initandlisten] waiting for connections on port <port> 
where <port> is the port configured in /etc/mongod.conf, 27017 by default. 
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: 
sudo service mongod stop 
Step 4: Restart MongoDB. Issue the following command to restart mongod: 
sudo service mongod restart 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
3http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-ubuntu 
2.1. Installation Guides 11
MongoDB Documentation, Release 2.6.4 
Install MongoDB on Debian 
Overview Use this tutorial to install MongoDB on Debian systems from .deb packages. While some Debian 
distributions include their own MongoDB packages, the official MongoDB packages are generally more up to date. 
Note: This tutorial applies to both Debian systems and versions of Ubuntu Linux prior to 9.10 “Karmic” which do 
not use Upstart. Other Ubuntu users will want to follow the Install MongoDB on Ubuntu (page 9) tutorial. 
Packages MongoDB provides packages of the officially supported MongoDB builds in it’s own repository. This 
repository provides the MongoDB distribution in the following packages: 
• mongodb-org 
This package is a metapackage that will automatically install the four component packages listed below. 
• mongodb-org-server 
This package contains the mongod daemon and associated configuration and init scripts. 
• mongodb-org-mongos 
This package contains the mongos daemon. 
• mongodb-org-shell 
This package contains the mongo shell. 
• mongodb-org-tools 
This package contains the following MongoDB tools: mongoimport bsondump, mongodump, 
mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, 
mongostat, and mongotop. 
Control Scripts The mongodb-org package includes various control scripts, including the init script 
/etc/init.d/mongod. These scripts are used to stop, start, and restart daemon processes. 
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. See 
http://docs.mongodb.org/manualreference/configuration-options for documentation of the 
configuration file. 
As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). 
You can use the mongod init script to derive your own mongos control script for use in such environments. See the 
mongos reference for configuration details. 
Considerations For production deployments, always run MongoDB on 64-bit systems. 
You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients pack-ages 
that your release of Debian may include. 
The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 
127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. 
Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation 
of an older release, please refer to the documentation for the appropriate version. 
Install MongoDB The Debian package management tools (i.e. dpkg and apt) ensure package consistency and 
authenticity by requiring that distributors sign packages with GPG keys. 
12 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Step 1: Import the public key used by the package management system. Issue the following command to add 
the MongoDB public GPG Key4 to the system key ring. 
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10 
Step 2: Create a /etc/apt/sources.list.d/mongodb.list file for MongoDB. Create the list file using 
the following command: 
echo 'deb http://downloads-distro.mongodb.org/repo/debian-sysvinit dist 10gen' | sudo tee /etc/apt/sources.Step 3: Reload local package database. Issue the following command to reload the local package database: 
sudo apt-get update 
Step 4: Install the MongoDB packages. You can install either the latest stable version of MongoDB or a specific 
version of MongoDB. 
Install the latest stable version of MongoDB. Issue the following command: 
sudo apt-get install -y mongodb-org 
Install a specific release of MongoDB. Specify each component package individually and append the version num-ber 
to the package name, as in the following example that installs the 2.6.1 release of MongoDB: 
sudo apt-get install -y mongodb-org=2.6.1 mongodb-org-server=2.6.1 mongodb-org-shell=2.6.1 mongodb-org-Pin a specific version of MongoDB. Although you can specify any available version of MongoDB, apt-get will 
upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To 
pin the version of MongoDB at the currently installed version, issue the following command sequence: 
echo "mongodb-org hold" | sudo dpkg --set-selections 
echo "mongodb-org-server hold" | sudo dpkg --set-selections 
echo "mongodb-org-shell hold" | sudo dpkg --set-selections 
echo "mongodb-org-mongos hold" | sudo dpkg --set-selections 
echo "mongodb-org-tools hold" | sudo dpkg --set-selections 
Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for 
more information5. 
Run MongoDB The MongoDB instance stores its data files in /var/lib/mongodb and its log files in 
/var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log and 
data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional 
information. 
If you change the user that runs the MongoDB process, you must modify the access control rights to the 
/var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. 
4http://docs.mongodb.org/10gen-gpg-key.asc 
5http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-ubuntu 
2.1. Installation Guides 13
MongoDB Documentation, Release 2.6.4 
Step 1: Start MongoDB. Issue the following command to start mongod: 
sudo service mongod start 
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully 
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading 
[initandlisten] waiting for connections on port <port> 
where <port> is the port configured in /etc/mongod.conf, 27017 by default. 
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: 
sudo service mongod stop 
Step 4: Restart MongoDB. Issue the following command to restart mongod: 
sudo service mongod restart 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Install MongoDB on Linux Systems 
Overview Compiled versions of MongoDB for Linux provide a simple option for installing MongoDB for other 
Linux systems without supported packages. 
Considerations For production deployments, always run MongoDB on 64-bit systems. 
Install MongoDB MongoDB provides archives for both 64-bit and 32-bit Linux. Follow the installation procedure 
appropriate for your system. 
Install for 64-bit Linux 
Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from 
https://www.mongodb.org/downloads. 
For example, to download the latest release through the shell, issue the following: 
curl -O http://downloads.mongodb.org/linux/mongodb-linux-x86_64-2.6.4.tgz 
Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through 
the tar command: 
tar -zxvf mongodb-linux-x86_64-2.6.4.tgz 
14 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which 
MongoDB will run. 
mkdir -p mongodb 
cp -R -n mongodb-linux-x86_64-2.6.4/ mongodb 
Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/ 
directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH. 
For example, you can add the following line to your shell’s rc file (e.g. ~/.bashrc): 
export PATH=<mongodb-install-directory>/bin:$PATH 
Replace <mongodb-install-directory> with the path to the extracted MongoDB archive. 
Install for 32-bit Linux 
Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from 
https://www.mongodb.org/downloads. 
For example, to download the latest release through the shell, issue the following: 
curl -O http://downloads.mongodb.org/linux/mongodb-linux-i686-2.6.4.tgz 
Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through 
the tar command: 
tar -zxvf mongodb-linux-i686-2.6.4.tgz 
Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which 
MongoDB will run. 
mkdir -p mongodb 
cp -R -n mongodb-linux-i686-2.6.4/ mongodb 
Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/ 
directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH. 
For example, you can add the following line to your shell’s rc file (e.g. ~/.bashrc): 
export PATH=<mongodb-install-directory>/bin:$PATH 
Replace <mongodb-install-directory> with the path to the extracted MongoDB archive. 
Run MongoDB 
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which 
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a 
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process 
later in this procedure. 
The following example command creates the default /data/db directory: 
2.1. Installation Guides 15
MongoDB Documentation, Release 2.6.4 
mkdir -p /data/db 
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user 
account running mongod has read and write permissions for the directory. 
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the 
path of the mongod or the data directory. See the following examples. 
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you 
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: 
mongod 
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full 
path to the mongod binary at the system prompt: 
<path to binary>/mongod 
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the 
path to the data directory using the --dbpath option: 
mongod --dbpath <path to data directory> 
Step 4: Stop MongoDB as needed. To stop MongoDB, press Control+C in the terminal where the mongod 
instance is running. 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
2.1.2 Install MongoDB on OS X 
Overview 
Use this tutorial to install MongoDB on on OS X systems. 
Platform Support 
Starting in version 2.4, MongoDB only supports OS X versions 10.6 (Snow Leopard) on Intel x86-64 and later. 
MongoDB is available through the popular OS X package manager Homebrew6 or through the MongoDB Download 
site7. 
Install MongoDB 
You can install MongoDB with Homebrew8 or manually. This section describes both. 
6http://brew.sh/ 
7http://www.mongodb.org/downloads 
8http://brew.sh/ 
16 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Install MongoDB with Homebrew 
Homebrew9 installs binary packages based on published “formulae.” This section describes how to update brew to 
the latest packages and install MongoDB. Homebrew requires some initial setup and configuration, which is beyond 
the scope of this document. 
Step 1: Update Homebrew’s package database. 
In a system shell, issue the following command: 
brew update 
Step 2: Install MongoDB. 
You can install MongoDB with via brew with several different options. Use one of the following operations: 
Install the MongoDB Binaries To install the MongoDB binaries, issue the following command in a system shell: 
brew install mongodb 
Build MongoDB from Source with SSL Support To build MongoDB from the source files and include SSL sup-port, 
issue the following from a system shell: 
brew install mongodb --with-openssl 
Install the Latest Development Release of MongoDB To install the latest development release for use in testing 
and development, issue the following command in a system shell: 
brew install mongodb --devel 
Install MongoDB Manually 
Only install MongoDB using this procedure if you cannot use homebrew (page 17). 
Step 1: Download the binary files for the desired release of MongoDB. 
Download the binaries from https://www.mongodb.org/downloads. 
For example, to download the latest release through the shell, issue the following: 
curl -O http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.4.tgz 
Step 2: Extract the files from the downloaded archive. 
For example, from a system shell, you can extract through the tar command: 
9http://brew.sh/ 
2.1. Installation Guides 17
MongoDB Documentation, Release 2.6.4 
tar -zxvf mongodb-osx-x86_64-2.6.4.tgz 
Step 3: Copy the extracted archive to the target directory. 
Copy the extracted folder to the location from which MongoDB will run. 
mkdir -p mongodb 
cp -R -n mongodb-osx-x86_64-2.6.4/ mongodb 
Step 4: Ensure the location of the binaries is in the PATH variable. 
The MongoDB binaries are in the bin/ directory of the archive. To ensure that the binaries are in your PATH, you 
can modify your PATH. 
For example, you can add the following line to your shell’s rc file (e.g. ~/.bashrc): 
export PATH=<mongodb-install-directory>/bin:$PATH 
Replace <mongodb-install-directory> with the path to the extracted MongoDB archive. 
Run MongoDB 
Step 1: Create the data directory. 
Before you start MongoDB for the first time, create the directory to which the mongod process will write data. By 
default, the mongod process uses the /data/db directory. If you create a directory other than this one, you must 
specify that directory in the dbpath option when starting the mongod process later in this procedure. 
The following example command creates the default /data/db directory: 
mkdir -p /data/db 
Step 2: Set permissions for the data directory. 
Before running mongod for the first time, ensure that the user account running mongod has read and write permis-sions 
for the directory. 
Step 3: Run MongoDB. 
To run MongoDB, run the mongod process at the system prompt. If necessary, specify the path of the mongod or the 
data directory. See the following examples. 
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you 
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: 
mongod 
18 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full 
path to the mongod binary at the system prompt: 
<path to binary>/mongod 
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the 
path to the data directory using the --dbpath option: 
mongod --dbpath <path to data directory> 
Step 4: Stop MongoDB as needed. 
To stop MongoDB, press Control+C in the terminal where the mongod instance is running. 
Step 5: Begin using MongoDB. 
To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes 
(page 188) document before deploying MongoDB in a production environment. 
2.1.3 Install MongoDB on Windows 
Overview 
Use this tutorial to install MongoDB on a Windows systems. 
Platform Support 
Starting in version 2.2, MongoDB does not support Windows XP. Please use a more recent version of Windows to use 
more recent releases of MongoDB. 
Important: If you are running any edition of Windows Server 2008 R2 or Windows 7, please install a hotfix to 
resolve an issue with memory mapped files on Windows10. 
Install MongoDB 
Step 1: Determine which MongoDB build you need. 
There are three builds of MongoDB for Windows: 
MongoDB for Windows Server 2008 R2 edition (i.e. 2008R2) runs only on Windows Server 2008 R2, Windows 7 
64-bit, and newer versions of Windows. This build takes advantage of recent enhancements to the Windows Platform 
and cannot operate on older versions of Windows. 
MongoDB forWindows 64-bit runs on any 64-bit version of Windows newer than Windows XP, including Windows 
Server 2008 R2 and Windows 7 64-bit. 
MongoDB for Windows 32-bit runs on any 32-bit version of Windows newer than Windows XP. 32-bit versions of 
MongoDB are only intended for older systems and for use in testing and development systems. 32-bit versions of 
MongoDB only support databases smaller than 2GB. 
10http://support.microsoft.com/kb/2731284 
2.1. Installation Guides 19
MongoDB Documentation, Release 2.6.4 
To find which version of Windows you are running, enter the following command in the Command Prompt: 
wmic os get osarchitecture 
Step 2: Download MongoDB for Windows. 
Download the latest production release of MongoDB from the MongoDB downloads page11. Ensure you download 
the correct version of MongoDB for your Windows system. The 64-bit versions of MongoDB does not work with 
32-bit Windows. 
Step 3: Install the downloaded file. 
InWindows Explorer, locate the downloaded MongoDB msi file, which typically is located in the default Downloads 
folder. Double-click the msi file. A set of screens will appear to guide you through the installation process. 
Step 4: Move the MongoDB folder to another location (optional). 
To move the MongoDB folder, you must issue the move command as an Administrator. For example, to move the 
folder to C:mongodb: 
Select Start Menu > All Programs > Accessories. 
Right-click Command Prompt and select Run as Administrator from the popup menu. 
Issue the following commands: 
cd  
move C:mongodb-win32-* C:mongodb 
MongoDB is self-contained and does not have any other system dependencies. You can run MongoDB from any folder 
you choose. You may install MongoDB in any folder (e.g. D:testmongodb) 
Run MongoDB 
Warning: Do not make mongod.exe visible on public networks without running in “Secure Mode” with the 
auth setting. MongoDB is designed to be run in trusted environments, and the database does not enable “Secure 
Mode” by default. 
Step 1: Set up the MongoDB environment. 
MongoDB requires a data directory to store all data. MongoDB’s default data directory path is datadb. Create 
this folder using the following commands from a Command Prompt: 
md datadb 
You can specify an alternate path for data files using the --dbpath option to mongod.exe, for example: 
C:mongodbbinmongod.exe --dbpath d:testmongodbdata 
If your path includes spaces, enclose the entire path in double quotes, for example: 
11http://www.mongodb.org/downloads 
20 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
C:mongodbbinmongod.exe --dbpath "d:testmongo db data" 
Step 2: Start MongoDB. 
To start MongoDB, run mongod.exe. For example, from the Command Prompt: 
C:Program FilesMongoDBbinmongod.exe 
This starts the main MongoDB database process. The waiting for connections message in the console 
output indicates that the mongod.exe process is running successfully. 
Depending on the security level of your system, Windows may pop up a Security Alert dialog box about block-ing 
“some features” of C:Program FilesMongoDBbinmongod.exe from communicating on networks. 
All users should select Private Networks, such as my home or work network and click Allow 
access. For additional information on security and MongoDB, please see the Security Documentation (page 281). 
Step 3: Connect to MongoDB. 
To connect to MongoDB through the mongo.exe shell, open another Command Prompt. When connecting, specify 
the data directory if necessary. This step provides several example connection commands. 
If your MongoDB installation uses the default data directory, connect without specifying the data directory: 
C:mongodbbinmongo.exe 
If you installation uses a different data directory, specify the directory when connecting, as in this example: 
C:mongodbbinmongod.exe --dbpath d:testmongodbdata 
If your path includes spaces, enclose the entire path in double quotes. For example: 
C:mongodbbinmongod.exe --dbpath "d:testmongo db data" 
If you want to develop applications using .NET, see the documentation of C# and MongoDB12 for more information. 
Step 4: Begin using MongoDB. 
To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes 
(page 188) document before deploying MongoDB in a production environment. 
Configure a Windows Service for MongoDB 
Note: There is a known issue for MongoDB 2.6.0, SERVER-1351513, which prevents the use of the instructions 
in this section. For MongoDB 2.6.0, use Manually Create a Windows Service for MongoDB (page 22) to create a 
Windows Service for MongoDB instead. 
12http://docs.mongodb.org/ecosystem/drivers/csharp 
13https://jira.mongodb.org/browse/SERVER-13515 
2.1. Installation Guides 21
MongoDB Documentation, Release 2.6.4 
Step 1: Configure directories and files. 
Create a configuration file and a directory path for MongoDB log output (logpath): 
Create a specific directory for MongoDB log files: 
md "C:Program FilesMongoDBlog" 
In the Command Prompt, create a configuration file for the logpath option for MongoDB: 
echo logpath="C:Program FilesMongoDBlogmongo.log" > "C:Program FilesMongoDBmongod.cfg" 
Step 2: Run the MongoDB service. 
Run all of the following commands in Command Prompt with “Administrative Privileges:” 
Install the MongoDB service. For --install to succeed, you must specify the logpath run-time option. 
"C:Program FilesMongoDBbinmongod.exe" --config "C:Program FilesMongoDBmongod.cfg" --install 
Modify the path to the mongod.cfg file as needed. 
To use an alternate dbpath, specify the path in the configuration file (e.g. C:Program 
FilesMongoDBmongod.cfg) or on the command line with the --dbpath option. 
If the dbpath directory does not exist, mongod.exe will not start. The default value for dbpath is datadb. 
If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with 
a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system 
resources exist and your system design requires it. 
Step 3: Stop or remove the MongoDB service as needed. 
To stop the MongoDB service use the following command: 
net stop MongoDB 
To remove the MongoDB service use the following command: 
"C:Program FilesMongoDBbinmongod.exe" --remove 
Manually Create a Windows Service for MongoDB 
The following procedure assumes you have installed MongoDB using the MSI installer, with the default path 
C:Program FilesMongoDB 2.6 Standard. 
If you have installed in an alternative directory, you will need to adjust the paths as appropriate. 
Step 1: Open an Administrator command prompt. 
Windows 7 / Vista / Server 2008 (and R2) Press Win + R, then type cmd, then press Ctrl + Shift + 
Enter. 
22 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Windows 8 Press Win + X, then press A. 
Execute the remaining steps from the Administrator command prompt. 
Step 2: Create directories. 
Create directories for your database and log files: 
mkdir c:datadb 
mkdir c:datalog 
Step 3: Create a configuration file. 
Create a configuration file. This file can include any of the configuration options for mongod, but 
must include a valid setting for logpath: 
The following creates a configuration file, specifying both the logpath and the dbpath settings in the configuration 
file: 
echo logpath=c:datalogmongod.log> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" 
echo dbpath=c:datadb>> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" 
Step 4: Create the MongoDB service. 
Create the MongoDB service. 
sc.exe create MongoDB binPath= ""C:Program FilesMongoDB 2.6 Standardbinmongod.exe" --service --sc.exe requires a space between “=” and the configuration values (eg “binPath= ”), and a “” to escape double quotes. 
If successfully created, the following log message will display: 
[SC] CreateService SUCCESS 
Step 5: Start the MongoDB service. 
net start MongoDB 
Step 6: Stop or remove the MongoDB service as needed. 
To stop the MongoDB service, use the following command: 
net stop MongoDB 
To remove the MongoDB service, first stop the service and then run the following command: 
sc.exe delete MongoDB 
2.1. Installation Guides 23
MongoDB Documentation, Release 2.6.4 
2.1.4 Install MongoDB Enterprise 
These documents provide instructions to install MongoDB Enterprise for Linux and Windows Systems. 
Install MongoDB Enterprise on Red Hat (page 24) Install the MongoDB Enterprise build and required dependen-cies 
on Red Hat Enterprise or CentOS Systems using packages. 
Install MongoDB Enterprise on Ubuntu (page 27) Install the MongoDB Enterprise build and required dependencies 
on Ubuntu Linux Systems using packages. 
Install MongoDB Enterprise on Debian (page 30) Install the MongoDB Enterprise build and required dependencies 
on Debian Linux Systems using packages. 
Install MongoDB Enterprise on SUSE (page 32) Install the MongoDB Enterprise build and required dependencies 
on SUSE Enterprise Linux. 
Install MongoDB Enterprise on Amazon AMI (page 34) Install the MongoDB Enterprise build and required depen-dencies 
on Amazon Linux AMI. 
Install MongoDB Enterprise on Windows (page 36) Install the MongoDB Enterprise build and required dependen-cies 
using the .msi installer. 
Install MongoDB Enterprise on Red Hat Enterprise or CentOS 
Overview 
Use this tutorial to install MongoDB Enterprise on Red Hat Enterprise Linux or CentOS Linux from .rpm packages. 
Packages 
MongoDB provides packages of the officially supported MongoDB Enterprise builds in it’s own repository. This 
repository provides the MongoDB Enterprise distribution in the following packages: 
• mongodb-enterprise 
This package is a metapackage that will automatically install the four component packages listed below. 
• mongodb-enterprise-server 
This package contains the mongod daemon and associated configuration and init scripts. 
• mongodb-enterprise-mongos 
This package contains the mongos daemon. 
• mongodb-enterprise-shell 
This package contains the mongo shell. 
• mongodb-enterprise-tools 
This package contains the following MongoDB tools: mongoimport bsondump, mongodump, 
mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, 
mongostat, and mongotop. 
24 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Control Scripts 
The mongodb-enterprise package includes various control scripts, including the init script 
/etc/rc.d/init.d/mongod. 
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. 
As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). 
You can use the mongod init script to derive your own mongos control script. 
Considerations 
MongoDB only provides Enterprise packages for Red Hat Enterprise Linux and CentOS Linux versions 5 and 6, 
64-bit. 
The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 
127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. 
Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation 
of an older release, please refer to the documentation for the appropriate version. 
Install MongoDB Enterprise 
When you install the packages for MongoDB Enterprise, you choose whether to install the current release or a previous 
one. This procedure describes how to do both. 
Step 1: Configure repository. Create an /etc/yum.repos.d/mongodb-enterprise.repo file so that 
you can install MongoDB enterprise directly, using yum. 
Use the following repository file to specify the latest stable release of MongoDB enterprise. 
[mongodb-enterprise] 
name=MongoDB Enterprise Repository 
baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/stable/$basearch/ 
gpgcheck=0 
enabled=1 
Use the following repository to install only versions of MongoDB for the 2.6 release. If you’d like to install Mon-goDB 
Enterprise packages from a particular release series (page 808), such as 2.4 or 2.6, you can specify the re-lease 
series in the repository configuration. For example, to restrict your system to the 2.6 release series, create a 
/etc/yum.repos.d/mongodb-enterprise-2.6.repo file to hold the following configuration information 
for the MongoDB Enterprise 2.6 repository: 
[mongodb-enterprise-2.6] 
name=MongoDB Enterprise 2.6 Repository 
baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/2.6/$basearch/ 
gpgcheck=0 
enabled=1 
.repo files for each release can also be found in the repository itself14. Remember that odd-numbered minor release 
versions (e.g. 2.5) are development versions and are unsuitable for production deployment. 
Step 1: Install the MongoDB Enterprise packages and associated tools. You can install either the latest stable 
version of MongoDB Enterprise or a specific version of MongoDB Enterprise. 
14https://repo.mongodb.com/yum/redhat/ 
2.1. Installation Guides 25
MongoDB Documentation, Release 2.6.4 
Install the latest stable version of MongoDB Enterprise. Issue the following command: 
sudo yum install -y mongodb-enterprise 
Step 2: Optional. Manage Installed Version 
Install a specific release of MongoDB Enterprise. Specify each component package individually and append the 
version number to the package name, as in the following example that installs the 2.6.1 release of MongoDB: 
sudo yum install -y mongodb-enterprise-2.6.1 mongodb-enterprise-server-2.6.1 mongodb-enterprise-shell-Pin a specific version of MongoDB Enterprise. Although you can specify any available version of MongoDB 
Enterprise, yum will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, 
pin the package. To pin a package, add the following exclude directive to your /etc/yum.conf file: 
exclude=mongodb-enterprise,mongodb-enterprise-server,mongodb-enterprise-shell,mongodb-enterprise-mongos,Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for 
more information15. 
Step 3: When the install completes, you can run MongoDB. 
Run MongoDB Enterprise 
Important: You must configure SELinux to allow MongoDB to start on Red Hat Linux-based systems (Red Hat 
Enterprise Linux, CentOS, Fedora). Administrators have three options: 
• enable access to the relevant ports (e.g. 27017) for SELinux. See Default MongoDB Port (page 380) for more 
information on MongoDB’s default ports. For default settings, this can be accomplished by running 
semanage port -a -t mongodb_port_t -p tcp 27017 
• set SELinux to permissive mode in /etc/selinux.conf. The line 
SELINUX=enforcing 
should be changed to 
SELINUX=permissive 
• disable SELinux entirely; as above but set 
SELINUX=disabled 
All three options require root privileges. The latter two options each requires a system reboot and may have larger 
implications for your deployment. 
You may alternatively choose not to install the SELinux packages when you are installing your Linux operating system, 
or choose to remove the relevant packages. This option is the most invasive and is not recommended. 
The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb 
by default, and runs using the mongod user account. You can specify alternate log and data file directories in 
/etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. 
15http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-linux 
26 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
If you change the user that runs the MongoDB process, you must modify the access control rights to the 
/var/lib/mongo and /var/log/mongodb directories to give this users access to these directories. 
Step 1: Start MongoDB. You can start the mongod process by issuing the following command: 
sudo service mongod start 
Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-cessfully 
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading 
[initandlisten] waiting for connections on port <port> 
where <port> is the port configured in /etc/mongod.conf, 27017 by default. 
You can optionally ensure that MongoDB will start following a system reboot by issuing the following command: 
sudo chkconfig mongod on 
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: 
sudo service mongod stop 
Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command: 
sudo service mongod restart 
You can follow the state of the process for errors or important messages by watching the output in the 
/var/log/mongodb/mongod.log file. 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Install MongoDB Enterprise on Ubuntu 
Overview 
Use this tutorial to install MongoDB Enterprise on Ubuntu Linux systems from .deb packages. 
Packages 
MongoDB provides packages of the officially supported MongoDB Enterprise builds in it’s own repository. This 
repository provides the MongoDB Enterprise distribution in the following packages: 
• mongodb-enterprise 
This package is a metapackage that will automatically install the four component packages listed below. 
• mongodb-enterprise-server 
This package contains the mongod daemon and associated configuration and init scripts. 
• mongodb-enterprise-mongos 
This package contains the mongos daemon. 
2.1. Installation Guides 27
MongoDB Documentation, Release 2.6.4 
• mongodb-enterprise-shell 
This package contains the mongo shell. 
• mongodb-enterprise-tools 
This package contains the following MongoDB tools: mongoimport bsondump, mongodump, 
mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, 
mongostat, and mongotop. 
Control Scripts 
The mongodb-enterprise package includes various control scripts, including the init script 
/etc/rc.d/init.d/mongod. 
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. 
As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). 
You can use the mongod init script to derive your own mongos control script. 
Considerations 
MongoDB only provides Enterprise packages for Ubuntu 12.04 LTS (Precise Pangolin). 
Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation 
of an older release, please refer to the documentation for the appropriate version. 
Install MongoDB Enterprise 
Step 1: Import the public key used by the package management system. The Ubuntu package management tools 
(i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with 
GPG keys. Issue the following command to import the MongoDB public GPG Key16: 
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 
Step 2: Create a /etc/apt/sources.list.d/mongodb-enterprise.list file for MongoDB. Create 
the list file using the following command: 
echo 'deb http://repo.mongodb.com/apt/ubuntu precise/mongodb-enterprise/stable multiverse' | sudo tee If you’d like to install MongoDB Enterprise packages from a particular release series (page 808), such as 2.4 or 2.6, 
you can specify the release series in the repository configuration. For example, to restrict your system to the 2.6 release 
series, add the following repository: 
echo 'deb http://repo.mongodb.com/apt/ubuntu precise/mongodb-enterprise/2.6 multiverse' | sudo tee /etc/Step 3: Reload local package database. Issue the following command to reload the local package database: 
sudo apt-get update 
16http://docs.mongodb.org/10gen-gpg-key.asc 
28 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Step 4: Install the MongoDB Enterprise packages. When you install the packages, you choose whether to install 
the current release or a previous one. This step provides instructions for both. 
To install the latest stable version of MongoDB Enterprise, issue the following command: 
sudo apt-get install mongodb-enterprise 
To install a specific release of MongoDB Enterprise, specify each component package individually and append the 
version number to the package name, as in the following example that installs the 2.6.1‘ release of MongoDB Enter-prise: 
apt-get install mongodb-enterprise=2.6.1 mongodb-enterprise-server=2.6.1 mongodb-enterprise-shell=2.6.1 You can specify any available version of MongoDB Enterprise. However apt-get will upgrade the packages when 
a newer version becomes available. To prevent unintended upgrades, pin the package. To pin the version of MongoDB 
Enterprise at the currently installed version, issue the following command sequence: 
echo "mongodb-enterprise hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-server hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-shell hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-mongos hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-tools hold" | sudo dpkg --set-selections 
Previous versions of MongoDB Enterprise packages use different naming conventions. See the 2.4 version of docu-mentation17 
for more information. 
Run MongoDB Enterprise 
The MongoDB instance stores its data files in /var/lib/mongodb and its log files in /var/log/mongodb 
by default, and runs using the mongodb user account. You can specify alternate log and data file directories in 
/etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. 
If you change the user that runs the MongoDB process, you must modify the access control rights to the 
/var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. 
Step 1: Start MongoDB. Issue the following command to start mongod: 
sudo service mongod start 
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully 
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading 
[initandlisten] waiting for connections on port <port> 
where <port> is the port configured in /etc/mongod.conf, 27017 by default. 
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: 
sudo service mongod stop 
Step 4: Restart MongoDB. Issue the following command to restart mongod: 
17http://docs.mongodb.org/v2.4/tutorial/install-mongodb-enterprise 
2.1. Installation Guides 29
MongoDB Documentation, Release 2.6.4 
sudo service mongod restart 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Install MongoDB Enterprise on Debian 
Overview 
Use this tutorial to install MongoDB Enterprise on Debian Linux systems from .deb packages. 
Packages 
MongoDB provides packages of the officially supported MongoDB Enterprise builds in it’s own repository. This 
repository provides the MongoDB Enterprise distribution in the following packages: 
• mongodb-enterprise 
This package is a metapackage that will automatically install the four component packages listed below. 
• mongodb-enterprise-server 
This package contains the mongod daemon and associated configuration and init scripts. 
• mongodb-enterprise-mongos 
This package contains the mongos daemon. 
• mongodb-enterprise-shell 
This package contains the mongo shell. 
• mongodb-enterprise-tools 
This package contains the following MongoDB tools: mongoimport bsondump, mongodump, 
mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, 
mongostat, and mongotop. 
Control Scripts 
The mongodb-enterprise package includes various control scripts, including the init script 
/etc/rc.d/init.d/mongod. 
The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. 
As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). 
You can use the mongod init script to derive your own mongos control script. 
Considerations 
Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation 
of an older release, please refer to the documentation for the appropriate version. 
MongoDB only provides Enterprise packages for 64-bit versions of Debian Wheezy. 
30 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Install MongoDB Enterprise 
Step 1: Import the public key used by the package management system. Issue the following command to add 
the MongoDB public GPG Key18 to the system key ring. 
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10 
Step 2: Create a /etc/apt/sources.list.d/mongodb-enterprise.list file for MongoDB. Create 
the list file using the following command: 
echo 'deb http://repo.mongodb.com/apt/debian wheezy/mongodb-enterprise/stable main' | sudo tee /etc/apt/If you’d like to install MongoDB Enterprise packages from a particular release series (page 808), such as 2.6, you can 
specify the release series in the repository configuration. For example, to restrict your system to the 2.6 release series, 
add the following repository: 
echo 'deb http://repo.mongodb.com/apt/debian precise/mongodb-enterprise/2.6 main' | sudo tee /etc/apt/Step 3: Reload local package database. Issue the following command to reload the local package database: 
sudo apt-get update 
Step 4: Install the MongoDB Enterprise packages. When you install the packages, you choose whether to install 
the current release or a previous one. This step provides instructions for both. 
To install the latest stable version of MongoDB Enterprise, issue the following command: 
sudo apt-get install mongodb-enterprise 
To install a specific release of MongoDB Enterprise, specify each component package individually and append the 
version number to the package name, as in the following example that installs the 2.6.1‘ release of MongoDB Enter-prise: 
apt-get install mongodb-enterprise=2.6.1 mongodb-enterprise-server=2.6.1 mongodb-enterprise-shell=2.6.1 You can specify any available version of MongoDB Enterprise. However apt-get will upgrade the packages when 
a newer version becomes available. To prevent unintended upgrades, pin the package. To pin the version of MongoDB 
Enterprise at the currently installed version, issue the following command sequence: 
echo "mongodb-enterprise hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-server hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-shell hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-mongos hold" | sudo dpkg --set-selections 
echo "mongodb-enterprise-tools hold" | sudo dpkg --set-selections 
Run MongoDB Enterprise 
The MongoDB instance stores its data files in /var/lib/mongodb and its log files in /var/log/mongodb 
by default, and runs using the mongodb user account. You can specify alternate log and data file directories in 
/etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. 
If you change the user that runs the MongoDB process, you must modify the access control rights to the 
/var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. 
18http://docs.mongodb.org/10gen-gpg-key.asc 
2.1. Installation Guides 31
MongoDB Documentation, Release 2.6.4 
Step 1: Start MongoDB. Issue the following command to start mongod: 
sudo service mongod start 
Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully 
by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading 
[initandlisten] waiting for connections on port <port> 
where <port> is the port configured in /etc/mongod.conf, 27017 by default. 
Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: 
sudo service mongod stop 
Step 4: Restart MongoDB. Issue the following command to restart mongod: 
sudo service mongod restart 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Install MongoDB Enterprise on SUSE 
Overview 
Use this tutorial to install MongoDB Enterprise on SUSE Linux. MongoDB Enterprise is available on select platforms 
and contains support for several features related to security and monitoring. 
Prerequisites 
To use MongoDB Enterprise on SUSE Enterprise Linux, you must install several prerequisite packages: 
• libopenssl0_9_8 
• libsnmp15 
• net-snmp 
• snmp-mibs 
• cyrus-sasl 
• cyrus-sasl-gssapi 
To install these packages, you can issue the following command: 
sudo zypper install libopenssl0_9_8 net-snmp libsnmp15 snmp-mibs cyrus-sasl cyrus-sasl-gssapi 
32 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Install MongoDB Enterprise 
Note: The Enterprise packages include an example SNMP configuration file named mongod.conf. This file is not 
a MongoDB configuration file. 
Step 1: Download and install the MongoDB Enterprise packages. After you have in-stalled 
the required prerequisite packages, download and install the MongoDB Enterprise packages 
from http://www.mongodb.com/subscription/downloads. The MongoDB binaries are located in the 
http://docs.mongodb.org/manualbin directory of the archive. To download and install, use the 
following sequence of commands. 
curl -O http://downloads.10gen.com/linux/mongodb-linux-x86_64-subscription-suse11-2.6.4.tgz 
tar -zxvf mongodb-linux-x86_64-subscription-suse11-2.6.4.tgz 
cp -R -n mongodb-linux-x86_64-subscription-suse11-2.6.4/ mongodb 
Step 2: Ensure the location of the MongoDB binaries is included in the PATH variable. Once you have copied 
the MongoDB binaries to their target location, ensure that the location is included in your PATH variable. If it is not, 
either include it or create symbolic links from the binaries to a directory that is included. 
Run MongoDB Enterprise 
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which 
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a 
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process 
later in this procedure. 
The following example command creates the default /data/db directory: 
mkdir -p /data/db 
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user 
account running mongod has read and write permissions for the directory. 
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the 
path of the mongod or the data directory. See the following examples. 
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you 
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: 
mongod 
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full 
path to the mongod binary at the system prompt: 
<path to binary>/mongod 
2.1. Installation Guides 33
MongoDB Documentation, Release 2.6.4 
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the 
path to the data directory using the --dbpath option: 
mongod --dbpath <path to data directory> 
Step 4: Stop MongoDB as needed. To stop MongoDB, press Control+C in the terminal where the mongod 
instance is running. 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Install MongoDB Enterprise on Amazon Linux AMI 
Overview 
Use this tutorial to install MongoDB Enterprise on Amazon Linux AMI. MongoDB Enterprise is available on select 
platforms and contains support for several features related to security and monitoring. 
Prerequisites 
To use MongoDB Enterprise on Amazon Linux AMI, you must install several prerequisite packages: 
• net-snmp 
• net-snmp-libs 
• openssl 
• net-snmp-utils 
• cyrus-sasl 
• cyrus-sasl-lib 
• cyrus-sasl-devel 
• cyrus-sasl-gssapi 
To install these packages, you can issue the following command: 
sudo yum install openssl net-snmp net-snmp-libs net-snmp-utils cyrus-sasl cyrus-sasl-lib cyrus-sasl-devel Install MongoDB Enterprise 
Note: The Enterprise packages include an example SNMP configuration file named mongod.conf. This file is not 
a MongoDB configuration file. 
Step 1: Download and install the MongoDB Enterprise packages. After you have in-stalled 
the required prerequisite packages, download and install the MongoDB Enterprise packages 
from http://www.mongodb.com/subscription/downloads. The MongoDB binaries are located in the 
http://docs.mongodb.org/manualbin directory of the archive. To download and install, use the 
following sequence of commands. 
34 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
curl -O http://downloads.10gen.com/linux/mongodb-linux-x86_64-subscription-amzn64-2.6.4.tgz 
tar -zxvf mongodb-linux-x86_64-subscription-amzn64-2.6.4.tgz 
cp -R -n mongodb-linux-x86_64-subscription-amzn64-2.6.4/ mongodb 
Step 2: Ensure the location of the MongoDB binaries is included in the PATH variable. Once you have copied 
the MongoDB binaries to their target location, ensure that the location is included in your PATH variable. If it is not, 
either include it or create symbolic links from the binaries to a directory that is included. 
Run MongoDB Enterprise 
The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb 
by default, and runs using the mongod user account. You can specify alternate log and data file directories in 
/etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. 
If you change the user that runs the MongoDB process, you must modify the access control rights to the 
/var/lib/mongo and /var/log/mongodb directories to give this users access to these directories. 
Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which 
the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a 
directory other than this one, you must specify that directory in the dbpath option when starting the mongod process 
later in this procedure. 
The following example command creates the default /data/db directory: 
mkdir -p /data/db 
Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user 
account running mongod has read and write permissions for the directory. 
Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the 
path of the mongod or the data directory. See the following examples. 
Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you 
use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: 
mongod 
Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full 
path to the mongod binary at the system prompt: 
<path to binary>/mongod 
Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the 
path to the data directory using the --dbpath option: 
mongod --dbpath <path to data directory> 
Step 4: Stop MongoDB as needed. To stop MongoDB, press Control+C in the terminal where the mongod 
instance is running. 
2.1. Installation Guides 35
MongoDB Documentation, Release 2.6.4 
Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Install MongoDB Enterprise on Windows 
New in version 2.6. 
Overview 
Use this tutorial to install MongoDB Enterprise on Windows systems. MongoDB Enterprise is available on select 
platforms and contains support for several features related to security and monitoring. 
Prerequisites 
MongoDB Enterprise Server for Windows requires Windows Server 2008 R2 or later. The MSI installer includes all 
other software dependencies. 
Install MongoDB Enterprise 
Step 1: Download MongoDB Enterprise for Windows. Download the latest production release of MongoDB 
Enterprise19 
Step 2: Install MongoDB Enterprise for Windows. Run the downloaded MSI installer. Make configuration 
choices as prompted. 
MongoDB is self-contained and does not have any other system dependencies. You can install MongoDB into any 
folder (e.g. D:testmongodb) and run it from there. The installation wizard includes an option to select an 
installation directory. 
Run MongoDB Enterprise 
Warning: Do not make mongod.exe visible on public networks without running in “Secure Mode” with the 
auth setting. MongoDB is designed to be run in trusted environments, and the database does not enable “Secure 
Mode” by default. 
Step 1: Set up the MongoDB environment. MongoDB requires a data directory to store all data. MongoDB’s 
default data directory path is datadb. Create this folder using the following commands from a Command Prompt: 
md datadb 
You can specify an alternate path for data files using the --dbpath option to mongod.exe, for example: 
C:mongodbbinmongod.exe --dbpath d:testmongodbdata 
If your path includes spaces, enclose the entire path in double quotes, for example: 
19http://www.mongodb.com/products/mongodb-enterprise 
36 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
C:mongodbbinmongod.exe --dbpath "d:testmongo db data" 
Step 2: Start MongoDB. To start MongoDB, run mongod.exe. For example, from the Command Prompt: 
C:Program FilesMongoDBbinmongod.exe 
This starts the main MongoDB database process. The waiting for connections message in the console 
output indicates that the mongod.exe process is running successfully. 
Depending on the security level of your system, Windows may pop up a Security Alert dialog box about block-ing 
“some features” of C:Program FilesMongoDBbinmongod.exe from communicating on networks. 
All users should select Private Networks, such as my home or work network and click Allow 
access. For additional information on security and MongoDB, please see the Security Documentation (page 281). 
Step 3: Connect to MongoDB. To connect to MongoDB through the mongo.exe shell, open another Command 
Prompt. When connecting, specify the data directory if necessary. This step provides several example connection 
commands. 
If your MongoDB installation uses the default data directory, connect without specifying the data directory: 
C:mongodbbinmongo.exe 
If you installation uses a different data directory, specify the directory when connecting, as in this example: 
C:mongodbbinmongod.exe --dbpath d:testmongodbdata 
If your path includes spaces, enclose the entire path in double quotes. For example: 
C:mongodbbinmongod.exe --dbpath "d:testmongo db data" 
If you want to develop applications using .NET, see the documentation of C# and MongoDB20 for more information. 
Step 4: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also 
consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 
Configure a Windows Service for MongoDB Enterprise 
You can set up the MongoDB server as a Windows Service that starts automatically at boot time. 
Step 1: Configure directories and files. Create a configuration file and a directory path for MongoDB log 
output (logpath): 
Create a specific directory for MongoDB log files: 
md "C:Program FilesMongoDBlog" 
In the Command Prompt, create a configuration file for the logpath option for MongoDB: 
echo logpath="C:Program FilesMongoDBlogmongo.log" > "C:Program FilesMongoDBmongod.cfg" 
20http://docs.mongodb.org/ecosystem/drivers/csharp 
2.1. Installation Guides 37
MongoDB Documentation, Release 2.6.4 
Step 2: Run the MongoDB service. Run all of the following commands in Command Prompt with “Administrative 
Privileges:” 
Install the MongoDB service. For --install to succeed, you must specify the logpath run-time option. 
"C:Program FilesMongoDBbinmongod.exe" --config "C:Program FilesMongoDBmongod.cfg" --install 
Modify the path to the mongod.cfg file as needed. 
To use an alternate dbpath, specify the path in the configuration file (e.g. C:Program 
FilesMongoDBmongod.cfg) or on the command line with the --dbpath option. 
If the dbpath directory does not exist, mongod.exe will not start. The default value for dbpath is datadb. 
If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with 
a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system 
resources exist and your system design requires it. 
Step 3: Stop or remove the MongoDB service as needed. To stop the MongoDB service use the following com-mand: 
net stop MongoDB 
To remove the MongoDB service use the following command: 
"C:Program FilesMongoDBbinmongod.exe" --remove 
Configure a Windows Service for MongoDB Enterprise 
Note: There is a known issue for MongoDB 2.6.0, SERVER-1351521, which prevents the use of the instructions 
in this section. For MongoDB 2.6.0, use Manually Create a Windows Service for MongoDB Enterprise (page 39) to 
create a Windows Service for MongoDB. 
You can set up the MongoDB server as a Windows Service that starts automatically at boot time. 
Step 1: Configure directories and files. Create a configuration file and a directory path for MongoDB log 
output (logpath): 
Create a specific directory for MongoDB log files: 
md "C:Program FilesMongoDBlog" 
In the Command Prompt, create a configuration file for the logpath option for MongoDB: 
echo logpath="C:Program FilesMongoDBlogmongo.log" > "C:Program FilesMongoDBmongod.cfg" 
Step 2: Run the MongoDB service. Run all of the following commands in Command Prompt with “Administrative 
Privileges:” 
Install the MongoDB service. For --install to succeed, you must specify the logpath run-time option. 
"C:Program FilesMongoDBbinmongod.exe" --config "C:Program FilesMongoDBmongod.cfg" --install 
21https://jira.mongodb.org/browse/SERVER-13515 
38 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Modify the path to the mongod.cfg file as needed. 
To use an alternate dbpath, specify the path in the configuration file (e.g. C:Program 
FilesMongoDBmongod.cfg) or on the command line with the --dbpath option. 
If the dbpath directory does not exist, mongod.exe will not start. The default value for dbpath is datadb. 
If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with 
a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system 
resources exist and your system design requires it. 
Step 3: Stop or remove the MongoDB service as needed. To stop the MongoDB service use the following com-mand: 
net stop MongoDB 
To remove the MongoDB service use the following command: 
"C:Program FilesMongoDBbinmongod.exe" --remove 
Manually Create a Windows Service for MongoDB Enterprise 
The following procedure assumes you have installed MongoDB using the MSI installer, with the default path 
C:Program FilesMongoDB 2.6 Enterprise. 
If you have installed in an alternative directory, you will need to adjust the paths as appropriate. 
Step 1: Open an Administrator command prompt. Press Win + R, then type cmd, then press Ctrl + Shift 
+ Enter. 
Execute the remaining steps from the Administrator command prompt. 
Step 2: Create directories. Create directories for your database and log files: 
mkdir c:datadb 
mkdir c:datalog 
Step 3: Create a configuration file. Create a configuration file. This file can include any of the 
configuration options for mongod, but must include a valid setting for logpath: 
The following creates a configuration file, specifying both the logpath and the dbpath settings in the configuration 
file: 
echo logpath=c:datalogmongod.log> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" 
echo dbpath=c:datadb>> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" 
Step 4: Create the MongoDB service. Create the MongoDB service. 
sc.exe create MongoDB binPath= ""C:Program FilesMongoDB 2.6 Enterprisebinmongod.exe" --service sc.exe requires a space between “=” and the configuration values (eg “binPath= ”), and a “” to escape double quotes. 
If successfully created, the following log message will display: 
2.1. Installation Guides 39
MongoDB Documentation, Release 2.6.4 
[SC] CreateService SUCCESS 
Step 5: Start the MongoDB service. 
net start MongoDB 
Step 6: Stop or remove the MongoDB service as needed. To stop the MongoDB service, use the following com-mand: 
net stop MongoDB 
To remove the MongoDB service, first stop the service and then run the following command: 
sc.exe delete MongoDB 
2.1.5 Verify Integrity of MongoDB Packages 
Overview 
The MongoDB release team digitally signs all software packages to certify that a particular MongoDB package is a 
valid and unaltered MongoDB release. 
Before installing MongoDB, you can validate packages using either a PGP signature or with MD5 and SHA checksums 
of the MongoDB packages. The PGP signatures store an encrypted hash of the software package, that you can validate 
to ensure that the package you have is consistent with the official package release. MongoDB also publishes MD5 and 
SHA hashes of the official packages that you can use to confirm that you have a valid package. 
Considerations 
MongoDB signs each release branch with a different PGP key. 
The public .asc and .pub key files for each branch are available for download. For example, the 2.2 keys are 
available at the following URLs: 
https://www.mongodb.org/static/pgp/server-2.2.asc 
https://www.mongodb.org/static/pgp/server-2.2.pub 
Replace 2.2 with the appropriate release number to download public key. Keys are available for all MongoDB 
releases beginning with 2.2. 
Procedures 
Use PGP/GPG 
Step 1: Download the MongoDB installation file. Download the binaries from 
https://www.mongodb.org/downloads based on your environment. 
For example, to download the 2.6.0 release for OS X through the shell, type this command: 
curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz 
40 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Step 2: Download the public signature file. 
curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz.sig 
Step 3: Download then import the key file. If you have not downloaded and imported the key file, enter these 
commands: 
curl -LO https://www.mongodb.org/static/pgp/server-2.6.asc 
gpg --import server-2.6.asc 
You should receive this message: 
gpg: key AAB2461C: public key "MongoDB 2.6 Release Signing Key <packaging@mongodb.com>" imported 
gpg: Total number processed: 1 
gpg: imported: 1 (RSA: 1) 
Step 4: Verify the MongoDB installation file. Type this command: 
gpg --verify mongodb-osx-x86_64-2.6.0.tgz.sig mongodb-osx-x86_64-2.6.0.tgz 
You should receive this message: 
gpg: Signature made Thu Mar 6 15:11:28 2014 EST using RSA key ID AAB2461C 
gpg: Good signature from "MongoDB 2.6 Release Signing Key <packaging@mongodb.com>" 
Download and import the key file, as described above, if you receive a message like this one: 
gpg: Signature made Thu Mar 6 15:11:28 2014 EST using RSA key ID AAB2461C 
gpg: Can't check signature: public key not found 
gpg will return the following message if the package is properly signed, but you do not currently trust the signing key 
in your local trustdb. 
gpg: WARNING: This key is not certified with a trusted signature! 
gpg: There is no indication that the signature belongs to the owner. 
Primary key fingerprint: DFFA 3DCF 326E 302C 4787 673A 01C4 E7FA AAB2 461C 
Use SHA 
MongoDB provides checksums using both the SHA-1 and SHA-256 hash functions. You can use either, as you like. 
Step 1: Download the MongoDB installation file. Download the binaries from 
https://www.mongodb.org/downloads based on your environment. 
For example, to download the 2.6.0 release for OS X through the shell, type this command: 
curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz 
Step 2: Download the SHA1 and SHA256 file. 
curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.3.tgz.sha1 
curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.3.tgz.sha256 
2.1. Installation Guides 41
MongoDB Documentation, Release 2.6.4 
Step 3: Use the SHA-256 checksum to verify the MongoDB package file. Compute the checksum of the package 
file: 
shasum mongodb-linux-x86_64-2.6.3.tgz 
which will generate this result: 
fe511ee40428edda3a507f70d2b91d16b0483674 mongodb-osx-x86_64-2.6.3.tgz 
Enter this command: 
cat mongodb-linux-x86_64-2.6.3.tgz.sha1 
which will generate this result: 
fe511ee40428edda3a507f70d2b91d16b0483674 mongodb-osx-x86_64-2.6.3.tgz 
The output of the shasum and cat commands should be identical. 
Step 3: Use the SHA-1 checksum to verify the MongoDB package file. Compute the checksum of the package 
file: 
shasum -a 256 mongodb-linux-x86_64-2.6.3.tgz 
which will generate this result: 
be3a5e9f4e9c8e954e9af7053776732387d2841a019185eaf2e52086d4d207a3 mongodb-osx-x86_64-2.6.3.tgz 
Enter this command: 
cat mongodb-linux-x86_64-2.6.3.tgz.sha256 
which will generate this result: 
be3a5e9f4e9c8e954e9af7053776732387d2841a019185eaf2e52086d4d207a3 mongodb-osx-x86_64-2.6.3.tgz 
The output of the shasum and cat commands should be identical. 
Use MD5 
Step 1: Download the MongoDB installation file. Download the binaries from 
https://www.mongodb.org/downloads based on your environment. 
For example, to download the 2.6.0 release for OS X through the shell, type this command: 
curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz 
Step 2: Download the MD5 file. 
curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz.md5 
Step 3: Verify the checksum values for the MongoDB package file (Linux). Compute the checksum of the pack-age 
file: 
md5 mongodb-linux-x86_64-2.6.0.tgz 
which will generate this result: 
42 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
MD5 (mongodb-linux-x86_64-2.6.0.tgz) = a937d49881f90e1a024b58d642011dc4 
Enter this command: 
cat mongodb-linux-x86_64-2.6.0.tgz.md5 
which will generate this result: 
a937d49881f90e1a024b58d642011dc4 
The output of the md5 and cat commands should be identical. 
Step 4: Verify the MongoDB installation file (OS X). Compute the checksum of the package file: 
md5sum -c mongodb-osx-x86_64-2.6.0.tgz.md5 mongodb-osx-x86_64-2.6.0.tgz 
which will generate this result: 
mongodb-osx-x86_64-2.6.0-rc1.tgz ok 
2.2 First Steps with MongoDB 
After you have installed MongoDB, consider the following documents as you begin to learn about MongoDB: 
Getting Started with MongoDB (page 43) An introduction to the basic operation and use of MongoDB. 
Generate Test Data (page 47) To support initial exploration, generate test data to facilitate testing. 
2.2.1 Getting Started with MongoDB 
This tutorial provides an introduction to basic database operations using the mongo shell. mongo is a part of the 
standard MongoDB distribution and provides a full JavaScript environment with complete access to the JavaScript 
language and all standard functions as well as a full database interface for MongoDB. See the mongo JavaScript API22 
documentation and the mongo shell JavaScript Method Reference. 
The tutorial assumes that you’re running MongoDB on a Linux or OS X operating system and that you have a running 
database server; MongoDB does support Windows and provides a Windows distribution with identical operation. 
For instructions on installing MongoDB and starting the database server, see the appropriate installation (page 5) 
document. 
Connect to a Database 
In this section, you connect to the database server, which runs as mongod, and begin using the mongo shell to select 
a logical database within the database instance and access the help text in the mongo shell. 
Connect to a mongod 
From a system prompt, start mongo by issuing the mongo command, as follows: 
mongo 
22http://api.mongodb.org/js 
2.2. First Steps with MongoDB 43
MongoDB Documentation, Release 2.6.4 
By default, mongo looks for a database server listening on port 27017 on the localhost interface. To connect to 
a server on a different port or interface, use the --port and --host options. 
Select a Database 
After starting the mongo shell your session will use the test database by default. At any time, issue the following 
operation at the mongo to report the name of the current database: 
db 
1. From the mongo shell, display the list of databases, with the following operation: 
show dbs 
2. Switch to a new database named mydb, with the following operation: 
use mydb 
3. Confirm that your session has the mydb database as context, by checking the value of the db object, which 
returns the name of the current database, as follows: 
db 
At this point, if you issue the show dbs operation again, it will not include the mydb database. MongoDB 
will not permanently create a database until you insert data into that database. The Create a Collection and 
Insert Documents (page 44) section describes the process for inserting data. 
New in version 2.4: show databases also returns a list of databases. 
Display mongo Help 
At any point, you can access help for the mongo shell using the following operation: 
help 
Furthermore, you can append the .help() method to some JavaScript methods, any cursor object, as well as the db 
and db.collection objects to return additional help information. 
Create a Collection and Insert Documents 
In this section, you insert documents into a new collection named testData within the new database named mydb. 
MongoDB will create a collection implicitly upon its first use. You do not need to create a collection before inserting 
data. Furthermore, because MongoDB uses dynamic schemas (page 688), you also need not specify the structure of 
your documents before inserting them into the collection. 
1. From the mongo shell, confirm you are in the mydb database by issuing the following: 
db 
2. If mongo does not return mydb for the previous operation, set the context to the mydb database, with the 
following operation: 
use mydb 
3. Create two documents named j and k by using the following sequence of JavaScript operations: 
44 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
j = { name : "mongo" } 
k = { x : 3 } 
4. Insert the j and k documents into the testData collection with the following sequence of operations: 
db.testData.insert( j ) 
db.testData.insert( k ) 
When you insert the first document, the mongod will create both the mydb database and the testData 
collection. 
5. Confirm that the testData collection exists. Issue the following operation: 
show collections 
The mongo shell will return the list of the collections in the current (i.e. mydb) database. At this point, the only 
collection is testData. All mongod databases also have a system.indexes (page 271) collection. 
6. Confirm that the documents exist in the testData collection by issuing a query on the collection using the 
find() method: 
db.testData.find() 
This operation returns the following results. The ObjectId (page 165) values will be unique: 
{ "_id" : ObjectId("4c2209f9f3924d31102bd84a"), "name" : "mongo" } 
{ "_id" : ObjectId("4c2209fef3924d31102bd84b"), "x" : 3 } 
All MongoDB documents must have an _id field with a unique value. These operations do not explicitly 
specify a value for the _id field, so mongo creates a unique ObjectId (page 165) value for the field before 
inserting it into the collection. 
Insert Documents using a For Loop or a JavaScript Function 
To perform the remaining procedures in this tutorial, first add more documents to your database using one or both of 
the procedures described in Generate Test Data (page 47). 
Working with the Cursor 
When you query a collection, MongoDB returns a “cursor” object that contains the results of the query. The mongo 
shell then iterates over the cursor to display the results. Rather than returning all results at once, the shell iterates over 
the cursor 20 times to display the first 20 results and then waits for a request to iterate over the remaining results. In 
the shell, use enter it to iterate over the next set of results. 
The procedures in this section show other ways to work with a cursor. For comprehensive documentation on cursors, 
see crud-read-cursor. 
Iterate over the Cursor with a Loop 
Before using this procedure, add documents to a collection using one of the procedures in Generate Test Data 
(page 47). You can name your database and collections anything you choose, but this procedure will assume the 
database named test and a collection named testData. 
1. In the MongoDB JavaScript shell, query the testData collection and assign the resulting cursor object to the 
c variable: 
2.2. First Steps with MongoDB 45
MongoDB Documentation, Release 2.6.4 
var c = db.testData.find() 
2. Print the full result set by using a while loop to iterate over the c variable: 
while ( c.hasNext() ) printjson( c.next() ) 
The hasNext() function returns true if the cursor has documents. The next() method returns the next 
document. The printjson() method renders the document in a JSON-like format. 
The operation displays all documents: 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 } 
... 
Use Array Operations with the Cursor 
The following procedure lets you manipulate a cursor object as if it were an array: 
1. In the mongo shell, query the testData collection and assign the resulting cursor object to the c variable: 
var c = db.testData.find() 
2. To find the document at the array index 4, use the following operation: 
printjson( c [ 4 ] ) 
MongoDB returns the following: 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), "x" : 5 } 
When you access documents in a cursor using the array index notation, mongo first calls the 
cursor.toArray() method and loads into RAM all documents returned by the cursor. The index is then 
applied to the resulting array. This operation iterates the cursor completely and exhausts the cursor. 
For very large result sets, mongo may run out of available memory. 
For more information on the cursor, see crud-read-cursor. 
Query for Specific Documents 
MongoDB has a rich query system that allows you to select and filter the documents in a collection along specific 
fields and values. See Query Documents (page 87) and Read Operations (page 55) for a full account of queries in 
MongoDB. 
In this procedure, you query for specific documents in the testData collection by passing a “query document” as a 
parameter to the find() method. A query document specifies the criteria the query must match to return a document. 
In the mongo shell, query for all documents where the x field has a value of 18 by passing the { x : 18 } query 
document as a parameter to the find() method: 
db.testData.find( { x : 18 } ) 
MongoDB returns one document that fits this criteria: 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf7"), "x" : 18 } 
46 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Return a Single Document from a Collection 
With the findOne() method you can return a single document from a MongoDB collection. The findOne() 
method takes the same parameters as find(), but returns a document rather than a cursor. 
To retrieve one document from the testData collection, issue the following command: 
db.testData.findOne() 
For more information on querying for documents, see the Query Documents (page 87) and Read Operations (page 55) 
documentation. 
Limit the Number of Documents in the Result Set 
To increase performance, you can constrain the size of the result by limiting the amount of data your application must 
receive over the network. 
To specify the maximum number of documents in the result set, call the limit() method on a cursor, as in the 
following command: 
db.testData.find().limit(3) 
MongoDB will return the following result, with different ObjectId (page 165) values: 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 } 
Next Steps with MongoDB 
For more information on manipulating the documents in a database as you continue to learn MongoDB, consider the 
following resources: 
• MongoDB CRUD Operations (page 51) 
• SQL to MongoDB Mapping Chart (page 120) 
• http://docs.mongodb.org/manualapplications/drivers 
2.2.2 Generate Test Data 
This tutorial describes how to quickly generate test data as you need to test basic MongoDB operations. 
Insert Multiple Documents Using a For Loop 
You can add documents to a new or existing collection by using a JavaScript for loop run from the mongo shell. 
1. From the mongo shell, insert new documents into the testData collection using the following for loop. If 
the testData collection does not exist, MongoDB creates the collection implicitly. 
for (var i = 1; i <= 25; i++) db.testData.insert( { x : i } ) 
2. Use find() to query the collection: 
db.testData.find() 
2.2. First Steps with MongoDB 47
MongoDB Documentation, Release 2.6.4 
The mongo shell displays the first 20 documents in the collection. Your ObjectId (page 165) values will be 
different: 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be9"), "x" : 4 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), "x" : 5 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990beb"), "x" : 6 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bec"), "x" : 7 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bed"), "x" : 8 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bee"), "x" : 9 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bef"), "x" : 10 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf0"), "x" : 11 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf1"), "x" : 12 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf2"), "x" : 13 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf3"), "x" : 14 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf4"), "x" : 15 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf5"), "x" : 16 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf6"), "x" : 17 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf7"), "x" : 18 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf8"), "x" : 19 } 
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bf9"), "x" : 20 } 
1. The find() returns a cursor. To iterate the cursor and return more documents use the it operation in the 
mongo shell. The mongo shell will exhaust the cursor, and return the following documents: 
{ "_id" : ObjectId("51a7dce92cacf40b79990bfc"), "x" : 21 } 
{ "_id" : ObjectId("51a7dce92cacf40b79990bfd"), "x" : 22 } 
{ "_id" : ObjectId("51a7dce92cacf40b79990bfe"), "x" : 23 } 
{ "_id" : ObjectId("51a7dce92cacf40b79990bff"), "x" : 24 } 
{ "_id" : ObjectId("51a7dce92cacf40b79990c00"), "x" : 25 } 
Insert Multiple Documents with a mongo Shell Function 
You can create a JavaScript function in your shell session to generate the above data. The insertData() JavaScript 
function, shown here, creates new data for use in testing or training by either creating a new collection or appending 
data to an existing collection: 
function insertData(dbName, colName, num) { 
var col = db.getSiblingDB(dbName).getCollection(colName); 
for (i = 0; i < num; i++) { 
col.insert({x:i}); 
} 
print(col.count()); 
} 
The insertData() function takes three parameters: a database, a new or existing collection, and the number of 
documents to create. The function creates documents with an x field that is set to an incremented integer, as in the 
following example documents: 
{ "_id" : ObjectId("51a4da9b292904caffcff6eb"), "x" : 0 } 
{ "_id" : ObjectId("51a4da9b292904caffcff6ec"), "x" : 1 } 
{ "_id" : ObjectId("51a4da9b292904caffcff6ed"), "x" : 2 } 
48 Chapter 2. Install MongoDB
MongoDB Documentation, Release 2.6.4 
Store the function in your .mongorc.js file. The mongo shell loads the function for you every time you start a session. 
Example 
Specify database name, collection name, and the number of documents to insert as arguments to insertData(). 
insertData("test", "testData", 400) 
This operation inserts 400 documents into the testData collection in the test database. If the collection and 
database do not exist, MongoDB creates them implicitly before inserting documents. 
See also: 
MongoDB CRUD Concepts (page 53) and Data Models (page 131). 
2.2. First Steps with MongoDB 49
MongoDB Documentation, Release 2.6.4 
50 Chapter 2. Install MongoDB
CHAPTER 3 
MongoDB CRUD Operations 
MongoDB provides rich semantics for reading and manipulating data. CRUD stands for create, read, update, and 
delete. These terms are the foundation for all interactions with the database. 
MongoDB CRUD Introduction (page 51) An introduction to the MongoDB data model as well as queries and data 
manipulations. 
MongoDB CRUD Concepts (page 53) The core documentation of query and data manipulation. 
MongoDB CRUD Tutorials (page 84) Examples of basic query and data modification operations. 
MongoDB CRUD Reference (page 117) Reference material for the query and data manipulation interfaces. 
3.1 MongoDB CRUD Introduction 
MongoDB stores data in the form of documents, which are JSON-like field and value pairs. Documents are analogous 
to structures in programming languages that associate keys with values (e.g. dictionaries, hashes, maps, and associative 
arrays). Formally, MongoDB documents are BSON documents. BSON is a binary representation of JSON with 
additional type information. In the documents, the value of a field can be any of the BSON data types, including other 
documents, arrays, and arrays of documents. For more information, see Documents (page 158). 
Figure 3.1: A MongoDB document. 
MongoDB stores all documents in collections. A collection is a group of related documents that have a set of shared 
common indexes. Collections are analogous to a table in relational databases. 
51
MongoDB Documentation, Release 2.6.4 
Figure 3.2: A collection of MongoDB documents. 
3.1.1 Database Operations 
Query 
In MongoDB a query targets a specific collection of documents. Queries specify criteria, or conditions, that identify 
the documents that MongoDB returns to the clients. A query may include a projection that specifies the fields from 
the matching documents to return. You can optionally modify queries to impose limits, skips, and sort orders. 
In the following diagram, the query process specifies a query criteria and a sort modifier: 
See Read Operations Overview (page 55) for more information. 
Data Modification 
Data modification refers to operations that create, update, or delete data. In MongoDB, these operations modify the 
data of a single collection. For the update and delete operations, you can specify the criteria to select the documents 
to update or remove. 
In the following diagram, the insert operation adds a new document to the users collection. 
See Write Operations Overview (page 68) for more information. 
3.1.2 Related Features 
Indexes 
To enhance the performance of common queries and updates, MongoDB has full support for secondary indexes. These 
indexes allow applications to store a view of a portion of the collection in an efficient data structure. Most indexes store 
an ordered representation of all values of a field or a group of fields. Indexes may also enforce uniqueness (page 457), 
store objects in a geospatial representation (page 444), and facilitate text search (page 454). 
52 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Figure 3.3: The stages of a MongoDB query with a query criteria and a sort modifier. 
Replica Set Read Preference 
For replica sets and sharded clusters with replica set components, applications specify read preferences (page 530). A 
read preference determines how the client direct read operations to the set. 
Write Concern 
Applications can also control the behavior of write operations using write concern (page 72). Particularly useful 
for deployments with replica sets, the write concern semantics allow clients to specify the assurance that MongoDB 
provides when reporting on the success of a write operation. 
Aggregation 
In addition to the basic queries, MongoDB provides several data aggregation features. For example, MongoDB can 
return counts of the number of documents that match a query, or return the number of distinct values for a field, or 
process a collection of documents using a versatile stage-based data processing pipeline or map-reduce operations. 
3.2 MongoDB CRUD Concepts 
The Read Operations (page 55) and Write Operations (page 67) documents introduce the behavior and operations of 
read and write operations for MongoDB deployments. 
Read Operations (page 55) Introduces all operations that select and return documents to clients, including the query 
specifications. 
Cursors (page 59) Queries return iterable objects, called cursors, that hold the full result set. 
Query Optimization (page 60) Analyze and improve query performance. 
3.2. MongoDB CRUD Concepts 53
MongoDB Documentation, Release 2.6.4 
Figure 3.4: The stages of a MongoDB insert operation. 
54 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Distributed Queries (page 63) Describes how sharded clusters and replica sets affect the performance of read 
operations. 
Write Operations (page 67) Introduces data create and modify operations, their behavior, and performances. 
Write Concern (page 72) Describes the kind of guarantee MongoDB provides when reporting on the success 
of a write operation. 
Distributed Write Operations (page 76) Describes how MongoDB directs write operations on sharded clusters 
and replica sets and the performance characteristics of these operations. 
Continue reading from Write Operations (page 67) for additional background on the behavior of data modifica-tion 
operations in MongoDB. 
3.2.1 Read Operations 
The following documents describe read operations: 
Read Operations Overview (page 55) A high level overview of queries and projections in MongoDB, including a 
discussion of syntax and behavior. 
Cursors (page 59) Queries return iterable objects, called cursors, that hold the full result set. 
Query Optimization (page 60) Analyze and improve query performance. 
Query Plans (page 61) MongoDB executes queries using optimal plans. 
Distributed Queries (page 63) Describes how sharded clusters and replica sets affect the performance of read opera-tions. 
Read Operations Overview 
Read operations, or queries, retrieve data stored in the database. In MongoDB, queries select documents from a single 
collection. 
Queries specify criteria, or conditions, that identify the documents that MongoDB returns to the clients. A query may 
include a projection that specifies the fields from the matching documents to return. The projection limits the amount 
of data that MongoDB returns to the client over the network. 
Query Interface 
For query operations, MongoDB provides a db.collection.find() method. The method accepts both the 
query criteria and projections and returns a cursor (page 59) to the matching documents. You can optionally modify 
the query to impose limits, skips, and sort orders. 
The following diagram highlights the components of a MongoDB query operation: 
Figure 3.5: The components of a MongoDB find operation. 
3.2. MongoDB CRUD Concepts 55
MongoDB Documentation, Release 2.6.4 
The next diagram shows the same query in SQL: 
Figure 3.6: The components of a SQL SELECT statement. 
Example 
db.users.find( { age: { $gt: 18 } }, { name: 1, address: 1 } ).limit(5) 
This query selects the documents in the users collection that match the condition age is greater than 18. To specify 
the greater than condition, query criteria uses the greater than (i.e. $gt) query selection operator. The query returns 
at most 5 matching documents (or more precisely, a cursor to those documents). The matching documents will return 
with only the _id, name and address fields. See Projections (page 57) for details. 
See 
SQL to MongoDB Mapping Chart (page 120) for additional examples of MongoDB queries and the corresponding 
SQL statements. 
Query Behavior 
MongoDB queries exhibit the following behavior: 
• All queries in MongoDB address a single collection. 
• You can modify the query to impose limits, skips, and sort orders. 
• The order of documents returned by a query is not defined unless you specify a sort(). 
• Operations that modify existing documents (page 98) (i.e. updates) use the same query syntax as queries to select 
documents to update. 
• In aggregation (page 391) pipeline, the $match pipeline stage provides access to MongoDB queries. 
MongoDB provides a db.collection.findOne() method as a special case of find() that returns a single 
document. 
Query Statements 
Consider the following diagram of the query process that specifies a query criteria and a sort modifier: 
In the diagram, the query selects documents from the users collection. Using a query selection operator 
to define the conditions for matching documents, the query selects documents that have age greater than (i.e. $gt) 
18. Then the sort() modifier sorts the results by age in ascending order. 
For additional examples of queries, see Query Documents (page 87). 
56 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Figure 3.7: The stages of a MongoDB query with a query criteria and a sort modifier. 
Projections 
Queries in MongoDB return all fields in all matching documents by default. To limit the amount of data that MongoDB 
sends to applications, include a projection in the queries. By projecting results with a subset of fields, applications 
reduce their network overhead and processing requirements. 
Projections, which are the second argument to the find() method, may either specify a list of fields to return or list 
fields to exclude in the result documents. 
Important: Except for excluding the _id field in inclusive projections, you cannot mix exclusive and inclusive 
projections. 
Consider the following diagram of the query process that specifies a query criteria and a projection: 
In the diagram, the query selects from the users collection. The criteria matches the documents that have age equal 
to 18. Then the projection specifies that only the name field should return in the matching documents. 
Projection Examples 
Exclude One Field From a Result Set 
db.records.find( { "user_id": { $lt: 42 } }, { "history": 0 } ) 
This query selects documents in the records collection that match the condition { "user_id": { $lt: 42 
} }, and uses the projection { "history": 0 } to exclude the history field from the documents in the result 
set. 
Return Two fields and the _id Field 
db.records.find( { "user_id": { $lt: 42 } }, { "name": 1, "email": 1 } ) 
3.2. MongoDB CRUD Concepts 57
MongoDB Documentation, Release 2.6.4 
Figure 3.8: The stages of a MongoDB query with a query criteria and projection. MongoDB only transmits the 
projected data to the clients. 
This query selects documents in the records collection that match the query { "user_id": { $lt: 42 } 
} and uses the projection { "name": 1, "email": 1 } to return just the _id field (implicitly included), 
name field, and the email field in the documents in the result set. 
Return Two Fields and Exclude _id 
db.records.find( { "user_id": { $lt: 42} }, { "_id": 0, "name": 1 , "email": 1 } ) 
This query selects documents in the records collection that match the query { "user_id": { $lt: 42} 
}, and only returns the name and email fields in the documents in the result set. 
See 
Limit Fields to Return from a Query (page 94) for more examples of queries with projection statements. 
Projection Behavior MongoDB projections have the following properties: 
• By default, the _id field is included in the results. To suppress the _id field from the result set, specify _id: 
0 in the projection document. 
• For fields that contain arrays, MongoDB provides the following projection operators: $elemMatch, $slice, 
and $. 
• For related projection functionality in the aggregation framework (page 391) pipeline, use the $project 
pipeline stage. 
58 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Cursors 
In the mongo shell, the primary method for the read operation is the db.collection.find() method. This 
method queries a collection and returns a cursor to the returning documents. 
To access the documents, you need to iterate the cursor. However, in the mongo shell, if the returned cursor is not 
assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times 1 to print up to 
the first 20 documents in the results. 
For example, in the mongo shell, the following read operation queries the inventory collection for documents that 
have type equal to ’food’ and automatically print up to the first 20 matching documents: 
db.inventory.find( { type: 'food' } ); 
To manually iterate the cursor to access the documents, see Iterate a Cursor in the mongo Shell (page 95). 
Cursor Behaviors 
Closure of Inactive Cursors By default, the server will automatically close the cursor after 10 minutes of inactivity 
or if client has exhausted the cursor. To override this behavior, you can specify the noTimeout wire protocol flag2 
in your query; however, you should either close the cursor manually or exhaust the cursor. In the mongo shell, you 
can set the noTimeout flag: 
var myCursor = db.inventory.find().addOption(DBQuery.Option.noTimeout); 
See your driver documentation for information on setting the noTimeout flag. For the mongo shell, see 
cursor.addOption() for a complete list of available cursor flags. 
Cursor Isolation Because the cursor is not isolated during its lifetime, intervening write operations on a document 
may result in a cursor that returns a document more than once if that document has changed. To handle this situation, 
see the information on snapshot mode (page 698). 
Cursor Batches The MongoDB server returns the query results in batches. Batch size will not exceed the maximum 
BSON document size. For most queries, the first batch returns 101 documents or just enough documents to exceed 1 
megabyte. Subsequent batch size is 4 megabytes. To override the default size of the batch, see batchSize() and 
limit(). 
For queries that include a sort operation without an index, the server must load all the documents in memory to perform 
the sort and will return all documents in the first batch. 
As you iterate through the cursor and reach the end of the returned batch, if there are more results, cursor.next() 
will perform a getmore operation to retrieve the next batch. To see how many documents remain in the batch 
as you iterate the cursor, you can use the objsLeftInBatch() method, as in the following example: 
var myCursor = db.inventory.find(); 
var myFirstDocument = myCursor.hasNext() ? myCursor.next() : null; 
myCursor.objsLeftInBatch(); 
1 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries 
(page 256) for more information. 
2http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol 
3.2. MongoDB CRUD Concepts 59
MongoDB Documentation, Release 2.6.4 
Cursor Information 
The db.serverStatus() method returns a document that includes a metrics field. The metrics field con-tains 
a cursor field with the following information: 
• number of timed out cursors since the last server restart 
• number of open cursors with the option DBQuery.Option.noTimeout set to prevent timeout after a period 
of inactivity 
• number of “pinned” open cursors 
• total number of open cursors 
Consider the following example which calls the db.serverStatus() method and accesses the metrics field 
from the results and then the cursor field from the metrics field: 
db.serverStatus().metrics.cursor 
The result is the following document: 
{ 
"timedOut" : <number> 
"open" : { 
"noTimeout" : <number>, 
"pinned" : <number>, 
"total" : <number> 
} 
} 
See also: 
db.serverStatus() 
Query Optimization 
Indexes improve the efficiency of read operations by reducing the amount of data that query operations need to process. 
This simplifies the work associated with fulfilling queries within MongoDB. 
Create an Index to Support Read Operations 
If your application queries a collection on a particular field or fields, then an index on the queried field or fields can 
prevent the query from scanning the whole collection to find and return the query results. For more information about 
indexes, see the complete documentation of indexes in MongoDB (page 436). 
Example 
An application queries the inventory collection on the type field. The value of the type field is user-driven. 
var typeValue = <someUserInput>; 
db.inventory.find( { type: typeValue } ); 
To improve the performance of this query, add an ascending, or a descending, index to the inventory collection 
on the type field. 3 In the mongo shell, you can create indexes using the db.collection.ensureIndex() 
method: 
3 For single-field indexes, the selection between ascending and descending order is immaterial. For compound indexes, the selection is important. 
See indexing order (page 441) for more details. 
60 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
db.inventory.ensureIndex( { type: 1 } ) 
This index can prevent the above query on type from scanning the whole collection to return the results. 
To analyze the performance of the query with an index, see Analyze Query Performance (page 97). 
In addition to optimizing read operations, indexes can support sort operations and allow for a more efficient storage 
utilization. See db.collection.ensureIndex() and Indexing Tutorials (page 464) for more information about 
index creation. 
Query Selectivity 
Some query operations are not selective. These operations cannot use indexes effectively or cannot use indexes at all. 
The inequality operators $nin and $ne are not very selective, as they often match a large portion of the index. As a 
result, in most cases, a $nin or $ne query with an index may perform no better than a $nin or $ne query that must 
scan all documents in a collection. 
Queries that specify regular expressions, with inline JavaScript regular expressions or $regex operator expressions, 
cannot use an index with one exception. Queries that specify regular expression with anchors at the beginning of a 
string can use an index. 
Covering a Query 
An index covers (page 495) a query, a covered query, when: 
• all the fields in the query (page 87) are part of that index, and 
• all the fields returned in the documents that match the query are in the same index. 
For these queries, MongoDB does not need to inspect documents outside of the index. This is often more efficient 
than inspecting entire documents. 
Example 
Given a collection inventory with the following index on the type and item fields: 
{ type: 1, item: 1 } 
This index will cover the following query on the type and item fields, which returns only the item field: 
db.inventory.find( { type: "food", item:/^c/ }, 
{ item: 1, _id: 0 } ) 
However, the index will not cover the following query, which returns the item field and the _id field: 
db.inventory.find( { type: "food", item:/^c/ }, 
{ item: 1 } ) 
See Create Indexes that Support Covered Queries (page 495) for more information on the behavior and use of covered 
queries. 
Query Plans 
The MongoDB query optimizer processes queries and chooses the most efficient query plan for a query given the 
available indexes. The query system then uses this query plan each time the query runs. 
3.2. MongoDB CRUD Concepts 61
MongoDB Documentation, Release 2.6.4 
The query optimizer only caches the plans for those query shapes that can have more than one viable plan. 
The query optimizer occasionally reevaluates query plans as the content of the collection changes to ensure optimal 
query plans. You can also specify which indexes the optimizer evaluates with Index Filters (page 63). 
You can use the explain() method to view statistics about the query plan for a given query. This information can 
help as you develop indexing strategies (page 493). 
Query Optimization 
To create a new query plan, the query optimizer: 
1. runs the query against several candidate indexes in parallel. 
2. records the matches in a common results buffer or buffers. 
• If the candidate plans include only ordered query plans, there is a single common results buffer. 
• If the candidate plans include only unordered query plans, there is a single common results buffer. 
• If the candidate plans include both ordered query plans and unordered query plans, there are two common 
results buffers, one for the ordered plans and the other for the unordered plans. 
If an index returns a result already returned by another index, the optimizer skips the duplicate match. In the 
case of the two buffers, both buffers are de-duped. 
3. stops the testing of candidate plans and selects an index when one of the following events occur: 
• An unordered query plan has returned all the matching results; or 
• An ordered query plan has returned all the matching results; or 
• An ordered query plan has returned a threshold number of matching results: 
– Version 2.0: Threshold is the query batch size. The default batch size is 101. 
– Version 2.2: Threshold is 101. 
The selected index becomes the index specified in the query plan; future iterations of this query or queries with the 
same query pattern will use this index. Query pattern refers to query select conditions that differ only in the values, as 
in the following two queries with the same query pattern: 
db.inventory.find( { type: 'food' } ) 
db.inventory.find( { type: 'utensil' } ) 
Query Plan Revision 
As collections change over time, the query optimizer deletes the query plan and re-evaluates after any of the following 
events: 
• The collection receives 1,000 write operations. 
• The reIndex rebuilds the index. 
• You add or drop an index. 
• The mongod process restarts. 
62 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Cached Query Plan Interface 
New in version 2.6. 
MongoDB provides http://docs.mongodb.org/manualreference/method/js-plan-cache to 
view and modify the cached query plans. 
Index Filters 
New in version 2.6. 
Index filters determine which indexes the optimizer evaluates for a query shape. A query shape consists of a combi-nation 
of query, sort, and projection specifications. If an index filter exists for a given query shape, the optimizer only 
considers those indexes specified in the filter. 
When an index filter exists for the query shape, MongoDB ignores the hint(). To see whether MongoDB applied 
an index filter for a query, check the explain.filterSet field of the explain() output. 
Index filters only affects which indexes the optimizer evaluates; the optimizer may still select the collection scan as 
the winning plan for a given query shape. 
Index filters exist for the duration of the server process and do not persist after shutdown. MongoDB also provides a 
command to manually remove filters. 
Because index filters overrides the expected behavior of the optimizer as well as the hint() method, use index filters 
sparingly. 
See planCacheListFilters, planCacheClearFilters, and planCacheSetFilter. 
Distributed Queries 
Read Operations to Sharded Clusters 
Sharded clusters allow you to partition a data set among a cluster of mongod instances in a way that is nearly trans-parent 
to the application. For an overview of sharded clusters, see the Sharding (page 607) section of this manual. 
For a sharded cluster, applications issue operations to one of the mongos instances associated with the cluster. 
Read operations on sharded clusters are most efficient when directed to a specific shard. Queries to sharded collections 
should include the collection’s shard key (page 620). When a query includes a shard key, the mongos can use cluster 
metadata from the config database (page 616) to route the queries to shards. 
If a query does not include the shard key, the mongos must direct the query to all shards in the cluster. These scatter 
gather queries can be inefficient. On larger clusters, scatter gather queries are unfeasible for routine operations. 
For more information on read operations in sharded clusters, see the Sharded Cluster Query Routing (page 624) and 
Shard Keys (page 620) sections. 
Read Operations to Replica Sets 
Replica sets use read preferences to determine where and how to route read operations to members of the replica set. 
By default, MongoDB always reads data from a replica set’s primary. You can modify that behavior by changing the 
read preference mode (page 603). 
You can configure the read preference mode (page 603) on a per-connection or per-operation basis to allow reads from 
secondaries to: 
3.2. MongoDB CRUD Concepts 63
MongoDB Documentation, Release 2.6.4 
Figure 3.9: Diagram of a sharded cluster. 
64 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Figure 3.10: Read operations to a sharded cluster. Query criteria includes the shard key. The query router mongos 
can target the query to the appropriate shard or shards. 
3.2. MongoDB CRUD Concepts 65
MongoDB Documentation, Release 2.6.4 
Figure 3.11: Read operations to a sharded cluster. Query criteria does not include the shard key. The query router 
mongos must broadcast query to all shards for the collection. 
66 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
• reduce latency in multi-data-center deployments, 
• improve read throughput by distributing high read-volumes (relative to write volume), 
• for backup operations, and/or 
• to allow reads during failover (page 523) situations. 
Figure 3.12: Read operations to a replica set. Default read preference routes the read to the primary. Read preference 
of nearest routes the read to the nearest member. 
Read operations from secondary members of replica sets are not guaranteed to reflect the current state of the primary, 
and the state of secondaries will trail the primary by some amount of time. Often, applications don’t rely on this kind 
of strict consistency, but application developers should always consider the needs of their application before setting 
read preference. 
For more information on read preference or on the read preference modes, see Read Preference (page 530) and Read 
Preference Modes (page 603). 
3.2.2 Write Operations 
The following documents describe write operations: 
Write Operations Overview (page 68) Provides an overview of MongoDB’s data insertion and modification opera-tions, 
including aspects of the syntax, and behavior. 
Write Concern (page 72) Describes the kind of guarantee MongoDB provides when reporting on the success of a 
write operation. 
Distributed Write Operations (page 76) Describes how MongoDB directs write operations on sharded clusters and 
replica sets and the performance characteristics of these operations. 
3.2. MongoDB CRUD Concepts 67
MongoDB Documentation, Release 2.6.4 
Write Operation Performance (page 77) Introduces the performance constraints and factors for writing data to Mon-goDB 
deployments. 
Bulk Inserts in MongoDB (page 81) Describe behaviors associated with inserting an array of documents. 
Storage (page 82) Introduces the storage allocation strategies available for MongoDB collections. 
Write Operations Overview 
A write operation is any operation that creates or modifies data in the MongoDB instance. In MongoDB, write 
operations target a single collection. All write operations in MongoDB are atomic on the level of a single document. 
There are three classes of write operations in MongoDB: insert (page 68), update (page 69), and remove (page 70). 
Insert operations add new data to a collection. Update operations modify existing data, and remove operations delete 
data from a collection. No insert, update, or remove can affect more than one document atomically. 
For the update and remove operations, you can specify criteria, or conditions, that identify the documents to update or 
remove. These operations use the same query syntax to specify the criteria as read operations (page 55). 
MongoDB allows applications to determine the acceptable level of acknowledgement required of write operations. 
See Write Concern (page 72) for more information. 
Insert 
In MongoDB, the db.collection.insert() method adds new documents to a collection. 
The following diagram highlights the components of a MongoDB insert operation: 
Figure 3.13: The components of a MongoDB insert operations. 
The following diagram shows the same query in SQL: 
Example 
The following operation inserts a new documents into the users collection. The new document has four fields name, 
age, and status, and an _id field. MongoDB always adds the _id field to the new document if that field does not 
exist. 
db.users.insert( 
{ 
name: "sue", 
age: 26, 
68 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Figure 3.14: The components of a SQL INSERT statement. 
status: "A" 
} 
) 
For more information and examples, see db.collection.insert(). 
Insert Behavior If you add a new document without the _id field, the client library or the mongod instance adds an 
_id field and populates the field with a unique ObjectId. 
If you specify the _id field, the value must be unique within the collection. For operations with write concern 
(page 72), if you try to create a document with a duplicate _id value, mongod returns a duplicate key exception. 
Other Methods to Add Documents You can also add new documents to a collection using methods that have an 
upsert (page 70) option. If the option is set to true, these methods will either modify existing documents or add a 
new document when no matching documents exist for the query. For more information, see Update Behavior with the 
upsert Option (page 70). 
Update 
In MongoDB, the db.collection.update() method modifies existing documents in a collection. The 
db.collection.update() method can accept query criteria to determine which documents to update as well as 
an options document that affects its behavior, such as the multi option to update multiple documents. 
The following diagram highlights the components of a MongoDB update operation: 
Figure 3.15: The components of a MongoDB update operation. 
The following diagram shows the same query in SQL: 
Example 
3.2. MongoDB CRUD Concepts 69
MongoDB Documentation, Release 2.6.4 
Figure 3.16: The components of a SQL UPDATE statement. 
db.users.update( 
{ age: { $gt: 18 } }, 
{ $set: { status: "A" } }, 
{ multi: true } 
) 
This update operation on the users collection sets the status field to A for the documents that match the criteria 
of age greater than 18. 
For more information, see db.collection.update() and update() Examples. 
Default Update Behavior By default, the db.collection.update() method updates a single document. 
However, with the multi option, update() can update all documents in a collection that match a query. 
The db.collection.update() method either updates specific fields in the existing document or replaces the 
document. See db.collection.update() for details as well as examples. 
When performing update operations that increase the document size beyond the allocated space for that document, the 
update operation relocates the document on disk. 
MongoDB preserves the order of the document fields following write operations except for the following cases: 
• The _id field is always the first field in the document. 
• Updates that include renaming of field names may result in the reordering of fields in the document. 
Changed in version 2.6: Starting in version 2.6, MongoDB actively attempts to preserve the field order in a document. 
Before version 2.6, MongoDB did not actively preserve the order of the fields in a document. 
Update Behavior with the upsert Option If the update() method includes upsert: true and no documents 
match the query portion of the update operation, then the update operation creates a new document. If there are 
matching documents, then the update operation with the upsert: true modifies the matching document or documents. 
By specifying upsert: true, applications can indicate, in a single operation, that if no matching documents are found 
for the update, an insert should be performed. See update() for details on performing an upsert. 
Changed in version 2.6: In 2.6, the new Bulk() methods and the underlying update command allow you to perform 
many updates with upsert: true operations in a single call. 
Remove 
In MongoDB, the db.collection.remove() method deletes documents from a collection. The 
db.collection.remove() method accepts a query criteria to determine which documents to remove. 
The following diagram highlights the components of a MongoDB remove operation: 
70 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Figure 3.17: The components of a MongoDB remove operation. 
The following diagram shows the same query in SQL: 
Figure 3.18: The components of a SQL DELETE statement. 
Example 
db.users.remove( 
{ status: "D" } 
) 
This delete operation on the users collection removes all documents that match the criteria of status equal to D. 
For more information, see db.collection.remove() method and Remove Documents (page 101). 
Remove Behavior By default, db.collection.remove() method removes all documents that match its query. 
However, the method can accept a flag to limit the delete operation to a single document. 
Isolation of Write Operations 
The modification of a single document is always atomic, even if the write operation modifies multiple sub-documents 
within that document. For write operations that modify multiple documents, the operation as a whole is not atomic, 
and other operations may interleave. 
No other operations are atomic. You can, however, attempt to isolate a write operation that affects multiple documents 
using the isolation operator. 
To isolate a sequence of write operations from other read and write operations, see Perform Two Phase Commits 
(page 102). 
Additional Methods 
The db.collection.save() method can either update an existing document or an insert a document if the 
document cannot be found by the _id field. See db.collection.save() for more information and examples. 
MongoDB also provides methods to perform write operations in bulk. See Bulk() for more information. 
3.2. MongoDB CRUD Concepts 71
MongoDB Documentation, Release 2.6.4 
Write Concern 
Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation. 
The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak 
write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns 
may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm 
the write operations. 
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients 
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB 
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather 
than ensure persistence to the entire deployment. 
Changed in version 2.6: A new protocol for write operations (page 737) integrates write concern with the write 
operations. 
For details on write concern configurations, see Write Concern Reference (page 118). 
Considerations 
Default Write Concern The mongo shell and the MongoDB drivers use Acknowledged (page 73) as the default 
write concern. 
See Acknowledged (page 73) for more information, including when this write concern became the default. 
Read Isolation MongoDB allows clients to read documents inserted or modified before it commits these modifica-tions 
to disk, regardless of write concern level or journaling configuration. As a result, applications may observe two 
classes of behaviors: 
• For systems with multiple concurrent readers and writers, MongoDB will allow clients to read the results of a 
write operation before the write operation returns. 
• If the mongod terminates before the journal commits, even if a write returns successfully, queries may have 
read data that will not exist after the mongod restarts. 
Other database systems refer to these isolation semantics as read uncommitted. For all inserts and updates, Mon-goDB 
modifies each document in isolation: clients never see documents in intermediate states. For multi-document 
operations, MongoDB does not provide any multi-document transactions or isolation. 
When mongod returns a successful journaled write concern, the data is fully committed to disk and will be available 
after mongod restarts. 
For replica sets, write operations are durable only after a write replicates and commits to the journal of a majority of 
the members of the set. MongoDB regularly commits data to the journal regardless of journaled write concern: use 
the commitIntervalMs to control how often a mongod commits the journal. 
Timeouts Clients can set a wtimeout (page 119) value as part of a replica acknowledged (page 75) write concern. If 
the write concern is not satisfied in the specified interval, the operation returns an error, even if the write concern will 
eventually succeed. 
MongoDB does not “rollback” or undo modifications made before the wtimeout interval expired. 
Write Concern Levels 
MongoDB has the following levels of conceptual write concern, listed from weakest to strongest: 
72 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Unacknowledged With an unacknowledged write concern, MongoDB does not acknowledge the receipt of write 
operations. Unacknowledged is similar to errors ignored; however, drivers will attempt to receive and handle network 
errors when possible. The driver’s ability to detect network errors depends on the system’s networking configuration. 
Before the releases outlined in Default Write Concern Change (page 808), this was the default write concern. 
Figure 3.19: Write operation to a mongod instance with write concern of unacknowledged. The client does not 
wait for any acknowledgment. 
Acknowledged With a receipt acknowledged write concern, the mongod confirms the receipt of the write operation. 
Acknowledged write concern allows clients to catch network, duplicate key, and other errors. 
MongoDB uses the acknowledged write concern by default starting in the driver releases outlined in Releases 
(page 808). 
Changed in version 2.6: The mongo shell write methods now incorporates the write concern (page 72) in the write 
methods and provide the default write concern whether run interactively or in a script. See Write Method Acknowl-edgements 
(page 743) for details. 
Journaled With a journaled write concern, the MongoDB acknowledges the write operation only after committing 
the data to the journal. This write concern ensures that MongoDB can recover the data following a shutdown or power 
interruption. 
You must have journaling enabled to use this write concern. 
With a journaled write concern, write operations must wait for the next journal commit. To reduce latency for these op-erations, 
MongoDB also increases the frequency that it commits operations to the journal. See commitIntervalMs 
for more information. 
Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the 
primary of the set regardless of the level of replica acknowledged write concern. 
3.2. MongoDB CRUD Concepts 73
MongoDB Documentation, Release 2.6.4 
Figure 3.20: Write operation to a mongod instance with write concern of acknowledged. The client waits for 
acknowledgment of success or exception. 
Figure 3.21: Write operation to a mongod instance with write concern of journaled. The mongod sends acknowl-edgment 
after it commits the write operation to the journal. 
74 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Replica Acknowledged Replica sets present additional considerations with regards to write concern.. The default 
write concern only requires acknowledgement from the primary. 
With replica acknowledged write concern, you can guarantee that the write operation propagates to additional members 
of the replica set. See Write Concern for Replica Sets (page 528) for more information. 
Figure 3.22: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one 
secondary. 
Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the 
primary of the set regardless of the level of replica acknowledged write concern. 
See also: 
Write Concern Reference (page 118) 
3.2. MongoDB CRUD Concepts 75
MongoDB Documentation, Release 2.6.4 
Distributed Write Operations 
Write Operations on Sharded Clusters 
For sharded collections in a sharded cluster, the mongos directs write operations from applications to the shards that 
are responsible for the specific portion of the data set. The mongos uses the cluster metadata from the config database 
(page 616) to route the write operation to the appropriate shards. 
Figure 3.23: Diagram of a sharded cluster. 
MongoDB partitions data in a sharded collection into ranges based on the values of the shard key. Then, MongoDB 
distributes these chunks to shards. The shard key determines the distribution of chunks to shards. This can affect the 
performance of write operations in the cluster. 
Important: Update operations that affect a single document must include the shard key or the _id field. Updates 
that affect multiple documents are more efficient in some situations if they have the shard key, but can be broadcast to 
all shards. 
If the value of the shard key increases or decreases with every insert, all insert operations target a single shard. As a 
result, the capacity of a single shard becomes the limit for the insert capacity of the sharded cluster. 
76 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Figure 3.24: Diagram of the shard key value space segmented into smaller ranges or chunks. 
For more information, see Sharded Cluster Tutorials (page 634) and Bulk Inserts in MongoDB (page 81). 
Write Operations on Replica Sets 
In replica sets, all write operations go to the set’s primary, which applies the write operation then records the oper-ations 
on the primary’s operation log or oplog. The oplog is a reproducible sequence of operations to the data set. 
Secondary members of the set are continuously replicating the oplog and applying the operations to themselves in an 
asynchronous process. 
Large volumes of write operations, particularly bulk operations, may create situations where the secondary members 
have difficulty applying the replicating operations from the primary at a sufficient rate: this can cause the secondary’s 
state to fall behind that of the primary. Secondaries that are significantly behind the primary present problems for 
normal operation of the replica set, particularly failover (page 523) in the form of rollbacks (page 527) as well as 
general read consistency (page 528). 
To help avoid this issue, you can customize the write concern (page 72) to return confirmation of the write operation 
to another member 4 of the replica set every 100 or 1,000 operations. This provides an opportunity for secondaries 
to catch up with the primary. Write concern can slow the overall progress of write operations but ensure that the 
secondaries can maintain a largely current state with respect to the primary. 
For more information on replica sets and write operations, see Replica Acknowledged (page 75), Oplog Size (page 535), 
and Change the Size of the Oplog (page 570). 
Write Operation Performance 
Indexes 
After every insert, update, or delete operation, MongoDB must update every index associated with the collection in 
addition to the data itself. Therefore, every index on a collection adds some amount of overhead for the performance 
of write operations. 5 
4 Intermittently issuing a write concern with a w value of 2 or majority will slow the throughput of write traffic; however, this practice will 
allow the secondaries to remain current with the state of the primary. 
Changed in version 2.6: In Master/Slave (page 538) deployments, MongoDB treats w: "majority" as equivalent to w: 1. In earlier 
versions of MongoDB, w: "majority" produces an error in master/slave (page 538) deployments. 
5 For inserts and updates to un-indexed fields, the overhead for sparse indexes (page 457) is less than for non-sparse indexes. Also for non-sparse 
indexes, updates that do not change the record size have less indexing overhead. 
3.2. MongoDB CRUD Concepts 77
MongoDB Documentation, Release 2.6.4 
Figure 3.25: Diagram of default routing of reads and writes to the primary. 
78 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Figure 3.26: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one 
secondary. 
3.2. MongoDB CRUD Concepts 79
MongoDB Documentation, Release 2.6.4 
In general, the performance gains that indexes provide for read operations are worth the insertion penalty. However, 
in order to optimize write performance when possible, be careful when creating new indexes and evaluate the existing 
indexes to ensure that your queries actually use these indexes. 
For indexes and queries, see Query Optimization (page 60). For more information on indexes, see Indexes (page 431) 
and Indexing Strategies (page 493). 
Document Growth 
If an update operation causes a document to exceed the currently allocated record size, MongoDB relocates the docu-ment 
on disk with enough contiguous space to hold the document. These relocations take longer than in-place updates, 
particularly if the collection has indexes. If a collection has indexes, MongoDB must update all index entries. Thus, 
for a collection with many indexes, the move will impact the write throughput. 
Some update operations, such as the $inc operation, do not cause an increase in document size. For these update 
operations, MongoDB can apply the updates in-place. Other update operations, such as the $push operation, change 
the size of the document. 
In-place-updates are significantly more efficient than updates that cause document growth. When possible, use data 
models (page 133) that minimize the need for document growth. 
See Storage (page 82) for more information. 
Storage Performance 
Hardware The capability of the storage system creates some important physical limits for the performance of Mon-goDB’s 
write operations. Many unique factors related to the storage system of the drive affect write performance, 
including random access patterns, disk caches, disk readahead and RAID configurations. 
Solid state drives (SSDs) can outperform spinning hard disks (HDDs) by 100 times or more for random workloads. 
See 
Production Notes (page 188) for recommendations regarding additional hardware and configuration options. 
Journaling MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 67) dura-bility 
and to provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation 
to the journal. 
While the durability assurance provided by the journal typically outweigh the performance costs of the additional write 
operations, consider the following interactions between the journal and performance: 
• if the journal and the data file reside on the same block device, the data files and the journal may have to contend 
for a finite number of available write operations. Moving the journal to a separate device may increase the 
capacity for write operations. 
• if applications specify write concern (page 72) that includes journaled (page 73), mongod will decrease the 
duration between journal commits, which can increases the overall write load. 
• the duration between journal commits is configurable using the commitIntervalMs run-time option. De-creasing 
the period between journal commits will increase the number of write operations, which can limit 
MongoDB’s capacity for write operations. Increasing the amount of time between commits may decrease the 
total number of write operation, but also increases the chance that the journal will not record a write operation 
in the event of a failure. 
For additional information on journaling, see Journaling Mechanics (page 275). 
80 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Bulk Inserts in MongoDB 
In some situations you may need to insert or ingest a large amount of data into a MongoDB database. These bulk 
inserts have some special considerations that are different from other write operations. 
Use the insert() Method 
The insert() method, when passed an array of documents, performs a bulk insert, and inserts each document 
atomically. Bulk inserts can significantly increase performance by amortizing write concern (page 72) costs. 
New in version 2.2: insert() in the mongo shell gained support for bulk inserts in version 2.2. 
In the drivers, you can configure write concern for batches rather than on a per-document level. 
Drivers have a ContinueOnError option in their insert operation, so that the bulk operation will continue to insert 
remaining documents in a batch even if an insert fails. 
Note: If multiple errors occur during a bulk insert, clients only receive the last error generated. 
See also: 
Driver documentation for details on performing bulk inserts in your application. Also see Import and Export 
MongoDB Data (page 186). 
Bulk Inserts on Sharded Clusters 
While ContinueOnError is optional on unsharded clusters, all bulk operations to a sharded collection run with 
ContinueOnError, which cannot be disabled. 
Large bulk insert operations, including initial data inserts or routine data import, can affect sharded cluster perfor-mance. 
For bulk inserts, consider the following strategies: 
Pre-Split the Collection If the sharded collection is empty, then the collection has only one initial chunk, which 
resides on a single shard. MongoDB must then take time to receive data, create splits, and distribute the split chunks 
to the available shards. To avoid this performance cost, you can pre-split the collection, as described in Split Chunks 
in a Sharded Cluster (page 666). 
Insert to Multiple mongos To parallelize import processes, send insert operations to more than one mongos 
instance. Pre-split empty collections first as described in Split Chunks in a Sharded Cluster (page 666). 
Avoid Monotonic Throttling If your shard key increases monotonically during an insert, then all inserted data goes 
to the last chunk in the collection, which will always end up on a single shard. Therefore, the insert capacity of the 
cluster will never exceed the insert capacity of that single shard. 
If your insert volume is larger than what a single shard can process, and if you cannot avoid a monotonically increasing 
shard key, then consider the following modifications to your application: 
• Reverse the binary bits of the shard key. This preserves the information and avoids correlating insertion order 
with increasing sequence of values. 
• Swap the first and last 16-bit words to “shuffle” the inserts. 
3.2. MongoDB CRUD Concepts 81
MongoDB Documentation, Release 2.6.4 
Example 
The following example, in C++, swaps the leading and trailing 16-bit word of BSON ObjectIds generated so that they 
are no longer monotonically increasing. 
using namespace mongo; 
OID make_an_id() { 
OID x = OID::gen(); 
const unsigned char *p = x.getData(); 
swap( (unsigned short&) p[0], (unsigned short&) p[10] ); 
return x; 
} 
void foo() { 
// create an object 
BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" ); 
// now we may insert o into a sharded collection 
} 
See also: 
Shard Keys (page 620) for information on choosing a sharded key. Also see Shard Key Internals (page 620) (in 
particular, Choosing a Shard Key (page 639)). 
Storage 
Data Model 
MongoDB stores data in the form of BSON documents, which are rich mappings of keys, or field names, to values. 
BSON supports a rich collection of types, and fields in BSON documents may hold arrays of values or embedded 
documents. All documents in MongoDB must be less than 16MB, which is the BSON document size. 
Every document in MongoDB is stored in a record which contains the document itself and extra space, or padding, 
which allows the document to grow as the result of updates. 
All records are contiguously located on disk, and when a document becomes larger than the allocated record, Mon-goDB 
must allocate a new record. New allocations require MongoDB to move a document and update all indexes that 
refer to the document, which takes more time than in-place updates and leads to storage fragmentation. 
All records are part of a collection, which is a logical grouping of documents in a MongoDB database. The documents 
in a collection share a set of indexes, and typically these documents share common fields and structure. 
In MongoDB the database construct is a group of related collections. Each database has a distinct set of data files and 
can contain a large number of collections. Also, each database has one distinct write lock, that blocks operations to 
the database during write operations. A single MongoDB deployment may have many databases. 
Journal 
In order to ensure that all modifications to a MongoDB data set are durably written to disk, MongoDB records all 
modifications to a journal that it writes to disk more frequently than it writes the data files. The journal allows 
MongoDB to successfully recover data from data files after a mongod instance exits without flushing all changes. 
See Journaling Mechanics (page 275) for more information about the journal in MongoDB. 
82 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Record Allocation Strategies 
MongoDB supports multiple record allocation strategies that determine how mongod adds padding to a document 
when creating a record. Because documents in MongoDB may grow after insertion and all records are contiguous on 
disk, the padding can reduce the need to relocate documents on disk following updates. Relocations are less efficient 
than in-place updates, and can lead to storage fragmentation. As a result, all padding strategies trade additional space 
for increased efficiency and decreased fragmentation. 
Different allocation strategies support different kinds of workloads: the power of 2 allocations (page 83) are more 
efficient for insert/update/delete workloads; while exact fit allocations (page 83) is ideal for collections without update 
and delete workloads. 
Power of 2 Sized Allocations Changed in version 2.6: For all new collections, usePowerOf2Sizes 
became the default allocation strategy. To change the default allocation strategy, use the 
newCollectionsUsePowerOf2Sizes parameter. 
mongod uses an allocation strategy called usePowerOf2Sizes where each record has a size in bytes that is a 
power of 2 (e.g. 32, 64, 128, 256, 512...16777216.) The smallest allocation for a document is 32 bytes. The power of 
2 sizes allocation strategy has two key properties: 
• there are a limited number of record allocation sizes, which makes it easier for mongod to reuse existing 
allocations, which will reduce fragmentation in some cases. 
• in many cases, the record allocations are significantly larger than the documents they hold. This allows docu-ments 
to grow while minimizing or eliminating the chance that the mongod will need to allocate a new record 
if the document grows. 
The usePowerOf2Sizes strategy does not eliminate document reallocation as a result of document growth, but it 
minimizes its occurrence in many common operations. 
Exact Fit Allocation The exact fit allocation strategy allocates record sizes based on the size of the document and 
an additional padding factor. Each collection has its own padding factor, which defaults to 1 when you insert the first 
document in a collection. MongoDB dynamically adjusts the padding factor up to 2 depending on the rate of growth 
of the documents over the life of the collection. 
To estimate total record size, compute the product of the padding factor and the size of the document. That is: 
record size = paddingFactor * <document size> 
The size of each record in a collection reflects the size of the padding factor at the time of allocation. See the 
paddingFactor field in the output of db.collection.stats() to see the current padding factor for a collec-tion. 
On average, this exact fit allocation strategy uses less storage space than the usePowerOf2Sizes strategy but will 
result in higher levels of storage fragmentation if documents grow beyond the size of their initial allocation. 
The compact and repairDatabase operations remove padding by default, as do the mongodump and 
mongorestore. compact does allow you to specify a padding for records during compaction. 
Capped Collections 
Capped collections are fixed-size collections that support high-throughput operations that store records in insertion 
order. Capped collections work like circular buffers: once a collection fills its allocated space, it makes room for new 
documents by overwriting the oldest documents in the collection. 
See Capped Collections (page 196) for more information. 
3.2. MongoDB CRUD Concepts 83
MongoDB Documentation, Release 2.6.4 
3.3 MongoDB CRUD Tutorials 
The following tutorials provide instructions for querying and modifying data. For a higher-level overview of these 
operations, see MongoDB CRUD Operations (page 51). 
Insert Documents (page 84) Insert new documents into a collection. 
Query Documents (page 87) Find documents in a collection using search criteria. 
Limit Fields to Return from a Query (page 94) Limit which fields are returned by a query. 
Iterate a Cursor in the mongo Shell (page 95) Access documents returned by a find query by iterating the cursor, 
either manually or using the iterator index. 
Analyze Query Performance (page 97) Analyze the efficiency of queries and determine how a query uses available 
indexes. 
Modify Documents (page 98) Modify documents in a collection 
Remove Documents (page 101) Remove documents from a collection. 
Perform Two Phase Commits (page 102) Use two-phase commits when writing data to multiple documents. 
Create Tailable Cursor (page 109) Create tailable cursors for use in capped collections with high numbers of write 
operations for which an index would be too expensive. 
Isolate Sequence of Operations (page 111) Use the <isolation> isolated operator to isolate a single write 
operation that affects multiple documents, preventing other operations from interrupting the sequence of write 
operations. 
Create an Auto-Incrementing Sequence Field (page 113) Describes how to create an incrementing sequence num-ber 
for the _id field using a Counters Collection or an Optimistic Loop. 
Limit Number of Elements in an Array after an Update (page 116) Use $push with various modifiers to sort and 
maintain an array of fixed size after update 
3.3.1 Insert Documents 
In MongoDB, the db.collection.insert() method adds new documents into a collection. 
Insert a Document 
Step 1: Insert a document into a collection. 
Insert a document into a collection named inventory. The operation will create the collection if the collection does 
not currently exist. 
db.inventory.insert( 
{ 
item: "ABC1", 
details: { 
model: "14Q3", 
manufacturer: "XYZ Company" 
}, 
stock: [ { size: "S", qty: 25 }, { size: "M", qty: 50 } ], 
category: "clothing" 
} 
) 
84 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
The operation returns a WriteResult object with the status of the operation. A successful insert of the document 
returns the following object: 
WriteResult({ "nInserted" : 1 }) 
The nInserted field specifies the number of documents inserted. If the operation encounters an error, the 
WriteResult object will contain the error information. 
Step 2: Review the inserted document. 
If the insert operation is successful, verify the insertion by querying the collection. 
db.inventory.find() 
The document you inserted should return. 
{ "_id" : ObjectId("53d98f133bb604791249ca99"), "item" : "ABC1", "details" : { "model" : "14Q3", "manufacturer" The returned document shows that MongoDB added an _id field to the document. If a client inserts a document that 
does not contain the _id field, MongoDB adds the field with the value set to a generated ObjectId6. The ObjectId7 
values in your documents will differ from the ones shown. 
Insert an Array of Documents 
You can pass an array of documents to the db.collection.insert() method to insert multiple documents. 
Step 1: Create an array of documents. 
Define a variable mydocuments that holds an array of documents to insert. 
var mydocuments = 
[ 
{ 
item: "ABC2", 
details: { model: "14Q3", manufacturer: "M1 Corporation" }, 
stock: [ { size: "M", qty: 50 } ], 
category: "clothing" 
}, 
{ 
item: "MNO2", 
details: { model: "14Q3", manufacturer: "ABC Company" }, 
stock: [ { size: "S", qty: 5 }, { size: "M", qty: 5 }, { size: "L", qty: 1 } ], 
category: "clothing" 
}, 
{ 
item: "IJK2", 
details: { model: "14Q2", manufacturer: "M5 Corporation" }, 
stock: [ { size: "S", qty: 5 }, { size: "L", qty: 1 } ], 
category: "houseware" 
} 
]; 
6http://docs.mongodb.org/manual/reference/object-id 
7http://docs.mongodb.org/manual/reference/object-id 
3.3. MongoDB CRUD Tutorials 85
MongoDB Documentation, Release 2.6.4 
Step 2: Insert the documents. 
Pass the mydocuments array to the db.collection.insert() to perform a bulk insert. 
db.inventory.insert( mydocuments ); 
The method returns a BulkWriteResult object with the status of the operation. A successful insert of the docu-ments 
returns the following object: 
BulkWriteResult({ 
"writeErrors" : [ ], 
"writeConcernErrors" : [ ], 
"nInserted" : 3, 
"nUpserted" : 0, 
"nMatched" : 0, 
"nModified" : 0, 
"nRemoved" : 0, 
"upserted" : [ ] 
}) 
The nInserted field specifies the number of documents inserted. If the operation encounters an error, the 
BulkWriteResult object will contain information regarding the error. 
The inserted documents will each have an _id field added by MongoDB. 
Insert Multiple Documents with Bulk 
New in version 2.6. 
MongoDB provides a Bulk() API that you can use to perform multiple write operations in bulk. The following 
sequence of operations describes how you would use the Bulk() API to insert a group of documents into a MongoDB 
collection. 
Step 1: Initialize a Bulk operations builder. 
Initialize a Bulk operations builder for the collection inventory. 
var bulk = db.inventory.initializeUnorderedBulkOp(); 
The operation returns an unordered operations builder which maintains a list of operations to perform. Unordered 
operations means that MongoDB can execute in parallel as well as in nondeterministic order. If an error occurs during 
the processing of one of the write operations, MongoDB will continue to process remaining write operations in the 
list. 
You can also initialize an ordered operations builder; see db.collection.initializeOrderedBulkOp() 
for details. 
Step 2: Add insert operations to the bulk object. 
Add two insert operations to the bulk object using the Bulk.insert() method. 
bulk.insert( 
{ 
item: "BE10", 
details: { model: "14Q2", manufacturer: "XYZ Company" }, 
stock: [ { size: "L", qty: 5 } ], 
86 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
category: "clothing" 
} 
); 
bulk.insert( 
{ 
item: "ZYT1", 
details: { model: "14Q1", manufacturer: "ABC Company" }, 
stock: [ { size: "S", qty: 5 }, { size: "M", qty: 5 } ], 
category: "houseware" 
} 
); 
Step 3: Execute the bulk operation. 
Call the execute() method on the bulk object to execute the operations in its list. 
bulk.execute(); 
The method returns a BulkWriteResult object with the status of the operation. A successful insert of the docu-ments 
returns the following object: 
BulkWriteResult({ 
"writeErrors" : [ ], 
"writeConcernErrors" : [ ], 
"nInserted" : 2, 
"nUpserted" : 0, 
"nMatched" : 0, 
"nModified" : 0, 
"nRemoved" : 0, 
"upserted" : [ ] 
}) 
The nInserted field specifies the number of documents inserted. If the operation encounters an error, the 
BulkWriteResult object will contain information regarding the error. 
Additional Examples and Methods 
For more examples, see db.collection.insert(). 
The db.collection.update() method, the db.collection.findAndModify(), and the 
db.collection.save() method can also add new documents. See the individual reference pages for the 
methods for more information and examples. 
3.3.2 Query Documents 
In MongoDB, the db.collection.find() method retrieves documents from a collection. 8 The 
db.collection.find() method returns a cursor (page 59) to the retrieved documents. 
This tutorial provides examples of read operations using the db.collection.find() method in the mongo 
shell. In these examples, the retrieved documents contain all their fields. To restrict the fields to return in the retrieved 
documents, see Limit Fields to Return from a Query (page 94). 
8 The db.collection.findOne() method also performs a read operation to return a single document. Internally, the 
db.collection.findOne() method is the db.collection.find() method with a limit of 1. 
3.3. MongoDB CRUD Tutorials 87
MongoDB Documentation, Release 2.6.4 
Select All Documents in a Collection 
An empty query document ({}) selects all documents in the collection: 
db.inventory.find( {} ) 
Not specifying a query document to the find() is equivalent to specifying an empty query document. Therefore the 
following operation is equivalent to the previous operation: 
db.inventory.find() 
Specify Equality Condition 
To specify equality condition, use the query document { <field>: <value> } to select all documents that 
contain the <field> with the specified <value>. 
The following example retrieves from the inventory collection all documents where the type field has the value 
snacks: 
db.inventory.find( { type: "snacks" } ) 
Specify Conditions Using Query Operators 
A query document can use the query operators to specify conditions in a MongoDB query. 
The following example selects all documents in the inventory collection where the value of the type field is either 
’food’ or ’snacks’: 
db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } ) 
Although you can express this query using the $or operator, use the $in operator rather than the $or operator when 
performing equality checks on the same field. 
Refer to the http://docs.mongodb.org/manualreference/operator document for the complete list 
of query operators. 
Specify AND Conditions 
A compound query can specify conditions for more than one field in the collection’s documents. Implicitly, a logical 
AND conjunction connects the clauses of a compound query so that the query selects the documents in the collection 
that match all the conditions. 
In the following example, the query document specifies an equality match on the field type and a less than ($lt) 
comparison match on the field price: 
db.inventory.find( { type: 'food', price: { $lt: 9.95 } } ) 
This query selects all documents where the type field has the value ’food’ and the value of the price field is less 
than 9.95. See comparison operators for other comparison operators. 
Specify OR Conditions 
Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so 
that the query selects the documents in the collection that match at least one condition. 
88 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
In the following example, the query document selects all documents in the collection where the field qty has a value 
greater than ($gt) 100 or the value of the price field is less than ($lt) 9.95: 
db.inventory.find( 
{ 
$or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ] 
} 
) 
Specify AND as well as OR Conditions 
With additional clauses, you can specify precise conditions for matching documents. 
In the following example, the compound query document selects all documents in the collection where the value of 
the type field is ’food’ and either the qty has a value greater than ($gt) 100 or the value of the price field is 
less than ($lt) 9.95: 
db.inventory.find( 
{ 
type: 'food', 
$or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ] 
} 
) 
Embedded Documents 
When the field holds an embedded document, a query can either specify an exact match on the embedded document 
or specify a match by individual fields in the embedded document using the dot notation. 
Exact Match on the Embedded Document 
To specify an equality match on the whole embedded document, use the query document { <field>: <value> 
} where <value> is the document to match. Equality matches on an embedded document require an exact match of 
the specified <value>, including the field order. 
In the following example, the query matches all documents where the value of the field producer is an embedded 
document that contains only the field company with the value ’ABC123’ and the field address with the value 
’123 Street’, in the exact order: 
db.inventory.find( 
{ 
producer: 
{ 
company: 'ABC123', 
address: '123 Street' 
} 
} 
) 
Equality Match on Fields within an Embedded Document 
Use the dot notation to match by specific fields in an embedded document. Equality matches for specific fields in 
an embedded document will select documents in the collection where the embedded document contains the specified 
fields with the specified values. The embedded document can contain additional fields. 
3.3. MongoDB CRUD Tutorials 89
MongoDB Documentation, Release 2.6.4 
In the following example, the query uses the dot notation to match all documents where the value of the field 
producer is an embedded document that contains a field company with the value ’ABC123’ and may contain 
other fields: 
db.inventory.find( { 'producer.company': 'ABC123' } ) 
Arrays 
When the field holds an array, you can query for an exact array match or for specific values in the array. If the array 
holds embedded documents, you can query for specific fields in the embedded documents using dot notation. 
If you specify multiple conditions using the $elemMatch operator, the array must contain at least one element that 
satisfies all the conditions. See Single Element Satisfies the Criteria (page 91). 
If you specify multiple conditions without using the $elemMatch operator, then some combination of the array 
elements, not necessarily a single element, must satisfy all the conditions; i.e. different elements in the array can 
satisfy different parts of the conditions. See Combination of Elements Satisfies the Criteria (page 91). 
Consider an inventory collection that contains the following documents: 
{ _id: 5, type: "food", item: "aaa", ratings: [ 5, 8, 9 ] } 
{ _id: 6, type: "food", item: "bbb", ratings: [ 5, 9 ] } 
{ _id: 7, type: "food", item: "ccc", ratings: [ 9, 5, 8 ] } 
Exact Match on an Array 
To specify equality match on an array, use the query document { <field>: <value> } where <value> is 
the array to match. Equality matches on the array require that the array field match exactly the specified <value>, 
including the element order. 
The following example queries for all documents where the field ratings is an array that holds exactly three ele-ments, 
5, 8, and 9, in this order: 
db.inventory.find( { ratings: [ 5, 8, 9 ] } ) 
The operation returns the following document: 
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } 
Match an Array Element 
Equality matches can specify a single element in the array to match. These specifications match if the array contains 
at least one element with the specified value. 
The following example queries for all documents where ratings is an array that contains 5 as one of its elements: 
db.inventory.find( { ratings: 5 } ) 
The operation returns the following documents: 
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } 
{ "_id" : 6, "type" : "food", "item" : "bbb", "ratings" : [ 5, 9 ] } 
{ "_id" : 7, "type" : "food", "item" : "ccc", "ratings" : [ 9, 5, 8 ] } 
90 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Match a Specific Element of an Array 
Equality matches can specify equality matches for an element at a particular index or position of the array using the 
dot notation. 
In the following example, the query uses the dot notation to match all documents where the ratings array contains 
5 as the first element: 
db.inventory.find( { 'ratings.0': 5 } ) 
The operation returns the following documents: 
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } 
{ "_id" : 6, "type" : "food", "item" : "bbb", "ratings" : [ 5, 9 ] } 
Specify Multiple Criteria for Array Elements 
Single Element Satisfies the Criteria Use $elemMatch operator to specify multiple criteria on the elements of 
an array such that at least one array element satisfies all the specified criteria. 
The following example queries for documents where the ratings array contains at least one element that is greater 
than ($gt) 5 and less than ($lt) 9: 
db.inventory.find( { ratings: { $elemMatch: { $gt: 5, $lt: 9 } } } ) 
The operation returns the following documents, whose ratings array contains the element 8 which meets the crite-ria: 
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } 
{ "_id" : 7, "type" : "food", "item" : "ccc", "ratings" : [ 9, 5, 8 ] } 
Combination of Elements Satisfies the Criteria The following example queries for documents where the 
ratings array contains elements that in some combination satisfy the query conditions; e.g., one element can satisfy 
the greater than 5 condition and another element can satisfy the less than 9 condition, or a single element can satisfy 
both: 
db.inventory.find( { ratings: { $gt: 5, $lt: 9 } } ) 
The operation returns the following documents: 
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } 
{ "_id" : 6, "type" : "food", "item" : "bbb", "ratings" : [ 5, 9 ] } 
{ "_id" : 7, "type" : "food", "item" : "ccc", "ratings" : [ 9, 5, 8 ] } 
The document with the "ratings" : [ 5, 9 ] matches the query since the element 9 is greater than 5 (the 
first condition) and the element 5 is less than 9 (the second condition). 
Array of Embedded Documents 
Consider that the inventory collection includes the following documents: 
{ 
_id: 100, 
type: "food", 
item: "xyz", 
qty: 25, 
3.3. MongoDB CRUD Tutorials 91
MongoDB Documentation, Release 2.6.4 
price: 2.5, 
ratings: [ 5, 8, 9 ], 
memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] 
} 
{ 
_id: 101, 
type: "fruit", 
item: "jkl", 
qty: 10, 
price: 4.25, 
ratings: [ 5, 9 ], 
memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ] 
} 
Match a Field in the Embedded Document Using the Array Index If you know the array index of the embedded 
document, you can specify the document using the subdocument’s position using the dot notation. 
The following example selects all documents where the memos contains an array whose first element (i.e. index is 0) 
is a document that contains the field by whose value is ’shipping’: 
db.inventory.find( { 'memos.0.by': 'shipping' } ) 
The operation returns the following document: 
{ 
_id: 100, 
type: "food", 
item: "xyz", 
qty: 25, 
price: 2.5, 
ratings: [ 5, 8, 9 ], 
memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] 
} 
Match a FieldWithout Specifying Array Index If you do not know the index position of the document in the array, 
concatenate the name of the field that contains the array, with a dot (.) and the name of the field in the subdocument. 
The following example selects all documents where the memos field contains an array that contains at least one 
embedded document that contains the field by with the value ’shipping’: 
db.inventory.find( { 'memos.by': 'shipping' } ) 
The operation returns the following documents: 
{ 
_id: 100, 
type: "food", 
item: "xyz", 
qty: 25, 
price: 2.5, 
ratings: [ 5, 8, 9 ], 
memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] 
} 
{ 
_id: 101, 
type: "fruit", 
92 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
item: "jkl", 
qty: 10, 
price: 4.25, 
ratings: [ 5, 9 ], 
memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ] 
} 
Specify Multiple Criteria for Array of Documents 
Single Element Satisfies the Criteria Use $elemMatch operator to specify multiple criteria on an array of em-bedded 
documents such that at least one embedded document satisfies all the specified criteria. 
The following example queries for documents where the memos array has at least one embedded document that 
contains both the field memo equal to ’on time’ and the field by equal to ’shipping’: 
db.inventory.find( 
{ 
memos: 
{ 
$elemMatch: 
{ 
memo: 'on time', 
by: 'shipping' 
} 
} 
} 
) 
The operation returns the following document: 
{ 
_id: 100, 
type: "food", 
item: "xyz", 
qty: 25, 
price: 2.5, 
ratings: [ 5, 8, 9 ], 
memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] 
} 
Combination of Elements Satisfies the Criteria The following example queries for documents where the memos 
array contains elements that in some combination satisfy the query conditions; e.g. one element satisfies the field 
memo equal to ’on time’ condition and another element satisfies the field by equal to ’shipping’ condition, or 
a single element can satisfy both criteria: 
db.inventory.find( 
{ 
'memos.memo': 'on time', 
'memos.by': 'shipping' 
} 
) 
The query returns the following documents: 
{ 
_id: 100, 
3.3. MongoDB CRUD Tutorials 93
MongoDB Documentation, Release 2.6.4 
type: "food", 
item: "xyz", 
qty: 25, 
price: 2.5, 
ratings: [ 5, 8, 9 ], 
memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] 
} 
{ 
_id: 101, 
type: "fruit", 
item: "jkl", 
qty: 10, 
price: 4.25, 
ratings: [ 5, 9 ], 
memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ] 
} 
3.3.3 Limit Fields to Return from a Query 
The projection document limits the fields to return for all matching documents. The projection document can specify 
the inclusion of fields or the exclusion of fields. 
The specifications have the following forms: 
Syntax Description 
<field>: <1 or true> Specify the inclusion of a field. 
<field>: <0 or false> Specify the suppression of the field. 
Important: The _id field is, by default, included in the result set. To suppress the _id field from the result set, 
specify _id: 0 in the projection document. 
You cannot combine inclusion and exclusion semantics in a single projection with the exception of the _id field. 
This tutorial offers various query examples that limit the fields to return for all matching documents. The examples in 
this tutorial use a collection inventory and use the db.collection.find() method in the mongo shell. The 
db.collection.find() method returns a cursor (page 59) to the retrieved documents. For examples on query 
selection criteria, see Query Documents (page 87). 
Return All Fields in Matching Documents 
If you specify no projection, the find() method returns all fields of all documents that match the query. 
db.inventory.find( { type: 'food' } ) 
This operation will return all documents in the inventory collection where the value of the type field is ’food’. 
The returned documents contain all its fields. 
Return the Specified Fields and the _id Field Only 
A projection can explicitly include several fields. In the following operation, find() method returns all documents 
that match the query. In the result set, only the item and qty fields and, by default, the _id field return in the 
matching documents. 
94 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
db.inventory.find( { type: 'food' }, { item: 1, qty: 1 } ) 
Return Specified Fields Only 
You can remove the _id field from the results by specifying its exclusion in the projection, as in the following 
example: 
db.inventory.find( { type: 'food' }, { item: 1, qty: 1, _id:0 } ) 
This operation returns all documents that match the query. In the result set, only the item and qty fields return in 
the matching documents. 
Return All But the Excluded Field 
To exclude a single field or group of fields you can use a projection in the following form: 
db.inventory.find( { type: 'food' }, { type:0 } ) 
This operation returns all documents where the value of the type field is food. In the result set, the type field does 
not return in the matching documents. 
With the exception of the _id field you cannot combine inclusion and exclusion statements in projection documents. 
Projection for Array Fields 
For fields that contain arrays, MongoDB provides the following projection operators: $elemMatch, $slice, and 
$. 
For example, the inventory collection contains the following document: 
{ "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } 
Then the following operation uses the $slice projection operator to return just the first two elements in the ratings 
array. 
db.inventory.find( { _id: 5 }, { ratings: { $slice: 2 } } ) 
$elemMatch, $slice, and $ are the only way to project portions of an array. For instance, you cannot project a 
portion of an array using the array index; e.g. { "ratings.0": 1 } projection will not project the array with 
the first element. 
3.3.4 Iterate a Cursor in the mongo Shell 
The db.collection.find() method returns a cursor. To access the documents, you need to iterate the cursor. 
However, in the mongo shell, if the returned cursor is not assigned to a variable using the var keyword, then the 
cursor is automatically iterated up to 20 times to print up to the first 20 documents in the results. The following 
describes ways to manually iterate the cursor to access the documents or to use the iterator index. 
Manually Iterate the Cursor 
In the mongo shell, when you assign the cursor returned from the find() method to a variable using the var 
keyword, the cursor does not automatically iterate. 
3.3. MongoDB CRUD Tutorials 95
MongoDB Documentation, Release 2.6.4 
You can call the cursor variable in the shell to iterate up to 20 times 9 and print the matching documents, as in the 
following example: 
var myCursor = db.inventory.find( { type: 'food' } ); 
myCursor 
You can also use the cursor method next() to access the documents, as in the following example: 
var myCursor = db.inventory.find( { type: 'food' } ); 
while (myCursor.hasNext()) { 
print(tojson(myCursor.next())); 
} 
As an alternative print operation, consider the printjson() helper method to replace print(tojson()): 
var myCursor = db.inventory.find( { type: 'food' } ); 
while (myCursor.hasNext()) { 
printjson(myCursor.next()); 
} 
You can use the cursor method forEach() to iterate the cursor and access the documents, as in the following 
example: 
var myCursor = db.inventory.find( { type: 'food' } ); 
myCursor.forEach(printjson); 
See JavaScript cursor methods and your driver documentation for more information on cursor methods. 
Iterator Index 
In the mongo shell, you can use the toArray() method to iterate the cursor and return the documents in an array, 
as in the following: 
var myCursor = db.inventory.find( { type: 'food' } ); 
var documentArray = myCursor.toArray(); 
var myDocument = documentArray[3]; 
The toArray() method loads into RAM all documents returned by the cursor; the toArray() method exhausts 
the cursor. 
Additionally, some drivers provide access to the documents by using an index on the cursor (i.e. 
cursor[index]). This is a shortcut for first calling the toArray() method and then using an index on the 
resulting array. 
Consider the following example: 
var myCursor = db.inventory.find( { type: 'food' } ); 
var myDocument = myCursor[3]; 
The myCursor[3] is equivalent to the following example: 
myCursor.toArray() [3]; 
9 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries 
(page 256) for more information. 
96 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
3.3.5 Analyze Query Performance 
The explain() cursor method allows you to inspect the operation of the query system. This method is useful for 
analyzing the efficiency of queries, and for determining how the query uses the index. The explain() method tests 
the query operation, and not the timing of query performance. Because explain() attempts multiple query plans, 
it does not reflect an accurate timing of query performance. 
Evaluate the Performance of a Query 
To use the explain() method, call the method on a cursor returned by find(). 
Example 
Evaluate a query on the type field on the collection inventory that has an index on the type field. 
db.inventory.find( { type: 'food' } ).explain() 
Consider the results: 
{ 
"cursor" : "BtreeCursor type_1", 
"isMultiKey" : false, 
"n" : 5, 
"nscannedObjects" : 5, 
"nscanned" : 5, 
"nscannedObjectsAllPlans" : 5, 
"nscannedAllPlans" : 5, 
"scanAndOrder" : false, 
"indexOnly" : false, 
"nYields" : 0, 
"nChunkSkips" : 0, 
"millis" : 0, 
"indexBounds" : { "type" : [ 
[ "food", 
"food" ] 
] }, 
"server" : "mongodbo0.example.net:27017" } 
The BtreeCursor value of the cursor field indicates that the query used an index. 
This query returned 5 documents, as indicated by the n field. 
To return these 5 documents, the query scanned 5 documents from the index, as indicated by the nscanned field, 
and then read 5 full documents from the collection, as indicated by the nscannedObjects field. 
Without the index, the query would have scanned the whole collection to return the 5 documents. 
See explain-results method for full details on the output. 
Compare Performance of Indexes 
To manually compare the performance of a query using more than one index, you can use the hint() and 
explain() methods in conjunction. 
Example 
Evaluate a query using different indexes: 
3.3. MongoDB CRUD Tutorials 97
MongoDB Documentation, Release 2.6.4 
db.inventory.find( { type: 'food' } ).hint( { type: 1 } ).explain() 
db.inventory.find( { type: 'food' } ).hint( { type: 1, name: 1 } ).explain() 
These return the statistics regarding the execution of the query using the respective index. 
Note: If you run explain() without including hint(), the query optimizer reevaluates the query and runs against 
multiple indexes before returning the query statistics. 
For more detail on the explain output, see explain-results. 
3.3.6 Modify Documents 
MongoDB provides the update() method to update the documents of a collection. The method accepts as its 
parameters: 
• an update conditions document to match the documents to update, 
• an update operations document to specify the modification to perform, and 
• an options document. 
To specify the update condition, use the same structure and syntax as the query conditions. 
By default, update() updates a single document. To update multiple documents, use the multi option. 
Update Specific Fields in a Document 
To change a field value, MongoDB provides update operators10, such as $set to modify values. 
Some update operators, such as $set, will create the field if the field does not exist. See the individual update 
operator11 reference. 
Step 1: Use update operators to change field values. 
For the document with item equal to "MNO2", use the $set operator to update the category field and the 
details field to the specified values and the $currentDate operator to update the field lastModified with 
the current date. 
db.inventory.update( 
{ item: "MNO2" }, 
{ 
$set: { 
category: "apparel", 
details: { model: "14Q3", manufacturer: "XYZ Company" } 
}, 
$currentDate: { lastModified: true } 
} 
) 
The update operation returns a WriteResult object which contains the status of the operation. A successful update 
of the document returns the following object: 
10http://docs.mongodb.org/manual/reference/operator/update 
11http://docs.mongodb.org/manual/reference/operator/update 
98 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) 
The nMatched field specifies the number of existing documents matched for the update, and nModified specifies 
the number of existing documents modified. 
Step 2: Update an embedded field. 
To update a field within an embedded document, use the dot notation. When using the dot notation, enclose the whole 
dotted field name in quotes. 
The following updates the model field within the embedded details document. 
db.inventory.update( 
{ item: "ABC1" }, 
{ $set: { "details.model": "14Q2" } } 
) 
The update operation returns a WriteResult object which contains the status of the operation. A successful update 
of the document returns the following object: 
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) 
Step 3: Update multiple documents. 
By default, the update() method updates a single document. To update multiple documents, use the multi option 
in the update() method. 
Update the category field to "apparel" and update the lastModified field to the current date for all docu-ments 
that have category field equal to "clothing". 
db.inventory.update( 
{ category: "clothing" }, 
{ 
$set: { category: "apparel" }, 
$currentDate: { lastModified: true } 
}, 
{ multi: true } 
) 
The update operation returns a WriteResult object which contains the status of the operation. A successful update 
of the document returns the following object: 
WriteResult({ "nMatched" : 3, "nUpserted" : 0, "nModified" : 3 }) 
Replace the Document 
To replace the entire content of a document except for the _id field, pass an entirely new document as the second 
argument to update(). 
The replacement document can have different fields from the original document. In the replacement document, you 
can omit the _id field since the _id field is immutable. If you do include the _id field, it must be the same value as 
the existing value. 
3.3. MongoDB CRUD Tutorials 99
MongoDB Documentation, Release 2.6.4 
Step 1: Replace a document. 
The following operation replaces the document with item equal to "BE10". The newly replaced document will only 
contain the the _id field and the fields in the replacement document. 
db.inventory.update( 
{ item: "BE10" }, 
{ 
item: "BE05", 
stock: [ { size: "S", qty: 20 }, { size: "M", qty: 5 } ], 
category: "apparel" 
} 
) 
The update operation returns a WriteResult object which contains the status of the operation. A successful update 
of the document returns the following object: 
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) 
upsert Option 
By default, if no document matches the update query, the update() method does nothing. 
However, by specifying upsert: true, the update() method either updates matching document or documents, or 
inserts a new document using the update specification if no matching document exists. 
Step 1: Specify upsert: true for the update replacement operation. 
When you specify upsert: true for an update operation to replace a document and no matching documents 
are found, MongoDB creates a new document using the equality conditions in the update conditions document, and 
replaces this document, except for the _id field if specified, with the update document. 
The following operation either updates a matching document by replacing it with a new document or adds a new 
document if no matching document exists. 
db.inventory.update( 
{ item: "TBD1" }, 
{ 
item: "TBD1", 
details: { "model" : "14Q4", "manufacturer" : "ABC Company" }, 
stock: [ { "size" : "S", "qty" : 25 } ], 
category: "houseware" 
}, 
{ upsert: true } 
) 
The update operation returns a WriteResult object which contains the status of the operation, including whether 
the db.collection.update() method modified an existing document or added a new document. 
WriteResult({ 
"nMatched" : 0, 
"nUpserted" : 1, 
"nModified" : 0, 
"_id" : ObjectId("53dbd684babeaec6342ed6c7") 
}) 
100 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
The nMatched field shows that the operation matched 0 documents. 
The nUpserted of 1 shows that the update added a document. 
The nModified of 0 specifies that no existing documents were updated. 
The _id field shows the generated _id field for the added document. 
Step 2: Specify an upsert: true for the update specific fields operation. 
When you specify an upsert: true for an update operation that modifies specific fields and no matching docu-ments 
are found, MongoDB creates a new document using the equality conditions in the update conditions document, 
and applies the modification as specified in the update document. 
The following update operation either updates specific fields of a matching document or adds a new document if no 
matching document exists. 
db.inventory.update( 
{ item: "TBD2" }, 
{ 
$set: { 
details: { "model" : "14Q3", "manufacturer" : "IJK Co." }, 
category: "houseware" 
} 
}, 
{ upsert: true } 
) 
The update operation returns a WriteResult object which contains the status of the operation, including whether 
the db.collection.update() method modified an existing document or added a new document. 
WriteResult({ 
"nMatched" : 0, 
"nUpserted" : 1, 
"nModified" : 0, 
"_id" : ObjectId("53dbd7c8babeaec6342ed6c8") 
}) 
The nMatched field shows that the operation matched 0 documents. 
The nUpserted of 1 shows that the update added a document. 
The nModified of 0 specifies that no existing documents were updated. 
The _id field shows the generated _id field for the added document. 
Additional Examples and Methods 
For more examples, see Update examples in the db.collection.update() reference page. 
The db.collection.findAndModify() and the db.collection.save() method can also modify exist-ing 
documents or insert a new one. See the individual reference pages for the methods for more information and 
examples. 
3.3.7 Remove Documents 
In MongoDB, the db.collection.remove() method removes documents from a collection. You can remove 
all documents from a collection, remove all documents that match a condition, or limit the operation to remove just a 
3.3. MongoDB CRUD Tutorials 101
MongoDB Documentation, Release 2.6.4 
single document. 
This tutorial provides examples of remove operations using the db.collection.remove() method in the mongo 
shell. 
Remove All Documents 
To remove all documents from a collection, pass an empty query document {} to the remove() method. The 
remove() method does not remove the indexes. 
The following example removes all documents from the inventory collection: 
db.inventory.remove({}) 
To remove all documents from a collection, it may be more efficient to use the drop() method to drop the entire 
collection, including the indexes, and then recreate the collection and rebuild the indexes. 
Remove Documents that Match a Condition 
To remove the documents that match a deletion criteria, call the remove() method with the <query> parameter. 
The following example removes all documents from the inventory collection where the type field equals food: 
db.inventory.remove( { type : "food" } ) 
For large deletion operations, it may be more efficient to copy the documents that you want to keep to a new collection 
and then use drop() on the original collection. 
Remove a Single Document that Matches a Condition 
To remove a single document, call the remove() method with the justOne parameter set to true or 1. 
The following example removes one document from the inventory collection where the type field equals food: 
db.inventory.remove( { type : "food" }, 1 ) 
To delete a single document sorted by some specified order, use the findAndModify() method. 
3.3.8 Perform Two Phase Commits 
Synopsis 
This document provides a pattern for doing multi-document updates or “multi-document transactions” using a two-phase 
commit approach for writing data to multiple documents. Additionally, you can extend this process to provide 
a rollback-like (page 106) functionality. 
Background 
Operations on a single document are always atomic with MongoDB databases; however, operations that involve multi-ple 
documents, which are often referred to as “multi-document transactions”, are not atomic. Since documents can be 
fairly complex and contain multiple “nested” documents, single-document atomicity provides necessary support for 
many practical use cases. 
Despite the power of single-document atomic operations, there are cases that require multi-document transactions. 
When executing a transaction composed of sequential operations, certain issues arise, such as: 
102 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
• Atomicity: if one operation fails, the previous operation within the transaction must “rollback” to the previous 
state (i.e. the “nothing,” in “all or nothing”). 
• Consistency: if a major failure (i.e. network, hardware) interrupts the transaction, the database must be able to 
recover a consistent state. 
For situations that require multi-document transactions, you can implement two-phase commit in your application to 
provide support for these kinds of multi-document updates. Using two-phase commit ensures that data is consistent 
and, in case of an error, the state that preceded the transaction is recoverable (page 106). During the procedure, 
however, documents can represent pending data and states. 
Note: Because only single-document operations are atomic with MongoDB, two-phase commits can only offer 
transaction-like semantics. It is possible for applications to return intermediate data at intermediate points during the 
two-phase commit or rollback. 
Pattern 
Overview 
Consider a scenario where you want to transfer funds from account A to account B. In a relational database system, 
you can subtract the funds from A and add the funds to B in a single multi-statement transaction. In MongoDB, you 
can emulate a two-phase commit to achieve a comparable result. 
The examples in this tutorial use the following two collections: 
1. A collection named accounts to store account information. 
2. A collection named transactions to store information on the fund transfer transactions. 
Initialize Source and Destination Accounts 
Insert into the accounts collection a document for account A and a document for account B. 
db.accounts.insert( 
[ 
{ _id: "A", balance: 1000, pendingTransactions: [] }, 
{ _id: "B", balance: 1000, pendingTransactions: [] } 
] 
) 
The operation returns a BulkWriteResult() object with the status of the operation. Upon successful insert, the 
BulkWriteResult() has nInserted set to 2 . 
Initialize Transfer Record 
For each fund transfer to perform, insert into the transactions collection a document with the transfer information. 
The document contains the following fields: 
• source and destination fields, which refer to the _id fields from the accounts collection, 
• value field, which specifies the amount of transfer affecting the balance of the source and 
destination accounts, 
• state field, which reflects the current state of the transfer. The state field can have the value of initial, 
pending, applied, done, canceling, and canceled. 
3.3. MongoDB CRUD Tutorials 103
MongoDB Documentation, Release 2.6.4 
• lastModified field, which reflects last modification date. 
To initialize the transfer of 100 from account A to account B, insert into the transactions collection a document 
with the transfer information, the transaction state of "initial", and the lastModified field set to the current 
date: 
db.transactions.insert( 
{ _id: 1, source: "A", destination: "B", value: 100, state: "initial", lastModified: new Date() } 
) 
The operation returns a WriteResult() object with the status of the operation. Upon successful insert, the 
WriteResult() object has nInserted set to 1. 
Transfer Funds Between Accounts Using Two-Phase Commit 
Step 1: Retrieve the transaction to start. From the transactions collection, find a transaction in the initial 
state. Currently the transactions collection has only one document, namely the one added in the Initialize 
Transfer Record (page 103) step. If the collection contains additional documents, the query will return any transaction 
with an initial state unless you specify additional query conditions. 
var t = db.transactions.findOne( { state: "initial" } ) 
Type the variable t in the mongo shell to print the contents of the variable. The operation should print a document 
similar to the following except the lastModified field should reflect date of your insert operation: 
{ "_id" : 1, "source" : "A", "destination" : "B", "value" : 100, "state" : "initial", "lastModified" Step 2: Update transaction state to pending. Set the transaction state from initial to pending and use the 
$currentDate operator to set the lastModified field to the current date. 
db.transactions.update( 
{ _id: t._id, state: "initial" }, 
{ 
$set: { state: "pending" }, 
$currentDate: { lastModified: true } 
} 
) 
The operation returns a WriteResult() object with the status of the operation. Upon successful update, the 
nMatched and nModified displays 1. 
In the update statement, the state: "initial" condition ensures that no other process has already updated this 
record. If nMatched and nModified is 0, go back to the first step to get a different transaction and restart the 
procedure. 
Step 3: Apply the transaction to both accounts. Apply the transaction t to both accounts using the update() 
method if the transaction has not been applied to the accounts. In the update condition, include the condition 
pendingTransactions: { $ne: t._id } in order to avoid re-applying the transaction if the step is run 
more than once. 
To apply the transaction to the account, update both the balance field and the pendingTransactions field. 
Update the source account, subtracting from its balance the transaction value and adding to its 
pendingTransactions array the transaction _id. 
104 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
db.accounts.update( 
{ _id: t.source, pendingTransactions: { $ne: t._id } }, 
{ $inc: { balance: -t.value }, $push: { pendingTransactions: t._id } } 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Update the destination account, adding to its balance the transaction value and adding to its 
pendingTransactions array the transaction _id . 
db.accounts.update( 
{ _id: t.destination, pendingTransactions: { $ne: t._id } }, 
{ $inc: { balance: t.value }, $push: { pendingTransactions: t._id } } 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Step 4: Update transaction state to applied. Use the following update() operation to set the transaction’s 
state to applied and update the lastModified field: 
db.transactions.update( 
{ _id: t._id, state: "pending" }, 
{ 
$set: { state: "applied" }, 
$currentDate: { lastModified: true } 
} 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Step 5: Update both accounts’ list of pending transactions. Remove the applied transaction _id from the 
pendingTransactions array for both accounts. 
Update the source account. 
db.accounts.update( 
{ _id: t.source, pendingTransactions: t._id }, 
{ $pull: { pendingTransactions: t._id } } 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Update the destination account. 
db.accounts.update( 
{ _id: t.destination, pendingTransactions: t._id }, 
{ $pull: { pendingTransactions: t._id } } 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Step 6: Update transaction state to done. Complete the transaction by setting the state of the transaction to 
done and updating the lastModified field: 
db.transactions.update( 
{ _id: t._id, state: "applied" }, 
{ 
$set: { state: "done" }, 
3.3. MongoDB CRUD Tutorials 105
MongoDB Documentation, Release 2.6.4 
$currentDate: { lastModified: true } 
} 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Recovering from Failure Scenarios 
The most important part of the transaction procedure is not the prototypical example above, but rather the possibility 
for recovering from the various failure scenarios when transactions do not complete successfully. This section presents 
an overview of possible failures and provides steps to recover from these kinds of events. 
Recovery Operations 
The two-phase commit pattern allows applications running the sequence to resume the transaction and arrive at a 
consistent state. Run the recovery operations at application startup, and possibly at regular intervals, to catch any 
unfinished transactions. 
The time required to reach a consistent state depends on how long the application needs to recover each transaction. 
The following recovery procedures uses the lastModified date as an indicator of whether the pending transaction 
requires recovery; specifically, if the pending or applied transaction has not been updated in the last 30 minutes, 
the procedures determine that these transactions require recovery. You can use different conditions to make this 
determination. 
Transactions in Pending State To recover from failures that occur after step “Update transaction state to pend-ing. 
(page ??)” but before “Update transaction state to applied. (page ??)“step, retrieve from the transactions 
collection a pending transaction for recovery: 
var dateThreshold = new Date(); 
dateThreshold.setMinutes(dateThreshold.getMinutes() - 30); 
var t = db.transactions.findOne( { state: "pending", lastModified: { $lt: dateThreshold } } ); 
And resume from step “Apply the transaction to both accounts. (page ??)“ 
Transactions in Applied State To recover from failures that occur after step “Update transaction state to applied. 
(page ??)” but before “Update transaction state to done. (page ??)“step, retrieve from the transactions collection 
an applied transaction for recovery: 
var dateThreshold = new Date(); 
dateThreshold.setMinutes(dateThreshold.getMinutes() - 30); 
var t = db.transactions.findOne( { state: "applied", lastModified: { $lt: dateThreshold } } ); 
And resume from “Update both accounts’ list of pending transactions. (page ??)“ 
Rollback Operations 
In some cases, you may need to “roll back” or undo a transaction; e.g., if the application needs to “cancel” the 
transaction or if one of the accounts does not exist or stops existing during the transaction. 
106 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Transactions in Applied State After the “Update transaction state to applied. (page ??)” step, you should not roll 
back the transaction. Instead, complete that transaction and create a new transaction to reverse the transaction by 
switching the values in the source and the destination fields. 
Transactions in Pending State After the “Update transaction state to pending. (page ??)” step, but before the 
“Update transaction state to applied. (page ??)” step, you can rollback the transaction using the following procedure: 
Step 1: Update transaction state to canceling. Update the transaction state from pending to canceling. 
db.transactions.update( 
{ _id: t._id, state: "pending" }, 
{ 
$set: { state: "canceling" }, 
$currentDate: { lastModified: true } 
} 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Step 2: Undo the transaction on both accounts. To undo the transaction on both accounts, reverse the transaction 
t if the transaction has been applied. In the update condition, include the condition pendingTransactions: 
t._id in order to update the account only if the pending transaction has been applied. 
Update the destination account, subtracting from its balance the transaction value and removing the transaction 
_id from the pendingTransactions array. 
db.accounts.update( 
{ _id: t.destination, pendingTransactions: t._id }, 
{ 
$inc: { balance: -t.value }, 
$pull: { pendingTransactions: t._id } 
} 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 
1. If the pending transaction has not been previously applied to this account, no document will match the update 
condition and nMatched and nModified will be 0. 
Update the source account, adding to its balance the transaction value and removing the transaction _id from 
the pendingTransactions array. 
db.accounts.update( 
{ _id: t.source, pendingTransactions: t._id }, 
{ 
$inc: { balance: t.value}, 
$pull: { pendingTransactions: t._id } 
} 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 
1. If the pending transaction has not been previously applied to this account, no document will match the update 
condition and nMatched and nModified will be 0. 
Step 3: Update transaction state to canceled. To finish the rollback, update the transaction state from 
canceling to cancelled. 
3.3. MongoDB CRUD Tutorials 107
MongoDB Documentation, Release 2.6.4 
db.transactions.update( 
{ _id: t._id, state: "canceling" }, 
{ 
$set: { state: "cancelled" }, 
$currentDate: { lastModified: true } 
} 
) 
Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. 
Multiple Applications 
Transactions exist, in part, so that multiple applications can create and run operations concurrently without causing 
data inconsistency or conflicts. In our procedure, to update or retrieve the transaction document, the update conditions 
include a condition on the state field to prevent reapplication of the transaction by multiple applications. 
For example, applications App1 and App2 both grab the same transaction, which is in the initial state. App1 
applies the whole transaction before App2 starts. When App2 attempts to perform the “Update transaction state to 
pending. (page ??)” step, the update condition, which includes the state: "initial" criterion, will not match 
any document, and the nMatched and nModified will be 0. This should signal to App2 to go back to the first step 
to restart the procedure with a different transaction. 
When multiple applications are running, it is crucial that only one application can handle a given transaction at any 
point in time. As such, in addition including the expected state of the transaction in the update condition, you can 
also create a marker in the transaction document itself to identify the application that is handling the transaction. Use 
findAndModify() method to modify the transaction and get it back in one step: 
t = db.transactions.findAndModify( 
{ 
query: { state: "initial", application: { $exists: false } }, 
update: 
{ 
$set: { state: "pending", application: "App1" }, 
$currentDate: { lastModified: true } 
}, 
new: true 
} 
) 
Amend the transaction operations to ensure that only applications that match the identifier in the application field 
apply the transaction. 
If the application App1 fails during transaction execution, you can use the recovery procedures (page 106), but appli-cations 
should ensure that they “own” the transaction before applying the transaction. For example to find and resume 
the pending job, use a query that resembles the following: 
var dateThreshold = new Date(); 
dateThreshold.setMinutes(dateThreshold.getMinutes() - 30); 
db.transactions.find( 
{ 
application: "App1", 
state: "pending", 
lastModified: { $lt: dateThreshold } 
} 
) 
108 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Using Two-Phase Commits in Production Applications 
The example transaction above is intentionally simple. For example, it assumes that it is always possible to roll back 
operations to an account and that account balances can hold negative values. 
Production implementations would likely be more complex. Typically, accounts need information about current bal-ance, 
pending credits, and pending debits. 
For all transactions, ensure that you use a level of write concern appropriate for your deployment. 
3.3.9 Create Tailable Cursor 
Overview 
By default, MongoDB will automatically close a cursor when the client has exhausted all results in the cursor. How-ever, 
for capped collections (page 196) you may use a Tailable Cursor that remains open after the client exhausts 
the results in the initial cursor. Tailable cursors are conceptually equivalent to the tail Unix command with the -f 
option (i.e. with “follow” mode). After clients insert new additional documents into a capped collection, the tailable 
cursor will continue to retrieve documents. 
Use tailable cursors on capped collections that have high write volumes where indexes aren’t practical. For instance, 
MongoDB replication (page 503) uses tailable cursors to tail the primary’s oplog. 
Note: If your query is on an indexed field, do not use tailable cursors, but instead, use a regular cursor. Keep track of 
the last value of the indexed field returned by the query. To retrieve the newly added documents, query the collection 
again using the last value of the indexed field in the query criteria, as in the following example: 
db.<collection>.find( { indexedField: { $gt: <lastvalue> } } ) 
Consider the following behaviors related to tailable cursors: 
• Tailable cursors do not use indexes and return documents in natural order. 
• Because tailable cursors do not use indexes, the initial scan for the query may be expensive; but, after initially 
exhausting the cursor, subsequent retrievals of the newly added documents are inexpensive. 
• Tailable cursors may become dead, or invalid, if either: 
– the query returns no match. 
– the cursor returns the document at the “end” of the collection and then the application deletes those docu-ment. 
A dead cursor has an id of 0. 
See your driver documentation for the driver-specific method to specify the tailable cursor. For more infor-mation 
on the details of specifying a tailable cursor, see MongoDB wire protocol12 documentation. 
C++ Example 
The tail function uses a tailable cursor to output the results from a query to a capped collection: 
• The function handles the case of the dead cursor by having the query be inside a loop. 
• To periodically check for new data, the cursor->more() statement is also inside a loop. 
12http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol 
3.3. MongoDB CRUD Tutorials 109
MongoDB Documentation, Release 2.6.4 
#include "client/dbclient.h" 
using namespace mongo; 
/* 
* Example of a tailable cursor. 
* The function "tails" the capped collection (ns) and output elements as they are added. 
* The function also handles the possibility of a dead cursor by tracking the field 'insertDate'. 
* New documents are added with increasing values of 'insertDate'. 
*/ 
void tail(DBClientBase& conn, const char *ns) { 
BSONElement lastValue = minKey.firstElement(); 
Query query = Query().hint( BSON( "$natural" << 1 ) ); 
while ( 1 ) { 
auto_ptr<DBClientCursor> c = 
conn.query(ns, query, 0, 0, 0, 
QueryOption_CursorTailable | QueryOption_AwaitData ); 
while ( 1 ) { 
if ( !c->more() ) { 
if ( c->isDead() ) { 
break; 
} 
continue; 
} 
BSONObj o = c->next(); 
lastValue = o["insertDate"]; 
cout << o.toString() << endl; 
} 
query = QUERY( "insertDate" << GT << lastValue ).hint( BSON( "$natural" << 1 ) ); 
} 
} 
The tail function performs the following actions: 
• Initialize the lastValue variable, which tracks the last accessed value. The function will use the lastValue 
if the cursor becomes invalid and tail needs to restart the query. Use hint() to ensure that the query uses 
the $natural order. 
• In an outer while(1) loop, 
– Query the capped collection and return a tailable cursor that blocks for several seconds waiting for new 
documents 
auto_ptr<DBClientCursor> c = 
conn.query(ns, query, 0, 0, 0, 
QueryOption_CursorTailable | QueryOption_AwaitData ); 
* Specify the capped collection using ns as an argument to the function. 
* Set the QueryOption_CursorTailable option to create a tailable cursor. 
110 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
* Set the QueryOption_AwaitData option so that the returned cursor blocks for a few seconds to 
wait for data. 
– In an inner while (1) loop, read the documents from the cursor: 
* If the cursor has no more documents and is not invalid, loop the inner while loop to recheck for 
more documents. 
* If the cursor has no more documents and is dead, break the inner while loop. 
* If the cursor has documents: 
¡ output the document, 
¡ update the lastValue value, 
¡ and loop the inner while (1) loop to recheck for more documents. 
– If the logic breaks out of the inner while (1) loop and the cursor is invalid: 
* Use the lastValue value to create a new query condition that matches documents added after the 
lastValue. Explicitly ensure $natural order with the hint() method: 
query = QUERY( "insertDate" << GT << lastValue ).hint( BSON( "$natural" << 1 ) ); 
* Loop through the outer while (1) loop to re-query with the new query condition and repeat. 
See also: 
Detailed blog post on tailable cursor13 
3.3.10 Isolate Sequence of Operations 
Overview 
Write operations are atomic on the level of a single document: no single write operation can atomically affect more 
than one document or more than one collection. 
When a single write operation modifies multiple documents, the operation as a whole is not atomic, and other opera-tions 
may interleave. The modification of a single document, or record, is always atomic, even if the write operation 
modifies multiple sub-documents within the single record. 
No other operations are atomic; however, you can isolate a single write operation that affects multiple documents 
using the isolation operator. 
This document describes one method of updating documents only if the local copy of the document reflects the current 
state of the document in the database. In addition the following methods provide a way to manage isolated sequences 
of operations: 
• the findAndModify() provides an isolated update and return operation. 
• Perform Two Phase Commits (page 102) 
• Create a unique index (page 457), to ensure that a key doesn’t exist when you insert it. 
13http://shtylman.com/post/the-tail-of-mongodb 
3.3. MongoDB CRUD Tutorials 111
MongoDB Documentation, Release 2.6.4 
Update if Current 
In this pattern, you will: 
• query for a document, 
• modify the fields in that document 
• and update the fields of a document only if the fields have not changed in the collection since the query. 
Consider the following example in JavaScript which attempts to update the qty field of a document in the products 
collection: 
Changed in version 2.6: The db.collection.update() method now returns a WriteResult() object that 
contains the status of the operation. Previous versions required an extra db.getLastErrorObj() method call. 
var myCollection = db.products; 
var myDocument = myCollection.findOne( { sku: 'abc123' } ); 
if (myDocument) { 
var oldQty = myDocument.qty; 
if (myDocument.qty < 10) { 
myDocument.qty *= 4; 
} else if ( myDocument.qty < 20 ) { 
myDocument.qty *= 3; 
} else { 
myDocument.qty *= 2; 
} 
var results = myCollection.update( 
{ 
_id: myDocument._id, 
qty: oldQty 
}, 
{ 
$set: { qty: myDocument.qty } 
} 
); 
if ( results.hasWriteError() ) { 
print("unexpected error updating document: " + tojson( results )); 
} else if ( results.nMatched == 0 ) { 
print("No update: no matching document for { _id: " + myDocument._id + ", qty: " + oldQty + " } 
} 
Your application may require some modifications of this pattern, such as: 
• Use the entire document as the query in the update() operation, to generalize the operation and guarantee 
that the original document was not modified, rather than ensuring that as single field was not changed. 
• Add a version variable to the document that applications increment upon each update operation to the documents. 
Use this version variable in the query expression. You must be able to ensure that all clients that connect to your 
database obey this constraint. 
• Use $set in the update expression to modify only your fields and prevent overriding other fields. 
• Use one of the methods described in Create an Auto-Incrementing Sequence Field (page 113). 
112 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
3.3.11 Create an Auto-Incrementing Sequence Field 
Synopsis 
MongoDB reserves the _id field in the top level of all documents as a primary key. _id must be unique, and always 
has an index with a unique constraint (page 457). However, except for the unique constraint you can use any value for 
the _id field in your collections. This tutorial describes two methods for creating an incrementing sequence number 
for the _id field using the following: 
• Use Counters Collection (page 113) 
• Optimistic Loop (page 115) 
Considerations 
Generally in MongoDB, you would not use an auto-increment pattern for the _id field, or any field, because it does 
not scale for databases with large numbers of documents. Typically the default value ObjectId is more ideal for the 
_id. 
Procedures 
Use Counters Collection 
Counter Collection Implementation Use a separate counters collection to track the last number sequence used. 
The _id field contains the sequence name and the seq field contains the last value of the sequence. 
1. Insert into the counters collection, the initial value for the userid: 
db.counters.insert( 
{ 
_id: "userid", 
seq: 0 
} 
) 
2. Create a getNextSequence function that accepts a name of the sequence. The function uses the 
findAndModify() method to atomically increment the seq value and return this new value: 
function getNextSequence(name) { 
var ret = db.counters.findAndModify( 
{ 
query: { _id: name }, 
update: { $inc: { seq: 1 } }, 
new: true 
} 
); 
return ret.seq; 
} 
3. Use this getNextSequence() function during insert(). 
db.users.insert( 
{ 
_id: getNextSequence("userid"), 
name: "Sarah C." 
3.3. MongoDB CRUD Tutorials 113
MongoDB Documentation, Release 2.6.4 
} 
) 
db.users.insert( 
{ 
_id: getNextSequence("userid"), 
name: "Bob D." 
} 
) 
You can verify the results with find(): 
db.users.find() 
The _id fields contain incrementing sequence values: 
{ 
_id : 1, 
name : "Sarah C." 
} 
{ 
_id : 2, 
name : "Bob D." 
} 
findAndModify Behavior When findAndModify() includes the upsert: true option and the query 
field(s) is not uniquely indexed, the method could insert a document multiple times in certain circumstances. For 
instance, if multiple clients each invoke the method with the same query condition and these methods complete the 
find phase before any of methods perform the modify phase, these methods could insert the same document. 
In the counters collection example, the query field is the _id field, which always has a unique index. Consider 
that the findAndModify() includes the upsert: true option, as in the following modified example: 
function getNextSequence(name) { 
var ret = db.counters.findAndModify( 
{ 
query: { _id: name }, 
update: { $inc: { seq: 1 } }, 
new: true, 
upsert: true 
} 
); 
return ret.seq; 
} 
If multiple clients were to invoke the getNextSequence() method with the same name parameter, then the 
methods would observe one of the following behaviors: 
• Exactly one findAndModify() would successfully insert a new document. 
• Zero or more findAndModify() methods would update the newly inserted document. 
• Zero or more findAndModify() methods would fail when they attempted to insert a duplicate. 
If the method fails due to a unique index constraint violation, retry the method. Absent a delete of the document, the 
retry should not fail. 
114 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Optimistic Loop 
In this pattern, an Optimistic Loop calculates the incremented _id value and attempts to insert a document with the 
calculated _id value. If the insert is successful, the loop ends. Otherwise, the loop will iterate through possible _id 
values until the insert is successful. 
1. Create a function named insertDocument that performs the “insert if not present” loop. The function wraps 
the insert() method and takes a doc and a targetCollection arguments. 
Changed in version 2.6: The db.collection.insert() method now returns a writeresults-insert object 
that contains the status of the operation. Previous versions required an extra db.getLastErrorObj() 
method call. 
function insertDocument(doc, targetCollection) { 
while (1) { 
var cursor = targetCollection.find( {}, { _id: 1 } ).sort( { _id: -1 } ).limit(1); 
var seq = cursor.hasNext() ? cursor.next()._id + 1 : 1; 
doc._id = seq; 
var results = targetCollection.insert(doc); 
if( results.hasWriteError() ) { 
if( results.writeError.code == 11000 /* dup key */ ) 
continue; 
else 
print( "unexpected error inserting data: " + tojson( results ) ); 
} 
break; 
} 
} 
The while (1) loop performs the following actions: 
• Queries the targetCollection for the document with the maximum _id value. 
• Determines the next sequence value for _id by: 
– adding 1 to the returned _id value if the returned cursor points to a document. 
– otherwise: it sets the next sequence value to 1 if the returned cursor points to no document. 
• For the doc to insert, set its _id field to the calculated sequence value seq. 
• Insert the doc into the targetCollection. 
• If the insert operation errors with duplicate key, repeat the loop. Otherwise, if the insert operation encoun-ters 
some other error or if the operation succeeds, break out of the loop. 
2. Use the insertDocument() function to perform an insert: 
var myCollection = db.users2; 
insertDocument( 
{ 
name: "Grace H." 
}, 
myCollection 
3.3. MongoDB CRUD Tutorials 115
MongoDB Documentation, Release 2.6.4 
); 
insertDocument( 
{ 
name: "Ted R." 
}, 
myCollection 
) 
You can verify the results with find(): 
db.users2.find() 
The _id fields contain incrementing sequence values: 
{ 
_id: 1, 
name: "Grace H." 
} 
{ 
_id : 2, 
"name" : "Ted R." 
} 
The while loop may iterate many times in collections with larger insert volumes. 
3.3.12 Limit Number of Elements in an Array after an Update 
New in version 2.4. 
Synopsis 
Consider an application where users may submit many scores (e.g. for a test), but the application only needs to track 
the top three test scores. 
This pattern uses the $push operator with the $each, $sort, and $slice modifiers to sort and maintain an array 
of fixed size. 
Important: The array elements must be documents in order to use the $sort modifier. 
Pattern 
Consider the following document in the collection students: 
{ 
_id: 1, 
scores: [ 
{ attempt: 1, score: 10 }, 
{ attempt: 2 , score:8 } 
] 
} 
The following update uses the $push operator with: 
• the $each modifier to append to the array 2 new elements, 
116 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
• the $sort modifier to order the elements by ascending (1) score, and 
• the $slice modifier to keep the last 3 elements of the ordered array. 
db.students.update( 
{ _id: 1 }, 
{ $push: { scores: { $each : [ 
{ attempt: 3, score: 7 }, 
{ attempt: 4, score: 4 } 
], 
$sort: { score: 1 }, 
$slice: -3 
} 
} 
} 
) 
Note: When using the $sort modifier on the array element, access the field in the subdocument element directly 
instead of using the dot notation on the array field. 
After the operation, the document contains only the top 3 scores in the scores array: 
{ 
"_id" : 1, 
"scores" : [ 
{ "attempt" : 3, "score" : 7 }, 
{ "attempt" : 2, "score" : 8 }, 
{ "attempt" : 1, "score" : 10 } 
] 
} 
See also: 
• $push operator, 
• $each modifier, 
• $sort modifier, and 
• $slice modifier. 
3.4 MongoDB CRUD Reference 
3.4.1 Query Cursor Methods 
Name Description 
cursor.count() Returns a count of the documents in a cursor. 
cursor.explain() Reports on the query execution plan, including index use, for a cursor. 
cursor.hint() Forces MongoDB to use a specific index for a query. 
cursor.limit() Constrains the size of a cursor’s result set. 
cursor.next() Returns the next document in a cursor. 
cursor.skip() Returns a cursor that begins returning results only after passing or skipping a number of 
documents. 
cursor.sort() Returns results ordered according to a sort specification. 
cursor.toArray() Returns an array that contains all documents returned by the cursor. 
3.4. MongoDB CRUD Reference 117
MongoDB Documentation, Release 2.6.4 
3.4.2 Query and Data Manipulation Collection Methods 
Name Description 
db.collection.count() Wraps count to return a count of the number of documents in a collection or 
matching a query. 
db.collection.distinct(R)eturns an array of documents that have distinct values for the specified field. 
db.collection.find() Performs a query on a collection and returns a cursor object. 
db.collection.findOne()Performs a query and returns a single document. 
db.collection.insert()Creates a new document in a collection. 
db.collection.remove()Deletes documents from a collection. 
db.collection.save() Provides a wrapper around an insert() and update() to insert new 
documents. 
db.collection.update()Modifies a document in a collection. 
3.4.3 MongoDB CRUD Reference Documentation 
Write Concern Reference (page 118) Configuration options associated with the guarantee MongoDB provides when 
reporting on the success of a write operation. 
SQL to MongoDB Mapping Chart (page 120) An overview of common database operations showing both the Mon-goDB 
operations and SQL statements. 
The bios Example Collection (page 125) Sample data for experimenting with MongoDB. insert(), update() 
and find() pages use the data for some of their examples. 
Write Concern Reference 
Write concern (page 72) describes the guarantee that MongoDB provides when reporting on the success of a write 
operation. 
Changed in version 2.6: A new protocol for write operations (page 737) integrates write concerns with the write oper-ations 
and eliminates the need to call the getLastError command. Previous versions required a getLastError 
command immediately after a write operation to specify the write concern. 
Read Isolation Behavior 
MongoDB allows clients to read documents inserted or modified before it commits these modifications to disk, regard-less 
of write concern level or journaling configuration. As a result, applications may observe two classes of behaviors: 
• For systems with multiple concurrent readers and writers, MongoDB will allow clients to read the results of a 
write operation before the write operation returns. 
• If the mongod terminates before the journal commits, even if a write returns successfully, queries may have 
read data that will not exist after the mongod restarts. 
Other database systems refer to these isolation semantics as read uncommitted. For all inserts and updates, Mon-goDB 
modifies each document in isolation: clients never see documents in intermediate states. For multi-document 
operations, MongoDB does not provide any multi-document transactions or isolation. 
When mongod returns a successful journaled write concern, the data is fully committed to disk and will be available 
after mongod restarts. 
For replica sets, write operations are durable only after a write replicates and commits to the journal of a majority of 
the members of the set. MongoDB regularly commits data to the journal regardless of journaled write concern: use 
the commitIntervalMs to control how often a mongod commits the journal. 
118 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
Available Write Concern 
Write concern can include the w (page 119) option to specify the required number of acknowledgments before return-ing, 
the j (page 119) option to require writes to the journal before returning, and wtimeout (page 119) option to specify 
a time limit to prevent write operations from blocking indefinitely. 
In sharded clusters, mongos instances will pass the write concern on to the shard. 
w Option The w option provides the ability to disable write concern entirely as well as specify the write concern for 
replica sets. 
MongoDB uses w: 1 as the default write concern. w: 1 provides basic receipt acknowledgment. 
The w option accepts the following values: 
Value Description 
1 Provides acknowledgment of write operations on a standalone mongod or the primary in a 
replica set. 
This is the default write concern for MongoDB. 
0 Disables basic acknowledgment of write operations, but returns information about socket 
exceptions and networking errors to the application. 
If you disable basic write operation acknowledgment but require journal commit 
acknowledgment, the journal commit prevails, and the server will require that mongod 
acknowledge the write operation. 
<Number 
greater than 
1> 
Guarantees that write operations have propagated successfully to the specified number of replica 
set members including the primary. 
For example, w: 2 indicates acknowledgements from the primary and at least one secondary. 
If you set w to a number that is greater than the number of set members that hold data, 
MongoDB waits for the non-existent members to become available, which means MongoDB 
blocks indefinitely. 
"majority" Confirms that write operations have propagated to the majority of configured replica set: a 
majority of the set’s configured members must acknowledge the write operation before it 
succeeds. This allows you to avoid hard coding assumptions about the size of your replica set 
into your application. 
Changed in version 2.6: In Master/Slave (page 538) deployments, MongoDB treats w: 
"majority" as equivalent to w: 1. In earlier versions of MongoDB, w: "majority" 
produces an error in master/slave (page 538) deployments. 
<tag set> By specifying a tag set (page 576), you can have fine-grained control over which replica set 
members must acknowledge a write operation to satisfy the required level of write concern. 
j Option The j option confirms that the mongod instance has written the data to the on-disk journal. This ensures 
that data is not lost if the mongod instance shuts down unexpectedly. Set to true to enable. 
Changed in version 2.6: Specifying a write concern that includes j: true to a mongod or mongos running with 
--nojournal option now errors. Previous versions would ignore the j: true. 
Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the 
primary of the set regardless of the level of replica acknowledged write concern. 
wtimeout This option specifies a time limit, in milliseconds, for the write concern. wtimeout is only applicable 
for w values greater than 1. 
3.4. MongoDB CRUD Reference 119
MongoDB Documentation, Release 2.6.4 
wtimeout causes write operations to return with an error after the specified limit, even if the required write concern 
will eventually succeed. When these write operations return, MongoDB does not undo successful data modifications 
performed before the write concern exceeded the wtimeout time limit. 
If you do not specify the wtimeout option and the level of write concern is unachievable, the write operation will 
block indefinitely. Specifying a wtimeout value of 0 is equivalent to a write concern without the wtimeout option. 
See also: 
Write Concern Introduction (page 72) and Write Concern for Replica Sets (page 75). 
SQL to MongoDB Mapping Chart 
In addition to the charts that follow, you might want to consider the Frequently Asked Questions (page 687) section for 
a selection of common questions about MongoDB. 
Terminology and Concepts 
The following table presents the various SQL terminology and concepts and the corresponding MongoDB terminology 
and concepts. 
SQL Terms/Concepts MongoDB Terms/Concepts 
database database 
table collection 
row document or BSON document 
column field 
index index 
table joins embedded documents and linking 
primary key 
primary key 
Specify any unique column or column combination as 
In MongoDB, the primary key is automatically set to 
primary key. 
the _id field. 
aggregation (e.g. group by) aggregation pipeline 
See the SQL to Aggregation Mapping Chart 
(page 426). 
Executables 
The following table presents some database executables and the corresponding MongoDB executables. This table is 
not meant to be exhaustive. 
MongoDB MySQL Oracle Informix DB2 
Database Server mongod mysqld oracle IDS DB2 Server 
Database Client mongo mysql sqlplus DB-Access DB2 Client 
Examples 
The following table presents the various SQL statements and the corresponding MongoDB statements. The examples 
in the table assume the following conditions: 
• The SQL examples assume a table named users. 
• The MongoDB examples assume a collection named users that contain documents of the following prototype: 
120 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
{ 
_id: ObjectId("509a8fb2f3f4948bd2f983a0"), 
user_id: "abc123", 
age: 55, 
status: 'A' 
} 
Create and Alter The following table presents the various SQL statements related to table-level actions and the 
corresponding MongoDB statements. 
3.4. MongoDB CRUD Reference 121
MongoDB Documentation, Release 2.6.4 
SQL Schema Statements MongoDB Schema Statements 
CREATE TABLE users ( 
id MEDIUMINT NOT NULL 
AUTO_INCREMENT, 
user_id Varchar(30), 
age Number, 
status char(1), 
PRIMARY KEY (id) 
) 
Implicitly created on first insert() operation. The 
primary key _id is automatically added if _id field is 
not specified. 
db.users.insert( { 
user_id: "abc123", 
age: 55, 
status: "A" 
} ) 
However, you can also explicitly create a collection: 
db.createCollection("users") 
ALTER TABLE users 
ADD join_date DATETIME 
Collections do not describe or enforce the structure of 
its documents; i.e. there is no structural alteration at the 
collection level. 
However, at the document level, update() operations 
can add fields to existing documents using the $set op-erator. 
db.users.update( 
{ }, 
{ $set: { join_date: new Date() } }, 
{ multi: true } 
) 
ALTER TABLE users 
DROP COLUMN join_date 
Collections do not describe or enforce the structure of 
its documents; i.e. there is no structural alteration at the 
collection level. 
However, at the document level, update() operations 
can remove fields from documents using the $unset 
operator. 
db.users.update( 
{ }, 
{ $unset: { join_date: "" } }, 
{ multi: true } 
) 
CREATE INDEX idx_user_id_asc 
ON users(user_id) 
db.users.ensureIndex( { user_id: 1 } ) 
CREATE INDEX 
idx_user_id_asc_age_desc 
ON users(user_id, age DESC) 
db.users.ensureIndex( { user_id: 1, age: -1 } ) 
DROP TABLE users db.users.drop() 
For more information, see db.collection.insert(), db.createCollection(), 
db.collection.update(), $set, $unset, db.collection.ensureIndex(), indexes (page 436), 
db.collection.drop(), and Data Modeling Concepts (page 133). 
Insert The following table presents the various SQL statements related to inserting records into tables and the cor-responding 
MongoDB statements. 
122 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
SQL INSERT Statements MongoDB insert() Statements 
INSERT INTO users(user_id, 
age, 
status) 
VALUES ("bcd001", 
45, 
"A") 
db.users.insert( 
{ user_id: "bcd001", age: 45, status: "A" } 
) 
For more information, see db.collection.insert(). 
Select The following table presents the various SQL statements related to reading records from tables and the corre-sponding 
MongoDB statements. 
3.4. MongoDB CRUD Reference 123
MongoDB Documentation, Release 2.6.4 
SQL SELECT Statements MongoDB find() Statements 
SELECT * 
db.users.find() 
FROM users 
SELECT id, 
user_id, 
status 
FROM users 
db.users.find( 
{ }, 
{ user_id: 1, status: 1 } 
) 
SELECT user_id, status 
FROM users 
db.users.find( 
{ }, 
{ user_id: 1, status: 1, _id: 0 } 
) 
SELECT * 
FROM users 
WHERE status = "A" 
db.users.find( 
{ status: "A" } 
) 
SELECT user_id, status 
FROM users 
WHERE status = "A" 
db.users.find( 
{ status: "A" }, 
{ user_id: 1, status: 1, _id: 0 } 
) 
SELECT * 
FROM users 
WHERE status != "A" 
db.users.find( 
{ status: { $ne: "A" } } 
) 
SELECT * 
FROM users 
WHERE status = "A" 
AND age = 50 
db.users.find( 
{ status: "A", 
age: 50 } 
) 
SELECT * 
FROM users 
WHERE status = "A" 
OR age = 50 
db.users.find( 
{ $or: [ { status: "A" } , 
{ age: 50 } ] } 
) 
SELECT * 
FROM users 
WHERE age > 25 
db.users.find( 
{ age: { $gt: 25 } } 
) 
SELECT * 
FROM users 
WHERE age < 25 
db.users.find( 
{ age: { $lt: 25 } } 
) 
SELECT * 
FROM users 
WHERE age > 25 
AND age <= 50 
db.users.find( 
{ age: { $gt: 25, $lte: 50 } } 
) 
124 Chapter 3. MongoDB CRUD Operations 
SELECT * 
FROM users 
WHERE user_id like "%bc%" 
db.users.find( { user_id: /bc/ } )
MongoDB Documentation, Release 2.6.4 
For more information, see db.collection.find(), db.collection.distinct(), 
db.collection.findOne(), $ne $and, $or, $gt, $lt, $exists, $lte, $regex, limit(), skip(), 
explain(), sort(), and count(). 
Update Records The following table presents the various SQL statements related to updating existing records in 
tables and the corresponding MongoDB statements. 
SQL Update Statements MongoDB update() Statements 
UPDATE users 
db.users.update( 
SET status = "C" 
WHERE age > 25 
{ age: { $gt: 25 } }, 
{ $set: { status: "C" } }, 
{ multi: true } 
) 
UPDATE users 
SET age = age + 3 
WHERE status = "A" 
db.users.update( 
{ status: "A" } , 
{ $inc: { age: 3 } }, 
{ multi: true } 
) 
For more information, see db.collection.update(), $set, $inc, and $gt. 
Delete Records The following table presents the various SQL statements related to deleting records from tables and 
the corresponding MongoDB statements. 
SQL Delete Statements MongoDB remove() Statements 
DELETE FROM users 
db.users.remove( { status: "D" } ) 
WHERE status = "D" 
DELETE FROM users db.users.remove({}) 
For more information, see db.collection.remove(). 
The bios Example Collection 
The bios collection provides example data for experimenting with MongoDB. Many of this guide’s examples on 
insert, update and read operations create or query data from the bios collection. 
The following documents comprise the bios collection. In the examples, the data might be different, as the examples 
themselves make changes to the data. 
{ 
"_id" : 1, 
"name" : { 
"first" : "John", 
"last" : "Backus" 
}, 
"birth" : ISODate("1924-12-03T05:00:00Z"), 
"death" : ISODate("2007-03-17T04:00:00Z"), 
"contribs" : [ 
"Fortran", 
3.4. MongoDB CRUD Reference 125
MongoDB Documentation, Release 2.6.4 
"ALGOL", 
"Backus-Naur Form", 
"FP" 
], 
"awards" : [ 
{ 
"award" : "W.W. McDowell Award", 
"year" : 1967, 
"by" : "IEEE Computer Society" 
}, 
{ 
"award" : "National Medal of Science", 
"year" : 1975, 
"by" : "National Science Foundation" 
}, 
{ 
"award" : "Turing Award", 
"year" : 1977, 
"by" : "ACM" 
}, 
{ 
"award" : "Draper Prize", 
"year" : 1993, 
"by" : "National Academy of Engineering" 
} 
] 
} 
{ 
"_id" : ObjectId("51df07b094c6acd67e492f41"), 
"name" : { 
"first" : "John", 
"last" : "McCarthy" 
}, 
"birth" : ISODate("1927-09-04T04:00:00Z"), 
"death" : ISODate("2011-12-24T05:00:00Z"), 
"contribs" : [ 
"Lisp", 
"Artificial Intelligence", 
"ALGOL" 
], 
"awards" : [ 
{ 
"award" : "Turing Award", 
"year" : 1971, 
"by" : "ACM" 
}, 
{ 
"award" : "Kyoto Prize", 
"year" : 1988, 
"by" : "Inamori Foundation" 
}, 
{ 
"award" : "National Medal of Science", 
"year" : 1990, 
"by" : "National Science Foundation" 
} 
] 
126 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
} 
{ 
"_id" : 3, 
"name" : { 
"first" : "Grace", 
"last" : "Hopper" 
}, 
"title" : "Rear Admiral", 
"birth" : ISODate("1906-12-09T05:00:00Z"), 
"death" : ISODate("1992-01-01T05:00:00Z"), 
"contribs" : [ 
"UNIVAC", 
"compiler", 
"FLOW-MATIC", 
"COBOL" 
], 
"awards" : [ 
{ 
"award" : "Computer Sciences Man of the Year", 
"year" : 1969, 
"by" : "Data Processing Management Association" 
}, 
{ 
"award" : "Distinguished Fellow", 
"year" : 1973, 
"by" : " British Computer Society" 
}, 
{ 
"award" : "W. W. McDowell Award", 
"year" : 1976, 
"by" : "IEEE Computer Society" 
}, 
{ 
"award" : "National Medal of Technology", 
"year" : 1991, 
"by" : "United States" 
} 
] 
} 
{ 
"_id" : 4, 
"name" : { 
"first" : "Kristen", 
"last" : "Nygaard" 
}, 
"birth" : ISODate("1926-08-27T04:00:00Z"), 
"death" : ISODate("2002-08-10T04:00:00Z"), 
"contribs" : [ 
"OOP", 
"Simula" 
], 
"awards" : [ 
{ 
"award" : "Rosing Prize", 
"year" : 1999, 
"by" : "Norwegian Data Association" 
3.4. MongoDB CRUD Reference 127
MongoDB Documentation, Release 2.6.4 
}, 
{ 
"award" : "Turing Award", 
"year" : 2001, 
"by" : "ACM" 
}, 
{ 
"award" : "IEEE John von Neumann Medal", 
"year" : 2001, 
"by" : "IEEE" 
} 
] 
} 
{ 
"_id" : 5, 
"name" : { 
"first" : "Ole-Johan", 
"last" : "Dahl" 
}, 
"birth" : ISODate("1931-10-12T04:00:00Z"), 
"death" : ISODate("2002-06-29T04:00:00Z"), 
"contribs" : [ 
"OOP", 
"Simula" 
], 
"awards" : [ 
{ 
"award" : "Rosing Prize", 
"year" : 1999, 
"by" : "Norwegian Data Association" 
}, 
{ 
"award" : "Turing Award", 
"year" : 2001, 
"by" : "ACM" 
}, 
{ 
"award" : "IEEE John von Neumann Medal", 
"year" : 2001, 
"by" : "IEEE" 
} 
] 
} 
{ 
"_id" : 6, 
"name" : { 
"first" : "Guido", 
"last" : "van Rossum" 
}, 
"birth" : ISODate("1956-01-31T05:00:00Z"), 
"contribs" : [ 
"Python" 
], 
"awards" : [ 
{ 
"award" : "Award for the Advancement of Free Software", 
128 Chapter 3. MongoDB CRUD Operations
MongoDB Documentation, Release 2.6.4 
"year" : 2001, 
"by" : "Free Software Foundation" 
}, 
{ 
"award" : "NLUUG Award", 
"year" : 2003, 
"by" : "NLUUG" 
} 
] 
} 
{ 
"_id" : ObjectId("51e062189c6ae665454e301d"), 
"name" : { 
"first" : "Dennis", 
"last" : "Ritchie" 
}, 
"birth" : ISODate("1941-09-09T04:00:00Z"), 
"death" : ISODate("2011-10-12T04:00:00Z"), 
"contribs" : [ 
"UNIX", 
"C" 
], 
"awards" : [ 
{ 
"award" : "Turing Award", 
"year" : 1983, 
"by" : "ACM" 
}, 
{ 
"award" : "National Medal of Technology", 
"year" : 1998, 
"by" : "United States" 
}, 
{ 
"award" : "Japan Prize", 
"year" : 2011, 
"by" : "The Japan Prize Foundation" 
} 
] 
} 
{ 
"_id" : 8, 
"name" : { 
"first" : "Yukihiro", 
"aka" : "Matz", 
"last" : "Matsumoto" 
}, 
"birth" : ISODate("1965-04-14T04:00:00Z"), 
"contribs" : [ 
"Ruby" 
], 
"awards" : [ 
{ 
"award" : "Award for the Advancement of Free Software", 
"year" : "2011", 
"by" : "Free Software Foundation" 
3.4. MongoDB CRUD Reference 129
MongoDB Documentation, Release 2.6.4 
} 
] 
} 
{ 
"_id" : 9, 
"name" : { 
"first" : "James", 
"last" : "Gosling" 
}, 
"birth" : ISODate("1955-05-19T04:00:00Z"), 
"contribs" : [ 
"Java" 
], 
"awards" : [ 
{ 
"award" : "The Economist Innovation Award", 
"year" : 2002, 
"by" : "The Economist" 
}, 
{ 
"award" : "Officer of the Order of Canada", 
"year" : 2007, 
"by" : "Canada" 
} 
] 
} 
{ 
"_id" : 10, 
"name" : { 
"first" : "Martin", 
"last" : "Odersky" 
}, 
"contribs" : [ 
"Scala" 
] 
} 
130 Chapter 3. MongoDB CRUD Operations
CHAPTER 4 
Data Models 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. This flexibility gives you 
data-modeling choices to match your application and its performance requirements. 
Read the Data Modeling Introduction (page 131) document for a high level introduction to data modeling, and proceed 
to the documents in the Data Modeling Concepts (page 133) section for additional documentation of the data model 
design process. The Data Model Examples and Patterns (page 140) documents provide examples of different data 
models. In addition, the MongoDB Use Case Studies1 provide overviews of application design and include example 
data models with MongoDB. 
Data Modeling Introduction (page 131) An introduction to data modeling in MongoDB. 
Data Modeling Concepts (page 133) The core documentation detailing the decisions you must make when determin-ing 
a data model, and discussing considerations that should be taken into account. 
Data Model Examples and Patterns (page 140) Examples of possible data models that you can use to structure your 
MongoDB documents. 
Data Model Reference (page 158) Reference material for data modeling for developers of MongoDB applications. 
4.1 Data Modeling Introduction 
Data in MongoDB has a flexible schema. Unlike SQL databases, where you must determine and declare a table’s 
schema before inserting data, MongoDB’s collections do not enforce document structure. This flexibility facilitates 
the mapping of documents to an entity or an object. Each document can match the data fields of the represented entity, 
even if the data has substantial variation. In practice, however, the documents in a collection share a similar structure. 
The key challenge in data modeling is balancing the needs of the application, the performance characteristics of the 
database engine, and the data retrieval patterns. When designing data models, always consider the application usage 
of the data (i.e. queries, updates, and processing of the data) as well as the inherent structure of the data itself. 
4.1.1 Document Structure 
The key decision in designing data models for MongoDB applications revolves around the structure of documents and 
how the application represents relationships between data. There are two tools that allow applications to represent 
these relationships: references and embedded documents. 
1http://docs.mongodb.org/ecosystem/use-cases 
131
MongoDB Documentation, Release 2.6.4 
References 
References store the relationships between data by including links or references from one document to another. Appli-cations 
can resolve these references (page 161) to access the related data. Broadly, these are normalized data models. 
Figure 4.1: Data model using references to link documents. Both the contact document and the access document 
contain a reference to the user document. 
See Normalized Data Models (page 135) for the strengths and weaknesses of using references. 
Embedded Data 
Embedded documents capture relationships between data by storing related data in a single document structure. Mon-goDB 
documents make it possible to embed document structures as sub-documents in a field or array within a docu-ment. 
These denormalized data models allow applications to retrieve and manipulate related data in a single database 
operation. 
See Embedded Data Models (page 134) for the strengths and weaknesses of embedding sub-documents. 
4.1.2 Atomicity of Write Operations 
In MongoDB, write operations are atomic at the document level, and no single write operation can atomically affect 
more than one document or more than one collection. A denormalized data model with embedded data combines 
all related data for a represented entity in a single document. This facilitates atomic write operations since a single 
write operation can insert or update the data for an entity. Normalizing the data would split the data across multiple 
collections and would require multiple write operations that are not atomic collectively. 
132 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
Figure 4.2: Data model with embedded fields that contain all related information. 
However, schemas that facilitate atomic writes may limit ways that applications can use the data or may limit ways to 
modify applications. The Atomicity Considerations (page 136) documentation describes the challenge of designing a 
schema that balances flexibility and atomicity. 
4.1.3 Document Growth 
Some updates, such as pushing elements to an array or adding new fields, increase a document’s size. If the document 
size exceeds the allocated space for that document, MongoDB relocates the document on disk. The growth consider-ation 
can affect the decision to normalize or denormalize data. See Document Growth Considerations (page 136) for 
more about planning for and managing document growth in MongoDB. 
4.1.4 Data Use and Performance 
When designing a data model, consider how applications will use your database. For instance, if your application only 
uses recently inserted documents, consider using Capped Collections (page 196). Or if your application needs are 
mainly read operations to a collection, adding indexes to support common queries can improve performance. 
See Operational Factors and Data Models (page 136) for more information on these and other operational considera-tions 
that affect data model designs. 
4.2 Data Modeling Concepts 
When constructing a data model for your MongoDB collection, there are various options you can choose from, each 
of which has its strengths and weaknesses. The following sections guide you through key design decisions and detail 
various considerations for choosing the best data model for your application needs. 
4.2. Data Modeling Concepts 133
MongoDB Documentation, Release 2.6.4 
For a general introduction to data modeling in MongoDB, see the Data Modeling Introduction (page 131). For example 
data models, see Data Modeling Examples and Patterns (page 140). 
Data Model Design (page 134) Presents the different strategies that you can choose from when determining your data 
model, their strengths and their weaknesses. 
Operational Factors and Data Models (page 136) Details features you should keep in mind when designing your 
data model, such as lifecycle management, indexing, horizontal scalability, and document growth. 
GridFS (page 138) GridFS is a specification for storing documents that exceeds the BSON-document size limit of 
16MB. 
4.2.1 Data Model Design 
Effective data models support your application needs. The key consideration for the structure of your documents is 
the decision to embed (page 134) or to use references (page 135). 
Embedded Data Models 
With MongoDB, you may embed related data in a single structure or document. These schema are generally known 
as “denormalized” models, and take advantage of MongoDB’s rich documents. Consider the following diagram: 
Figure 4.3: Data model with embedded fields that contain all related information. 
Embedded data models allow applications to store related pieces of information in the same database record. As a 
result, applications may need to issue fewer queries and updates to complete common operations. 
In general, use embedded data models when: 
• you have “contains” relationships between entities. See Model One-to-One Relationships with Embedded Doc-uments 
(page 140). 
134 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
• you have one-to-many relationships between entities. In these relationships the “many” or child documents 
always appear with or are viewed in the context of the “one” or parent documents. See Model One-to-Many 
Relationships with Embedded Documents (page 141). 
In general, embedding provides better performance for read operations, as well as the ability to request and retrieve 
related data in a single database operation. Embedded data models make it possible to update related data in a single 
atomic write operation. 
However, embedding related data in documents may lead to situations where documents grow after creation. Doc-ument 
growth can impact write performance and lead to data fragmentation. See Document Growth (page 136) for 
details. Furthermore, documents in MongoDB must be smaller than the maximum BSON document size. For 
bulk binary data, consider GridFS (page 138). 
To interact with embedded documents, use dot notation to “reach into” embedded documents. See query for data 
in arrays (page 90) and query data in sub-documents (page 89) for more examples on accessing data in arrays and 
embedded documents. 
Normalized Data Models 
Normalized data models describe relationships using references (page 161) between documents. 
Figure 4.4: Data model using references to link documents. Both the contact document and the access document 
contain a reference to the user document. 
In general, use normalized data models: 
• when embedding would result in duplication of data but would not provide sufficient read performance advan-tages 
to outweigh the implications of the duplication. 
• to represent more complex many-to-many relationships. 
4.2. Data Modeling Concepts 135
MongoDB Documentation, Release 2.6.4 
• to model large hierarchical data sets. 
References provides more flexibility than embedding. However, client-side applications must issue follow-up queries 
to resolve the references. In other words, normalized data models can require more round trips to the server. 
See Model One-to-Many Relationships with Document References (page 143) for an example of referencing. For 
examples of various tree models using references, see Model Tree Structures (page 144). 
4.2.2 Operational Factors and Data Models 
Modeling application data for MongoDB depends on both the data itself, as well as the characteristics of MongoDB 
itself. For example, different data models may allow applications to use more efficient queries, increase the throughput 
of insert and update operations, or distribute activity to a sharded cluster more effectively. 
These factors are operational or address requirements that arise outside of the application but impact the performance 
of MongoDB based applications. When developing a data model, analyze all of your application’s read operations 
(page 55) and write operations (page 67) in conjunction with the following considerations. 
Document Growth 
Some updates to documents can increase the size of documents. These updates include pushing elements to an array 
(i.e. $push) and adding new fields to a document. If the document size exceeds the allocated space for that document, 
MongoDB will relocate the document on disk. Relocating documents takes longer than in place updates and can lead to 
fragmented storage. Although MongoDB automatically adds padding to document allocations (page 83) to minimize 
the likelihood of relocation, data models should avoid document growth when possible. 
For instance, if your applications require updates that will cause document growth, you may want to refactor your data 
model to use references between data in distinct documents rather than a denormalized data model. 
MongoDB adaptively adjusts the amount of automatic padding to reduce occurrences of relocation. You may also use 
a pre-allocation strategy to explicitly avoid document growth. Refer to the Pre-Aggregated Reports Use Case2 for an 
example of the pre-allocation approach to handling document growth. 
See Storage (page 82) for more information on MongoDB’s storage model and record allocation strategies. 
Atomicity 
In MongoDB, operations are atomic at the document level. No single write operation can change more than one 
document. Operations that modify more than a single document in a collection still operate on one document at a time. 
3 Ensure that your application stores all fields with atomic dependency requirements in the same document. If the 
application can tolerate non-atomic updates for two pieces of data, you can store these data in separate documents. 
A data model that embeds related data in a single document facilitates these kinds of atomic operations. For data mod-els 
that store references between related pieces of data, the application must issue separate read and write operations 
to retrieve and modify these related pieces of data. 
See Model Data for Atomic Operations (page 154) for an example data model that provides atomic updates for a single 
document. 
2http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports 
3 Document-level atomic operations include all operations within a single MongoDB document record: operations that affect multiple sub-documents 
within that single record are still atomic. 
136 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
Sharding 
MongoDB uses sharding to provide horizontal scaling. These clusters support deployments with large data sets and 
high-throughput operations. Sharding allows users to partition a collection within a database to distribute the collec-tion’s 
documents across a number of mongod instances or shards. 
To distribute data and application traffic in a sharded collection, MongoDB uses the shard key (page 620). Selecting 
the proper shard key (page 620) has significant implications for performance, and can enable or prevent query isolation 
and increased write capacity. It is important to consider carefully the field or fields to use as the shard key. 
See Sharding Introduction (page 607) and Shard Keys (page 620) for more information. 
Indexes 
Use indexes to improve performance for common queries. Build indexes on fields that appear often in queries and for 
all operations that return sorted results. MongoDB automatically creates a unique index on the _id field. 
As you create indexes, consider the following behaviors of indexes: 
• Each index requires at least 8KB of data space. 
• Adding an index has some negative performance impact for write operations. For collections with high write-to- 
read ratio, indexes are expensive since each insert must also update any indexes. 
• Collections with high read-to-write ratio often benefit from additional indexes. Indexes do not affect un-indexed 
read operations. 
• When active, each index consumes disk space and memory. This usage can be significant and should be tracked 
for capacity planning, especially for concerns over working set size. 
See Indexing Strategies (page 493) for more information on indexes as well as Analyze Query Performance (page 97). 
Additionally, the MongoDB database profiler (page 210) may help identify inefficient queries. 
Large Number of Collections 
In certain situations, you might choose to store related information in several collections rather than in a single collec-tion. 
Consider a sample collection logs that stores log documents for various environment and applications. The logs 
collection contains documents of the following form: 
{ log: "dev", ts: ..., info: ... } 
{ log: "debug", ts: ..., info: ...} 
If the total number of documents is low, you may group documents into collection by type. For logs, consider main-taining 
distinct log collections, such as logs_dev and logs_debug. The logs_dev collection would contain 
only the documents related to the dev environment. 
Generally, having a large number of collections has no significant performance penalty and results in very good 
performance. Distinct collections are very important for high-throughput batch processing. 
When using models that have a large number of collections, consider the following behaviors: 
• Each collection has a certain minimum overhead of a few kilobytes. 
• Each index, including the index on _id, requires at least 8KB of data space. 
• For each database, a single namespace file (i.e. <database>.ns) stores all meta-data for that database, and 
each index and collection has its own entry in the namespace file. MongoDB places limits on the size 
of namespace files. 
4.2. Data Modeling Concepts 137
MongoDB Documentation, Release 2.6.4 
• MongoDB has limits on the number of namespaces. You may wish to know the current number 
of namespaces in order to determine how many additional namespaces the database can support. To get the 
current number of namespaces, run the following in the mongo shell: 
db.system.namespaces.count() 
The limit on the number of namespaces depend on the <database>.ns size. The namespace file defaults to 
16 MB. 
To change the size of the new namespace file, start the server with the option --nssize <new size MB>. 
For existing databases, after starting up the server with --nssize, run the db.repairDatabase() com-mand 
from the mongo shell. For impacts and considerations on running db.repairDatabase(), see 
repairDatabase. 
Data Lifecycle Management 
Data modeling decisions should take data lifecycle management into consideration. 
The Time to Live or TTL feature (page 198) of collections expires documents after a period of time. Consider using 
the TTL feature if your application requires some data to persist in the database for a limited period of time. 
Additionally, if your application only uses recently inserted documents, consider Capped Collections (page 196). 
Capped collections provide first-in-first-out (FIFO) management of inserted documents and efficiently support opera-tions 
that insert and read documents based on insertion order. 
4.2.3 GridFS 
GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16MB. 
Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, 4 and stores each of those 
chunks as a separate document. By default GridFS limits chunk size to 255k. GridFS uses two collections to store 
files. One collection stores the file chunks, and the other stores file metadata. 
When you query a GridFS store for a file, the driver or client will reassemble the chunks as needed. You can perform 
range queries on files stored through GridFS. You also can access information from arbitrary sections of files, which 
allows you to “skip” into the middle of a video or audio file. 
GridFS is useful not only for storing files that exceed 16MB but also for storing any files for which you want access 
without having to load the entire file into memory. For more information on the indications of GridFS, see When 
should I use GridFS? (page 693). 
Changed in version 2.4.10: The default chunk size changed from 256k to 255k. 
Implement GridFS 
To store and retrieve files using GridFS, use either of the following: 
• A MongoDB driver. See the drivers documentation for information on using GridFS with your driver. 
• The mongofiles command-line tool in the mongo shell. See 
http://docs.mongodb.org/manualreference/program/mongofiles. 
4 The use of the term chunks in the context of GridFS is not related to the use of the term chunks in the context of sharding. 
138 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
GridFS Collections 
GridFS stores files in two collections: 
• chunks stores the binary chunks. For details, see The chunks Collection (page 164). 
• files stores the file’s metadata. For details, see The files Collection (page 165). 
GridFS places the collections in a common bucket by prefixing each with the bucket name. By default, GridFS uses 
two collections with names prefixed by fs bucket: 
• fs.files 
• fs.chunks 
You can choose a different bucket name than fs, and create multiple buckets in a single database. 
Each document in the chunks collection represents a distinct chunk of a file as represented in the GridFS store. Each 
chunk is identified by its unique ObjectId stored in its _id field. 
For descriptions of all fields in the chunks and files collections, see GridFS Reference (page 164). 
GridFS Index 
GridFS uses a unique, compound index on the chunks collection for the files_id and n fields. The files_id 
field contains the _id of the chunk’s “parent” document. The n field contains the sequence number of the chunk. 
GridFS numbers all chunks, starting with 0. For descriptions of the documents and fields in the chunks collection, 
see GridFS Reference (page 164). 
The GridFS index allows efficient retrieval of chunks using the files_id and n values, as shown in the following 
example: 
cursor = db.fs.chunks.find({files_id: myFileID}).sort({n:1}); 
See the relevant driver documentation for the specific behavior of your GridFS application. If your driver does not 
create this index, issue the following operation using the mongo shell: 
db.fs.chunks.ensureIndex( { files_id: 1, n: 1 }, { unique: true } ); 
Example Interface 
The following is an example of the GridFS interface in Java. The example is for demonstration purposes only. For 
API specifics, see the relevant driver documentation. 
By default, the interface must support the default GridFS bucket, named fs, as in the following: 
// returns default GridFS bucket (i.e. "fs" collection) 
GridFS myFS = new GridFS(myDatabase); 
// saves the file to "fs" GridFS bucket 
myFS.createFile(new File("/tmp/largething.mpg")); 
Optionally, interfaces may support other additional GridFS buckets as in the following example: 
// returns GridFS bucket named "contracts" 
GridFS myContracts = new GridFS(myDatabase, "contracts"); 
// retrieve GridFS object "smithco" 
GridFSDBFile file = myContracts.findOne("smithco"); 
4.2. Data Modeling Concepts 139
MongoDB Documentation, Release 2.6.4 
// saves the GridFS file to the file system 
file.writeTo(new File("/tmp/smithco.pdf")); 
4.3 Data Model Examples and Patterns 
The following documents provide overviews of various data modeling patterns and common schema design consider-ations: 
Model Relationships Between Documents (page 140) Examples for modeling relationships between documents. 
Model One-to-One Relationships with Embedded Documents (page 140) Presents a data model that uses em-bedded 
documents (page 134) to describe one-to-one relationships between connected data. 
Model One-to-Many Relationships with Embedded Documents (page 141) Presents a data model that uses 
embedded documents (page 134) to describe one-to-many relationships between connected data. 
Model One-to-Many Relationships with Document References (page 143) Presents a data model that uses 
references (page 135) to describe one-to-many relationships between documents. 
Model Tree Structures (page 144) Examples for modeling tree structures. 
Model Tree Structures with Parent References (page 146) Presents a data model that organizes documents in 
a tree-like structure by storing references (page 135) to “parent” nodes in “child” nodes. 
Model Tree Structures with Child References (page 148) Presents a data model that organizes documents in a 
tree-like structure by storing references (page 135) to “child” nodes in “parent” nodes. 
See Model Tree Structures (page 144) for additional examples of data models for tree structures. 
Model Specific Application Contexts (page 154) Examples for models for specific application contexts. 
Model Data for Atomic Operations (page 154) Illustrates how embedding fields related to an atomic update 
within the same document ensures that the fields are in sync. 
Model Data to Support Keyword Search (page 155) Describes one method for supporting keyword search by 
storing keywords in an array in the same document as the text field. Combined with a multi-key index, this 
pattern can support application’s keyword search operations. 
4.3.1 Model Relationships Between Documents 
Model One-to-One Relationships with Embedded Documents (page 140) Presents a data model that uses embedded 
documents (page 134) to describe one-to-one relationships between connected data. 
Model One-to-Many Relationships with Embedded Documents (page 141) Presents a data model that uses embed-ded 
documents (page 134) to describe one-to-many relationships between connected data. 
Model One-to-Many Relationships with Document References (page 143) Presents a data model that uses refer-ences 
(page 135) to describe one-to-many relationships between documents. 
Model One-to-One Relationships with Embedded Documents 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
140 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
This document describes a data model that uses embedded (page 134) documents to describe relationships between 
connected data. 
Pattern 
Consider the following example that maps patron and address relationships. The example illustrates the advantage of 
embedding over referencing if you need to view one data entity in context of the other. In this one-to-one relationship 
between patron and address data, the address belongs to the patron. 
In the normalized data model, the address document contains a reference to the patron document. 
{ 
_id: "joe", 
name: "Joe Bookreader" 
} 
{ 
patron_id: "joe", 
street: "123 Fake Street", 
city: "Faketon", 
state: "MA", 
zip: "12345" 
} 
If the address data is frequently retrieved with the name information, then with referencing, your application needs 
to issue multiple queries to resolve the reference. The better data model would be to embed the address data in the 
patron data, as in the following document: 
{ 
_id: "joe", 
name: "Joe Bookreader", 
address: { 
street: "123 Fake Street", 
city: "Faketon", 
state: "MA", 
zip: "12345" 
} 
} 
With the embedded data model, your application can retrieve the complete patron information with one query. 
Model One-to-Many Relationships with Embedded Documents 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
This document describes a data model that uses embedded (page 134) documents to describe relationships between 
connected data. 
4.3. Data Model Examples and Patterns 141
MongoDB Documentation, Release 2.6.4 
Pattern 
Consider the following example that maps patron and multiple address relationships. The example illustrates the 
advantage of embedding over referencing if you need to view many data entities in context of another. In this one-to-many 
relationship between patron and address data, the patron has multiple address entities. 
In the normalized data model, the address documents contain a reference to the patron document. 
{ 
_id: "joe", 
name: "Joe Bookreader" 
} 
{ 
patron_id: "joe", 
street: "123 Fake Street", 
city: "Faketon", 
state: "MA", 
zip: "12345" 
} 
{ 
patron_id: "joe", 
street: "1 Some Other Street", 
city: "Boston", 
state: "MA", 
zip: "12345" 
} 
If your application frequently retrieves the address data with the name information, then your application needs 
to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data 
entities in the patron data, as in the following document: 
{ 
_id: "joe", 
name: "Joe Bookreader", 
addresses: [ 
{ 
street: "123 Fake Street", 
city: "Faketon", 
state: "MA", 
zip: "12345" 
}, 
{ 
street: "1 Some Other Street", 
city: "Boston", 
state: "MA", 
zip: "12345" 
} 
] 
} 
With the embedded data model, your application can retrieve the complete patron information with one query. 
142 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
Model One-to-Many Relationships with Document References 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
This document describes a data model that uses references (page 135) between documents to describe relationships 
between connected data. 
Pattern 
Consider the following example that maps publisher and book relationships. The example illustrates the advantage of 
referencing over embedding to avoid repetition of the publisher information. 
Embedding the publisher document inside the book document would lead to repetition of the publisher data, as the 
following documents show: 
{ 
title: "MongoDB: The Definitive Guide", 
author: [ "Kristina Chodorow", "Mike Dirolf" ], 
published_date: ISODate("2010-09-24"), 
pages: 216, 
language: "English", 
publisher: { 
name: "O'Reilly Media", 
founded: 1980, 
location: "CA" 
} 
} 
{ 
title: "50 Tips and Tricks for MongoDB Developer", 
author: "Kristina Chodorow", 
published_date: ISODate("2011-05-06"), 
pages: 68, 
language: "English", 
publisher: { 
name: "O'Reilly Media", 
founded: 1980, 
location: "CA" 
} 
} 
To avoid repetition of the publisher data, use references and keep the publisher information in a separate collection 
from the book collection. 
When using references, the growth of the relationships determine where to store the reference. If the number of books 
per publisher is small with limited growth, storing the book reference inside the publisher document may sometimes 
be useful. Otherwise, if the number of books per publisher is unbounded, this data model would lead to mutable, 
growing arrays, as in the following example: 
{ 
name: "O'Reilly Media", 
founded: 1980, 
location: "CA", 
4.3. Data Model Examples and Patterns 143
MongoDB Documentation, Release 2.6.4 
books: [12346789, 234567890, ...] 
} 
{ 
_id: 123456789, 
title: "MongoDB: The Definitive Guide", 
author: [ "Kristina Chodorow", "Mike Dirolf" ], 
published_date: ISODate("2010-09-24"), 
pages: 216, 
language: "English" 
} 
{ 
_id: 234567890, 
title: "50 Tips and Tricks for MongoDB Developer", 
author: "Kristina Chodorow", 
published_date: ISODate("2011-05-06"), 
pages: 68, 
language: "English" 
} 
To avoid mutable, growing arrays, store the publisher reference inside the book document: 
{ 
_id: "oreilly", 
name: "O'Reilly Media", 
founded: 1980, 
location: "CA" 
} 
{ 
_id: 123456789, 
title: "MongoDB: The Definitive Guide", 
author: [ "Kristina Chodorow", "Mike Dirolf" ], 
published_date: ISODate("2010-09-24"), 
pages: 216, 
language: "English", 
publisher_id: "oreilly" 
} 
{ 
_id: 234567890, 
title: "50 Tips and Tricks for MongoDB Developer", 
author: "Kristina Chodorow", 
published_date: ISODate("2011-05-06"), 
pages: 68, 
language: "English", 
publisher_id: "oreilly" 
} 
4.3.2 Model Tree Structures 
MongoDB allows various ways to use tree data structures to model large hierarchical or nested data relationships. 
Model Tree Structures with Parent References (page 146) Presents a data model that organizes documents in a tree-like 
structure by storing references (page 135) to “parent” nodes in “child” nodes. 
144 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
Figure 4.5: Tree data model for a sample hierarchy of categories. 
4.3. Data Model Examples and Patterns 145
MongoDB Documentation, Release 2.6.4 
Model Tree Structures with Child References (page 148) Presents a data model that organizes documents in a tree-like 
structure by storing references (page 135) to “child” nodes in “parent” nodes. 
Model Tree Structures with an Array of Ancestors (page 149) Presents a data model that organizes documents in a 
tree-like structure by storing references (page 135) to “parent” nodes and an array that stores all ancestors. 
Model Tree Structures with Materialized Paths (page 151) Presents a data model that organizes documents in a tree-like 
structure by storing full relationship paths between documents. In addition to the tree node, each document 
stores the _id of the nodes ancestors or path as a string. 
Model Tree Structures with Nested Sets (page 153) Presents a data model that organizes documents in a tree-like 
structure using the Nested Sets pattern. This optimizes discovering subtrees at the expense of tree mutability. 
Model Tree Structures with Parent References 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
This document describes a data model that describes a tree-like structure in MongoDB documents by storing references 
(page 135) to “parent” nodes in children nodes. 
Pattern 
The Parent References pattern stores each tree node in a document; in addition to the tree node, the document stores 
the id of the node’s parent. 
Consider the following hierarchy of categories: 
The following example models the tree using Parent References, storing the reference to the parent category in the 
field parent: 
db.categories.insert( { _id: "MongoDB", parent: "Databases" } ) 
db.categories.insert( { _id: "dbm", parent: "Databases" } ) 
db.categories.insert( { _id: "Databases", parent: "Programming" } ) 
db.categories.insert( { _id: "Languages", parent: "Programming" } ) 
db.categories.insert( { _id: "Programming", parent: "Books" } ) 
db.categories.insert( { _id: "Books", parent: null } ) 
• The query to retrieve the parent of a node is fast and straightforward: 
db.categories.findOne( { _id: "MongoDB" } ).parent 
• You can create an index on the field parent to enable fast search by the parent node: 
db.categories.ensureIndex( { parent: 1 } ) 
• You can query by the parent field to find its immediate children nodes: 
db.categories.find( { parent: "Databases" } ) 
The Parent Links pattern provides a simple solution to tree storage but requires multiple queries to retrieve subtrees. 
146 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
Figure 4.6: Tree data model for a sample hierarchy of categories. 
4.3. Data Model Examples and Patterns 147
MongoDB Documentation, Release 2.6.4 
Model Tree Structures with Child References 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
This document describes a data model that describes a tree-like structure in MongoDB documents by storing references 
(page 135) in the parent-nodes to children nodes. 
Pattern 
The Child References pattern stores each tree node in a document; in addition to the tree node, document stores in an 
array the id(s) of the node’s children. 
Consider the following hierarchy of categories: 
Figure 4.7: Tree data model for a sample hierarchy of categories. 
The following example models the tree using Child References, storing the reference to the node’s children in the field 
children: 
db.categories.insert( { _id: "MongoDB", children: [] } ) 
db.categories.insert( { _id: "dbm", children: [] } ) 
db.categories.insert( { _id: "Databases", children: [ "MongoDB", "dbm" ] } ) 
148 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
db.categories.insert( { _id: "Languages", children: [] } ) 
db.categories.insert( { _id: "Programming", children: [ "Databases", "Languages" ] } ) 
db.categories.insert( { _id: "Books", children: [ "Programming" ] } ) 
• The query to retrieve the immediate children of a node is fast and straightforward: 
db.categories.findOne( { _id: "Databases" } ).children 
• You can create an index on the field children to enable fast search by the child nodes: 
db.categories.ensureIndex( { children: 1 } ) 
• You can query for a node in the children field to find its parent node as well as its siblings: 
db.categories.find( { children: "MongoDB" } ) 
The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are 
necessary. This pattern may also provide a suitable solution for storing graphs where a node may have multiple 
parents. 
Model Tree Structures with an Array of Ancestors 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
This document describes a data model that describes a tree-like structure in MongoDB documents using references 
(page 135) to parent nodes and an array that stores all ancestors. 
Pattern 
The Array of Ancestors pattern stores each tree node in a document; in addition to the tree node, document stores in 
an array the id(s) of the node’s ancestors or path. 
Consider the following hierarchy of categories: 
The following example models the tree using Array of Ancestors. In addition to the ancestors field, these docu-ments 
also store the reference to the immediate parent category in the parent field: 
db.categories.insert( { _id: "MongoDB", ancestors: [ "Books", "Programming", "Databases" ], parent: "db.categories.insert( { _id: "dbm", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" db.categories.insert( { _id: "Databases", ancestors: [ "Books", "Programming" ], parent: "Programming" db.categories.insert( { _id: "Languages", ancestors: [ "Books", "Programming" ], parent: "Programming" db.categories.insert( { _id: "Programming", ancestors: [ "Books" ], parent: "Books" } ) 
db.categories.insert( { _id: "Books", ancestors: [ ], parent: null } ) 
• The query to retrieve the ancestors or path of a node is fast and straightforward: 
db.categories.findOne( { _id: "MongoDB" } ).ancestors 
• You can create an index on the field ancestors to enable fast search by the ancestors nodes: 
db.categories.ensureIndex( { ancestors: 1 } ) 
• You can query by the field ancestors to find all its descendants: 
4.3. Data Model Examples and Patterns 149
MongoDB Documentation, Release 2.6.4 
Figure 4.8: Tree data model for a sample hierarchy of categories. 
150 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
db.categories.find( { ancestors: "Programming" } ) 
The Array of Ancestors pattern provides a fast and efficient solution to find the descendants and the ancestors of a node 
by creating an index on the elements of the ancestors field. This makes Array of Ancestors a good choice for working 
with subtrees. 
The Array of Ancestors pattern is slightly slower than the Materialized Paths (page 151) pattern but is more straight-forward 
to use. 
Model Tree Structures with Materialized Paths 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
This document describes a data model that describes a tree-like structure in MongoDB documents by storing full 
relationship paths between documents. 
Pattern 
The Materialized Paths pattern stores each tree node in a document; in addition to the tree node, document stores as 
a string the id(s) of the node’s ancestors or path. Although the Materialized Paths pattern requires additional steps of 
working with strings and regular expressions, the pattern also provides more flexibility in working with the path, such 
as finding nodes by partial paths. 
Consider the following hierarchy of categories: 
The following example models the tree using Materialized Paths, storing the path in the field path; the path string 
uses the comma , as a delimiter: 
db.categories.insert( { _id: "Books", path: null } ) 
db.categories.insert( { _id: "Programming", path: ",Books," } ) 
db.categories.insert( { _id: "Databases", path: ",Books,Programming," } ) 
db.categories.insert( { _id: "Languages", path: ",Books,Programming," } ) 
db.categories.insert( { _id: "MongoDB", path: ",Books,Programming,Databases," } ) 
db.categories.insert( { _id: "dbm", path: ",Books,Programming,Databases," } ) 
• You can query to retrieve the whole tree, sorting by the field path: 
db.categories.find().sort( { path: 1 } ) 
• You can use regular expressions on the path field to find the descendants of Programming: 
db.categories.find( { path: /,Programming,/ } ) 
• You can also retrieve the descendants of Books where the Books is also at the topmost level of the hierarchy: 
db.categories.find( { path: /^,Books,/ } ) 
• To create an index on the field path use the following invocation: 
db.categories.ensureIndex( { path: 1 } ) 
This index may improve performance depending on the query: 
4.3. Data Model Examples and Patterns 151
MongoDB Documentation, Release 2.6.4 
Figure 4.9: Tree data model for a sample hierarchy of categories. 
152 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
– For queries of the Books sub-tree (e.g. http://docs.mongodb.org/manual^,Books,/) an 
index on the path field improves the query performance significantly. 
– For queries of the Programming sub-tree (e.g. http://docs.mongodb.org/manual,Programming,/), 
or similar queries of sub-tress, where the node might be in the middle of the indexed string, the query 
must inspect the entire index. 
For these queries an index may provide some performance improvement if the index is significantly smaller 
than the entire collection. 
Model Tree Structures with Nested Sets 
Overview 
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how 
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) 
for a full high level overview of data modeling in MongoDB. 
This document describes a data model that describes a tree like structure that optimizes discovering subtrees at the 
expense of tree mutability. 
Pattern 
The Nested Sets pattern identifies each node in the tree as stops in a round-trip traversal of the tree. The application 
visits each node in the tree twice; first during the initial trip, and second during the return trip. The Nested Sets pattern 
stores each tree node in a document; in addition to the tree node, document stores the id of node’s parent, the node’s 
initial stop in the left field, and its return stop in the right field. 
Consider the following hierarchy of categories: 
Figure 4.10: Example of a hierarchical data. The numbers identify the stops at nodes during a roundtrip traversal of a 
tree. 
4.3. Data Model Examples and Patterns 153
MongoDB Documentation, Release 2.6.4 
The following example models the tree using Nested Sets: 
db.categories.insert( { _id: "Books", parent: 0, left: 1, right: 12 } ) 
db.categories.insert( { _id: "Programming", parent: "Books", left: 2, right: 11 } ) 
db.categories.insert( { _id: "Languages", parent: "Programming", left: 3, right: 4 } ) 
db.categories.insert( { _id: "Databases", parent: "Programming", left: 5, right: 10 } ) 
db.categories.insert( { _id: "MongoDB", parent: "Databases", left: 6, right: 7 } ) 
db.categories.insert( { _id: "dbm", parent: "Databases", left: 8, right: 9 } ) 
You can query to retrieve the descendants of a node: 
var databaseCategory = db.categories.findOne( { _id: "Databases" } ); 
db.categories.find( { left: { $gt: databaseCategory.left }, right: { $lt: databaseCategory.right } } The Nested Sets pattern provides a fast and efficient solution for finding subtrees but is inefficient for modifying the 
tree structure. As such, this pattern is best for static trees that do not change. 
4.3.3 Model Specific Application Contexts 
Model Data for Atomic Operations (page 154) Illustrates how embedding fields related to an atomic update within 
the same document ensures that the fields are in sync. 
Model Data to Support Keyword Search (page 155) Describes one method for supporting keyword search by storing 
keywords in an array in the same document as the text field. Combined with a multi-key index, this pattern can 
support application’s keyword search operations. 
Model Monetary Data (page 156) Describes two methods to model monetary data in MongoDB. 
Model Data for Atomic Operations 
Pattern 
In MongoDB, write operations, e.g. db.collection.update(), db.collection.findAndModify(), 
db.collection.remove(), are atomic on the level of a single document. For fields that must be updated to-gether, 
embedding the fields within the same document ensures that the fields can be updated atomically. 
For example, consider a situation where you need to maintain information on books, including the number of copies 
available for checkout as well as the current checkout information. 
The available copies of the book and the checkout information should be in sync. As such, embedding the 
available field and the checkout field within the same document ensures that you can update the two fields 
atomically. 
{ 
_id: 123456789, 
title: "MongoDB: The Definitive Guide", 
author: [ "Kristina Chodorow", "Mike Dirolf" ], 
published_date: ISODate("2010-09-24"), 
pages: 216, 
language: "English", 
publisher_id: "oreilly", 
available: 3, 
checkout: [ { by: "joe", date: ISODate("2012-10-15") } ] 
} 
Then to update with new checkout information, you can use the db.collection.update() method to atomically 
update both the available field and the checkout field: 
154 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
db.books.update ( 
{ _id: 123456789, available: { $gt: 0 } }, 
{ 
$inc: { available: -1 }, 
$push: { checkout: { by: "abc", date: new Date() } } 
} 
) 
The operation returns a WriteResult() object that contains information on the status of the operation: 
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) 
The nMatched field shows that 1 document matched the update condition, and nModified shows that the operation 
updated 1 document. 
If no document matched the update condition, then nMatched and nModified would be 0 and would indicate that 
you could not check out the book. 
Model Data to Support Keyword Search 
Note: Keyword search is not the same as text search or full text search, and does not provide stemming or other 
text-processing features. See the Limitations of Keyword Indexes (page 156) section for more information. 
In 2.4, MongoDB provides a text search feature. See Text Indexes (page 454) for more information. 
If your application needs to perform queries on the content of a field that holds text you can perform exact matches 
on the text or use $regex to use regular expression pattern matches. However, for many operations on text, these 
methods do not satisfy application requirements. 
This pattern describes one method for supporting keyword search using MongoDB to support application search 
functionality, that uses keywords stored in an array in the same document as the text field. Combined with a multi-key 
index (page 442), this pattern can support application’s keyword search operations. 
Pattern 
To add structures to your document to support keyword-based queries, create an array field in your documents and add 
the keywords as strings in the array. You can then create a multi-key index (page 442) on the array and create queries 
that select values from the array. 
Example 
Given a collection of library volumes that you want to provide topic-based search. For each volume, you add the array 
topics, and you add as many keywords as needed for a given volume. 
For the Moby-Dick volume you might have the following document: 
{ title : "Moby-Dick" , 
author : "Herman Melville" , 
published : 1851 , 
ISBN : 0451526996 , 
topics : [ "whaling" , "allegory" , "revenge" , "American" , 
"novel" , "nautical" , "voyage" , "Cape Cod" ] 
} 
You then create a multi-key index on the topics array: 
4.3. Data Model Examples and Patterns 155
MongoDB Documentation, Release 2.6.4 
db.volumes.ensureIndex( { topics: 1 } ) 
The multi-key index creates separate index entries for each keyword in the topics array. For example the index 
contains one entry for whaling and another for allegory. 
You then query based on the keywords. For example: 
db.volumes.findOne( { topics : "voyage" }, { title: 1 } ) 
Note: An array with a large number of elements, such as one with several hundreds or thousands of keywords will 
incur greater indexing costs on insertion. 
Limitations of Keyword Indexes 
MongoDB can support keyword searches using specific data models and multi-key indexes (page 442); however, these 
keyword indexes are not sufficient or comparable to full-text products in the following respects: 
• Stemming. Keyword queries in MongoDB can not parse keywords for root or related words. 
• Synonyms. Keyword-based search features must provide support for synonym or related queries in the applica-tion 
layer. 
• Ranking. The keyword look ups described in this document do not provide a way to weight results. 
• Asynchronous Indexing. MongoDB builds indexes synchronously, which means that the indexes used for key-word 
indexes are always current and can operate in real-time. However, asynchronous bulk indexes may be 
more efficient for some kinds of content and workloads. 
Model Monetary Data 
Overview 
MongoDB stores numeric data as either IEEE 754 standard 64-bit floating point numbers or as 32-bit or 64-bit signed 
integers. Applications that handle monetary data often require capturing fractional units of currency. However, arith-metic 
on floating point numbers, as implemented in modern hardware, often does not conform to requirements for 
monetary arithmetic. In addition, some fractional numeric quantities, such as one third and one tenth, have no exact 
representation in binary floating point numbers. 
Note: Arithmetic mentioned on this page refers to server-side arithmetic performed by mongod or mongos, and not 
to client-side arithmetic. 
This document describes two ways to model monetary data in MongoDB: 
• Exact Precision (page 157) which multiplies the monetary value by a power of 10. 
• Arbitrary Precision (page 157) which uses two fields for the value: one field to store the exact monetary value 
as a non-numeric and another field to store a floating point approximation of the value. 
Use Cases for Exact Precision Model 
If you regularly need to perform server-side arithmetic on monetary data, the exact precision model may be appropriate. 
For instance: 
156 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
• If you need to query the database for exact, mathematically valid matches, use Exact Precision (page 157). 
• If you need to be able to do server-side arithmetic, e.g., $inc, $mul, and aggregation framework 
arithmetic, use Exact Precision (page 157). 
Use Cases for Arbitrary Precision Model 
If there is no need to perform server-side arithmetic on monetary data, modeling monetary data using the arbitrary 
precision model may be suitable. For instance: 
• If you need to handle arbitrary or unforeseen number of precision, see Arbitrary Precision (page 157). 
• If server-side approximations are sufficient, possibly with client-side post-processing, see Arbitrary Precision 
(page 157). 
Exact Precision 
To model monetary data using the exact precision model: 
1. Determine the maximum precision needed for the monetary value. For example, your application may require 
precision down to the tenth of one cent for monetary values in USD currency. 
2. Convert the monetary value into an integer by multiplying the value by a power of 10 that ensures the maximum 
precision needed becomes the least significant digit of the integer. For example, if the required maximum 
precision is the tenth of one cent, multiply the monetary value by 1000. 
3. Store the converted monetary value. 
For example, the following scales 9.99 USD by 1000 to preserve precision up to one tenth of a cent. 
{ price: 9990, currency: "USD" } 
The model assumes that for a given currency value: 
• The scale factor is consistent for a currency; i.e. same scaling factor for a given currency. 
• The scale factor is a constant and known property of the currency; i.e applications can determine the scale factor 
from the currency. 
When using this model, applications must be consistent in performing the appropriate scaling of the values. 
For use cases of this model, see Use Cases for Exact Precision Model (page 156). 
Arbitrary Precision 
To model monetary data using the arbitrary precision model, store the value in two fields: 
1. In one field, encode the exact monetary value as a non-numeric data type; e.g., BinData or a string. 
2. In the second field, store a double-precision floating point approximation of the exact value. 
The following example uses the arbitrary precision model to store 9.99 USD for the price and 0.25 USD for the 
fee: 
{ 
price: { display: "9.99", approx: 9.9900000000000002, currency: "USD" }, 
fee: { display: "0.25", approx: 0.2499999999999999, currency: "USD" } 
} 
4.3. Data Model Examples and Patterns 157
MongoDB Documentation, Release 2.6.4 
With some care, applications can perform range and sort queries on the field with the numeric approximation. How-ever, 
the use of the approximation field for the query and sort operations requires that applications perform client-side 
post-processing to decode the non-numeric representation of the exact value and then filter out the returned documents 
based on the exact monetary value. 
For use cases of this model, see Use Cases for Arbitrary Precision Model (page 157). 
4.4 Data Model Reference 
Documents (page 158) MongoDB stores all data in documents, which are JSON-style data structures composed of 
field-and-value pairs. 
Database References (page 161) Discusses manual references and DBRefs, which MongoDB can use to represent 
relationships between documents. 
GridFS Reference (page 164) Convention for storing large files in a MongoDB Database. 
ObjectId (page 165) A 12-byte BSON type that MongoDB uses as the default value for its documents’ _id field if 
the _id field is not specified. 
BSON Types (page 167) Outlines the unique BSON types used by MongoDB. See BSONspec.org5 for the complete 
BSON specification. 
4.4.1 Documents 
MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs: 
{ "item": "pencil", "qty": 500, "type": "no.2" } 
Most user-accessible data structures in MongoDB are documents, including: 
• All database records. 
• Query selectors (page 55), which define what records to select for read, update, and delete operations. 
• Update definitions (page 67), which define what fields to modify during an update. 
• Index specifications (page 436), which define what fields to index. 
• Data output by MongoDB for reporting and configuration, such as the output of the serverStatus and the 
replica set configuration document (page 594). 
Document Format 
MongoDB stores documents on disk in the BSON serialization format. BSON is a binary representation of JSON 
documents, though it contains more data types than JSON. For the BSON spec, see bsonspec.org6. See also BSON 
Types (page 167). 
The mongo JavaScript shell and the MongoDB language drivers translate between BSON and the language-specific 
document representation. 
5http://bsonspec.org/ 
6http://bsonspec.org/ 
158 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
Document Structure 
MongoDB documents are composed of field-and-value pairs and have the following structure: 
{ 
field1: value1, 
field2: value2, 
field3: value3, 
... 
fieldN: valueN 
} 
The value of a field can be any of the BSON data types (page 167), including other documents, arrays, and arrays of 
documents. The following document contains values of varying types: 
var mydoc = { 
_id: ObjectId("5099803df3f4948bd2f98391"), 
name: { first: "Alan", last: "Turing" }, 
birth: new Date('Jun 23, 1912'), 
death: new Date('Jun 07, 1954'), 
contribs: [ "Turing machine", "Turing test", "Turingery" ], 
views : NumberLong(1250000) 
} 
The above fields have the following data types: 
• _id holds an ObjectId. 
• name holds a subdocument that contains the fields first and last. 
• birth and death hold values of the Date type. 
• contribs holds an array of strings. 
• views holds a value of the NumberLong type. 
Field Names 
Field names are strings. 
Documents (page 158) have the following restrictions on field names: 
• The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, 
and may be of any type other than an array. 
• The field names cannot start with the dollar sign ($) character. 
• The field names cannot contain the dot (.) character. 
• The field names cannot contain the null character. 
BSON documents may have more than one field with the same name. Most MongoDB interfaces, however, 
represent MongoDB with a structure (e.g. a hash table) that does not support duplicate field names. If you need to 
manipulate documents that have more than one field with the same name, see the driver documentation for 
your driver. 
Some documents created by internal MongoDB processes may have duplicate fields, but no MongoDB process will 
ever add duplicate fields to an existing user document. 
4.4. Data Model Reference 159
MongoDB Documentation, Release 2.6.4 
Field Value Limit 
For indexed collections (page 431), the values for the indexed fields have a Maximum Index Key Length limit. 
See Maximum Index Key Length for details. 
Document Limitations 
Documents have the following attributes: 
Document Size Limit 
The maximum BSON document size is 16 megabytes. 
The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during 
transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides 
the GridFS API. See mongofiles and the documentation for your driver for more information about GridFS. 
Document Field Order 
MongoDB preserves the order of the document fields following write operations except for the following cases: 
• The _id field is always the first field in the document. 
• Updates that include renaming of field names may result in the reordering of fields in the document. 
Changed in version 2.6: Starting in version 2.6, MongoDB actively attempts to preserve the field order in a document. 
Before version 2.6, MongoDB did not actively preserve the order of the fields in a document. 
The _id Field 
The _id field has the following behavior and constraints: 
• By default, MongoDB creates a unique index on the _id field during the creation of a collection. 
• The _id field is always the first field in the documents. If the server receives a document that does not have the 
_id field first, then the server will move the field to the beginning. 
• The _id field may contain values of any BSON data type (page 167), other than an array. 
Warning: To ensure functioning replication, do not store values that are of the BSON regular expression 
type in the _id field. 
The following are common options for storing values for _id: 
• Use an ObjectId (page 165). 
• Use a natural unique identifier, if available. This saves space and avoids an additional index. 
• Generate an auto-incrementing number. See Create an Auto-Incrementing Sequence Field (page 113). 
• Generate a UUID in your application code. For a more efficient storage of the UUID values in the collection 
and in the _id index, store the UUID as a value of the BSON BinData type. 
Index keys that are of the BinData type are more efficiently stored in the index if: 
– the binary subtype value is in the range of 0-7 or 128-135, and 
160 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
– the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32. 
• Use your driver’s BSON UUID facility to generate UUIDs. Be aware that driver implementations may imple-ment 
UUID serialization and deserialization logic differently, which may not be fully compatible with other 
drivers. See your driver documentation7 for information concerning UUID interoperability. 
Note: Most MongoDB driver clients will include the _id field and generate an ObjectId before sending the insert 
operation to MongoDB; however, if the client sends a document without an _id field, the mongod will add the _id 
field and generate the ObjectId. 
Dot Notation 
MongoDB uses the dot notation to access the elements of an array and to access the fields of a subdocument. 
To access an element of an array by the zero-based index position, concatenate the array name with the dot (.) and 
zero-based index position, and enclose in quotes: 
'<array>.<index>' 
To access a field of a subdocument with dot-notation, concatenate the subdocument name with the dot (.) and the 
field name, and enclose in quotes: 
'<subdocument>.<field>' 
See also: 
• Embedded Documents (page 89) for dot notation examples with subdocuments. 
• Arrays (page 90) for dot notation examples with arrays. 
4.4.2 Database References 
MongoDB does not support joins. InMongoDB some data is denormalized, or stored with related data in documents to 
remove the need for joins. However, in some cases it makes sense to store related information in separate documents, 
typically in different collections or databases. 
MongoDB applications use one of two methods for relating documents: 
1. Manual references (page 162) where you save the _id field of one document in another document as a reference. 
Then your application can run a second query to return the related data. These references are simple and 
sufficient for most use cases. 
2. DBRefs (page 162) are references from one document to another using the value of the first document’s _id 
field, collection name, and, optionally, its database name. By including these names, DBRefs allow documents 
located in multiple collections to be more easily linked with documents from a single collection. 
To resolve DBRefs, your application must perform additional queries to return the referenced documents. Many 
drivers have helper methods that form the query for the DBRef automatically. The drivers 8 do not automat-ically 
resolve DBRefs into documents. 
DBRefs provide a common format and type to represent relationships among documents. The DBRef format 
also provides common semantics for representing links between documents if your database must interact with 
multiple frameworks and tools. 
Unless you have a compelling reason to use DBRefs, use manual references instead. 
7http://api.mongodb.org/ 
8 Some community supported drivers may have alternate behavior and may resolve a DBRef into a document automatically. 
4.4. Data Model Reference 161
MongoDB Documentation, Release 2.6.4 
Manual References 
Background 
Using manual references is the practice of including one document’s _id field in another document. The application 
can then issue a second query to resolve the referenced fields as needed. 
Process 
Consider the following operation to insert two documents, using the _id field of the first document as a reference in 
the second document: 
original_id = ObjectId() 
db.places.insert({ 
"_id": original_id, 
"name": "Broadway Center", 
"url": "bc.example.net" 
}) 
db.people.insert({ 
"name": "Erin", 
"places_id": original_id, 
"url": "bc.example.net/Erin" 
}) 
Then, when a query returns the document from the people collection you can, if needed, make a second query for 
the document referenced by the places_id field in the places collection. 
Use 
For nearly every case where you want to store a relationship between two documents, use manual references 
(page 162). The references are simple to create and your application can resolve references as needed. 
The only limitation of manual linking is that these references do not convey the database and collection names. If you 
have documents in a single collection that relate to documents in more than one collection, you may need to consider 
using DBRefs (page 162). 
DBRefs 
Background 
DBRefs are a convention for representing a document, rather than a specific reference type. They include the name of 
the collection, and in some cases the database name, in addition to the value from the _id field. 
Format 
DBRefs have the following fields: 
$ref 
The $ref field holds the name of the collection where the referenced document resides. 
162 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
$id 
The $id field contains the value of the _id field in the referenced document. 
$db 
Optional. 
Contains the name of the database where the referenced document resides. 
Only some drivers support $db references. 
Example 
DBRef documents resemble the following document: 
{ "$ref" : <value>, "$id" : <value>, "$db" : <value> } 
Consider a document from a collection that stored a DBRef in a creator field: 
{ 
"_id" : ObjectId("5126bbf64aed4daf9e2ab771"), 
// .. application fields 
"creator" : { 
"$ref" : "creators", 
"$id" : ObjectId("5126bc054aed4daf9e2ab772"), 
"$db" : "users" 
} 
} 
The DBRef in this example points to a document in the creators collection of the users database that has 
ObjectId("5126bc054aed4daf9e2ab772") in its _id field. 
Note: The order of fields in the DBRef matters, and you must use the above sequence when using a DBRef. 
Support 
C++ The C++ driver contains no support for DBRefs. You can transverse references manually. 
C# The C# driver provides access to DBRef objects with the MongoDBRef Class9 and supplies the FetchDBRef 
Method10 for accessing these objects. 
Java The DBRef11 class provides supports for DBRefs from Java. 
JavaScript The mongo shell’s JavaScript interface provides a DBRef. 
Perl The Perl driver contains no support for DBRefs. You can transverse references manually or use the Mon-goDBx:: 
AutoDeref12 CPAN module. 
PHP The PHP driver supports DBRefs, including the optional $db reference, through The MongoDBRef class13. 
Python The Python driver provides the DBRef class14, and the dereference method15 for interacting with DBRefs. 
9http://api.mongodb.org/csharp/current/html/46c356d3-ed06-a6f8-42fa-e0909ab64ce2.htm 
10http://api.mongodb.org/csharp/current/html/1b0b8f48-ba98-1367-0a7d-6e01c8df436f.htm 
11http://api.mongodb.org/java/current/com/mongodb/DBRef.html 
12http://search.cpan.org/dist/MongoDBx-AutoDeref/ 
13http://www.php.net/manual/en/class.mongodbref.php/ 
14http://api.mongodb.org/python/current/api/bson/dbref.html 
15http://api.mongodb.org//python/current/api/pymongo/database.html#pymongo.database.Database.dereference 
4.4. Data Model Reference 163
MongoDB Documentation, Release 2.6.4 
Ruby The Ruby Driver supports DBRefs using the DBRef class16 and the deference method17. 
Use 
In most cases you should use the manual reference (page 162) method for connecting two or more related documents. 
However, if you need to reference documents from multiple collections, consider using DBRefs. 
4.4.3 GridFS Reference 
GridFS stores files in two collections: 
• chunks stores the binary chunks. For details, see The chunks Collection (page 164). 
• files stores the file’s metadata. For details, see The files Collection (page 165). 
GridFS places the collections in a common bucket by prefixing each with the bucket name. By default, GridFS uses 
two collections with names prefixed by fs bucket: 
• fs.files 
• fs.chunks 
You can choose a different bucket name than fs, and create multiple buckets in a single database. 
See also: 
GridFS (page 138) for more information about GridFS. 
The chunks Collection 
Each document in the chunks collection represents a distinct chunk of a file as represented in the GridFS store. The 
following is a prototype document from the chunks collection.: 
{ 
"_id" : <ObjectId>, 
"files_id" : <ObjectId>, 
"n" : <num>, 
"data" : <binary> 
} 
A document from the chunks collection contains the following fields: 
chunks._id 
The unique ObjectId of the chunk. 
chunks.files_id 
The _id of the “parent” document, as specified in the files collection. 
chunks.n 
The sequence number of the chunk. GridFS numbers all chunks, starting with 0. 
chunks.data 
The chunk’s payload as a BSON binary type. 
The chunks collection uses a compound index on files_id and n, as described in GridFS Index (page 139). 
16http://api.mongodb.org//ruby/current/BSON/DBRef.html 
17http://api.mongodb.org//ruby/current/Mongo/DB.html#dereference-instance_method 
164 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
The files Collection 
Each document in the files collection represents a file in the GridFS store. Consider the following prototype of a 
document in the files collection: 
{ 
"_id" : <ObjectId>, 
"length" : <num>, 
"chunkSize" : <num>, 
"uploadDate" : <timestamp>, 
"md5" : <hash>, 
"filename" : <string>, 
"contentType" : <string>, 
"aliases" : <string array>, 
"metadata" : <dataObject>, 
} 
Documents in the files collection contain some or all of the following fields. Applications may create additional 
arbitrary fields: 
files._id 
The unique ID for this document. The _id is of the data type you chose for the original document. The default 
type for MongoDB documents is BSON ObjectId. 
files.length 
The size of the document in bytes. 
files.chunkSize 
The size of each chunk. GridFS divides the document into chunks of the size specified here. The default size is 
255 kilobytes. 
Changed in version 2.4.10: The default chunk size changed from 256k to 255k. 
files.uploadDate 
The date the document was first stored by GridFS. This value has the Date type. 
files.md5 
An MD5 hash returned by the filemd5 command. This value has the String type. 
files.filename 
Optional. A human-readable name for the document. 
files.contentType 
Optional. A valid MIME type for the document. 
files.aliases 
Optional. An array of alias strings. 
files.metadata 
Optional. Any additional information you want to store. 
4.4.4 ObjectId 
Overview 
ObjectId is a 12-byte BSON type, constructed using: 
• a 4-byte value representing the seconds since the Unix epoch, 
4.4. Data Model Reference 165
MongoDB Documentation, Release 2.6.4 
• a 3-byte machine identifier, 
• a 2-byte process id, and 
• a 3-byte counter, starting with a random value. 
In MongoDB, documents stored in a collection require a unique _id field that acts as a primary key. Because ObjectIds 
are small, most likely unique, and fast to generate, MongoDB uses ObjectIds as the default value for the _id field if 
the _id field is not specified. MongoDB clients should add an _id field with a unique ObjectId. However, if a client 
does not add an _id field, mongod will add an _id field that holds an ObjectId. 
Using ObjectIds for the _id field provides the following additional benefits: 
• in the mongo shell, you can access the creation time of the ObjectId, using the getTimestamp() method. 
• sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by creation time. 
Important: The relationship between the order of ObjectId values and generation time is not strict within a 
single second. If multiple systems, or multiple processes or threads on a single system generate values, within a 
single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also 
result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod 
process. 
Also consider the Documents (page 158) section for related information on MongoDB’s document orientation. 
ObjectId() 
The mongo shell provides the ObjectId() wrapper class to generate a new ObjectId, and to provide the following 
helper attribute and methods: 
• str 
The hexadecimal string representation of the object. 
• getTimestamp() 
Returns the timestamp portion of the object as a Date. 
• toString() 
Returns the JavaScript representation in the form of a string literal “ObjectId(...)”. 
Changed in version 2.2: In previous versions toString() returns the hexadecimal string representation, 
which as of version 2.2 can be retrieved by the str property. 
• valueOf() 
Returns the representation of the object as a hexadecimal string. The returned string is the str attribute. 
Changed in version 2.2: In previous versions, valueOf() returns the object. 
Examples 
Consider the following uses ObjectId() class in the mongo shell: 
Generate a new ObjectId 
To generate a new ObjectId, use the ObjectId() constructor with no argument: 
166 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
x = ObjectId() 
In this example, the value of x would be: 
ObjectId("507f1f77bcf86cd799439011") 
To generate a new ObjectId using the ObjectId() constructor with a unique hexadecimal string: 
y = ObjectId("507f191e810c19729de860ea") 
In this example, the value of y would be: 
ObjectId("507f191e810c19729de860ea") 
• To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows: 
Convert an ObjectId into a Timestamp 
To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows: 
ObjectId("507f191e810c19729de860ea").getTimestamp() 
This operation will return the following Date object: 
ISODate("2012-10-17T20:46:22Z") 
Convert ObjectIds into Strings 
Access the str attribute of an ObjectId() object, as follows: 
ObjectId("507f191e810c19729de860ea").str 
This operation will return the following hexadecimal string: 
507f191e810c19729de860ea 
To return the hexadecimal string representation of an ObjectId(), use the valueOf() method as follows: 
ObjectId("507f191e810c19729de860ea").valueOf() 
This operation returns the following output: 
507f191e810c19729de860ea 
To return the string representation of an ObjectId() object, use the toString() method as follows: 
ObjectId("507f191e810c19729de860ea").toString() 
This operation will return the following output: 
507f191e810c19729de860ea 
4.4.5 BSON Types 
BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The 
BSON specification is located at bsonspec.org18. 
18http://bsonspec.org/ 
4.4. Data Model Reference 167
MongoDB Documentation, Release 2.6.4 
BSON supports the following data types as values in documents. Each data type has a corresponding number that can 
be used with the $type operator to query documents by BSON type. 
Type Number 
Double 1 
String 2 
Object 3 
Array 4 
Binary data 5 
Undefined 6 
Object id 7 
Boolean 8 
Date 9 
Null 10 
Regular Expression 11 
JavaScript 13 
Symbol 14 
JavaScript (with scope) 15 
32-bit integer 16 
Timestamp 17 
64-bit integer 18 
Min key 255 
Max key 127 
To determine a field’s type, see Check Types in the mongo Shell (page 252). 
If you convert BSON to JSON, see http://docs.mongodb.org/manualreference/mongodb-extended-json. 
Comparison/Sort Order 
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to 
highest: 
1. MinKey (internal type) 
2. Null 
3. Numbers (ints, longs, doubles) 
4. Symbol, String 
5. Object 
6. Array 
7. BinData 
8. ObjectId 
9. Boolean 
10. Date, Timestamp 
11. Regular Expression 
12. MaxKey (internal type) 
MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo conversion 
before comparison. 
The comparison treats a non-existent field as it would an empty BSON Object. As such, a sort on the a field in 
documents { } and { a: null } would treat the documents as equivalent in sort order. 
168 Chapter 4. Data Models
MongoDB Documentation, Release 2.6.4 
With arrays, a less-than comparison or an ascending sort compares the smallest element of arrays, and a greater-than 
comparison or a descending sort compares the largest element of the arrays. As such, when comparing a field whose 
value is a single-element array (e.g. [ 1 ]) with non-array fields (e.g. 2), the comparison is between 1 and 2. A 
comparison of an empty array (e.g. [ ]) treats the empty array as less than null or a missing field. 
MongoDB sorts BinData in the following order: 
1. First, the length or size of the data. 
2. Then, by the BSON one-byte subtype. 
3. Finally, by the data, performing a byte-by-byte comparison. 
The following sections describe special considerations for particular BSON types. 
ObjectId 
ObjectIds are: small, likely unique, fast to generate, and ordered. These values consists of 12-bytes, where the first 
four bytes are a timestamp that reflect the ObjectId’s creation. Refer to the ObjectId (page 165) documentation for 
more information. 
String 
BSON strings are UTF-8. In general, drivers for each programming language convert from the language’s string format 
to UTF-8 when serializing and deserializing BSON. This makes it possible to store most international characters in 
BSON strings with ease. 19 In addition, MongoDB $regex queries support UTF-8 in the regex string. 
Timestamps 
BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date (page 170) 
type. Timestamp values are a 64 bit value where: 
• the first 32 bits are a time_t value (seconds since the Unix epoch) 
• the second 32 bits are an incrementing ordinal for operations within a given second. 
Within a single mongod instance, timestamp values are always unique. 
In replication, the oplog has a ts field. The values in this field reflect the operation time, which uses a BSON 
timestamp value. 
Note: The BSON Timestamp type is for internal MongoDB use. For most cases, in application development, you 
will want to use the BSON date type. See Date (page 170) for more information. 
If you create a BSON Timestamp using the empty constructor (e.g. new Timestamp()), MongoDB will only 
generate a timestamp if you use the constructor in the first field of the document. 20 Otherwise, MongoDB will 
generate an empty timestamp value (i.e. Timestamp(0, 0).) 
Changed in version 2.1: mongo shell displays the Timestamp value with the wrapper: 
Timestamp(<time_t>, <ordinal>) 
Prior to version 2.1, the mongo shell display the Timestamp value as a document: 
19 Given strings using UTF-8 character sets, using sort() on strings will be reasonably correct. However, because internally sort() uses the 
C++ strcmp api, the sort order may handle some characters incorrectly. 
20 If the first field in the document is _id, then you can generate a timestamp in the second field of a document. 
4.4. Data Model Reference 169
MongoDB Documentation, Release 2.6.4 
{ t : <time_t>, i : <ordinal> } 
Date 
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This 
results in a representable date range of about 290 million years into the past and future. 
The official BSON specification21 refers to the BSON Date type as the UTC datetime. 
Changed in version 2.0: BSON Date type is signed. 22 Negative values represent dates before 1970. 
Example 
Construct a Date using the new Date() constructor in the mongo shell: 
var mydate1 = new Date() 
Example 
Construct a Date using the ISODate() constructor in the mongo shell: 
var mydate2 = ISODate() 
Example 
Return the Date value as string: 
mydate1.toString() 
Example 
Return the month portion of the Date value; months are zero-indexed, so that January is month 0: 
mydate1.getMonth() 
21http://bsonspec.org/#/specification 
22 Prior to version 2.0, Date values were incorrectly interpreted as unsigned integers, which affected sorts, range queries, and indexes on Date 
fields. Because indexes are not recreated when upgrading, please re-index if you created an index on Date values with an earlier version, and dates 
before 1970 are relevant to your application. 
170 Chapter 4. Data Models
CHAPTER 5 
Administration 
The administration documentation addresses the ongoing operation and maintenance of MongoDB instances and de-ployments. 
This documentation includes both high level overviews of these concerns as well as tutorials that cover 
specific procedures and processes for operating MongoDB. 
Administration Concepts (page 171) Core conceptual documentation of operational practices for managing Mon-goDB 
deployments and systems. 
MongoDB Backup Methods (page 172) Describes approaches and considerations for backing up a MongoDB 
database. 
Monitoring for MongoDB (page 175) An overview of monitoring tools, diagnostic strategies, and approaches 
to monitoring replica sets and sharded clusters. 
Production Notes (page 188) A collection of notes that describe best practices and considerations for the oper-ations 
of MongoDB instances and deployments. 
Continue reading from Administration Concepts (page 171) for additional documentation of MongoDB admin-istration. 
Administration Tutorials (page 205) Tutorials that describe common administrative procedures and practices for op-erations 
for MongoDB instances and deployments. 
Configuration, Maintenance, and Analysis (page 205) Describes routine management operations, including 
configuration and performance analysis. 
Backup and Recovery (page 229) Outlines procedures for data backup and restoration with mongod instances 
and deployments. 
Continue reading from Administration Tutorials (page 205) for more tutorials of common MongoDB mainte-nance 
operations. 
Administration Reference (page 266) Reference and documentation of internal mechanics of administrative features, 
systems and functions and operations. 
See also: 
The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica 
Set Tutorials (page 543) and Sharded Cluster Tutorials (page 634) for additional tutorials and information. 
5.1 Administration Concepts 
The core administration documents address strategies and practices used in the operation of MongoDB systems and 
deployments. 
171
MongoDB Documentation, Release 2.6.4 
Operational Strategies (page 172) Higher level documentation of key concepts for the operation and maintenance of 
MongoDB deployments, including backup, maintenance, and configuration. 
MongoDB Backup Methods (page 172) Describes approaches and considerations for backing up a MongoDB 
database. 
Monitoring for MongoDB (page 175) An overview of monitoring tools, diagnostic strategies, and approaches 
to monitoring replica sets and sharded clusters. 
Run-time Database Configuration (page 182) Outlines common MongoDB configurations and examples of 
best-practice configurations for common use cases. 
Data Management (page 194) Core documentation that addresses issues in data management, organization, mainte-nance, 
and lifestyle management. 
Data Center Awareness (page 194) Presents the MongoDB features that allow application developers and 
database administrators to configure their deployments to be more data center aware or allow operational 
and location-based separation. 
Expire Data from Collections by Setting TTL (page 198) TTL collections make it possible to automatically 
remove data from a collection based on the value of a timestamp and are useful for managing data like 
machine generated event data that are only useful for a limited period of time. 
Capped Collections (page 196) Capped collections provide a special type of size-constrained collections that 
preserve insertion order and can support high volume inserts. 
Optimization Strategies for MongoDB (page 200) Techniques for optimizing application performance with Mon-goDB. 
5.1.1 Operational Strategies 
These documents address higher level strategies for common administrative tasks and requirements with respect to 
MongoDB deployments. 
MongoDB Backup Methods (page 172) Describes approaches and considerations for backing up a MongoDB 
database. 
Monitoring for MongoDB (page 175) An overview of monitoring tools, diagnostic strategies, and approaches to 
monitoring replica sets and sharded clusters. 
Run-time Database Configuration (page 182) Outlines common MongoDB configurations and examples of best-practice 
configurations for common use cases. 
Import and Export MongoDB Data (page 186) Provides an overview of mongoimport and mongoexport, the 
tools MongoDB includes for importing and exporting data. 
Production Notes (page 188) A collection of notes that describe best practices and considerations for the operations 
of MongoDB instances and deployments. 
MongoDB Backup Methods 
When deploying MongoDB in production, you should have a strategy for capturing and restoring backups in the case 
of data loss events. There are several ways to back up MongoDB clusters: 
• Backup by Copying Underlying Data Files (page 173) 
• Backup with mongodump (page 173) 
• MongoDB Management Service (MMS) Cloud Backup (page 174) 
• MongoDB Management Service (MMS) On Prem Backup Software (page 174) 
172 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Backup by Copying Underlying Data Files 
You can create a backup by copying MongoDB’s underlying data files. 
If the volume where MongoDB stores data files supports point in time snapshots, you can use these snapshots to create 
backups of a MongoDB system at an exact moment in time. 
File systems snapshots are an operating system volume manager feature, and are not specific to MongoDB. The 
mechanics of snapshots depend on the underlying storage system. For example, if you use Amazon’s EBS storage 
system for EC2 supports snapshots. On Linux the LVM manager can create a snapshot. 
To get a correct snapshot of a running mongod process, you must have journaling enabled and the journal must reside 
on the same logical volume as the other MongoDB data files. Without journaling enabled, there is no guarantee that 
the snapshot will be consistent or valid. 
To get a consistent snapshot of a sharded system, you must disable the balancer and capture a snapshot from every 
shard and a config server at approximately the same moment in time. 
If your storage system does not support snapshots, you can copy the files directly using cp, rsync, or a similar tool. 
Since copying multiple files is not an atomic operation, you must stop all writes to the mongod before copying the 
files. Otherwise, you will copy the files in an invalid state. 
Backups produced by copying the underlying data do not support point in time recovery for replica sets and are 
difficult to manage for larger sharded clusters. Additionally, these backups are larger because they include the indexes 
and duplicate underlying storage padding and fragmentation. mongodump, by contrast, creates smaller backups. 
For more information, see the Backup and Restore with Filesystem Snapshots (page 229) and Backup a Sharded Cluster 
with Filesystem Snapshots (page 239) for complete instructions on using LVM to create snapshots. Also see Back up 
and Restore Processes for MongoDB on Amazon EC21. 
Backup with mongodump 
The mongodump tool reads data from a MongoDB database and creates high fidelity BSON files. The 
mongorestore tool can populate a MongoDB database with the data from these BSON files. These tools are 
simple and efficient for backing up small MongoDB deployments, but are not ideal for capturing backups of larger 
systems. 
mongodump and mongorestore can operate against a running mongod process, and can manipulate the underly-ing 
data files directly. By default, mongodump does not capture the contents of the local database (page 598). 
mongodump only captures the documents in the database. The resulting backup is space efficient, but 
mongorestore or mongod must rebuild the indexes after restoring data. 
When connected to a MongoDB instance, mongodump can adversely affect mongod performance. If your data is 
larger than system memory, the queries will push the working set out of memory. 
To mitigate the impact of mongodump on the performance of the replica set, use mongodump to capture back-ups 
from a secondary (page 508) member of a replica set. Alternatively, you can shut down a secondary and use 
mongodump with the data files directly. If you shut down a secondary to capture data with mongodump ensure that 
the operation can complete before its oplog becomes too stale to continue replicating. 
For replica sets, mongodump also supports a point in time feature with the --oplog option. Applications may 
continue modifying data while mongodump captures the output. To restore a point in time backup created with 
--oplog, use mongorestore with the --oplogReplay option. 
If applications modify data while mongodump is creating a backup, mongodump will compete for resources with 
those applications. 
1http://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2 
5.1. Administration Concepts 173
MongoDB Documentation, Release 2.6.4 
See Back Up and Restore with MongoDB Tools (page 234), Backup a Small Sharded Cluster with mongodump 
(page 238), and Backup a Sharded Cluster with Database Dumps (page 241) for more information. 
MongoDB Management Service (MMS) Cloud Backup 
The MongoDB Management Service2 supports backup and restore for MongoDB deployments. 
MMS continually backs up MongoDB replica sets and sharded systems by reading the oplog data from your MongoDB 
cluster. 
MMS Backup offers point in time recovery of MongoDB replica sets and a consistent snapshot of sharded systems. 
MMS achieves point in time recovery by storing oplog data so that it can create a restore for any moment in time in 
the last 24 hours for a particular replica set. 
For sharded systems, MMS does not provide restores for arbitrary moments in time. MMS does provide periodic con-sistent 
snapshots of the entire sharded cluster. Sharded cluster snapshots are difficult to achieve with other MongoDB 
backup methods. 
To restore a MongoDB cluster from an MMS Backup snapshot, you download a compressed archive of your MongoDB 
data files and distribute those files before restarting the mongod processes. 
To get started with MMS Backup sign up for MMS3, and consider the complete documentation of MMS see the MMS 
Manual4. 
MongoDB Management Service (MMS) On Prem Backup Software 
MongoDB Subscribers can install and run the same core software that powers MongoDB Management Service (MMS) 
Cloud Backup (page 174) on their own infrastructure. The On Prem version of MMS, has similar functionality as the 
cloud version and is available with Standard and Enterprise subscriptions. 
For more information about On Prem MMS see the MongoDB subscription5 page and the MMS On Prem Manual6. 
Further Reading 
Backup and Restore with Filesystem Snapshots (page 229) An outline of procedures for creating MongoDB data set 
backups using system-level file snapshot tool, such as LVM or native storage appliance tools. 
Restore a Replica Set from MongoDB Backups (page 232) Describes procedure for restoring a replica set from an 
archived backup such as a mongodump or MMS Backup7 file. 
Back Up and Restore with MongoDB Tools (page 234) The procedure for writing the contents of a database to a 
BSON (i.e. binary) dump file for backing up MongoDB databases. 
Backup and Restore Sharded Clusters (page 238) Detailed procedures and considerations for backing up sharded 
clusters and single shards. 
Recover Data after an Unexpected Shutdown (page 246) Recover data from MongoDB data files that were not prop-erly 
closed or have an invalid state. 
2https://mms.10gen.com/?pk_campaign=MongoDB-Org&pk_kwd=Backup-Docs 
3http://mms.mongodb.com 
4https://mms.mongodb.com/help/ 
5https://www.mongodb.com/products/subscriptions 
6https://mms.mongodb.com/help-hosted/current/ 
7https://mms.mongodb.com/?pk_campaign=mongodb-docs-admin-tutorials 
174 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Monitoring for MongoDB 
Monitoring is a critical component of all database administration. A firm grasp of MongoDB’s reporting will allow you 
to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDB’s 
normal operational parameters will allow you to diagnose before they escalate to failures. 
This document presents an overview of the available monitoring utilities and the reporting statistics available in Mon-goDB. 
It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters. 
Note: MongoDB Management Service (MMS)8 is a hosted monitoring service which collects and aggregates data 
to provide insight into the performance and operation of MongoDB deployments. See the MMS documentation9 for 
more information. 
Monitoring Strategies 
There are three methods for collecting data about the state of a running MongoDB instance: 
• First, there is a set of utilities distributed with MongoDB that provides real-time reporting of database activities. 
• Second, database commands return statistics regarding the current database state with greater fidelity. 
• Third, MMS Monitoring Service10 collects data from running MongoDB deployments and provides visualiza-tion 
and alerts based on that data. MMS is a free service provided by MongoDB. 
Each strategy can help answer different questions and is useful in different contexts. These methods are complemen-tary. 
MongoDB Reporting Tools 
This section provides an overview of the reporting methods distributed with MongoDB. It also offers examples of the 
kinds of questions that each method is best suited to help you address. 
Utilities The MongoDB distribution includes a number of utilities that quickly return statistics about instances’ 
performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation. 
mongostat mongostat captures and returns the counts of database operations by type (e.g. insert, query, update, 
delete, etc.). These counts report on the load distribution on the server. 
Use mongostat to understand the distribution of operation types and to inform capacity planning. See the 
mongostat manual for details. 
mongotop mongotop tracks and reports the current read and write activity of a MongoDB instance, and reports 
these statistics on a per collection basis. 
Use mongotop to check if your database activity and use match your expectations. See the mongotop manual 
for details. 
8https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 
9http://mms.mongodb.com/help/ 
10https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 
5.1. Administration Concepts 175
MongoDB Documentation, Release 2.6.4 
REST Interface MongoDB provides a simple REST interface that can be useful for configuring monitoring and 
alert scripts, and for other administrative tasks. 
To enable, configure mongod to use REST, either by starting mongod with the --rest option, or by setting the 
net.http.RESTInterfaceEnabled setting to true in a configuration file. 
For more information on using the REST Interface see, the Simple REST Interface11 documentation. 
HTTP Console MongoDB provides a web interface that exposes diagnostic and monitoring information in a simple 
web page. The web interface is accessible at localhost:<port>, where the <port> number is 1000 more than 
the mongod port . 
For example, if a locally running mongod is using the default port 27017, access the HTTP console at 
http://localhost:28017. 
Commands MongoDB includes a number of commands that report on the state of the database. 
These data may provide a finer level of granularity than the utilities discussed above. Consider using their output 
in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the 
activity of your instance. The db.currentOp method is another useful tool for identifying the database instance’s 
in-progress operations. 
serverStatus The serverStatus command, or db.serverStatus() from the shell, returns a general 
overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access. 
The command returns quickly and does not impact MongoDB performance. 
serverStatus outputs an account of the state of a MongoDB instance. This command is rarely run directly. In 
most cases, the data is more meaningful when aggregated, as one would see with monitoring tools including MMS12 . 
Nevertheless, all administrators should be familiar with the data provided by serverStatus. 
dbStats The dbStats command, or db.stats() from the shell, returns a document that addresses storage use 
and data volumes. The dbStats reflect the amount of storage used, the quantity of data contained in the database, 
and object, collection, and index counters. 
Use this data to monitor the state and storage capacity of a specific database. This output also allows you to compare 
use between databases and to determine the average document size in a database. 
collStats The collStats provides statistics that resemble dbStats on the collection level, including a count 
of the objects in the collection, the size of the collection, the amount of disk space used by the collection, and infor-mation 
about its indexes. 
replSetGetStatus The replSetGetStatus command (rs.status() from the shell) returns an 
overview of your replica set’s status. The replSetGetStatus document details the state and configuration of 
the replica set and statistics about its members. 
Use this data to ensure that replication is properly configured, and to check the connections between the current host 
and the other members of the replica set. 
Third Party Tools A number of third party monitoring tools have support for MongoDB, either directly, or through 
their own plugins. 
11http://docs.mongodb.org/ecosystem/tools/http-interfaces 
12http://mms.mongodb.com 
176 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Self Hosted Monitoring Tools These are monitoring tools that you must install, configure and maintain on your 
own servers. Most are open source. 
Tool Plugin Description 
Gan-glia26 
mongodb-ganglia27 Python script to report operations per second, memory usage, btree statistics, 
master/slave status and current connections. 
Gan-glia 
gmond_python_modulePsa2r8ses output from the serverStatus and replSetGetStatus 
commands. 
Mo-top29 
None Realtime monitoring tool for MongoDB servers. Shows current operations 
ordered by durations every second. 
mtop30 None A top like tool. 
Munin31 mongo-munin32 Retrieves server statistics. 
Munin mongomon33 Retrieves collection statistics (sizes, index sizes, and each (configured) collection 
count for one DB). 
Munin munin-plugins 
Ubuntu PPA34 
Some additional munin plugins not in the main distribution. 
Na-gios35 
nagios-plugin-mongodb36 
A simple Nagios check script, written in Python. 
Zab-bix37 
mikoomi-mongodb38 
Monitors availability, resource utilization, health, performance and other 
important metrics. 
Also consider dex39, an index and query analyzing tool for MongoDB that compares MongoDB log files and indexes 
to make indexing recommendations. 
As part of MongoDB Enterprise40, you can run MMS On-Prem41, which offers the features of MMS in a package that 
runs within your infrastructure. 
Hosted (SaaS) Monitoring Tools These are monitoring tools provided as a hosted service, usually through a paid 
subscription. 
13http://sourceforge.net/apps/trac/ganglia/wiki 
14https://github.com/quiiver/mongodb-ganglia 
15https://github.com/ganglia/gmond_python_modules 
16https://github.com/tart/motop 
17https://github.com/beaufour/mtop 
18http://munin-monitoring.org/ 
19https://github.com/erh/mongo-munin 
20https://github.com/pcdummy/mongomon 
21https://launchpad.net/ chris-lea/+archive/munin-plugins 
22http://www.nagios.org/ 
23https://github.com/mzupan/nagios-plugin-mongodb 
24http://www.zabbix.com/ 
25https://code.google.com/p/mikoomi/wiki/03 
26http://sourceforge.net/apps/trac/ganglia/wiki 
27https://github.com/quiiver/mongodb-ganglia 
28https://github.com/ganglia/gmond_python_modules 
29https://github.com/tart/motop 
30https://github.com/beaufour/mtop 
31http://munin-monitoring.org/ 
32https://github.com/erh/mongo-munin 
33https://github.com/pcdummy/mongomon 
34https://launchpad.net/ chris-lea/+archive/munin-plugins 
35http://www.nagios.org/ 
36https://github.com/mzupan/nagios-plugin-mongodb 
37http://www.zabbix.com/ 
38https://code.google.com/p/mikoomi/wiki/03 
39https://github.com/mongolab/dex 
40http://www.mongodb.com/products/mongodb-enterprise 
41http://mms.mongodb.com 
5.1. Administration Concepts 177
MongoDB Documentation, Release 2.6.4 
Name Notes 
MongoDB Management 
Service50 
MMS is a cloud-based suite of services for managing MongoDB deployments. MMS 
provides monitoring and backup functionality. 
Scout51 Several plugins, including MongoDB Monitoring52, MongoDB Slow Queries53, and 
MongoDB Replica Set Monitoring54. 
Server Density55 Dashboard for MongoDB56, MongoDB specific alerts, replication failover timeline 
and iPhone, iPad and Android mobile apps. 
Application 
Performance 
Management57 
IBM has an Application Performance Management SaaS offering that includes 
monitor for MongoDB and other applications and middleware. 
Process Logging 
During normal operation, mongod and mongos instances report a live account of all server activity and operations to 
either standard output or a log file. The following runtime settings control these options. 
• quiet. Limits the amount of information written to the log or output. 
• verbosity. Increases the amount of information written to the log or output. 
• path. Enables logging to a file, rather than the standard output. You must specify the full path to the log file 
when adjusting this setting. 
• logAppend. Adds information to a log file instead of overwriting the file. 
Note: You can specify these configuration operations as the command line arguments to mongod or mongos 
For example: 
mongod -v --logpath /var/log/mongodb/server1.log --logappend 
Starts a mongod instance in verbose mode, appending data to the log file at 
/var/log/mongodb/server1.log/. 
The following database commands also affect logging: 
• getLog. Displays recent messages from the mongod process log. 
• logRotate. Rotates the log files for mongod processes only. See Rotate Log Files (page 214). 
42https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 
43http://scoutapp.com 
44https://scoutapp.com/plugin_urls/391-mongodb-monitoring 
45http://scoutapp.com/plugin_urls/291-mongodb-slow-queries 
46http://scoutapp.com/plugin_urls/2251-mongodb-replica-set-monitoring 
47http://www.serverdensity.com 
48http://www.serverdensity.com/mongodb-monitoring/ 
49http://ibmserviceengage.com 
50https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 
51http://scoutapp.com 
52https://scoutapp.com/plugin_urls/391-mongodb-monitoring 
53http://scoutapp.com/plugin_urls/291-mongodb-slow-queries 
54http://scoutapp.com/plugin_urls/2251-mongodb-replica-set-monitoring 
55http://www.serverdensity.com 
56http://www.serverdensity.com/mongodb-monitoring/ 
57http://ibmserviceengage.com 
178 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Diagnosing Performance Issues 
Degraded performance in MongoDB is typically a function of the relationship between the quantity of data stored 
in the database, the amount of system RAM, the number of connections to the database, and the amount of time the 
database spends in a locked state. 
In some cases performance issues may be transient and related to traffic load, data access patterns, or the availability 
of hardware on the host system for virtualized environments. Some users also experience performance limitations as a 
result of inadequate or inappropriate indexing strategies, or as a consequence of poor schema design patterns. In other 
situations, performance issues may indicate that the database may be operating at capacity and that it is time to add 
additional capacity to the database. 
The following are some causes of degraded performance in MongoDB. 
Locks MongoDB uses a locking system to ensure data set consistency. However, if certain operations are long-running, 
or a queue forms, performance will slow as requests and operations wait for the lock. Lock-related slowdowns 
can be intermittent. To see if the lock has been affecting your performance, look to the data in the globalLock section 
of the serverStatus output. If globalLock.currentQueue.total is consistently high, then there is a 
chance that a large number of requests are waiting for a lock. This indicates a possible concurrency issue that may be 
affecting performance. 
If globalLock.totalTime is high relative to uptime, the database has existed in a lock state for a significant 
amount of time. If globalLock.ratio is also high, MongoDB has likely been processing a large number of 
long running queries. Long queries are often the result of a number of factors: ineffective use of indexes, non-optimal 
schema design, poor query structure, system architecture issues, or insufficient RAM resulting in page faults 
(page 179) and disk reads. 
Memory Usage MongoDB uses memory mapped files to store data. Given a data set of sufficient size, the MongoDB 
process will allocate all available memory on the system for its use. While this is part of the design, and affords 
MongoDB superior performance, the memory mapped files make it difficult to determine if the amount of RAM is 
sufficient for the data set. 
The memory usage statuses metrics of the serverStatus output can provide insight into MongoDB’s memory use. 
Check the resident memory use (i.e. mem.resident): if this exceeds the amount of system memory and there is a 
significant amount of data on disk that isn’t in RAM, you may have exceeded the capacity of your system. 
You should also check the amount of mapped memory (i.e. mem.mapped.) If this value is greater than the amount of 
system memory, some operations will require disk access page faults to read data from virtual memory and negatively 
affect performance. 
Page Faults Page faults can occur as MongoDB reads from or writes data to parts of its data files that are not 
currently located in physical memory. In contrast, operating system page faults happen when physical memory is 
exhausted and pages of physical memory are swapped to disk. 
Page faults triggered by MongoDB are reported as the total number of page faults in one second. To check for page 
faults, see the extra_info.page_faults value in the serverStatus output. 
MongoDB on Windows counts both hard and soft page faults. 
The MongoDB page fault counter may increase dramatically in moments of poor performance and may correlate 
with limited physical memory environments. Page faults also can increase while accessing much larger data sets, 
for example, scanning an entire collection. Limited and sporadic MongoDB page faults do not necessarily indicate a 
problem or a need to tune the database. 
A single page fault completes quickly and is not problematic. However, in aggregate, large volumes of page faults 
typically indicate that MongoDB is reading too much data from disk. In many situations, MongoDB’s read locks will 
5.1. Administration Concepts 179
MongoDB Documentation, Release 2.6.4 
“yield” after a page fault to allow other processes to read and avoid blocking while waiting for the next page to read 
into memory. This approach improves concurrency, and also improves overall throughput in high volume systems. 
Increasing the amount of RAM accessible to MongoDB may help reduce the frequency of page faults. If this is not 
possible, you may want to consider deploying a sharded cluster or adding shards to your deployment to distribute load 
among mongod instances. 
See What are page faults? (page 715) for more information. 
Number of Connections In some cases, the number of connections between the application layer (i.e. clients) and 
the database can overwhelm the ability of the server to handle requests. This can produce performance irregularities. 
The following fields in the serverStatus document can provide insight: 
• globalLock.activeClients contains a counter of the total number of clients with active operations in 
progress or queued. 
• connections is a container for the following two fields: 
– current the total number of current clients that connect to the database instance. 
– available the total number of unused collections available for new clients. 
If requests are high because there are numerous concurrent application requests, the database may have trouble keeping 
up with demand. If this is the case, then you will need to increase the capacity of your deployment. For read-heavy 
applications increase the size of your replica set and distribute read operations to secondary members. For write heavy 
applications, deploy sharding and add one or more shards to a sharded cluster to distribute load among mongod 
instances. 
Spikes in the number of connections can also be the result of application or driver errors. All of the officially supported 
MongoDB drivers implement connection pooling, which allows clients to use and reuse connections more efficiently. 
Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or 
other configuration error. 
Unless constrained by system-wide limits MongoDB has no limit on incoming connections. You can modify system 
limits using the ulimit command, or by editing your system’s /etc/sysctl file. See UNIX ulimit Settings 
(page 266) for more information. 
Database Profiling MongoDB’s “Profiler” is a database profiling system that can help identify inefficient queries 
and operations. 
The following profiling levels are available: 
Level Setting 
0 Off. No profiling 
1 On. Only includes “slow” operations 
2 On. Includes all operations 
Enable the profiler by setting the profile value using the following command in the mongo shell: 
db.setProfilingLevel(1) 
The slowOpThresholdMs setting defines what constitutes a “slow” operation. To set the threshold above 
which the profiler considers operations “slow” (and thus, included in the level 1 profiling data), you can configure 
slowOpThresholdMs at runtime as an argument to the db.setProfilingLevel() operation. 
See 
The documentation of db.setProfilingLevel() for more information about this command. 
By default, mongod records all “slow” queries to its log, as defined by slowOpThresholdMs. 
180 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Note: Because the database profiler can negatively impact performance, only enable profiling for strategic intervals 
and as minimally as possible on production systems. 
You may enable profiling on a per-mongod basis. This setting will not propagate across a replica set or sharded 
cluster. 
You can view the output of the profiler in the system.profile collection of your database by issuing the show 
profile command in the mongo shell, or with the following operation: 
db.system.profile.find( { millis : { $gt : 100 } } ) 
This returns all operations that lasted longer than 100 milliseconds. Ensure that the value specified here (100, in this 
example) is above the slowOpThresholdMs threshold. 
See also: 
Optimization Strategies for MongoDB (page 200) addresses strategies that may improve the performance of your 
database queries and operations. 
Replication and Monitoring 
Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor 
replication lag. “Replication lag” refers to the amount of time that it takes to copy (i.e. replicate) a write operation 
on the primary to a secondary. Some small delay period may be acceptable, but two significant problems emerge as 
replication lag grows: 
• First, operations that occurred during the period of lag are not replicated to one or more secondaries. If you’re 
using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set. 
• Second, if the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform 
an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. This is uncommon 
under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise. 
Note: The size of the oplog is only configurable during the first run using the --oplogSize argument to the 
mongod command, or preferably, the oplogSizeMB setting in the MongoDB configuration file. If you do not 
specify this on the command line before running with the --replSet option, mongod will create a default 
sized oplog. 
By default, the oplog is 5 percent of total available disk space on 64-bit systems. For more information about 
changing the oplog size, see the Change the Size of the Oplog (page 570) 
For causes of replication lag, see Replication Lag (page 589). 
Replication issues are most often the result of network connectivity issues between members, or the result of a primary 
that does not have the resources to support application and replication traffic. To check the status of a replica, use the 
replSetGetStatus or the following helper in the shell: 
rs.status() 
The http://docs.mongodb.org/manualreference/command/replSetGetStatus document pro-vides 
a more in-depth overview view of this output. In general, watch the value of optimeDate, and pay particular 
attention to the time difference between the primary and the secondary members. 
5.1. Administration Concepts 181
MongoDB Documentation, Release 2.6.4 
Sharding and Monitoring 
In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB 
instances. In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes 
and that sharding operations are functioning appropriately. 
See also: 
See the Sharding Concepts (page 613) documentation for more information. 
Config Servers The config database maintains a map identifying which documents are on which shards. The cluster 
updates this map as chunks move between shards. When a configuration server becomes inaccessible, certain sharding 
operations become unavailable, such as moving chunks and starting mongos instances. However, clusters remain 
accessible from already-running mongos instances. 
Because inaccessible configuration servers can seriously impact the availability of a sharded cluster, you should mon-itor 
your configuration servers to ensure that the cluster remains well balanced and that mongos instances can restart. 
MMS Monitoring58 monitors config servers and can create notifications if a config server becomes inaccessible. 
Balancing and Chunk Distribution The most effective sharded cluster deployments evenly balance chunks among 
the shards. To facilitate this, MongoDB has a background balancer process that distributes data to ensure that chunks 
are always optimally distributed among the shards. 
Issue the db.printShardingStatus() or sh.status() command to the mongos by way of the mongo 
shell. This returns an overview of the entire cluster including the database name, and a list of the chunks. 
Stale Locks In nearly every case, all locks used by the balancer are automatically released when they become stale. 
However, because any long lasting lock can block future balancing, it’s important to ensure that all locks are legitimate. 
To check the lock status of the database, connect to a mongos instance using the mongo shell. Issue the following 
command sequence to switch to the config database and display all outstanding locks on the shard database: 
use config 
db.locks.find() 
For active deployments, the above query can provide insights. The balancing process, which originates on a randomly 
selected mongos, takes a special “balancer” lock that prevents other balancing activity from transpiring. Use the 
following command, also to the config database, to check the status of the “balancer” lock. 
db.locks.find( { _id : "balancer" } ) 
If this lock exists, make sure that the balancer process is actively using this lock. 
Run-time Database Configuration 
The command line and configuration file interfaces provide MongoDB administrators with a large num-ber 
of options and settings for controlling the operation of the database system. This document provides an overview 
of common configurations and examples of best-practice configurations for common use cases. 
While both interfaces provide access to the same collection of options and settings, this document primarily uses the 
configuration file interface. If you run MongoDB using a control script or installed from a package for your operating 
system, you likely already have a configuration file located at /etc/mongodb.conf. Confirm this by checking the 
contents of the /etc/init.d/mongod or /etc/rc.d/mongod script to ensure that the control scripts start the 
mongod with the appropriate configuration file (see below.) 
58http://mms.mongodb.com 
182 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
To start a MongoDB instance using this configuration issue a command in the following form: 
mongod --config /etc/mongodb.conf 
mongod -f /etc/mongodb.conf 
Modify the values in the /etc/mongodb.conf file on your system to control the configuration of your database 
instance. 
Configure the Database 
Consider the following basic configuration: 
fork = true 
bind_ip = 127.0.0.1 
port = 27017 
quiet = true 
dbpath = /srv/mongodb 
logpath = /var/log/mongodb/mongod.log 
logappend = true 
journal = true 
For most standalone servers, this is a sufficient base configuration. It makes several assumptions, but consider the 
following explanation: 
• fork is true, which enables a daemon mode for mongod, which detaches (i.e. “forks”) the MongoDB from 
the current session and allows you to run the database as a conventional server. 
• bindIp is 127.0.0.1, which forces the server to only listen for requests on the localhost IP. Only bind to 
secure interfaces that the application-level systems can access with access control provided by system network 
filtering (i.e. “firewall”). 
New in version 2.6: mongod installed from official .deb (page 12) and .rpm (page 6) packages have the 
bind_ip configuration set to 127.0.0.1 by default. 
• port is 27017, which is the default MongoDB port for database instances. MongoDB can bind to any port. 
You can also filter access based on port using network filtering tools. 
Note: UNIX-like systems require superuser privileges to attach processes to ports lower than 1024. 
• quiet is true. This disables all but the most critical entries in output/log file. In normal operation this is 
the preferable operation to avoid log noise. In diagnostic or testing situations, set this value to false. Use 
setParameter to modify this setting during run time. 
• dbPath is /srv/mongodb, which specifies where MongoDB will store its data files. /srv/mongodb and 
/var/lib/mongodb are popular locations. The user account that mongod runs under will need read and 
write access to this directory. 
• systemLog.path is /var/log/mongodb/mongod.log which is where mongod will write its output. 
If you do not set this value, mongod writes all output to standard output (e.g. stdout.) 
• logAppend is true, which ensures that mongod does not overwrite an existing log file following the server 
start operation. 
• storage.journal.enabled is true, which enables journaling. Journaling ensures single instance write-durability. 
64-bit builds of mongod enable journaling by default. Thus, this setting may be redundant. 
Given the default configuration, some of these values may be redundant. However, in many situations explicitly stating 
the configuration increases overall system intelligibility. 
5.1. Administration Concepts 183
MongoDB Documentation, Release 2.6.4 
Security Considerations 
The following collection of configuration options are useful for limiting access to a mongod instance. Consider the 
following: 
bind_ip = 127.0.0.1,10.8.0.10,192.168.4.24 
auth = true 
Consider the following explanation for these configuration decisions: 
• “bindIp” has three values: 127.0.0.1, the localhost interface; 10.8.0.10, a private IP address typically 
used for local networks and VPN interfaces; and 192.168.4.24, a private network interface typically used 
for local networks. 
Because production MongoDB instances need to be accessible from multiple database servers, it is important 
to bind MongoDB to multiple interfaces that are accessible from your application servers. At the same time it’s 
important to limit these interfaces to interfaces controlled and protected at the network layer. 
• “enabled” to false disables the UNIX Socket, which is otherwise enabled by default. This limits access 
on the local system. This is desirable when running MongoDB on systems with shared access, but in most 
situations has minimal impact. 
• “authorization” is true enables the authentication system within MongoDB. If enabled you will need to 
log in by connecting over the localhost interface for the first time to create user credentials. 
See also: 
Security Concepts (page 281) 
Replication and Sharding Configuration 
Replication Configuration Replica set configuration is straightforward, and only requires that the replSetName 
have a value that is consistent among all members of the set. Consider the following: 
replSet = set0 
Use descriptive names for sets. Once configured use the mongo shell to add hosts to the replica set. 
See also: 
Replica set reconfiguration. 
To enable authentication for the replica set, add the following option: 
keyFile = /srv/mongodb/keyfile 
New in version 1.8: for replica sets, and 1.9.1 for sharded replica sets. 
Setting keyFile enables authentication and specifies a key file for the replica set member use to when authenticating 
to each other. The content of the key file is arbitrary, but must be the same on all members of the replica set and 
mongos instances that connect to the set. The keyfile must be less than one kilobyte in size and may only contain 
characters in the base64 set and the file must not have group or “world” permissions on UNIX systems. 
See also: 
The Replica set Reconfiguration section for information regarding the process for changing replica set during opera-tion. 
Additionally, consider the Replica Set Security section for information on configuring authentication with replica sets. 
Finally, see the Replication (page 503) document for more information on replication in MongoDB and replica set 
configuration in general. 
184 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Sharding Configuration Sharding requires a number of mongod instances with different configurations. The con-fig 
servers store the cluster’s metadata, while the cluster distributes data among one or more shard servers. 
Note: Config servers are not replica sets. 
To set up one or three “config server” instances as normal (page 183) mongod instances, and then add the following 
configuration option: 
configsvr = true 
bind_ip = 10.8.0.12 
port = 27001 
This creates a config server running on the private IP address 10.8.0.12 on port 27001. Make sure that there are 
no port conflicts, and that your config server is accessible from all of your mongos and mongod instances. 
To set up shards, configure two or more mongod instance using your base configuration (page 183), with the 
shardsvr value for the clusterRole setting: 
shardsvr = true 
Finally, to establish the cluster, configure at least one mongos process with the following settings: 
configdb = 10.8.0.12:27001 
chunkSize = 64 
You can specify multiple configDB instances by specifying hostnames and ports in the form of a comma separated 
list. In general, avoid modifying the chunkSize from the default value of 64, 59 and should ensure this setting is 
consistent among all mongos instances. 
See also: 
The Sharding (page 607) section of the manual for more information on sharding and cluster configuration. 
Run Multiple Database Instances on the Same System 
In many cases running multiple instances of mongod on a single system is not recommended. On some types of 
deployments 60 and for testing purposes you may need to run more than one mongod on a single system. 
In these cases, use a base configuration (page 183) for each instance, but consider the following configuration values: 
dbpath = /srv/mongodb/db0/ 
pidfilepath = /srv/mongodb/db0.pid 
The dbPath value controls the location of the mongod instance’s data directory. Ensure that each database has a 
distinct and well labeled data directory. The pidFilePath controls where mongod process places it’s process id 
file. As this tracks the specific mongod file, it is crucial that file be unique and well labeled to make it easy to start 
and stop these processes. 
Create additional control scripts and/or adjust your existing MongoDB configuration and control script as needed to 
control these processes. 
59 Chunk size is 64 megabytes by default, which provides the ideal balance between the most even distribution of data, for which smaller chunk 
sizes are best, and minimizing chunk migration, for which larger chunk sizes are optimal. 
60 Single-tenant systems with SSD or other high performance disks may provide acceptable performance levels for multiple mongod instances. 
Additionally, you may find that multiple databases with small working sets may function acceptably on a single system. 
5.1. Administration Concepts 185
MongoDB Documentation, Release 2.6.4 
Diagnostic Configurations 
The following configuration options control various mongod behaviors for diagnostic purposes. The following set-tings 
have default values that tuned for general production purposes: 
slowms = 50 
profile = 3 
verbose = true 
objcheck = true 
Use the base configuration (page 183) and add these options if you are experiencing some unknown issue or perfor-mance 
problem as needed: 
• slowOpThresholdMs configures the threshold for to consider a query “slow,” for the purpose of the logging 
system and the database profiler. The default value is 100 milliseconds. Set a lower value if the database 
profiler does not return useful results, or a higher value to only log the longest running queries. See Optimization 
Strategies for MongoDB (page 200) for more information on optimizing operations in MongoDB. 
• mode sets the database profiler level. The profiler is not active by default because of the possible impact on the 
profiler itself on performance. Unless this setting has a value, queries are not profiled. 
• verbosity controls the amount of logging output that mongod write to the log. Only use this option if you 
are experiencing an issue that is not reflected in the normal logging level. 
• wireObjectCheck forces mongod to validate all requests from clients upon receipt. Use this option to 
ensure that invalid requests are not causing errors, particularly when running a database with untrusted clients. 
This option may affect database performance. 
Import and Export MongoDB Data 
This document provides an overview of the import and export programs included in the MongoDB distribution. These 
tools are useful when you want to backup or export a portion of your data without capturing the state of the entire 
database, or for simple data ingestion cases. For more complex data migration tasks, you may want to write your own 
import and export scripts using a client driver to interact with the database itself. For disaster recovery protection and 
routine database backup operation, use full database instance backups (page 172). 
Warning: Because these tools primarily operate by interacting with a running mongod instance, they can impact 
the performance of your running database. 
Not only do these processes create traffic for a running database instance, they also force the database to read all 
data through memory. When MongoDB reads infrequently used data, it can supplant more frequently accessed 
data, causing a deterioration in performance for the database’s regular workload. 
See also: 
MongoDB Backup Methods (page 172) or MMS Backup Manual61 for more information on backing up MongoDB 
instances. Additionally, consider the following references for the MongoDB import/export tools: 
• http://docs.mongodb.org/manualreference/program/mongoimport 
• http://docs.mongodb.org/manualreference/program/mongoexport 
• http://docs.mongodb.org/manualreference/program/mongorestore 
• http://docs.mongodb.org/manualreference/program/mongodump 
61https://mms.mongodb.com/help/backup 
186 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Data Import, Export, and Backup Operations 
For resilient and non-disruptive backups, use a file system or block-level disk snapshot function, such as the meth-ods 
described in the MongoDB Backup Methods (page 172) document. The tools and operations discussed provide 
functionality that is useful in the context of providing some kinds of backups. 
In contrast, use import and export tools to backup a small subset of your data or to move data to or from a third 
party system. These backups may capture a small crucial set of data or a frequently modified section of data for extra 
insurance, or for ease of access. 
Warning: mongoimport and mongoexport do not reliably preserve all rich BSON data 
types because JSON can only represent a subset of the types supported by BSON. As a re-sult, 
data exported or imported with these tools may lose some measure of fidelity. See 
http://docs.mongodb.org/manualreference/mongodb-extended-json for more infor-mation. 
No matter how you decide to import or export your data, consider the following guidelines: 
• Label files so that you can identify the contents of the export or backup as well as the point in time the ex-port/ 
backup reflect. 
• Do not create or apply exports if the backup process itself will have an adverse effect on a production system. 
• Make sure that they reflect a consistent data state. Export or backup processes can impact data integrity (i.e. 
type fidelity) and consistency if updates continue during the backup process. 
• Test backups and exports by restoring and importing to ensure that the backups are useful. 
Human Intelligible Import/Export Formats 
This section describes a process to import/export a collection to a file in a JSON or CSV format. 
The examples in this section use the MongoDB tools http://docs.mongodb.org/manualreference/program/mongoimport 
and http://docs.mongodb.org/manualreference/program/mongoexport. These tools may also 
be useful for importing data into a MongoDB database from third party applications. 
If you want to simply copy a database or collection from one instance to another, consider using the copydb, 
clone, or cloneCollection commands, which may be more suited to this task. The mongo shell provides 
the db.copyDatabase() method. 
Warning: mongoimport and mongoexport do not reliably preserve types because JSON can only represent a subset of the types supported by Collection Export with mongoexport 
data exported or imported with these tools may lose some measure http://docs.mongodb.org/manualreference/mongodb-extended-json With the mongoexport utility you can create a backup file. In the most simple invocation, the command takes the 
following form: 
mongoexport --collection collection --out collection.json 
This will export all documents in the collection named collection into the file collection.json. Without 
the output specification (i.e. “--out collection.json”), mongoexport writes output to standard output (i.e. 
“stdout”). You can further narrow the results by supplying a query filter using the “--query” and limit results to a 
single database using the “--db” option. For instance: 
5.1. Administration Concepts 187
MongoDB Documentation, Release 2.6.4 
mongoexport --db sales --collection contacts --query '{"field": 1}' 
This command returns all documents in the sales database’s contacts collection, with a field named field with 
a value of 1. Enclose the query in single quotes (e.g. ’) to ensure that it does not interact with your shell environment. 
The resulting documents will return on standard output. 
By default, mongoexport returns one JSON document per MongoDB document. Specify the “--jsonArray” 
argument to return the export as a single JSON array. Use the “--csv” file to return the result in CSV (comma 
separated values) format. 
If your mongod instance is not running, you can use the “--dbpath” option to specify the location to your Mon-goDB 
instance’s database files. See the following example: 
mongoexport --db sales --collection contacts --dbpath /srv/MongoDB/ 
This reads the data files directly. This locks the data directory to prevent conflicting writes. The mongod process must 
not be running or attached to these data files when you run mongoexport in this configuration. 
The “--host” and “--port” options allow you to specify a non-local host to connect to capture the export. Consider 
the following example: 
mongoexport --host mongodb1.example.net --port 37017 --username user --password pass --collection contacts On any mongoexport command you may, as above specify username and password credentials as above. 
Warning: mongoimport and mongoexport do not reliably preserve types because JSON can only represent a subset of the types supported by Collection Import with mongoimport 
data exported or imported with these tools may lose some measure http://docs.mongodb.org/manualreference/mongodb-extended-json To restore a backup taken with mongoexport. Most of the arguments to mongoexport also exist for 
mongoimport. Consider the following command: 
mongoimport --collection collection --file collection.json 
This imports the contents of the file collection.json into the collection named collection. If you do not 
specify a file with the “--file” option, mongoimport accepts input over standard input (e.g. “stdin.”) 
If you specify the “--upsert” option, all of mongoimport operations will attempt to update existing documents 
in the database and insert other documents. This option will cause some performance impact depending on your 
configuration. 
You can specify the database option --db to import these documents to a particular database. If your MongoDB 
instance is not running, use the “--dbpath” option to specify the location of your MongoDB instance’s database 
files. Consider using the “--journal” option to ensure that mongoimport records its operations in the jour-nal. 
The mongod process must not be running or attached to these data files when you run mongoimport in this 
configuration. 
Use the “--ignoreBlanks” option to ignore blank fields. For CSV and TSV imports, this option provides the 
desired functionality in most cases: it avoids inserting blank fields in MongoDB documents. 
Production Notes 
This page details system configurations that affect MongoDB, especially in production. 
Note: MongoDB Management Service (MMS)62 is a hosted monitoring service which collects and aggregates diag- 
188 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
nostic data to provide insight into the performance and operation of MongoDB deployments. See the MMS Website63 
and the MMS documentation64 for more information. 
Packages 
MongoDB Be sure you have the latest stable release. All releases are available on the Downloads65 page. This is a 
good place to verify what is current, even if you then choose to install via a package manager. 
Always use 64-bit builds for production. The 32-bit build MongoDB offers for test and development environments 
is not suitable for production deployments as it can store no more than 2GB of data. See the 32-bit limitations page 
(page 690) for more information. 
32-bit builds exist to support use on development machines. 
Operating Systems MongoDB distributions are currently available for Mac OS X, Linux,Windows Server 2008 R2 
64bit, Windows 7 (32 bit and 64 bit), Windows Vista, and Solaris platforms. 
Note: MongoDB uses the GNU C Library66 (glibc) if available on a system. MongoDB requires version at least 
glibc-2.12-1.2.el6 to avoid a known bug with earlier versions. For best results use at least version 2.13. 
Concurrency 
In earlier versions of MongoDB, all write operations contended for a single readers-writer lock on the MongoDB 
instance. As of version 2.2, each database has a readers-writer lock that allows concurrent reads access to a database, 
but gives exclusive access to a single write operation per database. See the Concurrency (page 702) page for more 
information. 
Journaling 
MongoDB uses write ahead logging to an on-disk journal to guarantee that MongoDB is able to quickly recover the 
write operations (page 67) following a crash or other serious failure. 
In order to ensure that mongod will be able to recover its data files and keep the data files in a valid state following a 
crash, leave journaling enabled. See Journaling (page 275) for more information. 
Networking 
Use Trusted Networking Environments Always run MongoDB in a trusted environment, with network rules that 
prevent access from all unknown machines, systems, and networks. As with any sensitive system dependent on 
network access, your MongoDB deployment should only be accessible to specific systems that require access, such as 
application servers, monitoring services, and other MongoDB components. 
Note: By default, authorization is not enabled and mongod assumes a trusted environment. You can enable 
security/auth (page 281) mode if you need it. 
62http://mms.mongodb.com 
63http://mms.mongodb.com/ 
64http://mms.mongodb.com/help/ 
65http://www.mongodb.org/downloads 
66http://www.gnu.org/software/libc/ 
5.1. Administration Concepts 189
MongoDB Documentation, Release 2.6.4 
See documents in the Security Section (page 279) for additional information, specifically: 
• Configuration Options (page 288) 
• Firewalls (page 289) 
• Network Security Tutorials (page 297) 
ForWindows users, consider theWindows Server Technet Article on TCP Configuration67 when deploying MongoDB 
on Windows. 
Connection Pools To avoid overloading the connection resources of a single mongod or mongos instance, ensure 
that clients maintain reasonable connection pool sizes. 
The connPoolStats database command returns information regarding the number of open connections to the 
current database for mongos instances and mongod instances in sharded clusters. 
Hardware Considerations 
MongoDB is designed specifically with commodity hardware in mind and has few hardware requirements or limita-tions. 
MongoDB’s core components run on little-endian hardware, primarily x86/x86_64 processors. Client libraries 
(i.e. drivers) can run on big or little endian systems. 
Hardware Requirements and Limitations The hardware for the most effective MongoDB deployments have the 
following properties: 
Allocate Sufficient RAM and CPU As with all software, more RAM and a faster CPU clock speed are important 
for performance. 
In general, databases are not CPU bound. As such, increasing the number of cores can help, but does not provide 
significant marginal return. 
Use Solid State Disks (SSDs) MongoDB has good results and a good price-performance ratio with SATA SSD 
(Solid State Disk). 
Use SSD if available and economical. Spinning disks can be performant, but SSDs’ capacity for random I/O operations 
works well with the update model of mongod. 
Commodity (SATA) spinning drives are often a good option, as the random I/O performance increase with more 
expensive spinning drives is not that dramatic (only on the order of 2x). Using SSDs or increasing RAM may be more 
effective in increasing I/O throughput. 
Avoid Remote File Systems 
• Remote file storage can create performance problems in MongoDB. See Remote Filesystems (page 191) for 
more information about storage and MongoDB. 
67http://technet.microsoft.com/en-us/library/dd349797.aspx 
190 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
MongoDB andNUMAHardware 
Important: The discussion of NUMA in this section only applies to Linux systems with multiple physical processors, 
and therefore does not affect deployments where mongod instances run on other UNIX-like systems, on Windows, or 
on a Linux system with only one physical processor. 
Running MongoDB on a system with Non-Uniform Access Memory (NUMA) can cause a number of operational 
problems, including slow performance for periods of time or high system process usage. 
When running MongoDB on NUMA hardware, you should disable NUMA for MongoDB and instead set an interleave 
memory policy. 
Note: MongoDB version 2.0 and greater checks these settings on start up when deployed on a Linux-based system, 
and prints a warning if the system is NUMA-based. 
To disable NUMA for MongoDB and set an interleave memory policy, use the numactl command and start mongod 
in the following manner: 
numactl --interleave=all /usr/bin/local/mongod 
Then, disable zone reclaim in the proc settings using the following command: 
echo 0 > /proc/sys/vm/zone_reclaim_mode 
To fully disable NUMA, you must perform both operations. For more information, see the Documentation for 
/proc/sys/vm/*68. 
See The MySQL “swap insanity” problem and the effects of NUMA69 post, which describes the effects of NUMA on 
databases. This blog post addresses the impact of NUMA for MySQL, but the issues for MongoDB are similar. The 
post introduces NUMA and its goals, and illustrates how these goals are not compatible with production databases. 
Disk and Storage Systems 
Swap Assign swap space for your systems. Allocating swap space can avoid issues with memory contention and 
can prevent the OOM Killer on Linux systems from killing mongod. 
The method mongod uses to map memory files to memory ensures that the operating system will never store Mon-goDB 
data in swap space. 
RAID Most MongoDB deployments should use disks backed by RAID-10. 
RAID-5 and RAID-6 do not typically provide sufficient performance to support a MongoDB deployment. 
Avoid RAID-0 with MongoDB deployments. While RAID-0 provides good write performance, it also provides limited 
availability and can lead to reduced performance on read operations, particularly when using Amazon’s EBS volumes. 
Remote Filesystems The Network File System protocol (NFS) is not recommended for use with MongoDB as some 
versions perform poorly. 
Performance problems arise when both the data files and the journal files are hosted on NFS. You may experience 
better performance if you place the journal on local or iscsi volumes. If you must use NFS, add the following NFS 
options to your /etc/fstab file: bg, nolock, and noatime. 
68http://www.kernel.org/doc/Documentation/sysctl/vm.txt 
69http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/ 
5.1. Administration Concepts 191
MongoDB Documentation, Release 2.6.4 
Separate Components onto Different Storage Devices For improved performance, consider separating your 
database’s data, journal, and logs onto different storage devices, based on your application’s access and write pat-tern. 
Note: This will affect your ability to create snapshot-style backups of your data, since the files will be on different 
devices and volumes. 
Scheduling for Virtual Devices Local block devices attached to virtual machine instances via the hypervisor should 
use a noop scheduler for best performance. The noop scheduler allows the operating system to defer I/O scheduling to 
the underlying hypervisor. 
Architecture 
Write Concern Write concern describes the guarantee that MongoDB provides when reporting on the success of 
a write operation. The strength of the write concerns determine the level of guarantee. When inserts, updates and 
deletes have a weak write concern, write operations return quickly. In some failure cases, write operations issued with 
weak write concerns may not persist. With stronger write concerns, clients wait after sending a write operation for 
MongoDB to confirm the write operations. 
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients 
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB 
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather 
than ensure persistence to the entire deployment. 
See the Write Concern (page 72) document for more information about choosing an appropriate write concern level 
for your deployment. 
Replica Sets See the Replica Set Architectures (page 516) document for an overview of architectural considerations 
for replica set deployments. 
Sharded Clusters See the Sharded Cluster Production Architecture (page 618) document for an overview of rec-ommended 
sharded cluster architectures for production deployments. 
Platforms 
MongoDB on Linux 
Important: The following discussion only applies to Linux, and therefore does not affect deployments where 
mongod instances run other UNIX-like systems or on Windows. 
Kernel and File Systems When running MongoDB in production on Linux, it is recommended that you use Linux 
kernel version 2.6.36 or later. 
MongoDB preallocates its database files before using them and often creates large files. As such, you should use the 
Ext4 and XFS file systems: 
• In general, if you use the Ext4 file system, use at least version 2.6.23 of the Linux Kernel. 
• In general, if you use the XFS file system, use at least version 2.6.25 of the Linux Kernel. 
• Some Linux distributions require different versions of the kernel to support using ext4 and/or xfs: 
192 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Linux Distribution Filesystem Kernel Version 
CentOS 5.5 ext4, xfs 2.6.18-194.el5 
CentOS 5.6 ext4, xfs 2.6.18-238.el5 
CentOS 5.8 ext4, xfs 2.6.18-308.8.2.el5 
CentOS 6.1 ext4, xfs 2.6.32-131.0.15.el6.x86_64 
RHEL 5.6 ext4 2.6.18-238 
RHEL 6.0 xfs 2.6.32-71 
Ubuntu 10.04.4 LTS ext4, xfs 2.6.32-38-server 
Amazon Linux AMI release 2012.03 ext4 3.2.12-3.2.4.amzn1.x86_64 
Important: MongoDB requires a filesystem that supports fsync() on directories. For example, HGFS and Virtual 
Box’s shared folders do not support this operation. 
Recommended Configuration 
• Turn off atime for the storage volume containing the database files. 
• Set the file descriptor limit, -n, and the user process limit (ulimit), -u, above 20,000, according to the sug-gestions 
in the ulimit (page 266) document. A low ulimit will affect MongoDB when under heavy use and can 
produce errors and lead to failed connections to MongoDB processes and loss of service. 
• Disable transparent huge pages as MongoDB performs better with normal (4096 bytes) virtual mem-ory 
pages. 
• Disable NUMA in your BIOS. If that is not possible see MongoDB on NUMA Hardware (page 191). 
• Ensure that readahead settings for the block devices that store the database files are appropriate. For random 
access use patterns, set low readahead values. A readahead of 32 (16kb) often works well. 
For a standard block device, you can run sudo blockdev --report to get the readahead settings and 
sudo blockdev --setra <value> <device> to change the readahead settings. Refer to your spe-cific 
operating system manual for more information. 
• Use the Network Time Protocol (NTP) to synchronize time among your hosts. This is especially important in 
sharded clusters. 
MongoDB on Virtual Environments The section describes considerations when running MongoDB in some of the 
more common virtual environments. 
For all platforms, consider Scheduling for Virtual Devices (page 192). 
EC2 MongoDB is compatible with EC2 and requires no configuration changes specific to the environment. 
You may alternately choose to obtain a set of Amazon Machine Images (AMI) that bundle together MongoDB and 
Amazon’s Provisioned IOPS storage volumes. Provisioned IOPS can greatly increase MongoDB’s performance and 
ease of use. For more information, see this blog post70. 
VMWare MongoDB is compatible with VMWare. As some users have run into issues with VMWare’s memory 
overcommit feature, disabling the feature is recommended. 
It is possible to clone a virtual machine running MongoDB. You might use this function to spin up a new virtual host 
to add as a member of a replica set. If you clone a VM with journaling enabled, the clone snapshot will be valid. If 
not using journaling, first stop mongod, then clone the VM, and finally, restart mongod. 
70http://www.mongodb.com/blog/post/provisioned-iops-aws-marketplace-significantly-boosts-mongodb-performance-ease-use 
5.1. Administration Concepts 193
MongoDB Documentation, Release 2.6.4 
OpenVZ Some users have had issues when running MongoDB on some older version of OpenVZ due to its handling 
of virtual memory, as with VMWare. 
This issue seems to have been resolved in the more recent versions of OpenVZ. 
Performance Monitoring 
iostat On Linux, use the iostat command to check if disk I/O is a bottleneck for your database. Specify a number 
of seconds when running iostat to avoid displaying stats covering the time since server boot. 
For example, the following command will display extended statistics and the time for each displayed report, with 
traffic in MB/s, at one second intervals: 
iostat -xmt 1 
Key fields from iostat: 
• %util: this is the most useful field for a quick check, it indicates what percent of the time the device/drive is 
in use. 
• avgrq-sz: average request size. Smaller number for this value reflect more random IO operations. 
bwm-ng bwm-ng71 is a command-line tool for monitoring network use. If you suspect a network-based bottleneck, 
you may use bwm-ng to begin your diagnostic process. 
Backups 
To make backups of your MongoDB database, please refer to MongoDB Backup Methods Overview (page 172). 
5.1.2 Data Management 
These document introduce data management practices and strategies for MongoDB deployments, including strategies 
for managing multi-data center deployments, managing larger file stores, and data lifecycle tools. 
Data Center Awareness (page 194) Presents the MongoDB features that allow application developers and database 
administrators to configure their deployments to be more data center aware or allow operational and location-based 
separation. 
Capped Collections (page 196) Capped collections provide a special type of size-constrained collections that preserve 
insertion order and can support high volume inserts. 
Expire Data from Collections by Setting TTL (page 198) TTL collections make it possible to automatically remove 
data from a collection based on the value of a timestamp and are useful for managing data like machine generated 
event data that are only useful for a limited period of time. 
Data Center Awareness 
MongoDB provides a number of features that allow application developers and database administrators to customize 
the behavior of a sharded cluster or replica set deployment so that MongoDB may be more “data center aware,” or 
allow operational and location-based separation. 
71http://www.gropp.org/?id=projects&sub=bwm-ng 
194 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
MongoDB also supports segregation based on functional parameters, to ensure that certain mongod instances are 
only used for reporting workloads or that certain high-frequency portions of a sharded collection only exist on specific 
shards. 
The following documents, found either in this section or other sections of this manual, provide information on cus-tomizing 
a deployment for operation- and location-based separation: 
Operational Segregation in MongoDB Deployments (page 195) MongoDB lets you specify that certain application 
operations use certain mongod instances. 
Tag Aware Sharding (page 671) Tags associate specific ranges of shard key values with specific shards for use in 
managing deployment patterns. 
Manage Shard Tags (page 672) Use tags to associate specific ranges of shard key values with specific shards. 
Operational Segregation in MongoDB Deployments 
Operational Overview MongoDB includes a number of features that allow database administrators and developers 
to segregate application operations to MongoDB deployments by functional or geographical groupings. 
This capability provides “data center awareness,” which allows applications to target MongoDB deployments with 
consideration of the physical location of the mongod instances. MongoDB supports segmentation of operations 
across different dimensions, which may include multiple data centers and geographical regions in multi-data center 
deployments, racks, networks, or power circuits in single data center deployments. 
MongoDB also supports segregation of database operations based on functional or operational parameters, to ensure 
that certain mongod instances are only used for reporting workloads or that certain high-frequency portions of a 
sharded collection only exist on specific shards. 
Specifically, with MongoDB, you can: 
• ensure write operations propagate to specific members of a replica set, or to specific members of replica sets. 
• ensure that specific members of a replica set respond to queries. 
• ensure that specific ranges of your shard key balance onto and reside on specific shards. 
• combine the above features in a single distributed deployment, on a per-operation (for read and write operations) 
and collection (for chunk distribution in sharded clusters distribution) basis. 
For full documentation of these features, see the following documentation in the MongoDB Manual: 
• Read Preferences (page 530), which controls how drivers help applications target read operations to members 
of a replica set. 
• Write Concerns (page 72), which controls how MongoDB ensures that write operations propagate to members 
of a replica set. 
• Replica Set Tags (page 576), which control how applications create and interact with custom groupings of replica 
set members to create custom application-specific read preferences and write concerns. 
• Tag Aware Sharding (page 671), which allows MongoDB administrators to define an application-specific bal-ancing 
policy, to control how documents belonging to specific ranges of a shard key distribute to shards in the 
sharded cluster. 
See also: 
Before adding operational segregation features to your application and MongoDB deployment, become familiar with 
all documentation of replication (page 503), and sharding (page 607). 
5.1. Administration Concepts 195
MongoDB Documentation, Release 2.6.4 
Further Reading 
• The Write Concern (page 72) and Read Preference (page 530) documents, which address capabilities related to 
data center awareness. 
• Deploy a Geographically Redundant Replica Set (page 550). 
Capped Collections 
Capped collections are fixed-size collections that support high-throughput operations that insert, retrieve, and delete 
documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection 
fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection. 
See createCollection() or create for more information on creating capped collections. 
Capped collections have the following behaviors: 
• Capped collections guarantee preservation of the insertion order. As a result, queries do not need an index to 
return documents in insertion order. Without this indexing overhead, they can support higher insertion through-put. 
• Capped collections guarantee that insertion order is identical to the order on disk (natural order) and do so 
by prohibiting updates that increase document size. Capped collections only allow updates that fit the original 
document size, which ensures a document does not change its location on disk. 
• Capped collections automatically remove the oldest documents in the collection without requiring scripts or 
explicit remove operations. 
For example, the oplog.rs collection that stores a log of the operations in a replica set uses a capped collection. 
Consider the following potential use cases for capped collections: 
• Store log information generated by high-volume systems. Inserting documents in a capped collection without 
an index is close to the speed of writing log information directly to a file system. Furthermore, the built-in 
first-in-first-out property maintains the order of events, while managing storage use. 
• Cache small amounts of data in a capped collections. Since caches are read rather than write heavy, you would 
either need to ensure that this collection always remains in the working set (i.e. in RAM) or accept some write 
penalty for the required index or indexes. 
Recommendations and Restrictions 
• You can only make in-place updates of documents. If the update operation causes the document to grow beyond 
their original size, the update operation will fail. 
If you plan to update documents in a capped collection, create an index so that these update operations do not 
require a table scan. 
• If you update a document in a capped collection to a size smaller than its original size, and then a secondary 
resyncs from the primary, the secondary will replicate and allocate space based on the current smaller document 
size. If the primary then receives an update which increases the document back to its original size, the primary 
will accept the update but the secondary will fail with a failing update: objects in a capped 
ns cannot grow error message. 
To prevent this error, create your secondary from a snapshot of one of the other up-to-date members of the 
replica set. Follow our tutorial on filesystem snapshots (page 229) to seed your new secondary. 
Seeding the secondary with a filesystem snapshot is the only way to guarantee the primary and secondary binary 
files are compatible. MMS Backup snapshots are insufficient in this situation since you need more than the 
content of the secondary to match the primary. 
196 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
• You cannot delete documents from a capped collection. To remove all records from a capped collection, use the 
‘emptycapped’ command. To remove the collection entirely, use the drop() method. 
• You cannot shard a capped collection. 
• Capped collections created after 2.2 have an _id field and an index on the _id field by default. Capped 
collections created before 2.2 do not have an index on the _id field by default. If you are using capped 
collections with replication prior to 2.2, you should explicitly create an index on the _id field. 
Warning: If you have a capped collection in a replica set outside of the local database, before 2.2, 
you should create a unique index on _id. Ensure uniqueness using the unique: true option to 
the ensureIndex() method or by using an ObjectId for the _id field. Alternately, you can use the 
autoIndexId option to create when creating the capped collection, as in the Query a Capped Collec-tion 
(page 197) procedure. 
• Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is 
(somewhat) analogous to tail on a log file. 
• The aggregation pipeline operator $out cannot write results to a capped collection. 
Procedures 
Create a Capped Collection You must create capped collections explicitly using the createCollection() 
method, which is a helper in the mongo shell for the create command. When creating a capped collection you must 
specify the maximum size of the collection in bytes, which MongoDB will pre-allocate for the collection. The size of 
the capped collection includes a small amount of space for internal overhead. 
db.createCollection( "log", { capped: true, size: 100000 } ) 
Additionally, you may also specify a maximum number of documents for the collection using the max field as in the 
following document: 
db.createCollection("log", { capped : true, size : 5242880, max : 5000 } ) 
Important: The size argument is always required, even when you specify max number of documents. MongoDB 
will remove older documents if a collection reaches the maximum size limit before it reaches the maximum document 
count. 
See 
createCollection() and create. 
Query a Capped Collection If you perform a find() on a capped collection with no ordering specified, MongoDB 
guarantees that the ordering of results is the same as the insertion order. 
To retrieve documents in reverse insertion order, issue find() along with the sort() method with the $natural 
parameter set to -1, as shown in the following example: 
db.cappedCollection.find().sort( { $natural: -1 } ) 
Check if a Collection is Capped Use the isCapped() method to determine if a collection is capped, as follows: 
db.collection.isCapped() 
5.1. Administration Concepts 197
MongoDB Documentation, Release 2.6.4 
Convert a Collection to Capped You can convert a non-capped collection to a capped collection with the 
convertToCapped command: 
db.runCommand({"convertToCapped": "mycoll", size: 100000}); 
The size parameter specifies the size of the capped collection in bytes. 
Warning: This command obtains a global write lock and will block other operations until it has completed. 
Changed in version 2.2: Before 2.2, capped collections did not have an index on _id unless you specified 
autoIndexId to the create, after 2.2 this became the default. 
Automatically Remove Data After a Specified Period of Time For additional flexibility when expiring data, con-sider 
MongoDB’s TTL indexes, as described in Expire Data from Collections by Setting TTL (page 198). These indexes 
allow you to expire and remove data from normal collections using a special type, based on the value of a date-typed 
field and a TTL value for the index. 
TTL Collections (page 198) are not compatible with capped collections. 
Tailable Cursor You can use a tailable cursor with capped collections. Similar to the Unix tail -f command, 
the tailable cursor “tails” the end of a capped collection. As new documents are inserted into the capped collection, 
you can use the tailable cursor to continue retrieving documents. 
See Create Tailable Cursor (page 109) for information on creating a tailable cursor. 
Expire Data from Collections by Setting TTL 
New in version 2.2. 
This document provides an introduction to MongoDB’s “time to live” or “TTL” collection feature. TTL collections 
make it possible to store data in MongoDB and have the mongod automatically remove data after a specified number 
of seconds or at a specific clock time. 
Data expiration is useful for some classes of information, including machine generated event data, logs, and session 
information that only need to persist for a limited period of time. 
A special index type supports the implementation of TTL collections. TTL relies on a background thread in mongod 
that reads the date-typed values in the index and removes expired documents from the collection. 
Considerations 
• The _id field does not support TTL indexes. 
• You cannot create a TTL index on a field that already has an index. 
• A document will not expire if the indexed field does not exist. 
• A document will not expire if the indexed field is not a date BSON type or an array of date BSON types. 
• The TTL index may not be compound (may not have multiple fields). 
• If the TTL field holds an array, and there are multiple date-typed data in the index, the document will expire 
when the lowest (i.e. earliest) date matches the expiration threshold. 
• You cannot create a TTL index on a capped collection, because MongoDB cannot remove documents from a 
capped collection. 
198 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
• You cannot use ensureIndex() to change the value of expireAfterSeconds. Instead use the 
collMod database command in conjunction with the index collection flag. 
• When you build a TTL index in the background (page 460), the TTL thread can begin deleting documents 
while the index is building. If you build a TTL index in the foreground, MongoDB begins removing expired 
documents as soon as the index finishes building. 
When the TTL thread is active, you will see delete (page 67) operations in the output of db.currentOp() or in the 
data collected by the database profiler (page 210). 
When using TTL indexes on replica sets, the TTL background thread only deletes documents on primary members. 
However, the TTL background thread does run on secondaries. Secondary members replicate deletion operations from 
the primary. 
The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay between the 
time a document expires and the time that MongoDB removes the document from the database. 
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a 
collection after they expire but before the background task runs or completes. 
The duration of the removal operation depends on the workload of your mongod instance. Therefore, expired data 
may exist for some time beyond the 60 second period between runs of the background task. 
All collections with an index using the expireAfterSeconds option have usePowerOf2Sizes enabled. Users 
cannot modify this setting. As a result of enabling usePowerOf2Sizes, MongoDB must allocate more disk space 
relative to data size. This approach helps mitigate the possibility of storage fragmentation caused by frequent delete 
operations and leads to more predictable storage use patterns. 
Procedures 
To enable TTL for a collection, use the ensureIndex() method to create a TTL index, as shown in the examples 
below. 
With the exception of the background thread, a TTL index supports queries in the same way normal indexes do. You 
can use TTL indexes to expire documents in one of two ways, either: 
• remove documents a certain number of seconds after creation. The index will support queries for the creation 
time of the documents. Alternately, 
• specify an explicit expiration time. The index will support queries for the expiration-time of the document. 
Expire Documents after a Certain Number of Seconds To expire data after a certain number of seconds, create 
a TTL index on a field that holds values of BSON date type or an array of BSON date-typed objects and specify a 
positive non-zero value in the expireAfterSeconds field. A document will expire when the number of seconds 
in the expireAfterSeconds field has passed since the time specified in its indexed field. 72 
For example, the following operation creates an index on the log_events collection’s createdAt field and spec-ifies 
the expireAfterSeconds value of 3600 to set the expiration time to be one hour after the time specified by 
createdAt. 
db.log_events.ensureIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } ) 
When adding documents to the log_events collection, set the createdAt field to the current time: 
72 If the field contains an array of BSON date-typed objects, data expires if at least one of BSON date-typed object is older than the number of 
seconds specified in expireAfterSeconds. 
5.1. Administration Concepts 199
MongoDB Documentation, Release 2.6.4 
db.log_events.insert( { 
"createdAt": new Date(), 
"logEvent": 2, 
"logMessage": "Success!" 
} ) 
MongoDB will automatically delete documents from the log_events collection when the document’s createdAt 
value 1 is older than the number of seconds specified in expireAfterSeconds. 
See also: 
$currentDate operator 
Expire Documents at a Certain Clock Time To expire documents at a certain clock time, begin by creating a 
TTL index on a field that holds values of BSON date type or an array of BSON date-typed objects and specify an 
expireAfterSeconds value of 0. For each document in the collection, set the indexed date field to a value 
corresponding to the time the document should expire. If the indexed date field contains a date in the past, MongoDB 
considers the document expired. 
For example, the following operation creates an index on the log_events collection’s expireAt field and specifies 
the expireAfterSeconds value of 0: 
db.log_events.ensureIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } ) 
For each document, set the value of expireAt to correspond to the time the document should expire. For instance, 
the following insert() operation adds a document that should expire at July 22, 2013 14:00:00. 
db.log_events.insert( { 
"expireAt": new Date('July 22, 2013 14:00:00'), 
"logEvent": 2, 
"logMessage": "Success!" 
} ) 
MongoDB will automatically delete documents from the log_events collection when the documents’ expireAt 
value is older than the number of seconds specified in expireAfterSeconds, i.e. 0 seconds older in this case. As 
such, the data expires at the specified expireAt value. 
5.1.3 Optimization Strategies for MongoDB 
There are many factors that can affect database performance and responsiveness including index use, query structure, 
data models and application design, as well as operational factors such as architecture and system configuration. 
This section describes techniques for optimizing application performance with MongoDB. 
Evaluate Performance of Current Operations (page 201) MongoDB provides introspection tools that describe the 
query execution process, to allow users to test queries and build more efficient queries. 
Use Capped Collections for Fast Writes and Reads (page 201) Outlines a use case for Capped Collections 
(page 196) to optimize certain data ingestion work flows. 
Optimize Query Performance (page 202) Introduces the use of projections (page 57) to reduce the amount of data 
MongoDB must set to clients. 
Design Notes (page 203) A collection of notes related to the architecture, design, and administration of MongoDB-based 
applications. 
200 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Evaluate Performance of Current Operations 
The following sections describe techniques for evaluating operational performance. 
Use the Database Profiler to Evaluate Operations Against the Database 
MongoDB provides a database profiler that shows performance characteristics of each operation against the database. 
Use the profiler to locate any queries or write operations that are running slow. You can use this information, for 
example, to determine what indexes to create. 
For more information, see Database Profiling (page 180). 
Use db.currentOp() to Evaluate mongod Operations 
The db.currentOp() method reports on current operations running on a mongod instance. 
Use $explain to Evaluate Query Performance 
The explain() method returns statistics on a query, and reports the index MongoDB selected to fulfill the query, as 
well as information about the internal operation of the query. 
Example 
To use explain() on a query for documents matching the expression { a: 1 }, in the collection named 
records, use an operation that resembles the following in the mongo shell: 
db.records.find( { a: 1 } ).explain() 
Use Capped Collections for Fast Writes and Reads 
Use Capped Collections for Fast Writes 
Capped Collections (page 196) are circular, fixed-size collections that keep documents well-ordered, even without the 
use of an index. This means that capped collections can receive very high-speed writes and sequential reads. 
These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections 
where appropriate. 
Use Natural Order for Fast Reads 
To return documents in the order they exist on disk, return sorted operations using the $natural operator. On a 
capped collection, this also returns the documents in the order in which they were written. 
Natural order does not use indexes but can be fast for operations when you want to select the first or last items on disk. 
See also: 
sort() and limit(). 
5.1. Administration Concepts 201
MongoDB Documentation, Release 2.6.4 
Optimize Query Performance 
Create Indexes to Support Queries 
For commonly issued queries, create indexes (page 431). If a query searches multiple fields, create a compound index 
(page 440). Scanning an index is much faster than scanning a collection. The indexes structures are smaller than the 
documents reference, and store references in order. 
Example 
If you have a posts collection containing blog posts, and if you regularly issue a query that sorts on the 
author_name field, then you can optimize the query by creating an index on the author_name field: 
db.posts.ensureIndex( { author_name : 1 } ) 
Indexes also improve efficiency on queries that routinely sort on a given field. 
Example 
If you regularly issue a query that sorts on the timestamp field, then you can optimize the query by creating an 
index on the timestamp field: 
Creating this index: 
db.posts.ensureIndex( { timestamp : 1 } ) 
Optimizes this query: 
db.posts.find().sort( { timestamp : -1 } ) 
Because MongoDB can read indexes in both ascending and descending order, the direction of a single-key index does 
not matter. 
Indexes support queries, update operations, and some phases of the aggregation pipeline (page 393). 
Index keys that are of the BinData type are more efficiently stored in the index if: 
• the binary subtype value is in the range of 0-7 or 128-135, and 
• the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32. 
Limit the Number of Query Results to Reduce Network Demand 
MongoDB cursors return results in groups of multiple documents. If you know the number of results you want, you 
can reduce the demand on network resources by issuing the limit() method. 
This is typically used in conjunction with sort operations. For example, if you need only 10 results from your query to 
the posts collection, you would issue the following command: 
db.posts.find().sort( { timestamp : -1 } ).limit(10) 
For more information on limiting results, see limit() 
Use Projections to Return Only Necessary Data 
When you need only a subset of fields from documents, you can achieve better performance by returning only the 
fields you need: 
202 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
For example, if in your query to the posts collection, you need only the timestamp, title, author, and 
abstract fields, you would issue the following command: 
db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 For more information on using projections, see Limit Fields to Return from a Query (page 94). 
Use $hint to Select a Particular Index 
In most cases the query optimizer (page 61) selects the optimal index for a specific operation; however, you can force 
MongoDB to use a specific index using the hint() method. Use hint() to support performance testing, or on 
some queries where you must select a field or field included in several indexes. 
Use the Increment Operator to Perform Operations Server-Side 
Use MongoDB’s $inc operator to increment or decrement values in documents. The operator increments the value 
of the field on the server side, as an alternative to selecting a document, making simple modifications in the client 
and then writing the entire document to the server. The $inc operator can also help avoid race conditions, which 
would result when two application instances queried for a document, manually incremented a field, and saved the 
entire document back at the same time. 
Design Notes 
This page details features of MongoDB that may be important to bear in mind when designing your applications. 
Schema Considerations 
Dynamic Schema Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This 
facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly ho-mogeneous 
structures. See Data Modeling Concepts (page 133) for more information. 
Some operational considerations include: 
• the exact set of collections to be used; 
• the indexes to be used: with the exception of the _id index, all indexes must be created explicitly; 
• shard key declarations: choosing a good shard key is very important as the shard key cannot be changed once 
set. 
Avoid importing unmodified data directly from a relational database. In general, you will want to “roll up” certain 
data into richer documents that take advantage of MongoDB’s support for sub-documents and nested arrays. 
Case Sensitive Strings MongoDB strings are case sensitive. So a search for "joe" will not find "Joe". 
Consider: 
• storing data in a normalized case format, or 
• using regular expressions ending with http://docs.mongodb.org/manuali, and/or 
• using $toLower or $toUpper in the aggregation framework (page 391). 
5.1. Administration Concepts 203
MongoDB Documentation, Release 2.6.4 
Type Sensitive Fields MongoDB data is stored in the BSON73 format, a binary encoded serialization of JSON-like 
documents. BSON encodes additional type information. See bsonspec.org74 for more information. 
Consider the following document which has a field x with the string value "123": 
{ x : "123" } 
Then the following query which looks for a number value 123 will not return that document: 
db.mycollection.find( { x : 123 } ) 
General Considerations 
By Default, Updates Affect one Document To update multiple documents that meet your query criteria, set the 
update multi option to true or 1. See: Update Multiple Documents (page 70). 
Prior to MongoDB 2.2, you would specify the upsert and multi options in the update method as positional 
boolean options. See: the update method reference documentation. 
BSON Document Size Limit The BSON Document Size limit is currently set at 16MB per document. If you 
require larger documents, use GridFS (page 138). 
No Fully Generalized Transactions MongoDB does not have fully generalized transactions (page 111). If you 
model your data using rich documents that closely resemble your application’s objects, each logical object will be in 
one MongoDB document. MongoDB allows you to modify a document in a single atomic operation. These kinds of 
data modification pattern covers most common uses of transactions in other systems. 
Replica Set Considerations 
Use an Odd Number of Replica Set Members Replica sets (page 503) perform consensus elections. To ensure 
that elections will proceed successfully, either use an odd number of members, typically three, or else use an arbiter 
to ensure an odd number of votes. 
Keep Replica Set Members Up-to-Date MongoDB replica sets support automatic failover (page 523). It is impor-tant 
for your secondaries to be up-to-date. There are various strategies for assessing consistency: 
1. Use monitoring tools to alert you to lag events. See Monitoring for MongoDB (page 175) for a detailed discus-sion 
of MongoDB’s monitoring options. 
2. Specify appropriate write concern. 
3. If your application requires manual fail over, you can configure your secondaries as priority 0 (page 512). 
Priority 0 secondaries require manual action for a failover. This may be practical for a small replica set, but 
large deployments should fail over automatically. 
See also: 
replica set rollbacks (page 527). 
73http://docs.mongodb.org/meta-driver/latest/legacy/bson/ 
74http://bsonspec.org/#/specification 
204 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Sharding Considerations 
• Pick your shard keys carefully. You cannot choose a new shard key for a collection that is already sharded. 
• Shard key values are immutable. 
• When enabling sharding on an existing collection, MongoDB imposes a maximum size on those col-lections 
to ensure that it is possible to create chunks. For a detailed explanation of this limit, see: 
<sharding-existing-collection-data-size>. 
To shard large amounts of data, create a new empty sharded collection, and ingest the data from the source 
collection using an application level import operation. 
• Unique indexes are not enforced across shards except for the shard key itself. See Enforce Unique Keys for 
Sharded Collections (page 674). 
• Consider pre-splitting (page 634) a sharded collection before a massive bulk import. 
5.2 Administration Tutorials 
The administration tutorials provide specific step-by-step instructions for performing common MongoDB setup, main-tenance, 
and configuration operations. 
Configuration, Maintenance, and Analysis (page 205) Describes routine management operations, including config-uration 
and performance analysis. 
Manage mongod Processes (page 207) Start, configure, and manage running mongod process. 
Rotate Log Files (page 214) Archive the current log files and start new ones. 
Continue reading from Configuration, Maintenance, and Analysis (page 205) for additional tutorials of funda-mental 
MongoDB maintenance procedures. 
Backup and Recovery (page 229) Outlines procedures for data backup and restoration with mongod instances and 
deployments. 
Backup and Restore with Filesystem Snapshots (page 229) An outline of procedures for creating MongoDB 
data set backups using system-level file snapshot tool, such as LVM or native storage appliance tools. 
Backup and Restore Sharded Clusters (page 238) Detailed procedures and considerations for backing up 
sharded clusters and single shards. 
Recover Data after an Unexpected Shutdown (page 246) Recover data from MongoDB data files that were not 
properly closed or have an invalid state. 
Continue reading from Backup and Recovery (page 229) for additional tutorials of MongoDB backup and re-covery 
procedures. 
MongoDB Scripting (page 248) An introduction to the scripting capabilities of the mongo shell and the scripting 
capabilities embedded in MongoDB instances. 
MongoDB Tutorials (page 225) A complete list of tutorials in the MongoDB Manual that address MongoDB opera-tion 
and use. 
5.2.1 Configuration, Maintenance, and Analysis 
The following tutorials describe routine management operations, including configuration and performance analysis: 
Use Database Commands (page 206) The process for running database commands that provide basic database oper-ations. 
5.2. Administration Tutorials 205
MongoDB Documentation, Release 2.6.4 
Manage mongod Processes (page 207) Start, configure, and manage running mongod process. 
Terminate Running Operations (page 209) Stop in progress MongoDB client operations using db.killOp() and 
maxTimeMS(). 
Analyze Performance of Database Operations (page 210) Collect data that introspects the performance of query and 
update operations on a mongod instance. 
Rotate Log Files (page 214) Archive the current log files and start new ones. 
Manage Journaling (page 215) Describes the procedures for configuring and managing MongoDB’s journaling sys-tem 
which allows MongoDB to provide crash resiliency and durability. 
Store a JavaScript Function on the Server (page 217) Describes how to store JavaScript functions on a MongoDB 
server. 
Upgrade to the Latest Revision of MongoDB (page 218) Introduces the basic process for upgrading a MongoDB de-ployment 
between different minor release versions. 
Monitor MongoDB With SNMP on Linux (page 221) The SNMP extension, available in MongoDB Enterprise, al-lows 
MongoDB to report data into SNMP traps. 
Monitor MongoDB Windows with SNMP (page 223) The SNMP extension, available in theWindows build of Mon-goDB 
Enterprise, allows MongoDB to report data into SNMP traps. 
Troubleshoot SNMP (page 224) Outlines common errors and diagnostic processes useful for deploying MongoDB 
Enterprise with SNMP support. 
MongoDB Tutorials (page 225) A complete list of tutorials in the MongoDB Manual that address MongoDB opera-tion 
and use. 
Use Database Commands 
The MongoDB command interface provides access to all non CRUD database operations. Fetching server stats, 
initializing a replica set, and running a map-reduce job are all accomplished with commands. 
See http://docs.mongodb.org/manualreference/command for list of all commands sorted by function, 
and http://docs.mongodb.org/manualreference/command for a list of all commands sorted alphabet-ically. 
Database Command Form 
You specify a command first by constructing a standard BSON document whose first key is the name of the command. 
For example, specify the isMaster command using the following BSON document: 
{ isMaster: 1 } 
Issue Commands 
The mongo shell provides a helper method for running commands called db.runCommand(). The following 
operation in mongo runs the above command: 
db.runCommand( { isMaster: 1 } ) 
Many drivers provide an equivalent for the db.runCommand() method. Internally, running commands with 
db.runCommand() is equivalent to a special query against the $cmd collection. 
206 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Many common commands have their own shell helpers or wrappers in the mongo shell and drivers, such as the 
db.isMaster() method in the mongo JavaScript shell. 
You can use the maxTimeMS option to specify a time limit for the execution of a command, see Terminate a Command 
(page 210) for more information on operation termination. 
admin Database Commands 
You must run some commands on the admin database. Normally, these operations resemble the followings: 
use admin 
db.runCommand( {buildInfo: 1} ) 
However, there’s also a command helper that automatically runs the command in the context of the admin database: 
db._adminCommand( {buildInfo: 1} ) 
Command Responses 
All commands return, at minimum, a document with an ok field indicating whether the command has succeeded: 
{ 'ok': 1 } 
Failed commands return the ok field with a value of 0. 
Manage mongod Processes 
MongoDB runs as a standard program. You can start MongoDB from a command line 
by issuing the mongod command and specifying options. For a list of options, see 
http://docs.mongodb.org/manualreference/program/mongod. MongoDB can also run as a 
Windows service. For details, see Configure a Windows Service for MongoDB (page 21). To install MongoDB, see 
Install MongoDB (page 5). 
The following examples assume the directory containing the mongod process is in your system paths. The mongod 
process is the primary database process that runs on an individual server. mongos provides a coherent MongoDB 
interface equivalent to a mongod from the perspective of a client. The mongo binary provides the administrative 
shell. 
This document page discusses the mongod process; however, some portions of this document may be applicable to 
mongos instances. 
See also: 
Run-time Database Configuration (page 182), http://docs.mongodb.org/manualreference/program/mongod, 
http://docs.mongodb.org/manualreference/program/mongos, and 
http://docs.mongodb.org/manualreference/configuration-options. 
Start mongod Processes 
By default, MongoDB stores data in the /data/db directory. OnWindows, MongoDB stores data in C:datadb. 
On all platforms, MongoDB listens for connections from clients on port 27017. 
To start MongoDB using all defaults, issue the following command at the system shell: 
5.2. Administration Tutorials 207
MongoDB Documentation, Release 2.6.4 
mongod 
Specify a Data Directory If you want mongod to store data files at a path other than /data/db you can specify 
a dbPath. The dbPath must exist before you start mongod. If it does not exist, create the directory and the 
permissions so that mongod can read and write data to this path. For more information on permissions, see the 
security operations documentation. 
To specify a dbPath for mongod to use as a data directory, use the --dbpath option. The following invocation 
will start a mongod instance and store data in the /srv/mongodb path 
mongod --dbpath /srv/mongodb/ 
Specify a TCP Port Only a single process can listen for connections on a network interface at a time. If you run 
multiple mongod processes on a single machine, or have other processes that must use this port, you must assign each 
a different port to listen on for client connections. 
To specify a port to mongod, use the --port option on the command line. The following command starts mongod 
listening on port 12345: 
mongod --port 12345 
Use the default port number when possible, to avoid confusion. 
Start mongod as a Daemon To run a mongod process as a daemon (i.e. fork), and write its output to a log file, 
use the --fork and --logpath options. You must create the log directory; however, mongod will create the log 
file if it does not exist. 
The following command starts mongod as a daemon and records log output to /var/log/mongodb.log. 
mongod --fork --logpath /var/log/mongodb.log 
Additional Configuration Options For an overview of common configurations and common configuration deploy-ments. 
configurations for common use cases, see Run-time Database Configuration (page 182). 
Stop mongod Processes 
In a clean shutdown a mongod completes all pending operations, flushes all data to data files, and closes all data files. 
Other shutdowns are unclean and can compromise the validity the data files. 
To ensure a clean shutdown, always shutdown mongod instances using one of the following methods: 
Use shutdownServer() Shut down the mongod from the mongo shell using the db.shutdownServer() 
method as follows: 
use admin 
db.shutdownServer() 
Calling the same method from a control script accomplishes the same result. 
For systems with authorization enabled, users may only issue db.shutdownServer() when authenticated 
to the admin database or via the localhost interface on systems without authentication enabled. 
208 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Use --shutdown From the Linux command line, shut down the mongod using the --shutdown option in the 
following command: 
mongod --shutdown 
Use CTRL-C When running the mongod instance in interactive mode (i.e. without --fork), issue Control-C 
to perform a clean shutdown. 
Use kill From the Linux command line, shut down a specific mongod instance using the following command: 
kill <mongod process ID> 
Warning: Never use kill -9 (i.e. SIGKILL) to terminate a mongod instance. 
Stop a Replica Set 
Procedure If the mongod is the primary in a replica set, the shutdown process for these mongod instances has the 
following steps: 
1. Check how up-to-date the secondaries are. 
2. If no secondary is within 10 seconds of the primary, mongod will return a message that it will not shut down. 
You can pass the shutdown command a timeoutSecs argument to wait for a secondary to catch up. 
3. If there is a secondary within 10 seconds of the primary, the primary will step down and wait for the secondary 
to catch up. 
4. After 60 seconds or once the secondary has caught up, the primary will shut down. 
Force Replica Set Shutdown If there is no up-to-date secondary and you want the primary to shut down, issue the 
shutdown command with the force argument, as in the following mongo shell operation: 
db.adminCommand({shutdown : 1, force : true}) 
To keep checking the secondaries for a specified number of seconds if none are immediately up-to-date, issue 
shutdown with the timeoutSecs argument. MongoDB will keep checking the secondaries for the specified 
number of seconds if none are immediately up-to-date. If any of the secondaries catch up within the allotted time, the 
primary will shut down. If no secondaries catch up, it will not shut down. 
The following command issues shutdown with timeoutSecs set to 5: 
db.adminCommand({shutdown : 1, timeoutSecs : 5}) 
Alternately you can use the timeoutSecs argument with the db.shutdownServer() method: 
db.shutdownServer({timeoutSecs : 5}) 
Terminate Running Operations 
Overview 
MongoDB provides two facilitates to terminate running operations: maxTimeMS() and db.killOp(). Use these 
operations as needed to control the behavior of operations in a MongoDB deployment. 
5.2. Administration Tutorials 209
MongoDB Documentation, Release 2.6.4 
Available Procedures 
maxTimeMS New in version 2.6. 
The maxTimeMS() method sets a time limit for an operation. When the operation reaches the specified time limit, 
MongoDB interrupts the operation at the next interrupt point. 
Terminate a Query From the mongo shell, use the following method to set a time limit of 30 milliseconds for this 
query: 
db.location.find( { "town": { "$regex": "(Pine Lumber)", 
"$options": 'i' } } ).maxTimeMS(30) 
Terminate a Command Consider a potentially long running operation using distinct to return each dis-tinct‘‘ 
collection‘‘ field that has a city key: 
db.runCommand( { distinct: "collection", 
key: "city" } ) 
You can add the maxTimeMS field to the command document to set a time limit of 30 milliseconds for the operation: 
db.runCommand( { distinct: "collection", 
key: "city", 
maxTimeMS: 45 } ) 
db.getLastError() and db.getLastErrorObj() will return errors for interrupted options: 
{ "n" : 0, 
"connectionId" : 1, 
"err" : "operation exceeded time limit", 
"ok" : 1 } 
killOp The db.killOp() method interrupts a running operation at the next interrupt point. db.killOp() 
identifies the target operation by operation ID. 
db.killOp(<opId>) 
Related 
To return a list of running operations see db.currentOp(). 
Analyze Performance of Database Operations 
The database profiler collects fine grained data about MongoDB write operations, cursors, database commands on 
a running mongod instance. You can enable profiling on a per-database or per-instance basis. The profiling level 
(page 211) is also configurable when enabling profiling. 
The database profiler writes all the data it collects to the system.profile (page 271) collection, which is a capped 
collection (page 196). See Database Profiler Output (page 271) for overview of the data in the system.profile 
(page 271) documents created by the profiler. 
This document outlines a number of key administration options for the database profiler. For additional related infor-mation, 
consider the following resources: 
• Database Profiler Output (page 271) 
210 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
• Profile Command 
• http://docs.mongodb.org/manualreference/method/db.currentOp 
Profiling Levels 
The following profiling levels are available: 
• 0 - the profiler is off, does not collect any data. mongod always writes operations longer than the 
slowOpThresholdMs threshold to its log. 
• 1 - collects profiling data for slow operations only. By default slow operations are those slower than 100 
milliseconds. 
You can modify the threshold for “slow” operations with the slowOpThresholdMs runtime option or the 
setParameter command. See the Specify the Threshold for Slow Operations (page 211) section for more 
information. 
• 2 - collects profiling data for all database operations. 
Enable Database Profiling and Set the Profiling Level 
You can enable database profiling from the mongo shell or through a driver using the profile command. This 
section will describe how to do so from the mongo shell. See your driver documentation if you want to 
control the profiler from within your application. 
When you enable profiling, you also set the profiling level (page 211). The profiler records data in the 
system.profile (page 271) collection. MongoDB creates the system.profile (page 271) collection in a 
database after you enable profiling for that database. 
To enable profiling and set the profiling level, use the db.setProfilingLevel() helper in the mongo shell, 
passing the profiling level as a parameter. For example, to enable profiling for all database operations, consider the 
following operation in the mongo shell: 
db.setProfilingLevel(2) 
The shell returns a document showing the previous level of profiling. The "ok" : 1 key-value pair indicates the 
operation succeeded: 
{ "was" : 0, "slowms" : 100, "ok" : 1 } 
To verify the new setting, see the Check Profiling Level (page 212) section. 
Specify the Threshold for Slow Operations The threshold for slow operations applies to the entire mongod in-stance. 
When you change the threshold, you change it for all databases on the instance. 
Important: Changing the slow operation threshold for the database profiler also affects the profiling subsystem’s 
slow operation threshold for the entire mongod instance. Always set the threshold to the highest useful value. 
By default the slow operation threshold is 100 milliseconds. Databases with a profiling level of 1 will log operations 
slower than 100 milliseconds. 
To change the threshold, pass two parameters to the db.setProfilingLevel() helper in the mongo shell. The 
first parameter sets the profiling level for the current database, and the second sets the default slow operation threshold 
for the entire mongod instance. 
5.2. Administration Tutorials 211
MongoDB Documentation, Release 2.6.4 
For example, the following command sets the profiling level for the current database to 0, which disables profiling, 
and sets the slow-operation threshold for the mongod instance to 20 milliseconds. Any database on the instance with 
a profiling level of 1 will use this threshold: 
db.setProfilingLevel(0,20) 
Check Profiling Level To view the profiling level (page 211), issue the following from the mongo shell: 
db.getProfilingStatus() 
The shell returns a document similar to the following: 
{ "was" : 0, "slowms" : 100 } 
The was field indicates the current level of profiling. 
The slowms field indicates how long an operation must exist in milliseconds for an operation to pass the “slow” 
threshold. MongoDB will log operations that take longer than the threshold if the profiling level is 1. This document 
returns the profiling level in the was field. For an explanation of profiling levels, see Profiling Levels (page 211). 
To return only the profiling level, use the db.getProfilingLevel() helper in the mongo as in the following: 
db.getProfilingLevel() 
Disable Profiling To disable profiling, use the following helper in the mongo shell: 
db.setProfilingLevel(0) 
Enable Profiling for an Entire mongod Instance For development purposes in testing environments, you can 
enable database profiling for an entire mongod instance. The profiling level applies to all databases provided by the 
mongod instance. 
To enable profiling for a mongod instance, pass the following parameters to mongod at startup or within the 
configuration file: 
mongod --profile=1 --slowms=15 
This sets the profiling level to 1, which collects profiling data for slow operations only, and defines slow operations as 
those that last longer than 15 milliseconds. 
See also: 
mode and slowOpThresholdMs. 
Database Profiling and Sharding You cannot enable profiling on a mongos instance. To enable profiling in a 
shard cluster, you must enable profiling for each mongod instance in the cluster. 
View Profiler Data 
The database profiler logs information about database operations in the system.profile (page 271) collection. 
To view profiling information, query the system.profile (page 271) collection. To view example queries, see 
Profiler Overhead (page 213) 
For an explanation of the output data, see Database Profiler Output (page 271). 
212 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Example Profiler Data Queries This section displays example queries to the system.profile (page 271) col-lection. 
For an explanation of the query output, see Database Profiler Output (page 271). 
To return the most recent 10 log entries in the system.profile (page 271) collection, run a query similar to the 
following: 
db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty() 
To return all operations except command operations ($cmd), run a query similar to the following: 
db.system.profile.find( { op: { $ne : 'command' } } ).pretty() 
To return operations for a particular collection, run a query similar to the following. This example returns operations 
in the mydb database’s test collection: 
db.system.profile.find( { ns : 'mydb.test' } ).pretty() 
To return operations slower than 5 milliseconds, run a query similar to the following: 
db.system.profile.find( { millis : { $gt : 5 } } ).pretty() 
To return information from a certain time range, run a query similar to the following: 
db.system.profile.find( 
{ 
ts : { 
$gt : new ISODate("2012-12-09T03:00:00Z") , 
$lt : new ISODate("2012-12-09T03:40:00Z") 
} 
} 
).pretty() 
The following example looks at the time range, suppresses the user field from the output to make it easier to read, 
and sorts the results by how long each operation took to run: 
db.system.profile.find( 
{ 
ts : { 
$gt : new ISODate("2011-07-12T03:00:00Z") , 
$lt : new ISODate("2011-07-12T03:40:00Z") 
} 
}, 
{ user : 0 } 
).sort( { millis : -1 } ) 
Show the Five Most Recent Events On a database that has profiling enabled, the show profile helper in the 
mongo shell displays the 5 most recent operations that took at least 1 millisecond to execute. Issue show profile 
from the mongo shell, as follows: 
show profile 
Profiler Overhead 
When enabled, profiling has a minor effect on performance. The system.profile (page 271) collection is a 
capped collection with a default size of 1 megabyte. A collection of this size can typically store several thousand 
profile documents, but some application may use more or less profiling data per operation. 
To change the size of the system.profile (page 271) collection, you must: 
5.2. Administration Tutorials 213
MongoDB Documentation, Release 2.6.4 
1. Disable profiling. 
2. Drop the system.profile (page 271) collection. 
3. Create a new system.profile (page 271) collection. 
4. Re-enable profiling. 
For example, to create a new system.profile (page 271) collections that’s 4000000 bytes, use the following 
sequence of operations in the mongo shell: 
db.setProfilingLevel(0) 
db.system.profile.drop() 
db.createCollection( "system.profile", { capped: true, size:4000000 } ) 
db.setProfilingLevel(1) 
Change Size of system.profile Collection 
To change the size of the system.profile (page 271) collection on a secondary, you must stop the secondary, run 
it as a standalone, and then perform the steps above. When done, restart the standalone as a member of the replica set. 
For more information, see Perform Maintenance on Replica Set Members (page 572). 
Rotate Log Files 
Overview 
Log rotation using MongoDB’s standard approach archives the current log file and starts a new one. To do this, the 
mongod or mongos instance renames the current log file by appending a UTC (GMT) timestamp to the filename, in 
ISODate format. It then opens a new log file, closes the old log file, and sends all new log entries to the new log file. 
MongoDB’s standard approach to log rotation only rotates logs in response to the logRotate command, or when 
the mongod or mongos process receives a SIGUSR1 signal from the operating system. 
Alternately, you may configure mongod to send log data to syslog. In this case, you can take advantage of alternate 
logrotation tools. 
See also: 
For information on logging, see the Process Logging (page 178) section. 
Log Rotation With MongoDB 
The following steps create and rotate a log file: 
1. Start a mongod with verbose logging, with appending enabled, and with the following log file: 
mongod -v --logpath /var/log/mongodb/server1.log --logappend 
2. In a separate terminal, list the matching files: 
ls /var/log/mongodb/server1.log* 
For results, you get: 
214 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
server1.log 
3. Rotate the log file using one of the following methods. 
• From the mongo shell, issue the logRotate command from the admin database: 
use admin 
db.runCommand( { logRotate : 1 } ) 
This is the only available method to rotate log files on Windows systems. 
• For Linux systems, rotate logs for a single process by issuing the following command: 
kill -SIGUSR1 <mongod process id> 
4. List the matching files again: 
ls /var/log/mongodb/server1.log* 
For results you get something similar to the following. The timestamps will be different. 
server1.log server1.log.2011-11-24T23-30-00 
The example results indicate a log rotation performed at exactly 11:30 pm on November 24th, 2011 
UTC, which is the local time offset by the local time zone. The original log file is the one with the timestamp. 
The new log is server1.log file. 
If you issue a second logRotate command an hour later, then an additional file would appear when listing 
matching files, as in the following example: 
server1.log server1.log.2011-11-24T23-30-00 server1.log.2011-11-25T00-30-00 
This operation does not modify the server1.log.2011-11-24T23-30-00 file created earlier, while 
server1.log.2011-11-25T00-30-00 is the previous server1.log file, renamed. server1.log 
is a new, empty file that receives all new log output. 
Syslog Log Rotation 
New in version 2.2. 
To configure mongod to send log data to syslog rather than writing log data to a file, use the following procedure. 
1. Start a mongod with the syslogFacility option. 
2. Store and rotate the log output using your system’s default log rotation mechanism. 
Important: You cannot use syslogFacility with systemLog.path. 
Manage Journaling 
MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 67) durability and to 
provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation to the 
journal. If MongoDB should terminate or encounter an error before it can write the changes from the journal to the 
data files, MongoDB can re-apply the write operation and maintain a consistent state. 
Without a journal, if mongod exits unexpectedly, you must assume your data is in an inconsistent state, and you must 
run either repair (page 246) or, preferably, resync (page 575) from a clean member of the replica set. 
5.2. Administration Tutorials 215
MongoDB Documentation, Release 2.6.4 
With journaling enabled, if mongod stops unexpectedly, the program can recover everything written to the journal, 
and the data remains in a consistent state. By default, the greatest extent of lost writes, i.e., those not made to the 
journal, are those made in the last 100 milliseconds. See commitIntervalMs for more information on the default. 
With journaling, if you want a data set to reside entirely in RAM, you need enough RAM to hold the data set plus 
the “write working set.” The “write working set” is the amount of unique data you expect to see written between 
re-mappings of the private view. For information on views, see Storage Views used in Journaling (page 275). 
Important: Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default. For other 
platforms, see storage.journal.enabled. 
Procedures 
Enable Journaling Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default. 
To enable journaling, start mongod with the --journal command line option. 
If no journal files exist, when mongod starts, it must preallocate new journal files. During this operation, the mongod 
is not listening for connections until preallocation completes: for some systems this may take a several minutes. 
During this period your applications and the mongo shell are not available. 
Disable Journaling 
Warning: Do not disable journaling on production systems. If your mongod instance stops without shutting 
down cleanly unexpectedly for any reason, (e.g. power failure) and you are not running with journaling, then you 
must recover from an unaffected replica set member or backup, as described in repair (page 246). 
To disable journaling, start mongod with the --nojournal command line option. 
Get Commit Acknowledgment You can get commit acknowledgment with the Write Concern (page 72) and the j 
option. For details, see Write Concern Reference (page 118). 
Avoid Preallocation Lag To avoid preallocation lag (page 275), you can preallocate files in the journal directory by 
copying them from another instance of mongod. 
Preallocated files do not contain data. It is safe to later remove them. But if you restart mongod with journaling, 
mongod will create them again. 
Example 
The following sequence preallocates journal files for an instance of mongod running on port 27017 with a database 
path of /data/db. 
For demonstration purposes, the sequence starts by creating a set of journal files in the usual way. 
1. Create a temporary directory into which to create a set of journal files: 
mkdir ~/tmpDbpath 
2. Create a set of journal files by staring a mongod instance that uses the temporary directory: 
mongod --port 10000 --dbpath ~/tmpDbpath --journal 
3. When you see the following log output, indicating mongod has the files, press CONTROL+C to stop the 
mongod instance: 
216 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
[initandlisten] waiting for connections on port 10000 
4. Preallocate journal files for the new instance of mongod by moving the journal files from the data directory of 
the existing instance to the data directory of the new instance: 
mv ~/tmpDbpath/journal /data/db/ 
5. Start the new mongod instance: 
mongod --port 27017 --dbpath /data/db --journal 
Monitor Journal Status Use the following commands and methods to monitor journal status: 
• serverStatus 
The serverStatus command returns database status information that is useful for assessing performance. 
• journalLatencyTest 
Use journalLatencyTest to measure how long it takes on your volume to write to the disk in an append-only 
fashion. You can run this command on an idle system to get a baseline sync time for journaling. You can 
also run this command on a busy system to see the sync time on a busy system, which may be higher if the 
journal directory is on the same volume as the data files. 
The journalLatencyTest command also provides a way to check if your disk drive is buffering writes in 
its local cache. If the number is very low (i.e., less than 2 milliseconds) and the drive is non-SSD, the drive 
is probably buffering writes. In that case, enable cache write-through for the device in your operating system, 
unless you have a disk controller card with battery backed RAM. 
Change the Group Commit Interval Changed in version 2.0. 
You can set the group commit interval using the --journalCommitInterval command line option. The allowed 
range is 2 to 300 milliseconds. 
Lower values increase the durability of the journal at the expense of disk performance. 
Recover Data After Unexpected Shutdown On a restart after a crash, MongoDB replays all journal files in the 
journal directory before the server becomes available. If MongoDB must replay journal files, mongod notes these 
events in the log output. 
There is no reason to run repairDatabase in these situations. 
Store a JavaScript Function on the Server 
Note: Do not store application logic in the database. There are performance limitations to running JavaScript inside 
of MongoDB. Application code also is typically most effective when it shares version control with the application 
itself. 
There is a special system collection named system.js that can store JavaScript functions for reuse. 
To store a function, you can use the db.collection.save(), as in the following example: 
5.2. Administration Tutorials 217
MongoDB Documentation, Release 2.6.4 
db.system.js.save( 
{ 
_id : "myAddFunction" , 
value : function (x, y){ return x + y; } 
} 
); 
• The _id field holds the name of the function and is unique per database. 
• The value field holds the function definition 
Once you save a function in the system.js collection, you can use the function from any JavaScript context (e.g. 
eval command or the mongo shell method db.eval(), $where operator, mapReduce or mongo shell method 
db.collection.mapReduce()). 
Consider the following example from the mongo shell that first saves a function named echoFunction to the 
system.js collection and calls the function using db.eval() method: 
db.system.js.save( 
{ _id: "echoFunction", 
value : function(x) { return x; } 
} 
) 
db.eval( "echoFunction( 'test' )" ) 
See http://github.com/mongodb/mongo/tree/master/jstests/core/storefunc.js for a full example. 
New in version 2.1: In the mongo shell, you can use db.loadServerScripts() to load all the scripts saved in 
the system.js collection for the current database. Once loaded, you can invoke the functions directly in the shell, 
as in the following example: 
db.loadServerScripts(); 
echoFunction(3); 
myAddFunction(3, 5); 
Upgrade to the Latest Revision of MongoDB 
Revisions provide security patches, bug fixes, and new or changed features that do not contain any backward breaking 
changes. Always upgrade to the latest revision in your release series. The third number in the MongoDB version 
number (page 808) indicates the revision. 
Before Upgrading 
• Ensure you have an up-to-date backup of your data set. See MongoDB Backup Methods (page 172). 
• Consult the following documents for any special considerations or compatibility issues specific to your Mon-goDB 
release: 
– The release notes, located at Release Notes (page 725). 
– The documentation for your driver. See http://docs.mongodb.org/manualapplications/drivers. 
• If your installation includes replica sets, plan the upgrade during a predefined maintenance window. 
218 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
• Before you upgrade a production environment, use the procedures in this document to upgrade a staging environ-ment 
that reproduces your production environment, to ensure that your production configuration is compatible 
with all changes. 
Upgrade Procedure 
Important: Always backup all of your data before upgrading MongoDB. 
Upgrade each mongod and mongos binary separately, using the procedure described here. When upgrading a binary, 
use the procedure Upgrade a MongoDB Instance (page 219). 
Follow this upgrade procedure: 
1. For deployments that use authentication, first upgrade all of your MongoDB drivers. To upgrade, see the 
documentation for your driver. 
2. Upgrade sharded clusters, as described in Upgrade Sharded Clusters (page 220). 
3. Upgrade any standalone instances. See Upgrade a MongoDB Instance (page 219). 
4. Upgrade any replica sets that are not part of a sharded cluster, as described in Upgrade Replica Sets (page 220). 
Upgrade a MongoDB Instance 
To upgrade a mongod or mongos instance, use one of the following approaches: 
• Upgrade the instance using the operating system’s package management tool and the official MongoDB pack-ages. 
This is the preferred approach. See Install MongoDB (page 5). 
• Upgrade the instance by replacing the existing binaries with new binaries. See Replace the Existing Binaries 
(page 219). 
Replace the Existing Binaries 
Important: Always backup all of your data before upgrading MongoDB. 
This section describes how to upgrade MongoDB by replacing the existing binaries. The preferred approach to an 
upgrade is to use the operating system’s package management tool and the official MongoDB packages, as described 
in Install MongoDB (page 5). 
To upgrade a mongod or mongos instance by replacing the existing binaries: 
1. Download the binaries for the latest MongoDB revision from the MongoDB Download Page75 and store the 
binaries in a temporary location. The binaries download as compressed files that uncompress to the directory 
structure used by the MongoDB installation. 
2. Shutdown the instance. 
3. Replace the existing MongoDB binaries with the downloaded binaries. 
4. Restart the instance. 
75http://downloads.mongodb.org/ 
5.2. Administration Tutorials 219
MongoDB Documentation, Release 2.6.4 
Upgrade Sharded Clusters 
To upgrade a sharded cluster: 
1. Disable the cluster’s balancer, as described in Disable the Balancer (page 661). 
2. Upgrade each mongos instance by following the instructions below in Upgrade a MongoDB Instance 
(page 219). You can upgrade the mongos instances in any order. 
3. Upgrade each mongod config server (page 616) individually starting with the last config server listed in your 
mongos --configdb string and working backward. To keep the cluster online, make sure at least one config 
server is always running. For each config server upgrade, follow the instructions below in Upgrade a MongoDB 
Instance (page 219) 
Example 
Given the following config string: 
mongos --configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019 
You would upgrade the config servers in the following order: 
(a) cfg2.example.net 
(b) cfg1.example.net 
(c) cfg0.example.net 
4. Upgrade each shard. 
• If a shard is a replica set, upgrade the shard using the procedure below titled Upgrade Replica Sets 
(page 220). 
• If a shard is a standalone instance, upgrade the shard using the procedure below titled Upgrade a MongoDB 
Instance (page 219). 
5. Re-enable the balancer, as described in Enable the Balancer (page 661). 
Upgrade Replica Sets 
To upgrade a replica set, upgrade each member individually, starting with the secondaries and finishing with the 
primary. Plan the upgrade during a predefined maintenance window. 
Upgrade Secondaries Upgrade each secondary separately as follows: 
1. Upgrade the secondary’s mongod binary by following the instructions below in Upgrade a MongoDB Instance 
(page 219). 
2. After upgrading a secondary, wait for the secondary to recover to the SECONDARY state before upgrading the 
next instance. To check the member’s state, issue rs.status() in the mongo shell. 
The secondary may briefly go into STARTUP2 or RECOVERING. This is normal. Make sure to wait for the 
secondary to fully recover to SECONDARY before you continue the upgrade. 
Upgrade the Primary 
1. Step down the primary to initiate the normal failover (page 523) procedure. Using one of the following: 
• The rs.stepDown() helper in the mongo shell. 
220 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
• The replSetStepDown database command. 
During failover, the set cannot accept writes. Typically this takes 10-20 seconds. Plan the upgrade during a 
predefined maintenance window. 
Note: Stepping down the primary is preferable to directly shutting down the primary. Stepping down expedites 
the failover procedure. 
2. Once the primary has stepped down, call the rs.status() method from the mongo shell until you see that 
another member has assumed the PRIMARY state. 
3. Shut down the original primary and upgrade its instance by following the instructions below in Upgrade a 
MongoDB Instance (page 219). 
Monitor MongoDB With SNMP on Linux 
New in version 2.2. 
Enterprise Feature 
SNMP is only available in MongoDB Enterprise76. 
Overview 
MongoDB Enterprise can report system information into SNMP traps, to support centralized data collection and 
aggregation. This procedure explains the setup and configuration of a mongod instance as an SNMP subagent, as 
well as initializing and testing of SNMP support with MongoDB Enterprise. 
See also: 
Troubleshoot SNMP (page 224) and Monitor MongoDB Windows with SNMP (page 223) for complete instructions on 
using MongoDB with SNMP on Windows systems. 
Considerations 
Only mongod instances provide SNMP support. mongos and the other MongoDB binaries do not support SNMP. 
Configuration Files 
Changed in version 2.6. 
MongoDB Enterprise contains the following configuration files to support SNMP: 
• MONGOD-MIB.txt: 
The management information base (MIB) file that defines MongoDB’s SNMP output. 
• mongod.conf.subagent: 
The configuration file to run mongod as the SNMP subagent. This file sets SNMP run-time configuration 
options, including the AgentX socket to connect to the SNMP master. 
76http://www.mongodb.com/products/mongodb-enterprise 
5.2. Administration Tutorials 221
MongoDB Documentation, Release 2.6.4 
• mongod.conf.master: 
The configuration file to run mongod as the SNMP master. This file sets SNMP run-time configuration options. 
Procedure 
Step 1: Copy configuration files. Use the following sequence of commands to move the SNMP configuration files 
to the SNMP service configuration directory. 
First, create the SNMP configuration directory if needed and then, from the installation directory, copy the configura-tion 
files to the SNMP service configuration directory: 
mkdir -p /etc/snmp/ 
cp MONGOD-MIB.txt /usr/share/snmp/mibs/MONGOD-MIB.txt 
cp mongod.conf.subagent /etc/snmp/mongod.conf 
The configuration filename is tool-dependent. For example, when using net-snmp the configuration file is 
snmpd.conf. 
By default SNMP uses UNIX domain for communication between the agent (i.e. snmpd or the master) and sub-agent 
(i.e. MongoDB). 
Ensure that the agentXAddress specified in the SNMP configuration file for MongoDB matches the 
agentXAddress in the SNMP master configuration file. 
Step 2: Start MongoDB. Start mongod with the snmp-subagent to send data to the SNMP master. 
mongod --snmp-subagent 
Step 3: Confirm SNMP data retrieval. Use snmpwalk to collect data from mongod: 
Connect an SNMP client to verify the ability to collect SNMP data from MongoDB. 
Install the net-snmp77 package to access the snmpwalk client. net-snmp provides the snmpwalk SNMP client. 
snmpwalk -m /usr/share/snmp/mibs/MONGOD-MIB.txt -v 2c -c mongodb 127.0.0.1:<port> 1.3.6.1.4.1.34601 
<port> refers to the port defined by the SNMP master, not the primary port used by mongod for client communi-cation. 
Optional: Run MongoDB as SNMP Master 
You can run mongod with the snmp-master option for testing purposes. To do this, use the SNMP master configu-ration 
file instead of the subagent configuration file. From the directory containing the unpacked MongoDB installation 
files: 
cp mongod.conf.master /etc/snmp/mongod.conf 
Additionally, start mongod with the snmp-master option, as in the following: 
mongod --snmp-master 
77http://www.net-snmp.org/ 
222 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Monitor MongoDB Windows with SNMP 
New in version 2.6. 
Enterprise Feature 
SNMP is only available in MongoDB Enterprise78. 
Overview 
MongoDB Enterprise can report system information into SNMP traps, to support centralized data collection and 
aggregation. This procedure explains the setup and configuration of a mongod.exe instance as an SNMP subagent, 
as well as initializing and testing of SNMP support with MongoDB Enterprise. 
See also: 
Monitor MongoDB With SNMP on Linux (page 221) and Troubleshoot SNMP (page 224) for more information. 
Considerations 
Only mongod.exe instances provide SNMP support. mongos.exe and the other MongoDB binaries do not support 
SNMP. 
Configuration Files 
Changed in version 2.6. 
MongoDB Enterprise contains the following configuration files to support SNMP: 
• MONGOD-MIB.txt: 
The management information base (MIB) file that defines MongoDB’s SNMP output. 
• mongod.conf.subagent: 
The configuration file to run mongod.exe as the SNMP subagent. This file sets SNMP run-time configuration 
options, including the AgentX socket to connect to the SNMP master. 
• mongod.conf.master: 
The configuration file to run mongod.exe as the SNMP master. This file sets SNMP run-time configuration 
options. 
Procedure 
Step 1: Copy configuration files. Use the following sequence of commands to move the SNMP configuration files 
to the SNMP service configuration directory. 
First, create the SNMP configuration directory if needed and then, from the installation directory, copy the configura-tion 
files to the SNMP service configuration directory: 
md C:snmpetcconfig 
copy MONGOD-MIB.txt C:snmpetcconfigMONGOD-MIB.txt 
copy mongod.conf.subagent C:snmpetcconfigmongod.conf 
78http://www.mongodb.com/products/mongodb-enterprise 
5.2. Administration Tutorials 223
MongoDB Documentation, Release 2.6.4 
The configuration filename is tool-dependent. For example, when using net-snmp the configuration file is 
snmpd.conf. 
Edit the configuration file to ensure that the communication between the agent (i.e. snmpd or the master) and sub-agent 
(i.e. MongoDB) uses TCP. 
Ensure that the agentXAddress specified in the SNMP configuration file for MongoDB matches the 
agentXAddress in the SNMP master configuration file. 
Step 2: Start MongoDB. Start mongod.exe with the snmp-subagent to send data to the SNMP master. 
mongod.exe --snmp-subagent 
Step 3: Confirm SNMP data retrieval. Use snmpwalk to collect data from mongod.exe: 
Connect an SNMP client to verify the ability to collect SNMP data from MongoDB. 
Install the net-snmp79 package to access the snmpwalk client. net-snmp provides the snmpwalk SNMP client. 
snmpwalk -m C:snmpetcconfigMONGOD-MIB.txt -v 2c -c mongodb 127.0.0.1:<port> 1.3.6.1.4.1.34601 
<port> refers to the port defined by the SNMP master, not the primary port used by mongod.exe for client 
communication. 
Optional: Run MongoDB as SNMP Master 
You can run mongod.exe with the snmp-master option for testing purposes. To do this, use the SNMP master 
configuration file instead of the subagent configuration file. From the directory containing the unpacked MongoDB 
installation files: 
copy mongod.conf.master C:snmpetcconfigmongod.conf 
Additionally, start mongod.exe with the snmp-master option, as in the following: 
mongod.exe --snmp-master 
Troubleshoot SNMP 
New in version 2.6. 
Enterprise Feature 
SNMP is only available in MongoDB Enterprise. 
Overview 
MongoDB Enterprise can report system information into SNMP traps, to support centralized data collection and 
aggregation. This document identifies common problems you may encounter when deploying MongoDB Enterprise 
with SNMP as well as possible solutions for these issues. 
See Monitor MongoDB With SNMP on Linux (page 221) and Monitor MongoDB Windows with SNMP (page 223) for 
complete installation instructions. 
79http://www.net-snmp.org/ 
224 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Issues 
Failed to Connect The following in the mongod logfile: 
Warning: Failed to connect to the agentx master agent 
AgentX is the SNMP agent extensibility protocol defined in Internet RFC 274180. It explains how to define additional 
data to monitor over SNMP. When MongoDB fails to connect to the agentx master agent, use the following procedure 
to ensure that the SNMP subagent can connect properly to the SNMP master. 
1. Make sure the master agent is running. 
2. Compare the SNMP master’s configuration file with the subagent configuration file. Ensure that the agentx 
socket definition is the same between the two. 
3. Check the SNMP configuration files to see if they specify using UNIX Domain Sockets. If so, confirm that the 
mongod has appropriate permissions to open a UNIX domain socket. 
Error Parsing Command Line One of the following errors at the command line: 
Error parsing command line: unknown option snmp-master 
try 'mongod --help' for more information 
Error parsing command line: unknown option snmp-subagent 
try 'mongod --help' for more information 
mongod binaries that are not part of the Enterprise Edition produce this error. Install the Enterprise Edition (page 24) 
and attempt to start mongod again. 
Other MongoDB binaries, including mongos will produce this error if you attempt to star them with snmp-master 
or snmp-subagent. Only mongod supports SNMP. 
Error Starting SNMPAgent The following line in the log file indicates that mongod cannot read the 
mongod.conf file: 
[SNMPAgent] warning: error starting SNMPAgent as master err:1 
If running on Linux, ensure mongod.conf exists in the /etc/snmp directory, and ensure that the mongod UNIX 
user has permission to read the mongod.conf file. 
If running on Windows, ensure mongod.conf exists in C:snmpetcconfig. 
MongoDB Tutorials 
This page lists the tutorials available as part of the MongoDB Manual. In addition to these documents, you can refer 
to the introductory MongoDB Tutorial (page 43). If there is a process or pattern that you would like to see included 
here, please open a Jira Case81. 
Getting Started 
• Install MongoDB on Linux Systems (page 14) 
• Install MongoDB on Red Hat Enterprise, CentOS, Fedora, or Amazon Linux (page 6) 
80http://www.ietf.org/rfc/rfc2741.txt 
81https://jira.mongodb.org/browse/DOCS 
5.2. Administration Tutorials 225
MongoDB Documentation, Release 2.6.4 
• Install MongoDB on Debian (page 12) 
• Install MongoDB on Ubuntu (page 9) 
• Install MongoDB on OS X (page 16) 
• Install MongoDB on Windows (page 19) 
• Getting Started with MongoDB (page 43) 
• Generate Test Data (page 47) 
Administration 
Replica Sets 
• Deploy a Replica Set (page 545) 
• Deploy Replica Set and Configure Authentication and Authorization (page 313) 
• Convert a Standalone to a Replica Set (page 556) 
• Add Members to a Replica Set (page 557) 
• Remove Members from Replica Set (page 560) 
• Replace a Replica Set Member (page 561) 
• Adjust Priority for Replica Set Member (page 562) 
• Resync a Member of a Replica Set (page 575) 
• Deploy a Geographically Redundant Replica Set (page 550) 
• Change the Size of the Oplog (page 570) 
• Force a Member to Become Primary (page 573) 
• Change Hostnames in a Replica Set (page 584) 
• Add an Arbiter to Replica Set (page 555) 
• Convert a Secondary to an Arbiter (page 568) 
• Configure a Secondary’s Sync Target (page 587) 
• Configure a Delayed Replica Set Member (page 566) 
• Configure a Hidden Replica Set Member (page 565) 
• Configure Non-Voting Replica Set Member (page 567) 
• Prevent Secondary from Becoming Primary (page 563) 
• Configure Replica Set Tag Sets (page 576) 
• Manage Chained Replication (page 583) 
• Reconfigure a Replica Set with Unavailable Members (page 580) 
• Recover Data after an Unexpected Shutdown (page 246) 
• Troubleshoot Replica Sets (page 588) 
226 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Sharding 
• Deploy a Sharded Cluster (page 635) 
• Convert a Replica Set to a Replicated Sharded Cluster (page 643) 
• Add Shards to a Cluster (page 642) 
• Remove Shards from an Existing Sharded Cluster (page 663) 
• Deploy Three Config Servers for Production Deployments (page 643) 
• Migrate Config Servers with the Same Hostname (page 652) 
• Migrate Config Servers with Different Hostnames (page 652) 
• Replace Disabled Config Server (page 653) 
• Migrate a Sharded Cluster to Different Hardware (page 654) 
• Backup Cluster Metadata (page 657) 
• Backup a Small Sharded Cluster with mongodump (page 238) 
• Backup a Sharded Cluster with Filesystem Snapshots (page 239) 
• Backup a Sharded Cluster with Database Dumps (page 241) 
• Restore a Single Shard (page 243) 
• Restore a Sharded Cluster (page 244) 
• Schedule Backup Window for Sharded Clusters (page 243) 
• Manage Shard Tags (page 672) 
Basic Operations 
• Use Database Commands (page 206) 
• Recover Data after an Unexpected Shutdown (page 246) 
• Expire Data from Collections by Setting TTL (page 198) 
• Analyze Performance of Database Operations (page 210) 
• Rotate Log Files (page 214) 
• Build Old Style Indexes (page 471) 
• Manage mongod Processes (page 207) 
• Back Up and Restore with MongoDB Tools (page 234) 
• Backup and Restore with Filesystem Snapshots (page 229) 
Security 
• Configure Linux iptables Firewall for MongoDB (page 297) 
• Configure Windows netsh Firewall for MongoDB (page 300) 
• Enable Client Access Control (page 317) 
• Create a User Administrator (page 343) 
• Add a User to a Database (page 344) 
• Create a Role (page 347) 
5.2. Administration Tutorials 227
MongoDB Documentation, Release 2.6.4 
• Modify a User’s Access (page 352) 
• View Roles (page 353) 
• Generate a Key File (page 338) 
• Configure MongoDB with Kerberos Authentication on Linux (page 331) 
• Create a Vulnerability Report (page 359) 
Development Patterns 
• Perform Two Phase Commits (page 102) 
• Isolate Sequence of Operations (page 111) 
• Create an Auto-Incrementing Sequence Field (page 113) 
• Enforce Unique Keys for Sharded Collections (page 674) 
• Aggregation Examples (page 403) 
• Model Data to Support Keyword Search (page 155) 
• Limit Number of Elements in an Array after an Update (page 116) 
• Perform Incremental Map-Reduce (page 413) 
• Troubleshoot the Map Function (page 415) 
• Troubleshoot the Reduce Function (page 416) 
• Store a JavaScript Function on the Server (page 217) 
Text Search Patterns 
• Create a text Index (page 486) 
• Specify a Language for Text Index (page 487) 
• Specify Name for text Index (page 489) 
• Control Search Results with Weights (page 490) 
• Limit the Number of Entries Scanned (page 491) 
Data Modeling Patterns 
• Model One-to-One Relationships with Embedded Documents (page 140) 
• Model One-to-Many Relationships with Embedded Documents (page 141) 
• Model One-to-Many Relationships with Document References (page 143) 
• Model Data for Atomic Operations (page 154) 
• Model Tree Structures with Parent References (page 146) 
• Model Tree Structures with Child References (page 148) 
• Model Tree Structures with Materialized Paths (page 151) 
• Model Tree Structures with Nested Sets (page 153) 
228 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
5.2.2 Backup and Recovery 
The following tutorials describe backup and restoration for a mongod instance: 
Backup and Restore with Filesystem Snapshots (page 229) An outline of procedures for creating MongoDB data set 
backups using system-level file snapshot tool, such as LVM or native storage appliance tools. 
Restore a Replica Set from MongoDB Backups (page 232) Describes procedure for restoring a replica set from an 
archived backup such as a mongodump or MMS Backup82 file. 
Back Up and Restore with MongoDB Tools (page 234) The procedure for writing the contents of a database to a 
BSON (i.e. binary) dump file for backing up MongoDB databases. 
Backup and Restore Sharded Clusters (page 238) Detailed procedures and considerations for backing up sharded 
clusters and single shards. 
Recover Data after an Unexpected Shutdown (page 246) Recover data from MongoDB data files that were not prop-erly 
closed or have an invalid state. 
Backup and Restore with Filesystem Snapshots 
This document describes a procedure for creating backups of MongoDB systems using system-level tools, such as 
LVM or storage appliance, as well as the corresponding restoration strategies. 
These filesystem snapshots, or “block-level” backup methods use system level tools to create copies of the device that 
holds MongoDB’s data files. These methods complete quickly and work reliably, but require more system configura-tion 
outside of MongoDB. 
See also: 
MongoDB Backup Methods (page 172) and Back Up and Restore with MongoDB Tools (page 234). 
Snapshots Overview 
Snapshots work by creating pointers between the live data and a special snapshot volume. These pointers are the-oretically 
equivalent to “hard links.” As the working data diverges from the snapshot, the snapshot process uses a 
copy-on-write strategy. As a result the snapshot only stores modified data. 
After making the snapshot, you mount the snapshot image on your file system and copy data from the snapshot. The 
resulting backup contains a full copy of all data. 
Snapshots have the following limitations: 
• The database must be valid when the snapshot takes place. This means that all writes accepted by the database 
need to be fully written to disk: either to the journal or to data files. 
If all writes are not on disk when the backup occurs, the backup will not reflect these changes. If writes are in 
progress when the backup occurs, the data files will reflect an inconsistent state. With journaling all data-file 
states resulting from in-progress writes are recoverable; without journaling you must flush all pending writes 
to disk before running the backup operation and must ensure that no writes occur during the entire backup 
procedure. 
If you do use journaling, the journal must reside on the same volume as the data. 
• Snapshots create an image of an entire disk image. Unless you need to back up your entire system, consider 
isolating your MongoDB data files, journal (if applicable), and configuration on one logical disk that doesn’t 
contain any other data. 
82https://mms.mongodb.com/?pk_campaign=mongodb-docs-admin-tutorials 
5.2. Administration Tutorials 229
MongoDB Documentation, Release 2.6.4 
Alternately, store all MongoDB data files on a dedicated device so that you can make backups without duplicat-ing 
extraneous data. 
• Ensure that you copy data from snapshots and onto other systems to ensure that data is safe from site failures. 
• Although different snapshots methods provide different capability, the LVM method outlined below does not 
provide any capacity for capturing incremental backups. 
Snapshots With Journaling If your mongod instance has journaling enabled, then you can use any kind of file 
system or volume/block level snapshot tool to create backups. 
If you manage your own infrastructure on a Linux-based system, configure your system with LVM to provide your disk 
packages and provide snapshot capability. You can also use LVM-based setups within a cloud/virtualized environment. 
Note: Running LVM provides additional flexibility and enables the possibility of using snapshots to back up Mon-goDB. 
Snapshots with Amazon EBS in a RAID 10 Configuration If your deployment depends on Amazon’s Elastic 
Block Storage (EBS) with RAID configured within your instance, it is impossible to get a consistent state across all 
disks using the platform’s snapshot tool. As an alternative, you can do one of the following: 
• Flush all writes to disk and create a write lock to ensure consistent state during the backup process. 
If you choose this option see Create Backups on Instances that do not have Journaling Enabled (page 232). 
• Configure LVM to run and hold your MongoDB data files on top of the RAID within your system. 
If you choose this option, perform the LVM backup operation described in Create a Snapshot (page 230). 
Backup and Restore Using LVM on a Linux System 
This section provides an overview of a simple backup process using LVM on a Linux system. While the tools, com-mands, 
and paths may be (slightly) different on your system the following steps provide a high level overview of the 
backup operation. 
Note: Only use the following procedure as a guideline for a backup system and infrastructure. Production backup 
systems must consider a number of application specific requirements and factors unique to specific environments. 
Create a Snapshot To create a snapshot with LVM, issue a command as root in the following format: 
lvcreate --size 100M --snapshot --name mdb-snap01 /dev/vg0/mongodb 
This command creates an LVM snapshot (with the --snapshot option) named mdb-snap01 of the mongodb 
volume in the vg0 volume group. 
This example creates a snapshot named mdb-snap01 located at http://docs.mongodb.org/manualdev/vg0/mdb-snap01. 
The location and paths to your systems volume groups and devices may vary slightly depending on your operating 
system’s LVM configuration. 
The snapshot has a cap of at 100 megabytes, because of the parameter --size 100M. This size does not 
reflect the total amount of the data on the disk, but rather the quantity of differences between the current 
state of http://docs.mongodb.org/manualdev/vg0/mongodb and the creation of the snapshot (i.e. 
http://docs.mongodb.org/manualdev/vg0/mdb-snap01.) 
230 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Warning: Ensure that you create snapshots with enough space to account for data growth, particularly for the 
period of time that it takes to copy data out of the system or to a temporary image. 
If your snapshot runs out of space, the snapshot image becomes unusable. Discard this logical volume and create 
another. 
The snapshot will exist when the command returns. You can restore directly from the snapshot at any time or by 
creating a new logical volume and restoring from this snapshot to the alternate image. 
While snapshots are great for creating high quality backups very quickly, they are not ideal as a format for storing 
backup data. Snapshots typically depend and reside on the same storage infrastructure as the original disk images. 
Therefore, it’s crucial that you archive these snapshots and store them elsewhere. 
Archive a Snapshot After creating a snapshot, mount the snapshot and copy the data to separate storage. Your 
system might try to compress the backup images as you move the offline. Alternatively, take a block level copy of the 
snapshot image, such as with the following procedure: 
umount /dev/vg0/mdb-snap01 
dd if=/dev/vg0/mdb-snap01 | gzip > mdb-snap01.gz 
The above command sequence does the following: 
• Ensures that the http://docs.mongodb.org/manualdev/vg0/mdb-snap01 device is not mounted. 
Never take a block level copy of a filesystem or filesystem snapshot that is mounted. 
• Performs a block level copy of the entire snapshot image using the dd command and compresses the result in a 
gzipped file in the current working directory. 
Warning: This command will create a large gz file in your current working directory. Make sure that you 
run this command in a file system that has enough free space. 
Restore a Snapshot To restore a snapshot created with the above method, issue the following sequence of com-mands: 
lvcreate --size 1G --name mdb-new vg0 
gzip -d -c mdb-snap01.gz | dd of=/dev/vg0/mdb-new 
mount /dev/vg0/mdb-new /srv/mongodb 
The above sequence does the following: 
• Creates a new logical volume named mdb-new, in the http://docs.mongodb.org/manualdev/vg0 
volume group. The path to the new device will be http://docs.mongodb.org/manualdev/vg0/mdb-new. 
Warning: This volume will have a maximum size of 1 gigabyte. The original file system must have had a 
total size of 1 gigabyte or smaller, or else the restoration will fail. 
Change 1G to your desired volume size. 
• Uncompresses and unarchives the mdb-snap01.gz into the mdb-new disk image. 
• Mounts the mdb-new disk image to the /srv/mongodb directory. Modify the mount point to correspond to 
your MongoDB data file location, or other location as needed. 
Note: The restored snapshot will have a stale mongod.lock file. If you do not remove this file from the snap-shot, 
and MongoDB may assume that the stale lock file indicates an unclean shutdown. If you’re running with 
storage.journal.enabled enabled, and you do not use db.fsyncLock(), you do not need to remove 
the mongod.lock file. If you use db.fsyncLock() you will need to remove the lock. 
5.2. Administration Tutorials 231
MongoDB Documentation, Release 2.6.4 
Restore Directly from a Snapshot To restore a backup without writing to a compressed gz file, use the following 
sequence of commands: 
umount /dev/vg0/mdb-snap01 
lvcreate --size 1G --name mdb-new vg0 
dd if=/dev/vg0/mdb-snap01 of=/dev/vg0/mdb-new 
mount /dev/vg0/mdb-new /srv/mongodb 
Remote Backup Storage You can implement off-system backups using the combined process (page 232) and SSH. 
This sequence is identical to procedures explained above, except that it archives and compresses the backup on a 
remote system using SSH. 
Consider the following procedure: 
umount /dev/vg0/mdb-snap01 
dd if=/dev/vg0/mdb-snap01 | ssh username@example.com gzip > /opt/backup/mdb-snap01.gz 
lvcreate --size 1G --name mdb-new vg0 
ssh username@example.com gzip -d -c /opt/backup/mdb-snap01.gz | dd of=/dev/vg0/mdb-new 
mount /dev/vg0/mdb-new /srv/mongodb 
Create Backups on Instances that do not have Journaling Enabled 
If your mongod instance does not run with journaling enabled, or if your journal is on a separate volume, obtaining a 
functional backup of a consistent state is more complicated. As described in this section, you must flush all writes to 
disk and lock the database to prevent writes during the backup process. If you have a replica set configuration, then 
for your backup use a secondary which is not receiving reads (i.e. hidden member). 
Step 1: Flush writes to disk and lock the database to prevent further writes. To flush writes to disk and to “lock” 
the database, issue the db.fsyncLock() method in the mongo shell: 
db.fsyncLock(); 
Step 2: Perform the backup operation described in Create a Snapshot. 
Step 3: After the snapshot completes, unlock the database. To unlock the database after the snapshot has com-pleted, 
use the following command in the mongo shell: 
db.fsyncUnlock(); 
Changed in version 2.2: When used in combination with fsync or db.fsyncLock(), mongod may block some 
reads, including those from mongodump, when queued write operation waits behind the fsync lock. 
Restore a Replica Set from MongoDB Backups 
This procedure outlines the process for taking MongoDB data and restoring that data into a new replica set. Use this 
approach for seeding test deployments from production backups as well as part of disaster recovery. 
You cannot restore a single data set to three new mongod instances and then create a replica set. In this situation 
MongoDB will force the secondaries to perform an initial sync. The procedures in this document describe the correct 
and efficient ways to deploy a replica set. 
232 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Restore Database into a Single Node Replica Set 
Step 1: Obtain backup MongoDB Database files. The backup files may come from a file system snapshot 
(page 229). The MongoDB Management Service (MMS)83 produces MongoDB database files for stored snapshots84 
and point and time snapshots85. You can also use mongorestore to restore database files using data created with 
mongodump. See Back Up and Restore with MongoDB Tools (page 234) for more information. 
Step 2: Start a mongod using data files from the backup as the data path. The following example uses 
/data/db as the data path, as specified in the dbpath setting: 
mongod --dbpath /data/db 
Step 3: Convert the standalone mongod to a single-node replica set Convert the standalone mongod process to 
a single-node replica set by shutting down the mongod instance, and restarting it with the --replSet option, as in 
the following example: 
mongod --dbpath /data/db --replSet <replName> 
Optionally, you can explicitly set a oplogSizeMB to control the size of the oplog created for this replica set member. 
Step 4: Connect to the mongod instance. For example, first use the following command to a mongod instance 
running on the localhost interface: 
mongo 
Step 5: Initiate the new replica set. Use rs.initiate() to initiate the new replica set, as in the following 
example: 
rs.initiate() 
Add Members to the Replica Set 
MongoDB provides two options for restoring secondary members of a replica set: 
• Manually copy the database files to each data directory. 
• Allow initial sync (page 537) to distribute data automatically. 
The following sections outlines both approaches. 
Note: If your database is large, initial sync can take a long time to complete. For large databases, it might be 
preferable to copy the database files onto each host. 
Copy Database Files and Restart mongod Instance Use the following sequence of operations to “seed” additional 
members of the replica set with the restored data by copying MongoDB data files directly. 
Step 1: Shut down the mongod instance that you restored. Use --shutdown or db.shutdownServer() 
to ensure a clean shut down. 
83https://mms.mongodb.com/?pk_campaign=mongodb-docs-restore-rs-tutorial 
84https://mms.mongodb.com/help/backup/tutorial/restore-from-snapshot/ 
85https://mms.mongodb.com/help/backup/tutorial/restore-from-point-in-time-snapshot/ 
5.2. Administration Tutorials 233
MongoDB Documentation, Release 2.6.4 
Step 2: Copy the primary’s data directory to each secondary. Copy the primary’s data directory into the dbPath 
of the other members of the replica set. The dbPath is /data/db by default. 
Step 3: Start the mongod instance that you restored. 
Step 4: Add the secondaries to the replica set. In a mongo shell connected to the primary, add the secondaries to 
the replica set using rs.add(). See Deploy a Replica Set (page 545) for more information about deploying a replica 
set. 
Update Secondaries using Initial Sync Use the following sequence of operations to “seed” additional members of 
the replica set with the restored data using the default initial sync operation. 
Step 1: Ensure that the data directories on the prospective replica set members are empty. 
Step 2: Add each prospective member to the replica set. When you add a member to the replica set, Initial Sync 
(page 537) copies the data from the primary to the new member. 
Back Up and Restore with MongoDB Tools 
This document describes the process for writing and restoring backups to files in binary format with the mongodump 
and mongorestore tools. 
Use these tools for backups if other backup methods, such as the MMS Backup Service86 or file system snapshots 
(page 229) are unavailable. 
See also: 
MongoDB Backup Methods (page 172), http://docs.mongodb.org/manualreference/program/mongodump, 
and http://docs.mongodb.org/manualreference/program/mongorestore. 
Backup a Database with mongodump 
mongodump does not dump the content of the local database. 
To backup all the databases in a cluster via mongodump, you should have the backup (page 367) role. The backup 
(page 367) role provides all the needed privileges for backing up all database. The role confers no additional access, 
in keeping with the policy of least privilege. 
To backup a given database, you must have read access on the database. Several roles provide this access, including 
the backup (page 367) role. 
To backup the system.profile collection in a database, you must have read access on certain system collec-tions 
in the database. Several roles provide this access, including the clusterAdmin (page 364) and dbAdmin 
(page 363) roles. 
Changed in version 2.6. 
To backup users and user-defined roles (page 286) for a given database, you must have access to the admin database. 
MongoDB stores the user data and role definitions for all databases in the admin database. 
86https://mms.mongodb.com/?pk_campaign=mongodb-docs-tools 
234 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Specifically, to backup a given database’s users, you must have the find (page 375) action (page 375) 
on the admin database’s admin.system.users (page 271) collection. The backup (page 367) and 
userAdminAnyDatabase (page 368) roles both provide this privilege. 
To backup the user-defined roles on a database, you must have the find (page 375) action on the admin database’s 
admin.system.roles (page 270) collection. Both the backup (page 367) and userAdminAnyDatabase 
(page 368) roles provide this privilege. 
Basic mongodump Operations The mongodump utility can back up data by either: 
• connecting to a running mongod or mongos instance, or 
• accessing data files without an active instance. 
The utility can create a backup for an entire server, database or collection, or can use a query to backup just part of a 
collection. 
When you run mongodump without any arguments, the command connects to the MongoDB instance on the local 
system (e.g. 127.0.0.1 or localhost) on port 27017 and creates a database backup named dump/ in the 
current directory. 
To backup data from a mongod or mongos instance running on the same machine and on the default port of 27017, 
use the following command: 
mongodump 
The data format used by mongodump from version 2.2 or later is incompatible with earlier versions of mongod. Do 
not use recent versions of mongodump to back up older data stores. 
You can also specify the --host and --port of the MongoDB instance that the mongodump should connect to. 
For example: 
mongodump --host mongodb.example.net --port 27017 
mongodump will write BSON files that hold a copy of data accessible via the mongod listening on port 27017 of 
the mongodb.example.net host. See Create Backups from Non-Local mongod Instances (page 236) for more 
information. 
To use mongodump without a running MongoDB instance, specify the --dbpath option to read directly from 
MongoDB data files. See Create Backups Without a Running mongod Instance (page 236) for details. 
To specify a different output directory, you can use the --out or -o option: 
mongodump --out /data/backup/ 
To limit the amount of data included in the database dump, you can specify --db and --collection as options to 
mongodump. For example: 
mongodump --collection myCollection --db test 
This operation creates a dump of the collection named myCollection from the database test in a dump/ subdi-rectory 
of the current working directory. 
mongodump overwrites output files if they exist in the backup data folder. Before running the mongodump command 
multiple times, either ensure that you no longer need the files in the output folder (the default is the dump/ folder) or 
rename the folders or files. 
Point in Time Operation Using Oplogs Use the --oplog option with mongodump to collect the oplog entries to 
build a point-in-time snapshot of a database within a replica set. With --oplog, mongodump copies all the data from 
the source database as well as all of the oplog entries from the beginning of the backup procedure to until the backup 
5.2. Administration Tutorials 235
MongoDB Documentation, Release 2.6.4 
procedure completes. This backup procedure, in conjunction with mongorestore --oplogReplay, allows you 
to restore a backup that reflects the specific moment in time that corresponds to when mongodump completed creating 
the dump file. 
Create Backups Without a Running mongod Instance If your MongoDB instance is not running, you can use 
the --dbpath option to specify the location to your MongoDB instance’s database files. mongodump reads from 
the data files directly with this operation. This locks the data directory to prevent conflicting writes. The mongod 
process must not be running or attached to these data files when you run mongodump in this configuration. Consider 
the following example: 
Given a MongoDB instance that contains the customers, products, and suppliers databases, the follow-ing 
mongodump operation backs up the databases using the --dbpath option, which specifies the location of the 
database files on the host: 
mongodump --dbpath /data -o dataout 
The --out or -o option allows you to specify the directory where mongodump will save the backup. 
mongodump creates a separate backup directory for each of the backed up databases: dataout/customers, 
dataout/products, and dataout/suppliers. 
Create Backups from Non-Local mongod Instances The --host and --port options for mongodump allow 
you to connect to and backup from a remote host. Consider the following example: 
mongodump --host mongodb1.example.net --port 3017 --username user --password pass --out /opt/backup/mongodump-On any mongodump command you may, as above, specify username and password credentials to specify database 
authentication. 
Restore a Database with mongorestore 
Changed in version 2.6. 
To restore users and user-defined roles (page 286) on a given database, you must have access to the admin database. 
MongoDB stores the user data and role definitions for all databases in the admin database. 
Specifically, to restore users to a given database, you must have the insert (page 375) action (page 375) on the 
admin database’s admin.system.users (page 271) collection. The restore (page 367) role provides this 
privilege. 
To restore user-defined roles to a database, you must have the insert (page 375) action on the admin database’s 
admin.system.roles (page 270) collection. The restore (page 367) role provides this privilege. 
Basic mongorestore Operations The mongorestore utility restores a binary backup created by 
mongodump. By default, mongorestore looks for a database backup in the dump/ directory. 
The mongorestore utility can restore data either by: 
• connecting to a running mongod or mongos directly, or 
• writing to a set of MongoDB data files without use of a running mongod. 
mongorestore can restore either an entire database backup or a subset of the backup. 
To use mongorestore to connect to an active mongod or mongos, use a command with the following prototype 
form: 
236 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
mongorestore --port <port number> <path to the backup> 
To use mongorestore to write to data files without using a running mongod, use a command with the following 
prototype form: 
mongorestore --dbpath <database path> <path to the backup> 
Consider the following example: 
mongorestore dump-2013-10-25/ 
Here, mongorestore imports the database backup in the dump-2013-10-25 directory to the mongod instance 
running on the localhost interface. 
Restore Point in Time Oplog Backup If you created your database dump using the --oplog option to ensure a 
point-in-time snapshot, call mongorestore with the --oplogReplay option, as in the following example: 
mongorestore --oplogReplay 
You may also consider using the mongorestore --objcheck option to check the integrity of objects while 
inserting them into the database, or you may consider the mongorestore --drop option to drop each collection 
from the database before restoring from backups. 
Restore a Subset of data from a Binary Database Dump mongorestore also includes the ability to a filter to 
all input before inserting it into the new database. Consider the following example: 
mongorestore --filter '{"field": 1}' 
Here, mongorestore only adds documents to the database from the dump located in the dump/ folder if the 
documents have a field name field that holds a value of 1. Enclose the filter in single quotes (e.g. ’) to prevent the 
filter from interacting with your shell environment. 
Restore Without a Running mongod mongorestore can write data to MongoDB data files without needing to 
connect to a mongod directly. 
Example 
Restore a Database Without a Running mongod 
Given a set of backed up databases in the /data/backup/ directory: 
• /data/backup/customers, 
• /data/backup/products, and 
• /data/backup/suppliers 
The following mongorestore command restores the products database. The command uses the --dbpath 
option to specify the path to the MongoDB data files: 
mongorestore --dbpath /data/db --journal /data/backup/products 
The mongorestore imports the database backup in the /data/backup/products directory to the mongod 
instance that runs on the localhost interface. The mongorestore operation imports the backup even if the mongod 
is not running. 
The --journal option ensures that mongorestore records all operation in the durability journal. The journal 
prevents data file corruption if anything (e.g. power failure, disk failure, etc.) interrupts the restore operation. 
5.2. Administration Tutorials 237
MongoDB Documentation, Release 2.6.4 
See also: 
http://docs.mongodb.org/manualreference/program/mongodump and 
http://docs.mongodb.org/manualreference/program/mongorestore. 
Restore Backups to Non-Local mongod Instances By default, mongorestore connects to a MongoDB instance 
running on the localhost interface (e.g. 127.0.0.1) and on the default port (27017). If you want to restore to a 
different host or port, use the --host and --port options. 
Consider the following example: 
mongorestore --host mongodb1.example.net --port 3017 --username user --password pass /opt/backup/mongodump-As above, you may specify username and password connections if your mongod requires authentication. 
Backup and Restore Sharded Clusters 
The following tutorials describe backup and restoration for sharded clusters: 
Backup a Small Sharded Cluster with mongodump (page 238) If your sharded cluster holds a small data set, you 
can use mongodump to capture the entire backup in a reasonable amount of time. 
Backup a Sharded Cluster with Filesystem Snapshots (page 239) Use file system snapshots back up each compo-nent 
in the sharded cluster individually. The procedure involves stopping the cluster balancer. If your system 
configuration allows file system backups, this might be more efficient than using MongoDB tools. 
Backup a Sharded Cluster with Database Dumps (page 241) Create backups using mongodump to back up each 
component in the cluster individually. 
Schedule Backup Window for Sharded Clusters (page 243) Limit the operation of the cluster balancer to provide a 
window for regular backup operations. 
Restore a Single Shard (page 243) An outline of the procedure and consideration for restoring a single shard from a 
backup. 
Restore a Sharded Cluster (page 244) An outline of the procedure and consideration for restoring an entire sharded 
cluster from backup. 
Backup a Small Sharded Cluster with mongodump 
Overview If your sharded cluster holds a small data set, you can connect to a mongos using mongodump. You can 
create backups of your MongoDB cluster, if your backup infrastructure can capture the entire backup in a reasonable 
amount of time and if you have a storage system that can hold the complete MongoDB data set. 
See MongoDB Backup Methods (page 172) and Backup and Restore Sharded Clusters (page 238) for complete infor-mation 
on backups in MongoDB and backups of sharded clusters in particular. 
Important: By default mongodump issue its queries to the non-primary nodes. 
To backup all the databases in a cluster via mongodump, you should have the backup (page 367) role. The backup 
(page 367) role provides all the needed privileges for backing up all database. The role confers no additional access, 
in keeping with the policy of least privilege. 
To backup a given database, you must have read access on the database. Several roles provide this access, including 
the backup (page 367) role. 
238 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
To backup the system.profile collection in a database, you must have read access on certain system collec-tions 
in the database. Several roles provide this access, including the clusterAdmin (page 364) and dbAdmin 
(page 363) roles. 
Changed in version 2.6. 
To backup users and user-defined roles (page 286) for a given database, you must have access to the admin database. 
MongoDB stores the user data and role definitions for all databases in the admin database. 
Specifically, to backup a given database’s users, you must have the find (page 375) action (page 375) 
on the admin database’s admin.system.users (page 271) collection. The backup (page 367) and 
userAdminAnyDatabase (page 368) roles both provide this privilege. 
To backup the user-defined roles on a database, you must have the find (page 375) action on the admin database’s 
admin.system.roles (page 270) collection. Both the backup (page 367) and userAdminAnyDatabase 
(page 368) roles provide this privilege. 
Considerations If you use mongodump without specifying a database or collection, mongodump will capture 
collection data and the cluster meta-data from the config servers (page 616). 
You cannot use the --oplog option for mongodump when capturing data from mongos. As a result, if you need 
to capture a backup that reflects a single moment in time, you must stop all writes to the cluster for the duration of the 
backup operation. 
Procedure 
Capture Data You can perform a backup of a sharded cluster by connecting mongodump to a mongos. Use the 
following operation at your system’s prompt: 
mongodump --host mongos3.example.net --port 27017 
mongodump will write BSON files that hold a copy of data stored in the sharded cluster accessible via the mongos 
listening on port 27017 of the mongos3.example.net host. 
Restore Data Backups created with mongodump do not reflect the chunks or the distribution of data in the sharded 
collection or collections. Like all mongodump output, these backups contain separate directories for each database 
and BSON files for each collection in that database. 
You can restore mongodump output to any MongoDB instance, including a standalone, a replica set, or a new sharded 
cluster. When restoring data to sharded cluster, you must deploy and configure sharding before restoring data from 
the backup. See Deploy a Sharded Cluster (page 635) for more information. 
Backup a Sharded Cluster with Filesystem Snapshots 
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This pro-cedure 
uses file system snapshots to capture a copy of the mongod instance. An alternate procedure uses mongodump 
to create binary database dumps when file-system snapshots are not available. See Backup a Sharded Cluster with 
Database Dumps (page 241) for the alternate procedure. 
See MongoDB Backup Methods (page 172) and Backup and Restore Sharded Clusters (page 238) for complete infor-mation 
on backups in MongoDB and backups of sharded clusters in particular. 
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a 
running production system, you can only capture an approximation of point-in-time snapshot. 
5.2. Administration Tutorials 239
MongoDB Documentation, Release 2.6.4 
Considerations 
Balancing It is essential that you stop the balancer before capturing a backup. 
If the balancer is active while you capture backups, the backup artifacts may be incomplete and/or have duplicate data, 
as chunks may migrate while recording backups. 
Precision In this procedure, you will stop the cluster balancer and take a backup up of the config database, and 
then take backups of each shard in the cluster using a file-system snapshot tool. If you need an exact moment-in-time 
snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise 
the snapshot will only approximate a moment in time. 
For approximate point-in-time snapshots, you can improve the quality of the backup while minimizing impact on the 
cluster by taking the backup from a secondary member of the replica set that provides each shard. 
Procedure 
Step 1: Disable the balancer. Disable the balancer process that equalizes the distribution of data among the shards. 
To disable the balancer, use the sh.stopBalancer() method in the mongo shell. For example: 
use config 
sh.stopBalancer() 
For more information, see the Disable the Balancer (page 661) procedure. 
Step 2: Lock one secondary member of each replica set in each shard. Lock one secondary member of each 
replica set in each shard so that your backups reflect the state of your database at the nearest possible approximation 
of a single moment in time. Lock these mongod instances in as short of an interval as possible. 
To lock a secondary, connect through the mongo shell to the secondary member’s mongod instance and issue the 
db.fsyncLock() method. 
Step 3: Back up one of the config servers. Backing up a config server (page 616) backs up the sharded cluster’s 
metadata. You need back up only one config server, as they all hold the same data. Do one of the following to back up 
one of the config servers: 
Create a file-system snapshot of the config server. Do this only if the config server has journaling enabled. Use 
the procedure in Backup and Restore with Filesystem Snapshots (page 229). Never use db.fsyncLock() on config 
databases. 
Create a database dump to backup the config server. Issue mongodump against one of the config mongod 
instances or via the mongos. If you are running MongoDB 2.4 or later with the --configsvr option, then include 
the --oplog option to ensure that the dump includes a partial oplog containing operations from the duration of the 
mongodump operation. For example: 
mongodump --oplog --db config 
Step 4: Back up the replica set members of the shards that you locked. You may back up the shards in parallel. 
For each shard, create a snapshot. Use the procedure in Backup and Restore with Filesystem Snapshots (page 229). 
240 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Step 5: Unlock locked replica set members. Unlock all locked replica set members of each shard using the 
db.fsyncUnlock() method in the mongo shell. 
Step 6: Enable the balancer. Re-enable the balancer with the sh.setBalancerState() method. Use the 
following command sequence when connected to the mongos with the mongo shell: 
use config 
sh.setBalancerState(true) 
Backup a Sharded Cluster with Database Dumps 
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This 
procedure uses mongodump to create dumps of the mongod instance. An alternate procedure uses file system snap-shots 
to capture the backup data, and may be more efficient in some situations if your system configuration allows file 
system backups. See Backup and Restore Sharded Clusters (page 238) for more information. 
See MongoDB Backup Methods (page 172) and Backup and Restore Sharded Clusters (page 238) for complete infor-mation 
on backups in MongoDB and backups of sharded clusters in particular. 
Prerequisites 
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a 
running production system, you can only capture an approximation of point-in-time snapshot. 
To backup all the databases in a cluster via mongodump, you should have the backup (page 367) role. The backup 
(page 367) role provides all the needed privileges for backing up all database. The role confers no additional access, 
in keeping with the policy of least privilege. 
To backup a given database, you must have read access on the database. Several roles provide this access, including 
the backup (page 367) role. 
To backup the system.profile collection in a database, you must have read access on certain system collec-tions 
in the database. Several roles provide this access, including the clusterAdmin (page 364) and dbAdmin 
(page 363) roles. 
Changed in version 2.6. 
To backup users and user-defined roles (page 286) for a given database, you must have access to the admin database. 
MongoDB stores the user data and role definitions for all databases in the admin database. 
Specifically, to backup a given database’s users, you must have the find (page 375) action (page 375) 
on the admin database’s admin.system.users (page 271) collection. The backup (page 367) and 
userAdminAnyDatabase (page 368) roles both provide this privilege. 
To backup the user-defined roles on a database, you must have the find (page 375) action on the admin database’s 
admin.system.roles (page 270) collection. Both the backup (page 367) and userAdminAnyDatabase 
(page 368) roles provide this privilege. 
Consideration To create these backups of a sharded cluster, you will stop the cluster balancer and take a backup up 
of the config database, and then take backups of each shard in the cluster using mongodump to capture the backup 
data. To capture a more exact moment-in-time snapshot of the system, you will need to stop all application writes 
before taking the filesystem snapshots; otherwise the snapshot will only approximate a moment in time. 
For approximate point-in-time snapshots, taking the backup from a single offline secondary member of the replica set 
that provides each shard can improve the quality of the backup while minimizing impact on the cluster. 
5.2. Administration Tutorials 241
MongoDB Documentation, Release 2.6.4 
Procedure 
Step 1: Disable the balancer process. Disable the balancer process that equalizes the distribution of data among 
the shards. To disable the balancer, use the sh.stopBalancer() method in the mongo shell. For example: 
use config 
sh.setBalancerState(false) 
For more information, see the Disable the Balancer (page 661) procedure. 
If you do not stop the balancer, the backup could have duplicate data or omit data as chunks migrate while recording 
backups. 
Step 2: Lock replica set members. Lock one member of each replica set in each shard so that your backups reflect 
the state of your database at the nearest possible approximation of a single moment in time. Lock these mongod 
instances in as short of an interval as possible. 
To lock or freeze a sharded cluster, you shut down one member of each replica set. Ensure that the oplog has sufficient 
capacity to allow these secondaries to catch up to the state of the primaries after finishing the backup procedure. See 
Oplog Size (page 535) for more information. 
Step 3: Backup one config server. Use mongodump to backup one of the config servers (page 616). This backs up 
the cluster’s metadata. You only need to back up one config server, as they all hold the same data. 
Use the mongodump tool to capture the content of the config mongod instances. 
Your config servers must run MongoDB 2.4 or later with the --configsvr option and the mongodump option 
must include the --oplog to capture a consistent copy of the config database: 
mongodump --oplog --db config 
Step 4: Backup replica set members. Back up the replica set members of the shards that shut down using 
mongodump and specifying the --dbpath option. You may back up the shards in parallel. Consider the following 
invocation: 
mongodump --journal --dbpath /data/db/ --out /data/backup/ 
You must run mongodump on the same system where the mongod ran. This operation will create a dump of all the 
data managed by the mongod instances that used the dbPath /data/db/. mongodump writes the output of this 
dump to the /data/backup/ directory. 
Step 5: Restart replica set members. Restart all stopped replica set members of each shard as normal and allow 
them to catch up with the state of the primary. 
Step 6: Re-enable the balancer process. Re-enable the balancer with the sh.setBalancerState() method. 
Use the following command sequence when connected to the mongos with the mongo shell: 
use config 
sh.setBalancerState(true) 
242 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Schedule Backup Window for Sharded Clusters 
Overview In a sharded cluster, the balancer process is responsible for distributing sharded data around the cluster, 
so that each shard has roughly the same amount of data. 
However, when creating backups from a sharded cluster it is important that you disable the balancer while taking 
backups to ensure that no chunk migrations affect the content of the backup captured by the backup procedure. Using 
the procedure outlined in the section Disable the Balancer (page 661) you can manually stop the balancer process 
temporarily. As an alternative you can use this procedure to define a balancing window so that the balancer is always 
disabled during your automated backup operation. 
Procedure If you have an automated backup schedule, you can disable all balancing operations for a period of time. 
For instance, consider the following command: 
use config 
db.settings.update( { _id : "balancer" }, { $set : { activeWindow : { start : "6:00", stop : "23:00" This operation configures the balancer to run between 6:00am and 11:00pm, server time. Schedule your backup 
operation to run and complete outside of this time. Ensure that the backup can complete outside the window when 
the balancer is running and that the balancer can effectively balance the collection among the shards in the window 
allotted to each. 
Restore a Single Shard 
Overview Restoring a single shard from backup with other unaffected shards requires a number of special consider-ations 
and practices. This document outlines the additional tasks you must perform when restoring a single shard. 
Consider the following resources on backups in general as well as backup and restoration of sharded clusters specifi-cally: 
• Backup and Restore Sharded Clusters (page 238) 
• Restore a Sharded Cluster (page 244) 
• MongoDB Backup Methods (page 172) 
Procedure Always restore sharded clusters as a whole. When you restore a single shard, keep in mind that the 
balancer process might have moved chunks to or from this shard since the last backup. If that’s the case, you must 
manually move those chunks, as described in this procedure. 
Step 1: Restore the shard as you would any other mongod instance. See MongoDB Backup Methods (page 172) 
for overviews of these procedures. 
Step 2: Manage the chunks. For all chunks that migrate away from this shard, you do not need to do anything at 
this time. You do not need to delete these documents from the shard because the chunks are automatically filtered out 
from queries by mongos. You can remove these documents from the shard, if you like, at your leisure. 
For chunks that migrate to this shard after the most recent backup, you must manually recover the chunks using back-ups 
of other shards, or some other source. To determine what chunks have moved, view the changelog collection 
in the Config Database (page 679). 
5.2. Administration Tutorials 243
MongoDB Documentation, Release 2.6.4 
Restore a Sharded Cluster 
Overview You can restore a sharded cluster either from snapshots (page 229) or from BSON database dumps 
(page 241) created by the mongodump tool. This document provides procedures for both: 
• Restore a Sharded Cluster with Filesystem Snapshots (page 244) 
• restore-sh-cl-dmp 
Related Documents For an overview of backups in MongoDB, see MongoDB Backup Methods (page 172). For 
complete information on backups and backups of sharded clusters in particular, see Backup and Restore Sharded 
Clusters (page 238). 
For backup procedures, see: 
• Backup a Sharded Cluster with Filesystem Snapshots (page 239) 
• Backup a Sharded Cluster with Database Dumps (page 241) 
Procedures Use the procedure for the type of backup files to restore. 
Restore a Sharded Cluster with Filesystem Snapshots 
Step 1: Shut down the entire cluster. Stop all mongos and mongod processes, including all shards and all config 
servers. 
Connect to each member use the following operation: 
use admin 
db.shutdownServer() 
For version 2.4 or earlier, use db.shutdownServer({force:true}). 
Step 2: Restore the data files. One each server, extract the data files to the location where the mongod instance 
will access them. Restore the following: 
Data files for each server in each shard. Because replica sets provide each production shard, restore all the mem-bers 
of the replica set or use the other standard approaches for restoring a replica set from backup. See the Restore a 
Snapshot (page 231) and Restore a Database with mongorestore (page 236) sections for details on these procedures. 
Data files for each config server. 
Step 3: Restart the config servers. Restart each config server (page 616) mongod instance by issuing a command 
similar to the following for each, using values appropriate to your configuration: 
mongod --configsvr --dbpath /data/configdb --port 27019 
Step 4: If shard hostnames have changed, update the config string and config database. If shard hostnames 
have changed, start one mongos instance using the updated config string with the new configdb hostnames and 
ports. 
Then update the shards collection in the Config Database (page 679) to reflect the new hostnames. Then stop the 
mongos instance. 
244 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Step 5: Restart all the shard mongod instances. 
Step 6: Restart all the mongos instances. If shard hostnames have changed, make sure to use the updated config 
string. 
Step 7: Connect to a mongos to ensure the cluster is operational. Connect to a mongos instance from a mongo 
shell and use the db.printShardingStatus() method to ensure that the cluster is operational, as follows: 
db.printShardingStatus() 
show collections 
Restore a Sharded Cluster with Database Dumps 
Step 1: Shut down the entire cluster. Stop all mongos and mongod processes, including all shards and all config 
servers. 
Connect to each member use the following operation: 
use admin 
db.shutdownServer() 
For version 2.4 or earlier, use db.shutdownServer({force:true}). 
Step 2: Restore the data files. One each server, use mongorestore to restore the database dump to the location 
where the mongod instance will access the data. 
The following example restores a database dump located at http://docs.mongodb.org/manualopt/backup/ 
to the /data/ directory. This requires that there are no active mongod instances attached to the /data directory. 
mongorestore --dbpath /data /opt/backup 
Step 3: Restart the config servers. Restart each config server (page 616) mongod instance by issuing a command 
similar to the following for each, using values appropriate to your configuration: 
mongod --configsvr --dbpath /data/configdb --port 27019 
Step 4: If shard hostnames have changed, update the config string and config database. If shard hostnames 
have changed, start one mongos instance using the updated config string with the new configdb hostnames and 
ports. 
Then update the shards collection in the Config Database (page 679) to reflect the new hostnames. Then stop the 
mongos instance. 
Step 5: Restart all the shard mongod instances. 
Step 6: Restart all the mongos instances. If shard hostnames have changed, make sure to use the updated config 
string. 
5.2. Administration Tutorials 245
MongoDB Documentation, Release 2.6.4 
Step 7: Connect to a mongos to ensure the cluster is operational. Connect to a mongos instance from a mongo 
shell and use the db.printShardingStatus() method to ensure that the cluster is operational, as follows: 
db.printShardingStatus() 
show collections 
Recover Data after an Unexpected Shutdown 
If MongoDB does not shutdown cleanly 87 the on-disk representation of the data files will likely reflect an inconsistent 
state which could lead to data corruption. 88 
To prevent data inconsistency and corruption, always shut down the database cleanly and use the durability journaling. 
MongoDB writes data to the journal, by default, every 100 milliseconds, such that MongoDB can always recover to a 
consistent state even in the case of an unclean shutdown due to power loss or other system failure. 
If you are not running as part of a replica set and do not have journaling enabled, use the following procedure to 
recover data that may be in an inconsistent state. If you are running as part of a replica set, you should always restore 
from a backup or restart the mongod instance with an empty dbPath and allow MongoDB to perform an initial sync 
to restore the data. 
See also: 
The Administration (page 171) documents, including Replica Set Syncing (page 535), and the documentation on the 
--repair repairPath and storage.journal.enabled settings. 
Process 
Indications When you are aware of a mongod instance running without journaling that stops unexpectedly and 
you’re not running with replication, you should always run the repair operation before starting MongoDB again. If 
you’re using replication, then restore from a backup and allow replication to perform an initial sync (page 535) to 
restore data. 
If the mongod.lock file in the data directory specified by dbPath, /data/db by default, is not a zero-byte file, 
then mongod will refuse to start, and you will find a message that contains the following line in your MongoDB log 
our output: 
Unclean shutdown detected. 
This indicates that you need to run mongod with the --repair option. If you run repair when the mongodb.lock 
file exists in your dbPath, or the optional --repairpath, you will see a message that contains the following line: 
old lock file: /data/db/mongod.lock. probably means unclean shutdown 
If you see this message, as a last resort you may remove the lockfile and run the repair operation before starting the 
database normally, as in the following procedure: 
87 To ensure a clean shut down, use the db.shutdownServer() from the mongo shell, your control script, the mongod --shutdown 
option on Linux systems, “Control-C” when running mongod in interactive mode, or kill $(pidof mongod) or kill -2 $(pidof 
mongod). 
88 You can also use the db.collection.validate() method to test the integrity of a single collection. However, this process is time 
consuming, and without journaling you can safely assume that the data is in an invalid state and you should either run the repair operation or resync 
from an intact member of the replica set. 
246 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Overview 
Warning: Recovering a member of a replica set. 
Do not use this procedure to recover a member of a replica set. Instead you should either restore from a backup 
(page 172) or perform an initial sync using data from an intact member of the set, as described in Resync a Member 
of a Replica Set (page 575). 
There are two processes to repair data files that result from an unexpected shutdown: 
• Use the --repair option in conjunction with the --repairpath option. mongod will read the existing 
data files, and write the existing data to new data files. This does not modify or alter the existing data files. 
You do not need to remove the mongod.lock file before using this procedure. 
• Use the --repair option. mongod will read the existing data files, write the existing data to new files and 
replace the existing, possibly corrupt, files with new files. 
You must remove the mongod.lock file before using this procedure. 
Note: --repair functionality is also available in the shell with the db.repairDatabase() helper for the 
repairDatabase command. 
Procedures 
Important: Always Run mongod as the same user to avoid changing the permissions of the MongoDB data files. 
Repair Data Files and Preserve Original Files To repair your data files using the --repairpath option to 
preserve the original data files unmodified. 
Repair Data Files without Preserving Original Files To repair your data files without preserving the original files, 
do not use the --repairpath option, as in the following procedure: 
Warning: After you remove the mongod.lock file you must run the --repair process before using your 
database. 
Step 1: Start mongod using the option to replace the original files with the repaired files. Start the mongod 
instance using the --repair option and the --repairpath option. Issue a command similar to the following: 
mongod --dbpath /data/db --repair --repairpath /data/db0 
When this completes, the new repaired data files will be in the /data/db0 directory. 
Step 2: Start mongod with the new data directory. Start mongod using the following invocation to point the 
dbPath at /data/db0: 
mongod --dbpath /data/db0 
Once you confirm that the data files are operational you may delete or archive the old data files in the /data/db 
directory. You may also wish to move the repaired files to the old database location or update the dbPath to indicate 
the new location. 
Step 1: Remove the stale lock file. For example: 
5.2. Administration Tutorials 247
MongoDB Documentation, Release 2.6.4 
rm /data/db/mongod.lock 
Replace /data/db with your dbPath where your MongoDB instance’s data files reside. 
Step 2: Start mongod using the option to replace the original files with the repaired files. Start the mongod 
instance using the --repair option, which replaces the original data files with the repaired data files. Issue a 
command similar to the following: 
mongod --dbpath /data/db --repair 
When this completes, the repaired data files will replace the original data files in the /data/db directory. 
Step 3: Start mongod as usual. Start mongod using the following invocation to point the dbPath at /data/db: 
mongod --dbpath /data/db 
mongod.lock 
In normal operation, you should never remove the mongod.lock file and start mongod. Instead consider the one 
of the above methods to recover the database and remove the lock files. In dire situations you can remove the lockfile, 
and start the database using the possibly corrupt files, and attempt to recover data from the database; however, it’s 
impossible to predict the state of the database in these situations. 
If you are not running with journaling, and your database shuts down unexpectedly for any reason, you should always 
proceed as if your database is in an inconsistent and likely corrupt state. If at all possible restore from backup 
(page 172) or, if running as a replica set, restore by performing an initial sync using data from an intact member of the 
set, as described in Resync a Member of a Replica Set (page 575). 
5.2.3 MongoDB Scripting 
The mongo shell is an interactive JavaScript shell for MongoDB, and is part of all MongoDB distributions89. This 
section provides an introduction to the shell, and outlines key functions, operations, and use of the mongo shell. Also 
consider FAQ: The mongo Shell (page 700) and the shell method and other relevant reference material. 
Note: Most examples in the MongoDB Manual use the mongo shell; however, many drivers provide similar 
interfaces to MongoDB. 
Server-side JavaScript (page 249) Details MongoDB’s support for executing JavaScript code for server-side opera-tions. 
Data Types in the mongo Shell (page 250) Describes the super-set of JSON available for use in the mongo shell. 
Write Scripts for the mongo Shell (page 253) An introduction to the mongo shell for writing scripts to manipulate 
data and administer MongoDB. 
Getting Started with the mongo Shell (page 255) Introduces the use and operation of the MongoDB shell. 
Access the mongo Shell Help Information (page 259) Describes the available methods for accessing online help for 
the operation of the mongo interactive shell. 
mongo Shell Quick Reference (page 261) A high level reference to the use and operation of the mongo shell. 
89http://www.mongodb.org/downloads 
248 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Server-side JavaScript 
Changed in version 2.4: The V8 JavaScript engine, which became the default in 2.4, allows multiple JavaScript 
operations to execute at the same time. Prior to 2.4, MongoDB operations that required the JavaScript interpreter had 
to acquire a lock, and a single mongod could only run a single JavaScript operation at a time. 
Overview 
MongoDB supports the execution of JavaScript code for the following server-side operations: 
• mapReduce and the corresponding mongo shell method db.collection.mapReduce(). See Map- 
Reduce (page 394) for more information. 
• eval command, and the corresponding mongo shell method db.eval() 
• $where operator 
• Running .js files via a mongo shell Instance on the Server (page 249) 
JavaScript in MongoDB 
Although the above operations use JavaScript, most interactions with MongoDB do not use JavaScript but use an 
idiomatic driver in the language of the interacting application. 
See also: 
Store a JavaScript Function on the Server (page 217) 
You can disable all server-side execution of JavaScript, by passing the --noscripting option on the command 
line or setting security.javascriptEnabled in a configuration file. 
Running .js files via a mongo shell Instance on the Server 
You can run a JavaScript (.js) file using a mongo shell instance on the server. This is a good technique for performing 
batch administrative work. When you run mongo shell on the server, connecting via the localhost interface, the 
connection is fast with low latency. 
The command helpers (page 261) provided in the mongo shell are not available in JavaScript files because they are 
not valid JavaScript. The following table maps the most common mongo shell helpers to their JavaScript equivalents. 
5.2. Administration Tutorials 249
MongoDB Documentation, Release 2.6.4 
Shell Helpers JavaScript Equivalents 
show dbs, show databases 
db.adminCommand('listDatabases') 
use <db> 
db = db.getSiblingDB('<db>') 
show collections 
db.getCollectionNames() 
show users 
db.getUsers() 
show roles 
db.getRoles({showBuiltinRoles: true}) 
show log <logname> 
db.adminCommand({ 'getLog' : '<logname>' }) 
show logs 
db.adminCommand({ 'getLog' : '*' }) 
it 
cursor = db.collection.find() 
if ( cursor.hasNext() ){ 
cursor.next(); 
} 
Concurrency 
Refer to the individual method or operator documentation for any concurrency information. See also the concurrency 
table (page 703). 
Data Types in the mongo Shell 
MongoDB BSON provides support for additional data types than JSON. Drivers provide na-tive 
support for these data types in host languages and the mongo shell also provides sev-eral 
helper classes to support the use of these data types in the mongo JavaScript shell. See 
http://docs.mongodb.org/manualreference/mongodb-extended-json for additional infor-mation. 
Types 
Date The mongo shell provides various methods to return the date, either as a string or as a Date object: 
• Date() method which returns the current date as a string. 
• new Date() constructor which returns a Date object using the ISODate() wrapper. 
• ISODate() constructor which returns a Date object using the ISODate() wrapper. 
Internally, Date objects are stored as a 64 bit integer representing the number of milliseconds since the Unix epoch 
(Jan 1, 1970), which results in a representable date range of about 290 millions years into the past and future. 
250 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Return Date as a String To return the date as a string, use the Date() method, as in the following example: 
var myDateString = Date(); 
To print the value of the variable, type the variable name in the shell, as in the following: 
myDateString 
The result is the value of myDateString: 
Wed Dec 19 2012 01:03:25 GMT-0500 (EST) 
To verify the type, use the typeof operator, as in the following: 
typeof myDateString 
The operation returns string. 
Return Date The mongo shell wrap objects of Date type with the ISODate helper; however, the objects remain 
of type Date. 
The following example uses both the new Date() constructor and the ISODate() constructor to return Date 
objects. 
var myDate = new Date(); 
var myDateInitUsingISODateWrapper = ISODate(); 
You can use the new operator with the ISODate() constructor as well. 
To print the value of the variable, type the variable name in the shell, as in the following: 
myDate 
The result is the Date value of myDate wrapped in the ISODate() helper: 
ISODate("2012-12-19T06:01:17.171Z") 
To verify the type, use the instanceof operator, as in the following: 
myDate instanceof Date 
myDateInitUsingISODateWrapper instanceof Date 
The operation returns true for both. 
ObjectId The mongo shell provides the ObjectId() wrapper class around the ObjectId data type. To generate a 
new ObjectId, use the following operation in the mongo shell: 
new ObjectId 
See 
ObjectId (page 165) for full documentation of ObjectIds in MongoDB. 
NumberLong By default, the mongo shell treats all numbers as floating-point values. The mongo shell provides 
the NumberLong() wrapper to handle 64-bit integers. 
The NumberLong() wrapper accepts the long as a string: 
5.2. Administration Tutorials 251
MongoDB Documentation, Release 2.6.4 
NumberLong("2090845886852") 
The following examples use the NumberLong() wrapper to write to the collection: 
db.collection.insert( { _id: 10, calc: NumberLong("2090845886852") } ) 
db.collection.update( { _id: 10 }, 
{ $set: { calc: NumberLong("2555555000000") } } ) 
db.collection.update( { _id: 10 }, 
{ $inc: { calc: NumberLong(5) } } ) 
Retrieve the document to verify: 
db.collection.findOne( { _id: 10 } ) 
In the returned document, the calc field contains a NumberLong object: 
{ "_id" : 10, "calc" : NumberLong("2555555000005") } 
If you use the $inc to increment the value of a field that contains a NumberLong object by a float, the data type 
changes to a floating point value, as in the following example: 
1. Use $inc to increment the calc field by 5, which the mongo shell treats as a float: 
db.collection.update( { _id: 10 }, 
{ $inc: { calc: 5 } } ) 
2. Retrieve the updated document: 
db.collection.findOne( { _id: 10 } ) 
In the updated document, the calc field contains a floating point value: 
{ "_id" : 10, "calc" : 2555555000010 } 
NumberInt By default, the mongo shell treats all numbers as floating-point values. The mongo shell provides the 
NumberInt() constructor to explicitly specify 32-bit integers. 
Check Types in the mongo Shell 
To determine the type of fields, the mongo shell provides the instanceof and typeof operators. 
instanceof instanceof returns a boolean to test if a value is an instance of some type. 
For example, the following operation tests whether the _id field is an instance of type ObjectId: 
mydoc._id instanceof ObjectId 
The operation returns true. 
typeof typeof returns the type of a field. 
For example, the following operation returns the type of the _id field: 
typeof mydoc._id 
In this case typeof will return the more generic object type rather than ObjectId type. 
252 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Write Scripts for the mongo Shell 
You can write scripts for the mongo shell in JavaScript that manipulate data in MongoDB or perform administrative 
operation. For more information about the mongo shell see MongoDB Scripting (page 248), and see the Running .js 
files via a mongo shell Instance on the Server (page 249) section for more information about using these mongo script. 
This tutorial provides an introduction to writing JavaScript that uses the mongo shell to access MongoDB. 
Opening New Connections 
From the mongo shell or from a JavaScript file, you can instantiate database connections using the Mongo() con-structor: 
new Mongo() 
new Mongo(<host>) 
new Mongo(<host:port>) 
Consider the following example that instantiates a new connection to the MongoDB instance running on localhost on 
the default port and sets the global db variable to myDatabase using the getDB() method: 
conn = new Mongo(); 
db = conn.getDB("myDatabase"); 
Additionally, you can use the connect() method to connect to the MongoDB instance. The following example 
connects to the MongoDB instance that is running on localhost with the non-default port 27020 and set the 
global db variable: 
db = connect("localhost:27020/myDatabase"); 
Differences Between Interactive and Scripted mongo 
When writing scripts for the mongo shell, consider the following: 
• To set the db global variable, use the getDB() method or the connect() method. You can assign the 
database reference to a variable other than db. 
• Write operations in the mongo shell use the “safe writes” by default. If performing bulk operations, use the 
Bulk() methods. See Write Method Acknowledgements (page 743) for more information. 
Changed in version 2.6: Before MongoDB 2.6, call db.getLastError() explicitly to wait for the result of 
write operations (page 67). 
• You cannot use any shell helper (e.g. use <dbname>, show dbs, etc.) inside the JavaScript file because 
they are not valid JavaScript. 
The following table maps the most common mongo shell helpers to their JavaScript equivalents. 
5.2. Administration Tutorials 253
MongoDB Documentation, Release 2.6.4 
Shell Helpers JavaScript Equivalents 
show dbs, show databases 
db.adminCommand('listDatabases') 
use <db> 
db = db.getSiblingDB('<db>') 
show collections 
db.getCollectionNames() 
show users 
db.getUsers() 
show roles 
db.getRoles({showBuiltinRoles: true}) 
show log <logname> 
db.adminCommand({ 'getLog' : '<logname>' }) 
show logs 
db.adminCommand({ 'getLog' : '*' }) 
it 
cursor = db.collection.find() 
if ( cursor.hasNext() ){ 
cursor.next(); 
} 
• In interactive mode, mongo prints the results of operations including the content of all cursors. In scripts, either 
use the JavaScript print() function or the mongo specific printjson() function which returns formatted 
JSON. 
Example 
To print all items in a result cursor in mongo shell scripts, use the following idiom: 
cursor = db.collection.find(); 
while ( cursor.hasNext() ) { 
printjson( cursor.next() ); 
} 
Scripting 
From the system prompt, use mongo to evaluate JavaScript. 
--eval option Use the --eval option to mongo to pass the shell a JavaScript fragment, as in the following: 
mongo test --eval "printjson(db.getCollectionNames())" 
This returns the output of db.getCollectionNames() using the mongo shell connected to the mongod or 
mongos instance running on port 27017 on the localhost interface. 
Execute a JavaScript file You can specify a .js file to the mongo shell, and mongo will execute the JavaScript 
directly. Consider the following example: 
254 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
mongo localhost:27017/test myjsfile.js 
This operation executes the myjsfile.js script in a mongo shell that connects to the test database on the 
mongod instance accessible via the localhost interface on port 27017. 
Alternately, you can specify the mongodb connection parameters inside of the javascript file using the Mongo() 
constructor. See Opening New Connections (page 253) for more information. 
You can execute a .js file from within the mongo shell, using the load() function, as in the following: 
load("myjstest.js") 
This function loads and executes the myjstest.js file. 
The load() method accepts relative and absolute paths. If the current working directory of the mongo shell is 
/data/db, and the myjstest.js resides in the /data/db/scripts directory, then the following calls within 
the mongo shell would be equivalent: 
load("scripts/myjstest.js") 
load("/data/db/scripts/myjstest.js") 
Note: There is no search path for the load() function. If the desired script is not in the current working directory 
or the full specified path, mongo will not be able to access the file. 
Getting Started with the mongo Shell 
This document provides a basic introduction to using the mongo shell. See Install MongoDB (page 5) for instructions 
on installing MongoDB for your system. 
Start the mongo Shell 
To start the mongo shell and connect to your MongoDB instance running on localhost with default port: 
1. Go to your <mongodb installation dir>: 
cd <mongodb installation dir> 
2. Type ./bin/mongo to start mongo: 
./bin/mongo 
If you have added the <mongodb installation dir>/bin to the PATH environment variable, you can 
just type mongo instead of ./bin/mongo. 
3. To display the database you are using, type db: 
db 
The operation should return test, which is the default database. To switch databases, issue the use <db> 
helper, as in the following example: 
use <database> 
To list the available databases, use the helper show dbs. See also How can I access different databases 
temporarily? (page 700) to access a different database from the current database without switching your current 
database context (i.e. db..) 
5.2. Administration Tutorials 255
MongoDB Documentation, Release 2.6.4 
To start the mongo shell with other options, see examples of starting up mongo and mongo reference which 
provides details on the available options. 
Note: When starting, mongo checks the user’s HOME directory for a JavaScript file named .mongorc.js. If found, 
mongo interprets the content of .mongorc.js before displaying the prompt for the first time. If you use the shell to 
evaluate a JavaScript file or expression, either by using the --eval option on the command line or by specifying a .js 
file to mongo, mongo will read the .mongorc.js file after the JavaScript has finished processing. You can prevent 
.mongorc.js from being loaded by using the --norc option. 
Executing Queries 
From the mongo shell, you can use the shell methods to run queries, as in the following example: 
db.<collection>.find() 
• The db refers to the current database. 
• The <collection> is the name of the collection to query. See Collection Help (page 259) to list the available 
collections. 
If the mongo shell does not accept the name of the collection, for instance if the name contains a space, hyphen, 
or starts with a number, you can use an alternate syntax to refer to the collection, as in the following: 
db["3test"].find() 
db.getCollection("3test").find() 
• The find() method is the JavaScript method to retrieve documents from <collection>. The find() 
method returns a cursor to the results; however, in the mongo shell, if the returned cursor is not assigned to a 
variable using the var keyword, then the cursor is automatically iterated up to 20 times to print up to the first 
20 documents that match the query. The mongo shell will prompt Type it to iterate another 20 times. 
You can set the DBQuery.shellBatchSize attribute to change the number of iteration from the default 
value 20, as in the following example which sets it to 10: 
DBQuery.shellBatchSize = 10; 
For more information and examples on cursor handling in the mongo shell, see Cursors (page 59). 
See also Cursor Help (page 260) for list of cursor help in the mongo shell. 
For more documentation of basic MongoDB operations in the mongo shell, see: 
• Getting Started with MongoDB (page 43) 
• mongo Shell Quick Reference (page 261) 
• Read Operations (page 55) 
• Write Operations (page 67) 
• Indexing Tutorials (page 464) 
Print 
The mongo shell automatically prints the results of the find() method if the returned cursor is not assigned to 
a variable using the var keyword. To format the result, you can add the .pretty() to the operation, as in the 
following: 
256 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
db.<collection>.find().pretty() 
In addition, you can use the following explicit print methods in the mongo shell: 
• print() to print without formatting 
• print(tojson(<obj>)) to print with JSON formatting and equivalent to printjson() 
• printjson() to print with JSON formatting and equivalent to print(tojson(<obj>)) 
Evaluate a JavaScript File 
You can execute a .js file from within the mongo shell, using the load() function, as in the following: 
load("myjstest.js") 
This function loads and executes the myjstest.js file. 
The load() method accepts relative and absolute paths. If the current working directory of the mongo shell is 
/data/db, and the myjstest.js resides in the /data/db/scripts directory, then the following calls within 
the mongo shell would be equivalent: 
load("scripts/myjstest.js") 
load("/data/db/scripts/myjstest.js") 
Note: There is no search path for the load() function. If the desired script is not in the current working directory 
or the full specified path, mongo will not be able to access the file. 
Use a Custom Prompt 
You may modify the content of the prompt by creating the variable prompt in the shell. The prompt variable can 
hold strings as well as any arbitrary JavaScript. If prompt holds a function that returns a string, mongo can display 
dynamic information in each prompt. Consider the following examples: 
Example 
Create a prompt with the number of operations issued in the current session, define the following variables: 
cmdCount = 1; 
prompt = function() { 
return (cmdCount++) + "> "; 
} 
The prompt would then resemble the following: 
1> db.collection.find() 
2> show collections 
3> 
Example 
To create a mongo shell prompt in the form of <database>@<hostname>$ define the following variables: 
host = db.serverStatus().host; 
prompt = function() { 
5.2. Administration Tutorials 257
MongoDB Documentation, Release 2.6.4 
return db+"@"+host+"$ "; 
} 
The prompt would then resemble the following: 
<database>@<hostname>$ use records 
switched to db records 
records@<hostname>$ 
Example 
To create a mongo shell prompt that contains the system up time and the number of documents in the current database, 
define the following prompt variable: 
prompt = function() { 
return "Uptime:"+db.serverStatus().uptime+" Documents:"+db.stats().objects+" > "; 
} 
The prompt would then resemble the following: 
Uptime:5897 Documents:6 > db.people.save({name : "James"}); 
Uptime:5948 Documents:7 > 
Use an External Editor in the mongo Shell 
New in version 2.2. 
In the mongo shell you can use the edit operation to edit a function or variable in an external editor. The edit 
operation uses the value of your environments EDITOR variable. 
At your system prompt you can define the EDITOR variable and start mongo with the following two operations: 
export EDITOR=vim 
mongo 
Then, consider the following example shell session: 
MongoDB shell version: 2.2.0 
> function f() {} 
> edit f 
> f 
function f() { 
print("this really works"); 
} 
> f() 
this really works 
> o = {} 
{ } 
> edit o 
> o 
{ "soDoes" : "this" } 
> 
Note: As mongo shell interprets code edited in an external editor, it may modify code in functions, depending on 
the JavaScript compiler. For mongo may convert 1+1 to 2 or remove comments. The actual changes affect only the 
appearance of the code and will vary based on the version of JavaScript used but will not affect the semantics of the 
code. 
258 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Exit the Shell 
To exit the shell, type quit() or use the <Ctrl-c> shortcut. 
Access the mongo Shell Help Information 
In addition to the documentation in the MongoDB Manual, the mongo shell provides some additional information 
in its “online” help system. This document provides an overview of accessing this help information. 
See also: 
• mongo Manual Page 
• MongoDB Scripting (page 248), and 
• mongo Shell Quick Reference (page 261). 
Command Line Help 
To see the list of options and help for starting the mongo shell, use the --help option from the command line: 
mongo --help 
Shell Help 
To see the list of help, in the mongo shell, type help: 
help 
Database Help 
• To see the list of databases on the server, use the show dbs command: 
show dbs 
New in version 2.4: show databases is now an alias for show dbs 
• To see the list of help for methods you can use on the db object, call the db.help() method: 
db.help() 
• To see the implementation of a method in the shell, type the db.<method name> without the parenthesis 
(()), as in the following example which will return the implementation of the method db.addUser(): 
db.addUser 
Collection Help 
• To see the list of collections in the current database, use the show collections command: 
5.2. Administration Tutorials 259
MongoDB Documentation, Release 2.6.4 
show collections 
• To see the help for methods available on the collection objects (e.g. db.<collection>), use the 
db.<collection>.help() method: 
db.collection.help() 
<collection> can be the name of a collection that exists, although you may specify a collection that doesn’t 
exist. 
• To see the collection method implementation, type the db.<collection>.<method> name without the 
parenthesis (()), as in the following example which will return the implementation of the save() method: 
db.collection.save 
Cursor Help 
When you perform read operations (page 55) with the find() method in the mongo shell, you can use various 
cursor methods to modify the find() behavior and various JavaScript methods to handle the cursor returned from 
the find() method. 
• To list the available modifier and cursor handling methods, use the db.collection.find().help() 
command: 
db.collection.find().help() 
<collection> can be the name of a collection that exists, although you may specify a collection that doesn’t 
exist. 
• To see the implementation of the cursor method, type the db.<collection>.find().<method> name 
without the parenthesis (()), as in the following example which will return the implementation of the 
toArray() method: 
db.collection.find().toArray 
Some useful methods for handling cursors are: 
• hasNext() which checks whether the cursor has more documents to return. 
• next() which returns the next document and advances the cursor position forward by one. 
• forEach(<function>) which iterates the whole cursor and applies the <function> to each document 
returned by the cursor. The <function> expects a single argument which corresponds to the document from 
each iteration. 
For examples on iterating a cursor and retrieving the documents from the cursor, see cursor handling (page 59). See 
also js-query-cursor-methods for all available cursor methods. 
Type Help 
To get a list of the wrapper classes available in the mongo shell, such as BinData(), type help misc in the 
mongo shell: 
help misc 
260 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
mongo Shell Quick Reference 
mongo Shell Command History 
You can retrieve previous commands issued in the mongo shell with the up and down arrow keys. Command history 
is stored in ~/.dbshell file. See .dbshell for more information. 
Command Line Options 
The mongo executable can be started with numerous options. See mongo executable page for details on all 
available options. 
The following table displays some common options for mongo: 
Op-tion 
Description 
--help Show command line options 
--nodb Start mongo shell without connecting to a database. 
To connect later, see Opening New Connections (page 253). 
--shellUsed in conjunction with a JavaScript file (i.e. <file.js>) to continue in the mongo shell after running 
the JavaScript file. 
See JavaScript file (page 254) for an example. 
Command Helpers 
The mongo shell provides various help. The following table displays some common help methods and commands: 
Help Methods and 
Description 
Commands 
help Show help. 
db.help() Show help for database methods. 
db.<collection>.hSehlopw()help on collection methods. The <collection> can be the name of an existing 
collection or a non-existing collection. 
show dbs Print a list of all databases on the server. 
use <db> Switch current database to <db>. The mongo shell variable db is set to the current 
database. 
show 
collections 
Print a list of all collections for current database 
show users Print a list of users for current database. 
show roles Print a list of all roles, both user-defined and built-in, for the current database. 
show profile Print the five most recent operations that took 1 millisecond or more. See documentation 
on the database profiler (page 210) for more information. 
show databases New in version 2.4: Print a list of all available databases. 
load() Execute a JavaScript file. See Getting Started with the mongo Shell (page 255) for more 
information. 
Basic Shell JavaScript Operations 
The mongo shell provides numerous http://docs.mongodb.org/manualreference/method methods 
for database operations. 
In the mongo shell, db is the variable that references the current database. The variable is automatically set to the 
default database test or is set when you use the use <db> to switch current database. 
5.2. Administration Tutorials 261
MongoDB Documentation, Release 2.6.4 
The following table displays some common JavaScript operations: 
JavaScript Database Operations Description 
db.auth() If running in secure mode, authenticate the user. 
coll = db.<collection> Set a specific collection in the current database to a vari-able 
coll, as in the following example: 
coll = db.myCollection; 
You can perform operations on the myCollection 
using the variable, as in the following example: 
coll.find(); 
find() Find all documents in the collection and returns a cursor. 
See the db.collection.find() and Query Docu-ments 
(page 87) for more information and examples. 
See Cursors (page 59) for additional information on cur-sor 
handling in the mongo shell. 
insert() Insert a new document into the collection. 
update() Update an existing document in the collection. 
See Write Operations (page 67) for more information. 
save() Insert either a new document or update an existing doc-ument 
in the collection. 
See Write Operations (page 67) for more information. 
remove() Delete documents from the collection. 
See Write Operations (page 67) for more information. 
drop() Drops or removes completely the collection. 
ensureIndex() Create a new index on the collection if the index does 
not exist; otherwise, the operation has no effect. 
db.getSiblingDB() Return a reference to another database using this same 
connection without explicitly switching the current 
database. This allows for cross database queries. See 
How can I access different databases temporarily? 
(page 700) for more information. 
For more information on performing operations in the shell, see: 
• MongoDB CRUD Concepts (page 53) 
• Read Operations (page 55) 
• Write Operations (page 67) 
• http://docs.mongodb.org/manualreference/method 
Keyboard Shortcuts 
Changed in version 2.2. 
The mongo shell provides most keyboard shortcuts similar to those found in the bash shell or in Emacs. For some 
functions mongo provides multiple key bindings, to accommodate several familiar paradigms. 
The following table enumerates the keystrokes supported by the mongo shell: 
Keystroke Function 
Up-arrow previous-history 
Down-arrow next-history 
Home beginning-of-line 
Continued on next page 
262 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
Table 5.1 – continued from previous page 
Keystroke Function 
End end-of-line 
Tab autocomplete 
Left-arrow backward-character 
Right-arrow forward-character 
Ctrl-left-arrow backward-word 
Ctrl-right-arrow forward-word 
Meta-left-arrow backward-word 
Meta-right-arrow forward-word 
Ctrl-A beginning-of-line 
Ctrl-B backward-char 
Ctrl-C exit-shell 
Ctrl-D delete-char (or exit shell) 
Ctrl-E end-of-line 
Ctrl-F forward-char 
Ctrl-G abort 
Ctrl-J accept-line 
Ctrl-K kill-line 
Ctrl-L clear-screen 
Ctrl-M accept-line 
Ctrl-N next-history 
Ctrl-P previous-history 
Ctrl-R reverse-search-history 
Ctrl-S forward-search-history 
Ctrl-T transpose-chars 
Ctrl-U unix-line-discard 
Ctrl-W unix-word-rubout 
Ctrl-Y yank 
Ctrl-Z Suspend (job control works in linux) 
Ctrl-H (i.e. Backspace) backward-delete-char 
Ctrl-I (i.e. Tab) complete 
Meta-B backward-word 
Meta-C capitalize-word 
Meta-D kill-word 
Meta-F forward-word 
Meta-L downcase-word 
Meta-U upcase-word 
Meta-Y yank-pop 
Meta-[Backspace] backward-kill-word 
Meta-< beginning-of-history 
Meta-> end-of-history 
Queries 
In the mongo shell, perform read operations using the find() and findOne() methods. 
The find() method returns a cursor object which the mongo shell iterates to print documents on screen. By default, 
mongo prints the first 20. The mongo shell will prompt the user to “Type it” to continue iterating the next 20 
results. 
The following table provides some common read operations in the mongo shell: 
5.2. Administration Tutorials 263
MongoDB Documentation, Release 2.6.4 
Read Operations Description 
db.collection.find(<query>) Find the documents matching the <query> criteria in 
the collection. If the <query> criteria is not specified 
or is empty (i.e {} ), the read operation selects all doc-uments 
in the collection. 
The following example selects the documents in the 
users collection with the name field equal to "Joe": 
coll = db.users; 
coll.find( { name: "Joe" } ); 
For more information on specifying the <query> cri-teria, 
see Query Documents (page 87). 
db.collection.find( <query>, 
<projection> ) 
Find documents matching the <query> criteria and re-turn 
just specific fields in the <projection>. 
The following example selects all documents from the 
collection but returns only the name field and the _id 
field. The _id is always returned unless explicitly spec-ified 
to not return. 
coll = db.users; 
coll.find( { }, 
{ name: true } 
); 
For more information on specifying the 
<projection>, see Limit Fields to Return from 
a Query (page 94). 
db.collection.find().sort( <sort 
order> ) 
Return results in the specified <sort order>. 
The following example selects all documents from the 
collection and returns the results sorted by the name 
field in ascending order (1). Use -1 for descending or-der: 
coll = db.users; 
coll.find().sort( { name: 1 } ); 
db.collection.find( <query> ).sort( 
<sort order> ) 
Return the documents matching the <query> criteria 
in the specified <sort order>. 
db.collection.find( ... ).limit( <n> 
) 
Limit result to <n> rows. Highly recommended if you 
need only a certain number of rows for best perfor-mance. 
db.collection.find( ... ).skip( <n> 
) 
Skip <n> results. 
count() Returns total number of documents in the collection. 
db.collection.find( <query> ).count() Returns the total number of documents that match the 
query. 
The count() ignores limit() and skip(). For 
example, if 100 records match but the limit is 10, 
count() will return 100. This will be faster than it-erating 
yourself, but still take time. 
db.collection.findOne( <query> ) Find and return a single document. Returns null if not 
found. 
The following example selects a single doc-ument 
in the users collection with the 
name field matches to "Joe": 
coll = db.users; 
coll.findOne( { name: "Joe" } ); 
Internally, the findOne() method is the find() 
method with a limit(1). 
264 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
See Query Documents (page 87) and Read Operations (page 55) documentation for more information and examples. 
See http://docs.mongodb.org/manualreference/operator to specify other query operators. 
Error Checking Methods 
Changed in version 2.6. 
The mongo shell write methods now integrates the Write Concern (page 72) directly into the method execution rather 
than with a separate db.getLastError() method. As such, the write methods now return a WriteResult() 
object that contains the results of the operation, including any write errors and write concern errors. 
Previous versions used db.getLastError() and db.getLastErrorObj() methods to return error informa-tion. 
Administrative Command Helpers 
The following table lists some common methods to support database administration: 
JavaScript Database 
Description 
Administration Methods 
db.cloneDatabase(<host>C)lone the current database from the <host> specified. The <host> database 
instance must be in noauth mode. 
Copy the <from> database from the <host> to the <to> database on the 
current server. 
The <host> database instance must be in noauth mode. 
db.copyDatabase(<from>, 
<to>, <host>) 
db.fromColl.renameColleRcetniamone(c<oltleocCtioolnlfr>o)m fromColl to <toColl>. 
db.repairDatabase() Repair and compact the current database. This operation can be very slow on 
large databases. 
db.addUser( <user>, 
<pwd> ) 
Add user to current database. 
db.getCollectionNames()Get the list of all collections in the current database. 
db.dropDatabase() Drops the current database. 
See also administrative database methods for a full list of methods. 
Opening Additional Connections 
You can create new connections within the mongo shell. 
The following table displays the methods to create the connections: 
JavaScript Connection Create Methods Description 
db = connect("<host><:port>/<dbname>") 
Open a new database connection. 
conn = new Mongo() 
db = conn.getDB("dbname") 
Open a connection to a new server using new 
Mongo(). 
Use getDB() method of the connection to select a 
database. 
See also Opening New Connections (page 253) for more information on the opening new connections from the mongo 
shell. 
5.2. Administration Tutorials 265
MongoDB Documentation, Release 2.6.4 
Miscellaneous 
The following table displays some miscellaneous methods: 
Method Description 
Object.bsonsize(<document>) Prints the BSON size of a <document> in bytes 
See the MongoDB JavaScript API Documentation90 for a full list of JavaScript methods . 
Additional Resources 
Consider the following reference material that addresses the mongo shell and its interface: 
• http://docs.mongodb.org/manualreference/program/mongo 
• http://docs.mongodb.org/manualreference/method 
• http://docs.mongodb.org/manualreference/operator 
• http://docs.mongodb.org/manualreference/command 
• Aggregation Reference (page 419) 
Additionally, the MongoDB source code repository includes a jstests directory91 which contains numerous mongo 
shell scripts. 
See also: 
The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica 
Set Tutorials (page 543) and Sharded Cluster Tutorials (page 634) for additional tutorials and information. 
5.3 Administration Reference 
UNIX ulimit Settings (page 266) Describes user resources limits (i.e. ulimit) and introduces the considerations 
and optimal configurations for systems that run MongoDB deployments. 
System Collections (page 270) Introduces the internal collections that MongoDB uses to track per-database metadata, 
including indexes, collections, and authentication credentials. 
Database Profiler Output (page 271) Describes the data collected by MongoDB’s operation profiler, which intro-spects 
operations and reports data for analysis on performance and behavior. 
Journaling Mechanics (page 275) Describes the internal operation of MongoDB’s journaling facility and outlines 
how the journal allows MongoDB to provide provides durability and crash resiliency. 
Exit Codes and Statuses (page 276) Lists the unique codes returned by mongos and mongod processes upon exit. 
5.3.1 UNIX ulimit Settings 
Most UNIX-like operating systems, including Linux and OS X, provide ways to limit and control the usage of system 
resources such as threads, files, and network connections on a per-process and per-user basis. These “ulimits” prevent 
single users from using too many system resources. Sometimes, these limits have low default values that can cause a 
number of issues in the course of normal MongoDB operation. 
Note: Red Hat Enterprise Linux and CentOS 6 place a max process limitation of 1024 which overrides ulimit set- 
90http://api.mongodb.org/js/index.html 
91https://github.com/mongodb/mongo/tree/master/jstests/ 
266 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
tings. Create a file named /etc/security/limits.d/99-mongodb-nproc.conf with new soft nproc 
and hard nproc values to increase the process limit. See /etc/security/limits.d/90-nproc.conf file 
as an example. 
Resource Utilization 
mongod and mongos each use threads and file descriptors to track connections and manage internal operations. This 
section outlines the general resource utilization patterns for MongoDB. Use these figures in combination with the 
actual information about your deployment and its use to determine ideal ulimit settings. 
Generally, all mongod and mongos instances: 
• track each incoming connection with a file descriptor and a thread. 
• track each internal thread or pthread as a system process. 
mongod 
• 1 file descriptor for each data file in use by the mongod instance. 
• 1 file descriptor for each journal file used by the mongod instance when storage.journal.enabled is 
true. 
• In replica sets, each mongod maintains a connection to all other members of the set. 
mongod uses background threads for a number of internal processes, including TTL collections (page 198), replica-tion, 
and replica set health checks, which may require a small number of additional resources. 
mongos 
In addition to the threads and file descriptors for client connections, mongos must maintain connects to all config 
servers and all shards, which includes all members of all replica sets. 
For mongos, consider the following behaviors: 
• mongos instances maintain a connection pool to each shard so that the mongos can reuse connections and 
quickly fulfill requests without needing to create new connections. 
• You can limit the number of incoming connections using the maxIncomingConnections run-time option. 
By restricting the number of incoming connections you can prevent a cascade effect where the mongos creates 
too many connections on the mongod instances. 
Note: Changed in version 2.6: MongoDB removed the upward limit on the maxIncomingConnections 
setting. 
Review and Set Resource Limits 
ulimit 
Note: Both the “hard” and the “soft” ulimit affect MongoDB’s performance. The “hard” ulimit refers to the 
maximum number of processes that a user can have active at any time. This is the ceiling: no non-root process can 
increase the “hard” ulimit. In contrast, the “soft” ulimit is the limit that is actually enforced for a session or 
process, but any process can increase it up to “hard” ulimit maximum. 
5.3. Administration Reference 267
MongoDB Documentation, Release 2.6.4 
A low “soft” ulimit can cause can’t create new thread, closing connection errors if the number 
of connections grows too high. For this reason, it is extremely important to set both ulimit values to the recom-mended 
values. 
ulimit will modify both “hard” and “soft” values unless the -H or -S modifiers are specified when modifying limit 
values. 
You can use the ulimit command at the system prompt to check system limits, as in the following example: 
$ ulimit -a 
-t: cpu time (seconds) unlimited 
-f: file size (blocks) unlimited 
-d: data seg size (kbytes) unlimited 
-s: stack size (kbytes) 8192 
-c: core file size (blocks) 0 
-m: resident set size (kbytes) unlimited 
-u: processes 192276 
-n: file descriptors 21000 
-l: locked-in-memory size (kb) 40000 
-v: address space (kb) unlimited 
-x: file locks unlimited 
-i: pending signals 192276 
-q: bytes in POSIX msg queues 819200 
-e: max nice 30 
-r: max rt priority 65 
-N 15: unlimited 
ulimit refers to the per-user limitations for various resources. Therefore, if your mongod instance executes as 
a user that is also running multiple processes, or multiple mongod processes, you might see contention for these 
resources. Also, be aware that the processes value (i.e. -u) refers to the combined number of distinct processes 
and sub-process threads. 
You can change ulimit settings by issuing a command in the following form: 
ulimit -n <value> 
For many distributions of Linux you can change values by substituting the -n option for any possible value in the 
output of ulimit -a. On OS X, use the launchctl limit command. See your operating system documentation 
for the precise procedure for changing system limits on running systems. 
Note: After changing the ulimit settings, you must restart the process to take advantage of the modified settings. 
You can use the http://docs.mongodb.org/manualproc file system to see the current limitations on a 
running process. 
Depending on your system’s configuration, and default settings, any change to system limits made using ulimit 
may revert following system a system restart. Check your distribution and operating system documentation for more 
information. 
Recommended ulimit Settings 
Every deployment may have unique requirements and settings; however, the following thresholds and settings are 
particularly important for mongod and mongos deployments: 
• -f (file size): unlimited 
• -t (cpu time): unlimited 
268 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
• -v (virtual memory): unlimited 92 
• -n (open files): 64000 
• -m (memory size): unlimited 1 93 
• -u (processes/threads): 64000 
Always remember to restart your mongod and mongos instances after changing the ulimit settings to ensure that 
the changes take effect. 
Linux distributions using Upstart 
For Linux distributions that use Upstart, you can specify limits within service scripts if you start mongod and/or 
mongos instances as Upstart services. You can do this by using limit stanzas94. 
Specify the Recommended ulimit Settings (page 268), as in the following example: 
limit fsize unlimited unlimited # (file size) 
limit cpu unlimited unlimited # (cpu time) 
limit as unlimited unlimited # (virtual memory size) 
limit nofile 64000 64000 # (open files) 
limit nproc 64000 64000 # (processes/threads) 
Each limit stanza sets the “soft” limit to the first value specified and the “hard” limit to the second. 
After after changing limit stanzas, ensure that the changes take effect by restarting the application services, using 
the following form: 
restart <service name> 
Linux distributions using systemd 
For Linux distributions that use systemd, you can specify limits within the [Service] sections of service scripts 
if you start mongod and/or mongos instances as systemd services. You can do this by using resource limit direc-tives95. 
Specify the Recommended ulimit Settings (page 268), as in the following example: 
[Service] 
# Other directives omitted 
# (file size) 
LimitFSIZE=infinity 
# (cpu time) 
LimitCPU=infinity 
# (virtual memory size) 
LimitAS=infinity 
# (open files) 
LimitNOFILE=64000 
# (processes/threads) 
LimitNPROC=64000 
92 If you limit virtual or resident memory size on a system running MongoDB the operating system will refuse to honor additional allocation 
requests. 
93 The -m parameter to ulimit has no effect on Linux systems with kernel versions more recent than 2.4.30. You may omit -m if you wish. 
94http://upstart.ubuntu.com/wiki/Stanzas#limit 
95http://www.freedesktop.org/software/systemd/man/systemd.exec.html#LimitCPU= 
5.3. Administration Reference 269
MongoDB Documentation, Release 2.6.4 
Each systemd limit directive sets both the “hard” and “soft” limits to the value specified. 
After after changing limit stanzas, ensure that the changes take effect by restarting the application services, using 
the following form: 
systemctl restart <service name> 
/proc File System 
Note: This section applies only to Linux operating systems. 
The http://docs.mongodb.org/manualproc file-system stores the per-process limits in the file system ob-ject 
located at /proc/<pid>/limits, where <pid> is the process’s PID or process identifier. You can use the 
following bash function to return the content of the limits object for a process or processes with a given name: 
return-limits(){ 
for process in $@; do 
process_pids=`ps -C $process -o pid --no-headers | cut -d " " -f 2` 
if [ -z $@ ]; then 
echo "[no $process running]" 
else 
for pid in $process_pids; do 
echo "[$process #$pid -- limits]" 
cat /proc/$pid/limits 
done 
fi 
done 
} 
You can copy and paste this function into a current shell session or load it as part of a script. Call the function with 
one the following invocations: 
return-limits mongod 
return-limits mongos 
return-limits mongod mongos 
5.3.2 System Collections 
Synopsis 
MongoDB stores system information in collections that use the <database>.system.* namespace, which Mon-goDB 
reserves for internal use. Do not create collections that begin with system. 
MongoDB also stores some additional instance-local metadata in the local database (page 598), specifically for repli-cation 
purposes. 
Collections 
System collections include these collections stored in the admin database: 
270 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
admin.system.roles 
New in version 2.6. 
The admin.system.roles (page 270) collection stores custom roles that administrators create and assign 
to users to provide access to specific resources. 
admin.system.users 
Changed in version 2.6. 
The admin.system.users (page 271) collection stores the user’s authentication credentials as well as any 
roles assigned to the user. Users may define authorization roles in the admin.system.roles (page 270) 
collection. 
admin.system.version 
New in version 2.6. 
Stores the schema version of the user credential documents. 
System collections also include these collections stored directly in each database: 
<database>.system.namespaces 
The <database>.system.namespaces (page 271) collection contains information about all of the 
database’s collections. Additional namespace metadata exists in the database.ns files and is opaque to 
database users. 
<database>.system.indexes 
The <database>.system.indexes (page 271) collection lists all the indexes in the database. Add and 
remove data from this collection via the ensureIndex() and dropIndex() 
<database>.system.profile 
The <database>.system.profile (page 271) collection stores database profiling information. For in-formation 
on profiling, see Database Profiling (page 180). 
<database>.system.js 
The <database>.system.js (page 271) collection holds special JavaScript code for use in server side 
JavaScript (page 249). See Store a JavaScript Function on the Server (page 217) for more information. 
5.3.3 Database Profiler Output 
The database profiler captures data information about read and write operations, cursor operations, and database com-mands. 
To configure the database profile and set the thresholds for capturing profile data, see the Analyze Performance 
of Database Operations (page 210) section. 
The database profiler writes data in the system.profile (page 271) collection, which is a capped collection. To 
view the profiler’s output, use normal MongoDB queries on the system.profile (page 271) collection. 
Note: Because the database profiler writes data to the system.profile (page 271) collection in a database, the 
profiler will profile some write activity, even for databases that are otherwise read-only. 
Example system.profile Document 
The documents in the system.profile (page 271) collection have the following form. This example document 
reflects an update operation: 
{ 
"ts" : ISODate("2012-12-10T19:31:28.977Z"), 
"op" : "update", 
"ns" : "social.users", 
5.3. Administration Reference 271
MongoDB Documentation, Release 2.6.4 
"query" : { 
"name" : "j.r." 
}, 
"updateobj" : { 
"$set" : { 
"likes" : [ 
"basketball", 
"trekking" 
] 
} 
}, 
"nscanned" : 8, 
"scanAndOrder" : true, 
"moved" : true, 
"nmoved" : 1, 
"nupdated" : 1, 
"keyUpdates" : 0, 
"numYield" : 0, 
"lockStats" : { 
"timeLockedMicros" : { 
"r" : NumberLong(0), 
"w" : NumberLong(258) 
}, 
"timeAcquiringMicros" : { 
"r" : NumberLong(0), 
"w" : NumberLong(7) 
} 
}, 
"millis" : 0, 
"client" : "127.0.0.1", 
"user" : "" 
} 
Output Reference 
For any single operation, the documents created by the database profiler will include a subset of the following fields. 
The precise selection of fields in these documents depends on the type of operation. 
system.profile.ts 
The timestamp of the operation. 
system.profile.op 
The type of operation. The possible values are: 
•insert 
•query 
•update 
•remove 
•getmore 
•command 
system.profile.ns 
The namespace the operation targets. Namespaces in MongoDB take the form of the database, followed by a 
dot (.), followed by the name of the collection. 
272 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
system.profile.query 
The query document (page 87) used. 
system.profile.command 
The command operation. 
system.profile.updateobj 
The <update> document passed in during an update (page 67) operation. 
system.profile.cursorid 
The ID of the cursor accessed by a getmore operation. 
system.profile.ntoreturn 
Changed in version 2.2: In 2.0, MongoDB includes this field for query and command operations. In 2.2, this 
information MongoDB also includes this field for getmore operations. 
The number of documents the operation specified to return. For example, the profile command would 
return one document (a results document) so the ntoreturn (page 273) value would be 1. The limit(5) 
command would return five documents so the ntoreturn (page 273) value would be 5. 
If the ntoreturn (page 273) value is 0, the command did not specify a number of documents to return, as 
would be the case with a simple find() command with no limit specified. 
system.profile.ntoskip 
New in version 2.2. 
The number of documents the skip() method specified to skip. 
system.profile.nscanned 
The number of documents that MongoDB scans in the index (page 431) in order to carry out the operation. 
In general, if nscanned (page 273) is much higher than nreturned (page 274), the database is scanning 
many objects to find the target objects. Consider creating an index to improve this. 
system.profile.scanAndOrder 
scanAndOrder (page 273) is a boolean that is true when a query cannot use the order of documents in the 
index for returning sorted results: MongoDB must sort the documents after it receives the documents from a 
cursor. 
If scanAndOrder (page 273) is false, MongoDB can use the order of the documents in an index to return 
sorted results. 
system.profile.moved 
This field appears with a value of true when an update operation moved one or more documents to a new 
location on disk. If the operation did not result in a move, this field does not appear. Operations that result in a 
move take more time than in-place updates and typically occur as a result of document growth. 
system.profile.nmoved 
New in version 2.2. 
The number of documents the operation moved on disk. This field appears only if the operation resulted in a 
move. The field’s implicit value is zero, and the field is present only when non-zero. 
system.profile.nupdated 
New in version 2.2. 
The number of documents updated by the operation. 
system.profile.keyUpdates 
New in version 2.2. 
5.3. Administration Reference 273
MongoDB Documentation, Release 2.6.4 
The number of index (page 431) keys the update changed in the operation. Changing an index key carries a 
small performance cost because the database must remove the old key and inserts a new key into the B-tree 
index. 
system.profile.numYield 
New in version 2.2. 
The number of times the operation yielded to allow other operations to complete. Typically, operations yield 
when they need access to data that MongoDB has not yet fully read into memory. This allows other operations 
that have data in memory to complete while MongoDB reads in data for the yielding operation. For more 
information, see the FAQ on when operations yield (page 703). 
system.profile.lockStats 
New in version 2.2. 
The time in microseconds the operation spent acquiring and holding locks. This field reports data for the 
following lock types: 
•R - global read lock 
•W - global write lock 
•r - database-specific read lock 
•w - database-specific write lock 
system.profile.lockStats.timeLockedMicros 
The time in microseconds the operation held a specific lock. For operations that require more than one 
lock, like those that lock the local database to update the oplog, this value may be longer than the total 
length of the operation (i.e. millis (page 274).) 
system.profile.lockStats.timeAcquiringMicros 
The time in microseconds the operation spent waiting to acquire a specific lock. 
system.profile.nreturned 
The number of documents returned by the operation. 
system.profile.responseLength 
The length in bytes of the operation’s result document. A large responseLength (page 274) can affect 
performance. To limit the size of the result document for a query operation, you can use any of the following: 
•Projections (page 94) 
•The limit() method 
•The batchSize() method 
Note: When MongoDB writes query profile information to the log, the responseLength (page 274) value 
is in a field named reslen. 
system.profile.millis 
The time in milliseconds from the perspective of the mongod from the beginning of the operation to the end of 
the operation. 
system.profile.client 
The IP address or hostname of the client connection where the operation originates. 
For some operations, such as db.eval(), the client is 0.0.0.0:0 instead of an actual client. 
system.profile.user 
The authenticated user who ran the operation. 
274 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
5.3.4 Journaling Mechanics 
When running with journaling, MongoDB stores and applies write operations (page 67) in memory and in the on-disk 
journal before the changes are present in the data files on disk. This document discusses the implementation and 
mechanics of journaling in MongoDB systems. See Manage Journaling (page 215) for information on configuring, 
tuning, and managing journaling. 
Journal Files 
With journaling enabled, MongoDB creates a journal subdirectory within the directory defined by dbPath, which is 
/data/db by default. The journal directory holds journal files, which contain write-ahead redo logs. The directory 
also holds a last-sequence-number file. A clean shutdown removes all the files in the journal directory. A dirty shut-down 
(crash) leaves files in the journal directory; these are used to automatically recover the database to a consistent 
state when the mongod process is restarted. 
Journal files are append-only files and have file names prefixed with j._. When a journal file holds 1 gigabyte of data, 
MongoDB creates a new journal file. Once MongoDB applies all the write operations in a particular journal file to the 
database data files, it deletes the file, as it is no longer needed for recovery purposes. Unless you write many bytes of 
data per second, the journal directory should contain only two or three journal files. 
You can use the storage.smallFiles run time option when starting mongod to limit the size of each journal 
file to 128 megabytes, if you prefer. 
To speed the frequent sequential writes that occur to the current journal file, you can ensure that the journal directory 
is on a different filesystem from the database data files. 
Important: If you place the journal on a different filesystem from your data files you cannot use a filesystem snapshot 
alone to capture valid backups of a dbPath directory. In this case, use fsyncLock() to ensure that database files 
are consistent before the snapshot and fsyncUnlock() once the snapshot is complete. 
Note: Depending on your filesystem, you might experience a preallocation lag the first time you start a mongod 
instance with journaling enabled. 
MongoDB may preallocate journal files if the mongod process determines that it is more efficient to preallocate 
journal files than create new journal files as needed. The amount of time required to pre-allocate lag might last several 
minutes, during which you will not be able to connect to the database. This is a one-time preallocation and does not 
occur with future invocations. 
To avoid preallocation lag, see Avoid Preallocation Lag (page 216). 
Storage Views used in Journaling 
Journaling adds three internal storage views to MongoDB. 
The shared view stores modified data for upload to the MongoDB data files. The shared view is the only view 
with direct access to the MongoDB data files. When running with journaling, mongod asks the operating system to 
map your existing on-disk data files to the shared view virtual memory view. The operating system maps the files 
but does not load them. MongoDB later loads data files into the shared view as needed. 
The private view stores data for use with read operations (page 55). The private view is the first place 
MongoDB applies new write operations (page 67). Upon a journal commit, MongoDB copies the changes made in 
the private view to the shared view, where they are then available for uploading to the database data files. 
The journal is an on-disk view that stores new write operations after MongoDB applies the operation to the private 
view but before applying them to the data files. The journal provides durability. If the mongod instance were to 
5.3. Administration Reference 275
MongoDB Documentation, Release 2.6.4 
crash without having applied the writes to the data files, the journal could replay the writes to the shared view for 
eventual upload to the data files. 
How Journaling Records Write Operations 
MongoDB copies the write operations to the journal in batches called group commits. These “group commits” help 
minimize the performance impact of journaling, since a group commit must block all writers during the commit. See 
commitIntervalMs for information on the default commit interval. 
Journaling stores raw operations that allow MongoDB to reconstruct the following: 
• document insertion/updates 
• index modifications 
• metadata changes to the namespace files 
• creation and dropping of databases and their associated data files 
As write operations (page 67) occur, MongoDB writes the data to the private view in RAM and then copies the 
write operations in batches to the journal. The journal stores the operations on disk to ensure durability. Each journal 
entry describes the bytes the write operation changed in the data files. 
MongoDB next applies the journal’s write operations to the shared view. At this point, the shared view 
becomes inconsistent with the data files. 
At default intervals of 60 seconds, MongoDB asks the operating system to flush the shared view to disk. This 
brings the data files up-to-date with the latest write operations. The operating system may choose to flush the shared 
view to disk at a higher frequency than 60 seconds, particularly if the system is low on free memory. 
When MongoDB flushes write operations to the data files, MongoDB notes which journal writes have been flushed. 
Once a journal file contains only flushed writes, it is no longer needed for recovery, and MongoDB either deletes it or 
recycles it for a new journal file. 
As part of journaling, MongoDB routinely asks the operating system to remap the shared view to the private 
view, in order to save physical RAM. Upon a new remapping, the operating system knows that physical memory 
pages can be shared between the shared view and the private view mappings. 
Note: The interaction between the shared view and the on-disk data files is similar to how MongoDB works 
without journaling, which is that MongoDB asks the operating system to flush in-memory changes back to the data 
files every 60 seconds. 
5.3.5 Exit Codes and Statuses 
MongoDB will return one of the following codes and statuses when exiting. Use this guide to interpret logs and when 
troubleshooting issues with mongod and mongos instances. 
0 
Returned by MongoDB applications upon successful exit. 
2 
The specified options are in error or are incompatible with other options. 
3 
Returned by mongod if there is a mismatch between hostnames specified on the command line and in the 
local.sources (page 600) collection. mongod may also return this status if oplog collection in the local 
database is not readable. 
276 Chapter 5. Administration
MongoDB Documentation, Release 2.6.4 
4 
The version of the database is different from the version supported by the mongod (or mongod.exe) instance. 
The instance exits cleanly. Restart mongod with the --upgrade option to upgrade the database to the version 
supported by this mongod instance. 
5 
Returned by mongod if a moveChunk operation fails to confirm a commit. 
12 
Returned by the mongod.exe process on Windows when it receives a Control-C, Close, Break or Shutdown 
event. 
14 
Returned by MongoDB applications which encounter an unrecoverable error, an uncaught exception or uncaught 
signal. The system exits without performing a clean shut down. 
20 
Message: ERROR: wsastartup failed <reason> 
Returned by MongoDB applications on Windows following an error in the WSAStartup function. 
Message: NT Service Error 
Returned by MongoDB applications forWindows due to failures installing, starting or removing the NT Service 
for the application. 
45 
Returned when a MongoDB application cannot open a file or cannot obtain a lock on a file. 
47 
MongoDB applications exit cleanly following a large clock skew (32768 milliseconds) event. 
48 
mongod exits cleanly if the server socket closes. The server socket is on port 27017 by default, or as specified 
to the --port run-time option. 
49 
Returned by mongod.exe or mongos.exe on Windows when either receives a shutdown message from the 
Windows Service Control Manager. 
100 
Returned by mongod when the process throws an uncaught exception. 
5.3. Administration Reference 277
MongoDB Documentation, Release 2.6.4 
278 Chapter 5. Administration
CHAPTER 6 
Security 
This section outlines basic security and risk management strategies and access control. The included tutorials outline 
specific tasks for configuring firewalls, authentication, and system privileges. 
Security Introduction (page 279) A high-level introduction to security and MongoDB deployments. 
Security Concepts (page 281) The core documentation of security. 
Authentication (page 282) Mechanisms for verifying user and instance access to MongoDB. 
Authorization (page 285) Control access to MongoDB instances using authorization. 
Network Exposure and Security (page 288) Discusses potential security risks related to the network and strate-gies 
for decreasing possible network-based attack vectors for MongoDB. 
Continue reading from Security Concepts (page 281) for additional documentation of MongoDB’s security 
features and operation. 
Security Tutorials (page 294) Tutorials for enabling and configuring security features for MongoDB. 
Security Checklist (page 295) A high level overview of global security consideration for administrators of 
MongoDB deployments. Use this checklist if you are new to deploying MongoDB in production and 
want to implement high quality security practices. 
Network Security Tutorials (page 297) Ensure that the underlying network configuration supports a secure op-erating 
environment for MongoDB deployments, and appropriately limits access to MongoDB deploy-ments. 
Access Control Tutorials (page 316) These tutorials describe procedures relevant for the configuration, opera-tion, 
and maintenance of MongoDB’s access control system. 
User and Role Management Tutorials (page 342) MongoDB’s access control system provides a flexible role-based 
access control system that you can use to limit access to MongoDB deployments. The tutorials in 
this section describe the configuration an setup of the authorization system. 
Continue reading from Security Tutorials (page 294) for additional tutorials that address the use and management 
of secure MongoDB deployments. 
Create a Vulnerability Report (page 359) Report a vulnerability in MongoDB. 
Security Reference (page 360) Reference for security related functions. 
6.1 Security Introduction 
Maintaining a secure MongoDB deployment requires administrators to implement controls to ensure that users and 
applications have access to only the data that they require. MongoDB provides features that allow administrators to 
279
MongoDB Documentation, Release 2.6.4 
implement these controls and restrictions for any MongoDB deployment. 
If you are already familiar with security and MongoDB security practices, consider the Security Checklist (page 295) 
for a collection of recommended actions to protect a MongoDB deployment. 
6.1.1 Authentication 
Before gaining access to a system all clients should identify themselves to MongoDB. This ensures that no client can 
access the data stored in MongoDB without being explicitly allowed. 
MongoDB supports a number of authentication mechanisms (page 282) that clients can use to verify their identity. 
MongoDB supports two mechanisms: a password-based challenge and response protocol and x.509 certificates. Ad-ditionally, 
MongoDB Enterprise1 also provides support for LDAP proxy authentication (page 283) and Kerberos au-thentication 
(page 283). 
See Authentication (page 282) for more information. 
6.1.2 Role Based Access Control 
Access control, i.e. authorization (page 285), determines a user’s access to resources and operations. Clients should 
only be able to perform the operations required to fulfill their approved functions. This is the “principle of least 
privilege” and limits the potential risk of a compromised application. 
MongoDB’s role-based access control system allows administrators to control all access and ensure that all granted 
access applies as narrowly as possible. MongoDB does not enable authorization by default. When you enable autho-rization 
(page 285), MongoDB will require authentication for all connections. 
When authorization is enabled, MongoDB controls a user’s access through the roles assigned to the user. A role 
consists of a set of privileges, where a privilege consists of actions, or a set of operations, and a resource upon which 
the actions are allowed. 
Users may have one or more role that describes their access. MongoDB provides several built-in roles (page 361) and 
users can construct specific roles tailored to clients’ actual requirements. 
See Authorization (page 285) for more information. 
6.1.3 Auditing 
Auditing provides administrators with the ability to verify that the implemented security policies are controlling activ-ity 
in the system. Retaining audit information ensures that administrators have enough information to perform forensic 
investigations and comply with regulations and polices that require audit data. 
See Auditing (page 290) for more information. 
6.1.4 Encryption 
Transport Encryption 
You can use SSL to encrypt all of MongoDB’s network traffic. SSL ensures that MongoDB network traffic is only 
readable by the intended client. 
See Configure mongod and mongos for SSL (page 304) for more information. 
1http://www.mongodb.com/products/mongodb-enterprise 
280 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Encryption at Rest 
There are two broad classes of approaches to encrypting data at rest with MongoDB. You can use these solutions 
together or independently: 
Application Level Encryption 
Provide encryption on a per-field or per-document basis within the application layer. To encrypt document or field 
level data, write custom encryption and decryption routines or use a commercial solutions such as the Vormetric Data 
Security Platform2. 
Storage Encryption 
Encrypt all MongoDB data on the storage or operating system to ensure that only authorized processes can access 
protected data. A number of third-party libraries can integrate with the operating system to provide transparent disk-level 
encryption. For example: 
Linux Unified Key Setup (LUKS) LUKS is available for most Linux distributions. For configuration explanation, 
see the LUKS documentation from Red Hat3. 
IBM Guardium Data Encryption IBM Guardium Data Encryption4 provides support for disk-level encryption for 
Linux and Windows operating systems. 
Vormetric Data Security Platform The Vormetric Data Security Platform5 provides disk and file-level encryption in 
addition to application level encryption. 
Bitlocker Drive Encryption Bitlocker Drive Encryption6 is a feature available on Windows Server 2008 and 2012 
that provides disk encryption. 
Properly configured disk encryption, when used alongside good security policies that protect relevant accounts, pass-words, 
and encryption keys, can help ensure compliance with standards, including HIPAA, PCI-DSS, and FERPA. 
6.1.5 Hardening Deployments and Environments 
In addition to implementing controls within MongoDB, you should also place controls around MongoDB to reduce 
the risk exposure of the entire MongoDB system. This is a defense in depth strategy. 
Hardening MongoDB extends the ideas of least privilege, auditing, and encryption outside of MongoDB. Reducing 
risk includes: configuring the network rules to ensure that only trusted hosts have access to MongoDB, and that the 
MongoDB processes only have access to the parts of the filesystem required for operation. 
6.2 Security Concepts 
These documents introduce and address concepts and strategies related to security practices in MongoDB deployments. 
Authentication (page 282) Mechanisms for verifying user and instance access to MongoDB. 
Authorization (page 285) Control access to MongoDB instances using authorization. 
2http://www.vormetric.com/sites/default/files/sb-MongoDB-Letter-2014-0611.pdf 
3https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security_Guide/sect-Security_Guide- 
LUKS_Disk_Encryption.html 
4http://www-03.ibm.com/software/products/en/infosphere-guardium-data-encryption 
5http://www.vormetric.com/sites/default/files/sb-MongoDB-Letter-2014-0611.pdf 
6http://technet.microsoft.com/en-us/library/hh831713.aspx 
6.2. Security Concepts 281
MongoDB Documentation, Release 2.6.4 
Collection-Level Access Control (page 287) Scope privileges to specific collections. 
Network Exposure and Security (page 288) Discusses potential security risks related to the network and strategies 
for decreasing possible network-based attack vectors for MongoDB. 
Security and MongoDB API Interfaces (page 289) Discusses potential risks related to MongoDB’s JavaScript, 
HTTP and REST interfaces, including strategies to control those risks. 
Auditing (page 290) Audit server and client activity for mongod and mongos instances. 
Kerberos Authentication (page 291) Kerberos authentication and MongoDB. 
6.2.1 Authentication 
Authentication is the process of verifying the identity of a client. When access control, i.e. authorization (page 285), 
is enabled, MongoDB requires all clients to authenticate themselves first in order to determine the access for the client. 
Although authentication and authorization (page 285) are closely connected, authentication is distinct from authoriza-tion. 
Authentication verifies the identity of a user; authorization determines the verified user’s access to resources and 
operations. 
MongoDB supports a number of authentication mechanisms (page 282) that clients can use to verify their identity. 
These mechanisms allow MongoDB to integrate into your existing authentication system. See Authentication Mecha-nisms 
(page 282) for details. 
In addition to verifying the identity of a client, MongoDB can require members of replica sets and sharded clusters to 
authenticate their membership (page 284) to their respective replica set or sharded cluster. See Authentication Between 
MongoDB Instances (page 284) for more information. 
Client Users 
To authenticate a client in MongoDB, you must add a corresponding user to MongoDB. When adding a user, you 
create the user in a specific database. Together, the user’s name and database serve as a unique identifier for that 
user. That is, if two users have the same name but are created in different databases, they are two separate users. To 
authenticate, the client must authenticate the user against the user’s database. For instance, if using the mongo shell 
as a client, you can specify the database for the user with the –authenticationDatabase option. 
To add and manage user information, MongoDB provides the db.createUser() method as well as other user 
management methods. For an example of adding a user to MongoDB, see Add a User to a Database (page 344). 
MongoDB stores all user information, including name (page 372), password (page 372), and the user’s 
database (page 372), in the system.users (page 372) collection in the admin database. 
Authentication Mechanisms 
MongoDB supports multiple authentication mechanisms. MongoDB’s default authentication method is a challenge 
and response mechanism (MONGODB-CR) (page 283). MongoDB also supports x509 certificate authentication 
(page 283), LDAP proxy authentication (page 283), and Kerberos authentication (page 283). 
This section introduces the mechanisms available in MongoDB. 
To specify the authentication mechanism to use, see authenticationMechanisms. 
282 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
MONGODB-CR Authentication 
MONGODB-CR is a challenge-response mechanism that authenticates users through passwords. MONGODB-CR is the 
default mechanism. 
When you use MONGODB-CR authentication, MONGODB-CR verifies the user against the user’s name (page 372), 
password (page 372) and database (page 372). The user’s database is the database where the user was created, 
and the user’s database and the user’s name together serves to identify the user. 
Using key files, you can also use MONGODB-CR authentication for the internal member authentication (page 284) 
of replica set members and sharded cluster members. The contents of the key files serve as the shared password for 
the members. You must store the key file on each mongod or mongos instance for that replica set or sharded cluster. 
The content of the key file is arbitrary but must be the same on all mongod and mongos instances that connect to 
each other. 
See Generate a Key File (page 338) for instructions on generating a key file and turning on key file authentication for 
members. 
x.509 Certificate Authentication 
New in version 2.6. 
MongoDB supports x.509 certificate authentication for use with a secure SSL connection (page 304). 
To authenticate to servers, clients can use x.509 certificates instead of usernames and passwords. See Client x.509 
Certificate (page 321) for more information. 
For membership authentication, members of sharded clusters and replica sets can use x.509 certificates instead of key 
files. See Use x.509 Certificate for Membership Authentication (page 323) for more information. 
Kerberos Authentication 
MongoDB Enterprise7 supports authentication using a Kerberos service. Kerberos is an industry standard authentica-tion 
protocol for large client/server systems. 
To use MongoDB with Kerberos, you must have a properly configured Kerberos deployment, configured Kerberos 
service principals (page 292) for MongoDB, and added Kerberos user principal (page 292) to MongoDB. 
See Kerberos Authentication (page 291) for more information on Kerberos and MongoDB. To configure MongoDB to 
use Kerberos authentication, see Configure MongoDB with Kerberos Authentication on Linux (page 331) and Configure 
MongoDB with Kerberos Authentication on Windows (page 334). 
LDAP Proxy Authority Authentication 
MongoDB Enterprise8 supports proxy authentication through a Lightweight Directory Access Protocol (LDAP) ser-vice. 
See Authenticate Using SASL and LDAP with OpenLDAP (page 329) and Authenticate Using SASL and LDAP 
with ActiveDirectory (page 326). 
MongoDB Enterprise forWindows does not include LDAP support for authentication. However, MongoDB Enterprise 
for Linux supports using LDAP authentication with an ActiveDirectory server. 
MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4 
and version 2.6 shards. 
7http://www.mongodb.com/products/mongodb-enterprise 
8http://www.mongodb.com/products/mongodb-enterprise 
6.2. Security Concepts 283
MongoDB Documentation, Release 2.6.4 
Authentication Behavior 
Client Authentication 
Clients can authenticate using the challenge and response (page 283), x.509 (page 283), LDAP Proxy (page 283) and 
Kerberos (page 283) mechanisms. 
Each client connection should authenticate as exactly one user. If a client authenticates to a database as one user and 
later authenticates to the same database as a different user, the second authentication invalidates the first. While clients 
can authenticate as multiple users if the users are defined on different databases, we recommend authenticating as one 
user at a time, providing the user with appropriate privileges on the databases required by the user. 
See Authenticate to a MongoDB Instance or Cluster (page 336) for more information. 
Authentication Between MongoDB Instances 
You can authenticate members of replica sets and sharded clusters. To authenticate members of a single MongoDB 
deployment to each other, MongoDB can use the keyFile and x.509 (page 283) mechanisms. Using keyFile 
authentication for members also enables authorization. 
Always run replica sets and sharded clusters in a trusted networking environment. Ensure that the network permits 
only trusted traffic to reach each mongod and mongos instance. 
Use your environment’s firewall and network routing to ensure that traffic only from clients and other members can 
reach your mongod and mongos instances. If needed, use virtual private networks (VPNs) to ensure secure connec-tions 
over wide area networks (WANs). 
Always ensure that: 
• Your network configuration will allow every member of the replica set or sharded cluster to contact every other 
member. 
• If you use MongoDB’s authentication system to limit access to your infrastructure, ensure that you configure a 
keyFile on all members to permit authentication. 
See Generate a Key File (page 338) for instructions on generating a key file and turning on key file authentication for 
members. For an example of using key files for sharded cluster authentication, see Enable Authentication in a Sharded 
Cluster (page 318). 
Authentication on Sharded Clusters 
In sharded clusters, applications authenticate to directly to mongos instances, using credentials stored in the admin 
database of the config servers. The shards in the sharded cluster also have credentials, and clients can authenticate 
directly to the shards to perform maintenance directly on the shards. In general, applications and clients should connect 
to the sharded cluster through the mongos. 
Changed in version 2.6: Previously, the credentials for authenticating to a database on a cluster resided on the primary 
shard (page 615) for that database. 
Some maintenance operations, such as cleanupOrphaned, compact, rs.reconfig(), require direct connec-tions 
to specific shards in a sharded cluster. To perform these operations with authentication enabled, you must connect 
directly to the shard and authenticate as a shard local administrative user. To create a shard local administrative user, 
connect directly to the shard and create the user. MongoDB stores shard local users in the admin database of the shard 
itself. These shard local users are completely independent from the users added to the sharded cluster via mongos. 
Shard local users are local to the shard and are inaccessible by mongos. Direct connections to a shard should only be 
for shard-specific maintenance and configuration. 
284 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Localhost Exception 
The localhost exception allows you to enable authorization before creating the first user in the system. When active, 
the localhost exception allows all connections from the localhost interface to have full access to that instance. The 
exception applies only when there are no users created in the MongoDB instance. 
If you use the localhost exception when deploying a new MongoDB system, the first user you create must be 
in the admin database with privileges to create other users, such as a user with the userAdmin (page 363) or 
userAdminAnyDatabase (page 368) role. See Enable Client Access Control (page 317) and Create a User Ad-ministrator 
(page 343) for more information. 
In the case of a sharded cluster, the localhost exception can apply to the cluster as a whole or separately to each shard. 
The localhost exception can apply to the cluster as a whole if there are no user information stored on the config servers 
and clients access via mongos instances. 
The localhost exception can apply separately to each shard if there is no user information stored on the shard itself and 
clients connect to the shard directly. 
To prevent unauthorized access to a cluster’s shards, you must either create an administrator on each shard 
or disable the localhost exception. To disable the localhost exception, use setParameter to set the 
enableLocalhostAuthBypass parameter to 0 during startup. 
6.2.2 Authorization 
MongoDB employs Role-Based Access Control (RBAC) to govern access to a MongoDB system. A user is granted 
one or more roles (page 285) that determine the user’s access to database resources and operations. Outside of role 
assignments, the user has no access to the system. 
MongoDB does not enable authorization by default. You can enable authorization using the --auth or 
the --keyFile options, or if using a configuration file, with the security.authorization or the 
security.keyFile settings. 
MongoDB provides built-in roles (page 361), each with a dedicated purpose for a common use case. Examples include 
the read (page 362), readWrite (page 362), dbAdmin (page 363), and root (page 368) roles. 
Administrators also can create new roles and privileges to cater to operational needs. Administrators can assign 
privileges scoped as granularly as the collection level. 
When granted a role, a user receives all the privileges of that role. A user can have several roles concurrently, in which 
case the user receives the union of all the privileges of the respective roles. 
Roles 
A role consists of privileges that pair resources with allowed operations. Each privilege is defined directly in the role 
or inherited from another role. 
A role’s privileges apply to the database where the role is created. A role created on the admin database can include 
privileges that apply to all databases or to the cluster (page 374). 
A user assigned a role receives all the privileges of that role. The user can have multiple roles and can have different 
roles on different databases. 
Roles always grant privileges and never limit access. For example, if a user has both read (page 362) and 
readWriteAnyDatabase (page 368) roles on a database, the greater access prevails. 
6.2. Security Concepts 285
MongoDB Documentation, Release 2.6.4 
Privileges 
A privilege consists of a specified resource and the actions permitted on the resource. 
A privilege resource (page 373) is either a database, collection, set of collections, or the cluster. If the cluster, the 
affiliated actions affect the state of the system rather than a specific database or collection. 
An action (page 375) is a command or method the user is allowed to perform on the resource. A resource can have 
multiple allowed actions. For available actions see Privilege Actions (page 375). 
For example, a privilege that includes the update (page 375) action allows a user to modify existing documents on 
the resource. To additionally grant the user permission to create documents on the resource, the administrator would 
add the insert (page 375) action to the privilege. 
For privilege syntax, see admin.system.roles.privileges (page 370). 
Inherited Privileges 
A role can include one or more existing roles in its definition, in which case the role inherits all the privileges of the 
included roles. 
A role can inherit privileges from other roles in its database. A role created on the admin database can inherit 
privileges from roles in any database. 
User-Defined Roles 
New in version 2.6. 
User administrators can create custom roles to ensure collection-level and command-level granularity and to adhere to 
the policy of least privilege. Administrators create and edit roles using the role management commands. 
MongoDB scopes a user-defined role to the database in which it is created and uniquely identifies the role by the 
pairing of its name and its database. MongoDB stores the roles in the admin database’s system.roles (page 369) 
collection. Do not access this collection directly but instead use the role management commands to view and edit 
custom roles. 
Collection-Level Access Control 
By creating a role with privileges (page 286) that are scoped to a specific collection in a particular database, adminis-trators 
can implement collection-level access control. 
See Collection-Level Access Control (page 287) for more information. 
Users 
MongoDB stores user credentials in the protected admin.system.users (page 271). Use the user management 
methods to view and edit user credentials. 
Role Assignment to Users 
User administrators create the users that access the system’s databases. MongoDB’s user management commands let 
administrators create users and assign them roles. 
286 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
MongoDB scopes a user to the database in which the user is created. MongoDB stores all user definitions in the admin 
database, no matter which database the user is scoped to. MongoDB stores users in the admin database’s system.users 
collection (page 372). Do not access this collection directly but instead use the user management commands. 
The first role assigned in a database should be either userAdmin (page 363) or userAdminAnyDatabase 
(page 368). This user can then create all other users in the system. See Create a User Administrator (page 343). 
Protect the User and Role Collections 
MongoDB stores role and user data in the protected admin.system.roles (page 270) and 
admin.system.users (page 271) collections, which are only accessible using the user management meth-ods. 
If you disable access control, do not modify the admin.system.roles (page 270) and admin.system.users 
(page 271) collections using normal insert() and update() operations. 
Additional Information 
See the reference section for documentation of all built-in-roles (page 361) and all available privilege actions 
(page 375). Also consider the reference for the form of the resource documents (page 373). 
To create users see the Create a User Administrator (page 343) and Add a User to a Database (page 344) tutorials. 
6.2.3 Collection-Level Access Control 
Collection-level access control allows administrators to grant users privileges that are scoped to specific collections. 
Administrators can implement collection-level access control through user-defined roles (page 286). By creating a role 
with privileges (page 286) that are scoped to a specific collection in a particular database, administrators can provision 
users with roles that grant privileges on a collection level. 
Privileges and Scope 
A privilege consists of actions (page 375) and the resources (page 373) upon which the actions are permissible; i.e. 
the resources define the scope of the actions for that privilege. 
By specifying both the database and the collection in the resource document (page 373) for a privilege, administrator 
can limit the privilege actions just to a specific collection in a specific database. Each privilege action in a role can be 
scoped to a different collection. 
For example, a user defined role can contain the following privileges: 
privileges: [ 
{ resource: { db: "products", collection: "inventory" }, actions: [ "find", "update", "insert" ] }, 
{ resource: { db: "products", collection: "orders" }, actions: [ "find" ] } 
] 
The first privilege scopes its actions to the inventory collection of the products database. The second privilege 
scopes its actions to the orders collection of the products database. 
Additional Information 
For more information on user-defined roles and MongoDB authorization model, see Authorization (page 285). For a 
tutorial on creating user-defined roles, see Create a Role (page 347). 
6.2. Security Concepts 287
MongoDB Documentation, Release 2.6.4 
6.2.4 Network Exposure and Security 
By default, MongoDB programs (i.e. mongos and mongod) will bind to all available network interfaces (i.e. IP 
addresses) on a system. 
This page outlines various runtime options that allow you to limit access to MongoDB programs. 
Configuration Options 
You can limit the network exposure with the following mongod and mongos configuration options: enabled, 
net.http.RESTInterfaceEnabled, bindIp, and port. You can use a configuration file to specify 
these settings. 
nohttpinterface 
The enabled setting for mongod and mongos instances disables the “home” status page. 
Changed in version 2.6: The mongod and mongos instances run with the http interface disabled by default. 
The status interface is read-only by default, and the default port for the status page is 28017. Authentication does not 
control or affect access to this interface. 
Important: Disable this interface for production deployments. If you enable this interface, you should only allow 
trusted clients to access this port. See Firewalls (page 289). 
rest 
The net.http.RESTInterfaceEnabled setting for mongod enables a fully interactive administrative REST 
interface, which is disabled by default. The net.http.RESTInterfaceEnabled configuration makes the http 
status interface 9, which is read-only by default, fully interactive. Use the net.http.RESTInterfaceEnabled 
setting with the enabled setting. 
The REST interface does not support any authentication and you should always restrict access to this interface to only 
allow trusted clients to connect to this port. 
You may also enable this interface on the command line as mongod --rest --httpinterface. 
Important: Disable this option for production deployments. If do you leave this interface enabled, you should only 
allow trusted clients to access this port. 
bind_ip 
The bindIp setting for mongod and mongos instances limits the network interfaces on which MongoDB programs 
will listen for incoming connections. You can also specify a number of interfaces by passing bindIp a comma 
separated list of IP addresses. You can use the mongod --bind_ip and mongos --bind_ip option on the 
command line at run time to limit the network accessibility of a MongoDB program. 
Important: Make sure that your mongod and mongos instances are only accessible on trusted networks. If your 
system has more than one network interface, bind MongoDB programs to the private or internal network interface. 
9 Starting in version 2.6, http interface is disabled by default. 
288 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
port 
The port setting for mongod and mongos instances changes the main port on which the mongod or mongos 
instance listens for connections. The default port is 27017. Changing the port does not meaningfully reduce risk or 
limit exposure. You may also specify this option on the command line as mongod --port or mongos --port. 
Setting port also indirectly sets the port for the HTTP status interface, which is always available on the port numbered 
1000 greater than the primary mongod port. 
Only allow trusted clients to connect to the port for the mongod and mongos instances. See Firewalls (page 289). 
See also Security Considerations (page 184) and Default MongoDB Port (page 380). 
Firewalls 
Firewalls allow administrators to filter and control access to a system by providing granular control over what network 
communications. For administrators of MongoDB, the following capabilities are important: limiting incoming traffic 
on a specific port to specific systems, and limiting incoming traffic from untrusted hosts. 
On Linux systems, the iptables interface provides access to the underlying netfilter firewall. On Windows 
systems, netsh command line interface provides access to the underlying Windows Firewall. For additional infor-mation 
about firewall configuration, see Configure Linux iptables Firewall for MongoDB (page 297) and Configure 
Windows netsh Firewall for MongoDB (page 300). 
For best results and to minimize overall exposure, ensure that only traffic from trusted sources can reach mongod and 
mongos instances and that the mongod and mongos instances can only connect to trusted outputs. 
See also: 
For MongoDB deployments on Amazon’s web services, see the Amazon EC210 page, which addresses Amazon’s 
Security Groups and other EC2-specific security features. 
Virtual Private Networks 
Virtual private networks, or VPNs, make it possible to link two networks over an encrypted and limited-access trusted 
network. Typically MongoDB users who use VPNs use SSL rather than IPSEC VPNs for performance issues. 
Depending on configuration and implementation, VPNs provide for certificate validation and a choice of encryption 
protocols, which requires a rigorous level of authentication and identification of all clients. Furthermore, because 
VPNs provide a secure tunnel, by using a VPN connection to control access to your MongoDB instance, you can 
prevent tampering and “man-in-the-middle” attacks. 
6.2.5 Security and MongoDB API Interfaces 
The following section contains strategies to limit risks related to MongoDB’s available interfaces including JavaScript, 
HTTP, and REST interfaces. 
JavaScript and the Security of the mongo Shell 
The following JavaScript evaluation behaviors of the mongo shell represents risk exposures. 
10http://docs.mongodb.org/ecosystem/platforms/amazon-ec2 
6.2. Security Concepts 289
MongoDB Documentation, Release 2.6.4 
JavaScript Expression or JavaScript File 
The mongo program can evaluate JavaScript expressions using the command line --eval option. Also, the mongo 
program can evaluate a JavaScript file (.js) passed directly to it (e.g. mongo someFile.js). 
Because the mongo program evaluates the JavaScript directly, inputs should only come from trusted sources. 
.mongorc.js File 
If a .mongorc.js file exists 11, the mongo shell will evaluate a .mongorc.js file before starting. You can disable 
this behavior by passing the mongo --norc option. 
HTTP Status Interface 
The HTTP status interface provides a web-based interface that includes a variety of operational data, logs, and status 
reports regarding the mongod or mongos instance. The HTTP interface is always available on the port numbered 
1000 greater than the primary mongod port. By default, the HTTP interface port is 28017, but is indirectly set using 
the port option which allows you to configure the primary mongod port. 
Without the net.http.RESTInterfaceEnabled setting, this interface is entirely read-only, and limited in 
scope; nevertheless, this interface may represent an exposure. To disable the HTTP interface, set the enabled run 
time option or the --nohttpinterface command line option. See also Configuration Options (page 288). 
REST API 
The REST API to MongoDB provides additional information and write access on top of the HTTP Status interface. 
While the REST API does not provide any support for insert, update, or remove operations, it does provide adminis-trative 
access, and its accessibility represents a vulnerability in a secure environment. The REST interface is disabled 
by default, and is not recommended for production use. 
If you must use the REST API, please control and limit access to the REST API. The REST API does not include any 
support for authentication, even when running with authorization enabled. 
See the following documents for instructions on restricting access to the REST API interface: 
• Configure Linux iptables Firewall for MongoDB (page 297) 
• Configure Windows netsh Firewall for MongoDB (page 300) 
6.2.6 Auditing 
New in version 2.6. 
MongoDB Enterprise includes an auditing capability for mongod and mongos instances. The auditing facility allows 
administrators and users to track system activity for deployments with multiple users and applications. The auditing 
facility can write audit events to the console, the syslog, a JSON file, or a BSON file. For details on the audit log 
messages, see System Event Audit Messages (page 380). 
11 On Linux and Unix systems, mongo reads the .mongorc.js file from $HOME/.mongorc.js (i.e. ~/.mongorc.js). On Windows, 
mongo.exe reads the .mongorc.js file from %HOME%.mongorc.js or %HOMEDRIVE%%HOMEPATH%.mongorc.js. 
290 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Audit Events and Filter 
The auditing system can record the following operations: 
• schema (DDL), 
• replica set, 
• authentication and authorization, and 
• general operations. 
See Event Actions, Details, and Results (page 381) for the specific actions recorded. 
By default, the auditing system records all these operations; however, you can configure the --auditFilter option 
to restrict the events captured. 
See Configure System Events Auditing (page 356) to enable and configure auditing for MongoDB Enterprise. To set 
up filters, see Filter Events (page 358). 
Audit Guarantee 
The auditing system writes every audit event 12 to an in-memory buffer of audit events. MongoDB writes this buffer to 
disk periodically. For events collected from any single connection, the events have a total order: if MongoDB writes 
one event to disk, the system guarantees that it has written all prior events for that connection to disk. 
If an audit event entry corresponds to an operation that affects the durable state of the database, such as a modification 
to data, MongoDB will always write the audit event to disk before writing to the journal for that entry. 
That is, before adding an operation to the journal, MongoDB writes all audit events on the connection that triggered 
the operation, up to and including the entry for the operation. 
These auditing guarantees require that MongoDB runs with the journaling enabled. 
Warning: MongoDB may lose events if the server terminates before it commits the events to the audit log. 
The client may receive confirmation of the event before MongoDB commits to the audit log. For example, while 
auditing an aggregation operation, the server might crash after returning the result but before the audit log flushes. 
6.2.7 Kerberos Authentication 
New in version 2.4. 
Overview 
MongoDB Enterprise provides support for Kerberos authentication of MongoDB clients to mongod and mongos. 
Kerberos is an industry standard authentication protocol for large client/server systems. Kerberos allows MongoDB 
and applications to take advantage of existing authentication infrastructure and processes. 
Kerberos Components and MongoDB 
Principals 
In a Kerberos-based system, every participant in the authenticated communication is known as a “principal”, and every 
principal must have a unique name. 
12 Audit configuration can include a filter (page 358) to limit events to audit. 
6.2. Security Concepts 291
MongoDB Documentation, Release 2.6.4 
Principals belong to administrative units called realms. For each realm, the Kerberos Key Distribution Center (KDC) 
maintains a database of the realm’s principal and the principals’ associated “secret keys”. 
For a client-server authentication, the client requests from the KDC a “ticket” for access to a specific asset. KDC 
uses the client’s secret and the server’s secret to construct the ticket which allows the client and server to mutually 
authenticate each other, while keeping the secrets hidden. 
For the configuration of MongoDB for Kerberos support, two kinds of principal names are of interest: user principals 
(page 292) and service principals (page 292). 
User Principal To authenticate using Kerberos, you must add the Kerberos user principals to MongoDB to the 
$external database. User principal names have the form: 
<username>@<KERBEROS REALM> 
For every user you want to authenticate using Kerberos, you must create a corresponding user in MongoDB in the 
$external database. 
For examples of adding a user to MongoDB as well as authenticating as that user, see Configure MongoDB with 
Kerberos Authentication on Linux (page 331) and Configure MongoDB with Kerberos Authentication on Windows 
(page 334). 
See also: 
http://docs.mongodb.org/manualreference/command/nav-user-management for general in-formation 
regarding creating and managing users in MongoDB. 
Service Principal Every MongoDB mongod and mongos instance (or mongod.exe or mongos.exe on Win-dows) 
must have an associated service principal. Service principal names have the form: 
<service>/<fully qualified domain name>@<KERBEROS REALM> 
For MongoDB, the <service> defaults to mongodb. For example, if m1.example.com is a MongoDB server, 
and example.com maintains the EXAMPLE.COM Kerberos realm, then m1 should have the service principal name 
mongodb/m1.example.com@EXAMPLE.COM. 
To specify a different value for <service>, use serviceName during the start up of mongod or mongos (or 
mongod.exe or mongos.exe). mongo shell or other clients may also specify a different service principal name 
using serviceName. 
Service principal names must be reachable over the network using the fully qualified domain name (FQDN) part of its 
service principal name. 
By default, Kerberos attempts to identify hosts using the /etc/kerb5.conf file before using DNS to resolve hosts. 
On Windows, if running MongoDB as a service, see Assign Service Principal Name to MongoDB Windows Service 
(page 336). 
Linux Keytab Files 
Linux systems can store Kerberos authentication keys for a service principal (page 292) in keytab files. Each Kerber-ized 
mongod and mongos instance running on Linux must have access to a keytab file containing keys for its service 
principal (page 292). 
To keep keytab files secure, use file permissions that restrict access to only the user that runs the mongod or mongos 
process. 
292 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Tickets 
On Linux, MongoDB clients can use Kerberos’s kinit program to initialize a credential cache for authenticating the 
user principal to servers. 
Windows Active Directory 
Unlike on Linux systems, mongod and mongos instances running on Windows do not require access to keytab 
files. Instead, the mongod and mongos instances read their server credentials from a credential store specific to the 
operating system. 
However, from the Windows Active Directory, you can export a keytab file for use on Linux systems. See Ktpass13 
for more information. 
Authenticate With Kerberos 
To configure MongoDB for Kerberos support and authenticate, see Configure MongoDB with Kerberos Authentication 
on Linux (page 331) and Configure MongoDB with Kerberos Authentication on Windows (page 334). 
Operational Considerations 
The HTTP Console 
The MongoDB HTTP Console14 interface does not support Kerberos authentication. 
DNS 
Each host that runs a mongod or mongos instance must have both A and PTR DNS records to provide forward and 
reverse lookup. 
Without A and PTR DNS records, the host cannot resolve the components of the Kerberos domain or the Key Distri-bution 
Center (KDC). 
System Time Synchronization 
To successfully authenticate, the system time for each mongod and mongos instance must be within 5 minutes of the 
system time of the other hosts in the Kerberos infrastructure. 
Kerberized MongoDB Environments 
Driver Support 
The following MongoDB drivers support Kerberos authentication: 
• Java15 
13http://technet.microsoft.com/en-us/library/cc753771.aspx 
14http://docs.mongodb.org/ecosystem/tools/http-interfaces/#http-console 
15http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-java-driver/ 
6.2. Security Concepts 293
MongoDB Documentation, Release 2.6.4 
• C#16 
• C++17 
• Python18 
Use with Additional MongoDB Authentication Mechanism 
Although MongoDB supports the use of Kerberos authentication with other authentication mechanisms, only add 
the other mechanisms as necessary. See the Incorporate Additional Authentication Mechanisms 
section in Configure MongoDB with Kerberos Authentication on Linux (page 331) and Configure MongoDB with 
Kerberos Authentication on Windows (page 334) for details. 
6.3 Security Tutorials 
The following tutorials provide instructions for enabling and using the security features available in MongoDB. 
Security Checklist (page 295) A high level overview of global security consideration for administrators of MongoDB 
deployments. Use this checklist if you are new to deploying MongoDB in production and want to implement 
high quality security practices. 
Network Security Tutorials (page 297) Ensure that the underlying network configuration supports a secure operating 
environment for MongoDB deployments, and appropriately limits access to MongoDB deployments. 
Configure Linux iptables Firewall for MongoDB (page 297) Basic firewall configuration patterns and exam-ples 
for iptables on Linux systems. 
Configure Windows netsh Firewall for MongoDB (page 300) Basic firewall configuration patterns and exam-ples 
for netsh on Windows systems. 
Configure mongod and mongos for SSL (page 304) SSL allows MongoDB clients to support encrypted con-nections 
to mongod instances. 
Continue reading from Network Security Tutorials (page 297) for more information on running MongoDB in 
secure environments. 
Security Deployment Tutorials (page 313) These tutorials describe procedures for deploying MongoDB using au-thentication 
and authorization. 
Access Control Tutorials (page 316) These tutorials describe procedures relevant for the configuration, operation, 
and maintenance of MongoDB’s access control system. 
Enable Client Access Control (page 317) Describes the process for enabling authentication for MongoDB de-ployments. 
Use x.509 Certificates to Authenticate Clients (page 320) Use x.509 for client authentication. 
Use x.509 Certificate for Membership Authentication (page 323) Use x.509 for internal member authentica-tion 
for replica sets and sharded clusters. 
Configure MongoDB with Kerberos Authentication on Linux (page 331) For MongoDB Enterprise Linux, 
describes the process to enable Kerberos-based authentication for MongoDB deployments. 
Continue reading from Access Control Tutorials (page 316) for additional tutorials on configuring MongoDB’s 
authentication systems. 
16http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-csharp-driver/ 
17http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-cpp-driver/ 
18http://api.mongodb.org/python/current/examples/authentication.html 
294 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Enable Authentication after Creating the User Administrator (page 319) Describes an alternative process for 
enabling authentication for MongoDB deployments. 
User and Role Management Tutorials (page 342) MongoDB’s access control system provides a flexible role-based 
access control system that you can use to limit access to MongoDB deployments. The tutorials in this section 
describe the configuration an setup of the authorization system. 
Add a User to a Database (page 344) Create non-administrator users using MongoDB’s role-based authentica-tion 
system. 
Create a Role (page 347) Create custom role. 
Modify a User’s Access (page 352) Modify the actions available to a user on specific database resources. 
View Roles (page 353) View a role’s privileges. 
Continue reading from User and Role Management Tutorials (page 342) for additional tutorials on managing 
users and privileges in MongoDB’s authorization system. 
Configure System Events Auditing (page 356) Enable and configure MongoDB Enterprise system event auditing fea-ture. 
Create a Vulnerability Report (page 359) Report a vulnerability in MongoDB. 
6.3.1 Security Checklist 
This documents provides a list of security measures that you should implement to protect your MongoDB installation. 
Require Authentication 
Enable MongoDB authentication and specify the authentication mechanism. You can use the MongoDB authentica-tion 
mechanism or an existing external framework. Authentication requires that all clients and servers provide valid 
credentials before they can connect to the system. In clustered deployments, enable authentication for each MongoDB 
server. 
See Authentication (page 282), Enable Client Access Control (page 317), and Enable Authentication in a Sharded 
Cluster (page 318). 
Configure Role-Based Access Control 
Create roles that define the exact access a set of users needs. Follow a principle of least privilege. Then create users 
and assign them only the roles they need to perform their operations. A user can be a person or a client application. 
Create a user administrator first, then create additional users. Create a unique MongoDB user for each person and 
application that accesses the system. 
See Authorization (page 285), Create a Role (page 347), Create a User Administrator (page 343), and Add a User to 
a Database (page 344). 
Encrypt Communication 
Configure MongoDB to use SSL for all incoming and outgoing connections. Use SSL to encrypt communication 
between mongod and mongos components of a MongoDB client, as well as between all applications and MongoDB. 
See Configure mongod and mongos for SSL (page 304). 
6.3. Security Tutorials 295
MongoDB Documentation, Release 2.6.4 
Limit Network Exposure 
Ensure that MongoDB runs in a trusted network environment and limit the interfaces on which MongoDB instances 
listen for incoming connections. Allow only trusted clients to access the network interfaces and ports on which 
MongoDB instances are available. 
See the bindIp setting, and see Configure Linux iptables Firewall for MongoDB (page 297) and Configure Windows 
netsh Firewall for MongoDB (page 300). 
Audit System Activity 
Track access and changes to database configurations and data. MongoDB Enterprise19 includes a system auditing 
facility that can record system events (e.g. user operations, connection events) on a MongoDB instance. These audit 
records permit forensic analysis and allow administrators to verify proper controls. 
See Auditing (page 290) and Configure System Events Auditing (page 356). 
Encrypt and Protect Data 
Encrypt MongoDB data on each host using file-system, device, or physical encryption. Protect MongoDB data using 
file-system permissions. MongoDB data includes data files, configuration files, auditing logs, and key files. 
Run MongoDB with a Dedicated User 
Run MongoDB processes with a dedicated operating system user account. Ensure that the account has permissions to 
access data but no unnecessary permissions. 
See Install MongoDB (page 5) for more information on running MongoDB. 
Run MongoDB with Secure Configuration Options 
MongoDB supports the execution of JavaScript code for certain server-side operations: mapReduce, group, eval, 
and $where. If you do not use these operations, disable server-side scripting by using the --noscripting option 
on the command line. 
Use only the MongoDB wire protocol on production deployments. Do not enable the following, all of which enable 
the web server interface: enabled, net.http.JSONPEnabled, and net.http.RESTInterfaceEnabled. 
Leave these disabled, unless required for backwards compatibility. 
Keep input validation enabled. MongoDB enables input validation by default through the wireObjectCheck 
setting. This ensures that all documents stored by the mongod instance are valid BSON. 
Consider Security Standards Compliance 
For applications requiring HIPAA or PCI-DSS compliance, please refer to the MongoDB Security Reference Architec-ture20 
to learn more about how you can use the key security capabilities to build compliant application infrastructure. 
19http://www.mongodb.com/products/mongodb-enterprise 
20http://info.mongodb.com/rs/mongodb/images/MongoDB_Security_Architecture_WP.pdf 
296 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Contact MongoDB for Further Guidance 
MongoDB Inc. provides a Security Technical Implementation Guide (STIG) upon request. Please request a copy21 for 
more information. 
6.3.2 Network Security Tutorials 
The following tutorials provide information on handling network security for MongoDB. 
Configure Linux iptables Firewall for MongoDB (page 297) Basic firewall configuration patterns and examples for 
iptables on Linux systems. 
Configure Windows netsh Firewall for MongoDB (page 300) Basic firewall configuration patterns and examples for 
netsh on Windows systems. 
Configure mongod and mongos for SSL (page 304) SSL allows MongoDB clients to support encrypted connections 
to mongod instances. 
SSL Configuration for Clients (page 307) Configure clients to connect to MongoDB instances that use SSL. 
Upgrade a Cluster to Use SSL (page 311) Rolling upgrade process to use SSL. 
Configure MongoDB for FIPS (page 311) Configure for Federal Information Processing Standard (FIPS). 
Configure Linux iptables Firewall for MongoDB 
On contemporary Linux systems, the iptables program provides methods for managing the Linux Kernel’s 
netfilter or network packet filtering capabilities. These firewall rules make it possible for administrators to 
control what hosts can connect to the system, and limit risk exposure by limiting the hosts that can connect to a 
system. 
This document outlines basic firewall configurations for iptables firewalls on Linux. Use these approaches as a 
starting point for your larger networking organization. For a detailed overview of security practices and risk manage-ment 
for MongoDB, see Security Concepts (page 281). 
See also: 
For MongoDB deployments on Amazon’s web services, see the Amazon EC222 page, which addresses Amazon’s 
Security Groups and other EC2-specific security features. 
Overview 
Rules in iptables configurations fall into chains, which describe the process for filtering and processing specific 
streams of traffic. Chains have an order, and packets must pass through earlier rules in a chain to reach later rules. 
This document addresses only the following two chains: 
INPUT Controls all incoming traffic. 
OUTPUT Controls all outgoing traffic. 
Given the default ports (page 288) of all MongoDB processes, you must configure networking rules that permit only 
required communication between your application and the appropriate mongod and mongos instances. 
Be aware that, by default, the default policy of iptables is to allow all connections and traffic unless explicitly 
disabled. The configuration changes outlined in this document will create rules that explicitly allow traffic from 
specific addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed. When 
21http://www.mongodb.com/lp/contact/stig-requests 
22http://docs.mongodb.org/ecosystem/platforms/amazon-ec2 
6.3. Security Tutorials 297
MongoDB Documentation, Release 2.6.4 
you have properly configured your iptables rules to allow only the traffic that you want to permit, you can Change 
Default Policy to DROP (page 300). 
Patterns 
This section contains a number of patterns and examples for configuring iptables for use with MongoDB deploy-ments. 
If you have configured different ports using the port configuration setting, you will need to modify the rules 
accordingly. 
Traffic to and from mongod Instances This pattern is applicable to all mongod instances running as standalone 
instances or as part of a replica set. 
The goal of this pattern is to explicitly allow traffic to the mongod instance from the application server. In the 
following examples, replace <ip-address> with the IP address of the application server: 
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27017 -m state --state NEW,ESTABLISHED -j iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27017 -m state --state ESTABLISHED -j ACCEPT 
The first rule allows all incoming traffic from <ip-address> on port 27017, which allows the application server to 
connect to the mongod instance. The second rule, allows outgoing traffic from the mongod to reach the application 
server. 
Optional 
If you have only one application server, you can replace <ip-address> with either the IP address itself, such as: 
198.51.100.55. You can also express this using CIDR notation as 198.51.100.55/32. If you want to permit 
a larger block of possible IP addresses you can allow traffic from a http://docs.mongodb.org/manual24 
using one of the following specifications for the <ip-address>, as follows: 
10.10.10.10/24 
10.10.10.10/255.255.255.0 
Traffic to and from mongos Instances mongos instances provide query routing for sharded clusters. Clients 
connect to mongos instances, which behave from the client’s perspective as mongod instances. In turn, the mongos 
connects to all mongod instances that are components of the sharded cluster. 
Use the same iptables command to allow traffic to and from these instances as you would from the mongod 
instances that are members of the replica set. Take the configuration outlined in the Traffic to and from mongod 
Instances (page 298) section as an example. 
Traffic to and from a MongoDB Config Server Config servers, host the config database that stores metadata 
for sharded clusters. Each production cluster has three config servers, initiated using the mongod --configsvr 
option. 23 Config servers listen for connections on port 27019. As a result, add the following iptables rules to the 
config server to allow incoming and outgoing connection on port 27019, for connection to the other config servers. 
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27019 -m state --state NEW,ESTABLISHED -j iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27019 -m state --state ESTABLISHED -j ACCEPT 
Replace <ip-address> with the address or address space of all the mongod that provide config servers. 
Additionally, config servers need to allow incoming connections from all of the mongos instances in the cluster and 
all mongod instances in the cluster. Add rules that resemble the following: 
23 You also can run a config server by using the configsvr value for the clusterRole setting in a configuration file. 
298 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27019 -m state --state NEW,ESTABLISHED -j Replace <ip-address> with the address of the mongos instances and the shard mongod instances. 
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 24 Because 
the default port number is 27018 when running with the shardsvr value for the clusterRole setting, you must 
configure the following iptables rules to allow traffic to and from each shard: 
iptables -A INPUT -s <ip-address> -p tcp --destination-port 27018 -m state --state NEW,ESTABLISHED -j iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27018 -m state --state ESTABLISHED -j ACCEPT 
Replace the <ip-address> specification with the IP address of all mongod. This allows you to permit incoming 
and outgoing traffic between all shards including constituent replica set members, to: 
• all mongod instances in the shard’s replica sets. 
• all mongod instances in other shards. 25 
Furthermore, shards need to be able make outgoing connections to: 
• all mongos instances. 
• all mongod instances in the config servers. 
Create a rule that resembles the following, and replace the <ip-address> with the address of the config servers 
and the mongos instances: 
iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27018 -m state --state ESTABLISHED -j ACCEPT 
Provide Access For Monitoring Systems 
1. The mongostat diagnostic tool, when running with the --discover needs to be able to reach all compo-nents 
of a cluster, including the config servers, the shard servers, and the mongos instances. 
2. If your monitoring system needs access the HTTP interface, insert the following rule to the chain: 
iptables -A INPUT -s <ip-address> -p tcp --destination-port 28017 -m state --state NEW,ESTABLISHED Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface. 
For all deployments, you should restrict access to this port to only the monitoring instance. 
Optional 
For config server mongod instances running with the shardsvr value for the clusterRole setting, the 
rule would resemble the following: 
iptables -A INPUT -s <ip-address> -p tcp --destination-port 28018 -m state --state NEW,ESTABLISHED For config server mongod instances running with the configsvr value for the clusterRole setting, the 
rule would resemble the following: 
iptables -A INPUT -s <ip-address> -p tcp --destination-port 28019 -m state --state NEW,ESTABLISHED 24 You can also specify the shard server option with the shardsvr value for the clusterRole setting in the configuration file. Shard members 
are also often conventional replica sets using the default port. 
25 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations. 
6.3. Security Tutorials 299
MongoDB Documentation, Release 2.6.4 
Change Default Policy to DROP 
The default policy for iptables chains is to allow all traffic. After completing all iptables configuration changes, 
you must change the default policy to DROP so that all traffic that isn’t explicitly allowed as above will not be able to 
reach components of the MongoDB deployment. Issue the following commands to change this policy: 
iptables -P INPUT DROP 
iptables -P OUTPUT DROP 
Manage and Maintain iptables Configuration 
This section contains a number of basic operations for managing and using iptables. There are various front end 
tools that automate some aspects of iptables configuration, but at the core all iptables front ends provide the 
same basic functionality: 
Make all iptables Rules Persistent By default all iptables rules are only stored in memory. When your 
system restarts, your firewall rules will revert to their defaults. When you have tested a rule set and have guaranteed 
that it effectively controls traffic you can use the following operations to you should make the rule set persistent. 
On Red Hat Enterprise Linux, Fedora Linux, and related distributions you can issue the following command: 
service iptables save 
On Debian, Ubuntu, and related distributions, you can use the following command to dump the iptables rules to 
the /etc/iptables.conf file: 
iptables-save > /etc/iptables.conf 
Run the following operation to restore the network rules: 
iptables-restore < /etc/iptables.conf 
Place this command in your rc.local file, or in the /etc/network/if-up.d/iptables file with other 
similar operations. 
List all iptables Rules To list all of currently applied iptables rules, use the following operation at the system 
shell. 
iptables --L 
Flush all iptables Rules If you make a configuration mistake when entering iptables rules or simply need to 
revert to the default rule set, you can use the following operation at the system shell to flush all rules: 
iptables --F 
If you’ve already made your iptables rules persistent, you will need to repeat the appropriate procedure in the 
Make all iptables Rules Persistent (page 300) section. 
Configure Windows netsh Firewall for MongoDB 
On Windows Server systems, the netsh program provides methods for managing the Windows Firewall. These 
firewall rules make it possible for administrators to control what hosts can connect to the system, and limit risk 
exposure by limiting the hosts that can connect to a system. 
300 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
This document outlines basic Windows Firewall configurations. Use these approaches as a starting point for your 
larger networking organization. For a detailed over view of security practices and risk management for MongoDB, see 
Security Concepts (page 281). 
See also: 
Windows Firewall26 documentation from Microsoft. 
Overview 
Windows Firewall processes rules in an ordered determined by rule type, and parsed in the following order: 
1. Windows Service Hardening 
2. Connection security rules 
3. Authenticated Bypass Rules 
4. Block Rules 
5. Allow Rules 
6. Default Rules 
By default, the policy in Windows Firewall allows all outbound connections and blocks all incoming connections. 
Given the default ports (page 288) of all MongoDB processes, you must configure networking rules that permit only 
required communication between your application and the appropriate mongod.exe and mongos.exe instances. 
The configuration changes outlined in this document will create rules which explicitly allow traffic from specific 
addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed. 
You can configure the Windows Firewall with using the netsh command line tool or through a windows application. 
On Windows Server 2008 this application is Windows Firewall With Advanced Security in Administrative Tools. On 
previous versions of Windows Server, access the Windows Firewall application in the System and Security control 
panel. 
The procedures in this document use the netsh command line tool. 
Patterns 
This section contains a number of patterns and examples for configuring Windows Firewall for use with MongoDB 
deployments. If you have configured different ports using the port configuration setting, you will need to modify the 
rules accordingly. 
Traffic to and from mongod.exe Instances This pattern is applicable to all mongod.exe instances running as 
standalone instances or as part of a replica set. The goal of this pattern is to explicitly allow traffic to the mongod.exe 
instance from the application server. 
netsh advfirewall firewall add rule name="Open mongod port 27017" dir=in action=allow protocol=TCP localport=This rule allows all incoming traffic to port 27017, which allows the application server to connect to the 
mongod.exe instance. 
Windows Firewall also allows enabling network access for an entire application rather than to a specific port, as in the 
following example: 
26http://technet.microsoft.com/en-us/network/bb545423.aspx 
6.3. Security Tutorials 301
MongoDB Documentation, Release 2.6.4 
netsh advfirewall firewall add rule name="Allowing mongod" dir=in action=allow program=" C:mongodbbinYou can allow all access for a mongos.exe server, with the following invocation: 
netsh advfirewall firewall add rule name="Allowing mongos" dir=in action=allow program=" C:mongodbbinTraffic to and from mongos.exe Instances mongos.exe instances provide query routing for sharded clusters. 
Clients connect to mongos.exe instances, which behave from the client’s perspective as mongod.exe instances. 
In turn, the mongos.exe connects to all mongod.exe instances that are components of the sharded cluster. 
Use the same Windows Firewall command to allow traffic to and from these instances as you would from the 
mongod.exe instances that are members of the replica set. 
netsh advfirewall firewall add rule name="Open mongod shard port 27018" dir=in action=allow protocol=Traffic to and from a MongoDB Config Server Configuration servers, host the config database that stores meta-data 
for sharded clusters. Each production cluster has three configuration servers, initiated using the mongod 
--configsvr option. 27 Configuration servers listen for connections on port 27019. As a result, add the fol-lowing 
Windows Firewall rules to the config server to allow incoming and outgoing connection on port 27019, for 
connection to the other config servers. 
netsh advfirewall firewall add rule name="Open mongod config svr port 27019" dir=in action=allow protocol=Additionally, config servers need to allow incoming connections from all of the mongos.exe instances in the cluster 
and all mongod.exe instances in the cluster. Add rules that resemble the following: 
netsh advfirewall firewall add rule name="Open mongod config svr inbound" dir=in action=allow protocol=Replace <ip-address> with the addresses of the mongos.exe instances and the shard mongod.exe instances. 
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 28 Because 
the default port number is 27018 when running with the shardsvr value for the clusterRole setting, you must 
configure the following Windows Firewall rules to allow traffic to and from each shard: 
netsh advfirewall firewall add rule name="Open mongod shardsvr inbound" dir=in action=allow protocol=netsh advfirewall firewall add rule name="Open mongod shardsvr outbound" dir=out action=allow protocol=Replace the <ip-address> specification with the IP address of all mongod.exe instances. This allows you to 
permit incoming and outgoing traffic between all shards including constituent replica set members to: 
• all mongod.exe instances in the shard’s replica sets. 
• all mongod.exe instances in other shards. 29 
Furthermore, shards need to be able make outgoing connections to: 
• all mongos.exe instances. 
• all mongod.exe instances in the config servers. 
Create a rule that resembles the following, and replace the <ip-address> with the address of the config servers 
and the mongos.exe instances: 
27 You also can run a config server by using the configsrv value for the clusterRole setting in a configuration file. 
28 You can also specify the shard server option with the shardsvr value for the clusterRole setting in the configuration file. Shard members 
are also often conventional replica sets using the default port. 
29 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations. 
302 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
netsh advfirewall firewall add rule name="Open mongod config svr outbound" dir=out action=allow protocol=Provide Access For Monitoring Systems 
1. The mongostat diagnostic tool, when running with the --discover needs to be able to reach all compo-nents 
of a cluster, including the config servers, the shard servers, and the mongos.exe instances. 
2. If your monitoring system needs access the HTTP interface, insert the following rule to the chain: 
netsh advfirewall firewall add rule name="Open mongod HTTP monitoring inbound" dir=in action=allow Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface. 
For all deployments, you should restrict access to this port to only the monitoring instance. 
Optional 
For config server mongod instances running with the shardsvr value for the clusterRole setting, the 
rule would resemble the following: 
netsh advfirewall firewall add rule name="Open mongos HTTP monitoring inbound" dir=in action=allow For config server mongod instances running with the configsvr value for the clusterRole setting, the 
rule would resemble the following: 
netsh advfirewall firewall add rule name="Open mongod configsvr HTTP monitoring inbound" dir=in Manage and Maintain Windows Firewall Configurations 
This section contains a number of basic operations for managing and using netsh. While you can use the GUI front 
ends to manage the Windows Firewall, all core functionality is accessible is accessible from netsh. 
Delete all Windows Firewall Rules To delete the firewall rule allowing mongod.exe traffic: 
netsh advfirewall firewall delete rule name="Open mongod port 27017" protocol=tcp localport=27017 
netsh advfirewall firewall delete rule name="Open mongod shard port 27018" protocol=tcp localport=27018 
List All Windows Firewall Rules To return a list of all Windows Firewall rules: 
netsh advfirewall firewall show rule name=all 
Reset Windows Firewall To reset the Windows Firewall rules: 
netsh advfirewall reset 
Backup and Restore Windows Firewall Rules To simplify administration of larger collection of systems, you can 
export or import firewall systems from different servers) rules very easily on Windows: 
Export all firewall rules with the following command: 
netsh advfirewall export "C:tempMongoDBfw.wfw" 
6.3. Security Tutorials 303
MongoDB Documentation, Release 2.6.4 
Replace "C:tempMongoDBfw.wfw" with a path of your choosing. You can use a command in the following 
form to import a file created using this operation: 
netsh advfirewall import "C:tempMongoDBfw.wfw" 
Configure mongod and mongos for SSL 
This document helps you to configure MongoDB to support SSL. MongoDB clients can use SSL to encrypt connec-tions 
to mongod and mongos instances. 
Note: The default distribution of MongoDB30 does not contain support for SSL. To use SSL, you must either build 
MongoDB locally passing the --ssl option to scons or use MongoDB Enterprise31. 
These instructions assume that you have already installed a build of MongoDB that includes SSL support and that your 
client driver supports SSL. For instructions on upgrading a cluster currently not using SSL to using SSL, see Upgrade 
a Cluster to Use SSL (page 311). 
Changed in version 2.6: MongoDB’s SSL encryption only allows use of strong SSL ciphers with a minimum of 
128-bit key length for all connections. MongoDB Enterprise for Windows includes support for SSL. 
See also: 
SSL Configuration for Clients (page 307) to learn about SSL support for Python, Java, Ruby, and other clients. 
.pem File 
Before you can use SSL, you must have a .pem file containing a public key certificate and its associated private key. 
MongoDB can use any valid SSL certificate issued by a certificate authority, or a self-signed certificate. If you use a 
self-signed certificate, although the communications channel will be encrypted, there will be no validation of server 
identity. Although such a situation will prevent eavesdropping on the connection, it leaves you vulnerable to a man-in-the- 
middle attack. Using a certificate signed by a trusted certificate authority will permit MongoDB drivers to verify 
the server’s identity. 
In general, avoid using self-signed certificates unless the network is trusted. 
Additionally, with regards to authentication among replica set/sharded cluster members (page 284), in order to mini-mize 
exposure of the private key and allow hostname validation, it is advisable to use different certificates on different 
servers. 
For testing purposes, you can generate a self-signed certificate and private key on a Unix system with a command that 
resembles the following: 
• cd /etc/ssl/ 
openssl req -newkey rsa:2048 -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key 
This operation generates a new, self-signed certificate with no passphrase that is valid for 365 days. Once you have 
the certificate, concatenate the certificate and private key to a .pem file, as in the following example: 
cat mongodb-cert.key mongodb-cert.crt > mongodb.pem 
See also: 
Use x.509 Certificates to Authenticate Clients (page 320) 
30http://www.mongodb.org/downloads 
31http://www.mongodb.com/products/mongodb-enterprise 
304 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Set Up mongod and mongos with SSL Certificate and Key 
To use SSL in your MongoDB deployment, include the following run-time options with mongod and mongos: 
• net.ssl.mode set to requireSSL. This setting restricts each server to use only SSL encrypted connections. 
You can also specify either the value allowSSL or preferSSL to set up the use of mixed SSL modes on a 
port. See net.ssl.mode for details. 
• PEMKeyfile with the .pem file that contains the SSL certificate and key. 
Consider the following syntax for mongod: 
mongod --sslMode requireSSL --sslPEMKeyFile <pem> 
For example, given an SSL certificate located at /etc/ssl/mongodb.pem, configure mongod to use SSL encryp-tion 
for all connections with the following command: 
mongod --sslMode requireSSL --sslPEMKeyFile /etc/ssl/mongodb.pem 
Note: 
• Specify <pem> with the full path name to the certificate. 
• If the private key portion of the <pem> is encrypted, specify the passphrase. See SSL Certificate Passphrase 
(page 306). 
• You may also specify these options in the configuration file, as in the following example: 
sslMode = requireSSL 
sslPEMKeyFile = /etc/ssl/mongodb.pem 
To connect, to mongod and mongos instances using SSL, the mongo shell and MongoDB tools must include the 
--ssl option. See SSL Configuration for Clients (page 307) for more information on connecting to mongod and 
mongos running with SSL. 
See also: 
Upgrade a Cluster to Use SSL (page 311) 
Set Up mongod and mongos with Certificate Validation 
To set up mongod or mongos for SSL encryption using an SSL certificate signed by a certificate authority, include 
the following run-time options during startup: 
• net.ssl.mode set to requireSSL. This setting restricts each server to use only SSL encrypted connections. 
You can also specify either the value allowSSL or preferSSL to set up the use of mixed SSL modes on a 
port. See net.ssl.mode for details. 
• PEMKeyfile with the name of the .pem file that contains the signed SSL certificate and key. 
• CAFile with the name of the .pem file that contains the root certificate chain from the Certificate Authority. 
Consider the following syntax for mongod: 
mongod --sslMode requireSSL --sslPEMKeyFile <pem> --sslCAFile <ca> 
For example, given a signed SSL certificate located at /etc/ssl/mongodb.pem and the certificate authority file 
at /etc/ssl/ca.pem, you can configure mongod for SSL encryption as follows: 
6.3. Security Tutorials 305
MongoDB Documentation, Release 2.6.4 
mongod --sslMode requireSSL --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile /etc/ssl/ca.pem 
Note: 
• Specify the <pem> file and the <ca> file with either the full path name or the relative path name. 
• If the <pem> is encrypted, specify the passphrase. See SSL Certificate Passphrase (page 306). 
• You may also specify these options in the configuration file, as in the following example: 
sslMode = requireSSL 
sslPEMKeyFile = /etc/ssl/mongodb.pem 
sslCAFile = /etc/ssl/ca.pem 
To connect, to mongod and mongos instances using SSL, the mongo tools must include the both the --ssl and 
--sslPEMKeyFile option. See SSL Configuration for Clients (page 307) for more information on connecting to 
mongod and mongos running with SSL. 
See also: 
Upgrade a Cluster to Use SSL (page 311) 
Block Revoked Certificates for Clients To prevent clients with revoked certificates from connecting, include the 
sslCRLFile to specify a .pem file that contains revoked certificates. 
For example, the following mongod with SSL configuration includes the sslCRLFile setting: 
mongod --sslMode requireSSL --sslCRLFile /etc/ssl/ca-crl.pem --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile Clients with revoked certificates in the /etc/ssl/ca-crl.pem will not be able to connect to this mongod in-stance. 
Validate Only if a Client Presents a Certificate In most cases it is important to ensure that clients present valid 
certificates. However, if you have clients that cannot present a client certificate, or are transitioning to using a certificate 
authority you may only want to validate certificates from clients that present a certificate. 
If you want to bypass validation for clients that don’t present certificates, include the 
weakCertificateValidation run-time option with mongod and mongos. If the client does not present a 
certificate, no validation occurs. These connections, though not validated, are still encrypted using SSL. 
For example, consider the following mongod with an SSL configuration that includes the 
weakCertificateValidation setting: 
mongod --sslMode requireSSL --sslWeakCertificateValidation --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile Then, clients can connect either with the option --ssl and no certificate or with the option --ssl and a valid 
certificate. See SSL Configuration for Clients (page 307) for more information on SSL connections for clients. 
Note: If the client presents a certificate, the certificate must be a valid certificate. 
All connections, including those that have not presented certificates are encrypted using SSL. 
SSL Certificate Passphrase 
The PEM files for PEMKeyfile and ClusterFile may be encrypted. With encrypted PEM files, you must specify 
the passphrase at startup with a command-line or a configuration file option or enter the passphrase when prompted. 
306 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Changed in version 2.6: In previous versions, you can only specify the passphrase with a command-line or a configu-ration 
file option. 
To specify the passphrase in clear text on the command line or in a configuration file, use the PEMKeyPassword 
and/or the ClusterPassword option. 
To have MongoDB prompt for the passphrase at the start of mongod or mongos and avoid specifying the passphrase 
in clear text, omit the PEMKeyPassword and/or the ClusterPassword option. MongoDB will prompt for each 
passphrase as necessary. 
Important: The passphrase prompt option is available if you run the MongoDB instance in the foreground with 
a connected terminal. If you run mongod or mongos in a non-interactive session (e.g. without a terminal or as a 
service on Windows), you cannot use the passphrase prompt option. 
Run in FIPS Mode 
See Configure MongoDB for FIPS (page 311) for more details. 
SSL Configuration for Clients 
Clients must have support for SSL to work with a mongod or a mongos instance that has SSL support enabled. The 
current versions of the Python, Java, Ruby, Node.js, .NET, and C++ drivers have support for SSL, with full support 
coming in future releases of other drivers. 
See also: 
Configure mongod and mongos for SSL (page 304). 
mongo Shell SSL Configuration 
For SSL connections, you must use the mongo shell built with SSL support or distributed with MongoDB Enterprise. 
To support SSL, mongo has the following settings: 
• --ssl 
• --sslPEMKeyFile with the name of the .pem file that contains the SSL certificate and key. 
• --sslCAFile with the name of the .pem file that contains the certificate from the Certificate Authority (CA). 
Warning: If the mongo shell or any other tool that connects to mongos or mongod is run without 
--sslCAFile, it will not attempt to validate server certificates. This results in vulnerability to expired 
mongod and mongos certificates as well as to foreign processes posing as valid mongod or mongos 
instances. Ensure that you always specify the CA file against which server certificates should be validated 
in cases where intrusion is a possibility. 
• --sslPEMKeyPassword option if the client certificate-key file is encrypted. 
Connect to MongoDB Instance with SSL Encryption To connect to a mongod or mongos instance that requires 
only a SSL encryption mode (page 305), start mongo shell with --ssl, as in the following: 
mongo --ssl 
6.3. Security Tutorials 307
MongoDB Documentation, Release 2.6.4 
Connect to MongoDB Instance that Requires Client Certificates To connect to a mongod or mongos that re-quires 
CA-signed client certificates (page 305), start the mongo shell with --ssl and the --sslPEMKeyFile 
option to specify the signed certificate-key file, as in the following: 
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem 
Connect to MongoDB Instance that Validates when Presented with a Certificate To connect to a mongod or 
mongos instance that only requires valid certificates when the client presents a certificate (page 306), start mongo 
shell either with the --ssl ssl and no certificate or with the --ssl ssl and a valid signed certificate. 
For example, if mongod is running with weak certificate validation, both of the following mongo shell clients can 
connect to that mongod: 
mongo --ssl 
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem 
Important: If the client presents a certificate, the certificate must be valid. 
MMS Monitoring Agent 
The Monitoring agent will also have to connect via SSL in order to gather its stats. Because the agent already utilizes 
SSL for its communications to the MMS servers, this is just a matter of enabling SSL support in MMS itself on a per 
host basis. 
Use the “Edit” host button (i.e. the pencil) on the Hosts page in the MMS console to enable SSL. 
Please see the MMS documentation32 for more information about MMS configuration. 
PyMongo 
Add the “ssl=True” parameter to a PyMongo MongoClient33 to create a MongoDB connection to an SSL Mon-goDB 
instance: 
from pymongo import MongoClient 
c = MongoClient(host="mongodb.example.net", port=27017, ssl=True) 
To connect to a replica set, use the following operation: 
from pymongo import MongoReplicaSetClient 
c = MongoReplicaSetClient("mongodb.example.net:27017", 
replicaSet="mysetname", ssl=True) 
PyMongo also supports an “ssl=true” option for the MongoDB URI: 
mongodb://mongodb.example.net:27017/?ssl=true 
For more details, see the Python MongoDB Driver page34. 
32http://mms.mongodb.com/help 
33http://api.mongodb.org/python/current/api/pymongo/mongo_client.html#pymongo.mongo_client.MongoClient 
34http://docs.mongodb.org/ecosystem/drivers/python 
308 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Java 
Consider the following example “SSLApp.java” class file: 
import com.mongodb.*; 
import javax.net.ssl.SSLSocketFactory; 
public class SSLApp { 
public static void main(String args[]) throws Exception { 
MongoClientOptions o = new MongoClientOptions.Builder() 
.socketFactory(SSLSocketFactory.getDefault()) 
.build(); 
MongoClient m = new MongoClient("localhost", o); 
DB db = m.getDB( "test" ); 
DBCollection c = db.getCollection( "foo" ); 
System.out.println( c.findOne() ); 
} 
} 
For more details, see the Java MongoDB Driver page35. 
Ruby 
The recent versions of the Ruby driver have support for connections to SSL servers. Install the latest version of the 
driver with the following command: 
gem install mongo 
Then connect to a standalone instance, using the following form: 
require 'rubygems' 
require 'mongo' 
connection = MongoClient.new('localhost', 27017, :ssl => true) 
Replace connection with the following if you’re connecting to a replica set: 
connection = MongoReplicaSetClient.new(['localhost:27017'], 
['localhost:27018'], 
:ssl => true) 
Here, mongod instance run on “localhost:27017” and “localhost:27018”. 
For more details, see the Ruby MongoDB Driver page36. 
Node.JS (node-mongodb-native) 
In the node-mongodb-native37 driver, use the following invocation to connect to a mongod or mongos instance via 
SSL: 
35http://docs.mongodb.org/ecosystem/drivers/java 
36http://docs.mongodb.org/ecosystem/drivers/ruby 
37https://github.com/mongodb/node-mongodb-native 
6.3. Security Tutorials 309
MongoDB Documentation, Release 2.6.4 
var db1 = new Db(MONGODB, new Server("127.0.0.1", 27017, 
{ auto_reconnect: false, poolSize:4, ssl:true } ); 
To connect to a replica set via SSL, use the following form: 
var replSet = new ReplSetServers( [ 
new Server( RS.host, RS.ports[1], { auto_reconnect: true } ), 
new Server( RS.host, RS.ports[0], { auto_reconnect: true } ), 
], 
{rs_name:RS.name, ssl:true} 
); 
For more details, see the Node.JS MongoDB Driver page38. 
.NET 
As of release 1.6, the .NET driver supports SSL connections with mongod and mongos instances. To connect using 
SSL, you must add an option to the connection string, specifying ssl=true as follows: 
var connectionString = "mongodb://localhost/?ssl=true"; 
var server = MongoServer.Create(connectionString); 
The .NET driver will validate the certificate against the local trusted certificate store, in addition to providing en-cryption 
of the server. This behavior may produce issues during testing if the server uses a self-signed certificate. If 
you encounter this issue, add the sslverifycertificate=false option to the connection string to prevent the 
.NET driver from validating the certificate, as follows: 
var connectionString = "mongodb://localhost/?ssl=true&sslverifycertificate=false"; 
var server = MongoServer.Create(connectionString); 
For more details, see the .NET MongoDB Driver page39. 
MongoDB Tools 
Changed in version 2.6. 
Various MongoDB utility programs supports SSL. These tools include: 
• mongodump 
• mongoexport 
• mongofiles 
• mongoimport 
• mongooplog 
• mongorestore 
• mongostat 
• mongotop 
To use SSL connections with these tools, use the same SSL options as the mongo shell. See mongo Shell SSL 
Configuration (page 307). 
38http://docs.mongodb.org/ecosystem/drivers/node-js 
39http://docs.mongodb.org/ecosystem/drivers/csharp 
310 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Upgrade a Cluster to Use SSL 
Note: The default distribution of MongoDB40 does not contain support for SSL. To use SSL you can either compile 
MongoDB with SSL support or use MongoDB Enterprise. See Configure mongod and mongos for SSL (page 304) for 
more information about SSL and MongoDB. 
Changed in version 2.6. 
The MongoDB server supports listening for both SSL encrypted and unencrypted connections on the same TCP port. 
This allows upgrades of MongoDB clusters to use SSL encrypted connections. To upgrade from a MongoDB cluster 
using no SSL encryption to one using only SSL encryption, use the following rolling upgrade process: 
1. For each node of a cluster, start the node with the option --sslMode set to allowSSL. The --sslMode 
allowSSL setting allows the node to accept both SSL and non-SSL incoming connections. Its connections to 
other servers do not use SSL. Include other SSL options (page 304) as well as any other options that are required 
for your specific configuration. For example: 
mongod --replSet <name> --sslMode allowSSL --sslPEMKeyFile <path to SSL Certificate and key PEM Upgrade all nodes of the cluster to these settings. 
Note: You may also specify these options in the configuration file, as in the following example: 
sslMode = <disabled|allowSSL|preferSSL|requireSSL> 
sslPEMKeyFile = <path to SSL certificate and key PEM file> 
sslCAFile = <path to root CA PEM file> 
2. Switch all clients to use SSL. See SSL Configuration for Clients (page 307). 
3. For each node of a cluster, use the setParameter command to update the sslMode to preferSSL. 41 
With preferSSL as its net.ssl.mode, the node accepts both SSL and non-SSL incoming connections, 
and its connections to other servers use SSL. For example: 
db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "preferSSL" } ) 
Upgrade all nodes of the cluster to these settings. 
At this point, all connections should be using SSL. 
4. For each node of the cluster, use the setParameter command to update the sslMode to requireSSL. 1 
With requireSSL as its net.ssl.mode, the node will reject any non-SSL connections. For example: 
db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "requireSSL" } ) 
5. After the upgrade of all nodes, edit the configuration file with the appropriate SSL settings to ensure 
that upon subsequent restarts, the cluster uses SSL. 
Configure MongoDB for FIPS 
New in version 2.6. 
40http://www.mongodb.org/downloads 
41 As an alternative to using the setParameter command, you can also restart the nodes with the appropriate SSL options and values. 
6.3. Security Tutorials 311
MongoDB Documentation, Release 2.6.4 
Overview 
The Federal Information Processing Standard (FIPS) is a U.S. government computer security standard used to certify 
software modules and libraries that encrypt and decrypt data securely. You can configure MongoDB to run with a 
FIPS 140-2 certified library for OpenSSL. Configure FIPS to run by default or as needed from the command line. 
Prerequisites 
Only the MongoDB Enterprise42 version supports FIPS mode. Download and install MongoDB Enterprise43 to use 
FIPS mode. 
Your system must have an OpenSSL library configured with the FIPS 140-2 module. At the command line, type 
openssl version to confirm your OpenSSL software includes FIPS support. 
For Red Hat Enterprise Linux 6.x (RHEL 6.x) or its derivatives such as CentOS 6.x, the OpenSSL toolkit must be 
at least openssl-1.0.1e-16.el6_5 to use FIPS mode. To upgrade the toolkit for these platforms, issue the 
following command: 
sudo yum update openssl 
Some versions of Linux periodically execute a process to prelink dynamic libraries with pre-assigned addresses. This 
process modifies the OpenSSL libraries, specifically libcrypto. The OpenSSL FIPS mode will subsequently fail 
the signature check performed upon startup to ensure libcrypto has not been modified since compilation. 
To configure the Linux prelink process to not prelink libcrypto: 
sudo bash -c "echo '-b /usr/lib64/libcrypto.so.*' >>/etc/prelink.conf.d/openssl-prelink.conf" 
Procedure 
Configure MongoDB to use SSL See Configure mongod and mongos for SSL (page 304) for details about config-uring 
OpenSSL. 
Run mongod or mongos instance in FIPS mode Perform these steps after you Configure mongod and mongos 
for SSL (page 304). 
Step 1: Change configuration file. To configure your mongod or mongos instance to use FIPS mode, shut down 
the instance and update the configuration file with the following setting: 
net: 
ssl: 
FIPSMode: true 
Step 2: Start mongod or mongos instance with configuration file. For example, run this command to start the 
mongod instance with its configuration file: 
mongod --config /etc/mongodb.conf 
For more information about configuration files, see http://docs.mongodb.org/manualreference/configuration-options. 
42http://www.mongodb.com/products/mongodb-enterprise 
43http://www.mongodb.com/products/mongodb-enterprise 
312 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Confirm FIPS mode is running Check the server log file for a message FIPS is active: 
FIPS 140-2 mode activated 
6.3.3 Security Deployment Tutorials 
The following tutorials provide information in deploying MongoDB using authentication and authorization. 
Deploy Replica Set and Configure Authentication and Authorization (page 313) Configure a replica set that has au-thentication 
enabled. 
Deploy Replica Set and Configure Authentication and Authorization 
Overview 
With authentication (page 282) enabled, MongoDB forces all clients to identify themselves before granting access to 
the server. Authorization (page 285), in turn, allows administrators to define and limit the resources and operations 
that a user can access. Using authentication and authorization is a key part of a complete security strategy. 
All MongoDB deployments support authentication. By default, MongoDB does not require authorization checking. 
You can enforce authorization checking when deploying MongoDB, or on an existing deployment; however, you 
cannot enable authorization checking on a running deployment without downtime. 
This tutorial provides a procedure for creating a MongoDB replica set (page 503) that uses the challenge-response au-thentication 
mechanism. The tutorial includes creation of a minimal authorization system to support basic operations. 
Considerations 
Authentication In this procedure, you will configure MongoDB using the default challenge-response authentication 
mechanism, using the keyFile to supply the password for inter-process authentication (page 284). The content of 
the key file is the shared secret used for all internal authentication. 
All deployments that enforce authorization checking should have one user administrator user that can create new users 
and modify existing users. During this procedure you will create a user administrator that you will use to administer 
this deployment. 
Architecture In a production, deploy each member of the replica set to its own machine and if possible bind to the 
standard MongoDB port of 27017. Use the bind_ip option to ensure that MongoDB listens for connections from 
applications on configured addresses. 
For a geographically distributed replica sets, ensure that the majority of the set’s mongod instances reside in the 
primary site. 
See Replica Set Deployment Architectures (page 516) for more information. 
Connectivity Ensure that network traffic can pass between all members of the set and all clients in the network 
securely and efficiently. Consider the following: 
• Establish a virtual private network. Ensure that your network topology routes all traffic between members within 
a single site over the local area network. 
• Configure access control to prevent connections from unknown clients to the replica set. 
• Configure networking and firewall rules so that incoming and outgoing packets are permitted only on the default 
MongoDB port and only from within your deployment. 
6.3. Security Tutorials 313
MongoDB Documentation, Release 2.6.4 
Finally ensure that each member of a replica set is accessible by way of resolvable DNS or hostnames. You should 
either configure your DNS names appropriately or set up your systems’ /etc/hosts file to reflect this configuration. 
Configuration Specify the run time configuration on each system in a configuration file stored in 
/etc/mongodb.conf or a related location. Create the directory where MongoDB stores data files before de-ploying 
MongoDB. 
For more information about the run time options used above and other configuration options, see 
http://docs.mongodb.org/manualreference/configuration-options. 
Procedure 
This procedure deploys a replica set in which all members use the same key file. 
Step 1: Start one member of the replica set. This mongod should not enable auth. 
Step 2: Create administrative users. The following operations will create two users: a user administrator that will 
be able to create and modify users (siteUserAdmin), and a root (page 368) user (siteRootAdmin) that you 
will use to complete the remainder of the tutorial: 
use admin 
db.createUser( { 
user: "siteUserAdmin", 
pwd: "<password>", 
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] 
}); 
db.createUser( { 
user: "siteRootAdmin", 
pwd: "<password>", 
roles: [ { role: "root", db: "admin" } ] 
}); 
Step 3: Stop the mongod instance. 
Step 4: Create the key file to be used by each member of the replica set. Create the key file your deployment will 
use to authenticate servers to each other. 
To generate pseudo-random data to use for a keyfile, issue the following openssl command: 
openssl rand -base64 741 > mongodb-keyfile 
chmod 600 mongodb-keyfile 
You may generate a key file using any method you choose. Always ensure that the password stored in the key file is 
both long and contains a high amount of entropy. Using openssl in this manner helps generate such a key. 
Step 5: Copy the key file to each member of the replica set. Copy the mongodb-keyfile to all hosts where 
components of a MongoDB deployment run. Set the permissions of these files to 600 so that only the owner of the 
file can read or write this file to prevent other users on the system from accessing the shared secret. 
314 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Step 6: Start each member of the replica set with the appropriate options. For each member, start a mongod 
and specify the key file and the name of the replica set. Also specify other parameters as needed for your deployment. 
For replication-specific parameters, see cli-mongod-replica-set required by your deployment. 
If your application connects to more than one replica set, each set should have a distinct name. Some drivers group 
replica set connections by replica set name. 
The following example specifies parameters through the --keyFile and --replSet command-line options: 
mongod --keyFile /mysecretdirectory/mongodb-keyfile --replSet "rs0" 
The following example specifies parameters through a configuration file: 
mongod --config $HOME/.mongodb/config 
In production deployments, you can configure a control script to manage this process. Control scripts are beyond the 
scope of this document. 
Step 7: Connect to the member of the replica set where you created the administrative users. Connect to 
the replica set member you started and authenticate as the siteRootAdmin user. From the mongo shell, use the 
following operation to authenticate: 
use admin 
db.auth("siteRootAdmin", "<password>"); 
Step 8: Initiate the replica set. Use rs.initiate(): 
rs.initiate() 
MongoDB initiates a set that consists of the current member and that uses the default replica set configuration. 
Step 9: Verify the initial replica set configuration. Use rs.conf() to display the replica set configuration object 
(page 594): 
rs.conf() 
The replica set configuration object resembles the following: 
{ 
"_id" : "rs0", 
"version" : 1, 
"members" : [ 
{ 
"_id" : 1, 
"host" : "mongodb0.example.net:27017" 
} 
] 
} 
Step 10: Add the remaining members to the replica set. Add the remaining members with the rs.add() 
method. 
The following example adds two members: 
rs.add("mongodb1.example.net") 
rs.add("mongodb2.example.net") 
When complete, you have a fully functional replica set. The new replica set will elect a primary. 
6.3. Security Tutorials 315
MongoDB Documentation, Release 2.6.4 
Step 11: Check the status of the replica set. Use the rs.status() operation: 
rs.status() 
Step 12: Create additional users to address operational requirements. You can use built-in roles (page 361) to 
create common types of database users, such as the dbOwner (page 363) role to create a database administrator, the 
readWrite (page 362) role to create a user who can update data, or the read (page 362) role to create user who 
can search data but no more. You also can define custom roles (page 286). 
For example, the following creates a database administrator for the products database: 
use products 
db.createUser( 
{ 
user: "productsDBAdmin", 
pwd: "password", 
roles: 
[ 
{ 
role: "dbOwner", 
db: "products" 
} 
] 
} 
) 
For an overview of roles and privileges, see Authorization (page 285). For more information on adding users, see Add 
a User to a Database (page 344). 
6.3.4 Access Control Tutorials 
The following tutorials provide instructions for MongoDB”s authentication and authorization related features. 
Enable Client Access Control (page 317) Describes the process for enabling authentication for MongoDB deploy-ments. 
Enable Authentication in a Sharded Cluster (page 318) Control access to a sharded cluster through a key file and 
the keyFile setting on each of the cluster’s components. 
Enable Authentication after Creating the User Administrator (page 319) Describes an alternative process for en-abling 
authentication for MongoDB deployments. 
Use x.509 Certificates to Authenticate Clients (page 320) Use x.509 for client authentication. 
Use x.509 Certificate for Membership Authentication (page 323) Use x.509 for internal member authentication for 
replica sets and sharded clusters. 
Authenticate Using SASL and LDAP with ActiveDirectory (page 326) Describes the process for authentication us-ing 
SASL/LDAP with ActiveDirectory. 
Authenticate Using SASL and LDAP with OpenLDAP (page 329) Describes the process for authentication using 
SASL/LDAP with OpenLDAP. 
Configure MongoDB with Kerberos Authentication on Linux (page 331) For MongoDB Enterprise Linux, de-scribes 
the process to enable Kerberos-based authentication for MongoDB deployments. 
Configure MongoDB with Kerberos Authentication on Windows (page 334) For MongoDB Enterprise for Win-dows, 
describes the process to enable Kerberos-based authentication for MongoDB deployments. 
316 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Authenticate to a MongoDB Instance or Cluster (page 336) Describes the process for authenticating to MongoDB 
systems using the mongo shell. 
Generate a Key File (page 338) Use key file to allow the components of MongoDB sharded cluster or replica set to 
mutually authenticate. 
Troubleshoot Kerberos Authentication on Linux (page 338) Steps to troubleshoot Kerberos-based authentication 
for MongoDB deployments. 
Implement Field Level Redaction (page 340) Describes the process to set up and access document content that can 
have different access levels for the same data. 
Enable Client Access Control 
Overview 
Enabling access control on a MongoDB instance restricts access to the instance by requiring that users identify them-selves 
when connecting. In this procedure, you enable access control and then create the instance’s first user, which 
must be a user administrator. The user administrator grants further access to the instance by creating additional users. 
Considerations 
If you create the user administrator before enabling access control, MongoDB disables the localhost exception 
(page 285). In that case, you must use the “Enable Authentication after Creating the User Administrator (page 319)” 
procedure to enable access control. 
This procedure uses the localhost exception (page 285) to allow you to create the first user after enabling authentication. 
See Localhost Exception (page 285) and Authentication (page 282) for more information. 
Procedure 
Step 1: Start the MongoDB instance with authentication enabled. Start the mongod or mongos instance with 
the authorization or keyFile setting. Use authorization on a standalone instance. Use keyFile on an 
instance in a replica set or sharded cluster. 
For example, to start a mongod with authentication enabled and a key file stored in 
http://docs.mongodb.org/manualprivate/var, first set the following option in the mongod‘s 
configuration file: 
security: 
keyFile: /private/var/key.pem 
Then start the mongod and specify the config file. For example: 
mongod --config /etc/mongodb/mongodb.conf 
After you enable authentication, only the user administrator can connect to the MongoDB instance. The user admin-istrator 
must log in and grant further access to the instance by creating additional users. 
Step 2: Connect to the MongoDB instance via the localhost exception. Connect to the MongoDB instance from 
a client running on the same system. This access is made possible by the localhost exception (page 285). 
6.3. Security Tutorials 317
MongoDB Documentation, Release 2.6.4 
Step 3: Create the system user administrator. Add the user with the userAdminAnyDatabase (page 368) 
role, and only that role. 
The following example creates the user siteUserAdmin user on the admin database: 
use admin 
db.createUser( 
{ 
user: "siteUserAdmin", 
pwd: "password", 
roles: 
[ 
{ 
role: "userAdminAnyDatabase", 
db: "admin" 
} 
] 
} 
) 
After you create the user administrator, the localhost exception (page 285) is no longer available. 
Step 4: Create additional users. Login in with the user administrator’s credentials and create additional users. See 
Add a User to a Database (page 344). 
Next Steps 
If you need to disable access control for any reason, restart the process without the authorization or keyFile 
setting. 
Enable Authentication in a Sharded Cluster 
New in version 2.0: Support for authentication with sharded clusters. 
Overview 
When authentication is enabled on a sharded cluster every client that accesses the cluster must provide credentials. 
This includes MongoDB instances that access each other within the cluster. 
To enable authentication on a sharded cluster, you must enable authentication individually on each component of the 
cluster. This means enabling authentication on each mongos and each mongod, including each config server, and all 
members of a shard’s replica set. 
Authentication requires an authentication mechanism and, in most cases, a key file. The content of the key file 
must be the same on all cluster members. 
Procedure 
Step 1: Create a key file. Create the key file your deployment will use to authenticate servers to each other. 
To generate pseudo-random data to use for a keyfile, issue the following openssl command: 
openssl rand -base64 741 > mongodb-keyfile 
chmod 600 mongodb-keyfile 
318 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
You may generate a key file using any method you choose. Always ensure that the password stored in the key file is 
both long and contains a high amount of entropy. Using openssl in this manner helps generate such a key. 
Step 2: Enable authentication on each component in the cluster. On each mongos and mongod in the cluster, 
including all config servers and shards, specify the key file using one of the following approaches: 
Specify the key file in the configuration file. In the configuration file, set the keyFile option to the key file’s path 
and then start the component, as in the following example: 
security: 
keyFile: /srv/mongodb/keyfile 
Specify the key file at runtime. When starting the component, set the --keyFile option, which is an option 
for both mongos instances and mongod instances. Set the --keyFile to the key file’s path. The keyFile 
setting implies the authorization setting, which means in most cases you do not need to set authorization 
explicitly. 
Step 3: Add users. While connected to a mongos, add the first administrative user and then add subsequent users. 
See Create a User Administrator (page 343). 
Related Documents 
• Authentication (page 282) 
• Security (page 279) 
• Use x.509 Certificate for Membership Authentication (page 323) 
Enable Authentication after Creating the User Administrator 
Overview 
Enabling authentication on a MongoDB instance restricts access to the instance by requiring that users identify them-selves 
when connecting. In this procedure, you will create the instance’s first user, which must be a user administrator 
and then enable authentication. Then, you can authenticate as the user administrator to create additional users and 
grant additional access to the instance. 
This procedures outlines how enable authentication after creating the user administrator. The approach requires a 
restart. To enable authentication without restarting, see Enable Client Access Control (page 317). 
Considerations 
This document outlines a procedure for enabling authentication for MongoDB instance where you create the first user 
on an existing MongoDB system that does not require authentication before restarting the instance and requiring au-thentication. 
You can use the localhost exception (page 285) to gain access to a system with no users and authentication 
enabled. See Enable Client Access Control (page 317) for the description of that procedure. 
6.3. Security Tutorials 319
MongoDB Documentation, Release 2.6.4 
Procedure 
Step 1: Start the MongoDB instance without authentication. Start the mongod or mongos instance without the 
authorization or keyFile setting. For example: 
mongod --port 27017 --dbpath /data/db1 
For details on starting a mongod or mongos, see Manage mongod Processes (page 207) or Deploy a Sharded Cluster 
(page 635). 
Step 2: Create the system user administrator. Add the user with the userAdminAnyDatabase (page 368) 
role, and only that role. 
The following example creates the user siteUserAdmin user on the admin database: 
use admin 
db.createUser( 
{ 
user: "siteUserAdmin", 
pwd: "password", 
roles: 
[ 
{ 
role: "userAdminAnyDatabase", 
db: "admin" 
} 
] 
} 
) 
Step 3: Re-start the MongoDB instance with authentication enabled. Re-start the mongod or mongos instance 
with the authorization or keyFile setting. Use authorization on a standalone instance. Use keyFile 
on an instance in a replica set or sharded cluster. 
The following example enables authentication on a standalone mongod using the authorization command-line 
option: 
mongod --auth --config /etc/mongodb/mongodb.conf 
Step 4: Create additional users. Log in with the user administrator’s credentials and create additional users. See 
Add a User to a Database (page 344). 
Next Steps 
If you need to disable authentication for any reason, restart the process without the authorization or keyFile 
option. 
Use x.509 Certificates to Authenticate Clients 
New in version 2.6. 
MongoDB supports x.509 certificate authentication for use with a secure SSL connection (page 304). The x.509 client 
authentication allows clients to authenticate to servers with certificates (page 321) rather than with a username and 
password. 
320 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
To use x.509 authentication for the internal authentication of replica set/sharded cluster members, see Use x.509 
Certificate for Membership Authentication (page 323). 
Client x.509 Certificate 
The client certificate must have the following properties: 
• A single Certificate Authority (CA) must issue the certificates for both the client and the server. 
• Client certificates must contain the following fields: 
keyUsage = digitalSignature 
extendedKeyUsage = clientAuth 
• A client x.509 certificate’s subject, which contains the Distinguished Name (DN), must differ from that of a 
Member x.509 Certificate (page 323) to prevent client certificates from identifying the client as a cluster member 
and granting full permission on the system. Specifically, the subjects must differ with regards to at least one of 
the following attributes: Organization (O), the Organizational Unit (OU) or the Domain Component (DC). 
• Each unique MongoDB user must have a unique certificate. 
Configure MongoDB Server 
Use Command-line Options You can configure the MongoDB server from the command line, e.g.: 
mongod --sslMode requireSSL --sslPEMKeyFile <path to SSL certificate and key PEM file> --sslCAFile <path Warning: If the --sslCAFile option and its target file are not specified, x.509 client and member authenti-cation 
will not function. mongod, and mongos in sharded systems, will not be able to verify the certificates of 
processes connecting to it against the trusted certificate authority (CA) that issued them, breaking the certificate 
chain. 
As of version 2.6.4, mongod will not start with x.509 authentication enabled if the CA file is not specified. 
Use Configuration File You may also specify these options in the configuration file. 
Starting in MongoDB 2.6, you can specify the configuration for MongoDB in YAML format, e.g.: 
net: 
ssl: 
mode: requireSSL 
PEMKeyFile: <path to SSL certificate and key PEM file> 
CAFile: <path to root CA PEM file> 
For backwards compatibility, you can also specify the configuration using the older configuration file format44, e.g.: 
sslMode = requireSSL 
sslPEMKeyFile = <path to SSL certificate and key PEM file> 
sslCAFile = <path to the root CA PEM file> 
Include any additional options, SSL or otherwise, that are required for your specific configuration. 
44http://docs.mongodb.org/v2.4/reference/configuration 
6.3. Security Tutorials 321
MongoDB Documentation, Release 2.6.4 
Add x.509 Certificate subject as a User 
To authenticate with a client certificate, you must first add the value of the subject from the client certificate as a 
MongoDB user. Each unique x.509 client certificate corresponds to a single MongoDB user; i.e. you cannot use a 
single client certificate to authenticate more than one MongoDB user. 
1. You can retrieve the subject from the client certificate with the following command: 
openssl x509 -in <pathToClient PEM> -inform PEM -subject -nameopt RFC2253 
The command returns the subject string as well as certificate: 
subject= CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry 
-----BEGIN CERTIFICATE----- 
# ... 
-----END CERTIFICATE----- 
2. Add the value of the subject, omitting the spaces, from the certificate as a user. 
For example, in the mongo shell, to add the user with both the readWrite role in the test database and the 
userAdminAnyDatabase role which is defined only in the admin database: 
db.getSiblingDB("$external").runCommand( 
{ 
createUser: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry", 
roles: [ 
{ role: 'readWrite', db: 'test' }, 
{ role: 'userAdminAnyDatabase', db: 'admin' } 
], 
writeConcern: { w: "majority" , wtimeout: 5000 } 
} 
) 
In the above example, to add the user with the readWrite role in the test database, the role specification 
document specified ’test’ in the db field. To add userAdminAnyDatabase role for the user, the above 
example specified ’admin’ in the db field. 
Note: Some roles are defined only in the admin database, including: clusterAdmin, 
readAnyDatabase, readWriteAnyDatabase, dbAdminAnyDatabase, and 
userAdminAnyDatabase. To add a user with these roles, specify ’admin’ in the db. 
See Add a User to a Database (page 344) for details on adding a user with roles. 
Authenticate with a x.509 Certificate 
To authenticate with a client certificate, you must first add a MongoDB user that corresponds to the client certificate. 
See Add x.509 Certificate subject as a User (page 322). 
To authenticate, use the db.auth() method in the $external database, specifying "MONGODB-X509" for the 
mechanism field, and the user that corresponds to the client certificate (page 322) for the user field. 
For example, if using the mongo shell, 
1. Connect mongo shell to the mongod set up for SSL: 
mongo --ssl --sslPEMKeyFile <path to CA signed client PEM file> --sslCAFile <path to root CA PEM 322 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
2. To perform the authentication, use the db.auth() method in the $external database. For the mechanism 
field, specify "MONGODB-X509", and for the user field, specify the user, or the subject, that corresponds 
to the client certificate. 
db.getSiblingDB("$external").auth( 
{ 
mechanism: "MONGODB-X509", 
user: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry" 
} 
) 
Use x.509 Certificate for Membership Authentication 
New in version 2.6. 
MongoDB supports x.509 certificate authentication for use with a secure SSL connection (page 304). Sharded cluster 
members and replica set members can use x.509 certificates to verify their membership to the cluster or the replica set 
instead of using keyfiles (page 282). The membership authentication is an internal process. 
For client authentication with x.509, see Use x.509 Certificates to Authenticate Clients (page 320). 
Member x.509 Certificate 
The member certificate, used for internal authentication to verify membership to the sharded cluster or a replica set, 
must have the following properties: 
• A single Certificate Authority (CA) must issue all the x.509 certificates for the members of a sharded cluster or 
a replica set. 
• The Distinguished Name (DN), found in the member certificate’s subject, must specify a non-empty value 
for at least one of the following attributes: Organization (O), the Organizational Unit (OU) or the Domain 
Component (DC). 
• The Organization attributes (O‘s), the Organizational Unit attributes (OU‘s), and the Domain Components (DC‘s) 
must match those from the certificates for the other cluster members. To match, the certificate must match all 
specifications of these attributes, or even the non-specification of these attributes. The order of the attributes 
does not matter. 
In the following example, the two DN‘s contain matching specifications for O, OU as well as the non-specification 
of the DC attribute. 
CN=host1,OU=Dept1,O=MongoDB,ST=NY,C=US 
C=US, ST=CA, O=MongoDB, OU=Dept1, CN=host2 
However, the following two DN‘s contain a mismatch for the OU attribute since one contains two OU specifica-tions 
and the other, only one specification. 
CN=host1,OU=Dept1,OU=Sales,O=MongoDB 
CN=host2,OU=Dept1,O=MongoDB 
• Either the Common Name (CN) or one of the Subject Alternative Name (SAN) entries must match the hostname 
of the server, used by the other members of the cluster. 
For example, the certificates for a cluster could have the following subjects: 
subject= CN=<myhostname1>,OU=Dept1,O=MongoDB,ST=NY,C=US 
subject= CN=<myhostname2>,OU=Dept1,O=MongoDB,ST=NY,C=US 
subject= CN=<myhostname3>,OU=Dept1,O=MongoDB,ST=NY,C=US 
6.3. Security Tutorials 323
MongoDB Documentation, Release 2.6.4 
It is possible to use a single x509 certificate for both member authentication and x.509 client authentication (page 320). 
To do so, obtain a certificate with both clientAuth and serverAuth (i.e. “TLS Web Client Authentication” and 
“TLS Web Server Authentication”) specified as Extended Key Usage (EKU) values, or simply do not specify any 
EKU values. Provide this file as the the --sslPEMKeyFile and omit the --sslClusterFile option described 
below. 
Configure Replica Set/Sharded Cluster 
Use Command-line Options To specify the x.509 certificate for internal cluster member authentication, append 
the additional SSL options --clusterAuthMode and --sslClusterFile, as in the following example for a 
member of a replica set: 
mongod --replSet <name> --sslMode requireSSL --clusterAuthMode x509 --sslClusterFile <path to membership Include any additional options, SSL or otherwise, that are required for your specific configuration. For instance, if 
the membership key is encrypted, set the --sslClusterPassword to the passphrase to decrypt the key or have 
MongoDB prompt for the passphrase. See SSL Certificate Passphrase (page 306) for details. 
Warning: If the --sslCAFile option and its target file are not specified, x.509 client and member authenti-cation 
will not function. mongod, and mongos in sharded systems, will not be able to verify the certificates of 
processes connecting to it against the trusted certificate authority (CA) that issued them, breaking the certificate 
chain. 
As of version 2.6.4, mongod will not start with x.509 authentication enabled if the CA file is not specified. 
Use Configuration File You may also specify these options in the configuration file. 
YAML Formatted Configuration File Starting in MongoDB 2.6, you can specify the configuration for MongoDB 
in YAML format, as in the following example: 
security: 
clusterAuthMode: x509 
net: 
ssl: 
mode: requireSSL 
PEMKeyFile: <path to SSL certificate and key PEM file> 
CAFile: <path to root CA PEM file> 
clusterFile: <path to x.509 membership certificate and key PEM file> 
See security.clusterAuthMode, net.ssl.mode, net.ssl.PEMKeyFile, net.ssl.CAFile, and 
net.ssl.clusterFile for more information on the settings. 
v2.4 Configuration File For backwards compatibility, you can also specify the configuration using the v2.4 config-uration 
file format45, as in the following example: 
sslMode = requireSSL 
sslPEMKeyFile = <path to SSL certificate and key PEM file> 
sslCAFile = <path to root CA PEM file> 
clusterAuthMode = x509 
sslClusterFile = <path to membership certificate and key PEM file> 
45http://docs.mongodb.org/v2.4/reference/configuration 
324 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Upgrade from Keyfile Authentication to x.509 Authentication 
To upgrade clusters that are currently using keyfile authentication to x.509 authentication, use a rolling upgrade pro-cess. 
Clusters Currently Using SSL For clusters using SSL and keyfile authentication, to upgrade to x.509 cluster au-thentication, 
use the following rolling upgrade process: 
1. For each node of a cluster, start the node with the option --clusterAuthMode set to sendKeyFile and 
the option --sslClusterFile set to the appropriate path of the node’s certificate. Include other SSL options 
(page 304) as well as any other options that are required for your specific configuration. For example: 
mongod --replSet <name> --sslMode requireSSL --clusterAuthMode sendKeyFile --sslClusterFile <path With this setting, each node continues to use its keyfile to authenticate itself as a member. However, each 
node can now accept either a keyfile or an x.509 certificate from other members to authenticate those members. 
Upgrade all nodes of the cluster to this setting. 
2. Then, for each node of a cluster, connect to the node and use the setParameter command to update the 
clusterAuthMode to sendX509. 46 For example, 
db.getSiblingDB('admin').runCommand( { setParameter: 1, clusterAuthMode: "sendX509" } ) 
With this setting, each node uses its x.509 certificate, specified with the --sslClusterFile option in the 
previous step, to authenticate itself as a member. However, each node continues to accept either a keyfile or an 
x.509 certificate from other members to authenticate those members. Upgrade all nodes of the cluster to this 
setting. 
3. Optional but recommended. Finally, for each node of the cluster, connect to the node and use the 
setParameter command to update the clusterAuthMode to x509 to only use the x.509 certificate for 
authentication. 1 For example: 
db.getSiblingDB('admin').runCommand( { setParameter: 1, clusterAuthMode: "x509" } ) 
4. After the upgrade of all nodes, edit the configuration file with the appropriate x.509 settings to ensure 
that upon subsequent restarts, the cluster uses x.509 authentication. 
See --clusterAuthMode for the various modes and their descriptions. 
Clusters Currently Not Using SSL For clusters using keyfile authentication but not SSL, to upgrade to x.509 
authentication, use the following rolling upgrade process: 
1. For each node of a cluster, start the node with the option --sslMode set to allowSSL, the option 
--clusterAuthMode set to sendKeyFile and the option --sslClusterFile set to the appropri-ate 
path of the node’s certificate. Include other SSL options (page 304) as well as any other options that are 
required for your specific configuration. For example: 
mongod --replSet <name> --sslMode allowSSL --clusterAuthMode sendKeyFile --sslClusterFile <path The --sslMode allowSSL setting allows the node to accept both SSL and non-SSL incoming connections. 
Its outgoing connections do not use SSL. 
The --clusterAuthMode sendKeyFile setting allows each node continues to use its keyfile to authen-ticate 
itself as a member. However, each node can now accept either a keyfile or an x.509 certificate from other 
members to authenticate those members. 
46 As an alternative to using the setParameter command, you can also restart the nodes with the appropriate SSL and x509 options and 
values. 
6.3. Security Tutorials 325
MongoDB Documentation, Release 2.6.4 
Upgrade all nodes of the cluster to these settings. 
2. Then, for each node of a cluster, connect to the node and use the setParameter command to update the 
sslMode to preferSSL and the clusterAuthMode to sendX509. 1 For example: 
db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "preferSSL", clusterAuthMode: "With the sslMode set to preferSSL, the node accepts both SSL and non-SSL incoming connections, and its 
outgoing connections use SSL. 
With the clusterAuthMode set to sendX509, each node uses its x.509 certificate, specified with the 
--sslClusterFile option in the previous step, to authenticate itself as a member. However, each node 
continues to accept either a keyfile or an x.509 certificate from other members to authenticate those members. 
Upgrade all nodes of the cluster to these settings. 
3. Optional but recommended. Finally, for each node of the cluster, connect to the node and use the 
setParameter command to update the sslMode to requireSSL and the clusterAuthMode to x509. 
1 For example: 
db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "requireSSL", clusterAuthMode: With the sslMode set to requireSSL, the node only uses SSL connections. 
With the clusterAuthMode set to x509, the node only uses the x.509 certificate for authentication. 
4. After the upgrade of all nodes, edit the configuration file with the appropriate SSL and x.509 settings 
to ensure that upon subsequent restarts, the cluster uses x.509 authentication. 
See --clusterAuthMode for the various modes and their descriptions. 
Authenticate Using SASL and LDAP with ActiveDirectory 
MongoDB Enterprise provides support for proxy authentication of users. This allows administrators to configure 
a MongoDB cluster to authenticate users by proxying authentication requests to a specified Lightweight Directory 
Access Protocol (LDAP) service. 
Considerations 
MongoDB Enterprise forWindows does not include LDAP support for authentication. However, MongoDB Enterprise 
for Linux supports using LDAP authentication with an ActiveDirectory server. 
MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4 
and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 751) for upgrade instructions. 
Use secure encrypted or trusted connections between clients and the server, as well as between saslauthd and the 
LDAP server. The LDAP server uses the SASL PLAIN mechanism, sending and receiving data in plain text. You 
should use only a trusted channel such as a VPN, a connection encrypted with SSL, or a trusted wired network. 
Configure saslauthd 
LDAP support for user authentication requires proper configuration of the saslauthd daemon process as well as 
the MongoDB server. 
326 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Step 1: Specify the mechanism. On systems that configure saslauthd with the 
/etc/sysconfig/saslauthd file, such as Red Hat Enterprise Linux, Fedora, CentOS, and Amazon 
Linux AMI, set the mechanism MECH to ldap: 
MECH=ldap 
On systems that configure saslauthd with the /etc/default/saslauthd file, such as Ubuntu, set the 
MECHANISMS option to ldap: 
MECHANISMS="ldap" 
Step 2: Adjust caching behavior. On certain Linux distributions, saslauthd starts with the caching of authenti-cation 
credentials enabled. Until restarted or until the cache expires, saslauthd will not contact the LDAP server 
to re-authenticate users in its authentication cache. This allows saslauthd to successfully authenticate users in its 
cache, even in the LDAP server is down or if the cached users’ credentials are revoked. 
To set the expiration time (in seconds) for the authentication cache, see the -t option47 of saslauthd. 
Step 3: Configure LDAP Options with ActiveDirectory. If the saslauthd.conf file does not exist, create it. 
The saslauthd.conf file usually resides in the /etc folder. If specifying a different file path, see the -O option48 
of saslauthd. 
To use with ActiveDirectory, start saslauthd with the following configuration options set in the 
saslauthd.conf file: 
ldap_servers: <ldap uri> 
ldap_use_sasl: yes 
ldap_mech: DIGEST-MD5 
ldap_auth_method: fastbind 
For the <ldap uri>, specify the uri of the ldap server. For example, ldap_servers: 
ldaps://ad.example.net. 
For more information on saslauthd configuration, see http://www.openldap.org/doc/admin24/guide.html#Configuringsaslauthd. 
Step 4: Test the saslauthd configuration. Use testsaslauthd utility to test the saslauthd configuration. 
For example: 
testsaslauthd -u testuser -p testpassword -f /var/run/saslauthd/mux 
Configure MongoDB 
Step 1: Add user to MongoDB for authentication. Add the user to the $external database in MongoDB. To 
specify the user’s privileges, assign roles (page 285) to the user. 
For example, the following adds a user with read-only access to the records database. 
db.getSiblingDB("$external").createUser( 
{ 
user : <username>, 
roles: [ { role: "read", db: "records" } ] 
} 
) 
47http://www.linuxcommand.org/man_pages/saslauthd8.html 
48http://www.linuxcommand.org/man_pages/saslauthd8.html 
6.3. Security Tutorials 327
MongoDB Documentation, Release 2.6.4 
Add additional principals as needed. For more information about creating and managing users, see 
http://docs.mongodb.org/manualreference/command/nav-user-management. 
Step 2: Configure MongoDB server. To configure the MongoDB server to use the saslauthd instance for proxy 
authentication, start the mongod with the following options: 
• --auth, 
• authenticationMechanisms parameter set to PLAIN, and 
• saslauthdPath parameter set to the path to the Unix-domain Socket of the saslauthd instance. 
Configure the MongoDB server using either the command line option --setParameter or the configuration 
file. Specify additional configurations as appropriate for your configuration. 
If you use the authorization option to enforce authentication, you will need privileges to create a user. 
Use specific saslauthd socket path. For socket path of /<some>/<path>/saslauthd, set the 
saslauthdPath to /<some>/<path>/saslauthd/mux, as in the following command line example: 
mongod --auth --setParameter saslauthdPath=/<some>/<path>/saslauthd/mux --setParameter authenticationMechanisms=Or if using a configuration file, specify the following parameters in the file: 
auth=true 
setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux 
setParameter=authenticationMechanisms=PLAIN 
Use default Unix-domain socket path. To use the default Unix-domain socket path, set the saslauthdPath to 
the empty string "", as in the following command line example: 
mongod --auth --setParameter saslauthdPath="" --setParameter authenticationMechanisms=PLAIN 
Or if using a configuration file, specify the following parameters in the file: 
auth=true 
setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux 
setParameter=authenticationMechanisms=PLAIN 
Step 3: Authenticate the user in the mongo shell. To perform the authentication in the mongo shell, use the 
db.auth() method in the $external database. 
Specify the value "PLAIN" in the mechanism field, the user and password in the user and pwd fields respectively, 
and the value false in the digestPassword field. You must specify false for digestPassword since the 
server must receive an undigested password to forward on to saslauthd, as in the following example: 
db.getSiblingDB("$external").auth( 
{ 
mechanism: "PLAIN", 
user: <username>, 
pwd: <cleartext password>, 
digestPassword: false 
} 
) 
The server forwards the password in plain text. In general, use only on a trusted channel (VPN, SSL, trusted wired 
network). See Considerations. 
328 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Authenticate Using SASL and LDAP with OpenLDAP 
MongoDB Enterprise provides support for proxy authentication of users. This allows administrators to configure 
a MongoDB cluster to authenticate users by proxying authentication requests to a specified Lightweight Directory 
Access Protocol (LDAP) service. 
Considerations 
MongoDB Enterprise forWindows does not include LDAP support for authentication. However, MongoDB Enterprise 
for Linux supports using LDAP authentication with an ActiveDirectory server. 
MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4 
and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 751) for upgrade instructions. 
Use secure encrypted or trusted connections between clients and the server, as well as between saslauthd and the 
LDAP server. The LDAP server uses the SASL PLAIN mechanism, sending and receiving data in plain text. You 
should use only a trusted channel such as a VPN, a connection encrypted with SSL, or a trusted wired network. 
Configure saslauthd 
LDAP support for user authentication requires proper configuration of the saslauthd daemon process as well as 
the MongoDB server. 
Step 1: Specify the mechanism. On systems that configure saslauthd with the 
/etc/sysconfig/saslauthd file, such as Red Hat Enterprise Linux, Fedora, CentOS, and Amazon 
Linux AMI, set the mechanism MECH to ldap: 
MECH=ldap 
On systems that configure saslauthd with the /etc/default/saslauthd file, such as Ubuntu, set the 
MECHANISMS option to ldap: 
MECHANISMS="ldap" 
Step 2: Adjust caching behavior. On certain Linux distributions, saslauthd starts with the caching of authenti-cation 
credentials enabled. Until restarted or until the cache expires, saslauthd will not contact the LDAP server 
to re-authenticate users in its authentication cache. This allows saslauthd to successfully authenticate users in its 
cache, even in the LDAP server is down or if the cached users’ credentials are revoked. 
To set the expiration time (in seconds) for the authentication cache, see the -t option49 of saslauthd. 
Step 3: Configure LDAP Options with OpenLDAP. If the saslauthd.conf file does not exist, create it. The 
saslauthd.conf file usually resides in the /etc folder. If specifying a different file path, see the -O option50 of 
saslauthd. 
To connect to an OpenLDAP server, update the saslauthd.conf file with the following configuration options: 
ldap_servers: <ldap uri> 
ldap_search_base: <search base> 
ldap_filter: <filter> 
49http://www.linuxcommand.org/man_pages/saslauthd8.html 
50http://www.linuxcommand.org/man_pages/saslauthd8.html 
6.3. Security Tutorials 329
MongoDB Documentation, Release 2.6.4 
The ldap_servers specifies the uri of the LDAP server used for authentication. In general, for OpenLDAP installed 
on the local machine, you can specify the value ldap://localhost:389 or if using LDAP over SSL, you can 
specify the value ldaps://localhost:636. 
The ldap_search_base specifies distinguished name to which the search is relative. The search includes the base 
or objects below. 
The ldap_filter specifies the search filter. 
The values for these configuration options should correspond to the values specific for your test. For example, to filter 
on email, specify ldap_filter: (mail=%n) instead. 
OpenLDAP Example A sample saslauthd.conf file for OpenLDAP includes the following content: 
ldap_servers: ldaps://ad.example.net 
ldap_search_base: ou=Users,dc=example,dc=com 
ldap_filter: (uid=%u) 
To use this sample OpenLDAP configuration, create users with a uid attribute (login name) and place under the 
Users organizational unit (ou) under the domain components (dc) example and com. 
For more information on saslauthd configuration, see http://www.openldap.org/doc/admin24/guide.html#Configuringsaslauthd. 
Step 4: Test the saslauthd configuration. Use testsaslauthd utility to test the saslauthd configuration. 
For example: 
testsaslauthd -u testuser -p testpassword -f /var/run/saslauthd/mux 
Configure MongoDB 
Step 1: Add user to MongoDB for authentication. Add the user to the $external database in MongoDB. To 
specify the user’s privileges, assign roles (page 285) to the user. 
For example, the following adds a user with read-only access to the records database. 
db.getSiblingDB("$external").createUser( 
{ 
user : <username>, 
roles: [ { role: "read", db: "records" } ] 
} 
) 
Add additional principals as needed. For more information about creating and managing users, see 
http://docs.mongodb.org/manualreference/command/nav-user-management. 
Step 2: Configure MongoDB server. To configure the MongoDB server to use the saslauthd instance for proxy 
authentication, start the mongod with the following options: 
• --auth, 
• authenticationMechanisms parameter set to PLAIN, and 
• saslauthdPath parameter set to the path to the Unix-domain Socket of the saslauthd instance. 
Configure the MongoDB server using either the command line option --setParameter or the configuration 
file. Specify additional configurations as appropriate for your configuration. 
If you use the authorization option to enforce authentication, you will need privileges to create a user. 
330 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Use specific saslauthd socket path. For socket path of /<some>/<path>/saslauthd, set the 
saslauthdPath to /<some>/<path>/saslauthd/mux, as in the following command line example: 
mongod --auth --setParameter saslauthdPath=/<some>/<path>/saslauthd/mux --setParameter authenticationMechanisms=Or if using a configuration file, specify the following parameters in the file: 
auth=true 
setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux 
setParameter=authenticationMechanisms=PLAIN 
Use default Unix-domain socket path. To use the default Unix-domain socket path, set the saslauthdPath to 
the empty string "", as in the following command line example: 
mongod --auth --setParameter saslauthdPath="" --setParameter authenticationMechanisms=PLAIN 
Or if using a configuration file, specify the following parameters in the file: 
auth=true 
setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux 
setParameter=authenticationMechanisms=PLAIN 
Step 3: Authenticate the user in the mongo shell. To perform the authentication in the mongo shell, use the 
db.auth() method in the $external database. 
Specify the value "PLAIN" in the mechanism field, the user and password in the user and pwd fields respectively, 
and the value false in the digestPassword field. You must specify false for digestPassword since the 
server must receive an undigested password to forward on to saslauthd, as in the following example: 
db.getSiblingDB("$external").auth( 
{ 
mechanism: "PLAIN", 
user: <username>, 
pwd: <cleartext password>, 
digestPassword: false 
} 
) 
The server forwards the password in plain text. In general, use only on a trusted channel (VPN, SSL, trusted wired 
network). See Considerations. 
Configure MongoDB with Kerberos Authentication on Linux 
New in version 2.4. 
Overview 
MongoDB Enterprise supports authentication using a Kerberos service (page 291). Kerberos is an industry standard 
authentication protocol for large client/server system. 
Prerequisites 
Setting up and configuring a Kerberos deployment is beyond the scope of this document. This tutorial assumes 
you have have configured a Kerberos service principal (page 292) for each mongod and mongos instance in your 
6.3. Security Tutorials 331
MongoDB Documentation, Release 2.6.4 
MongoDB deployment, and you have a valid keytab file (page 292) for for each mongod and mongos instance. 
To verify MongoDB Enterprise binaries: 
mongod --version 
In the output from this command, look for the string modules: subscription or modules: enterprise 
to confirm your system has MongoDB Enterprise. 
Procedure 
The following procedure outlines the steps to add a Kerberos user principal to MongoDB, configure a standalone 
mongod instance for Kerberos support, and connect using the mongo shell and authenticate the user principal. 
Step 1: Start mongod withoutKerberos. For the initial addition of Kerberos users, start mongod without Kerberos 
support. 
If a Kerberos user is already in MongoDB and has the privileges required to create a user, you can start mongod with 
Kerberos support. 
Step 2: Connect to mongod. Connect via the mongo shell to the mongod instance. If mongod has --auth 
enabled, ensure you connect with the privileges required to create a user. 
Step 3: Add Kerberos Principal(s) to MongoDB. Add a Kerberos principal, <username>@<KERBEROS 
REALM> or <username>/<instance>@<KERBEROS REALM>, to MongoDB in the $external database. 
Specify the Kerberos realm in all uppercase. The $external database allows mongod to consult an external source 
(e.g. Kerberos) to authenticate. To specify the user’s privileges, assign roles (page 285) to the user. 
The following example adds the Kerberos principal application/reporting@EXAMPLE.NET with read-only 
access to the records database: 
use $external 
db.createUser( 
{ 
user: "application/reporting@EXAMPLE.NET", 
roles: [ { role: "read", db: "records" } ] 
} 
) 
Add additional principals as needed. For every user you want to authenticate using Kerberos, you must 
create a corresponding user in MongoDB. For more information about creating and managing users, see 
http://docs.mongodb.org/manualreference/command/nav-user-management. 
Step 4: Start mongod with Kerberos support. To start mongod with Kerberos support, set the environmental 
variable KRB5_KTNAME to the path of the keytab file and the mongod parameter authenticationMechanisms 
to GSSAPI in the following form: 
env KRB5_KTNAME=<path to keytab file>  
mongod  
--setParameter authenticationMechanisms=GSSAPI 
<additional mongod options> 
For example, the following starts a standalone mongod instance with Kerberos support: 
332 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
env KRB5_KTNAME=/opt/mongodb/mongod.keytab  
/opt/mongodb/bin/mongod --auth  
--setParameter authenticationMechanisms=GSSAPI  
--dbpath /opt/mongodb/data 
The path to your mongod as well as your keytab file (page 292) may differ. Modify or include additional mongod 
options as required for your configuration. The keytab file (page 292) must be only accessible to the owner of the 
mongod process. 
With the official .deb or .rpm packages, you can set the KRB5_KTNAME in a environment settings file. See 
KRB5_KTNAME (page 333) for details. 
Step 5: Connect mongo shell to mongod and authenticate. Connect the mongo shell client as the Kerberos prin-cipal 
application/reporting@EXAMPLE.NET. Before connecting, you must have used Kerberos’s kinit 
program to get credentials for application/reporting@EXAMPLE.NET. 
You can connect and authenticate from the command line. 
mongo --authenticationMechanism=GSSAPI --authenticationDatabase='$external'  
--username application/reporting@EXAMPLE.NET 
Or, alternatively, you can first connect mongo to the mongod, and then from the mongo shell, use the db.auth() 
method to authenticate in the $external database. 
use $external 
db.auth( { mechanism: "GSSAPI", user: "application/reporting@EXAMPLE.NET" } ) 
Additional Considerations 
KRB5_KTNAME If you installed MongoDB Enterprise using one of the official .deb or .rpm packages, and you 
use the included init/upstart scripts to control the mongod instance, you can set the KR5_KTNAME variable in the 
default environment settings file instead of setting the variable each time. 
For .rpm packages, the default environment settings file is /etc/sysconfig/mongod. 
For .deb packages, the file is /etc/default/mongodb. 
Set the KRB5_KTNAME value in a line that resembles the following: 
export KRB5_KTNAME="<path to keytab>" 
Configure mongos for Kerberos To start mongos with Kerberos support, set the environmen-tal 
variable KRB5_KTNAME to the path of its keytab file (page 292) and the mongos parameter 
authenticationMechanisms to GSSAPI in the following form: 
env KRB5_KTNAME=<path to keytab file>  
mongos  
--setParameter authenticationMechanisms=GSSAPI  
<additional mongos options> 
For example, the following starts a mongos instance with Kerberos support: 
env KRB5_KTNAME=/opt/mongodb/mongos.keytab  
mongos  
--setParameter authenticationMechanisms=GSSAPI  
--configdb shard0.example.net, shard1.example.net,shard2.example.net  
--keyFile /opt/mongodb/mongos.keyfile 
6.3. Security Tutorials 333
MongoDB Documentation, Release 2.6.4 
The path to your mongos as well as your keytab file (page 292) may differ. The keytab file (page 292) must be only 
accessible to the owner of the mongos process. 
Modify or include any additional mongos options as required for your configuration. For example, instead of us-ing 
--keyFile for internal authentication of sharded cluster members, you can use x.509 member authentication 
(page 323) instead. 
Use a Config File To configure mongod or mongos for Kerberos support using a configuration file, 
specify the authenticationMechanisms setting in the configuration file: 
setParameter=authenticationMechanisms=GSSAPI 
Modify or include any additional mongod options as required for your configuration. 
For example, if http://docs.mongodb.org/manualopt/mongodb/mongod.conf contains the follow-ing 
configuration settings for a standalone mongod: 
auth = true 
setParameter=authenticationMechanisms=GSSAPI 
dbpath=/opt/mongodb/data 
To start mongod with Kerberos support, use the following form: 
env KRB5_KTNAME=/opt/mongodb/mongod.keytab  
/opt/mongodb/bin/mongod --config /opt/mongodb/mongod.conf 
The path to your mongod, keytab file (page 292), and configuration file may differ. The keytab file (page 292) must 
be only accessible to the owner of the mongod process. 
Troubleshoot Kerberos Setup for MongoDB If you encounter problems when starting mongod or mongos with 
Kerberos authentication, see Troubleshoot Kerberos Authentication on Linux (page 338). 
Incorporate Additional Authentication Mechanisms Kerberos authentication (GSSAPI) can work alongside 
MongoDB’s challenge/response authentication mechanism (MONGODB-CR), MongoDB’s authentication mechanism 
for LDAP (PLAIN), and MongoDB’s authentication mechanism for x.509 (MONGODB-X509). Specify the mecha-nisms, 
as follows: 
--setParameter authenticationMechanisms=GSSAPI,MONGODB-CR 
Only add the other mechanisms if in use. This parameter setting does not affect MongoDB’s internal authentication of 
cluster members. 
Configure MongoDB with Kerberos Authentication on Windows 
New in version 2.6. 
Overview 
MongoDB Enterprise supports authentication using a Kerberos service (page 291). Kerberos is an industry standard 
authentication protocol for large client/server system. Kerberos allows MongoDB and applications to take advantage 
of existing authentication infrastructure and processes. 
334 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Prerequisites 
Setting up and configuring a Kerberos deployment is beyond the scope of this document. This tutorial assumes have 
configured a Kerberos service principal (page 292) for each mongod.exe and mongos.exe instance. 
Procedures 
Step 1: Start mongod.exe without Kerberos. For the initial addition of Kerberos users, start mongod.exe 
without Kerberos support. 
If a Kerberos user is already in MongoDB and has the privileges required to create a user, you can start mongod.exe 
with Kerberos support. 
Step 2: Connect to mongod. Connect via the mongo.exe shell to the mongod.exe instance. If mongod.exe 
has --auth enabled, ensure you connect with the privileges required to create a user. 
Step 3: Add Kerberos Principal(s) to MongoDB. Add a Kerberos principal, <username>@<KERBEROS 
REALM>, to MongoDB in the $external database. Specify the Kerberos realm in all uppercase. The $external 
database allows mongod.exe to consult an external source (e.g. Kerberos) to authenticate. To specify the user’s 
privileges, assign roles (page 285) to the user. 
The following example adds the Kerberos principal reportingapp@EXAMPLE.NET with read-only access to the 
records database: 
use $external 
db.createUser( 
{ 
user: "reportingapp@EXAMPLE.NET", 
roles: [ { role: "read", db: "records" } ] 
} 
) 
Add additional principals as needed. For every user you want to authenticate using Kerberos, you must 
create a corresponding user in MongoDB. For more information about creating and managing users, see 
http://docs.mongodb.org/manualreference/command/nav-user-management. 
Step 4: Start mongod.exe with Kerberos support. You must start mongod.exe as the service principal ac-count 
(page 336). 
To start mongod.exe with Kerberos support, set the mongod.exe parameter authenticationMechanisms 
to GSSAPI: 
mongod.exe --setParameter authenticationMechanisms=GSSAPI <additional mongod.exe options> 
For example, the following starts a standalone mongod.exe instance with Kerberos support: 
mongod.exe --auth --setParameter authenticationMechanisms=GSSAPI 
Modify or include additional mongod.exe options as required for your configuration. 
Step 5: Connect mongo.exe shell to mongod.exe and authenticate. Connect the mongo.exe shell client as 
the Kerberos principal application@EXAMPLE.NET. 
You can connect and authenticate from the command line. 
6.3. Security Tutorials 335
MongoDB Documentation, Release 2.6.4 
mongo.exe --authenticationMechanism=GSSAPI --authenticationDatabase='$external'  
--username reportingapp@EXAMPLE.NET 
Or, alternatively, you can first connect mongo.exe to the mongod.exe, and then from the mongo.exe shell, use 
the db.auth() method to authenticate in the $external database. 
use $external 
db.auth( { mechanism: "GSSAPI", user: "reportingapp@EXAMPLE.NET" } ) 
Additional Considerations 
Configure mongos.exe for Kerberos To start mongos.exe with Kerberos support, set the mongos.exe pa-rameter 
authenticationMechanisms to GSSAPI. You must start mongos.exe as the service principal ac-count 
(page 336).: 
mongos.exe --setParameter authenticationMechanisms=GSSAPI <additional mongos options> 
For example, the following starts a mongos instance with Kerberos support: 
mongos.exe --setParameter authenticationMechanisms=GSSAPI --configdb shard0.example.net, shard1.example.Modify or include any additional mongos.exe options as required for your configuration. For example, instead of 
using --keyFile for for internal authentication of sharded cluster members, you can use x.509 member authentica-tion 
(page 323) instead. 
Assign Service Principal Name to MongoDBWindows Service Use setspn.exe to assign the service principal 
name (SPN) to the account running the mongod.exe and the mongos.exe service: 
setspn.exe -A <service>/<fully qualified domain name> <service account name> 
For example, if mongod.exe runs as a service named mongodb on testserver.mongodb.com with the ser-vice 
account name mongodtest, assign the SPN as follows: 
setspn.exe -A mongodb/testserver.mongodb.com mongodtest 
Incorporate Additional Authentication Mechanisms Kerberos authentication (GSSAPI) can work alongside 
MongoDB’s challenge/response authentication mechanism (MONGODB-CR), MongoDB’s authentication mechanism 
for LDAP (PLAIN), and MongoDB’s authentication mechanism for x.509 (MONGODB-X509). Specify the mecha-nisms, 
as follows: 
--setParameter authenticationMechanisms=GSSAPI,MONGODB-CR 
Only add the other mechanisms if in use. This parameter setting does not affect MongoDB’s internal authentication of 
cluster members. 
Authenticate to a MongoDB Instance or Cluster 
Overview 
To authenticate to a running mongod or mongos instance, you must have user credentials for a resource on that 
instance. When you authenticate to MongoDB, you authenticate either to a database or to a cluster. Your user privileges 
determine the resource you can authenticate to. 
You authenticate to a resource either by: 
336 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
• using the authentication options when connecting to the mongod or mongos instance, or 
• connecting first and then authenticating to the resource with the authenticate command or the db.auth() 
method. 
This section describes both approaches. 
In general, always use a trusted channel (VPN, SSL, trusted wired network) for connecting to a MongoDB instance. 
Prerequisites 
You must have user credentials on the database or cluster to which you are authenticating. 
Procedures 
Authenticate When First Connecting to MongoDB 
Step 1: Specify your credentials when starting the mongo instance. When using mongo to connect to a mongod 
or mongos, enter your username, password, and authenticationDatabase. For example: 
mongo --username "prodManager" --password "cleartextPassword" --authenticationDatabase "products" 
Step 2: Close the session when your work is complete. To close an authenticated session, use the logout com-mand.: 
db.runCommand( { logout: 1 } ) 
Authenticate After Connecting to MongoDB 
Step 1: Connect to a MongoDB instance. Connect to a mongod or mongos instance. 
Step 2: Switch to the database to which to authenticate. 
use <database> 
Step 3: Authenticate. Use either the authenticate command or the db.auth() method to provide your 
username and password to the database. For example: 
db.auth( "prodManager", "cleartextPassword" ) 
Step 4: Close the session when your work is complete. To close an authenticated session, use the logout com-mand.: 
db.runCommand( { logout: 1 } ) 
6.3. Security Tutorials 337
MongoDB Documentation, Release 2.6.4 
Generate a Key File 
Overview 
This section describes how to generate a key file to store authentication information. After generating a key file, 
specify the key file using the keyFile option when starting a mongod or mongos instance. 
A key’s length must be between 6 and 1024 characters and may only contain characters in the base64 set. The key 
file must not have group or world permissions on UNIX systems. Key file permissions are not checked on Windows 
systems. 
MongoDB strips whitespace characters (e.g. x0d, x09, and x20) for cross-platform convenience. As a result, the 
following operations produce identical keys: 
echo -e "my secret key" > key1 
echo -e "my secret keyn" > key2 
echo -e "my secret key" > key3 
echo -e "myrnsecretrnkeyrn" > key4 
Procedure 
Step 1: Create a key file. Create the key file your deployment will use to authenticate servers to each other. 
To generate pseudo-random data to use for a keyfile, issue the following openssl command: 
openssl rand -base64 741 > mongodb-keyfile 
chmod 600 mongodb-keyfile 
You may generate a key file using any method you choose. Always ensure that the password stored in the key file is 
both long and contains a high amount of entropy. Using openssl in this manner helps generate such a key. 
Step 2: Specify the key file when starting a MongoDB instance. Specify the path to the key file with the keyFile 
option. 
Troubleshoot Kerberos Authentication on Linux 
New in version 2.4. 
Kerberos Configuration Checklist 
If you have difficulty starting mongod or mongos with Kerberos (page 291) on Linux systems, ensure that: 
• The mongod and the mongos binaries are from MongoDB Enterprise. 
To verify MongoDB Enterprise binaries: 
mongod --version 
In the output from this command, look for the string modules: subscription or modules: 
enterprise to confirm your system has MongoDB Enterprise. 
• You are not using the HTTP Console51. MongoDB Enterprise does not support Kerberos authentication over the 
HTTP Console interface. 
51http://docs.mongodb.org/ecosystem/tools/http-interface/#http-console 
338 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
• Either the service principal name (SPN) in the keytab file (page 292) matches the SPN for the 
mongod or mongos instance, or the mongod or the mongos instance use the --setParameter 
saslHostName=<host name> to match the name in the keytab file. 
• The canonical system hostname of the system that runs the mongod or mongos instance is a resolvable, fully 
qualified domain for this host. You can test the system hostname resolution with the hostname -f command 
at the system prompt. 
• Each host that runs a mongod or mongos instance has both the A and PTR DNS records to provide forward 
and reverse lookup. The records allow the host to resolve the components of the Kerberos infrastructure. 
• Both the Kerberos Key Distribution Center (KDC) and the system running mongod instance or mongos must 
be able to resolve each other using DNS. By default, Kerberos attempts to resolve hosts using the content of the 
/etc/kerb5.conf before using DNS to resolve hosts. 
• The time synchronization of the systems running mongod or the mongos instances and the Kerberos infras-tructure 
are within the maximum time skew (default is 5 minutes) of each other. Time differences greater than 
the maximum time skew will prevent successful authentication. 
Debug with More Verbose Logs 
If you still encounter problems with Kerberos on Linux, you can start both mongod and mongo (or another client) 
with the environment variable KRB5_TRACE set to different files to produce more verbose logging of the Kerberos 
process to help further troubleshooting. For example, the following starts a standalone mongod with KRB5_TRACE 
set: 
env KRB5_KTNAME=/opt/mongodb/mongod.keytab  
KRB5_TRACE=/opt/mongodb/log/mongodb-kerberos.log  
/opt/mongodb/bin/mongod --dbpath /opt/mongodb/data  
--fork --logpath /opt/mongodb/log/mongod.log  
--auth --setParameter authenticationMechanisms=GSSAPI 
Common Error Messages 
In some situations, MongoDB will return error messages from the GSSAPI interface if there is a problem with the 
Kerberos service. Some common error messages are: 
GSSAPI error in client while negotiating security context. This error occurs on the 
client and reflects insufficient credentials or a malicious attempt to authenticate. 
If you receive this error, ensure that you are using the correct credentials and the correct fully qualified domain 
name when connecting to the host. 
GSSAPI error acquiring credentials. This error occurs during the start of the mongod or mongos 
and reflects improper configuration of the system hostname or a missing or incorrectly configured keytab file. 
If you encounter this problem, consider the items in the Kerberos Configuration Checklist (page 338), in partic-ular, 
whether the SPN in the keytab file (page 292) matches the SPN for the mongod or mongos instance. 
To determine whether the SPNs match: 
1. Examine the keytab file, with the following command: 
klist -k <keytab> 
Replace <keytab> with the path to your keytab file. 
2. Check the configured hostname for your system, with the following command: 
6.3. Security Tutorials 339
MongoDB Documentation, Release 2.6.4 
hostname -f 
Ensure that this name matches the name in the keytab file, or start mongod or mongos with the 
--setParameter saslHostName=<hostname>. 
See also: 
• Kerberos Authentication (page 291) 
• Configure MongoDB with Kerberos Authentication on Linux (page 331) 
• Configure MongoDB with Kerberos Authentication on Windows (page 334) 
Implement Field Level Redaction 
The $redact pipeline operator restricts the contents of the documents based on information stored in the documents 
themselves. 
Figure 6.1: Diagram of security architecture with middleware and redaction. 
To store the access criteria data, add a field to the documents and subdocuments. To allow for multiple combinations 
of access levels for the same data, consider setting the access field to an array of arrays. Each array element contains 
a required set that allows a user with that set to access the data. 
Then, include the $redact stage in the db.collection.aggregate() operation to restrict contents of the 
result set based on the access required to view the data. 
For more information on the $redact pipeline operator, including its syntax and associated system variables as well 
as additional examples, see $redact. 
340 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Procedure 
For example, a forecasts collection contains documents of the following form where the tags field determines 
the access levels required to view the data: 
{ 
_id: 1, 
title: "123 Department Report", 
tags: [ [ "G" ], [ "FDW" ] ], 
year: 2014, 
subsections: [ 
{ 
subtitle: "Section 1: Overview", 
tags: [ [ "SI", "G" ], [ "FDW" ] ], 
content: "Section 1: This is the content of section 1." 
}, 
{ 
subtitle: "Section 2: Analysis", 
tags: [ [ "STLW" ] ], 
content: "Section 2: This is the content of section 2." 
}, 
{ 
subtitle: "Section 3: Budgeting", 
tags: [ [ "TK" ], [ "FDW", "TGE" ] ], 
content: { 
text: "Section 3: This is the content of section3.", 
tags: [ [ "HCS"], [ "FDW", "TGE", "BX" ] ] 
} 
} 
] 
} 
For each document, the tags field contains various access groupings necessary to view the data. For example, the 
value [ [ "G" ], [ "FDW", "TGE" ] ] can specify that a user requires either access level ["G"] or both [ 
"FDW", "TGE" ] to view the data. 
Consider a user who only has access to view information tagged with either "FDW" or "TGE". To run a query on all 
documents with year 2014 for this user, include a $redact stage as in the following: 
var userAccess = [ "FDW", "TGE" ]; 
db.forecasts.aggregate( 
[ 
{ $match: { year: 2014 } }, 
{ $redact: 
{ 
$cond: { 
if: { $anyElementTrue: 
{ 
$map: { 
input: "$tags" , 
as: "fieldTag", 
in: { $setIsSubset: [ "$$fieldTag", userAccess ] } 
} 
} 
}, 
then: "$$DESCEND", 
else: "$$PRUNE" 
} 
} 
6.3. Security Tutorials 341
MongoDB Documentation, Release 2.6.4 
} 
] 
) 
The aggregation operation returns the following “redacted” document for the user: 
{ "_id" : 1, 
"title" : "123 Department Report", 
"tags" : [ [ "G" ], [ "FDW" ] ], 
"year" : 2014, 
"subsections" : 
[ 
{ 
"subtitle" : "Section 1: Overview", 
"tags" : [ [ "SI", "G" ], [ "FDW" ] ], 
"content" : "Section 1: This is the content of section 1." 
}, 
{ 
"subtitle" : "Section 3: Budgeting", 
"tags" : [ [ "TK" ], [ "FDW", "TGE" ] ] 
} 
] 
} 
See also: 
$map, $setIsSubset, $anyElementTrue 
6.3.5 User and Role Management Tutorials 
The following tutorials provide instructions on how to enable authentication and limit access for users with privilege 
roles. 
Create a User Administrator (page 343) Create users with special permissions to to create, modify, and remove other 
users, as well as administer authentication credentials (e.g. passwords). 
Add a User to a Database (page 344) Create non-administrator users using MongoDB’s role-based authentication 
system. 
Create an Administrative User with Unrestricted Access (page 346) Create a user with unrestricted access. Create 
such a user only in unique situations. In general, all users in the system should have no more access than needed 
to perform their required operations. 
Create a Role (page 347) Create custom role. 
Assign a User a Role (page 349) Assign a user a role. A role grants the user a defined set of privileges. A user can 
have multiple roles. 
Verify User Privileges (page 350) View a user’s current privileges. 
Modify a User’s Access (page 352) Modify the actions available to a user on specific database resources. 
View Roles (page 353) View a role’s privileges. 
Change a User’s Password (page 354) Only user administrators can edit credentials. This tutorial describes the pro-cess 
for editing an existing user’s password. 
Change Your Password and Custom Data (page 355) Users with sufficient access can change their own passwords 
and modify the optional custom data associated with their user credential. 
342 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Create a User Administrator 
Overview 
User administrators create users and create and assigns roles. A user administrator can grant any privilege in the 
database and can create new ones. In a MongoDB deployment, create the user administrator as the first user. Then let 
this user create all other users. 
To provide user administrators, MongoDB has userAdmin (page 363) and userAdminAnyDatabase (page 368) 
roles, which grant access to actions (page 375) that support user and role management. Following the policy of least 
privilege userAdmin (page 363) and userAdminAnyDatabase (page 368) confer no additional privileges. 
Carefully control access to these roles. A user with either of these roles can grant itself unlimited additional privileges. 
Specifically, a user with the userAdmin (page 363) role can grant itself any privilege in the database. A user assigned 
either the userAdmin (page 363) role on the admin database or the userAdminAnyDatabase (page 368) can 
grant itself any privilege in the system. 
Prerequisites 
Required Access You must have the createUser (page 376) action (page 375) on a database to create a new user 
on that database. 
You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another user. 
If you have the userAdmin (page 363) or userAdminAnyDatabase (page 368) role, you have those actions. 
First User Restrictions If your MongoDB deployment has no users, you must connect to mongod using the local-host 
exception (page 285) or use the --noauth option when starting mongod to gain full access the system. Once 
you have access, you can skip to Creating the system user administrator in this procedure. 
If users exist in the MongoDB database, but none of them has the appropriate prerequisites to create a new user or you 
do not have access to them, you must restart mongod with the --noauth option. 
Procedure 
Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos as a user 
with the privileges required in the Prerequisites (page 343) section. 
The following example operation connects to MongoDB as an authenticated user named manager: 
mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 
Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
6.3. Security Tutorials 343
MongoDB Documentation, Release 2.6.4 
Step 3: Create the system user administrator. Add the user with the userAdminAnyDatabase (page 368) 
role, and only that role. 
The following example creates the user siteUserAdmin user on the admin database: 
use admin 
db.createUser( 
{ 
user: "siteUserAdmin", 
pwd: "password", 
roles: 
[ 
{ 
role: "userAdminAnyDatabase", 
db: "admin" 
} 
] 
} 
) 
Step 4: Create a user administrator for a single database. Optionally, you may want to create user administrators 
that only have access to administer users in a specific database by way of the userAdmin (page 363) role. 
The following example creates the user recordsUserAdmin on the records database: 
use products 
db.createUser( 
{ 
user: "recordsUserAdmin", 
pwd: "password", 
roles: 
[ 
{ 
role: "userAdmin", 
db: "records" 
} 
] 
} 
) 
Related Documents 
• Authentication (page 282) 
• Security Introduction (page 279) 
• Enable Client Access Control (page 317) 
• Access Control Tutorials (page 316) 
Add a User to a Database 
Changed in version 2.6. 
344 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Overview 
Each application and user of a MongoDB system should map to a distinct application or administrator. This access 
isolation facilitates access revocation and ongoing user maintenance. At the same time users should have only the 
minimal set of privileges required to ensure a system of least privilege. 
To create a user, you must define the user’s credentials and assign that user roles (page 285). Credentials verify the 
user’s identity to a database, and roles determine the user’s access to database resources and operations. 
For an overview of credentials and roles in MongoDB see Security Introduction (page 279). 
Considerations 
For users that authenticate using external mechanisms, 52 you do not need to provide credentials when creating users. 
For all users, select the roles that have the exact required privileges (page 286). If the correct roles do not exist, create 
roles (page 347). 
You can create a user without assigning roles, choosing instead to assign the roles later. To do so, create the user with 
an empty roles (page 372) array. 
When adding a user to multiple databases, use unique username-and-password combinations for each database, see 
Password Hashing Insecurity (page 385) for more information. 
Prerequisites 
To create a user on a system that uses authentication (page 282), you must authenticate as a user administrator. If you 
have not yet created a user administrator, do so as described in Create a User Administrator (page 343). 
Required Access You must have the createUser (page 376) action (page 375) on a database to create a new user 
on that database. 
You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another user. 
If you have the userAdmin (page 363) or userAdminAnyDatabase (page 368) role, you have those actions. 
First User Restrictions If your MongoDB deployment has no users, you must connect to mongod using the local-host 
exception (page 285) or use the --noauth option when starting mongod to gain full access the system. Once 
you have access, you can skip to Creating the system user administrator in this procedure. 
If users exist in the MongoDB database, but none of them has the appropriate prerequisites to create a new user or you 
do not have access to them, you must restart mongod with the --noauth option. 
Procedures 
Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos with the 
privileges required in the Prerequisites (page 345) section. 
The following example operation connects to MongoDB as an authenticated user named manager: 
mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 
52 Configure MongoDB with Kerberos Authentication on Linux (page 331), Authenticate Using SASL and LDAP with OpenLDAP (page 329), 
Authenticate Using SASL and LDAP with ActiveDirectory (page 326), and x.509 certificates provide external authentication mechanisms. 
6.3. Security Tutorials 345
MongoDB Documentation, Release 2.6.4 
Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
Step 3: Create the new user. Create the user in the database to which the user will belong. Pass a well formed user 
document to the db.createUser() method. 
The following operation creates a user in the reporting database with the specified name, password, and roles. 
use reporting 
db.createUser( 
{ 
user: "reportsUser", 
pwd: "12345678", 
roles: [ 
{ role: "read", db: "reporting" }, 
{ role: "read", db: "products" }, 
{ role: "read", db: "sales" } 
] 
} 
) 
To authenticate the reportsUser, you must authenticate the user in the reporting database. 
Create an Administrative User with Unrestricted Access 
Overview 
Most users should have only the minimal set of privileges required for their operations, in keeping with the policy of 
least privilege. However, some authorization architectures may require a user with unrestricted access. To support 
these super users, you can create users with access to all database resources (page 373) and actions (page 375). 
For many deployments, you may be able to avoid having any users with unrestricted access by having an administrative 
user with the createUser (page 376) and grantRole (page 376) actions granted as needed to support operations. 
If users truly need unrestricted access to a MongoDB deployment, MongoDB provides a built-in role (page 361) named 
root (page 368) that grants the combined privileges of all built-in roles. This document describes how to create an 
administrative user with the root (page 368) role. 
For descriptions of the access each built-in role provides, see the section on built-in roles (page 361). 
Prerequisites 
Required Access You must have the createUser (page 376) action (page 375) on a database to create a new user 
on that database. 
You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another user. 
If you have the userAdmin (page 363) or userAdminAnyDatabase (page 368) role, you have those actions. 
346 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
First User Restrictions If your MongoDB deployment has no users, you must connect to mongod using the local-host 
exception (page 285) or use the --noauth option when starting mongod to gain full access the system. Once 
you have access, you can skip to Creating the system user administrator in this procedure. 
If users exist in the MongoDB database, but none of them has the appropriate prerequisites to create a new user or you 
do not have access to them, you must restart mongod with the --noauth option. 
Procedure 
Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos as a user 
with the privileges required in the Prerequisites (page 346) section. 
The following example operation connects to MongoDB as an authenticated user named manager: 
mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 
Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
Step 3: Create the administrative user. In the admin database, create a new user using the db.createUser() 
method. Give the user the built-in root (page 368) role. 
For example: 
use admin 
db.createUser( 
{ 
user: "superuser", 
pwd: "12345678", 
roles: [ "root" ] 
} 
) 
Authenticate against the admin database to test the new user account. Use db.auth() while using the admin 
database or use the mongo shell with the --authenticationDatabase option. 
Create a Role 
Overview 
Roles grant users access to MongoDB resources. By default, MongoDB provides a number of built-in roles (page 361) 
that administrators may use to control access to a MongoDB system. However, if these roles cannot describe the 
desired privilege set of a particular user type in a deployment, you can define a new, customized role. 
6.3. Security Tutorials 347
MongoDB Documentation, Release 2.6.4 
A role’s privileges apply to the database where the role is created. The role can inherit privileges from other roles in 
its database. A role created on the admin database can include privileges that apply to all databases or to the cluster 
(page 374) and can inherit privileges from roles in other databases. 
The combination of the database name and the role name uniquely defines a role in MongoDB. 
Prerequisites 
You must have the createRole (page 375) action (page 375) on a database to create a role on that database. 
You must have the grantRole (page 376) action (page 375) on the database that a privilege targets in order to 
grant that privilege to a role. If the privilege targets multiple databases or the cluster resource , you must have the 
grantRole (page 376) action on the admin database. 
You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another role. 
To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action 
(page 375) on the role’s database. 
Procedure 
Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos with the 
privileges required in the Prerequisites (page 348) section. 
The following example operation connects to MongoDB as an authenticated user named manager: 
mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 
Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
Step 3: Define the privileges to grant to the role. Decide which resources (page 373) to grant access to and which 
actions (page 375) to grant on each resource. 
When creating the role, you will enter the resource-action pairings as documents in the privileges array, as in the 
following example: 
{ db: "products", collection: "electronics" } 
Step 4: Check whether an existing role provides the privileges. If an existing role contains the exact set of 
privileges (page 286), the new role can inherit (page 286) those privileges. 
To view the privileges provided by existing roles, use the rolesInfo command, as in the following: 
db.runCommand( { rolesInfo: 1, showPrivileges: 1 } ) 
348 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Step 5: Create the role. To create the role, use the createRole command. Specify privileges in the 
privileges array and inherited roles in the roles array. 
The following example creates the myClusterwideAdmin role in the admin database: 
use admin 
db.createRole( 
{ 
role: "myClusterwideAdmin", 
privileges: 
[ 
{ resource: { cluster: true }, actions: [ "addShard" ] }, 
{ resource: { db: "config", collection: "" }, actions: [ "find", "update", "insert" ] }, 
{ resource: { db: "users", collection: "usersCollection" }, actions: [ "update" ] }, 
{ resource: { db: "", collection: "" }, actions: [ "find" ] } 
], 
roles: 
[ 
{ role: "read", db: "admin" } 
], 
writeConcern: { w: "majority" , wtimeout: 5000 } 
} 
) 
The operation defines myClusterwideAdmin role’s privileges in the privileges array. In the roles array, 
myClusterwideAdmin inherits privileges from the admin database’s read role. 
Assign a User a Role 
Changed in version 2.6. 
Overview 
A role provides a user privileges to perform a set of actions (page 375) on a resource (page 373). A user can have 
multiple roles. 
In MongoDB systems with authorization enforced, you must grant a user a role for the user to access a database 
resource. To assign a role, first determine the privileges the user needs and then determine the role that grants those 
privileges. 
For an overview of roles and privileges, see Authorization (page 285). For descriptions of the access each built-in role 
provides, see the section on built-in roles (page 361). 
Prerequisites 
You must have the grantRole (page 376) action (page 375) on a database to grant a role on that database. 
To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action 
(page 375) on the role’s database. 
Procedure 
Step 1: Connect with the privilege to grant roles. Connect to the mongod or mongos either through the localhost 
exception (page 285) or as a user with the privileges required in the Prerequisites (page 349) section. 
6.3. Security Tutorials 349
MongoDB Documentation, Release 2.6.4 
The following example operation connects to the MongoDB instance as a user named roleManager: 
mongo --port 27017 -u roleManager -p 12345678 --authenticationDatabase admin 
Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
Step 3: Identify the user’s roles and privileges. To display the roles and privileges of the user to be modified, use 
the db.getUser() and db.getRole() methods, as described in Verify User Privileges (page 350). 
To display the privileges granted by siteRole01 on the current database, issue: 
db.getRole( "siteRole01", { showPrivileges: true } ) 
Step 4: Identify the privileges to grant or revoke. Determine which role contains the privileges and only those 
privileges. If such a role does not exist, then to grant the privileges will require creating a new role (page 347) with 
the specific set of privileges. To revoke a subset of privileges provided by an existing role: revoke the original role, 
create a new role (page 347) that contains the privileges to keep, and then grant that role to the user. 
Step 5: Grant a role to a user. Grant the user the role using the db.grantRolesToUser() method. 
For example: 
use admin 
db.grantRolesToUser( 
"accountAdmin01", 
[ 
{ 
role: "readWrite", db: "products" 
}, 
{ 
role: "readAnyDatabase", db:"admin" 
} 
] 
) 
Verify User Privileges 
Overview 
A user’s privileges determine the access the user has to MongoDB resources (page 373) and the actions (page 375) 
that user can perform. Users receive privileges through role assignments. A user can have multiple roles, and each 
role can have multiple privileges. 
For an overview of roles and privileges, see Authorization (page 285). 
350 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Prerequisites 
To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action 
(page 375) on the role’s database. 
Procedure 
Step 1: Identify the user’s roles. Use the usersInfo command or db.getUser() method to display user 
information. The roles (page 372) array specifies the user’s roles. 
For example, to view roles for accountUser01 on the accounts database, issue the following: 
use accounts 
db.getUser("accountUser01") 
The roles (page 372) array displays all roles for accountUser01: 
"roles" : [ 
{ 
"role" : "readWrite", 
"db" : "accounts" 
}, 
{ 
"role" : "siteRole01", 
"db" : "records" 
} 
] 
Step 2: Identify the privileges granted by the roles. For a given role, use the rolesInfo command or 
db.getRole() method, and include the showPrivileges parameter. The resulting role document displays 
both privileges granted directly and roles from which this role inherits privileges. 
For example, to view the privileges granted by siteRole01 on the records database, use the following operation, 
which returns a document with a privileges (page 370) array: 
use records 
db.getRole( "siteRole01", { showPrivileges: true } ) 
The returned document includes the roles (page 370) and privileges (page 370) arrays: 
"roles" : [ 
{ 
"role" : "read", 
"db" : "corporate" 
} 
], 
"privileges" : [ 
{ 
"resource" : { 
"db" : "records", 
"collection" : "" 
}, 
"actions" : [ 
"find", 
"insert", 
"update" 
] 
6.3. Security Tutorials 351
MongoDB Documentation, Release 2.6.4 
} 
] 
To view the privileges granted by the read (page 362) role, use db.getRole() again with the appropriate param-eters. 
Modify a User’s Access 
Overview 
When a user’s responsibilities change, modify the user’s access to include only those roles the user requires. This 
follows the policy of least privilege. 
To change a user’s access, first determine the privileges the user needs and then determine the roles that grants those 
privileges. Grant and revoke roles using the method:db.grantRolesToUser() and db.revokeRolesFromUser 
methods. 
For an overview of roles and privileges, see Authorization (page 285). For descriptions of the access each built-in role 
provides, see the section on built-in roles (page 361). 
Prerequisites 
You must have the grantRole (page 376) action (page 375) on a database to grant a role on that database. 
You must have the revokeRole (page 376) action (page 375) on a database to revoke a role on that database. 
To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action 
(page 375) on the role’s database. 
Procedure 
Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos either through 
the localhost exception (page 285) or as a user with the privileges required in the Prerequisites (page 352) section. 
The following example operation connects to MongoDB as an authenticated user named manager: 
mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 
Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
352 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Step 3: Identify the user’s roles and privileges. To display the roles and privileges of the user to be modified, use 
the db.getUser() and db.getRole() methods, as described in Verify User Privileges (page 350). 
To display the privileges granted by siteRole01 on the current database, issue: 
db.getRole( "siteRole01", { showPrivileges: true } ) 
Step 4: Identify the privileges to grant or revoke. Determine which role contains the privileges and only those 
privileges. If such a role does not exist, then to grant the privileges will require creating a new role (page 347) with 
the specific set of privileges. To revoke a subset of privileges provided by an existing role: revoke the original role, 
create a new role (page 347) that contains the privileges to keep, and then grant that role to the user. 
Step 5: Modify the user’s access. 
Revoke a Role Revoke a role with the db.revokeRolesFromUser() method. Access revocations apply as 
soon as the user tries to run a command. On a mongos revocations are instant on the mongos on which the command 
ran, but there is up to a 10-minute delay before the user cache is updated on the other mongos instances in the cluster. 
The following example operation removes the readWrite (page 362) role on the accounts database from the 
accountUser01 user’s existing roles: 
use accounts 
db.revokeRolesFromUser( 
"accountUser01", 
[ 
{ role: "readWrite", db: "accounts" } 
] 
) 
Grant a Role Grant a role using the db.grantRolesToUser() method. For example, the following operation 
grants the accountUser01 user the read (page 362) role on the records database: 
use accounts 
db.grantRolesToUser( 
"accountUser01", 
[ 
{ role: "read", db: "records" } 
] 
) 
View Roles 
Overview 
A role (page 285) grants privileges to the users who are assigned the role. Each role is scoped to a particular 
database, but MongoDB stores all role information in the admin.system.roles (page 270) collection in the 
admin database. 
Prerequisites 
To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action 
(page 375) on the role’s database. 
6.3. Security Tutorials 353
MongoDB Documentation, Release 2.6.4 
Procedures 
The following procedures use the rolesInfo command. You also can use the methods db.getRole() (singular) 
and db.getRoles(). 
View a Role in the Current Database If the role is in the current database, you can refer to the role by name, as for 
the role dataEntry on the current database: 
db.runCommand({ rolesInfo: "dataEntry" }) 
View a Role in a Different Database If the role is in a different database, specify the role as a document. Use the 
following form: 
{ role: "<role name>", db: "<role db>" } 
To view the custom appWriter role in the orders database, issue the following command from the mongo shell: 
db.runCommand({ rolesInfo: { role: "appWriter", db: "orders" } }) 
View Multiple Roles To view information for multiple roles, specify each role as a document or string in an array. 
To view the custom appWriter and clientWriter roles in the orders database, as well as the dataEntry 
role on the current database, use the following command from the mongo shell: 
db.runCommand( { rolesInfo: [ { role: "appWriter", db: "orders" }, 
{ role: "clientWriter", db: "orders" }, 
"dataEntry" ] 
} ) 
View All Custom Roles To view the all custom roles, query admin.system.roles (page 369) collection directly, for 
example: 
db = db.getSiblingDB('admin') 
db.system.roles.find() 
Change a User’s Password 
Changed in version 2.6. 
Overview 
Strong passwords help prevent unauthorized access, and all users should have strong passwords. You can use the 
openssl program to generate unique strings for use in passwords, as in the following command: 
openssl rand -base64 48 
Prerequisites 
You must have the changeAnyPassword action (page 375) on a database to modify the password of any user on 
that database. 
354 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
You must have the changeOwnPassword (page 375) action (page 375) on your database to change your own 
password. 
Procedure 
Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos with the 
privileges required in the Prerequisites (page 354) section. 
The following example operation connects to MongoDB as an authenticated user named manager: 
mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 
Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
Step 3: Change the password. Pass the user’s username and the new password to the 
db.changeUserPassword() method. 
The following operation changes the reporting user’s password to SOh3TbYhxuLiW8ypJPxmt1oOfL: 
db.changeUserPassword("reporting", "SOh3TbYhxuLiW8ypJPxmt1oOfL") 
Change Your Password and Custom Data 
Changed in version 2.6. 
Overview 
Users with appropriate privileges can change their own passwords and custom data. Custom data (page 373) stores 
optional user information. 
Considerations 
To generate a strong password for use in this procedure, you can use the openssl utility’s rand command. For 
example, issue openssl rand with the following options to create a base64-encoded string of 48 pseudo-random 
bytes: 
openssl rand -base64 48 
6.3. Security Tutorials 355
MongoDB Documentation, Release 2.6.4 
Prerequisites 
To modify your own password or custom data, you must have the changeOwnPassword (page 375) and 
changeOwnCustomData (page 375) actions (page 375) respectively on the cluster resource. 
Procedure 
Step 1: Connect with the appropriate privileges. Connect to the mongod or mongos with your username and 
current password. 
For example, the following operation connects to MongoDB as an authenticated user named manager. 
mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 
Step 2: Verify your privileges. To check that you have the privileges specified in the Prerequisites (page 356) 
section, use the usersInfo command with the showPrivileges option. 
The following example operation checks privileges for a user connected as manager: 
db.runCommand( 
{ 
usersInfo:"manager", 
showPrivileges:true 
} 
) 
The resulting users document displays the privileges granted to manager. 
Step 2: View your custom data. Connect to the mongod or mongos with your username and current password. 
For example, the following operation returns information for the manager user: 
db.runCommand( { usersInfo: "manager" } ) 
Step 3: Change your password and custom data. Pass your username, new password, and new custom data to the 
updateUser command. 
For example, the following operation changes a user’s password to KNlZmiaNUp0B and custom data to { title: 
"Senior Manager" }: 
db.runCommand( 
{ updateUser: "manager", 
pwd: "KNlZmiaNUp0B", 
customData: { title: "Senior Manager" } 
} 
) 
6.3.6 Configure System Events Auditing 
New in version 2.6. 
MongoDB Enterprise supports auditing (page 290) of various operations. A complete auditing solution must involve 
all mongod server and mongos router processes. 
356 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
The audit facility can write audit events to the console, the syslog (option is unavailable on Windows), a JSON file, 
or a BSON file. For details on the audited operations and the audit log messages, see System Event Audit Messages 
(page 380). 
Enable and Configure Audit Output 
Use the --auditDestination option to enable auditing and specify where to output the audit events. 
Output to Syslog 
To enable auditing and print audit events to the syslog (option is unavailable on Windows) in JSON format, specify 
syslog for the --auditDestination setting. For example: 
mongod --dbpath data/db --auditDestination syslog 
Warning: The syslog message limit can result in the truncation of the audit messages. The auditing system will 
neither detect the truncation nor error upon its occurrence. 
You may also specify these options in the configuration file: 
dbpath=data/db 
auditDestination=syslog 
Output to Console 
To enable auditing and print the audit events to standard output (i.e. stdout), specify console for the 
--auditDestination setting. For example: 
mongod --dbpath data/db --auditDestination console 
You may also specify these options in the configuration file: 
dbpath=data/db 
auditDestination=console 
Output to JSON File 
To enable auditing and print audit events to a file in JSON format, specify file for the --auditDestination set-ting, 
JSON for the --auditFormat setting, and the output filename for the --auditPath. The --auditPath 
option accepts either full path name or relative path name. For example, the following enables auditing and records 
audit events to a file with the relative path name of data/db/auditLog.json: 
mongod --dbpath data/db --auditDestination file --auditFormat JSON --auditPath data/db/auditLog.json 
The audit file rotates at the same time as the server log file. 
You may also specify these options in the configuration file: 
dbpath=data/db 
auditDestination=file 
auditFormat=JSON 
auditPath=data/db/auditLog.json 
6.3. Security Tutorials 357
MongoDB Documentation, Release 2.6.4 
Note: Printing audit events to a file in JSON format degrades server performance more than printing to a file in BSON 
format. 
Output to BSON File 
To enable auditing and print audit events to a file in BSON binary format, specify file for the 
--auditDestination setting, BSON for the --auditFormat setting, and the output filename for the 
--auditPath. The --auditPath option accepts either full path name or relative path name. For ex-ample, 
the following enables auditing and records audit events to a BSON file with the relative path name of 
data/db/auditLog.bson: 
mongod --dbpath data/db --auditDestination file --auditFormat BSON --auditPath data/db/auditLog.bson 
The audit file rotates at the same time as the server log file. 
You may also specify these options in the configuration file: 
dbpath=data/db 
auditDestination=file 
auditFormat=BSON 
auditPath=data/db/auditLog.bson 
To view the contents of the file, pass the file to the MongoDB utility bsondump. For example, the following converts 
the audit log into a human-readable form and output to the terminal: 
bsondump data/db/auditLog.bson 
Filter Events 
By default, the audit facility records all auditable operations (page 381). The audit feature has an --auditFilter 
option to determine which events to record. The --auditFilter option takes a document of the form: 
{ atype: <expression> } 
The <expression> is a query condition expression to match on various actions (page 381) . 
Filter for a Single Operation Type 
For example, to audit only the createCollection (page 375) action, use the filter { atype: 
"createCollection" }: 
Tip 
To specify the filter as a command-line option, enclose the filter document in single quotes to pass the document as a 
string. 
mongod --dbpath data/db --auditDestination file --auditFilter '{ atype: "createCollection" }' --auditFormat Filter for Multiple Operation Types 
To match on multiple operations, use the $in operator in the <expression> as in the following: 
358 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Tip 
To specify the filter as a command-line option, enclose the filter document in single quotes to pass the document as a 
string. 
mongod --dbpath data/db --auditDestination file --auditFilter '{ atype: { $in: [ "createCollection", Filter on Authentication Operations on a Single Database 
For authentication operations, you can also specify a specific database with the param.db field: 
{ atype: <expression>, "param.db": <database> } 
For example, to audit only authenticate operations that occur against the test database, use the filter { 
atype: "authenticate", "param.db": "test" }: 
Tip 
To specify the filter as a command-line option, enclose the filter document in single quotes to pass the document as a 
string. 
mongod --dbpath data/db --auth --auditDestination file --auditFilter '{ atype: "authenticate", "param.To filter on all authenticate operations across databases, use the filter { atype: "authenticate" }. 
6.3.7 Create a Vulnerability Report 
If you believe you have discovered a vulnerability in MongoDB or have experienced a security incident related to 
MongoDB, please report the issue to aid in its resolution. 
To report an issue, we strongly suggest filing a ticket in the SECURITY53 project in JIRA. MongoDB, Inc responds to 
vulnerability notifications within 48 hours. 
Create the Report in JIRA 
Submit a ticket in the Security54 project at: <http://jira.mongodb.org/browse>. The ticket number will become the 
reference identification for the issue for its lifetime. You can use this identifier for tracking purposes. 
Information to Provide 
All vulnerability reports should contain as much information as possible so MongoDB’s developers can move quickly 
to resolve the issue. In particular, please include the following: 
• The name of the product. 
• Common Vulnerability information, if applicable, including: 
• CVSS (Common Vulnerability Scoring System) Score. 
• CVE (Common Vulnerability and Exposures) Identifier. 
• Contact information, including an email address and/or phone number, if applicable. 
53https://jira.mongodb.org/browse/SECURITY 
54https://jira.mongodb.org/browse/SECURITY 
6.3. Security Tutorials 359
MongoDB Documentation, Release 2.6.4 
Send the Report via Email 
While JIRA is the preferred reporting method, you may also report vulnerabilities via email to secu-rity@ 
mongodb.com55. 
You may encrypt email using MongoDB’s public key at http://docs.mongodb.org/10gen-security-gpg-key.asc. 
MongoDB, Inc. responds to vulnerability reports sent via email with a response email that contains a reference number 
for a JIRA ticket posted to the SECURITY56 project. 
Evaluation of a Vulnerability Report 
MongoDB, Inc. validates all submitted vulnerabilities and uses Jira to track all communications regarding a vulner-ability, 
including requests for clarification or additional information. If needed, MongoDB representatives set up a 
conference call to exchange information regarding the vulnerability. 
Disclosure 
MongoDB, Inc. requests that you do not publicly disclose any information regarding the vulnerability or exploit the 
issue until it has had the opportunity to analyze the vulnerability, to respond to the notification, and to notify key users, 
customers, and partners. 
The amount of time required to validate a reported vulnerability depends on the complexity and severity of the issue. 
MongoDB, Inc. takes all required vulnerabilities very seriously and will always ensure that there is a clear and open 
channel of communication with the reporter. 
After validating an issue, MongoDB, Inc. coordinates public disclosure of the issue with the reporter in a mutually 
agreed timeframe and format. If required or requested, the reporter of a vulnerability will receive credit in the published 
security bulletin. 
6.4 Security Reference 
6.4.1 Security Methods in the mongo Shell 
Name Description 
db.auth() Authenticates a user to a database. 
55security@mongodb.com 
56https://jira.mongodb.org/browse/SECURITY 
360 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
User Management Methods 
Name Description 
db.addUser() Deprecated. Adds a user to a database, and allows administrators to configure the 
user’s privileges. 
db.changeUserPassword()Changes an existing user’s password. 
db.createUser() Creates a new user. 
db.dropAllUsers() Deletes all users associated with a database. 
db.dropUser() Removes a single user. 
db.getUser() Returns information about the specified user. 
db.getUsers() Returns information about all users associated with a database. 
db.grantRolesToUser() Grants a role and its privileges to a user. 
db.removeUser() Deprecated. Removes a user from a database. 
db.revokeRolesFromUserR()emoves a role from a user. 
db.updateUser() Updates user data. 
Role Management Methods 
Name Description 
db.createRole() Creates a role and specifies its privileges. 
db.dropAllRoles() Deletes all user-defined roles associated with a database. 
db.dropRole() Deletes a user-defined role. 
db.getRole() Returns information for the specified role. 
db.getRoles() Returns information for all the user-defined roles in a database. 
db.grantPrivilegesToRole() Assigns privileges to a user-defined role. 
db.grantRolesToRole() Specifies roles from which a user-defined role inherits privileges. 
db.revokePrivilegesFromRole() Removes the specified privileges from a user-defined role. 
db.revokeRolesFromRole() Removes a role from a user. 
db.updateRole() Updates a user-defined role. 
6.4.2 Security Reference Documentation 
Built-In Roles (page 361) Reference on MongoDB provided roles and corresponding access. 
system.roles Collection (page 369) Describes the content of the collection that stores user-defined roles. 
system.users Collection (page 372) Describes the content of the collection that stores users’ credentials and role as-signments. 
Resource Document (page 373) Describes the resource document for roles. 
Privilege Actions (page 375) List of the actions available for privileges. 
Default MongoDB Port (page 380) List of default ports used by MongoDB. 
System Event Audit Messages (page 380) Reference on system event audit messages. 
Built-In Roles 
MongoDB grants access to data and commands through role-based authorization (page 285) and provides built-in 
roles that provide the different levels of access commonly needed in a database system. You can additionally create 
user-defined roles (page 286). 
A role grants privileges to perform sets of actions (page 375) on defined resources (page 373). A given role applies to 
the database on which it is defined and can grant access down to a collection level of granularity. 
6.4. Security Reference 361
MongoDB Documentation, Release 2.6.4 
Each of MongoDB’s built-in roles defines access at the database level for all non-system collections in the role’s 
database and at the collection level for all system collections (page 270). 
MongoDB provides the built-in database user (page 362) and database administration (page 363) roles on every 
database. MongoDB provides all other built-in roles only on the admin database. 
This section describes the privileges for each built-in role. You can also view the privileges for a built-in role at any 
time by issuing the rolesInfo command with the showPrivileges and showBuiltinRoles fields both set 
to true. 
Database User Roles 
Every database includes the following client roles: 
read 
Provides the ability to read data on all non-system collections and on the following system collections: 
system.indexes (page 271), system.js (page 271), and system.namespaces (page 271) collec-tions. 
The role provides read access by granting the following actions (page 375): 
•collStats (page 379) 
•dbHash (page 379) 
•dbStats (page 379) 
•find (page 375) 
•killCursors (page 376) 
readWrite 
Provides all the privileges of the read (page 362) role plus ability to modify data on all non-system collections 
and the system.js (page 271) collection. The role provides the following actions on those collections: 
•collStats (page 379) 
•convertToCapped (page 378) 
•createCollection (page 375) 
•dbHash (page 379) 
•dbStats (page 379) 
•dropCollection (page 376) 
•createIndex (page 375) 
•dropIndex (page 378) 
•emptycapped (page 376) 
•find (page 375) 
•insert (page 375) 
•killCursors (page 376) 
•remove (page 375) 
•renameCollectionSameDB (page 378) 
•update (page 375) 
362 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Database Administration Roles 
Every database includes the following database administration roles: 
dbAdmin 
Provides the following actions (page 375) on the database’s system.indexes (page 271), 
system.namespaces (page 271), and system.profile (page 271) collections: 
•collStats (page 379) 
•dbHash (page 379) 
•dbStats (page 379) 
•find (page 375) 
•killCursors (page 376) 
•dropCollection (page 376) on system.profile (page 271) only 
Provides the following actions on all non-system collections. This role does not include full read access on 
non-system collections: 
•collMod (page 378) 
•collStats (page 379) 
•compact (page 378) 
•convertToCapped (page 378) 
•createCollection (page 375) 
•createIndex (page 375) 
•dbStats (page 379) 
•dropCollection (page 376) 
•dropDatabase (page 378) 
•dropIndex (page 378) 
•enableProfiler (page 376) 
•indexStats (page 379) 
•reIndex (page 378) 
•renameCollectionSameDB (page 378) 
•repairDatabase (page 378) 
•storageDetails (page 377) 
•validate (page 379) 
dbOwner 
The database owner can perform any administrative action on the database. This role combines the privileges 
granted by the readWrite (page 362), dbAdmin (page 363) and userAdmin (page 363) roles. 
userAdmin 
Provides the ability to create and modify roles and users on the current database. This role also indirectly 
provides superuser (page 368) access to either the database or, if scoped to the admin database, the cluster. 
The userAdmin (page 363) role allows users to grant any user any privilege, including themselves. 
The userAdmin (page 363) role explicitly provides the following actions: 
6.4. Security Reference 363
MongoDB Documentation, Release 2.6.4 
•changeCustomData (page 375) 
•changePassword (page 375) 
•createRole (page 375) 
•createUser (page 376) 
•dropRole (page 376) 
•dropUser (page 376) 
•grantRole (page 376) 
•revokeRole (page 376) 
•viewRole (page 376) 
•viewUser (page 376) 
Cluster Administration Roles 
The admin database includes the following roles for administering the whole system rather than just a single database. 
These roles include but are not limited to replica set and sharded cluster administrative functions. 
clusterAdmin 
Provides the greatest cluster-management access. This role combines the privileges granted by the 
clusterManager (page 364), clusterMonitor (page 365), and hostManager (page 366) roles. Ad-ditionally, 
the role provides the dropDatabase (page 378) action. 
clusterManager 
Provides management and monitoring actions on the cluster. A user with this role can access the config and 
local databases, which are used in sharding and replication, respectively. 
Provides the following actions on the cluster as a whole: 
•addShard (page 377) 
•applicationMessage (page 378) 
•cleanupOrphaned (page 376) 
•flushRouterConfig (page 377) 
•listShards (page 377) 
•removeShard (page 377) 
•replSetConfigure (page 377) 
•replSetGetStatus (page 377) 
•replSetStateChange (page 377) 
•resync (page 377) 
Provides the following actions on all databases in the cluster: 
•enableSharding (page 377) 
•moveChunk (page 377) 
•splitChunk (page 378) 
•splitVector (page 378) 
On the config database, provides the following actions on the settings (page 683) collection: 
364 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
•insert (page 375) 
•remove (page 375) 
•update (page 375) 
On the config database, provides the following actions on all configuration collections and on the 
system.indexes (page 271), system.js (page 271), and system.namespaces (page 271) collec-tions: 
•collStats (page 379) 
•dbHash (page 379) 
•dbStats (page 379) 
•find (page 375) 
•killCursors (page 376) 
On the local database, provides the following actions on the replset (page 600) collection: 
•collStats (page 379) 
•dbHash (page 379) 
•dbStats (page 379) 
•find (page 375) 
•killCursors (page 376) 
clusterMonitor 
Provides read-only access to monitoring tools, such as the MongoDB Management Service (MMS)57 monitoring 
agent. 
Provides the following actions on the cluster as a whole: 
•connPoolStats (page 379) 
•cursorInfo (page 379) 
•getCmdLineOpts (page 379) 
•getLog (page 379) 
•getParameter (page 378) 
•getShardMap (page 377) 
•hostInfo (page 378) 
•inprog (page 376) 
•listDatabases (page 379) 
•listShards (page 377) 
•netstat (page 379) 
•replSetGetStatus (page 377) 
•serverStatus (page 379) 
•shardingState (page 377) 
•top (page 379) 
57http://mms.mongodb.com/help/ 
6.4. Security Reference 365
MongoDB Documentation, Release 2.6.4 
Provides the following actions on all databases in the cluster: 
•collStats (page 379) 
•dbStats (page 379) 
•getShardVersion (page 377) 
Provides the find (page 375) action on all system.profile (page 271) collections in the cluster. 
Provides the following actions on the config database’s configuration collections and system.indexes 
(page 271), system.js (page 271), and system.namespaces (page 271) collections: 
•collStats (page 379) 
•dbHash (page 379) 
•dbStats (page 379) 
•find (page 375) 
•killCursors (page 376) 
hostManager 
Provides the ability to monitor and manage servers. 
Provides the following actions on the cluster as a whole: 
•applicationMessage (page 378) 
•closeAllDatabases (page 378) 
•connPoolSync (page 378) 
•cpuProfiler (page 376) 
•diagLogging (page 379) 
•flushRouterConfig (page 377) 
•fsync (page 378) 
•invalidateUserCache (page 376) 
•killop (page 376) 
•logRotate (page 378) 
•resync (page 377) 
•setParameter (page 379) 
•shutdown (page 379) 
•touch (page 379) 
•unlock (page 376) 
Provides the following actions on all databases in the cluster: 
•killCursors (page 376) 
•repairDatabase (page 378) 
366 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Backup and Restoration Roles 
The admin database includes the following roles for backing up and restoring data: 
backup 
Provides minimal privileges needed for backing up data. This role provides sufficient privileges to use the 
MongoDB Management Service (MMS)58 backup agent, or to use mongodump to back up an entire mongod 
instance. 
Provides the following actions (page 375) on the mms.backup collection in the admin database: 
•insert (page 375) 
•update (page 375) 
Provides the listDatabases (page 379) action on the cluster as a whole. 
Provides the find (page 375) action on the following: 
•all non-system collections in the cluster 
•all the following system collections in the cluster: system.indexes (page 271), 
system.namespaces (page 271), and system.js (page 271) 
•the admin.system.users (page 271) and admin.system.roles (page 270) collections 
•legacy system.users collections from versions of MongoDB prior to 2.6 
restore 
Provides minimal privileges needed for restoring data from backups. This role provides sufficient privileges to 
use the mongorestore tool to restore an entire mongod instance. 
Provides the following actions on all non-system collections and system.js (page 271) collections in the 
cluster; on the admin.system.users (page 271) and admin.system.roles (page 270) collections in 
the admin database; and on legacy system.users collections from versions of MongoDB prior to 2.6: 
•collMod (page 378) 
•createCollection (page 375) 
•createIndex (page 375) 
•dropCollection (page 376) 
•insert (page 375) 
Provides the following additional actions on admin.system.users (page 271) and legacy 
system.users collections: 
•find (page 375) 
•remove (page 375) 
•update (page 375) 
Provides the find (page 375) action on all the system.namespaces (page 271) collections in the cluster. 
Although, restore (page 367) includes the ability to modify the documents in the admin.system.users 
(page 271) collection using normal modification operations, only modify these data using the user management 
methods. 
58http://mms.mongodb.com/help/ 
6.4. Security Reference 367
MongoDB Documentation, Release 2.6.4 
All-Database Roles 
The admin database provides the following roles that apply to all databases in a mongod instance and are roughly 
equivalent to their single-database equivalents: 
readAnyDatabase 
Provides the same read-only permissions as read (page 362), except it applies to all databases in the cluster. 
The role also provides the listDatabases (page 379) action on the cluster as a whole. 
readWriteAnyDatabase 
Provides the same read and write permissions as readWrite (page 362), except it applies to all databases in 
the cluster. The role also provides the listDatabases (page 379) action on the cluster as a whole. 
userAdminAnyDatabase 
Provides the same access to user administration operations as userAdmin (page 363), except it applies to all 
databases in the cluster. The role also provides the following actions on the cluster as a whole: 
•authSchemaUpgrade (page 376) 
•invalidateUserCache (page 376) 
•listDatabases (page 379) 
The role also provides the following actions on the admin.system.users (page 271) and 
admin.system.roles (page 270) collections on the admin database, and on legacy system.users 
collections from versions of MongoDB prior to 2.6: 
•collStats (page 379) 
•dbHash (page 379) 
•dbStats (page 379) 
•find (page 375) 
•killCursors (page 376) 
•planCacheRead (page 376) 
The userAdminAnyDatabase (page 368) role does not restrict the permissions that a user can grant. As 
a result, userAdminAnyDatabase (page 368) users can grant themselves privileges in excess of their cur-rent 
privileges and even can grant themselves all privileges, even though the role does not explicitly authorize 
privileges beyond user administration. This role is effectively a MongoDB system superuser (page 368). 
dbAdminAnyDatabase 
Provides the same access to database administration operations as dbAdmin (page 363), except it applies to 
all databases in the cluster. The role also provides the listDatabases (page 379) action on the cluster as a 
whole. 
Superuser Roles 
Several roles provide either indirect or direct system-wide superuser access. 
The following roles provide the ability to assign any user any privilege on any database, which means that users with 
one of these roles can assign themselves any privilege on any database: 
• dbOwner (page 363) role, when scoped to the admin database 
• userAdmin (page 363) role, when scoped to the admin database 
• userAdminAnyDatabase (page 368) role 
The following role provides full privileges on all resources: 
368 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
root 
Provides access to the operations and all the resources of the readWriteAnyDatabase (page 368), 
dbAdminAnyDatabase (page 368), userAdminAnyDatabase (page 368) and clusterAdmin 
(page 364) roles combined. 
root (page 368) does not include the ability to insert data directly into the system.users (page 271) and 
system.roles (page 270) collections in the admin database. Therefore, root (page 368) is not suitable for 
restoring data that have these collections with mongorestore. To perform these kinds of restore operations, 
provision users with the restore (page 367) role. 
Internal Role 
__system 
MongoDB assigns this role to user objects that represent cluster members, such as replica set members and 
mongos instances. The role entitles its holder to take any action against any object in the database. 
Do not assign this role to user objects representing applications or human administrators, other than in excep-tional 
circumstances. 
If you need access to all actions on all resources, for example to run the eval or applyOps commands, do 
not assign this role. Instead, create a user-defined role that grants anyAction (page 380) on anyResource 
(page 375) and ensure that only the users who needs access to these operations has this access. 
system.roles Collection 
New in version 2.6. 
The system.roles collection in the admin database stores the user-defined roles. To create and manage these 
user-defined roles, MongoDB provides role management commands. 
system.roles Schema 
The documents in the system.roles collection have the following schema: 
{ 
_id: <system-defined id>, 
role: "<role name>", 
db: "<database>", 
privileges: 
[ 
{ 
resource: { <resource> }, 
actions: [ "<action>", ... ] 
}, 
... 
], 
roles: 
[ 
{ role: "<role name>", db: "<database>" }, 
... 
] 
} 
A system.roles document has the following fields: 
6.4. Security Reference 369
MongoDB Documentation, Release 2.6.4 
admin.system.roles.role 
The role (page 369) field is a string that specifies the name of the role. 
admin.system.roles.db 
The db (page 370) field is a string that specifies the database to which the role belongs. MongoDB uniquely 
identifies each role by the pairing of its name (i.e. role (page 369)) and its database. 
admin.system.roles.privileges 
The privileges (page 370) array contains the privilege documents that define the privileges (page 286) for 
the role. 
A privilege document has the following syntax: 
{ 
resource: { <resource> }, 
actions: [ "<action>", ... ] 
} 
Each privilege document has the following fields: 
admin.system.roles.privileges[n].resource 
A document that specifies the resources upon which the privilege actions (page 370) apply. The docu-ment 
has one of the following form: 
{ db: <database>, collection: <collection> } 
or 
{ cluster : true } 
See Resource Document (page 373) for more details. 
admin.system.roles.privileges[n].actions 
An array of actions permitted on the resource. For a list of actions, see Privilege Actions (page 375). 
admin.system.roles.roles 
The roles (page 370) array contains role documents that specify the roles from which this role inherits 
(page 286) privileges. 
A role document has the following syntax: 
{ role: "<role name>", db: "<database>" } 
A role document has the following fields: 
admin.system.roles.roles[n].role 
The name of the role. A role can be a built-in role (page 361) provided by MongoDB or a user-defined 
role (page 286). 
admin.system.roles.roles[n].db 
The name of the database where the role is defined. 
Examples 
Consider the following sample documents found in system.roles collection of the admin database. 
A User-Defined Role Specifies Privileges The following is a sample document for a user-defined role appUser 
defined for the myApp database: 
370 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
{ 
_id: "myApp.appUser", 
role: "appUser", 
db: "myApp", 
privileges: [ 
{ resource: { db: "myApp" , collection: "" }, 
actions: [ "find", "createCollection", "dbStats", "collStats" ] }, 
{ resource: { db: "myApp", collection: "logs" }, 
actions: [ "insert" ] }, 
{ resource: { db: "myApp", collection: "data" }, 
actions: [ "insert", "update", "remove", "compact" ] }, 
{ resource: { db: "myApp", collection: "system.indexes" }, 
actions: [ "find" ] }, 
{ resource: { db: "myApp", collection: "system.namespaces" }, 
actions: [ "find" ] }, 
], 
roles: [] 
} 
The privileges array lists the five privileges that the appUser role specifies: 
• The first privilege permits its actions ( "find", "createCollection", "dbStats", "collStats") on 
all the collections in the myApp database excluding its system collections. See Specify a Database as Resource 
(page 373). 
• The next two privileges permits additional actions on specific collections, logs and data, in the myApp 
database. See Specify a Collection of a Database as Resource (page 373). 
• The last two privileges permits actions on two system collections (page 270) in the myApp database. While 
the first privilege gives database-wide permission for the find action, the action does not apply to myApp‘s 
system collections. To give access to a system collection, a privilege must explicitly specify the collection. See 
Resource Document (page 373). 
As indicated by the empty roles array, appUser inherits no additional privileges from other roles. 
User-Defined Role Inherits from Other Roles The following is a sample document for a user-defined role 
appAdmin defined for the myApp database: The document shows that the appAdmin role specifies privileges 
as well as inherits privileges from other roles: 
{ 
_id: "myApp.appAdmin", 
role: "appAdmin", 
db: "myApp", 
privileges: [ 
{ 
resource: { db: "myApp", collection: "" }, 
actions: [ "insert", "dbStats", "collStats", "compact", "repairDatabase" ] 
} 
], 
roles: [ 
{ role: "appUser", db: "myApp" } 
] 
} 
The privileges array lists the privileges that the appAdmin role specifies. This role has a single privilege that 
permits its actions ( "insert", "dbStats", "collStats", "compact", "repairDatabase") on all the 
collections in the myApp database excluding its system collections. See Specify a Database as Resource (page 373). 
6.4. Security Reference 371
MongoDB Documentation, Release 2.6.4 
The roles array lists the roles, identified by the role names and databases, from which the role appAdmin inherits 
privileges. 
system.users Collection 
Changed in version 2.6. 
The system.users collection in the admin database stores user authentication (page 282) and authorization 
(page 285) information. To manage data in this collection, MongoDB provides user management commands. 
system.users Schema 
The documents in the system.users collection have the following schema: 
{ 
_id: <system defined id>, 
user: "<name>", 
db: "<database>", 
credentials: { <authentication credentials> }, 
roles: [ 
{ role: "<role name>", db: "<database>" }, 
... 
], 
customData: <custom information> 
} 
Each system.users document has the following fields: 
admin.system.users.user 
The user (page 372) field is a string that identifies the user. A user exists in the context of a single logical 
database but can have access to other databases through roles specified in the roles (page 372) array. 
admin.system.users.db 
The db (page 372) field specifies the database associated with the user. The user’s privileges are not necessarily 
limited to this database. The user can have privileges in additional databases through the roles (page 372) 
array. 
admin.system.users.credentials 
The credentials (page 372) field contains the user’s authentication information. For users with externally 
stored authentication credentials, such as users that use Kerberos (page 331) or x.509 certificates for authentica-tion, 
the system.users document for that user does not contain the credentials (page 372) field. 
admin.system.users.roles 
The roles (page 372) array contains role documents that specify the roles granted to the user. The array 
contains both built-in roles (page 361) and user-defined role (page 286). 
A role document has the following syntax: 
{ role: "<role name>", db: "<database>" } 
A role document has the following fields: 
admin.system.users.roles[n].role 
The name of a role. A role can be a built-in role (page 361) provided by MongoDB or a custom user-defined 
role (page 286). 
admin.system.users.roles[n].db 
The name of the database where role is defined. 
372 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
When specifying a role using the role management or user management commands, you can specify the role 
name alone (e.g. "readWrite") if the role that exists on the database on which the command is run. 
admin.system.users.customData 
The customData (page 373) field contains optional custom information about the user. 
Example 
Consider the following document in the system.users collection: 
{ 
_id: "home.Kari", 
user: "Kari", 
db: "home", 
credentials: { "MONGODB-CR" :"<hashed password>" }, 
roles : [ 
{ role: "read", db: "home" }, 
{ role: "readWrite", db: "test" }, 
{ role: "appUser", db: "myApp" } 
], 
customData: { zipCode: "64157" } 
} 
The document shows that a user Kari is associated with the home database. Kari has the read (page 362) role 
in the home database, the readWrite (page 362) role in the test database, and the appUser role in the myApp 
database. 
Resource Document 
The resource document specifies the resources upon which a privilege permits actions. 
Database and/or Collection Resource 
To specify databases and/or collections, use the following syntax: 
{ db: <database>, collection: <collection> } 
Specify a Collection of a Database as Resource If the resource document species both the db an collection 
fields as non-empty strings, the resource is the specified collection in the specified database. For example, the following 
document specifies a resource of the inventory collection in the products database: 
{ db: "products", collection: "inventory" } 
For a user-defined role scoped for a non-admin database, the resource specification for its privileges must specify the 
same database as the role. User-defined roles scoped for the admin database can specify other databases. 
Specify a Database as Resource If only the collection field is an empty string (""), the resource is the specified 
database, excluding the system collections (page 270). For example, the following resource document specifies the 
resource of the test database, excluding the system collections: 
{ db: "test", collection: "" } 
6.4. Security Reference 373
MongoDB Documentation, Release 2.6.4 
For a user-defined role scoped for a non-admin database, the resource specification for its privileges must specify the 
same database as the role. User-defined roles scoped for the admin database can specify other databases. 
Note: When you specify a database as the resource, the system collections are excluded, unless you name them 
explicitly, as in the following: 
{ db: "test", collection: "system.namespaces" } 
System collections include but are not limited to the following: 
• <database>.system.profile (page 271) 
• <database>.system.namespaces (page 271) 
• <database>.system.indexes (page 271) 
• <database>.system.js (page 271) 
• local.system.replset (page 600) 
• system.users Collection (page 372) in the admin database 
• system.roles Collection (page 369) in the admin database 
Specify Collections Across Databases as Resource If only the db field is an empty string (""), the resource is all 
collections with the specified name across all databases. For example, the following document specifies the resource 
of all the accounts collections across all the databases: 
{ db: "", collection: "accounts" } 
For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-leges. 
Specify All Non-System Collections in All Databases If both the db and collection fields are empty strings 
(""), the resource is all collections, excluding the system collections (page 270), in all the databases: 
{ db: "", collection: "" } 
For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-leges. 
Cluster Resource 
To specify the cluster as the resource, use the following syntax: 
{ cluster : true } 
Use the cluster resource for actions that affect the state of the system rather than act on specific set of databases 
or collections. Examples of such actions are shutdown, replSetReconfig, and addShard. For example, the 
following document grants the action shutdown on the cluster. 
{ resource: { cluster : true }, actions: [ "shutdown" ] } 
For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-leges. 
374 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
anyResource 
The internal resource anyResource gives access to every resource in the system and is intended for internal use. 
Do not use this resource, other than in exceptional circumstances. The syntax for this resource is { anyResource: 
true }. 
Privilege Actions 
New in version 2.6. 
Privilege actions define the operations a user can perform on a resource (page 373). A MongoDB privilege (page 286) 
comprises a resource (page 373) and the permitted actions. This page lists available actions grouped by common 
purpose. 
MongoDB provides built-in roles with pre-defined pairings of resources and permitted actions. For lists of the actions 
granted, see Built-In Roles (page 361). To define custom roles, see Create a Role (page 347). 
Query and Write Actions 
find 
User can perform the db.collection.find() method. Apply this action to database or collection re-sources. 
insert 
User can perform the insert command. Apply this action to database or collection resources. 
remove 
User can perform the db.collection.remove() method. Apply this action to database or collection 
resources. 
update 
User can perform the update command. Apply this action to database or collection resources. 
Database Management Actions 
changeCustomData 
User can change the custom information of any user in the given database. Apply this action to database 
resources. 
changeOwnCustomData 
Users can change their own custom information. Apply this action to database resources. 
changeOwnPassword 
Users can change their own passwords. Apply this action to database resources. 
changePassword 
User can change the password of any user in the given database. Apply this action to database resources. 
createCollection 
User can perform the db.createCollection() method. Apply this action to database or collection re-sources. 
createIndex 
Provides access to the db.collection.createIndex() method and the createIndexes command. 
Apply this action to database or collection resources. 
6.4. Security Reference 375
MongoDB Documentation, Release 2.6.4 
createRole 
User can create new roles in the given database. Apply this action to database resources. 
createUser 
User can create new users in the given database. Apply this action to database resources. 
dropCollection 
User can perform the db.collection.drop() method. Apply this action to database or collection re-sources. 
dropRole 
User can delete any role from the given database. Apply this action to database resources. 
dropUser 
User can remove any user from the given database. Apply this action to database resources. 
emptycapped 
User can perform the emptycapped command. Apply this action to database or collection resources. 
enableProfiler 
User can perform the db.setProfilingLevel() method. Apply this action to database resources. 
grantRole 
User can grant any role in the database to any user from any database in the system. Apply this action to database 
resources. 
killCursors 
User can kill cursors on the target collection. 
revokeRole 
User can remove any role from any user from any database in the system. Apply this action to database resources. 
unlock 
User can perform the db.fsyncUnlock() method. Apply this action to the cluster resource. 
viewRole 
User can view information about any role in the given database. Apply this action to database resources. 
viewUser 
User can view the information of any user in the given database. Apply this action to database resources. 
Deployment Management Actions 
authSchemaUpgrade 
User can perform the authSchemaUpgrade command. Apply this action to the cluster resource. 
cleanupOrphaned 
User can perform the cleanupOrphaned command. Apply this action to the cluster resource. 
cpuProfiler 
User can enable and use the CPU profiler. Apply this action to the cluster resource. 
inprog 
User can use the db.currentOp() method to return pending and active operations. Apply this action to the 
cluster resource. 
invalidateUserCache 
Provides access to the invalidateUserCache command. Apply this action to the cluster resource. 
killop 
User can perform the db.killOp() method. Apply this action to the cluster resource. 
376 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
planCacheRead 
User can perform the planCacheListPlans and planCacheListQueryShapes commands and the 
PlanCache.getPlansByQuery() and PlanCache.listQueryShapes() methods. Apply this ac-tion 
to database or collection resources. 
planCacheWrite 
User can perform the planCacheClear command and the PlanCache.clear() and 
PlanCache.clearPlansByQuery() methods. Apply this action to database or collection resources. 
storageDetails 
User can perform the storageDetails command. Apply this action to database or collection resources. 
Replication Actions 
appendOplogNote 
User can append notes to the oplog. Apply this action to the cluster resource. 
replSetConfigure 
User can configure a replica set. Apply this action to the cluster resource. 
replSetGetStatus 
User can perform the replSetGetStatus command. Apply this action to the cluster resource. 
replSetHeartbeat 
User can perform the replSetHeartbeat command. Apply this action to the cluster resource. 
replSetStateChange 
User can change the state of a replica set through the replSetFreeze, replSetMaintenance, 
replSetStepDown, and replSetSyncFrom commands. Apply this action to the cluster resource. 
resync 
User can perform the resync command. Apply this action to the cluster resource. 
Sharding Actions 
addShard 
User can perform the addShard command. Apply this action to the cluster resource. 
enableSharding 
User can enable sharding on a database using the enableSharding command and can shard a collection 
using the shardCollection command. Apply this action to database or collection resources. 
flushRouterConfig 
User can perform the flushRouterConfig command. Apply this action to the cluster resource. 
getShardMap 
User can perform the getShardMap command. Apply this action to the cluster resource. 
getShardVersion 
User can perform the getShardVersion command. Apply this action to database resources. 
listShards 
User can perform the listShards command. Apply this action to the cluster resource. 
moveChunk 
User can perform the moveChunk command. Apply this action to the cluster resource. 
removeShard 
User can perform the removeShard command. Apply this action to the cluster resource. 
6.4. Security Reference 377
MongoDB Documentation, Release 2.6.4 
shardingState 
User can perform the shardingState command. Apply this action to the cluster resource. 
splitChunk 
User can perform the splitChunk command. Apply this action to database or collection resources. 
splitVector 
User can perform the splitVector command. Apply this action to database or collection resources. 
Server Administration Actions 
applicationMessage 
User can perform the logApplicationMessage command. Apply this action to the cluster resource. 
closeAllDatabases 
User can perform the closeAllDatabases command. Apply this action to the cluster resource. 
collMod 
User can perform the collMod command. Apply this action to database or collection resources. 
compact 
User can perform the compact command. Apply this action to database or collection resources. 
connPoolSync 
User can perform the connPoolSync command. Apply this action to the cluster resource. 
convertToCapped 
User can perform the convertToCapped command. Apply this action to database or collection resources. 
dropDatabase 
User can perform the dropDatabase command. Apply this action to database resources. 
dropIndex 
User can perform the dropIndexes command. Apply this action to database or collection resources. 
fsync 
User can perform the fsync command. Apply this action to the cluster resource. 
getParameter 
User can perform the getParameter command. Apply this action to the cluster resource. 
hostInfo 
Provides information about the server the MongoDB instance runs on. Apply this action to the cluster 
resource. 
logRotate 
User can perform the logRotate command. Apply this action to the cluster resource. 
reIndex 
User can perform the reIndex command. Apply this action to database or collection resources. 
renameCollectionSameDB 
Allows the user to rename collections on the current database using the renameCollection command. 
Apply this action to database resources. 
Additionally, the user must either have find (page 375) on the source collection or not have find (page 375) 
on the destination collection. 
If a collection with the new name already exists, the user must also have the dropCollection (page 376) 
action on the destination collection. 
378 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
repairDatabase 
User can perform the repairDatabase command. Apply this action to database resources. 
setParameter 
User can perform the setParameter command. Apply this action to the cluster resource. 
shutdown 
User can perform the shutdown command. Apply this action to the cluster resource. 
touch 
User can perform the touch command. Apply this action to the cluster resource. 
Diagnostic Actions 
collStats 
User can perform the collStats command. Apply this action to database or collection resources. 
connPoolStats 
User can perform the connPoolStats and shardConnPoolStats commands. Apply this action to the 
cluster resource. 
cursorInfo 
User can perform the cursorInfo command. Apply this action to the cluster resource. 
dbHash 
User can perform the dbHash command. Apply this action to database or collection resources. 
dbStats 
User can perform the dbStats command. Apply this action to database resources. 
diagLogging 
User can perform the diagLogging command. Apply this action to the cluster resource. 
getCmdLineOpts 
User can perform the getCmdLineOpts command. Apply this action to the cluster resource. 
getLog 
User can perform the getLog command. Apply this action to the cluster resource. 
indexStats 
User can perform the indexStats command. Apply this action to database or collection resources. 
listDatabases 
User can perform the listDatabases command. Apply this action to the cluster resource. 
netstat 
User can perform the netstat command. Apply this action to the cluster resource. 
serverStatus 
User can perform the serverStatus command. Apply this action to the cluster resource. 
validate 
User can perform the validate command. Apply this action to database or collection resources. 
top 
User can perform the top command. Apply this action to the cluster resource. 
6.4. Security Reference 379
MongoDB Documentation, Release 2.6.4 
Internal Actions 
anyAction 
Allows any action on a resource. Do not assign this action except for exceptional circumstances. 
internal 
Allows internal actions. Do not assign this action except for exceptional circumstances. 
Default MongoDB Port 
The following table lists the default ports used by MongoDB: 
Default 
Description 
Port 
27017 The default port for mongod and mongos instances. You can change this port with port or 
--port. 
27018 The default port when running with --shardsvr runtime operation or the shardsvr value for the 
clusterRole setting in a configuration file. 
27019 The default port when running with --configsvr runtime operation or the configsvr value for 
the clusterRole setting in a configuration file. 
28017 The default port for the web status page. The web status page is always accessible at a port number 
that is 1000 greater than the port determined by port. 
System Event Audit Messages 
Note: The audit system (page 290) is available only in MongoDB Enterprise59. 
The event auditing feature (page 290) can record events in JSON format. The recorded JSON messages have the 
following syntax: 
{ 
atype: <String>, 
ts : { "$date": <timestamp> }, 
local: { ip: <String>, port: <int> }, 
remote: { ip: <String>, port: <int> }, 
users : [ { user: <String>, db: String> }, ... ], 
params: <document>, 
result: <int> 
} 
field String atype Action type. See Event Actions, Details, and Results (page 381). 
field document ts Document that contains the date and UTC time of the event, in ISO 8601 format. 
field document local Document that contains the local ip address and the port number of the running 
instance. 
field document remote Document that contains the remote ip address and the port number of the 
incoming connection associated with the event. 
field array users Array of user identification documents. Because MongoDB allows a session to log in 
with different user per database, this array can have more than one user. Each document contains a 
user field for the username and a db field for the authentication database for that user. 
59http://www.mongodb.com/products/mongodb-enterprise 
380 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
field document params Specific details for the event. See Event Actions, Details, and Results 
(page 381). 
field integer result Error code. See Event Actions, Details, and Results (page 381). 
Event Actions, Details, and Results 
The following table lists for each atype or action type, the associated params details and the result values, if 
any. 
atype params result Notes 
authenticate 
{ 
user: <user name>, 
db: <database>, 
mechanism: <mechanism> 
} 
0 - Success 
18 - Authentication Failed 
authCheck 
{ 
command: <name>, 
ns: <database>.<collection>, 
args: <command object> 
} 
0 - Success 
13 - Unauthorized to per-form 
the operation. 
The auditing system logs 
only authorization failures. 
ns field is optional. 
args field may be 
redacted. 
0 - Success 
createCollection 
(page 375) { ns: <database>.<collection> } 
createDatabase 
{ ns: <database> } 
0 - Success 
createIndex (page 375) 
{ 
ns: <database>.<collection>, 
indexName: <index name>, 
indexSpec: <full index specification> 
} 
0 - Success 
renameCollection 
{ 
old: <database>.<collection>, 
new: <database>.<collection> 
} 
0 - Success 
0 - Success 
dropCollection 
(page 376) { ns: <database>.<collection> } 
dropDatabase 
(page 378) { ns: <database> } 
0 - Success 
Continued on next page 
6.4. Security Reference 381
MongoDB Documentation, Release 2.6.4 
Table 6.1 – continued from previous page 
atype params result Notes 
dropIndex (page 378) 
{ 
ns: <database>.<collection>, 
indexName: <index name> 
} 
0 - Success 
createUser (page 376) 
{ 
user: <user name>, 
db: <database>, 
customData: <document>, 
roles: [ <role1>, ... ] 
} 
0 - Success customData field is op-tional. 
dropUser (page 376) 
{ 
user: <user name>, 
db: <database> 
} 
0 - Success 
dropAllUsersFromDatabase 
{ db: <database> } 
0 - Success 
updateUser 
{ 
user: <user name>, 
db: <database>, 
passwordChanged: <boolean>, 
customData: <document>, 
roles: [ <role1>, ... ] 
} 
0 - Success customData field is op-tional. 
grantRolesToUser 
{ 
user: <user name>, 
db: <database>, 
roles: [ <role1>, ... ] 
} 
0 - Success The roles array contains 
role documents. See role 
Document (page 384). 
revokeRolesFromUser 
{ 
user: <user name>, 
db: <database>, 
roles: [ <role1>, ... ] 
} 
0 - Success The roles array contains 
role documents. See role 
Document (page 384). 
Continued on next page 
382 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
Table 6.1 – continued from previous page 
atype params result Notes 
createRole (page 375) 
{ 
role: <role name>, 
db: <database>, 
roles: [ <role1>, ... ], 
privileges: [ <privilege1>, ... ] 
} 
0 - Success Either roles or the 
privileges field can be 
optional. 
The roles array contains 
role documents. See role 
Document (page 384). 
The privileges array 
contains privilege doc-uments. 
See privilege 
Document (page 384). 
updateRole 
{ 
role: <role name>, 
db: <database>, 
roles: [ <role1>, ... ], 
privileges: [ <privilege1>, ... ] 
} 
0 - Success Either roles or the 
privileges field can be 
optional. 
The roles array contains 
role documents. See role 
Document (page 384). 
The privileges array 
contains privilege doc-uments. 
See privilege 
Document (page 384). 
dropRole (page 376) 
{ 
role: <role name>, 
db: <database> 
} 
0 - Success 
dropAllRolesFromDatabase 
{ db: <database> } 
0 - Success 
grantRolesToRole 
{ 
role: <role name>, 
db: <database>, 
roles: [ <role1>, ... ] 
} 
0 - Success The roles array contains 
role documents. See role 
Document (page 384). 
revokeRolesFromRole 
{ 
role: <role name>, 
db: <database>, 
roles: [ <role1>, ... ] 
} 
0 - Success The roles array contains 
role documents. See role 
Document (page 384). 
grantPrivilegesToRole 
{ 
role: <role name>, 
db: <database>, 
privileges: [ <privilege1>, ... ] 
} 
0 - Success The privileges array 
contains privilege doc-uments. 
See privilege 
Document (page 384). 
Continued on next page 
6.4. Security Reference 383
MongoDB Documentation, Release 2.6.4 
Table 6.1 – continued from previous page 
atype params result Notes 
revokePrivilegesFromRole 
{ 
role: <role name>, 
db: <database name>, 
privileges: [ <privilege1>, ... ] 
} 
0 - Success The privileges array 
contains privilege doc-uments. 
See privilege 
Document (page 384). 
replSetReconfig 
{ 
old: <configuration>, 
new: <configuration> 
} 
0 - Success 
enableSharding 
(page 377) { ns: <database> } 
0 - Success 
shardCollection 
{ 
ns: <database>.<collection>, 
key: <shard key pattern>, 
options: { unique: <boolean> } 
} 
0 - Success 
addShard (page 377) 
{ 
shard: <shard name>, 
connectionString: <hostname>:<port>, 
maxSize: <maxSize> 
} 
0 - Success When a shard is 
a replica set, the 
connectionString 
includes the replica set 
name and can include other 
members of the replica set. 
removeShard (page 377) 
0 - Success 
{ shard: <shard name> } 
shutdown (page 379) 
{ } 
0 - Success Indicates commencement 
of database shutdown. 
0 - Success See 
applicationMessage 
(page 378) { msg: <custom message string> } 
logApplicationMessage. 
Additional Information 
role Document The <role> document in the roles array has the following form: 
{ 
role: <role name>, 
db: <database> 
} 
privilege Document The <privilege> document in the privilege array has the following form: 
384 Chapter 6. Security
MongoDB Documentation, Release 2.6.4 
{ 
resource: <resource document> , 
actions: [ <action>, ... ] 
} 
See Resource Document (page 373) for details on the resource document. For a list of actions, see Privilege Actions 
(page 375). 
6.4.3 Security Release Notes Alerts 
Security Release Notes (page 385) Security vulnerability for password. 
Security Release Notes 
Access to system.users Collection 
Changed in version 2.4. 
In 2.4, only users with the userAdmin role have access to the system.users collection. 
In version 2.2 and earlier, the read-write users of a database all have access to the system.users collection, which 
contains the user names and user password hashes. 60 
Password Hashing Insecurity 
If a user has the same password for multiple databases, the hash will be the same. A malicious user could exploit this 
to gain access on a second database using a different user’s credentials. 
As a result, always use unique username and password combinations for each database. 
Thanks to Will Urbanski, from Dell SecureWorks, for identifying this issue. 
60 Read-only users do not have access to the system.users collection. 
6.4. Security Reference 385
MongoDB Documentation, Release 2.6.4 
386 Chapter 6. Security
CHAPTER 7 
Aggregation 
Aggregations operations process data records and return computed results. Aggregation operations group values from 
multiple documents together, and can perform a variety of operations on the grouped data to return a single result. 
MongoDB provides three ways to perform aggregation: the aggregation pipeline (page 391), the map-reduce function 
(page 394), and single purpose aggregation methods and commands (page 395). 
Aggregation Introduction (page 387) A high-level introduction to aggregation. 
Aggregation Concepts (page 391) Introduces the use and operation of the data aggregation modalities available in 
MongoDB. 
Aggregation Pipeline (page 391) The aggregation pipeline is a framework for performing aggregation tasks, 
modeled on the concept of data processing pipelines. Using this framework, MongoDB passes the doc-uments 
of a single collection through a pipeline. The pipeline transforms the documents into aggregated 
results, and is accessed through the aggregate database command. 
Map-Reduce (page 394) Map-reduce is a generic multi-phase data aggregation modality for processing quan-tities 
of data. MongoDB provides map-reduce with the mapReduce database command. 
Single Purpose Aggregation Operations (page 395) MongoDB provides a collection of specific data aggrega-tion 
operations to support a number of common data aggregation functions. These operations include 
returning counts of documents, distinct values of a field, and simple grouping operations. 
Aggregation Mechanics (page 398) Details internal optimization operations, limits, support for sharded col-lections, 
and concurrency concerns. 
Aggregation Examples (page 403) Examples and tutorials for data aggregation operations in MongoDB. 
Aggregation Reference (page 419) References for all aggregation operations material for all data aggregation meth-ods 
in MongoDB. 
7.1 Aggregation Introduction 
Aggregations are operations that process data records and return computed results. MongoDB provides a rich set 
of aggregation operations that examine and perform calculations on the data sets. Running data aggregation on the 
mongod instance simplifies application code and limits resource requirements. 
Like queries, aggregation operations in MongoDB use collections of documents as an input and return results in the 
form of one or more documents. 
387
MongoDB Documentation, Release 2.6.4 
7.1.1 Aggregation Modalities 
Aggregation Pipelines 
MongoDB 2.2 introduced a new aggregation framework (page 391), modeled on the concept of data processing 
pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result. 
The most basic pipeline stages provide filters that operate like queries and document transformations that modify the 
form of the output document. 
Other pipeline operations provide tools for grouping and sorting documents by specific field or fields as well as tools 
for aggregating the contents of arrays, including arrays of documents. In addition, pipeline stages can use operators 
for tasks such as calculating the average or concatenating a string. 
The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method 
for data aggregation in MongoDB. 
Figure 7.1: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two stages: 
$match and $group. 
Map-Reduce 
MongoDB also provides map-reduce (page 394) operations to perform aggregation. In general, map-reduce operations 
have two phases: a map stage that processes each document and emits one or more objects for each input document, 
388 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
and reduce phase that combines the output of the map operation. Optionally, map-reduce can have a finalize stage to 
make final modifications to the result. Like other aggregation operations, map-reduce can specify a query condition to 
select the input documents as well as sort and limit the results. 
Map-reduce uses custom JavaScript functions to perform the map and reduce operations, as well as the optional finalize 
operation. While the custom JavaScript provide great flexibility compared to the aggregation pipeline, in general, map-reduce 
is less efficient and more complex than the aggregation pipeline. 
Note: Starting in MongoDB 2.4, certain mongo shell functions and properties are inaccessible in map-reduce op-erations. 
MongoDB 2.4 also provides support for multiple JavaScript operations to run at the same time. Before 
MongoDB 2.4, JavaScript code executed in a single thread, raising concurrency issues for map-reduce. 
Figure 7.2: Diagram of the annotated map-reduce operation. 
Single Purpose Aggregation Operations 
For a number of common single purpose aggregation operations (page 395), MongoDB provides special purpose 
database commands. These common aggregation operations are: returning a count of matching documents, returning 
the distinct values for a field, and grouping data based on the values of a field. All of these operations aggregate 
documents from a single collection. While these operations provide simple access to common aggregation processes, 
they lack the flexibility and capabilities of the aggregation pipeline and map-reduce. 
7.1. Aggregation Introduction 389
MongoDB Documentation, Release 2.6.4 
Figure 7.3: Diagram of the annotated distinct operation. 
390 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
7.1.2 Additional Features and Behaviors 
Both the aggregation pipeline and map-reduce can operate on a sharded collection (page 607). Map-reduce operations 
can also output to a sharded collection. See Aggregation Pipeline and Sharded Collections (page 401) and Map-Reduce 
and Sharded Collections (page 402) for details. 
The aggregation pipeline can use indexes to improve its performance during some of its stages. In addition, the aggre-gation 
pipeline has an internal optimization phase. See Pipeline Operators and Indexes (page 393) and Aggregation 
Pipeline Optimization (page 398) for details. 
For a feature comparison of the aggregation pipeline, map-reduce, and the special group functionality, see Aggregation 
Commands Comparison (page 424). 
7.2 Aggregation Concepts 
MongoDB provides the three approaches to aggregation, each with its own strengths and purposes for a given situation. 
This section describes these approaches and also describes behaviors and limitations specific to each approach. See 
also the chart (page 424) that compares the approaches. 
Aggregation Pipeline (page 391) The aggregation pipeline is a framework for performing aggregation tasks, modeled 
on the concept of data processing pipelines. Using this framework, MongoDB passes the documents of a single 
collection through a pipeline. The pipeline transforms the documents into aggregated results, and is accessed 
through the aggregate database command. 
Map-Reduce (page 394) Map-reduce is a generic multi-phase data aggregation modality for processing quantities of 
data. MongoDB provides map-reduce with the mapReduce database command. 
Single Purpose Aggregation Operations (page 395) MongoDB provides a collection of specific data aggregation op-erations 
to support a number of common data aggregation functions. These operations include returning counts 
of documents, distinct values of a field, and simple grouping operations. 
Aggregation Mechanics (page 398) Details internal optimization operations, limits, support for sharded collections, 
and concurrency concerns. 
7.2.1 Aggregation Pipeline 
New in version 2.2. 
The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines. 
Documents enter a multi-stage pipeline that transforms the documents into an aggregated results. 
The aggregation pipeline provides an alternative to map-reduce and may be the preferred solution for aggregation tasks 
where the complexity of map-reduce may be unwarranted. 
Aggregation pipeline have some limitations on value types and result size. See Aggregation Pipeline Limits (page 401) 
for details on limits and restrictions on the aggregation pipeline. 
Pipeline 
The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through the 
pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may 
generate new documents or filter out documents. Pipeline stages can appear multiple times in the pipeline. 
MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregate com-mand 
for aggregation pipeline. See aggregation-pipeline-operator-reference for the available stages. 
7.2. Aggregation Concepts 391
MongoDB Documentation, Release 2.6.4 
Figure 7.4: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two stages: 
$match and $group. 
392 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
For example usage of the aggregation pipeline, consider Aggregation with User Preference Data (page 407) and 
Aggregation with the Zip Code Data Set (page 404). 
Pipeline Expressions 
Some pipeline stages takes a pipeline expression as its operand. Pipeline expressions specify the transformation to 
apply to the input documents. Expressions have a document (page 158) structure and can contain other expression 
(page 420). 
Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other 
documents: expression operations provide in-memory transformation of documents. 
Generally, expressions are stateless and are only evaluated when seen by the aggregation process with one exception: 
accumulator expressions. 
The accumulators, used with the $group pipeline operator, maintain their state (e.g. totals, maximums, minimums, 
and related data) as documents progress through the pipeline. 
For more information on expressions, see Expressions (page 420). 
Aggregation Pipeline Behavior 
In MongoDB, the aggregate command operates on a single collection, logically passing the entire collection into 
the aggregation pipeline. To optimize the operation, wherever possible, use the following strategies to avoid scanning 
the entire collection. 
Pipeline Operators and Indexes 
The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the 
pipeline. 
New in version 2.4: The $geoNear pipeline operator takes advantage of a geospatial index. When using $geoNear, 
the $geoNear pipeline operation must appear as the first stage in an aggregation pipeline. 
Even when the pipeline uses an index, aggregation still requires access to the actual documents; i.e. indexes cannot 
fully cover an aggregation pipeline. 
Changed in version 2.6: In previous versions, for very select use cases, an index could cover a pipeline. 
Early Filtering 
If your aggregation operation requires only a subset of the data in a collection, use the $match, $limit, and $skip 
stages to restrict the documents that enter at the beginning of the pipeline. When placed at the beginning of a pipeline, 
$match operations use suitable indexes to scan only the matching documents in a collection. 
Placing a $match pipeline stage followed by a $sort stage at the start of the pipeline is logically equivalent to a 
single query with a sort and can use an index. When possible, place $match operators at the beginning of the pipeline. 
Additional Features 
The aggregation pipeline has an internal optimization phase that provides improved performance for certain sequences 
of operators. For details, see Aggregation Pipeline Optimization (page 398). 
The aggregation pipeline supports operations on sharded collections. See Aggregation Pipeline and Sharded Collec-tions 
(page 401). 
7.2. Aggregation Concepts 393
MongoDB Documentation, Release 2.6.4 
7.2.2 Map-Reduce 
Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For 
map-reduce operations, MongoDB provides the mapReduce database command. 
Consider the following map-reduce operation: 
Figure 7.5: Diagram of the annotated map-reduce operation. 
In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the 
collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple 
values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores 
the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further 
condense or process the results of the aggregation. 
All map-reduce functions in MongoDB are JavaScript and run within the mongod process. Map-reduce operations 
take the documents of a single collection as the input and can perform any arbitrary sorting and limiting before 
beginning the map stage. mapReduce can return the results of a map-reduce operation as a document, or may write 
the results to collections. The input and the output collections may be sharded. 
Note: For most aggregation operations, the Aggregation Pipeline (page 391) provides better performance and more 
coherent interface. However, map-reduce operations provide some flexibility that is not presently available in the 
aggregation pipeline. 
394 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
Map-Reduce JavaScript Functions 
In MongoDB, map-reduce operations use custom JavaScript functions to map, or associate, values to a key. If a key 
has multiple values mapped to it, the operation reduces the values for the key to a single object. 
The use of custom JavaScript functions provide flexibility to map-reduce operations. For instance, when processing a 
document, the map function can create more than one key and value mapping or no mapping. Map-reduce operations 
can also use a custom JavaScript function to make final modifications to the results at the end of the map and reduce 
operation, such as perform additional calculations. 
Map-Reduce Behavior 
In MongoDB, the map-reduce operation can write results to a collection or return the results inline. If you write 
map-reduce output to a collection, you can perform subsequent map-reduce operations on the same input collection 
that merge replace, merge, or reduce new results with previous results. See mapReduce and Perform Incremental 
Map-Reduce (page 413) for details and examples. 
When returning the results of a map reduce operation inline, the result documents must be within the BSON 
Document Size limit, which is currently 16 megabytes. For additional information on limits and restrictions on 
map-reduce operations, see the http://docs.mongodb.org/manualreference/command/mapReduce 
reference page. 
MongoDB supports map-reduce operations on sharded collections (page 607). Map-reduce operations can also output 
the results to a sharded collection. See Map-Reduce and Sharded Collections (page 402). 
7.2.3 Single Purpose Aggregation Operations 
Aggregation refers to a broad class of data manipulation operations that compute a result based on an input and a spe-cific 
procedure. MongoDB provides a number of aggregation operations that perform specific aggregation operations 
on a set of data. 
Although limited in scope, particularly compared to the aggregation pipeline (page 391) and map-reduce (page 394), 
these operations provide straightforward semantics for common data processing options. 
Count 
MongoDB can return a count of the number of documents that match a query. The count command as well as the 
count() and cursor.count() methods provide access to counts in the mongo shell. 
Example 
Given a collection named records with only the following documents: 
{ a: 1, b: 0 } 
{ a: 1, b: 1 } 
{ a: 1, b: 4 } 
{ a: 2, b: 2 } 
The following operation would count all documents in the collection and return the number 4: 
db.records.count() 
The following operation will count only the documents where the value of the field a is 1 and return 3: 
db.records.count( { a: 1 } ) 
7.2. Aggregation Concepts 395
MongoDB Documentation, Release 2.6.4 
Distinct 
The distinct operation takes a number of documents that match a query and returns all of the unique values for a field 
in the matching documents. The distinct command and db.collection.distinct() method provide this 
operation in the mongo shell. Consider the following examples of a distinct operation: 
Figure 7.6: Diagram of the annotated distinct operation. 
Example 
Given a collection named records with only the following documents: 
{ a: 1, b: 0 } 
{ a: 1, b: 1 } 
{ a: 1, b: 1 } 
{ a: 1, b: 4 } 
{ a: 2, b: 2 } 
396 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
{ a: 2, b: 2 } 
Consider the following db.collection.distinct() operation which returns the distinct values of the field b: 
db.records.distinct( "b" ) 
The results of this operation would resemble: 
[ 0, 1, 4, 2 ] 
Group 
The group operation takes a number of documents that match a query, and then collects groups of documents based 
on the value of a field or fields. It returns an array of documents with computed results for each group of documents. 
Access the grouping functionality via the group command or the db.collection.group() method in the 
mongo shell. 
Warning: group does not support data in sharded collections. In addition, the results of the group operation 
must be no larger than 16 megabytes. 
Consider the following group operation: 
Example 
Given a collection named records with the following documents: 
{ a: 1, count: 4 } 
{ a: 1, count: 2 } 
{ a: 1, count: 4 } 
{ a: 2, count: 3 } 
{ a: 2, count: 1 } 
{ a: 1, count: 5 } 
{ a: 4, count: 4 } 
Consider the following group operation which groups documents by the field a, where a is less than 3, and sums the 
field count for each group: 
db.records.group( { 
key: { a: 1 }, 
cond: { a: { $lt: 3 } }, 
reduce: function(cur, result) { result.count += cur.count }, 
initial: { count: 0 } 
} ) 
The results of this group operation would resemble the following: 
[ 
{ a: 1, count: 15 }, 
{ a: 2, count: 4 } 
] 
See also: 
The $group for related functionality in the aggregation pipeline (page 391). 
7.2. Aggregation Concepts 397
MongoDB Documentation, Release 2.6.4 
7.2.4 Aggregation Mechanics 
This section describes behaviors and limitations for the various aggregation modalities. 
Aggregation Pipeline Optimization (page 398) Details the internal optimization of certain pipeline sequence. 
Aggregation Pipeline Limits (page 401) Presents limitations on aggregation pipeline operations. 
Aggregation Pipeline and Sharded Collections (page 401) Mechanics of aggregation pipeline operations on sharded 
collections. 
Map-Reduce and Sharded Collections (page 402) Mechanics of map-reduce operation with sharded collections. 
Map Reduce Concurrency (page 403) Details the locks taken during map-reduce operations. 
Aggregation Pipeline Optimization 
Aggregation pipeline operations have an optimization phase which attempts to reshape the pipeline for improved 
performance. 
To see how the optimizer transforms a particular aggregation pipeline, include the explain option in the 
db.collection.aggregate() method. 
Optimizations are subject to change between releases. 
Projection Optimization 
The aggregation pipeline can determine if it requires only a subset of the fields in the documents to obtain the results. 
If so, the pipeline will only use those required fields, reducing the amount of data passing through the pipeline. 
Pipeline Sequence Optimization 
$sort + $match Sequence Optimization When you have a sequence with $sort followed by a $match, the 
$match moves before the $sort to minimize the number of objects to sort. For example, if the pipeline consists of 
the following stages: 
{ $sort: { age : -1 } }, 
{ $match: { status: 'A' } } 
During the optimization phase, the optimizer transforms the sequence to the following: 
{ $match: { status: 'A' } }, 
{ $sort: { age : -1 } } 
$skip + $limit Sequence Optimization When you have a sequence with $skip followed by a $limit, the 
$limit moves before the $skip. With the reordering, the $limit value increases by the $skip amount. 
For example, if the pipeline consists of the following stages: 
{ $skip: 10 }, 
{ $limit: 5 } 
During the optimization phase, the optimizer transforms the sequence to the following: 
{ $limit: 15 }, 
{ $skip: 10 } 
398 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
This optimization allows for more opportunities for $sort + $limit Coalescence (page 399), such as with $sort + 
$skip + $limit sequences. See $sort + $limit Coalescence (page 399) for details on the coalescence and $sort + 
$skip + $limit Sequence (page 400) for an example. 
For aggregation operations on sharded collections (page 401), this optimization reduces the results returned from each 
shard. 
$redact + $match Sequence Optimization When possible, when the pipeline has the $redact stage immedi-ately 
followed by the $match stage, the aggregation can sometimes add a portion of the $match stage before the 
$redact stage. If the added $match stage is at the start of a pipeline, the aggregation can use an index as well 
as query the collection to limit the number of documents that enter the pipeline. See Pipeline Operators and Indexes 
(page 393) for more information. 
For example, if the pipeline consists of the following stages: 
{ $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } }, 
{ $match: { year: 2014, category: { $ne: "Z" } } } 
The optimizer can add the same $match stage before the $redact stage: 
{ $match: { year: 2014 } }, 
{ $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } }, 
{ $match: { year: 2014, category: { $ne: "Z" } } } 
Pipeline Coalescence Optimization 
When possible, the optimization phase coalesces a pipeline stage into its predecessor. Generally, coalescence occurs 
after any sequence reordering optimization. 
$sort + $limit Coalescence When a $sort immediately precedes a $limit, the optimizer can coalesce the 
$limit into the $sort. This allows the sort operation to only maintain the top n results as it progresses, where 
n is the specified limit, and MongoDB only needs to store n items in memory 1. See sort-and-memory for more 
information. 
$limit + $limit Coalescence When a $limit immediately follows another $limit, the two stages can 
coalesce into a single $limit where the limit amount is the smaller of the two initial limit amounts. For example, a 
pipeline contains the following sequence: 
{ $limit: 100 }, 
{ $limit: 10 } 
Then the second $limit stage can coalesce into the first $limit stage and result in a single $limit stage where 
the limit amount 10 is the minimum of the two initial limits 100 and 10. 
{ $limit: 10 } 
$skip + $skip Coalescence When a $skip immediately follows another $skip, the two stages can coalesce 
into a single $skip where the skip amount is the sum of the two initial skip amounts. For example, a pipeline contains 
the following sequence: 
{ $skip: 5 }, 
{ $skip: 2 } 
1 The optimization will still apply when allowDiskUse is true and the n items exceed the aggregation memory limit (page 401). 
7.2. Aggregation Concepts 399
MongoDB Documentation, Release 2.6.4 
Then the second $skip stage can coalesce into the first $skip stage and result in a single $skip stage where the 
skip amount 7 is the sum of the two initial limits 5 and 2. 
{ $skip: 7 } 
$match + $match Coalescence When a $match immediately follows another $match, the two stages can 
coalesce into a single $match combining the conditions with an $and. For example, a pipeline contains the following 
sequence: 
{ $match: { year: 2014 } }, 
{ $match: { status: "A" } } 
Then the second $match stage can coalesce into the first $match stage and result in a single $match stage 
{ $match: { $and: [ { "year" : 2014 }, { "status" : "A" } ] } } 
Examples 
The following examples are some sequences that can take advantage of both sequence reordering and coalescence. 
Generally, coalescence occurs after any sequence reordering optimization. 
$sort + $skip + $limit Sequence A pipeline contains a sequence of $sort followed by a $skip followed 
by a $limit: 
{ $sort: { age : -1 } }, 
{ $skip: 10 }, 
{ $limit: 5 } 
First, the optimizer performs the $skip + $limit Sequence Optimization (page 398) to transforms the sequence to the 
following: 
{ $sort: { age : -1 } }, 
{ $limit: 15 } 
{ $skip: 10 } 
The $skip + $limit Sequence Optimization (page 398) increases the $limit amount with the reordering. See $skip + 
$limit Sequence Optimization (page 398) for details. 
The reordered sequence now has $sort immediately preceding the $limit, and the pipeline can coalesce the two 
stages to decrease memory usage during the sort operation. See $sort + $limit Coalescence (page 399) for more 
information. 
$limit + $skip + $limit + $skip Sequence A pipeline contains a sequence of alternating $limit and 
$skip stages: 
{ $limit: 100 }, 
{ $skip: 5 }, 
{ $limit: 10 }, 
{ $skip: 2 } 
The $skip + $limit Sequence Optimization (page 398) reverses the position of the { $skip: 5 } and { $limit: 
10 } stages and increases the limit amount: 
400 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
{ $limit: 100 }, 
{ $limit: 15}, 
{ $skip: 5 }, 
{ $skip: 2 } 
The optimizer then coalesces the two $limit stages into a single $limit stage and the two $skip stages into a 
single $skip stage. The resulting sequence is the following: 
{ $limit: 15 }, 
{ $skip: 7 } 
See $limit + $limit Coalescence (page 399) and $skip + $skip Coalescence (page 399) for details. 
See also: 
explain option in the db.collection.aggregate() 
Aggregation Pipeline Limits 
Aggregation operations with the aggregate command have the following limitations. 
Result Size Restrictions 
If the aggregate command returns a single document that contains the complete result set, the command will 
produce an error if the result set exceeds the BSON Document Size limit, which is currently 16 megabytes. To 
manage result sets that exceed this limit, the aggregate command can return result sets of any size if the command 
return a cursor or store the results to a collection. 
Changed in version 2.6: The aggregate command can return results as a cursor or store the results in a collection, 
which are not subject to the size limit. The db.collection.aggregate() returns a cursor and can return result 
sets of any size. 
Memory Restrictions 
Changed in version 2.6. 
Pipeline stages have a limit of 100 megabytes of RAM. If a stage exceeds this limit, MongoDB will produce an error. 
To allow for the handling of large datasets, use the allowDiskUse option to enable aggregation pipeline stages to 
write data to temporary files. 
See also: 
sort-memory-limit and group-memory-limit. 
Aggregation Pipeline and Sharded Collections 
The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the 
aggregation pipeline (page 391) and sharded collections. 
Behavior 
Changed in version 2.6. 
7.2. Aggregation Concepts 401
MongoDB Documentation, Release 2.6.4 
When operating on a sharded collection, the aggregation pipeline is split into two parts. The first pipeline runs on each 
shard, or if an early $match can exclude shards through the use of the shard key in the predicate, the pipeline runs on 
only the relevant shards. 
The second pipeline consists of the remaining pipeline stages and runs on the primary shard (page 615). The primary 
shard merges the cursors from the other shards and runs the second pipeline on these results. The primary shard 
forwards the final results to the mongos. In previous versions, the second pipeline would run on the mongos. 2 
Optimization 
When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards perform as many 
stages as possible with consideration for optimization. 
To see how the pipeline was split, include the explain option in the db.collection.aggregate() method. 
Optimizations are subject to change between releases. 
Map-Reduce and Sharded Collections 
Map-reduce supports operations on sharded collections, both as an input and as an output. This section describes the 
behaviors of mapReduce specific to sharded collections. 
Sharded Collection as Input 
When using sharded collection as the input for a map-reduce operation, mongos will automatically dispatch the map-reduce 
job to each shard in parallel. There is no special option required. mongos will wait for jobs on all shards to 
finish. 
Sharded Collection as Output 
Changed in version 2.2. 
If the out field for mapReduce has the sharded value, MongoDB shards the output collection using the _id field 
as the shard key. 
To output to a sharded collection: 
• If the output collection does not exist, MongoDB creates and shards the collection on the _id field. 
• For a new or an empty sharded collection, MongoDB uses the results of the first stage of the map-reduce 
operation to create the initial chunks distributed among the shards. 
• mongos dispatches, in parallel, a map-reduce post-processing job to every shard that owns a chunk. During 
the post-processing, each shard will pull the results for its own chunks from the other shards, run the final 
reduce/finalize, and write locally to the output collection. 
Note: 
• During later map-reduce jobs, MongoDB splits chunks as needed. 
• Balancing of chunks for the output collection is automatically prevented during post-processing to avoid con-currency 
issues. 
In MongoDB 2.0: 
2 Until all shards upgrade to v2.6, the second pipeline runs on the mongos if any shards are still running v2.4. 
402 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
• mongos retrieves the results from each shard, performs a merge sort to order the results, and proceeds to the 
reduce/finalize phase as needed. mongos then writes the result to the output collection in sharded mode. 
• This model requires only a small amount of memory, even for large data sets. 
• Shard chunks are not automatically split during insertion. This requires manual intervention until the chunks 
are granular and balanced. 
Important: For best results, only use the sharded output options for mapReduce in version 2.2 or later. 
Map Reduce Concurrency 
The map-reduce operation is composed of many tasks, including reads from the input collection, executions of the 
map function, executions of the reduce function, writes to a temporary collection during processing, and writes to 
the output collection. 
During the operation, map-reduce takes the following locks: 
• The read phase takes a read lock. It yields every 100 documents. 
• The insert into the temporary collection takes a write lock for a single write. 
• If the output collection does not exist, the creation of the output collection takes a write lock. 
• If the output collection exists, then the output actions (i.e. merge, replace, reduce) take a write lock. This 
write lock is global, and blocks all operations on the mongod instance. 
Changed in version 2.4: The V8 JavaScript engine, which became the default in 2.4, allows multiple JavaScript 
operations to execute at the same time. Prior to 2.4, JavaScript code (i.e. map, reduce, finalize functions) 
executed in a single thread. 
Note: The final write lock during post-processing makes the results appear atomically. However, output actions 
merge and reduce may take minutes to process. For the merge and reduce, the nonAtomic flag is available, 
which releases the lock between writing each output document. the db.collection.mapReduce() reference 
for more information. 
7.3 Aggregation Examples 
This document provides the practical examples that display the capabilities of aggregation (page 391). 
Aggregation with the Zip Code Data Set (page 404) Use the aggregation pipeline to group values and to calculate 
aggregated sums and averages for a collection of United States zip codes. 
Aggregation with User Preference Data (page 407) Use the pipeline to sort, normalize, and sum data on a collection 
of user data. 
Map-Reduce Examples (page 411) Define map-reduce operations that select ranges, group data, and calculate sums 
and averages. 
Perform Incremental Map-Reduce (page 413) Run a map-reduce operations over one collection and output results 
to another collection. 
Troubleshoot the Map Function (page 415) Steps to troubleshoot the map function. 
Troubleshoot the Reduce Function (page 416) Steps to troubleshoot the reduce function. 
7.3. Aggregation Examples 403
MongoDB Documentation, Release 2.6.4 
7.3.1 Aggregation with the Zip Code Data Set 
The examples in this document use the zipcode collection. This collection is available at: me-dia. 
mongodb.org/zips.json3. Use mongoimport to load this data set into your mongod instance. 
Data Model 
Each document in the zipcode collection has the following form: 
{ 
"_id": "10280", 
"city": "NEW YORK", 
"state": "NY", 
"pop": 5574, 
"loc": [ 
-74.016323, 
40.710537 
] 
} 
The _id field holds the zip code as a string. 
The city field holds the city name. A city can have more than one zip code associated with it as different sections of 
the city can each have a different zip code. 
The state field holds the two letter state abbreviation. 
The pop field holds the population. 
The loc field holds the location as a latitude longitude pair. 
All of the following examples use the aggregate() helper in the mongo shell. aggregate() provides a wrapper 
around the aggregate database command. See the documentation for your driver for a more idiomatic interface 
for data aggregation operations. 
Return States with Populations above 10 Million 
To return all states with a population greater than 10 million, use the following aggregation operation: 
db.zipcodes.aggregate( { $group : 
{ _id : "$state", 
totalPop : { $sum : "$pop" } } }, 
{ $match : {totalPop : { $gte : 10*1000*1000 } } } ) 
Aggregations operations using the aggregate() helper process all documents in the zipcodes collection. 
aggregate() connects a number of pipeline (page 391) operators, which define the aggregation process. 
In this example, the pipeline passes all documents in the zipcodes collection through the following steps: 
• the $group operator collects all documents and creates documents for each state. 
These new per-state documents have one field in addition to the _id field: totalPop which is a generated 
field using the $sum operation to calculate the total value of all pop fields in the source documents. 
After the $group operation the documents in the pipeline resemble the following: 
3http://media.mongodb.org/zips.json 
404 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
{ 
"_id" : "AK", 
"totalPop" : 550043 
} 
• the $match operation filters these documents so that the only documents that remain are those where the value 
of totalPop is greater than or equal to 10 million. 
The $match operation does not alter the documents, which have the same format as the documents output by 
$group. 
The equivalent SQL for this operation is: 
SELECT state, SUM(pop) AS totalPop 
FROM zipcodes 
GROUP BY state 
HAVING totalPop >= (10*1000*1000) 
Return Average City Population by State 
To return the average populations for cities in each state, use the following aggregation operation: 
db.zipcodes.aggregate( [ 
{ $group : { _id : { state : "$state", city : "$city" }, pop : { $sum : "$pop" } } }, 
{ $group : { _id : "$_id.state", avgCityPop : { $avg : "$pop" } } } 
] ) 
Aggregations operations using the aggregate() helper process all documents in the zipcodes collection. 
aggregate() connects a number of pipeline (page 391) operators that define the aggregation process. 
In this example, the pipeline passes all documents in the zipcodes collection through the following steps: 
• the $group operator collects all documents and creates new documents for every combination of the city and 
state fields in the source document. A city can have more than one zip code associated with it as different 
sections of the city can each have a different zip code. 
After this stage in the pipeline, the documents resemble the following: 
{ 
"_id" : { 
"state" : "CO", 
"city" : "EDGEWATER" 
}, 
"pop" : 13154 
} 
• the second $group operator collects documents by the state field and use the $avg expression to compute 
a value for the avgCityPop field. 
The final output of this aggregation operation is: 
{ 
"_id" : "MN", 
"avgCityPop" : 5335 
}, 
Return Largest and Smallest Cities by State 
To return the smallest and largest cities by population for each state, use the following aggregation operation: 
7.3. Aggregation Examples 405
MongoDB Documentation, Release 2.6.4 
db.zipcodes.aggregate( { $group: 
{ _id: { state: "$state", city: "$city" }, 
pop: { $sum: "$pop" } } }, 
{ $sort: { pop: 1 } }, 
{ $group: 
{ _id : "$_id.state", 
biggestCity: { $last: "$_id.city" }, 
biggestPop: { $last: "$pop" }, 
smallestCity: { $first: "$_id.city" }, 
smallestPop: { $first: "$pop" } } }, 
// the following $project is optional, and 
// modifies the output format. 
{ $project: 
{ _id: 0, 
state: "$_id", 
biggestCity: { name: "$biggestCity", pop: "$biggestPop" }, 
smallestCity: { name: "$smallestCity", pop: "$smallestPop" } } } ) 
Aggregation operations using the aggregate() helper process all documents in the zipcodes collection. 
aggregate() combines a number of pipeline (page 391) operators that define the aggregation process. 
All documents from the zipcodes collection pass into the pipeline, which consists of the following steps: 
• the $group operator collects all documents and creates new documents for every combination of the city and 
state fields in the source documents. 
By specifying the value of _id as a sub-document that contains both fields, the operation preserves the state 
field for use later in the pipeline. The documents produced by this stage of the pipeline have a second field, 
pop, which uses the $sum operator to provide the total of the pop fields in the source document. 
At this stage in the pipeline, the documents resemble the following: 
{ 
"_id" : { 
"state" : "CO", 
"city" : "EDGEWATER" 
}, 
"pop" : 13154 
} 
• $sort operator orders the documents in the pipeline based on the value of the pop field from largest to smallest. 
This operation does not alter the documents. 
• the second $group operator collects the documents in the pipeline by the state field, which is a field inside 
the nested _id document. 
Within each per-state document this $group operator specifies four fields: Using the $last expression, the 
$group operator creates the biggestcity and biggestpop fields that store the city with the largest pop-ulation 
and that population. Using the $first expression, the $group operator creates the smallestcity 
and smallestpop fields that store the city with the smallest population and that population. 
The documents, at this stage in the pipeline resemble the following: 
{ 
"_id" : "WA", 
"biggestCity" : "SEATTLE", 
"biggestPop" : 520096, 
"smallestCity" : "BENGE", 
406 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
"smallestPop" : 2 
} 
• The final operation is $project, which renames the _id field to state and moves the biggestCity, 
biggestPop, smallestCity, and smallestPop into biggestCity and smallestCity sub-documents. 
The output of this aggregation operation is: 
{ 
"state" : "RI", 
"biggestCity" : { 
"name" : "CRANSTON", 
"pop" : 176404 
}, 
"smallestCity" : { 
"name" : "CLAYVILLE", 
"pop" : 45 
} 
} 
7.3.2 Aggregation with User Preference Data 
Data Model 
Consider a hypothetical sports club with a database that contains a users collection that tracks the user’s join dates, 
sport preferences, and stores these data in documents that resemble the following: 
{ 
_id : "jane", 
joined : ISODate("2011-03-02"), 
likes : ["golf", "racquetball"] 
} 
{ 
_id : "joe", 
joined : ISODate("2012-07-02"), 
likes : ["tennis", "golf", "swimming"] 
} 
Normalize and Sort Documents 
The following operation returns user names in upper case and in alphabetical order. The aggregation includes user 
names for all documents in the users collection. You might do this to normalize user names for processing. 
db.users.aggregate( 
[ 
{ $project : { name:{$toUpper:"$_id"} , _id:0 } }, 
{ $sort : { name : 1 } } 
] 
) 
All documents from the users collection pass through the pipeline, which consists of the following operations: 
• The $project operator: 
– creates a new field called name. 
7.3. Aggregation Examples 407
MongoDB Documentation, Release 2.6.4 
– converts the value of the _id to upper case, with the $toUpper operator. Then the $project creates 
a new field, named name to hold this value. 
– suppresses the id field. $project will pass the _id field by default, unless explicitly suppressed. 
• The $sort operator orders the results by the name field. 
The results of the aggregation would resemble the following: 
{ 
"name" : "JANE" 
}, 
{ 
"name" : "JILL" 
}, 
{ 
"name" : "JOE" 
} 
Return Usernames Ordered by Join Month 
The following aggregation operation returns user names sorted by the month they joined. This kind of aggregation 
could help generate membership renewal notices. 
db.users.aggregate( 
[ 
{ $project : 
{ 
month_joined : { $month : "$joined" }, 
name : "$_id", 
_id : 0 
} 
}, 
{ $sort : { month_joined : 1 } } 
] 
) 
The pipeline passes all documents in the users collection through the following operations: 
• The $project operator: 
– Creates two new fields: month_joined and name. 
– Suppresses the id from the results. The aggregate() method includes the _id, unless explicitly 
suppressed. 
• The $month operator converts the values of the joined field to integer representations of the month. Then 
the $project operator assigns those values to the month_joined field. 
• The $sort operator sorts the results by the month_joined field. 
The operation returns results that resemble the following: 
{ 
"month_joined" : 1, 
"name" : "ruth" 
}, 
{ 
"month_joined" : 1, 
"name" : "harold" 
}, 
408 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
{ 
"month_joined" : 1, 
"name" : "kate" 
} 
{ 
"month_joined" : 2, 
"name" : "jill" 
} 
Return Total Number of Joins per Month 
The following operation shows how many people joined each month of the year. You might use this aggregated data 
for recruiting and marketing strategies. 
db.users.aggregate( 
[ 
{ $project : { month_joined : { $month : "$joined" } } } , 
{ $group : { _id : {month_joined:"$month_joined"} , number : { $sum : 1 } } }, 
{ $sort : { "_id.month_joined" : 1 } } 
] 
) 
The pipeline passes all documents in the users collection through the following operations: 
• The $project operator creates a new field called month_joined. 
• The $month operator converts the values of the joined field to integer representations of the month. Then 
the $project operator assigns the values to the month_joined field. 
• The $group operator collects all documents with a given month_joined value and counts how many docu-ments 
there are for that value. Specifically, for each unique value, $group creates a new “per-month” document 
with two fields: 
– _id, which contains a nested document with the month_joined field and its value. 
– number, which is a generated field. The $sum operator increments this field by 1 for every document 
containing the given month_joined value. 
• The $sort operator sorts the documents created by $group according to the contents of the month_joined 
field. 
The result of this aggregation operation would resemble the following: 
{ 
"_id" : { 
"month_joined" : 1 
}, 
"number" : 3 
}, 
{ 
"_id" : { 
"month_joined" : 2 
}, 
"number" : 9 
}, 
{ 
"_id" : { 
"month_joined" : 3 
}, 
7.3. Aggregation Examples 409
MongoDB Documentation, Release 2.6.4 
"number" : 5 
} 
Return the Five Most Common “Likes” 
The following aggregation collects top five most “liked” activities in the data set. This type of analysis could help 
inform planning and future development. 
db.users.aggregate( 
[ 
{ $unwind : "$likes" }, 
{ $group : { _id : "$likes" , number : { $sum : 1 } } }, 
{ $sort : { number : -1 } }, 
{ $limit : 5 } 
] 
) 
The pipeline begins with all documents in the users collection, and passes these documents through the following 
operations: 
• The $unwind operator separates each value in the likes array, and creates a new version of the source 
document for every element in the array. 
Example 
Given the following document from the users collection: 
{ 
_id : "jane", 
joined : ISODate("2011-03-02"), 
likes : ["golf", "racquetball"] 
} 
The $unwind operator would create the following documents: 
{ 
_id : "jane", 
joined : ISODate("2011-03-02"), 
likes : "golf" 
} 
{ 
_id : "jane", 
joined : ISODate("2011-03-02"), 
likes : "racquetball" 
} 
• The $group operator collects all documents the same value for the likes field and counts each grouping. 
With this information, $group creates a new document with two fields: 
– _id, which contains the likes value. 
– number, which is a generated field. The $sum operator increments this field by 1 for every document 
containing the given likes value. 
• The $sort operator sorts these documents by the number field in reverse order. 
• The $limit operator only includes the first 5 result documents. 
The results of aggregation would resemble the following: 
410 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
{ 
"_id" : "golf", 
"number" : 33 
}, 
{ 
"_id" : "racquetball", 
"number" : 31 
}, 
{ 
"_id" : "swimming", 
"number" : 24 
}, 
{ 
"_id" : "handball", 
"number" : 19 
}, 
{ 
"_id" : "tennis", 
"number" : 18 
} 
7.3.3 Map-Reduce Examples 
In the mongo shell, the db.collection.mapReduce() method is a wrapper around the mapReduce command. 
The following examples use the db.collection.mapReduce() method: 
Consider the following map-reduce operations on a collection orders that contains documents of the following 
prototype: 
{ 
_id: ObjectId("50a8240b927d5d8b5891743c"), 
cust_id: "abc123", 
ord_date: new Date("Oct 04, 2012"), 
status: 'A', 
price: 25, 
items: [ { sku: "mmm", qty: 5, price: 2.5 }, 
{ sku: "nnn", qty: 5, price: 2.5 } ] 
} 
Return the Total Price Per Customer 
Perform the map-reduce operation on the orders collection to group by the cust_id, and calculate the sum of the 
price for each cust_id: 
1. Define the map function to process each input document: 
• In the function, this refers to the document that the map-reduce operation is processing. 
• The function maps the price to the cust_id for each document and emits the cust_id and price 
pair. 
var mapFunction1 = function() { 
emit(this.cust_id, this.price); 
}; 
2. Define the corresponding reduce function with two arguments keyCustId and valuesPrices: 
7.3. Aggregation Examples 411
MongoDB Documentation, Release 2.6.4 
• The valuesPrices is an array whose elements are the price values emitted by the map function and 
grouped by keyCustId. 
• The function reduces the valuesPrice array to the sum of its elements. 
var reduceFunction1 = function(keyCustId, valuesPrices) { 
return Array.sum(valuesPrices); 
}; 
3. Perform the map-reduce on all documents in the orders collection using the mapFunction1 map function 
and the reduceFunction1 reduce function. 
db.orders.mapReduce( 
mapFunction1, 
reduceFunction1, 
{ out: "map_reduce_example" } 
) 
This operation outputs the results to a collection named map_reduce_example. If the 
map_reduce_example collection already exists, the operation will replace the contents with the re-sults 
of this map-reduce operation: 
Calculate Order and Total Quantity with Average Quantity Per Item 
In this example, you will perform a map-reduce operation on the orders collection for all documents that have 
an ord_date value greater than 01/01/2012. The operation groups by the item.sku field, and calculates the 
number of orders and the total quantity ordered for each sku. The operation concludes by calculating the average 
quantity per order for each sku value: 
1. Define the map function to process each input document: 
• In the function, this refers to the document that the map-reduce operation is processing. 
• For each item, the function associates the sku with a new object value that contains the count of 1 
and the item qty for the order and emits the sku and value pair. 
var mapFunction2 = function() { 
for (var idx = 0; idx < this.items.length; idx++) { 
var key = this.items[idx].sku; 
var value = { 
count: 1, 
qty: this.items[idx].qty 
}; 
emit(key, value); 
} 
}; 
2. Define the corresponding reduce function with two arguments keySKU and countObjVals: 
• countObjVals is an array whose elements are the objects mapped to the grouped keySKU values 
passed by map function to the reducer function. 
• The function reduces the countObjVals array to a single object reducedValue that contains the 
count and the qty fields. 
• In reducedVal, the count field contains the sum of the count fields from the individual array ele-ments, 
and the qty field contains the sum of the qty fields from the individual array elements. 
var reduceFunction2 = function(keySKU, countObjVals) { 
reducedVal = { count: 0, qty: 0 }; 
412 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
for (var idx = 0; idx < countObjVals.length; idx++) { 
reducedVal.count += countObjVals[idx].count; 
reducedVal.qty += countObjVals[idx].qty; 
} 
return reducedVal; 
}; 
3. Define a finalize function with two arguments key and reducedVal. The function modifies the 
reducedVal object to add a computed field named avg and returns the modified object: 
var finalizeFunction2 = function (key, reducedVal) { 
reducedVal.avg = reducedVal.qty/reducedVal.count; 
return reducedVal; 
}; 
4. Perform the map-reduce operation on the orders collection using the mapFunction2, 
reduceFunction2, and finalizeFunction2 functions. 
db.orders.mapReduce( mapFunction2, 
reduceFunction2, 
{ 
out: { merge: "map_reduce_example" }, 
query: { ord_date: 
{ $gt: new Date('01/01/2012') } 
}, 
finalize: finalizeFunction2 
} 
) 
This operation uses the query field to select only those documents with ord_date greater than new 
Date(01/01/2012). Then it output the results to a collection map_reduce_example. If the 
map_reduce_example collection already exists, the operation will merge the existing contents with the 
results of this map-reduce operation. 
7.3.4 Perform Incremental Map-Reduce 
Map-reduce operations can handle complex aggregation tasks. To perform map-reduce operations, MongoDB provides 
the mapReduce command and, in the mongo shell, the db.collection.mapReduce() wrapper method. 
If the map-reduce data set is constantly growing, you may want to perform an incremental map-reduce rather than 
performing the map-reduce operation over the entire data set each time. 
To perform incremental map-reduce: 
1. Run a map-reduce job over the current collection and output the result to a separate collection. 
2. When you have more data to process, run subsequent map-reduce job with: 
• the query parameter that specifies conditions that match only the new documents. 
• the out parameter that specifies the reduce action to merge the new results into the existing output 
collection. 
Consider the following example where you schedule a map-reduce operation on a sessions collection to run at the 
end of each day. 
7.3. Aggregation Examples 413
MongoDB Documentation, Release 2.6.4 
Data Setup 
The sessions collection contains documents that log users’ sessions each day, for example: 
db.sessions.save( { userid: "a", ts: ISODate('2011-11-03 14:17:00'), length: 95 } ); 
db.sessions.save( { userid: "b", ts: ISODate('2011-11-03 14:23:00'), length: 110 } ); 
db.sessions.save( { userid: "c", ts: ISODate('2011-11-03 15:02:00'), length: 120 } ); 
db.sessions.save( { userid: "d", ts: ISODate('2011-11-03 16:45:00'), length: 45 } ); 
db.sessions.save( { userid: "a", ts: ISODate('2011-11-04 11:05:00'), length: 105 } ); 
db.sessions.save( { userid: "b", ts: ISODate('2011-11-04 13:14:00'), length: 120 } ); 
db.sessions.save( { userid: "c", ts: ISODate('2011-11-04 17:00:00'), length: 130 } ); 
db.sessions.save( { userid: "d", ts: ISODate('2011-11-04 15:37:00'), length: 65 } ); 
Initial Map-Reduce of Current Collection 
Run the first map-reduce operation as follows: 
1. Define the map function that maps the userid to an object that contains the fields userid, total_time, 
count, and avg_time: 
var mapFunction = function() { 
var key = this.userid; 
var value = { 
userid: this.userid, 
total_time: this.length, 
count: 1, 
avg_time: 0 
}; 
emit( key, value ); 
}; 
2. Define the corresponding reduce function with two arguments key and values to calculate the total time and 
the count. The key corresponds to the userid, and the values is an array whose elements corresponds to 
the individual objects mapped to the userid in the mapFunction. 
var reduceFunction = function(key, values) { 
var reducedObject = { 
userid: key, 
total_time: 0, 
count:0, 
avg_time:0 
}; 
values.forEach( function(value) { 
reducedObject.total_time += value.total_time; 
reducedObject.count += value.count; 
} 
); 
return reducedObject; 
}; 
3. Define the finalize function with two arguments key and reducedValue. The function modifies the 
reducedValue document to add another field average and returns the modified document. 
414 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
var finalizeFunction = function (key, reducedValue) { 
if (reducedValue.count > 0) 
reducedValue.avg_time = reducedValue.total_time / reducedValue.count; 
return reducedValue; 
}; 
4. Perform map-reduce on the session collection using the mapFunction, the reduceFunction, and the 
finalizeFunction functions. Output the results to a collection session_stat. If the session_stat 
collection already exists, the operation will replace the contents: 
db.sessions.mapReduce( mapFunction, 
reduceFunction, 
{ 
out: "session_stat", 
finalize: finalizeFunction 
} 
) 
Subsequent Incremental Map-Reduce 
Later, as the sessions collection grows, you can run additional map-reduce operations. For example, add new 
documents to the sessions collection: 
db.sessions.save( { userid: "a", ts: ISODate('2011-11-05 14:17:00'), length: 100 } ); 
db.sessions.save( { userid: "b", ts: ISODate('2011-11-05 14:23:00'), length: 115 } ); 
db.sessions.save( { userid: "c", ts: ISODate('2011-11-05 15:02:00'), length: 125 } ); 
db.sessions.save( { userid: "d", ts: ISODate('2011-11-05 16:45:00'), length: 55 } ); 
At the end of the day, perform incremental map-reduce on the sessions collection, but use the query field to select 
only the new documents. Output the results to the collection session_stat, but reduce the contents with the 
results of the incremental map-reduce: 
db.sessions.mapReduce( mapFunction, 
reduceFunction, 
{ 
query: { ts: { $gt: ISODate('2011-11-05 00:00:00') } }, 
out: { reduce: "session_stat" }, 
finalize: finalizeFunction 
} 
); 
7.3.5 Troubleshoot the Map Function 
The map function is a JavaScript function that associates or “maps” a value with a key and emits the key and value 
pair during a map-reduce (page 394) operation. 
To verify the key and value pairs emitted by the map function, write your own emit function. 
Consider a collection orders that contains documents of the following prototype: 
{ 
_id: ObjectId("50a8240b927d5d8b5891743c"), 
cust_id: "abc123", 
ord_date: new Date("Oct 04, 2012"), 
7.3. Aggregation Examples 415
MongoDB Documentation, Release 2.6.4 
status: 'A', 
price: 250, 
items: [ { sku: "mmm", qty: 5, price: 2.5 }, 
{ sku: "nnn", qty: 5, price: 2.5 } ] 
} 
1. Define the map function that maps the price to the cust_id for each document and emits the cust_id and 
price pair: 
var map = function() { 
emit(this.cust_id, this.price); 
}; 
2. Define the emit function to print the key and value: 
var emit = function(key, value) { 
print("emit"); 
print("key: " + key + " value: " + tojson(value)); 
} 
3. Invoke the map function with a single document from the orders collection: 
var myDoc = db.orders.findOne( { _id: ObjectId("50a8240b927d5d8b5891743c") } ); 
map.apply(myDoc); 
4. Verify the key and value pair is as you expected. 
emit 
key: abc123 value:250 
5. Invoke the map function with multiple documents from the orders collection: 
var myCursor = db.orders.find( { cust_id: "abc123" } ); 
while (myCursor.hasNext()) { 
var doc = myCursor.next(); 
print ("document _id= " + tojson(doc._id)); 
map.apply(doc); 
print(); 
} 
6. Verify the key and value pairs are as you expected. 
See also: 
The map function must meet various requirements. For a list of all the requirements for the map function, see 
mapReduce, or the mongo shell helper method db.collection.mapReduce(). 
7.3.6 Troubleshoot the Reduce Function 
The reduce function is a JavaScript function that “reduces” to a single object all the values associated with a par-ticular 
key during a map-reduce (page 394) operation. The reduce function must meet various requirements. This 
tutorial helps verify that the reduce function meets the following criteria: 
• The reduce function must return an object whose type must be identical to the type of the value emitted by 
the map function. 
• The order of the elements in the valuesArray should not affect the output of the reduce function. 
• The reduce function must be idempotent. 
416 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
For a list of all the requirements for the reduce function, see mapReduce, or the mongo shell helper method 
db.collection.mapReduce(). 
Confirm Output Type 
You can test that the reduce function returns a value that is the same type as the value emitted from the map function. 
1. Define a reduceFunction1 function that takes the arguments keyCustId and valuesPrices. 
valuesPrices is an array of integers: 
var reduceFunction1 = function(keyCustId, valuesPrices) { 
return Array.sum(valuesPrices); 
}; 
2. Define a sample array of integers: 
var myTestValues = [ 5, 5, 10 ]; 
3. Invoke the reduceFunction1 with myTestValues: 
reduceFunction1('myKey', myTestValues); 
4. Verify the reduceFunction1 returned an integer: 
20 
5. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects. 
valuesCountObjects is an array of documents that contain two fields count and qty: 
var reduceFunction2 = function(keySKU, valuesCountObjects) { 
reducedValue = { count: 0, qty: 0 }; 
for (var idx = 0; idx < valuesCountObjects.length; idx++) { 
reducedValue.count += valuesCountObjects[idx].count; 
reducedValue.qty += valuesCountObjects[idx].qty; 
} 
return reducedValue; 
}; 
6. Define a sample array of documents: 
var myTestObjects = [ 
{ count: 1, qty: 5 }, 
{ count: 2, qty: 10 }, 
{ count: 3, qty: 15 } 
]; 
7. Invoke the reduceFunction2 with myTestObjects: 
reduceFunction2('myKey', myTestObjects); 
8. Verify the reduceFunction2 returned a document with exactly the count and the qty field: 
{ "count" : 6, "qty" : 30 } 
7.3. Aggregation Examples 417
MongoDB Documentation, Release 2.6.4 
Ensure Insensitivity to the Order of Mapped Values 
The reduce function takes a key and a values array as its argument. You can test that the result of the reduce 
function does not depend on the order of the elements in the values array. 
1. Define a sample values1 array and a sample values2 array that only differ in the order of the array elements: 
var values1 = [ 
{ count: 1, qty: 5 }, 
{ count: 2, qty: 10 }, 
{ count: 3, qty: 15 } 
]; 
var values2 = [ 
{ count: 3, qty: 15 }, 
{ count: 1, qty: 5 }, 
{ count: 2, qty: 10 } 
]; 
2. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects. 
valuesCountObjects is an array of documents that contain two fields count and qty: 
var reduceFunction2 = function(keySKU, valuesCountObjects) { 
reducedValue = { count: 0, qty: 0 }; 
for (var idx = 0; idx < valuesCountObjects.length; idx++) { 
reducedValue.count += valuesCountObjects[idx].count; 
reducedValue.qty += valuesCountObjects[idx].qty; 
} 
return reducedValue; 
}; 
3. Invoke the reduceFunction2 first with values1 and then with values2: 
reduceFunction2('myKey', values1); 
reduceFunction2('myKey', values2); 
4. Verify the reduceFunction2 returned the same result: 
{ "count" : 6, "qty" : 30 } 
Ensure Reduce Function Idempotence 
Because the map-reduce operation may call a reduce multiple times for the same key, and won’t call a reduce for 
single instances of a key in the working set, the reduce function must return a value of the same type as the value 
emitted from the map function. You can test that the reduce function process “reduced” values without affecting the 
final value. 
1. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects. 
valuesCountObjects is an array of documents that contain two fields count and qty: 
var reduceFunction2 = function(keySKU, valuesCountObjects) { 
reducedValue = { count: 0, qty: 0 }; 
for (var idx = 0; idx < valuesCountObjects.length; idx++) { 
reducedValue.count += valuesCountObjects[idx].count; 
reducedValue.qty += valuesCountObjects[idx].qty; 
} 
418 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
return reducedValue; 
}; 
2. Define a sample key: 
var myKey = 'myKey'; 
3. Define a sample valuesIdempotent array that contains an element that is a call to the reduceFunction2 
function: 
var valuesIdempotent = [ 
{ count: 1, qty: 5 }, 
{ count: 2, qty: 10 }, 
reduceFunction2(myKey, [ { count:3, qty: 15 } ] ) 
]; 
4. Define a sample values1 array that combines the values passed to reduceFunction2: 
var values1 = [ 
{ count: 1, qty: 5 }, 
{ count: 2, qty: 10 }, 
{ count: 3, qty: 15 } 
]; 
5. Invoke the reduceFunction2 first with myKey and valuesIdempotent and then with myKey and 
values1: 
reduceFunction2(myKey, valuesIdempotent); 
reduceFunction2(myKey, values1); 
6. Verify the reduceFunction2 returned the same result: 
{ "count" : 6, "qty" : 30 } 
7.4 Aggregation Reference 
Aggregation Pipeline Quick Reference (page 420) Quick reference card for aggregation pipeline. 
http://docs.mongodb.org/manualreference/operator/aggregation Aggregation pipeline op-erations 
have a collection of operators available to define and manipulate documents in pipeline stages. 
Aggregation Commands Comparison (page 424) A comparison of group, mapReduce and aggregate that ex-plores 
the strengths and limitations of each aggregation modality. 
SQL to Aggregation Mapping Chart (page 426) An overview common aggregation operations in SQL and Mon-goDB 
using the aggregation pipeline and operators in MongoDB and common SQL statements. 
Aggregation Interfaces (page 428) The data aggregation interfaces document the invocation format and output for 
MongoDB’s aggregation commands and methods. 
Variables in Aggregation Expressions (page 428) Use of variables in aggregation pipeline expressions. 
7.4. Aggregation Reference 419
MongoDB Documentation, Release 2.6.4 
7.4.1 Aggregation Pipeline Quick Reference 
Stages 
Pipeline stages appear in an array. Documents pass through the stages in sequence. All except the $out and 
$geoNear stages can appear multiple times in a pipeline. 
db.collection.aggregate( [ { <stage> }, ... ] ) 
Name Description 
$geoNearReturns an ordered stream of documents based on the proximity to a geospatial point. Incorporates the 
functionality of $match, $sort, and $limit for geospatial data. The output documents include an 
additional distance field and can include a location identifier field. 
$group Groups input documents by a specified identifier expression and applies the accumulator expression(s), 
if specified, to each group. Consumes all input documents and outputs one document per each distinct 
group. The output documents only contain the identifier field and, if specified, accumulated fields. 
$limit Passes the first n documents unmodified to the pipeline where n is the specified limit. For each input 
document, outputs either one document (for the first n documents) or zero documents (after the first n 
documents). 
$match Filters the document stream to allow only matching documents to pass unmodified into the next 
pipeline stage. $match uses standard MongoDB queries. For each input document, outputs either one 
document (a match) or zero documents (no match). 
$out Writes the resulting documents of the aggregation pipeline to a collection. To use the $out stage, it 
must be the last stage in the pipeline. 
$projectReshapes each document in the stream, such as by adding new fields or removing existing fields. For 
each input document, outputs one document. 
$redactReshapes each document in the stream by restricting the content for each document based on 
information stored in the documents themselves. Incorporates the functionality of $project and 
$match. Can be used to implement field level redaction. For each input document, outputs either one 
or zero document. 
$skip Skips the first n documents where n is the specified skip number and passes the remaining documents 
unmodified to the pipeline. For each input document, outputs either zero documents (for the first n 
documents) or one document (if after the first n documents). 
$sort Reorders the document stream by a specified sort key. Only the order changes; the documents remain 
unmodified. For each input document, outputs one document. 
$unwindDeconstructs an array field from the input documents to output a document for each element. Each 
output document replaces the array with an element value. For each input document, outputs n 
documents where n is the number of array elements and can be zero for an empty array. 
Expressions 
Expressions can include field paths and system variables (page 420), literals (page 421), expression objects (page 421), 
and expression operators (page 421). Expressions can be nested. 
Field Path and System Variables 
Aggregation expressions use field path to access fields in the input documents. To specify a field path, use a string that 
prefixes with a dollar sign $ the field name or the dotted field name, if the field is in embedded document. For example, 
"$user" to specify the field path for the user field or "$user.name" to specify the field path to "user.name" 
field. 
"$<field>" is equivalent to "$$CURRENT.<field>" where the CURRENT (page 429) is a system variable that 
defaults to the root of the current object in the most stages, unless stated otherwise in specific stages. CURRENT 
420 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
(page 429) can be rebound. 
Along with the CURRENT (page 429) system variable, other system variables (page 428) are also available for use in 
expressions. To use user-defined variables, use $let and $map expressions. To access variables in expressions, use 
a string that prefixes the variable name with $$. 
Literals 
Literals can be of any type. However, MongoDB parses string literals that start with a dollar sign $ as a path to a field 
and numeric/boolean literals in expression objects (page 421) as projection flags. To avoid parsing literals, use the 
$literal expression. 
Expression Objects 
Expression objects have the following form: 
{ <field1>: <expression1>, ... } 
If the expressions are numeric or boolean literals, MongoDB treats the literals as projection flags (e.g. 1 or true to 
include the field), valid only in the $project stage. To avoid treating numeric or boolean literals as projection flags, 
use the $literal expression to wrap the numeric or boolean literals. 
Operator Expressions 
Operator expressions are similar to functions that take arguments. In general, these expressions take an array of 
arguments and have the following form: 
{ <operator>: [ <argument1>, <argument2> ... ] } 
If operator accepts a single argument, you can omit the outer array designating the argument list: 
{ <operator>: <argument> } 
To avoid parsing ambiguity if the argument is a literal array, you must wrap the literal array in a $literal expression 
or keep the outer array that designates the argument list. 
Boolean Expressions Boolean expressions evaluates its argument expressions as booleans and return a boolean as 
the result. 
In addition to the false boolean value, Boolean expression evaluates as false the following: null, 0, and 
undefined values. The Boolean expression evaluates all other values as true, including non-zero numeric values 
and arrays. 
Name Description 
$and Returns true only when all its expressions evaluate to true. Accepts any number of argument 
expressions. 
$not Returns the boolean value that is the opposite of its argument expression. Accepts a single argument 
expression. 
$or Returns true when any of its expressions evaluates to true. Accepts any number of argument 
expressions. 
7.4. Aggregation Reference 421
MongoDB Documentation, Release 2.6.4 
Set Expressions Set expressions performs set operation on arrays, treating arrays as sets. Set expressions ignores 
the duplicate entries in each input array and the order of the elements. 
If the set operation returns a set, the operation filters out duplicates in the result to output an array that contains only 
unique entries. The order of the elements in the output array is unspecified. 
If a set contains a nested array element, the set expression does not descend into the nested array but evaluates the 
array at top-level. 
Name Description 
$allElementsRTertuurnes true if no element of a set evaluates to false, otherwise, returns false. Accepts a 
single argument expression. 
$anyElementTRreutuerns true if any elements of a set evaluate to true; otherwise, returns false. Accepts a 
single argument expression. 
$setDifferenRceeturns a set with elements that appear in the first set but not in the second set; i.e. performs a 
relative complement6 of the second set relative to the first. Accepts exactly two argument 
expressions. 
$setEquals Returns true if the input sets have the same distinct elements. Accepts two or more argument 
expressions. 
$setIntersecRteituornns a set with elements that appear in all of the input sets. Accepts any number of argument 
expressions. 
$setIsSubsetReturns true if all elements of the first set appear in the second set, including when the first set 
equals the second set; i.e. not a strict subset7. Accepts exactly two argument expressions. 
$setUnion Returns a set with elements that appear in any of the input sets. Accepts any number of argument 
expressions. 
Comparison Expressions Comparison expressions return a boolean except for $cmp which returns a number. 
The comparison expressions take two argument expressions and compare both value and type, using the specified 
BSON comparison order (page 168) for values of different types. 
Name Description 
$cmp Returns: 0 if the two values are equivalent, 1 if the first value is greater than the second, and -1 if the 
first value is less than the second. 
$eq Returns true if the values are equivalent. 
$gt Returns true if the first value is greater than the second. 
$gte Returns true if the first value is greater than or equal to the second. 
$lt Returns true if the first value is less than the second. 
$lte Returns true if the first value is less than or equal to the second. 
$ne Returns true if the values are not equivalent. 
Arithmetic Expressions Arithmetic expressions perform mathematic operations on numbers. Some arithmetic ex-pressions 
can also support date arithmetic. 
4http://en.wikipedia.org/wiki/Complement_(set_theory) 
5http://en.wikipedia.org/wiki/Subset 
6http://en.wikipedia.org/wiki/Complement_(set_theory) 
7http://en.wikipedia.org/wiki/Subset 
422 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
Name Description 
$add Adds numbers to return the sum, or adds numbers and a date to return a new date. If adding numbers 
and a date, treats the numbers as milliseconds. Accepts any number of argument expressions, but at 
most, one expression can resolve to a date. 
$divide Returns the result of dividing the first number by the second. Accepts two argument expressions. 
$mod Returns the remainder of the first number divided by the second. Accepts two argument expressions. 
$multiply Multiplies numbers to return the product. Accepts any number of argument expressions. 
$subtract Returns the result of subtracting the second value from the first. If the two values are numbers, return 
the difference. If the two values are dates, return the difference in milliseconds. If the two values are a 
date and a number in milliseconds, return the resulting date. Accepts two argument expressions. If the 
two values are a date and a number, specify the date argument first as it is not meaningful to subtract a 
date from a number. 
String Expressions String expressions, with the exception of $concat, only have a well-defined behavior for 
strings of ASCII characters. 
$concat behavior is well-defined regardless of the characters used. 
Name Description 
$concat Concatenates any number of strings. 
$strcasecPmeprforms case-insensitive string comparison and returns: 0 if two strings are equivalent, 1 if the first 
string is greater than the second, and -1 if the first string is less than the second. 
$substr Returns a substring of a string, starting at a specified index position up to a specified length. Accepts 
three expressions as arguments: the first argument must resolve to a string, and the second and third 
arguments must resolve to integers. 
$toLower Converts a string to lowercase. Accepts a single argument expression. 
$toUpper Converts a string to uppercase. Accepts a single argument expression. 
Text Search Expressions 
Name Description 
$meta Access text search metadata. 
Array Expressions 
Name Description 
$size Returns the number of elements in the array. Accepts a single expression as argument. 
Variable Expressions 
Name Description 
$let Defines variables for use within the scope of a subexpression and returns the result of the subexpression. 
Accepts named parameters. 
$map Applies a subexpression to each element of an array and returns the array of resulting values in order. 
Accepts named parameters. 
Literal Expressions 
Name Description 
$literalReturn a value without parsing. Use for values that the aggregation pipeline may interpret as an 
expression. For example, use a $literal expression to a string that starts with a $ to avoid parsing a field path. 
7.4. Aggregation Reference 423
MongoDB Documentation, Release 2.6.4 
Date Expressions 
Name Description 
$dayOfMonthReturns the day of the month for a date as a number between 1 and 31. 
$dayOfWeek Returns the day of the week for a date as a number between 1 (Sunday) and 7 (Saturday). 
$dayOfYear Returns the day of the year for a date as a number between 1 and 366 (leap year). 
$hour Returns the hour for a date as a number between 0 and 23. 
$millisecond Returns the milliseconds of a date as a number between 0 and 999. 
$minute Returns the minute for a date as a number between 0 and 59. 
$month Returns the month for a date as a number between 1 (January) and 12 (December). 
$second Returns the seconds for a date as a number between 0 and 60 (leap seconds). 
$week Returns the week number for a date as a number between 0 (the partial week that precedes the first 
Sunday of the year) and 53 (leap year). 
$year Returns the year for a date as a number (e.g. 2014). 
Conditional Expressions 
Name Description 
$cond A ternary operator that evaluates one expression, and depending on the result, returns the value of the other two expressions. Accepts either three expressions in an ordered list or three named parameters. 
$ifNullReturns either the non-null result of the first expression or the result of the second expression if the expression results in a null result. Null result encompasses instances of undefined values or missing 
fields. Accepts two expressions as arguments. The result of the second expression can be null. 
Accumulators 
Accumulators, available only for the $group stage, compute values by combining documents that share the same 
group key. Accumulators take as input a single expression, evaluating the expression once for each input document, 
and maintain their state for the group of documents. 
Name Description 
$addToSet Returns an array of unique expression values for each group. Order of the array elements is 
undefined. 
$avg Returns an average for each group. Ignores non-numeric values. 
$first Returns a value from the first document for each group. Order is only defined if the documents are 
in a defined order. 
$last Returns a value from the last document for each group. Order is only defined if the documents are 
in a defined order. 
$max Returns the highest expression value for each group. 
$min Returns the lowest expression value for each group. 
$push Returns an array of expression values for each group. 
$sum Returns a sum for each group. Ignores non-numeric values. 
7.4.2 Aggregation Commands Comparison 
The following table provides a brief overview of the features of the MongoDB aggregation commands. 
424 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
aggregate mapReduce group 
De-scrip-tion 
New in version 2.2. 
Implements the Map-Reduce 
Designed with specific goals of 
aggregation for processing large 
improving performance and 
data sets. 
usability for aggregation tasks. 
Uses a “pipeline” approach 
where objects are transformed as 
they pass through a series of 
pipeline operators such as 
$group, $match, and $sort. 
See 
http://docs.mongodb.org/manualreference/operator/aggregation 
for more information on the 
pipeline operators. 
Provides grouping functionality. 
Is slower than the aggregate 
command and has less 
functionality than the 
mapReduce command. 
Key 
Fea-tures 
Pipeline operators can be 
repeated as needed. 
Pipeline operators need not 
produce one output document for 
every input document. 
Can also generate new 
documents or filter out 
documents. 
In addition to grouping 
operations, can perform complex 
aggregation tasks as well as 
perform incremental aggregation 
on continuously growing 
datasets. 
See Map-Reduce Examples 
(page 411) and Perform 
Incremental Map-Reduce 
(page 413). 
Can either group by existing 
fields or with a custom keyf 
JavaScript function, can group by 
calculated fields. 
See group for information and 
example using the keyf 
function. 
Flex-i-bil-ity 
Limited to the operators and 
Custom map, reduce and 
expressions supported by the 
finalize JavaScript functions 
aggregation pipeline. 
offer flexibility to aggregation 
However, can add computed 
logic. 
fields, create new virtual 
See mapReduce for details and 
sub-objects, and extract 
restrictions on the functions. 
sub-fields into the top-level of 
results by using the $project 
pipeline operator. 
See $project for more 
information as well as 
http://docs.mongodb.org/manualreference/operator/aggregation 
for more information on all the 
available pipeline operators. 
Custom reduce and 
finalize JavaScript functions 
offer flexibility to grouping logic. 
See group for details and 
restrictions on these functions. 
Out-put 
Re-sults 
Returns results in various options 
(inline as a document that 
contains the result set, a cursor to 
the result set) or stores the results 
in a collection. 
The result is subject to the BSON 
Document size limit if returned 
inline as a document that 
contains the result set. 
Changed in version 2.6: Can 
return results as a cursor or store 
the results to a collection. 
Returns results in various options 
(inline, new collection, merge, 
replace, reduce). See 
mapReduce for details on the 
output options. 
Changed in version 2.2: Provides 
much better support for sharded 
map-reduce output than previous 
versions. 
Returns results inline as an array 
of grouped items. 
The result set must fit within the 
maximum BSON document size 
limit. 
Changed in version 2.2: The 
returned array can contain at 
most 20,000 elements; i.e. at 
most 20,000 unique groupings. 
Previous versions had a limit of 
10,000 elements. 
Shard-ing 
Supports non-sharded and 
sharded input collections. 
Supports non-sharded and 
sharded input collections. 
Does not support sharded 
collection. 
Notes Prior to 2.4, JavaScript code 
executed in a single thread. 
Prior to 2.4, JavaScript code 
executed in a single thread. 
More 
In-for-ma-tion 
See Aggregation Pipeline 
(page 391) and aggregate. 
See Map-Reduce (page 394) and 
mapReduce. 
See group. 
7.4. Aggregation Reference 425
MongoDB Documentation, Release 2.6.4 
7.4.3 SQL to Aggregation Mapping Chart 
The aggregation pipeline (page 391) allows MongoDB to provide native aggregation capabilities that corresponds to 
many common data aggregation operations in SQL. 
The following table provides an overview of common SQL aggregation terms, functions, and concepts and the corre-sponding 
MongoDB aggregation operators: 
SQL Terms, 
Functions, and 
Concepts 
MongoDB Aggregation Operators 
WHERE $match 
GROUP BY $group 
HAVING $match 
SELECT $project 
ORDER BY $sort 
LIMIT $limit 
SUM() $sum 
COUNT() $sum 
join No direct corresponding operator; however, the $unwind operator allows for 
somewhat similar functionality, but with fields embedded within the document. 
Examples 
The following table presents a quick reference of SQL aggregation statements and the corresponding MongoDB state-ments. 
The examples in the table assume the following conditions: 
• The SQL examples assume two tables, orders and order_lineitem that join by the 
order_lineitem.order_id and the orders.id columns. 
• The MongoDB examples assume one collection orders that contain documents of the following prototype: 
{ 
cust_id: "abc123", 
ord_date: ISODate("2012-11-02T17:04:11.102Z"), 
status: 'A', 
price: 50, 
items: [ { sku: "xxx", qty: 25, price: 1 }, 
{ sku: "yyy", qty: 25, price: 1 } ] 
} 
426 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
SQL Example MongoDB Example Description 
SELECT COUNT(*) AS count 
db.orders.aggregate( [ 
FROM orders 
{ 
$group: { 
_id: null, 
count: { $sum: 1 } 
} 
} 
] ) 
Count all records from orders 
SELECT SUM(price) AS total 
FROM orders 
db.orders.aggregate( [ 
{ 
$group: { 
_id: null, 
total: { $sum: "$price" } 
} 
} 
] ) 
Sum the price field from orders 
SELECT cust_id, 
SUM(price) AS total 
FROM orders 
GROUP BY cust_id 
db.orders.aggregate( [ 
{ 
$group: { 
_id: "$cust_id", 
total: { $sum: "$price" } 
} 
} 
] ) 
For each unique cust_id, sum the 
price field. 
SELECT cust_id, 
SUM(price) AS total 
FROM orders 
GROUP BY cust_id 
ORDER BY total 
db.orders.aggregate( [ 
{ 
$group: { 
_id: "$cust_id", 
total: { $sum: "$price" } 
} 
}, 
{ $sort: { total: 1 } } 
] ) 
For each unique cust_id, sum the 
price field, results sorted by sum. 
SELECT cust_id, 
ord_date, 
SUM(price) AS total 
FROM orders 
GROUP BY cust_id, 
ord_date 
db.orders.aggregate( [ 
{ 
$group: { 
_id: { 
cust_id: "$cust_id", 
ord_date: { 
month: { $month: "$ord_date" }, 
day: { $dayOfMonth: "$ord_date" }, 
year: { $year: "$ord_date"} 
} 
}, 
total: { $sum: "$price" } 
} 
} 
] ) 
For each unique cust_id, 
ord_date grouping, sum the 
price field. Excludes the time 
portion of the date. 
7.4. Aggregation Reference 427 
SELECT cust_id, 
count(*) 
FROM orders 
db.orders.aggregate( [ 
{ 
$group: { 
For cust_id with multiple records, 
return the cust_id and the corre-sponding 
record count.
MongoDB Documentation, Release 2.6.4 
7.4.4 Aggregation Interfaces 
Aggregation Commands 
Name Description 
aggregate Performs aggregation tasks (page 391) such as group using the aggregation framework. 
count Counts the number of documents in a collection. 
distinct Displays the distinct values found for a specified key in a collection. 
group Groups documents in a collection by the specified key and performs simple aggregation. 
mapReduce Performs map-reduce (page 394) aggregation for large data sets. 
Aggregation Methods 
Name Description 
db.collection.aggregate()Provides access to the aggregation pipeline (page 391). 
db.collection.group() Groups documents in a collection by the specified key and performs simple 
aggregation. 
db.collection.mapReduce()Performs map-reduce (page 394) aggregation for large data sets. 
7.4.5 Variables in Aggregation Expressions 
Aggregation expressions (page 420) can use both user-defined and system variables. 
Variables can hold any BSON type data (page 167). To access the value of the variable, use a string with the variable 
name prefixed with double dollar signs ($$). 
If the variable references an object, to access a specific field in the object, use the dot notation; i.e. 
"$$<variable>.<field>". 
User Variables 
User variable names can contain the ascii characters [_a-zA-Z0-9] and any non-ascii character. 
User variable names must begin with a lowercase ascii letter [a-z] or a non-ascii character. 
System Variables 
MongoDB offers the following system variables: 
428 Chapter 7. Aggregation
MongoDB Documentation, Release 2.6.4 
Variable Description 
ROOT 
References the root document, i.e. the top-level doc-ument, 
currently being processed in the aggregation 
pipeline stage. 
CURRENT 
References the start of the field path being processed in 
the aggregation pipeline stage. Unless documented oth-erwise, 
all stages start with CURRENT (page 429) the 
same as ROOT (page 429). 
CURRENT (page 429) is modifiable. However, since 
$<field> is equivalent to $$CURRENT.<field>, 
rebinding CURRENT (page 429) changes the meaning 
of $ accesses. 
DESCEND 
One of the allowed results of a $redact expression. 
PRUNE One of the allowed results of a $redact expression. 
KEEP One of the allowed results of a $redact expression. 
See also: 
$let, $redact 
7.4. Aggregation Reference 429
MongoDB Documentation, Release 2.6.4 
430 Chapter 7. Aggregation
CHAPTER 8 
Indexes 
Indexes provide high performance read operations for frequently used queries. 
This section introduces indexes in MongoDB, describes the types and configuration options for indexes, and describes 
special types of indexing MongoDB supports. The section also provides tutorials detailing procedures and operational 
concerns, and providing information on how applications may use indexes. 
Index Introduction (page 431) An introduction to indexes in MongoDB. 
Index Concepts (page 436) The core documentation of indexes in MongoDB, including geospatial and text indexes. 
Index Types (page 437) MongoDB provides different types of indexes for different purposes and different types 
of content. 
Index Properties (page 456) The properties you can specify when building indexes. 
Index Creation (page 460) The options available when creating indexes. 
Index Intersection (page 462) The use of index intersection to fulfill a query. 
Indexing Tutorials (page 464) Examples of operations involving indexes, including index creation and querying in-dexes. 
Indexing Reference (page 500) Reference material for indexes in MongoDB. 
8.1 Index Introduction 
Indexes support the efficient execution of queries in MongoDB.Without indexes, MongoDB must scan every document 
in a collection to select those documents that match the query statement. These collection scans are inefficient because 
they require mongod to process a larger volume of data than an index for each operation. 
Indexes are special data structures 1 that store a small portion of the collection’s data set in an easy to traverse form. 
The index stores the value of a specific field or set of fields, ordered by the value of the field. 
Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB defines indexes at 
the collection level and supports indexes on any field or sub-field of the documents in a MongoDB collection. 
If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must 
inspect. In some cases, MongoDB can use the data from the index to determine which documents match a query. The 
following diagram illustrates a query that selects documents using an index. 
1 MongoDB indexes use a B-tree data structure. 
431
MongoDB Documentation, Release 2.6.4 
Figure 8.1: Diagram of a query selecting documents using an index. MongoDB narrows the query by scanning the 
range of documents with values of score less than 30. 
8.1.1 Optimization 
Consider the documentation of the query optimizer (page 61) for more information on the relationship between queries 
and indexes. 
Create indexes to support common and user-facing queries. Having these indexes will ensure that MongoDB only 
scans the smallest possible number of documents. 
Indexes can also optimize the performance of other operations in specific situations: 
Sorted Results 
MongoDB can use indexes to return documents sorted by the index key directly from the index without requiring an 
additional sort phase. 
Covered Results 
When the query criteria and the projection of a query include only the indexed fields, MongoDB will return results 
directly from the index without scanning any documents or bringing documents into memory. These covered queries 
can be very efficient. 
8.1.2 Index Types 
MongoDB provides a number of different index types to support specific types of data and queries. 
432 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Figure 8.2: Diagram of a query that uses an index to select and return sorted results. The index stores score values 
in ascending order. MongoDB can traverse the index in either ascending or descending order to return sorted results. 
Figure 8.3: Diagram of a query that uses only the index to match the query criteria and return the results. MongoDB 
does not need to inspect data outside of the index to fulfill the query. 
8.1. Index Introduction 433
MongoDB Documentation, Release 2.6.4 
Default _id 
All MongoDB collections have an index on the _id field that exists by default. If applications do not specify a value 
for _id the driver or the mongod will create an _id field with an ObjectId value. 
The _id index is unique, and prevents clients from inserting two documents with the same value for the _id field. 
Single Field 
In addition to the MongoDB-defined _id index, MongoDB supports user-defined indexes on a single field of a docu-ment 
(page 438). Consider the following illustration of a single-field index: 
Figure 8.4: Diagram of an index on the score field (ascending). 
Compound Index 
MongoDB also supports user-defined indexes on multiple fields. These compound indexes (page 440) behave like 
single-field indexes; however, the query can select documents based on additional fields. The order of fields listed 
in a compound index has significance. For instance, if a compound index consists of { userid: 1, score: 
-1 }, the index sorts first by userid and then, within each userid value, sort by score. Consider the following 
illustration of this compound index: 
Multikey Index 
MongoDB uses multikey indexes (page 442) to index the content stored in arrays. If you index a field that holds an 
array value, MongoDB creates separate index entries for every element of the array. These multikey indexes (page 442) 
allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB 
automatically determines whether to create a multikey index if the indexed field contains an array value; you do not 
need to explicitly specify the multikey type. 
Consider the following illustration of a multikey index: 
Geospatial Index 
To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes 
(page 451) that uses planar geometry when returning results and 2sphere indexes (page 447) that use spherical ge-ometry 
to return results. 
434 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Figure 8.5: Diagram of a compound index on the userid field (ascending) and the score field (descending). The 
index sorts first by the userid field and then by the score field. 
Figure 8.6: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address 
documents. The address documents contain the zip field. 
8.1. Index Introduction 435
MongoDB Documentation, Release 2.6.4 
See 2d Index Internals (page 452) for a high level introduction to geospatial indexes. 
Text Indexes 
MongoDB provides a text index type that supports searching for string content in a collection. These text indexes 
do not store language-specific stop words (e.g. “the”, “a”, “or”) and stem the words in a collection to only store root 
words. 
See Text Indexes (page 454) for more information on text indexes and search. 
Hashed Indexes 
To support hash based sharding (page 621), MongoDB provides a hashed index (page 455) type, which indexes the 
hash of the value of a field. These indexes have a more random distribution of values along their range, but only 
support equality matches and cannot support range-based queries. 
8.1.3 Index Properties 
Unique Indexes 
The unique (page 457) property for an index causes MongoDB to reject duplicate values for the indexed field. To 
create a unique index (page 457) on a field that already has duplicate values, see Drop Duplicates (page 461) for 
index creation options. Other than the unique constraint, unique indexes are functionally interchangeable with other 
MongoDB indexes. 
Sparse Indexes 
The sparse (page 457) property of an index ensures that the index only contain entries for documents that have the 
indexed field. The index skips documents that do not have the indexed field. 
You can combine the sparse index option with the unique index option to reject documents that have duplicate values 
for a field but ignore documents that do not have the indexed key. 
8.1.4 Index Intersection 
New in version 2.6. 
MongoDB can use the intersection of indexes (page 462) to fulfill queries. For queries that specify compound query 
conditions, if one index can fulfill a part of a query condition, and another index can fulfill another part of the query 
condition, then MongoDB can use the intersection of the two indexes to fulfill the query. Whether the use of a 
compound index or the use of an index intersection is more efficient depends on the particular query and the system. 
For details on index intersection, see Index Intersection (page 462). 
8.2 Index Concepts 
These documents describe and provide examples of the types, configuration options, and behavior of indexes in Mon-goDB. 
For an over view of indexing, see Index Introduction (page 431). For operational instructions, see Indexing 
Tutorials (page 464). The Indexing Reference (page 500) documents the commands and operations specific to index 
construction, maintenance, and querying in MongoDB, including index types and creation options. 
436 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Index Types (page 437) MongoDB provides different types of indexes for different purposes and different types of 
content. 
Single Field Indexes (page 438) A single field index only includes data from a single field of the documents in 
a collection. MongoDB supports single field indexes on fields at the top level of a document and on fields 
in sub-documents. 
Compound Indexes (page 440) A compound index includes more than one field of the documents in a collec-tion. 
Multikey Indexes (page 442) A multikey index references an array and records a match if a query includes any 
value in the array. 
Geospatial Indexes and Queries (page 444) Geospatial indexes support location-based searches on data that is 
stored as either GeoJSON objects or legacy coordinate pairs. 
Text Indexes (page 454) Text indexes supports search of string content in documents. 
Hashed Index (page 455) Hashed indexes maintain entries with hashes of the values of the indexed field. 
Index Properties (page 456) The properties you can specify when building indexes. 
TTL Indexes (page 456) The TTL index is used for TTL collections, which expire data after a period of time. 
Unique Indexes (page 457) A unique index causes MongoDB to reject all documents that contain a duplicate 
value for the indexed field. 
Sparse Indexes (page 457) A sparse index does not index documents that do not have the indexed field. 
Index Creation (page 460) The options available when creating indexes. 
Index Intersection (page 462) The use of index intersection to fulfill a query. 
8.2.1 Index Types 
MongoDB provides a number of different index types. You can create indexes on any field or embedded field within 
a document or sub-document. You can create single field indexes (page 438) or compound indexes (page 440). Mon-goDB 
also supports indexes of arrays, called multi-key indexes (page 442), as well as indexes on geospatial data 
(page 444). For a list of the supported index types, see Index Type Documentation (page 438). 
In general, you should create indexes that support your common and user-facing queries. Having these indexes will 
ensure that MongoDB scans the smallest possible number of documents. 
In the mongo shell, you can create an index by calling the ensureIndex() method. For more detailed instructions 
about building indexes, see the Indexing Tutorials (page 464) page. 
Behavior of Indexes 
All indexes in MongoDB are B-tree indexes, which can efficiently support equality matches and range queries. The 
index stores items internally in order sorted by the value of the index field. The ordering of index entries supports 
efficient range-based operations and allows MongoDB to return sorted results using the order of documents in the 
index. 
Ordering of Indexes 
MongoDB indexes may be ascending, (i.e. 1) or descending (i.e. -1) in their ordering. Nevertheless, MongoDB may 
also traverse the index in either directions. As a result, for single-field indexes, ascending and descending indexes are 
8.2. Index Concepts 437
MongoDB Documentation, Release 2.6.4 
interchangeable. This is not the case for compound indexes: in compound indexes, the direction of the sort order can 
have a greater impact on the results. 
See Sort Order (page 441) for more information on the impact of index order on results in compound indexes. 
Index Intersection 
MongoDB can use the intersection of indexes to fulfill queries with compound conditions. See Index Intersection 
(page 462) for details. 
Limits 
Certain restrictions apply to indexes, such as the length of the index keys or the number of indexes per collection. See 
Index Limitations for details. 
Index Type Documentation 
Single Field Indexes (page 438) A single field index only includes data from a single field of the documents in a 
collection. MongoDB supports single field indexes on fields at the top level of a document and on fields in 
sub-documents. 
Compound Indexes (page 440) A compound index includes more than one field of the documents in a collection. 
Multikey Indexes (page 442) A multikey index references an array and records a match if a query includes any value 
in the array. 
Geospatial Indexes and Queries (page 444) Geospatial indexes support location-based searches on data that is stored 
as either GeoJSON objects or legacy coordinate pairs. 
Text Indexes (page 454) Text indexes supports search of string content in documents. 
Hashed Index (page 455) Hashed indexes maintain entries with hashes of the values of the indexed field. 
Single Field Indexes 
MongoDB provides complete support for indexes on any field in a collection of documents. By default, all collections 
have an index on the _id field (page 439), and applications and users may add additional indexes to support important 
queries and operations. 
MongoDB supports indexes that contain either a single field or multiple fields depending on the operations that this 
index-type supports. This document describes indexes that contain a single field. Consider the following illustration 
of a single field index. 
See also: 
Compound Indexes (page 440) for information about indexes that include multiple fields, and Index Introduction 
(page 431) for a higher level introduction to indexing in MongoDB. 
Example Given the following document in the friends collection: 
{ "_id" : ObjectId(...), 
"name" : "Alice" 
"age" : 27 
} 
438 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Figure 8.7: Diagram of an index on the score field (ascending). 
The following command creates an index on the name field: 
db.friends.ensureIndex( { "name" : 1 } ) 
Cases 
_id Field Index MongoDB creates the _id index, which is an ascending unique index (page 457) on the _id field, 
for all collections when the collection is created. You cannot remove the index on the _id field. 
Think of the _id field as the primary key for a collection. Every document must have a unique _id field. You may 
store any unique value in the _id field. The default value of _id is an ObjectId which is generated when the client 
inserts the document. An ObjectId is a 12-byte unique identifier suitable for use as the value of an _id field. 
Note: In sharded clusters, if you do not use the _id field as the shard key, then your application must ensure the 
uniqueness of the values in the _id field to prevent errors. This is most-often done by using a standard auto-generated 
ObjectId. 
Before version 2.2, capped collections did not have an _id field. In version 2.2 and newer, capped collections do 
have an _id field, except those in the local database. See Capped Collections Recommendations and Restrictions 
(page 196) for more information. 
Indexes on Embedded Fields You can create indexes on fields embedded in sub-documents, just as you can index 
top-level fields in documents. Indexes on embedded fields differ from indexes on sub-documents (page 440), which 
include the full content up to the maximum index size of the sub-document in the index. Instead, indexes on 
embedded fields allow you to use a “dot notation,” to introspect into sub-documents. 
Consider a collection named people that holds documents that resemble the following example document: 
{"_id": ObjectId(...) 
"name": "John Doe" 
"address": { 
"street": "Main", 
"zipcode": "53511", 
"state": "WI" 
} 
} 
8.2. Index Concepts 439
MongoDB Documentation, Release 2.6.4 
You can create an index on the address.zipcode field, using the following specification: 
db.people.ensureIndex( { "address.zipcode": 1 } ) 
Indexes on Subdocuments You can also create indexes on subdocuments. 
For example, the factories collection contains documents that contain a metro field, such as: 
{ 
_id: ObjectId(...), 
metro: { 
city: "New York", 
state: "NY" 
}, 
name: "Giant Factory" 
} 
The metro field is a subdocument, containing the embedded fields city and state. The following command 
creates an index on the metro field as a whole: 
db.factories.ensureIndex( { metro: 1 } ) 
The following query can use the index on the metro field: 
db.factories.find( { metro: { city: "New York", state: "NY" } } ) 
This query returns the above document. When performing equality matches on subdocuments, field order matters and 
the subdocuments must match exactly. For example, the following query does not match the above document: 
db.factories.find( { metro: { state: "NY", city: "New York" } } ) 
See query-subdocuments for more information regarding querying on subdocuments. 
Compound Indexes 
MongoDB supports compound indexes, where a single index structure holds references to multiple fields 2 within a 
collection’s documents. The following diagram illustrates an example of a compound index on two fields: 
Compound indexes can support queries that match on multiple fields. 
Example 
Consider a collection named products that holds documents that resemble the following document: 
{ 
"_id": ObjectId(...), 
"item": "Banana", 
"category": ["food", "produce", "grocery"], 
"location": "4th Street Store", 
"stock": 4, 
"type": "cases", 
"arrival": Date(...) 
} 
If applications query on the item field as well as query on both the item field and the stock field, you can specify 
a single compound index to support both of these queries: 
2 MongoDB imposes a limit of 31 fields for any compound index. 
440 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Figure 8.8: Diagram of a compound index on the userid field (ascending) and the score field (descending). The 
index sorts first by the userid field and then by the score field. 
db.products.ensureIndex( { "item": 1, "stock": 1 } ) 
Important: You may not create compound indexes that have hashed index fields. You will receive an error if you 
attempt to create a compound index that includes a hashed index (page 455). 
The order of the fields in a compound index is very important. In the previous example, the index will contain 
references to documents sorted first by the values of the item field and, within each value of the item field, sorted 
by values of the stock field. See Sort Order (page 441) for more information. 
In addition to supporting queries that match on all the index fields, compound indexes can support queries that match 
on the prefix of the index fields. For details, see Prefixes (page 442). 
Sort Order Indexes store references to fields in either ascending (1) or descending (-1) sort order. For single-field 
indexes, the sort order of keys doesn’t matter because MongoDB can traverse the index in either direction. However, 
for compound indexes (page 440), sort order can matter in determining whether the index can support a sort operation. 
Consider a collection events that contains documents with the fields username and date. Applications can issue 
queries that return results sorted first by ascending username values and then by descending (i.e. more recent to last) 
date values, such as: 
db.events.find().sort( { username: 1, date: -1 } ) 
or queries that return results sorted first by descending username values and then by ascending date values, such 
as: 
db.events.find().sort( { username: -1, date: 1 } ) 
The following index can support both these sort operations: 
db.events.ensureIndex( { "username" : 1, "date" : -1 } ) 
However, the above index cannot support sorting by ascending username values and then by ascending date 
values, such as the following: 
db.events.find().sort( { username: 1, date: 1 } ) 
8.2. Index Concepts 441
MongoDB Documentation, Release 2.6.4 
Prefixes Compound indexes support queries on any prefix of the index fields. Index prefixes are the beginning 
subset of indexed fields. For example, given the index { a: 1, b: 1, c: 1 }, both { a: 1 } and { 
a: 1, b: 1 } are prefixes of the index. 
If you have a collection that has a compound index on { a: 1, b: 1 }, as well as an index that consists of the 
prefix of that index, i.e. { a: 1 }, assuming none of the index has a sparse or unique constraints, then you can 
drop the { a: 1 } index. MongoDB will be able to use the compound index in all of situations that it would have 
used the { a: 1 } index. 
For example, given the following index: 
{ "item": 1, "location": 1, "stock": 1 } 
MongoDB can use this index to support queries that include: 
• the item field, 
• the item field and the location field, 
• the item field and the location field and the stock field, or 
• only the item and stock fields; however, this index would be less efficient than an index on only item and 
stock. 
MongoDB cannot use this index to support queries that include: 
• only the location field, 
• only the stock field, or 
• only the location and stock fields. 
Index Intersection Starting in version 2.6, MongoDB can use index intersection (page 462) to fulfill queries. The 
choice between creating compound indexes that support your queries or relying on index intersection depends on the 
specifics of your system. See Index Intersection and Compound Indexes (page 463) for more details. 
Multikey Indexes 
To index a field that holds an array value, MongoDB adds index items for each item in the array. These multikey indexes 
allow MongoDB to return documents from queries using the value of an array. MongoDB automatically determines 
whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify 
the multikey type. 
Consider the following illustration of a multikey index: 
Multikey indexes support all operations supported by other MongoDB indexes; however, applications may use multi-key 
indexes to select documents based on ranges of values for the value of an array. Multikey indexes support arrays 
that hold both values (e.g. strings, numbers) and nested documents. 
Limitations 
Interactions between Compound and Multikey Indexes While you can create multikey compound indexes 
(page 440), at most one field in a compound index may hold an array. For example, given an index on { a: 1, 
b: 1 }, the following documents are permissible: 
{a: [1, 2], b: 1} 
{a: 1, b: [1, 2]} 
442 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Figure 8.9: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address 
documents. The address documents contain the zip field. 
However, the following document is impermissible, and MongoDB cannot insert such a document into a collection 
with the {a: 1, b: 1 } index: 
{a: [1, 2], b: [1, 2]} 
If you attempt to insert such a document, MongoDB will reject the insertion, and produce an error that says cannot 
index parallel arrays. MongoDB does not index parallel arrays because they require the index to include 
each value in the Cartesian product of the compound keys, which could quickly result in incredibly large and difficult 
to maintain indexes. 
ShardKeys 
Important: The index of a shard key cannot be a multi-key index. 
Hashed Indexes hashed indexes are not compatible with multi-key indexes. 
To compute the hash for a hashed index, MongoDB collapses sub-documents and computes the hash for the entire 
value. For fields that hold arrays or sub-documents, you cannot use the index to support queries that introspect the 
sub-document. 
Examples 
Index Basic Arrays Given the following document: 
{ 
"_id" : ObjectId("..."), 
"name" : "Warm Weather", 
8.2. Index Concepts 443
MongoDB Documentation, Release 2.6.4 
"author" : "Steve", 
"tags" : [ "weather", "hot", "record", "april" ] 
} 
Then an index on the tags field, { tags: 1 }, would be a multikey index and would include these four separate 
entries for that document: 
• "weather", 
• "hot", 
• "record", and 
• "april". 
Queries could use the multikey index to return queries for any of the above values. 
Index Arrays with Embedded Documents You can create multikey indexes on fields in objects embedded in arrays, 
as in the following example: 
Consider a feedback collection with documents in the following form: 
{ 
"_id": ObjectId(...), 
"title": "Grocery Quality", 
"comments": [ 
{ author_id: ObjectId(...), 
date: Date(...), 
text: "Please expand the cheddar selection." }, 
{ author_id: ObjectId(...), 
date: Date(...), 
text: "Please expand the mustard selection." }, 
{ author_id: ObjectId(...), 
date: Date(...), 
text: "Please expand the olive selection." } 
] 
} 
An index on the comments.text field would be a multikey index and would add items to the index for all embedded 
documents in the array. 
With the index { "comments.text": 1 } on the feedback collection, consider the following query: 
db.feedback.find( { "comments.text": "Please expand the olive selection." } ) 
The query would select the documents in the collection that contain the following embedded document in the 
comments array: 
{ author_id: ObjectId(...), 
date: Date(...), 
text: "Please expand the olive selection." } 
Geospatial Indexes and Queries 
MongoDB offers a number of indexes and query mechanisms to handle geospatial information. This section introduces 
MongoDB’s geospatial features. For complete examples of geospatial queries in MongoDB, see Geospatial Index 
Tutorials (page 476). 
444 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Surfaces Before storing your location data and writing queries, you must decide the type of surface to use to perform 
calculations. The type you choose affects how you store data, what type of index to build, and the syntax of your 
queries. 
MongoDB offers two surface types: 
Spherical To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use 
2dsphere (page 447) index. 
Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate 
reference system for GeoJSON uses the WGS84 datum. 
Flat To calculate distances on a Euclidean plane, store your location data as legacy coordinate pairs and use a 2d 
(page 451) index. 
Location Data If you choose spherical surface calculations, you store location data as either: 
GeoJSON Objects Queries on GeoJSON objects always calculate on a sphere. The default coordinate reference 
system for GeoJSON uses the WGS84 datum. 
New in version 2.4: Support for GeoJSON storage and queries is new in version 2.4. Prior to version 2.4, all geospatial 
data used coordinate pairs. 
Changed in version 2.6: Support for additional GeoJSON types: MultiPoint, MultiLineString, MultiPolygon, Geome-tryCollection. 
MongoDB supports the following GeoJSON objects: 
• Point 
• LineString 
• Polygon 
• MultiPoint 
• MultiLineString 
• MultiPolygon 
• GeometryCollection 
Legacy Coordinate Pairs MongoDB supports spherical surface calculations on legacy coordinate pairs using a 
2dsphere index by converting the data to the GeoJSON Point type. 
If you choose flat surface calculations, and use a 2d index you can store data only as legacy coordinate pairs. 
Query Operations MongoDB’s geospatial query operators let you query for: 
Inclusion MongoDB can query for locations contained entirely within a specified polygon. Inclusion queries use 
the $geoWithin operator. 
Both 2d and 2dsphere indexes can support inclusion queries. MongoDB does not require an index for inclusion 
queries after 2.2.3; however, these indexes will improve query performance. 
8.2. Index Concepts 445
MongoDB Documentation, Release 2.6.4 
Intersection MongoDB can query for locations that intersect with a specified geometry. These queries apply only 
to data on a spherical surface. These queries use the $geoIntersects operator. 
Only 2dsphere indexes support intersection. 
Proximity MongoDB can query for the points nearest to another point. Proximity queries use the $near operator. 
The $near operator requires a 2d or 2dsphere index. 
Geospatial Indexes MongoDB provides the following geospatial index types to support the geospatial queries. 
2dsphere 2dsphere (page 447) indexes support: 
• Calculations on a sphere 
• GeoJSON objects and include backwards compatibility for legacy coordinate pairs. 
• A compound index with scalar index fields (i.e. ascending or descending) as a prefix or suffix of the 2dsphere 
index field 
New in version 2.4: 2dsphere indexes are not available before version 2.4. 
See also: 
Query a 2dsphere Index (page 478) 
2d 2d (page 451) indexes support: 
• Calculations using flat geometry 
• Legacy coordinate pairs (i.e., geospatial points on a flat coordinate system) 
• A compound index with only one additional field, as a suffix of the 2d index field 
See also: 
Query a 2d Index (page 481) 
Geospatial Indexes and Sharding You cannot use a geospatial index as the shard key index. 
You can create and maintain a geospatial index on a sharded collection if using fields other than shard key. 
For sharded collections, queries using $near are not supported. You can instead use either the geoNear command 
or the $geoNear aggregation stage. 
You also can query for geospatial data using $geoWithin. 
Additional Resources The following pages provide complete documentation for geospatial indexes and queries: 
2dsphere Indexes (page 447) A 2dsphere index supports queries that calculate geometries on an earth-like sphere. 
The index supports data stored as both GeoJSON objects and as legacy coordinate pairs. 
2d Indexes (page 451) The 2d index supports data stored as legacy coordinate pairs and is intended for use in Mon-goDB 
2.2 and earlier. 
geoHaystack Indexes (page 452) A haystack index is a special index optimized to return results over small areas. For 
queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 
2d Index Internals (page 452) Provides a more in-depth explanation of the internals of geospatial indexes. This ma-terial 
is not necessary for normal operations but may be useful for troubleshooting and for further understanding. 
446 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
2dsphere Indexes New in version 2.4. 
A 2dsphere index supports queries that calculate geometries on an earth-like sphere. The index supports data stored 
as both GeoJSON objects and as legacy coordinate pairs. The index supports legacy coordinate pairs by converting 
the data to the GeoJSON Point type. The default datum for an earth-like sphere in MongoDB 2.4 is WGS84. 
Coordinate-axis order is longitude, latitude. 
The 2dsphere index supports all MongoDB geospatial queries: queries for inclusion, intersection and proxim-ity. 
See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the 
query operators that support geospatial queries. 
To create a 2dsphere index, use the db.collection.ensureIndex method. A compound (page 440) 
2dsphere index can reference multiple location and non-location fields within a collection’s documents. See Create 
a 2dsphere Index (page 476) for more information. 
2dsphere Version 2 Changed in version 2.6. 
MongoDB 2.6 introduces a version 2 of 2dsphere indexes. Version 2 is the default version of 2dsphere 
indexes created in MongoDB 2.6. To create a 2dsphere index as a version 1, include the option { 
"2dsphereIndexVersion": 1 } when creating the index. 
Additional GeoJSON Objects Changed in version 2.6. 
Version 2 adds support for additional GeoJSON object: MultiPoint (page 449), MultiLineString (page 450), MultiPoly-gon 
(page 450), and GeometryCollection (page 450). 
sparse Property Changed in version 2.6. 
Version 2 2dsphere indexes are sparse (page 457) by default and ignores the sparse: true (page 457) option. If 
a document lacks a 2dsphere index field (or the field is null or an empty array), MongoDB does not add an 
entry for the document to the 2dsphere index. For inserts, MongoDB inserts the document but does not add to the 
2dsphere index. 
For a compound index that includes a 2dsphere index key along with keys of other types, only the 2dsphere 
index field determines whether the index references a document. 
Earlier versions of MongoDB only support Version 1 2dsphere indexes. Version 1 2dsphere indexes are not 
sparse by default and will reject documents with null location fields. 
Considerations 
geoNear and $geoNear Restrictions The geoNear command and the $geoNear pipeline stage require that 
a collection have at most only one 2dsphere index and/or only one 2d (page 451) index whereas geospatial query 
operators (e.g. $near and $geoWithin) permit collections to have multiple geospatial indexes. 
The geospatial index restriction for the geoNear command nor the $geoNear pipeline stage exists because neither 
the geoNear command nor the $geoNear pipeline stage syntax includes the location field. As such, index selection 
among multiple 2d indexes or 2dsphere indexes is ambiguous. 
No such restriction applies for geospatial query operators since these operators take a location field, eliminating the 
ambiguity. 
Shard Key Restrictions You cannot use a 2dsphere index as a shard key when sharding a collection. However, 
you can create and maintain a geospatial index on a sharded collection by using a different field as the shard key. 
8.2. Index Concepts 447
MongoDB Documentation, Release 2.6.4 
GeoJSON Objects MongoDB supports the following GeoJSON objects: 
• Point (page 448) 
• LineString (page 448) 
• Polygon (page 448) 
• MultiPoint (page 449) 
• MultiLineString (page 450) 
• MultiPolygon (page 450) 
• GeometryCollection (page 450) 
The MultiPoint (page 449), MultiLineString (page 450), MultiPolygon (page 450), and GeometryCollection (page 450) 
require 2dsphere index version 2. 
In order to index GeoJSON data, you must store the data in a location field that you name. The location field contains 
a subdocument with a type field specifying the GeoJSON object type and a coordinates field specifying the 
object’s coordinates. Always store coordinates in longitude, latitude order. 
Use the following syntax: 
{ <location field>: { type: "<GeoJSON type>" , coordinates: <coordinates> } } 
Point New in version 2.4. 
The following example stores a GeoJSON Point: 
{ loc: { type: "Point", coordinates: [ 40, 5 ] } } 
LineString New in version 2.4. 
The following example stores a GeoJSON LineString: 
{ loc: { type: "LineString", coordinates: [ [ 40, 5 ], [ 41, 6 ] ] } } 
Polygon New in version 2.4. 
Polygons consist of an array of GeoJSON LinearRing coordinate arrays. These LinearRings are closed 
LineStrings. Closed LineStrings have at least four coordinate pairs and specify the same position as the 
first and last coordinates. 
The line that joins two points on a curved surface may or may not contain the same set of co-ordinates that joins those 
two points on a flat surface. The line that joins two points on a curved surface will be a geodesic. Carefully check 
points to avoid errors with shared edges, as well as overlaps and other types of intersections. 
Polygons with a Single Ring The following example stores a GeoJSON Polygon with an exterior ring and no 
interior rings (or holes). Note the first and last coordinate pair with the [ 0 , 0 ] coordinate: 
{ 
loc : 
{ 
type: "Polygon", 
coordinates: [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ] ] 
} 
} 
448 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
For Polygons with a single ring, the ring cannot self-intersect. 
Polygons with Multiple Rings For Polygons with multiple rings: 
• The first described ring must be the exterior ring. 
• The exterior ring cannot self-intersect. 
• Any interior ring must be entirely contained by the outer ring. 
• Interior rings cannot intersect or overlap each other. Interior rings cannot share an edge. 
The following document represents a polygon with an interior ring as GeoJSON: 
{ loc : { 
type : "Polygon", 
coordinates : [ 
[ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ], 
[ [ 2 , 2 ] , [ 3 , 3 ] , [ 4 , 2 ] , [ 2 , 2 ] ] 
] 
} 
} 
Figure 8.10: Diagram of a Polygon with internal ring. 
MultiPoint New in version 2.6: Requires 2dsphere index version 2. 
The following example stores coordinates of GeoJSON type MultiPoint3: 
{ loc: { 
type: "MultiPoint", 
coordinates: [ 
[ -73.9580, 40.8003 ], 
3http://geojson.org/geojson-spec.html#id5 
8.2. Index Concepts 449
MongoDB Documentation, Release 2.6.4 
[ -73.9498, 40.7968 ], 
[ -73.9737, 40.7648 ], 
[ -73.9814, 40.7681 ] 
] 
} 
} 
MultiLineString New in version 2.6: Requires 2dsphere index version 2. 
The following example stores coordinates of GeoJSON type MultiLineString4: 
{ loc: 
{ 
type: "MultiLineString", 
coordinates: [ 
[ [ -73.96943, 40.78519 ], [ -73.96082, 40.78095 ] ], 
[ [ -73.96415, 40.79229 ], [ -73.95544, 40.78854 ] ], 
[ [ -73.97162, 40.78205 ], [ -73.96374, 40.77715 ] ], 
[ [ -73.97880, 40.77247 ], [ -73.97036, 40.76811 ] ] 
] 
} 
} 
MultiPolygon New in version 2.6: Requires 2dsphere index version 2. 
The following example stores coordinates of GeoJSON type MultiPolygon5: 
{ loc: 
{ 
type: "MultiPolygon", 
coordinates: [ 
[ [ [ -73.958, 40.8003 ], [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.9814, 40.7681 [ [ [ -73.958, 40.8003 ], [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.958, 40.8003 ] ] 
} 
} 
GeometryCollection New in version 2.6: Requires 2dsphere index version 2. 
The following example stores coordinates of GeoJSON type GeometryCollection6: 
{ loc: 
{ 
type: "GeometryCollection", 
geometries: [ 
{ 
type: "MultiPoint", 
coordinates: [ 
[ -73.9580, 40.8003 ], 
[ -73.9498, 40.7968 ], 
[ -73.9737, 40.7648 ], 
[ -73.9814, 40.7681 ] 
] 
4http://geojson.org/geojson-spec.html#id6 
5http://geojson.org/geojson-spec.html#id7 
6http://geojson.org/geojson-spec.html#geometrycollection 
450 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
}, 
{ 
type: "MultiLineString", 
coordinates: [ 
[ [ -73.96943, 40.78519 ], [ -73.96082, 40.78095 ] ], 
[ [ -73.96415, 40.79229 ], [ -73.95544, 40.78854 ] ], 
[ [ -73.97162, 40.78205 ], [ -73.96374, 40.77715 ] ], 
[ [ -73.97880, 40.77247 ], [ -73.97036, 40.76811 ] ] 
] 
} 
] 
} 
} 
2d Indexes Use a 2d index for data stored as points on a two-dimensional plane. The 2d index is intended for 
legacy coordinate pairs used in MongoDB 2.2 and earlier. 
Use a 2d index if: 
• your database has legacy location data from MongoDB 2.2 or earlier, and 
• you do not intend to store any location data as GeoJSON objects. 
See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the 
query operators that support geospatial queries. 
Considerations The geoNear command and the $geoNear pipeline stage require that a collection have at most 
only one 2d index and/or only one 2dsphere index (page 447) whereas geospatial query operators (e.g. $near and 
$geoWithin) permit collections to have multiple geospatial indexes. 
The geospatial index restriction for the geoNear command nor the $geoNear pipeline stage exists because neither 
the geoNear command nor the $geoNear pipeline stage syntax includes the location field. As such, index selection 
among multiple 2d indexes or 2dsphere indexes is ambiguous. 
No such restriction applies for geospatial query operators since these operators take a location field, eliminating the 
ambiguity. 
Do not use a 2d index if your location data includes GeoJSON objects. To index on both legacy coordinate pairs and 
GeoJSON objects, use a 2dsphere (page 447) index. 
You cannot use a 2d index as a shard key when sharding a collection. However, you can create and maintain a 
geospatial index on a sharded collection by using a different field as the shard key. 
Behavior The 2d index supports calculations on a flat, Euclidean plane. The 2d index also supports distance-only 
calculations on a sphere, but for geometric calculations (e.g. $geoWithin) on a sphere, store data as GeoJSON 
objects and use the 2dsphere index type. 
A 2d index can reference two fields. The first must be the location field. A 2d compound index constructs queries 
that select first on the location field, and then filters those results by the additional criteria. A compound 2d index can 
cover queries. 
Points on a 2D Plane To store location data as legacy coordinate pairs, use an array or an embedded document. 
When possible, use the array format: 
loc : [ <longitude> , <latitude> ] 
Consider the embedded document form: 
8.2. Index Concepts 451
MongoDB Documentation, Release 2.6.4 
loc : { lng : <longitude> , lat : <latitude> } 
Arrays are preferred as certain languages do not guarantee associative map ordering. 
For all points, if you use longitude and latitude, store coordinates in longitude, latitude order. 
sparse Property 2d indexes are sparse (page 457) by default and ignores the sparse: true (page 457) option. If 
a document lacks a 2d index field (or the field is null or an empty array), MongoDB does not add an entry for the 
document to the 2d index. For inserts, MongoDB inserts the document but does not add to the 2d index. 
For a compound index that includes a 2d index key along with keys of other types, only the 2d index field determines 
whether the index references a document. 
geoHaystack Indexes A geoHaystack index is a special index that is optimized to return results over small 
areas. geoHaystack indexes improve performance on queries that use flat geometry. 
For queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 2dsphere in-dexes 
(page 447) allow field reordering; geoHaystack indexes require the first field to be the location field. Also, 
geoHaystack indexes are only usable via commands and so always return all results at once. 
Behavior geoHaystack indexes create “buckets” of documents from the same geographic area in order to improve 
performance for queries limited to that area. Each bucket in a geoHaystack index contains all the documents within 
a specified proximity to a given longitude and latitude. 
sparse Property geoHaystack indexes are sparse (page 457) by default and ignore the sparse: true (page 457) 
option. If a document lacks a geoHaystack index field (or the field is null or an empty array), MongoDB does 
not add an entry for the document to the geoHaystack index. For inserts, MongoDB inserts the document but does 
not add to the geoHaystack index. 
geoHaystack indexes include one geoHaystack index key and one non-geospatial index key; however, only the 
geoHaystack index field determines whether the index references a document. 
Create geoHaystack Index To create a geoHaystack index, see Create a Haystack Index (page 482). For 
information and example on querying a haystack index, see Query a Haystack Index (page 483). 
2d Index Internals This document provides a more in-depth explanation of the internals of MongoDB’s 2d geospa-tial 
indexes. This material is not necessary for normal operations or application development but may be useful for 
troubleshooting and for further understanding. 
Calculation of Geohash Values for 2d Indexes When you create a geospatial index on legacy coordinate pairs, 
MongoDB computes geohash values for the coordinate pairs within the specified location range (page 480) and then 
indexes the geohash values. 
To calculate a geohash value, recursively divide a two-dimensional map into quadrants. Then assign each quadrant a 
two-bit value. For example, a two-bit representation of four quadrants would be: 
01 11 
00 10 
452 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
These two-bit values (00, 01, 10, and 11) represent each of the quadrants and all points within each quadrant. For 
a geohash with two bits of resolution, all points in the bottom left quadrant would have a geohash of 00. The top 
left quadrant would have the geohash of 01. The bottom right and top right would have a geohash of 10 and 11, 
respectively. 
To provide additional precision, continue dividing each quadrant into sub-quadrants. Each sub-quadrant would have 
the geohash value of the containing quadrant concatenated with the value of the sub-quadrant. The geohash for the 
upper-right quadrant is 11, and the geohash for the sub-quadrants would be (clockwise from the top left): 1101, 
1111, 1110, and 1100, respectively. 
Multi-location Documents for 2d Indexes New in version 2.0: Support for multiple locations in a document. 
While 2d geospatial indexes do not support more than one set of coordinates in a document, you can use a multi-key 
index (page 442) to index multiple coordinate pairs in a single document. In the simplest example you may have a 
field (e.g. locs) that holds an array of coordinates, as in the following example: 
{ _id : ObjectId(...), 
locs : [ [ 55.5 , 42.3 ] , 
[ -74 , 44.74 ] , 
{ lng : 55.5 , lat : 42.3 } ] 
} 
The values of the array may be either arrays, as in [ 55.5, 42.3 ], or embedded documents, as in { lng : 
55.5 , lat : 42.3 }. 
You could then create a geospatial index on the locs field, as in the following: 
db.places.ensureIndex( { "locs": "2d" } ) 
You may also model the location data as a field inside of a sub-document. In this case, the document would contain 
a field (e.g. addresses) that holds an array of documents where each document has a field (e.g. loc:) that holds 
location coordinates. For example: 
{ _id : ObjectId(...), 
name : "...", 
addresses : [ { 
context : "home" , 
loc : [ 55.5, 42.3 ] 
} , 
{ 
context : "home", 
loc : [ -74 , 44.74 ] 
} 
] 
} 
You could then create the geospatial index on the addresses.loc field as in the following example: 
db.records.ensureIndex( { "addresses.loc": "2d" } ) 
To include the location field with the distance field in multi-location document queries, specify includeLocs: 
true in the geoNear command. 
See also: 
geospatial-query-compatibility-chart 
8.2. Index Concepts 453
MongoDB Documentation, Release 2.6.4 
Text Indexes 
New in version 2.4. 
MongoDB provides text indexes to support text search of string content in documents of a collection. 
text indexes can include any field whose value is a string or an array of string elements. To perform queries that 
access the text index, use the $text query operator. 
Changed in version 2.6: MongoDB enables the text search feature by default. In MongoDB 2.4, you need to enable 
the text search feature manually to create text indexes and perform text search (page 455). 
Create Text Index To create a text index, use the db.collection.ensureIndex() method. To index a 
field that contains a string or an array of string elements, include the field and specify the string literal "text" in the 
index document, as in the following example: 
db.reviews.ensureIndex( { comments: "text" } ) 
A collection can have at most one text index. 
For examples of creating text indexes on multiple fields, see Create a text Index (page 486). 
Supported Languages and StopWords MongoDB supports text search for various languages. text indexes drop 
language-specific stop words (e.g. in English, “the”, “an”, “a”, “and”, etc.) and uses simple language-specific suffix 
stemming. For a list of the supported languages, see Text Search Languages (page 501). 
If you specify a language value of "none", then the text index uses simple tokenization with no list of stop words 
and no stemming. 
If the index language is English, text indexes are case-insensitive for non-diacritics; i.e. case insensitive for [A-z]. 
To specify a language for the text index, see Specify a Language for Text Index (page 487). 
sparse Property text indexes are sparse (page 457) by default and ignores the sparse: true (page 457) option. 
If a document lacks a text index field (or the field is null or an empty array), MongoDB does not add an entry for 
the document to the text index. For inserts, MongoDB inserts the document but does not add to the text index. 
For a compound index that includes a text index key along with keys of other types, only the text index field 
determine whether the index references a document. The other keys do not determine whether the index references 
the documents or not. 
Restrictions 
Text Search and Hints You cannot use hint() if the query includes a $text query expression. 
Compound Index A compound index (page 440) can include a text index key in combination with ascend-ing/ 
descending index keys. However, these compound indexes have the following restrictions: 
A compound text index cannot include any other special index types, such as multi-key (page 442) or geospatial 
(page 446) index fields. 
If the compound text index includes keys preceding the text index key, to perform a $text search, the query 
predicate must include equality match conditions on the preceding keys. 
See Limit the Number of Entries Scanned (page 491). 
454 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Drop a Text Index To drop a text index, pass the name of the index to the db.collection.dropIndex() 
method. To get the name of the index, run the getIndexes() method. 
For information on the default naming scheme for text indexes as well as overriding the default name, see Specify 
Name for text Index (page 489). 
Storage Requirements and Performance Costs text indexes have the following storage requirements and per-formance 
costs: 
• text indexes change the space allocation method for all future record allocations in a collection to 
usePowerOf2Sizes. 
• text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed 
field for each document inserted. 
• Building a text index is very similar to building a large multi-key index and will take longer than building a 
simple ordered (scalar) index on the same data. 
• When building a large text index on an existing collection, ensure that you have a sufficiently high limit on 
open file descriptors. See the recommended settings (page 266). 
• text indexes will impact insertion throughput because MongoDB must add an index entry for each unique 
post-stemmed word in each indexed field of each new source document. 
• Additionally, text indexes do not store phrases or information about the proximity of words in the documents. 
As a result, phrase queries will run much more effectively when the entire collection fits in RAM. 
Text Search Text search supports the search of string content in documents of a collection. MongoDB provides the 
$text operator to perform text search in queries and in aggregation pipelines (page 491). 
The text search process: 
• tokenizes and stems the search term(s) during both the index creation and the text command execution. 
• assigns a score to each document that contains the search term in the indexed fields. The score determines the 
relevance of a document to a given search query. 
The $text operator can search for words and phrases. The query matches on the complete stemmed words. For 
example, if a document field contains the word blueberry, a search on the term blue will not match the document. 
However, a search on either blueberry or blueberries will match. 
For information and examples on various text search patterns, see the $text query operator. For examples of text 
search in aggregation pipeline, see Text Search in the Aggregation Pipeline (page 491). 
Hashed Index 
New in version 2.4. 
Hashed indexes maintain entries with hashes of the values of the indexed field. The hashing function collapses sub-documents 
and computes the hash for the entire value but does not support multi-key (i.e. arrays) indexes. 
Hashed indexes support sharding (page 607) a collection using a hashed shard key (page 621). Using a hashed shard 
key to shard a collection ensures a more even distribution of data. See Shard a Collection Using a Hashed Shard Key 
(page 641) for more details. 
MongoDB can use the hashed index to support equality queries, but hashed indexes do not support range queries. 
You may not create compound indexes that have hashed index fields or specify a unique constraint 
on a hashed index; however, you can create both a hashed index and an ascending/descending 
(i.e. non-hashed) index on the same field: MongoDB will use the scalar index for range queries. 
8.2. Index Concepts 455
MongoDB Documentation, Release 2.6.4 
Warning: MongoDB hashed indexes truncate floating point numbers to 64-bit integers before hashing. For 
example, a hashed index would store the same value for a field that held a value of 2.3, 2.2, and 2.9. To 
prevent collisions, do not use a hashed index for floating point numbers that cannot be reliably converted to 
64-bit integers (and then back to floating point). MongoDB hashed indexes do not support floating point values 
larger than 253. 
Create a hashed index using an operation that resembles the following: 
db.active.ensureIndex( { a: "hashed" } ) 
This operation creates a hashed index for the active collection on the a field. 
8.2.2 Index Properties 
In addition to the numerous index types (page 437) MongoDB supports, indexes can also have various properties. The 
following documents detail the index properties that you can select when building an index. 
TTL Indexes (page 456) The TTL index is used for TTL collections, which expire data after a period of time. 
Unique Indexes (page 457) A unique index causes MongoDB to reject all documents that contain a duplicate value 
for the indexed field. 
Sparse Indexes (page 457) A sparse index does not index documents that do not have the indexed field. 
TTL Indexes 
TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after 
a certain amount of time. This is ideal for some types of information like machine generated event data, logs, and 
session information that only need to persist in a database for a limited amount of time. 
Considerations 
TTL indexes have the following limitations: 
• Compound indexes (page 440) are not supported. 
• The indexed field must be a date type. 
• If the field holds an array, and there are multiple date-typed data in the index, the document will expire when 
the lowest (i.e. earliest) matches the expiration threshold. 
The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay between the 
time a document expires and the time that MongoDB removes the document from the database. 
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a 
collection after they expire but before the background task runs or completes. 
The duration of the removal operation depends on the workload of your mongod instance. Therefore, expired data 
may exist for some time beyond the 60 second period between runs of the background task. 
In all other respects, TTL indexes are normal indexes, and if appropriate, MongoDB can use these indexes to fulfill 
arbitrary queries. 
Additional Information 
Expire Data from Collections by Setting TTL (page 198) 
456 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Unique Indexes 
A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. 
To create a unique index, use the db.collection.ensureIndex() method with the unique option set to 
true. For example, to create a unique index on the user_id field of the members collection, use the following 
operation in the mongo shell: 
db.members.ensureIndex( { "user_id": 1 }, { unique: true } ) 
By default, unique is false on MongoDB indexes. 
If you use the unique constraint on a compound index (page 440), then MongoDB will enforce uniqueness on the 
combination of values rather than the individual value for any or all values of the key. 
Behavior 
Unique Constraint Across Separate Documents The unique constraint applies to separate documents in the col-lection. 
That is, the unique index prevents separate documents from having the same value for the indexed key, but the 
index does not prevent a document from having multiple elements or embedded documents in an indexed array from 
having the same value. In the case of a single document with repeating values, the repeated value is inserted into the 
index only once. 
For example, a collection has a unique index on a.b: 
db.collection.ensureIndex( { "a.b": 1 }, { unique: true } ) 
The unique index permits the insertion of the following document into the collection if no other document in the 
collection has the a.b value of 5: 
db.collection.insert( { a: [ { b: 5 }, { b: 5 } ] } ) 
Unique Index and Missing Field If a document does not have a value for the indexed field in a unique index, the 
index will store a null value for this document. Because of the unique constraint, MongoDB will only permit one 
document that lacks the indexed field. If there is more than one document without a value for the indexed field or is 
missing the indexed field, the index build will fail with a duplicate key error. 
You can combine the unique constraint with the sparse index (page 457) to filter these null values from the unique 
index and avoid the error. 
Restrictions You may not specify a unique constraint on a hashed index (page 455). 
See also: 
Create a Unique Index (page 467) 
Sparse Indexes 
Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null 
value. The index skips over any document that is missing the indexed field. The index is “sparse” because it does not 
include all documents of a collection. By contrast, non-sparse indexes contain all documents in a collection, storing 
null values for those documents that do not contain the indexed field. 
To create a sparse index, use the db.collection.ensureIndex() method with the sparse option set to 
true. For example, the following operation in the mongo shell creates a sparse index on the xmpp_id field of the 
addresses collection: 
8.2. Index Concepts 457
MongoDB Documentation, Release 2.6.4 
db.addresses.ensureIndex( { "xmpp_id": 1 }, { sparse: true } ) 
Note: Do not confuse sparse indexes in MongoDB with block-level7 indexes in other databases. Think of them as 
dense indexes with a specific filter. 
Behavior 
sparse Index and Incomplete Results Changed in version 2.6. 
If a sparse index would result in an incomplete result set for queries and sort operations, MongoDB will not use that 
index unless a hint() explicitly specifies the index. 
For example, the query { x: { $exists: false } } will not use a sparse index on the x field unless 
explicitly hinted. See Sparse Index On A Collection Cannot Return Complete Results (page 459) for an example that 
details the behavior. 
Indexes that are sparse by Default 2dsphere (version 2) (page 447), 2d (page 451), geoHaystack (page 452), and 
text (page 454) indexes are always sparse. 
sparse Compound Indexes Sparse compound indexes (page 440) that only contain ascending/descending index 
keys will index a document as long as the document contains at least one of the keys. 
For sparse compound indexes that contain a geospatial key (i.e. 2dsphere (page 447), 2d (page 451), or geoHaystack 
(page 452) index keys) along with ascending/descending index key(s), only the existence of the geospatial field(s) in 
a document determine whether the index references the document. 
For sparse compound indexes that contain text (page 454) index keys along with ascending/descending index keys, 
only the existence of the text index field(s) determine whether the index references a document. 
sparse and unique Properties An index that is both sparse and unique (page 457) prevents collection from 
having documents with duplicate values for a field but allows multiple documents that omit the key. 
Examples 
Create a Sparse Index On A Collection Consider a collection scores that contains the following documents: 
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } 
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } 
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } 
The collection has a sparse index on the field score: 
db.scores.ensureIndex( { score: 1 } , { sparse: true } ) 
Then, the following query on the scores collection uses the sparse index to return the documents that have the 
score field less than ($lt) 90: 
db.scores.find( { score: { $lt: 90 } } ) 
Because the document for the userid "newbie" does not contain the score field and thus does not meet the query 
criteria, the query can use the sparse index to return the results: 
7http://en.wikipedia.org/wiki/Database_index#Sparse_index 
458 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } 
Sparse Index On A Collection Cannot Return Complete Results Consider a collection scores that contains the 
following documents: 
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } 
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } 
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } 
The collection has a sparse index on the field score: 
db.scores.ensureIndex( { score: 1 } , { sparse: true } ) 
Because the document for the userid "newbie" does not contain the score field, the sparse index does not contain 
an entry for that document. 
Consider the following query to return all documents in the scores collection, sorted by the score field: 
db.scores.find().sort( { score: -1 } ) 
Even though the sort is by the indexed field, MongoDB will not select the sparse index to fulfill the query in order to 
return complete results: 
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } 
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } 
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } 
To use the sparse index, explicitly specify the index with hint(): 
db.scores.find().sort( { score: -1 } ).hint( { score: 1 } ) 
The use of the index results in the return of only those documents with the score field: 
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } 
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } 
See also: 
explain() and Analyze Query Performance (page 97) 
Sparse Index with Unique Constraint Consider a collection scores that contains the following documents: 
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } 
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } 
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } 
You could create an index with a unique constraint (page 457) and sparse filter on the score field using the following 
operation: 
db.scores.ensureIndex( { score: 1 } , { sparse: true, unique: true } ) 
This index would permit the insertion of documents that had unique values for the score field or did not include a 
score field. Consider the following insert operation (page 84): 
db.scores.insert( { "userid": "AAAAAAA", "score": 43 } ) 
db.scores.insert( { "userid": "BBBBBBB", "score": 34 } ) 
db.scores.insert( { "userid": "CCCCCCC" } ) 
db.scores.insert( { "userid": "DDDDDDD" } ) 
8.2. Index Concepts 459
MongoDB Documentation, Release 2.6.4 
However, the index would not permit the addition of the following documents since documents already exists with 
score value of 82 and 90: 
db.scores.insert( { "userid": "AAAAAAA", "score": 82 } ) 
db.scores.insert( { "userid": "BBBBBBB", "score": 90 } ) 
8.2.3 Index Creation 
MongoDB provides several options that only affect the creation of the index. Specify these options in a document as 
the second argument to the db.collection.ensureIndex() method. This section describes the uses of these 
creation options and their behavior. 
Related 
Some options that you can specify to ensureIndex() options control the properties of the index (page 456), which 
are not index creation options. For example, the unique (page 457) option affects the behavior of the index after 
creation. 
For a detailed description of MongoDB’s index types, see Index Types (page 437) and Index Properties (page 456) for 
related documentation. 
Background Construction 
By default, creating an index blocks all other operations on a database. When building an index on a collection, the 
database that holds the collection is unavailable for read or write operations until the index build completes. Any 
operation that requires a read or write lock on all databases (e.g. listDatabases) will wait for the foreground index 
build to complete. 
For potentially long running index building operations, consider the background operation so that the MongoDB 
database remains available during the index building operation. For example, to create an index in the background of 
the zipcode field of the people collection, issue the following: 
db.people.ensureIndex( { zipcode: 1}, {background: true} ) 
By default, background is false for building MongoDB indexes. 
You can combine the background option with other options, as in the following: 
db.people.ensureIndex( { zipcode: 1}, {background: true, sparse: true } ) 
Behavior 
As of MongoDB version 2.4, a mongod instance can build more than one index in the background concurrently. 
Changed in version 2.4: Before 2.4, a mongod instance could only build one background index per database at a time. 
Changed in version 2.2: Before 2.2, a single mongod instance could only build one index at a time. 
Background indexing operations run in the background so that other database operations can run while creating the 
index. However, the mongo shell session or connection where you are creating the index will block until the index 
build is complete. To continue issuing commands to the database, open another connection or mongo instance. 
Queries will not use partially-built indexes: the index will only be usable once the index build is complete. 
Note: If MongoDB is building an index in the background, you cannot perform other administra-tive 
operations involving that collection, including running repairDatabase, dropping the collection (i.e. 
460 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
db.collection.drop()), and running compact. These operations will return an error during background 
index builds. 
Performance 
The background index operation uses an incremental approach that is slower than the normal “foreground” index 
builds. If the index is larger than the available RAM, then the incremental process can take much longer than the 
foreground build. 
If your application includes ensureIndex() operations, and an index doesn’t exist for other operational concerns, 
building the index can have a severe impact on the performance of the database. 
To avoid performance issues, make sure that your application checks for the indexes at start up using the 
getIndexes() method or the equivalent method for your driver8 and terminates if the proper indexes do not ex-ist. 
Always build indexes in production instances using separate application code, during designated maintenance 
windows. 
Building Indexes on Secondaries 
Changed in version 2.6: Secondary members can now build indexes in the background. Previously all index builds on 
secondaries were in the foreground. 
Background index operations on a replica set secondaries begin after the primary completes building the index. If 
MongoDB builds an index in the background on the primary, the secondaries will then build that index in the back-ground. 
To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and 
build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other 
members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step 
down the primary, restart it as a standalone, and build the index on the former primary. 
The amount of time required to build the index on a secondary must be within the window of the oplog, so that the 
secondary can catch up with the primary. 
Indexes on secondary members in “recovering” mode are always built in the foreground to allow them to catch up as 
soon as possible. 
See Build Indexes on Replica Sets (page 469) for a complete procedure for building indexes on secondaries. 
Drop Duplicates 
MongoDB cannot create a unique index (page 457) on a field that has duplicate values. To force the creation of a 
unique index, you can specify the dropDups option, which will only index the first occurrence of a value for the key, 
and delete all subsequent values. 
Important: As in all unique indexes, if a document does not have the indexed field, MongoDB will include it in the 
index with a “null” value. 
If subsequent fields do not have the indexed field, and you have set {dropDups: true}, MongoDB will remove 
these documents from the collection when creating the index. If you combine dropDups with the sparse (page 457) 
option, this index will only include documents in the index that have the value, and the documents without the field 
will remain in the database. 
8http://api.mongodb.org/ 
8.2. Index Concepts 461
MongoDB Documentation, Release 2.6.4 
To create a unique index that drops duplicates on the username field of the accounts collection, use a command 
in the following form: 
db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } ) 
Warning: Specifying { dropDups: true } will delete data from your database. Use with extreme cau-tion. 
By default, dropDups is false. 
Index Names 
The default name for an index is the concatenation of the indexed keys and each key’s direction in the index, 1 or -1. 
Example 
Issue the following command to create an index on item and quantity: 
db.products.ensureIndex( { item: 1, quantity: -1 } ) 
The resulting index is named: item_1_quantity_-1. 
Optionally, you can specify a name for an index instead of using the default name. 
Example 
Issue the following command to create an index on item and quantity and specify inventory as the index 
name: 
db.products.ensureIndex( { item: 1, quantity: -1 } , { name: "inventory" } ) 
The resulting index has the name inventory. 
To view the name of an index, use the getIndexes() method. 
8.2.4 Index Intersection 
New in version 2.6. 
MongoDB can use the intersection of multiple indexes to fulfill queries. 9 In general, each index intersection involves 
two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query. 
To illustrate index intersection, consider a collection orders that has the following indexes: 
{ qty: 1 } 
{ item: 1 } 
MongoDB can use the intersection of the two indexes to support the following query: 
db.orders.find( { item: "abc123", qty: { $gt: 15 } } ) 
For query plans that use index intersection, the explain() returns the value Complex Plan in the cursor field. 
9 In previous versions, MongoDB could use only a single index to fulfill most queries. The exception to this is queries with $or clauses, which 
could use a single index for each $or clause. 
462 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Index Prefix Intersection 
With index intersection, MongoDB can use an intersection of either the entire index or the index prefix. An index 
prefix is a subset of a compound index, consisting of one or more keys starting from the beginning of the index. 
Consider a collection orders with the following indexes: 
{ qty: 1 } 
{ status: 1, ord_date: -1 } 
To fulfill the following query which specifies a condition on both the qty field and the status field, MongoDB can 
use the intersection of the two indexes: 
db.orders.find( { qty: { $gt: 10 } , status: "A" } ) 
Index Intersection and Compound Indexes 
Index intersection does not eliminate the need for creating compound indexes (page 440). However, because both the 
list order (i.e. the order in which the keys are listed in the index) and the sort order (i.e. ascending or descending), 
matter in compound indexes (page 440), a compound index may not support a query condition that does not include 
the index prefix keys (page 442) or that specifies a different sort order. 
For example, if a collection orders has the following compound index, with the status field listed before the 
ord_date field: 
{ status: 1, ord_date: -1 } 
The compound index can support the following queries: 
db.orders.find( { status: { $in: ["A", "P" ] } } ) 
db.orders.find( 
{ 
ord_date: { $gt: new Date("2014-02-01") }, 
status: {$in:[ "P", "A" ] } 
} 
) 
But not the following two queries: 
db.orders.find( { ord_date: { $gt: new Date("2014-02-01") } } ) 
db.orders.find( { } ).sort( { ord_date: 1 } ) 
However, if the collection has two separate indexes: 
{ status: 1 } 
{ ord_date: -1 } 
The two indexes can, either individually or through index intersection, support all four aforementioned queries. 
The choice between creating compound indexes that support your queries or relying on index intersection depends on 
the specifics of your system. 
See also: 
compound indexes (page 440), Create Compound Indexes to Support Several Different Queries (page 494) 
8.2. Index Concepts 463
MongoDB Documentation, Release 2.6.4 
Index Intersection and Sort 
Index intersection does not apply when the sort() operation requires an index completely separate from the query 
predicate. 
For example, the orders collection has the following indexes: 
{ qty: 1 } 
{ status: 1, ord_date: -1 } 
{ status: 1 } 
{ ord_date: -1 } 
MongoDB cannot use index intersection for the following query with sort: 
db.orders.find( { qty: { $gt: 10 } } ).sort( { status: 1 } ) 
That is, MongoDB does not use the { qty: 1 } index for the query, and the separate { status: 1 } or the 
{ status: 1, ord_date: -1 } index for the sort. 
However, MongoDB can use index intersection for the following query with sort since the index { status: 1, 
ord_date: -1 } can fulfill part of the query predicate. 
db.orders.find( { qty: { $gt: 10 } , status: "A" } ).sort( { ord_date: -1 } ) 
8.3 Indexing Tutorials 
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the 
documents in a collection. 
The documents in this section outline specific tasks related to building and maintaining indexes for data in MongoDB 
collections and discusses strategies and practical approaches. For a conceptual overview of MongoDB indexing, see 
the Index Concepts (page 436) document. 
Index Creation Tutorials (page 464) Create and configure different types of indexes for different purposes. 
Index Management Tutorials (page 472) Monitor and assess index performance and rebuild indexes as needed. 
Geospatial Index Tutorials (page 476) Create indexes that support data stored as GeoJSON objects and legacy coor-dinate 
pairs. 
Text Search Tutorials (page 486) Build and configure indexes that support full-text searches. 
Indexing Strategies (page 493) The factors that affect index performance and practical approaches to indexing in 
MongoDB 
8.3.1 Index Creation Tutorials 
Instructions for creating and configuring indexes in MongoDB and building indexes on replica sets and sharded clus-ters. 
Create an Index (page 465) Build an index for any field on a collection. 
Create a Compound Index (page 466) Build an index of multiple fields on a collection. 
Create a Unique Index (page 467) Build an index that enforces unique values for the indexed field or fields. 
Create a Sparse Index (page 467) Build an index that omits references to documents that do not include the indexed 
field. This saves space when indexing fields that are present in only some documents. 
464 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Create a Hashed Index (page 468) Compute a hash of the value of a field in a collection and index the hashed value. 
These indexes permit equality queries and may be suitable shard keys for some collections. 
Build Indexes on Replica Sets (page 469) To build indexes on a replica set, you build the indexes separately on the 
primary and the secondaries, as described here. 
Build Indexes in the Background (page 470) Background index construction allows read and write operations to 
continue while building the index, but take longer to complete and result in a larger index. 
Build Old Style Indexes (page 471) A {v : 0} index is necessary if you need to roll back from MongoDB version 
2.0 (or later) to MongoDB version 1.8. 
Create an Index 
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the 
documents in a collection. Users can create indexes for any collection on any field in a document. By default, 
MongoDB creates an index on the _id field of every collection. 
This tutorial describes how to create an index on a single field. MongoDB also supports compound indexes (page 440), 
which are indexes on multiple fields. See Create a Compound Index (page 466) for instructions on building compound 
indexes. 
Create an Index on a Single Field 
To create an index, use ensureIndex() or a similar method from your driver10. The ensureIndex() method 
only creates an index if an index of the same specification does not already exist. 
For example, the following operation creates an index on the userid field of the records collection: 
db.records.ensureIndex( { userid: 1 } ) 
The value of the field in the index specification describes the kind of index for that field. For example, a value of 1 
specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending 
order. For additional index types, see Index Types (page 437). 
The created index will support queries that select on the field userid, such as the following: 
db.records.find( { userid: 2 } ) 
db.records.find( { userid: { $gt: 10 } } ) 
But the created index does not support the following query on the profile_url field: 
db.records.find( { profile_url: 2 } ) 
For queries that cannot use an index, MongoDB must scan all documents in a collection for documents that match the 
query. 
Additional Considerations 
Although indexes can improve query performances, indexes also present some operational considerations. See Oper-ational 
Considerations for Indexes (page 137) for more information. 
If your collection holds a large amount of data, and your application needs to be able to access the data while building 
the index, consider building the index in the background, as described in Background Construction (page 460). To 
build indexes on replica sets, see the Build Indexes on Replica Sets (page 469) section for more information. 
10http://api.mongodb.org/ 
8.3. Indexing Tutorials 465
MongoDB Documentation, Release 2.6.4 
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 469). 
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any 
affect on the resulting index. 
See also: 
Create a Compound Index (page 466), Indexing Tutorials (page 464) and Index Concepts (page 436) for more infor-mation. 
Create a Compound Index 
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the 
documents in a collection. MongoDB supports indexes that include content on a single field, as well as compound 
indexes (page 440) that include content from multiple fields. Continue reading for instructions and examples of 
building a compound index. 
Build a Compound Index 
To create a compound index (page 440) use an operation that resembles the following prototype: 
db.collection.ensureIndex( { a: 1, b: 1, c: 1 } ) 
The value of the field in the index specification describes the kind of index for that field. For example, a value of 1 
specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending 
order. For additional index types, see Index Types (page 437). 
Example 
The following operation will create an index on the item, category, and price fields of the products collec-tion: 
db.products.ensureIndex( { item: 1, category: 1, price: 1 } ) 
Additional Considerations 
If your collection holds a large amount of data, and your application needs to be able to access the data while building 
the index, consider building the index in the background, as described in Background Construction (page 460). To 
build indexes on replica sets, see the Build Indexes on Replica Sets (page 469) section for more information. 
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 469). 
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any 
affect on the resulting index. 
See also: 
Create an Index (page 465), Indexing Tutorials (page 464) and Index Concepts (page 436) for more information. 
466 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Create a Unique Index 
MongoDB allows you to specify a unique constraint (page 457) on an index. These constraints prevent applications 
from inserting documents that have duplicate values for the inserted fields. Additionally, if you want to create an index 
on a collection that has existing data that might have duplicate values for the indexed field, you may choose to combine 
unique enforcement with duplicate dropping (page 461). 
Unique Indexes 
To create a unique index (page 457), consider the following prototype: 
db.collection.ensureIndex( { a: 1 }, { unique: true } ) 
For example, you may want to create a unique index on the "tax-id": of the accounts collection to prevent 
storing multiple account records for the same legal entity: 
db.accounts.ensureIndex( { "tax-id": 1 }, { unique: true } ) 
The _id index (page 439) is a unique index. In some situations you may consider using the _id field itself for this 
kind of data rather than using a unique index on another field. 
In many situations you will want to combine the unique constraint with the sparse option. When MongoDB 
indexes a field, if a document does not have a value for a field, the index entry for that item will be null. Since 
unique indexes cannot have duplicate values for a field, without the sparse option, MongoDB will reject the second 
document and all subsequent documents without the indexed field. Consider the following prototype. 
db.collection.ensureIndex( { a: 1 }, { unique: true, sparse: true } ) 
You can also enforce a unique constraint on compound indexes (page 440), as in the following prototype: 
db.collection.ensureIndex( { a: 1, b: 1 }, { unique: true } ) 
These indexes enforce uniqueness for the combination of index keys and not for either key individually. 
Drop Duplicates 
To force the creation of a unique index (page 457) index on a collection with duplicate values in the field you are 
indexing you can use the dropDups option. This will force MongoDB to create a unique index by deleting documents 
with duplicate values when building the index. Consider the following prototype invocation of ensureIndex(): 
db.collection.ensureIndex( { a: 1 }, { unique: true, dropDups: true } ) 
See the full documentation of duplicate dropping (page 461) for more information. 
Warning: Specifying { dropDups: true } may delete data from your database. Use with extreme cau-tion. 
Refer to the ensureIndex() documentation for additional index creation options. 
Create a Sparse Index 
Sparse indexes are like non-sparse indexes, except that they omit references to documents that do not include the 
indexed field. For fields that are only present in some documents sparse indexes may provide a significant space 
savings. See Sparse Indexes (page 457) for more information about sparse indexes and their use. 
8.3. Indexing Tutorials 467
MongoDB Documentation, Release 2.6.4 
See also: 
Index Concepts (page 436) and Indexing Tutorials (page 464) for more information. 
Prototype 
To create a sparse index (page 457) on a field, use an operation that resembles the following prototype: 
db.collection.ensureIndex( { a: 1 }, { sparse: true } ) 
Example 
The following operation, creates a sparse index on the users collection that only includes a document in the index if 
the twitter_name field exists in a document. 
db.users.ensureIndex( { twitter_name: 1 }, { sparse: true } ) 
The index excludes all documents that do not include the twitter_name field. 
Considerations 
Note: Sparse indexes can affect the results returned by the query, particularly with respect to sorts on fields not 
included in the index. See the sparse index (page 457) section for more information. 
Create a Hashed Index 
New in version 2.4. 
Hashed indexes (page 455) compute a hash of the value of a field in a collection and index the hashed value. These 
indexes permit equality queries and may be suitable shard keys for some collections. 
Tip 
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need 
to compute hashes. 
See 
Hashed Shard Keys (page 621) for more information about hashed indexes in sharded clusters, as well as Index Con-cepts 
(page 436) and Indexing Tutorials (page 464) for more information about indexes. 
Procedure 
To create a hashed index (page 455), specify hashed as the value of the index key, as in the following example: 
Example 
Specify a hashed index on _id 
db.collection.ensureIndex( { _id: "hashed" } ) 
468 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Considerations 
MongoDB supports hashed indexes of any single field. The hashing function collapses sub-documents and computes 
the hash for the entire value, but does not support multi-key (i.e. arrays) indexes. 
You may not create compound indexes that have hashed index fields. 
Build Indexes on Replica Sets 
For replica sets, secondaries will begin building indexes after the primary finishes building the index. In sharded 
clusters, the mongos will send ensureIndex() to the primary members of the replica set for each shard, which 
then replicate to the secondaries after the primary finishes building the index. 
To minimize the impact of building an index on your replica set, use the following procedure to build indexes: 
See 
Indexing Tutorials (page 464) and Index Concepts (page 436) for more information. 
Considerations 
• Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without 
falling too far behind to catch up. See the oplog sizing (page 535) documentation for additional information. 
• This procedure does take one member out of the replica set at a time. However, this procedure will only affect 
one member of the set at a time rather than all secondaries at the same time. 
• Do not use this procedure when building a unique index (page 457) with the dropDups option. 
• Before version 2.6 Background index creation operations (page 460) become foreground indexing operations 
on secondary members of replica sets. After 2.6, background index builds replicate as background index builds 
on the secondaries. 
Procedure 
Note: If you need to build an index in a sharded cluster, repeat the following procedure for each replica set that 
provides each shard. 
Stop One Secondary Stop the mongod process on one secondary. Restart the mongod process without the 
--replSet option and running on a different port. 11 This instance is now in “standalone” mode. 
For example, if your mongod normally runs with on the default port of 27017 with the --replSet option you 
would use the following invocation: 
mongod --port 47017 
11 By running the mongod on a different port, you ensure that the other members of the replica set and all clients will not contact the member 
while you are building the index. 
8.3. Indexing Tutorials 469
MongoDB Documentation, Release 2.6.4 
Build the Index Create the new index using the ensureIndex() in the mongo shell, or comparable method in 
your driver. This operation will create or rebuild the index on this mongod instance 
For example, to create an ascending index on the username field of the records collection, use the following 
mongo shell operation: 
db.records.ensureIndex( { username: 1 } ) 
See also: 
Create an Index (page 465) and Create a Compound Index (page 466) for more information. 
Restart the Program mongod When the index build completes, start the mongod instance with the --replSet 
option on its usual port: 
mongod --port 27017 --replSet rs0 
Modify the port number (e.g. 27017) or the replica set name (e.g. rs0) as needed. 
Allow replication to catch up on this member. 
Build Indexes on all Secondaries Changed in version 2.6: Secondary members can now build indexes in the back-ground 
(page 470). Previously all index builds on secondaries were in the foreground. 
For each secondary in the set, build an index according to the following steps: 
1. Stop One Secondary (page 469) 
2. Build the Index (page 470) 
3. Restart the Program mongod (page 470) 
Build the Index on the Primary To build an index on the primary you can either: 
1. Build the index in the background (page 470) on the primary. 
2. Step down the primary using the rs.stepDown() method in the mongo shell to cause the current primary to 
become a secondary graceful and allow the set to elect another member as primary. 
Then repeat the index building procedure, listed below, to build the index on the primary: 
(a) Stop One Secondary (page 469) 
(b) Build the Index (page 470) 
(c) Restart the Program mongod (page 470) 
Building the index on the background, takes longer than the foreground index build and results in a less compact index 
structure. Additionally, the background index build may impact write performance on the primary. However, building 
the index in the background allows the set to be continuously up for write operations during while MongoDB builds 
the index. 
Build Indexes in the Background 
By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database 
while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can 
occur during a foreground index build. 
Background index construction (page 460) allows read and write operations to continue while building the index. 
See also: 
470 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Index Concepts (page 436) and Indexing Tutorials (page 464) for more information. 
Considerations 
Background index builds take longer to complete and result in an index that is initially larger, or less compact, than an 
index built in the foreground. Over time, the compactness of indexes built in the background will approach foreground-built 
indexes. 
After MongoDB finishes building the index, background-built indexes are functionally identical to any other index. 
Procedure 
To create an index in the background, add the background argument to the ensureIndex() operation, as in the 
following index: 
db.collection.ensureIndex( { a: 1 }, { background: true } ) 
Consider the section on background index construction (page 460) for more information about these indexes and their 
implications. 
Build Old Style Indexes 
Important: Use this procedure only if you must have indexes that are compatible with a version of MongoDB earlier 
than 2.0. 
MongoDB version 2.0 introduced the {v:1} index format. MongoDB versions 2.0 and later support both the {v:1} 
format and the earlier {v:0} format. 
MongoDB versions prior to 2.0, however, support only the {v:0} format. If you need to roll back MongoDB to a 
version prior to 2.0, you must drop and re-create your indexes. 
To build pre-2.0 indexes, use the dropIndexes() and ensureIndex() methods. You cannot simply reindex the 
collection. When you reindex on versions that only support {v:0} indexes, the v fields in the index definition still 
hold values of 1, even though the indexes would now use the {v:0} format. If you were to upgrade again to version 
2.0 or later, these indexes would not work. 
Example 
Suppose you rolled back from MongoDB 2.0 to MongoDB 1.8, and suppose you had the following index on the 
items collection: 
{ "v" : 1, "key" : { "name" : 1 }, "ns" : "mydb.items", "name" : "name_1" } 
The v field tells you the index is a {v:1} index, which is incompatible with version 1.8. 
To drop the index, issue the following command: 
db.items.dropIndex( { name : 1 } ) 
To recreate the index as a {v:0} index, issue the following command: 
db.foo.ensureIndex( { name : 1 } , { v : 0 } ) 
See also: 
Index Performance Enhancements (page 794). 
8.3. Indexing Tutorials 471
MongoDB Documentation, Release 2.6.4 
8.3.2 Index Management Tutorials 
Instructions for managing indexes and assessing index performance and use. 
Remove Indexes (page 472) Drop an index from a collection. 
Modify an Index (page 472) Modify an existing index. 
Rebuild Indexes (page 474) In a single operation, drop all indexes on a collection and then rebuild them. 
Manage In-Progress Index Creation (page 474) Check the status of indexing progress, or terminate an ongoing in-dex 
build. 
Return a List of All Indexes (page 475) Obtain a list of all indexes on a collection or of all indexes on all collections 
in a database. 
Measure Index Use (page 475) Study query operations and observe index use for your database. 
Remove Indexes 
To remove an index from a collection use the dropIndex() method and the following procedure. If you simply 
need to rebuild indexes you can use the process described in the Rebuild Indexes (page 474) document. 
See also: 
Indexing Tutorials (page 464) and Index Concepts (page 436) for more information about indexes and indexing oper-ations 
in MongoDB. 
Remove a Specific Index 
To remove an index, use the db.collection.dropIndex() method. 
For example, the following operation removes an ascending index on the tax-id field in the accounts collection: 
db.accounts.dropIndex( { "tax-id": 1 } ) 
The operation returns a document with the status of the operation: 
{ "nIndexesWas" : 3, "ok" : 1 } 
Where the value of nIndexesWas reflects the number of indexes before removing this index. 
For text (page 454) indexes, pass the index name to the db.collection.dropIndex() method. See Use the 
Index Name to Drop a text Index (page 489) for details. 
Remove All Indexes 
You can also use the db.collection.dropIndexes() to remove all indexes, except for the _id index (page 439) 
from a collection. 
These shell helpers provide wrappers around the dropIndexes database command. Your client library may 
have a different or additional interface for these operations. 
Modify an Index 
To modify an existing index, you need to drop and recreate the index. 
472 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Step 1: Create a unique index. 
Use the ensureIndex() method create a unique index. 
db.orders.ensureIndex( 
{ "cust_id" : 1, "ord_date" : -1, "items" : 1 }, 
{ unique: true } 
) 
The method returns a document with the status of the results. The method only creates an index if the index does 
not already exist. See Create an Index (page 465) and Index Creation Tutorials (page 464) for more information on 
creating indexes. 
Step 2: Attempt to modify the index. 
To modify an existing index, you cannot just re-issue the ensureIndex() method with the updated specification 
of the index. 
For example, the following operation attempts to remove the unique constraint from the previously created index by 
using the ensureIndex() method. 
db.orders.ensureIndex( 
{ "cust_id" : 1, "ord_date" : -1, "items" : 1 } 
) 
The status document returned by the operation shows an error. 
Step 3: Drop the index. 
To modify the index, you must drop the index first. 
db.orders.dropIndex( 
{ "cust_id" : 1, "ord_date" : -1, "items" : 1 } 
) 
The method returns a document with the status of the operation. Upon successful operation, the ok field in the returned 
document should specify a 1. See Remove Indexes (page 472) for more information about dropping indexes. 
Step 4: Recreate the index without the unique constraint. 
Recreate the index without the unique constraint. 
db.orders.ensureIndex( 
{ "cust_id" : 1, "ord_date" : -1, "items" : 1 } 
) 
The method returns a document with the status of the results. Upon successful operation, the returned document 
should show the numIndexesAfter to be greater than numIndexesBefore by one. 
See also: 
Index Introduction (page 431), Index Concepts (page 436). 
8.3. Indexing Tutorials 473
MongoDB Documentation, Release 2.6.4 
Rebuild Indexes 
If you need to rebuild indexes for a collection you can use the db.collection.reIndex() method to rebuild all 
indexes on a collection in a single operation. This operation drops all indexes, including the _id index (page 439), and 
then rebuilds all indexes. 
See also: 
Index Concepts (page 436) and Indexing Tutorials (page 464). 
Process 
The operation takes the following form: 
db.accounts.reIndex() 
MongoDB will return the following document when the operation completes: 
{ 
"nIndexesWas" : 2, 
"msg" : "indexes dropped for collection", 
"nIndexes" : 2, 
"indexes" : [ 
{ 
"key" : { 
"_id" : 1, 
"tax-id" : 1 
}, 
"ns" : "records.accounts", 
"name" : "_id_" 
} 
], 
"ok" : 1 
} 
This shell helper provides a wrapper around the reIndex database command. Your client library may have 
a different or additional interface for this operation. 
Additional Considerations 
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 469). 
Manage In-Progress Index Creation 
To see the status of the indexing processes, you can use the db.currentOp() method in the mongo shell. The 
value of the query field and the msg field will indicate if the operation is an index build. The msg field also indicates 
the percent of the build that is complete. 
To terminate an ongoing index build, use the db.killOp() method in the mongo shell. 
For more information see db.currentOp(). 
Changed in version 2.4: Before MongoDB 2.4, you could only terminate background index builds. After 2.4, you can 
terminate any index build, including foreground index builds. 
474 Chapter 8. Indexes
MongoDB Documentation, Release 2.6.4 
Return a List of All Indexes 
When performing maintenance you may want to check which indexes exist on a collection. Every index on a collection 
has a corresponding document in the system.indexes (page 271) collection, and you can use standard queries (i.e. 
find()) to list the indexes, or in the mongo shell, the getIndexes() method to return a list of the indexes on a 
collection, as in the following examples. 
See also: 
Index Concepts (page 436) and Indexing Tutorials (page 464) for more information about indexes in MongoDB and 
common index management operations. 
List all Indexes on a Collection 
To return a list of all indexes on a collection, use the db.collection.getIndexes() method or a similar 
method for your driver12. 
For example, to view all indexes on the people collection: 
db.people.getIndexes() 
List all Indexes for a Database 
To return a list of all indexes on all collections in a database, use the following operation in the mongo shell: 
db.system.indexes.find() 
See system.indexes (page 271) for more information about these documents. 
Measure Index Use 
Synopsis 
Query performance is a good general indicator of index use; however, for more precise insight into index use, Mon-goDB 
provides a number of tools that allow you to study query operations and observe index use for your database. 
See also: 
Index Concepts (page 436) and Indexing Tutorials (page 464) for more information. 
Operations 
Return Query Plan with explain() Append the explain() method to any cursor (e.g. query) to return a 
document with statistics about the query process, including the index used, the number of documents scanned, and the 
time the query takes to process in milliseconds. 
Control Index Use with hint() Append the hint() to any cursor (e.g. query) with the index as the argument to 
force MongoDB to use a specific index to fulfill the query. Consider the following example: 
db.people.find( { name: "John Doe", zipcode: { $gt: "63000" } } ).hint( { zipcode: 1 } ) 
12http://api.mongodb.org/ 
8.3. Indexing Tutorials 475
MongoDB Documentation, Release 2.6.4 
You can use hint() and explain() in conjunction with each other to compare the effectiveness of a specific 
index. Specify the $natural operator to the hint() method to prevent MongoDB from using any index: 
db.people.find( { name: "John Doe", zipcode: { $gt: "63000" } } ).hint( { $natural: 1 } ) 
Instance Index Use Reporting MongoDB provides a number of metrics of index use and operation that you may 
want to consider when analyzing index use for your database: 
• In the output of serverStatus: 
– indexCounters 
– scanned 
– scanAndOrder 
• In the output of collStats: 
– totalIndexSize 
– indexSizes 
• In the output of dbStats: 
– dbStats.indexes 
– dbStats.indexSize 
8.3.3 Geospatial Index Tutorials 
Instructions for creating and querying 2d, 2dsphere, and haystack indexes. 
Create a 2dsphere Index (page 476) A 2dsphere index supports data stored as both GeoJSON objects and as 
legacy coordinate pairs. 
Query a 2dsphere Index (page 478) Search for locations within, near, or intersected by a GeoJSON shape, or within 
a circle as defined by coordinate points on a sphere. 
Create a 2d Index (page 480) Create a 2d index to support queries on data stored as legacy coordinate pairs. 
Query a 2d Index (page 481) Search for locations using legacy coordinate pairs. 
Create a Haystack Index (page 482) A haystack index is optimized to return results over small areas. For queries 
that use spherical geometry, a 2dsphere index is a better option. 
Query a Haystack Index (page 483) Search based on location and non-location data within a small area. 
Calculate Distance Using Spherical Geometry (page 483) Convert distances to radians and back again. 
Create a 2dsphere Index 
To create a geospatial index for Geo
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual
Mongo db manual

More Related Content

PDF
Programming
PDF
Sql developer usermanual_en
PDF
Winserver 2012 R2 and Winserver 2012.Technet
 
PDF
MONGODB
PDF
html-css-bootstrap-javascript-and-jquery
PDF
Hp networking-and-cisco-cli-reference-guide june-10_ww_eng_ltr
PDF
Perl &lt;b>5 Tutorial&lt;/b>, First Edition
PDF
Neo4j manual-milestone
Programming
Sql developer usermanual_en
Winserver 2012 R2 and Winserver 2012.Technet
 
MONGODB
html-css-bootstrap-javascript-and-jquery
Hp networking-and-cisco-cli-reference-guide june-10_ww_eng_ltr
Perl &lt;b>5 Tutorial&lt;/b>, First Edition
Neo4j manual-milestone

What's hot (19)

PDF
Akka java
PDF
MySQL Reference Manual
PDF
sg247413
PDF
PDF
Latex
PDF
Postgresql 8.4.0-us
PDF
Linux_kernelmodule
PDF
Linux mailserver-installation
PDF
Learn MySQL - Online Guide
ODT
Using Open Source Tools For STR7XX Cross Development
PDF
RDB Synchronization, Transcoding and LDAP Directory Services ...
PDF
Another example PDF
PDF
Selenium python
PDF
Ibm system storage productivity center deployment guide sg247560
PDF
Tivoli data warehouse version 1.3 planning and implementation sg246343
PDF
Zend Server Ce Reference Manual V403
PDF
Red paper
PDF
Sg246776
PDF
Data source integration guide for HP Performance Agent
Akka java
MySQL Reference Manual
sg247413
Latex
Postgresql 8.4.0-us
Linux_kernelmodule
Linux mailserver-installation
Learn MySQL - Online Guide
Using Open Source Tools For STR7XX Cross Development
RDB Synchronization, Transcoding and LDAP Directory Services ...
Another example PDF
Selenium python
Ibm system storage productivity center deployment guide sg247560
Tivoli data warehouse version 1.3 planning and implementation sg246343
Zend Server Ce Reference Manual V403
Red paper
Sg246776
Data source integration guide for HP Performance Agent
Ad

Viewers also liked (8)

PPT
xxx
 
PPT
Montreal Airport Hassels
PDF
Mongo db reference-manual
PDF
Mongo db data-models-guide
PDF
Mongo db crud-guide
PDF
Mongo db aggregation-guide
PDF
Mongo db replication-guide
PDF
Mongo db administration-guide
xxx
 
Montreal Airport Hassels
Mongo db reference-manual
Mongo db data-models-guide
Mongo db crud-guide
Mongo db aggregation-guide
Mongo db replication-guide
Mongo db administration-guide
Ad

Similar to Mongo db manual (20)

PDF
Mongo db administration guide
PDF
Mongodb The Definitive Guide 3rd Edition 3rd Edition Kristina Chodorow Eoin B...
PDF
Mongo db reference manual
PDF
Mongo db crud guide
PPTX
How to learn MongoDB for beginner's
PDF
1428393873 mhkx3 ln
PDF
Introduction to mongo db
PDF
Mongodb
PPT
MongoDB Pros and Cons
PPT
9. Document Oriented Databases
PPTX
Introduction to MongoDB a brief intro(1).pptx
PPTX
Mongo db Quick Guide
PDF
Mongodb
PPT
MONGODB VASUDEV PRAJAPATI DOCUMENTBASE DATABASE
PDF
MongoDB.pdf
KEY
MongoDB
PDF
The Little MongoDB Book - Karl Seguin
PDF
Mongo db data-models guide
PPTX
Introduction To MongoDB
PPTX
MongoDB_ppt.pptx
Mongo db administration guide
Mongodb The Definitive Guide 3rd Edition 3rd Edition Kristina Chodorow Eoin B...
Mongo db reference manual
Mongo db crud guide
How to learn MongoDB for beginner's
1428393873 mhkx3 ln
Introduction to mongo db
Mongodb
MongoDB Pros and Cons
9. Document Oriented Databases
Introduction to MongoDB a brief intro(1).pptx
Mongo db Quick Guide
Mongodb
MONGODB VASUDEV PRAJAPATI DOCUMENTBASE DATABASE
MongoDB.pdf
MongoDB
The Little MongoDB Book - Karl Seguin
Mongo db data-models guide
Introduction To MongoDB
MongoDB_ppt.pptx

Recently uploaded (20)

PDF
Mega Projects Data Mega Projects Data
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Introduction to machine learning and Linear Models
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
annual-report-2024-2025 original latest.
Mega Projects Data Mega Projects Data
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
STUDY DESIGN details- Lt Col Maksud (21).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Miokarditis (Inflamasi pada Otot Jantung)
Data_Analytics_and_PowerBI_Presentation.pptx
climate analysis of Dhaka ,Banglades.pptx
Fluorescence-microscope_Botany_detailed content
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Clinical guidelines as a resource for EBP(1).pdf
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Introduction to machine learning and Linear Models
Introduction-to-Cloud-ComputingFinal.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Business Ppt On Nestle.pptx huunnnhhgfvu
annual-report-2024-2025 original latest.

Mongo db manual

  • 1. MongoDB Documentation Release 2.6.4 MongoDB Documentation Project September 16, 2014
  • 2. 2
  • 3. Contents 1 Introduction to MongoDB 3 1.1 What is MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Install MongoDB 5 2.1 Installation Guides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 First Steps with MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3 MongoDB CRUD Operations 51 3.1 MongoDB CRUD Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.2 MongoDB CRUD Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3 MongoDB CRUD Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.4 MongoDB CRUD Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4 Data Models 131 4.1 Data Modeling Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 4.2 Data Modeling Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 4.3 Data Model Examples and Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 4.4 Data Model Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 5 Administration 171 5.1 Administration Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 5.2 Administration Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 5.3 Administration Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 6 Security 279 6.1 Security Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 6.2 Security Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 6.3 Security Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 6.4 Security Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 7 Aggregation 387 7.1 Aggregation Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 7.2 Aggregation Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 7.3 Aggregation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 7.4 Aggregation Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 8 Indexes 431 8.1 Index Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 8.2 Index Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436 i
  • 4. 8.3 Indexing Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 8.4 Indexing Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 9 Replication 503 9.1 Replication Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 9.2 Replication Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507 9.3 Replica Set Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543 9.4 Replication Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593 10 Sharding 607 10.1 Sharding Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 10.2 Sharding Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613 10.3 Sharded Cluster Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634 10.4 Sharding Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678 11 Frequently Asked Questions 687 11.1 FAQ: MongoDB Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 11.2 FAQ: MongoDB for Application Developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690 11.3 FAQ: The mongo Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700 11.4 FAQ: Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702 11.5 FAQ: Sharding with MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706 11.6 FAQ: Replication and Replica Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711 11.7 FAQ: MongoDB Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714 11.8 FAQ: Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718 11.9 FAQ: MongoDB Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720 12 Release Notes 725 12.1 Current Stable Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725 12.2 Previous Stable Releases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762 12.3 Other MongoDB Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808 12.4 MongoDB Version Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808 13 About MongoDB Documentation 811 13.1 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811 13.2 Editions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811 13.3 Version and Revisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812 13.4 Report an Issue or Make a Change Request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812 13.5 Contribute to the Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812 Index 829 ii
  • 5. MongoDB Documentation, Release 2.6.4 See About MongoDB Documentation (page 811) for more information about the MongoDB Documentation project, this Manual and additional editions of this text. Note: This version of the PDF does not include the reference section, see MongoDB Reference Manual1 for a PDF edition of all MongoDB Reference Material. 1http://docs.mongodb.org/master/MongoDB-reference-manual.pdf Contents 1
  • 7. CHAPTER 1 Introduction to MongoDB Welcome to MongoDB. This document provides a brief introduction to MongoDB and some key concepts. See the installation guides (page 5) for information on downloading and installing MongoDB. 1.1 What is MongoDB MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling. 1.1.1 Document Database A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB docu-ments are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents. Figure 1.1: A MongoDB document. The advantages of using documents are: • Documents (i.e. objects) correspond to native data types in many programming languages. • Embedded documents and arrays reduce need for expensive joins. • Dynamic schema supports fluent polymorphism. 3
  • 8. MongoDB Documentation, Release 2.6.4 1.1.2 Key Features High Performance MongoDB provides high performance data persistence. In particular, • Support for embedded data models reduces I/O activity on database system. • Indexes support faster queries and can include keys from embedded documents and arrays. High Availability To provide high availability, MongoDB’s replication facility, called replica sets, provide: • automatic failover. • data redundancy. A replica set (page 503) is a group of MongoDB servers that maintain the same data set, providing redundancy and increasing data availability. Automatic Scaling MongoDB provides horizontal scalability as part of its core functionality. • Automatic sharding (page 607) distributes data across a cluster of machines. • Replica sets can provide eventually-consistent reads for low-latency high throughput deployments. 4 Chapter 1. Introduction to MongoDB
  • 9. CHAPTER 2 Install MongoDB MongoDB runs on most platforms and supports both 32-bit and 64-bit architectures. 2.1 Installation Guides See the Release Notes (page 725) for information about specific releases of MongoDB. Install on Linux (page 6) Documentations for installing the official MongoDB distribution on Linux-based systems. Install on Red Hat (page 6) Install MongoDB on Red Hat Enterprise, CentOS, Fedora and related Linux sys-tems using .rpm packages. Install on Ubuntu (page 9) Install MongoDB on Ubuntu Linux systems using .deb packages. Install on Debian (page 12) Install MongoDB on Debian systems using .deb packages. Install on Other Linux Systems (page 14) Install the official build of MongoDB on other Linux systems from MongoDB archives. Install on OS X (page 16) Install the official build of MongoDB on OS X systems from Homebrew packages or from MongoDB archives. Install on Windows (page 19) Install MongoDB on Windows systems and optionally start MongoDB as a Windows service. Install MongoDB Enterprise (page 24) MongoDB Enterprise is available for MongoDB Enterprise subscribers and includes several additional features including support for SNMP monitoring, LDAP authentication, Kerberos authentication, and System Event Auditing. Install MongoDB Enterprise on Red Hat (page 24) Install the MongoDB Enterprise build and required depen-dencies on Red Hat Enterprise or CentOS Systems using packages. Install MongoDB Enterprise on Ubuntu (page 27) Install the MongoDB Enterprise build and required depen-dencies on Ubuntu Linux Systems using packages. Install MongoDB Enterprise on Debian (page 30) Install the MongoDB Enterprise build and required depen-dencies on Debian Linux Systems using packages. Install MongoDB Enterprise on SUSE (page 32) Install the MongoDB Enterprise build and required depen-dencies on SUSE Enterprise Linux. Install MongoDB Enterprise on Amazon AMI (page 34) Install the MongoDB Enterprise build and required dependencies on Amazon Linux AMI. Install MongoDB Enterprise on Windows (page 36) Install the MongoDB Enterprise build and required de-pendencies using the .msi installer. 5
  • 10. MongoDB Documentation, Release 2.6.4 2.1.1 Install on Linux These documents provide instructions to install MongoDB for various Linux systems. Recommended For easy installation, MongoDB provides packages for popular Linux distributions. The following guides detail the installation process for these systems: Install on Red Hat (page 6) Install MongoDB on Red Hat Enterprise, CentOS, Fedora and related Linux systems using .rpm packages. Install on Ubuntu (page 9) Install MongoDB on Ubuntu Linux systems using .deb packages. Install on Debian (page 12) Install MongoDB on Debian systems using .deb packages. For systems without supported packages, refer to the Manual Installation tutorial. Manual Installation Although packages are the preferred installation method, for Linux systems without supported packages, see the following guide: Install on Other Linux Systems (page 14) Install the official build of MongoDB on other Linux systems from Mon-goDB archives. Install MongoDB on Red Hat Enterprise, CentOS, Fedora, or Amazon Linux Overview Use this tutorial to install MongoDB on Red Hat Enterprise Linux, CentOS Linux, Fedora Linux, or a related system from .rpm packages. While some of these distributions include their own MongoDB packages, the official MongoDB packages are generally more up to date. Packages MongoDB provides packages of the officially supported MongoDB builds in it’s own repository. This repository provides the MongoDB distribution in the following packages: • mongodb-org This package is a metapackage that will automatically install the four component packages listed below. • mongodb-org-server This package contains the mongod daemon and associated configuration and init scripts. • mongodb-org-mongos This package contains the mongos daemon. • mongodb-org-shell This package contains the mongo shell. • mongodb-org-tools This package contains the following MongoDB tools: mongoimport bsondump, mongodump, mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, mongostat, and mongotop. 6 Chapter 2. Install MongoDB
  • 11. MongoDB Documentation, Release 2.6.4 Control Scripts The mongodb-org package includes various control scripts, including the init script /etc/rc.d/init.d/mongod. These scripts are used to stop, start, and restart daemon processes. The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. See http://docs.mongodb.org/manualreference/configuration-options for documentation of the configuration file. As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). You can use the mongod init script to derive your own mongos control script for use in such environments. See the mongos reference for configuration details. Warning: With the introduction of systemd in Fedora 15, the control scripts included in the packages available in the MongoDB downloads repository are not compatible with Fedora systems. A correction is forthcoming; see SERVER-7285a for more information. In the mean time use your own control scripts or install using the procedure outlined in Install MongoDB on Linux Systems (page 14). ahttps://jira.mongodb.org/browse/SERVER-7285 Considerations For production deployments, always run MongoDB on 64-bit systems. The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation of an older release, please refer to the documentation for the appropriate version. Install MongoDB Step 1: Configure the package management system (YUM). Create a /etc/yum.repos.d/mongodb.repo file to hold the following configuration information for the MongoDB repository: If you are running a 64-bit system, use the following configuration: [mongodb] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/ gpgcheck=0 enabled=1 If you are running a 32-bit system, which is not recommended for production deployments, use the following config-uration: [mongodb] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/i686/ gpgcheck=0 enabled=1 Step 2: Install the MongoDB packages and associated tools. When you install the packages, you choose whether to install the current release or a previous one. This step provides the commands for both. To install the latest stable version of MongoDB, issue the following command: sudo yum install -y mongodb-org 2.1. Installation Guides 7
  • 12. MongoDB Documentation, Release 2.6.4 To install a specific release of MongoDB, specify each component package individually and append the version number to the package name, as in the following example that installs the 2.6.1‘ release of MongoDB: sudo yum install -y mongodb-org-2.6.1 mongodb-org-server-2.6.1 mongodb-org-shell-2.6.1 mongodb-org-mongos-You can specify any available version of MongoDB. However yum will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To pin a package, add the following exclude directive to your /etc/yum.conf file: exclude=mongodb-org,mongodb-org-server,mongodb-org-shell,mongodb-org-mongos,mongodb-org-tools Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for more information1. Run MongoDB Important: You must configure SELinux to allow MongoDB to start on Red Hat Linux-based systems (Red Hat Enterprise Linux, CentOS, Fedora). Administrators have three options: • enable access to the relevant ports (e.g. 27017) for SELinux. See Default MongoDB Port (page 380) for more information on MongoDB’s default ports. For default settings, this can be accomplished by running semanage port -a -t mongodb_port_t -p tcp 27017 • set SELinux to permissive mode in /etc/selinux.conf. The line SELINUX=enforcing should be changed to SELINUX=permissive • disable SELinux entirely; as above but set SELINUX=disabled All three options require root privileges. The latter two options each requires a system reboot and may have larger implications for your deployment. You may alternatively choose not to install the SELinux packages when you are installing your Linux operating system, or choose to remove the relevant packages. This option is the most invasive and is not recommended. The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb by default, and runs using the mongod user account. You can specify alternate log and data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. If you change the user that runs the MongoDB process, you must modify the access control rights to the /var/lib/mongo and /var/log/mongodb directories to give this users access to these directories. Step 1: Start MongoDB. You can start the mongod process by issuing the following command: sudo service mongod start Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-cessfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading 1http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-linux 8 Chapter 2. Install MongoDB
  • 13. MongoDB Documentation, Release 2.6.4 [initandlisten] waiting for connections on port <port> where <port> is the port configured in /etc/mongod.conf, 27017 by default. You can optionally ensure that MongoDB will start following a system reboot by issuing the following command: sudo chkconfig mongod on Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: sudo service mongod stop Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command: sudo service mongod restart You can follow the state of the process for errors or important messages by watching the output in the /var/log/mongodb/mongod.log file. Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Install MongoDB on Ubuntu Overview Use this tutorial to install MongoDB on Ubuntu Linux systems from .deb packages. While Ubuntu includes its own MongoDB packages, the official MongoDB packages are generally more up-to-date. Note: If you use an older Ubuntu that does not use Upstart (i.e. any version before 9.10 “Karmic”), please follow the instructions on the Install MongoDB on Debian (page 12) tutorial. Packages MongoDB provides packages of the officially supported MongoDB builds in it’s own repository. This repository provides the MongoDB distribution in the following packages: • mongodb-org This package is a metapackage that will automatically install the four component packages listed below. • mongodb-org-server This package contains the mongod daemon and associated configuration and init scripts. • mongodb-org-mongos This package contains the mongos daemon. • mongodb-org-shell This package contains the mongo shell. • mongodb-org-tools This package contains the following MongoDB tools: mongoimport bsondump, mongodump, mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, mongostat, and mongotop. 2.1. Installation Guides 9
  • 14. MongoDB Documentation, Release 2.6.4 Control Scripts The mongodb-org package includes various control scripts, including the init script /etc/init.d/mongod. These scripts are used to stop, start, and restart daemon processes. The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. See http://docs.mongodb.org/manualreference/configuration-options for documentation of the configuration file. As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). You can use the mongod init script to derive your own mongos control script for use in such environments. See the mongos reference for configuration details. Considerations For production deployments, always run MongoDB on 64-bit systems. You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients pack-ages provided by Ubuntu. The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation of an older release, please refer to the documentation for the appropriate version. Install MongoDB Step 1: Import the public key used by the package management system. The Ubuntu package management tools (i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with GPG keys. Issue the following command to import the MongoDB public GPG Key2: sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 Step 2: Create a list file for MongoDB. Create the /etc/apt/sources.list.d/mongodb.list list file using the following command: echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.Step 3: Reload local package database. Issue the following command to reload the local package database: sudo apt-get update Step 4: Install the MongoDB packages. You can install either the latest stable version of MongoDB or a specific version of MongoDB. Install the latest stable version of MongoDB. Issue the following command: sudo apt-get install -y mongodb-org Install a specific release of MongoDB. Specify each component package individually and append the version num-ber to the package name, as in the following example that installs the 2.6.1 release of MongoDB: 2http://docs.mongodb.org/10gen-gpg-key.asc 10 Chapter 2. Install MongoDB
  • 15. MongoDB Documentation, Release 2.6.4 sudo apt-get install -y mongodb-org=2.6.1 mongodb-org-server=2.6.1 mongodb-org-shell=2.6.1 mongodb-org-Pin a specific version of MongoDB. Although you can specify any available version of MongoDB, apt-get will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To pin the version of MongoDB at the currently installed version, issue the following command sequence: echo "mongodb-org hold" | sudo dpkg --set-selections echo "mongodb-org-server hold" | sudo dpkg --set-selections echo "mongodb-org-shell hold" | sudo dpkg --set-selections echo "mongodb-org-mongos hold" | sudo dpkg --set-selections echo "mongodb-org-tools hold" | sudo dpkg --set-selections Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for more information3. Run MongoDB The MongoDB instance stores its data files in /var/lib/mongodb and its log files in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log and data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. If you change the user that runs the MongoDB process, you must modify the access control rights to the /var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. Step 1: Start MongoDB. Issue the following command to start mongod: sudo service mongod start Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading [initandlisten] waiting for connections on port <port> where <port> is the port configured in /etc/mongod.conf, 27017 by default. Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: sudo service mongod stop Step 4: Restart MongoDB. Issue the following command to restart mongod: sudo service mongod restart Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 3http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-ubuntu 2.1. Installation Guides 11
  • 16. MongoDB Documentation, Release 2.6.4 Install MongoDB on Debian Overview Use this tutorial to install MongoDB on Debian systems from .deb packages. While some Debian distributions include their own MongoDB packages, the official MongoDB packages are generally more up to date. Note: This tutorial applies to both Debian systems and versions of Ubuntu Linux prior to 9.10 “Karmic” which do not use Upstart. Other Ubuntu users will want to follow the Install MongoDB on Ubuntu (page 9) tutorial. Packages MongoDB provides packages of the officially supported MongoDB builds in it’s own repository. This repository provides the MongoDB distribution in the following packages: • mongodb-org This package is a metapackage that will automatically install the four component packages listed below. • mongodb-org-server This package contains the mongod daemon and associated configuration and init scripts. • mongodb-org-mongos This package contains the mongos daemon. • mongodb-org-shell This package contains the mongo shell. • mongodb-org-tools This package contains the following MongoDB tools: mongoimport bsondump, mongodump, mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, mongostat, and mongotop. Control Scripts The mongodb-org package includes various control scripts, including the init script /etc/init.d/mongod. These scripts are used to stop, start, and restart daemon processes. The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. See http://docs.mongodb.org/manualreference/configuration-options for documentation of the configuration file. As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). You can use the mongod init script to derive your own mongos control script for use in such environments. See the mongos reference for configuration details. Considerations For production deployments, always run MongoDB on 64-bit systems. You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients pack-ages that your release of Debian may include. The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation of an older release, please refer to the documentation for the appropriate version. Install MongoDB The Debian package management tools (i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with GPG keys. 12 Chapter 2. Install MongoDB
  • 17. MongoDB Documentation, Release 2.6.4 Step 1: Import the public key used by the package management system. Issue the following command to add the MongoDB public GPG Key4 to the system key ring. sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10 Step 2: Create a /etc/apt/sources.list.d/mongodb.list file for MongoDB. Create the list file using the following command: echo 'deb http://downloads-distro.mongodb.org/repo/debian-sysvinit dist 10gen' | sudo tee /etc/apt/sources.Step 3: Reload local package database. Issue the following command to reload the local package database: sudo apt-get update Step 4: Install the MongoDB packages. You can install either the latest stable version of MongoDB or a specific version of MongoDB. Install the latest stable version of MongoDB. Issue the following command: sudo apt-get install -y mongodb-org Install a specific release of MongoDB. Specify each component package individually and append the version num-ber to the package name, as in the following example that installs the 2.6.1 release of MongoDB: sudo apt-get install -y mongodb-org=2.6.1 mongodb-org-server=2.6.1 mongodb-org-shell=2.6.1 mongodb-org-Pin a specific version of MongoDB. Although you can specify any available version of MongoDB, apt-get will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To pin the version of MongoDB at the currently installed version, issue the following command sequence: echo "mongodb-org hold" | sudo dpkg --set-selections echo "mongodb-org-server hold" | sudo dpkg --set-selections echo "mongodb-org-shell hold" | sudo dpkg --set-selections echo "mongodb-org-mongos hold" | sudo dpkg --set-selections echo "mongodb-org-tools hold" | sudo dpkg --set-selections Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for more information5. Run MongoDB The MongoDB instance stores its data files in /var/lib/mongodb and its log files in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log and data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. If you change the user that runs the MongoDB process, you must modify the access control rights to the /var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. 4http://docs.mongodb.org/10gen-gpg-key.asc 5http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-ubuntu 2.1. Installation Guides 13
  • 18. MongoDB Documentation, Release 2.6.4 Step 1: Start MongoDB. Issue the following command to start mongod: sudo service mongod start Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading [initandlisten] waiting for connections on port <port> where <port> is the port configured in /etc/mongod.conf, 27017 by default. Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: sudo service mongod stop Step 4: Restart MongoDB. Issue the following command to restart mongod: sudo service mongod restart Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Install MongoDB on Linux Systems Overview Compiled versions of MongoDB for Linux provide a simple option for installing MongoDB for other Linux systems without supported packages. Considerations For production deployments, always run MongoDB on 64-bit systems. Install MongoDB MongoDB provides archives for both 64-bit and 32-bit Linux. Follow the installation procedure appropriate for your system. Install for 64-bit Linux Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from https://www.mongodb.org/downloads. For example, to download the latest release through the shell, issue the following: curl -O http://downloads.mongodb.org/linux/mongodb-linux-x86_64-2.6.4.tgz Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through the tar command: tar -zxvf mongodb-linux-x86_64-2.6.4.tgz 14 Chapter 2. Install MongoDB
  • 19. MongoDB Documentation, Release 2.6.4 Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which MongoDB will run. mkdir -p mongodb cp -R -n mongodb-linux-x86_64-2.6.4/ mongodb Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/ directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH. For example, you can add the following line to your shell’s rc file (e.g. ~/.bashrc): export PATH=<mongodb-install-directory>/bin:$PATH Replace <mongodb-install-directory> with the path to the extracted MongoDB archive. Install for 32-bit Linux Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from https://www.mongodb.org/downloads. For example, to download the latest release through the shell, issue the following: curl -O http://downloads.mongodb.org/linux/mongodb-linux-i686-2.6.4.tgz Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through the tar command: tar -zxvf mongodb-linux-i686-2.6.4.tgz Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which MongoDB will run. mkdir -p mongodb cp -R -n mongodb-linux-i686-2.6.4/ mongodb Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/ directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH. For example, you can add the following line to your shell’s rc file (e.g. ~/.bashrc): export PATH=<mongodb-install-directory>/bin:$PATH Replace <mongodb-install-directory> with the path to the extracted MongoDB archive. Run MongoDB Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a directory other than this one, you must specify that directory in the dbpath option when starting the mongod process later in this procedure. The following example command creates the default /data/db directory: 2.1. Installation Guides 15
  • 20. MongoDB Documentation, Release 2.6.4 mkdir -p /data/db Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user account running mongod has read and write permissions for the directory. Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the path of the mongod or the data directory. See the following examples. Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: mongod Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full path to the mongod binary at the system prompt: <path to binary>/mongod Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the path to the data directory using the --dbpath option: mongod --dbpath <path to data directory> Step 4: Stop MongoDB as needed. To stop MongoDB, press Control+C in the terminal where the mongod instance is running. Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 2.1.2 Install MongoDB on OS X Overview Use this tutorial to install MongoDB on on OS X systems. Platform Support Starting in version 2.4, MongoDB only supports OS X versions 10.6 (Snow Leopard) on Intel x86-64 and later. MongoDB is available through the popular OS X package manager Homebrew6 or through the MongoDB Download site7. Install MongoDB You can install MongoDB with Homebrew8 or manually. This section describes both. 6http://brew.sh/ 7http://www.mongodb.org/downloads 8http://brew.sh/ 16 Chapter 2. Install MongoDB
  • 21. MongoDB Documentation, Release 2.6.4 Install MongoDB with Homebrew Homebrew9 installs binary packages based on published “formulae.” This section describes how to update brew to the latest packages and install MongoDB. Homebrew requires some initial setup and configuration, which is beyond the scope of this document. Step 1: Update Homebrew’s package database. In a system shell, issue the following command: brew update Step 2: Install MongoDB. You can install MongoDB with via brew with several different options. Use one of the following operations: Install the MongoDB Binaries To install the MongoDB binaries, issue the following command in a system shell: brew install mongodb Build MongoDB from Source with SSL Support To build MongoDB from the source files and include SSL sup-port, issue the following from a system shell: brew install mongodb --with-openssl Install the Latest Development Release of MongoDB To install the latest development release for use in testing and development, issue the following command in a system shell: brew install mongodb --devel Install MongoDB Manually Only install MongoDB using this procedure if you cannot use homebrew (page 17). Step 1: Download the binary files for the desired release of MongoDB. Download the binaries from https://www.mongodb.org/downloads. For example, to download the latest release through the shell, issue the following: curl -O http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.4.tgz Step 2: Extract the files from the downloaded archive. For example, from a system shell, you can extract through the tar command: 9http://brew.sh/ 2.1. Installation Guides 17
  • 22. MongoDB Documentation, Release 2.6.4 tar -zxvf mongodb-osx-x86_64-2.6.4.tgz Step 3: Copy the extracted archive to the target directory. Copy the extracted folder to the location from which MongoDB will run. mkdir -p mongodb cp -R -n mongodb-osx-x86_64-2.6.4/ mongodb Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB binaries are in the bin/ directory of the archive. To ensure that the binaries are in your PATH, you can modify your PATH. For example, you can add the following line to your shell’s rc file (e.g. ~/.bashrc): export PATH=<mongodb-install-directory>/bin:$PATH Replace <mongodb-install-directory> with the path to the extracted MongoDB archive. Run MongoDB Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a directory other than this one, you must specify that directory in the dbpath option when starting the mongod process later in this procedure. The following example command creates the default /data/db directory: mkdir -p /data/db Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user account running mongod has read and write permis-sions for the directory. Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the path of the mongod or the data directory. See the following examples. Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: mongod 18 Chapter 2. Install MongoDB
  • 23. MongoDB Documentation, Release 2.6.4 Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full path to the mongod binary at the system prompt: <path to binary>/mongod Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the path to the data directory using the --dbpath option: mongod --dbpath <path to data directory> Step 4: Stop MongoDB as needed. To stop MongoDB, press Control+C in the terminal where the mongod instance is running. Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. 2.1.3 Install MongoDB on Windows Overview Use this tutorial to install MongoDB on a Windows systems. Platform Support Starting in version 2.2, MongoDB does not support Windows XP. Please use a more recent version of Windows to use more recent releases of MongoDB. Important: If you are running any edition of Windows Server 2008 R2 or Windows 7, please install a hotfix to resolve an issue with memory mapped files on Windows10. Install MongoDB Step 1: Determine which MongoDB build you need. There are three builds of MongoDB for Windows: MongoDB for Windows Server 2008 R2 edition (i.e. 2008R2) runs only on Windows Server 2008 R2, Windows 7 64-bit, and newer versions of Windows. This build takes advantage of recent enhancements to the Windows Platform and cannot operate on older versions of Windows. MongoDB forWindows 64-bit runs on any 64-bit version of Windows newer than Windows XP, including Windows Server 2008 R2 and Windows 7 64-bit. MongoDB for Windows 32-bit runs on any 32-bit version of Windows newer than Windows XP. 32-bit versions of MongoDB are only intended for older systems and for use in testing and development systems. 32-bit versions of MongoDB only support databases smaller than 2GB. 10http://support.microsoft.com/kb/2731284 2.1. Installation Guides 19
  • 24. MongoDB Documentation, Release 2.6.4 To find which version of Windows you are running, enter the following command in the Command Prompt: wmic os get osarchitecture Step 2: Download MongoDB for Windows. Download the latest production release of MongoDB from the MongoDB downloads page11. Ensure you download the correct version of MongoDB for your Windows system. The 64-bit versions of MongoDB does not work with 32-bit Windows. Step 3: Install the downloaded file. InWindows Explorer, locate the downloaded MongoDB msi file, which typically is located in the default Downloads folder. Double-click the msi file. A set of screens will appear to guide you through the installation process. Step 4: Move the MongoDB folder to another location (optional). To move the MongoDB folder, you must issue the move command as an Administrator. For example, to move the folder to C:mongodb: Select Start Menu > All Programs > Accessories. Right-click Command Prompt and select Run as Administrator from the popup menu. Issue the following commands: cd move C:mongodb-win32-* C:mongodb MongoDB is self-contained and does not have any other system dependencies. You can run MongoDB from any folder you choose. You may install MongoDB in any folder (e.g. D:testmongodb) Run MongoDB Warning: Do not make mongod.exe visible on public networks without running in “Secure Mode” with the auth setting. MongoDB is designed to be run in trusted environments, and the database does not enable “Secure Mode” by default. Step 1: Set up the MongoDB environment. MongoDB requires a data directory to store all data. MongoDB’s default data directory path is datadb. Create this folder using the following commands from a Command Prompt: md datadb You can specify an alternate path for data files using the --dbpath option to mongod.exe, for example: C:mongodbbinmongod.exe --dbpath d:testmongodbdata If your path includes spaces, enclose the entire path in double quotes, for example: 11http://www.mongodb.org/downloads 20 Chapter 2. Install MongoDB
  • 25. MongoDB Documentation, Release 2.6.4 C:mongodbbinmongod.exe --dbpath "d:testmongo db data" Step 2: Start MongoDB. To start MongoDB, run mongod.exe. For example, from the Command Prompt: C:Program FilesMongoDBbinmongod.exe This starts the main MongoDB database process. The waiting for connections message in the console output indicates that the mongod.exe process is running successfully. Depending on the security level of your system, Windows may pop up a Security Alert dialog box about block-ing “some features” of C:Program FilesMongoDBbinmongod.exe from communicating on networks. All users should select Private Networks, such as my home or work network and click Allow access. For additional information on security and MongoDB, please see the Security Documentation (page 281). Step 3: Connect to MongoDB. To connect to MongoDB through the mongo.exe shell, open another Command Prompt. When connecting, specify the data directory if necessary. This step provides several example connection commands. If your MongoDB installation uses the default data directory, connect without specifying the data directory: C:mongodbbinmongo.exe If you installation uses a different data directory, specify the directory when connecting, as in this example: C:mongodbbinmongod.exe --dbpath d:testmongodbdata If your path includes spaces, enclose the entire path in double quotes. For example: C:mongodbbinmongod.exe --dbpath "d:testmongo db data" If you want to develop applications using .NET, see the documentation of C# and MongoDB12 for more information. Step 4: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Configure a Windows Service for MongoDB Note: There is a known issue for MongoDB 2.6.0, SERVER-1351513, which prevents the use of the instructions in this section. For MongoDB 2.6.0, use Manually Create a Windows Service for MongoDB (page 22) to create a Windows Service for MongoDB instead. 12http://docs.mongodb.org/ecosystem/drivers/csharp 13https://jira.mongodb.org/browse/SERVER-13515 2.1. Installation Guides 21
  • 26. MongoDB Documentation, Release 2.6.4 Step 1: Configure directories and files. Create a configuration file and a directory path for MongoDB log output (logpath): Create a specific directory for MongoDB log files: md "C:Program FilesMongoDBlog" In the Command Prompt, create a configuration file for the logpath option for MongoDB: echo logpath="C:Program FilesMongoDBlogmongo.log" > "C:Program FilesMongoDBmongod.cfg" Step 2: Run the MongoDB service. Run all of the following commands in Command Prompt with “Administrative Privileges:” Install the MongoDB service. For --install to succeed, you must specify the logpath run-time option. "C:Program FilesMongoDBbinmongod.exe" --config "C:Program FilesMongoDBmongod.cfg" --install Modify the path to the mongod.cfg file as needed. To use an alternate dbpath, specify the path in the configuration file (e.g. C:Program FilesMongoDBmongod.cfg) or on the command line with the --dbpath option. If the dbpath directory does not exist, mongod.exe will not start. The default value for dbpath is datadb. If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system resources exist and your system design requires it. Step 3: Stop or remove the MongoDB service as needed. To stop the MongoDB service use the following command: net stop MongoDB To remove the MongoDB service use the following command: "C:Program FilesMongoDBbinmongod.exe" --remove Manually Create a Windows Service for MongoDB The following procedure assumes you have installed MongoDB using the MSI installer, with the default path C:Program FilesMongoDB 2.6 Standard. If you have installed in an alternative directory, you will need to adjust the paths as appropriate. Step 1: Open an Administrator command prompt. Windows 7 / Vista / Server 2008 (and R2) Press Win + R, then type cmd, then press Ctrl + Shift + Enter. 22 Chapter 2. Install MongoDB
  • 27. MongoDB Documentation, Release 2.6.4 Windows 8 Press Win + X, then press A. Execute the remaining steps from the Administrator command prompt. Step 2: Create directories. Create directories for your database and log files: mkdir c:datadb mkdir c:datalog Step 3: Create a configuration file. Create a configuration file. This file can include any of the configuration options for mongod, but must include a valid setting for logpath: The following creates a configuration file, specifying both the logpath and the dbpath settings in the configuration file: echo logpath=c:datalogmongod.log> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" echo dbpath=c:datadb>> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" Step 4: Create the MongoDB service. Create the MongoDB service. sc.exe create MongoDB binPath= ""C:Program FilesMongoDB 2.6 Standardbinmongod.exe" --service --sc.exe requires a space between “=” and the configuration values (eg “binPath= ”), and a “” to escape double quotes. If successfully created, the following log message will display: [SC] CreateService SUCCESS Step 5: Start the MongoDB service. net start MongoDB Step 6: Stop or remove the MongoDB service as needed. To stop the MongoDB service, use the following command: net stop MongoDB To remove the MongoDB service, first stop the service and then run the following command: sc.exe delete MongoDB 2.1. Installation Guides 23
  • 28. MongoDB Documentation, Release 2.6.4 2.1.4 Install MongoDB Enterprise These documents provide instructions to install MongoDB Enterprise for Linux and Windows Systems. Install MongoDB Enterprise on Red Hat (page 24) Install the MongoDB Enterprise build and required dependen-cies on Red Hat Enterprise or CentOS Systems using packages. Install MongoDB Enterprise on Ubuntu (page 27) Install the MongoDB Enterprise build and required dependencies on Ubuntu Linux Systems using packages. Install MongoDB Enterprise on Debian (page 30) Install the MongoDB Enterprise build and required dependencies on Debian Linux Systems using packages. Install MongoDB Enterprise on SUSE (page 32) Install the MongoDB Enterprise build and required dependencies on SUSE Enterprise Linux. Install MongoDB Enterprise on Amazon AMI (page 34) Install the MongoDB Enterprise build and required depen-dencies on Amazon Linux AMI. Install MongoDB Enterprise on Windows (page 36) Install the MongoDB Enterprise build and required dependen-cies using the .msi installer. Install MongoDB Enterprise on Red Hat Enterprise or CentOS Overview Use this tutorial to install MongoDB Enterprise on Red Hat Enterprise Linux or CentOS Linux from .rpm packages. Packages MongoDB provides packages of the officially supported MongoDB Enterprise builds in it’s own repository. This repository provides the MongoDB Enterprise distribution in the following packages: • mongodb-enterprise This package is a metapackage that will automatically install the four component packages listed below. • mongodb-enterprise-server This package contains the mongod daemon and associated configuration and init scripts. • mongodb-enterprise-mongos This package contains the mongos daemon. • mongodb-enterprise-shell This package contains the mongo shell. • mongodb-enterprise-tools This package contains the following MongoDB tools: mongoimport bsondump, mongodump, mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, mongostat, and mongotop. 24 Chapter 2. Install MongoDB
  • 29. MongoDB Documentation, Release 2.6.4 Control Scripts The mongodb-enterprise package includes various control scripts, including the init script /etc/rc.d/init.d/mongod. The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). You can use the mongod init script to derive your own mongos control script. Considerations MongoDB only provides Enterprise packages for Red Hat Enterprise Linux and CentOS Linux versions 5 and 6, 64-bit. The default /etc/mongodb.conf configuration file supplied by the 2.6 series packages has bind_ip‘ set to 127.0.0.1 by default. Modify this setting as needed for your environment before initializing a replica set. Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation of an older release, please refer to the documentation for the appropriate version. Install MongoDB Enterprise When you install the packages for MongoDB Enterprise, you choose whether to install the current release or a previous one. This procedure describes how to do both. Step 1: Configure repository. Create an /etc/yum.repos.d/mongodb-enterprise.repo file so that you can install MongoDB enterprise directly, using yum. Use the following repository file to specify the latest stable release of MongoDB enterprise. [mongodb-enterprise] name=MongoDB Enterprise Repository baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/stable/$basearch/ gpgcheck=0 enabled=1 Use the following repository to install only versions of MongoDB for the 2.6 release. If you’d like to install Mon-goDB Enterprise packages from a particular release series (page 808), such as 2.4 or 2.6, you can specify the re-lease series in the repository configuration. For example, to restrict your system to the 2.6 release series, create a /etc/yum.repos.d/mongodb-enterprise-2.6.repo file to hold the following configuration information for the MongoDB Enterprise 2.6 repository: [mongodb-enterprise-2.6] name=MongoDB Enterprise 2.6 Repository baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/2.6/$basearch/ gpgcheck=0 enabled=1 .repo files for each release can also be found in the repository itself14. Remember that odd-numbered minor release versions (e.g. 2.5) are development versions and are unsuitable for production deployment. Step 1: Install the MongoDB Enterprise packages and associated tools. You can install either the latest stable version of MongoDB Enterprise or a specific version of MongoDB Enterprise. 14https://repo.mongodb.com/yum/redhat/ 2.1. Installation Guides 25
  • 30. MongoDB Documentation, Release 2.6.4 Install the latest stable version of MongoDB Enterprise. Issue the following command: sudo yum install -y mongodb-enterprise Step 2: Optional. Manage Installed Version Install a specific release of MongoDB Enterprise. Specify each component package individually and append the version number to the package name, as in the following example that installs the 2.6.1 release of MongoDB: sudo yum install -y mongodb-enterprise-2.6.1 mongodb-enterprise-server-2.6.1 mongodb-enterprise-shell-Pin a specific version of MongoDB Enterprise. Although you can specify any available version of MongoDB Enterprise, yum will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To pin a package, add the following exclude directive to your /etc/yum.conf file: exclude=mongodb-enterprise,mongodb-enterprise-server,mongodb-enterprise-shell,mongodb-enterprise-mongos,Previous versions of MongoDB packages use different naming conventions. See the 2.4 version of documentation for more information15. Step 3: When the install completes, you can run MongoDB. Run MongoDB Enterprise Important: You must configure SELinux to allow MongoDB to start on Red Hat Linux-based systems (Red Hat Enterprise Linux, CentOS, Fedora). Administrators have three options: • enable access to the relevant ports (e.g. 27017) for SELinux. See Default MongoDB Port (page 380) for more information on MongoDB’s default ports. For default settings, this can be accomplished by running semanage port -a -t mongodb_port_t -p tcp 27017 • set SELinux to permissive mode in /etc/selinux.conf. The line SELINUX=enforcing should be changed to SELINUX=permissive • disable SELinux entirely; as above but set SELINUX=disabled All three options require root privileges. The latter two options each requires a system reboot and may have larger implications for your deployment. You may alternatively choose not to install the SELinux packages when you are installing your Linux operating system, or choose to remove the relevant packages. This option is the most invasive and is not recommended. The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb by default, and runs using the mongod user account. You can specify alternate log and data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. 15http://docs.mongodb.org/v2.4/tutorial/install-mongodb-on-linux 26 Chapter 2. Install MongoDB
  • 31. MongoDB Documentation, Release 2.6.4 If you change the user that runs the MongoDB process, you must modify the access control rights to the /var/lib/mongo and /var/log/mongodb directories to give this users access to these directories. Step 1: Start MongoDB. You can start the mongod process by issuing the following command: sudo service mongod start Step 2: Verify that MongoDB has started successfully You can verify that the mongod process has started suc-cessfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading [initandlisten] waiting for connections on port <port> where <port> is the port configured in /etc/mongod.conf, 27017 by default. You can optionally ensure that MongoDB will start following a system reboot by issuing the following command: sudo chkconfig mongod on Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: sudo service mongod stop Step 4: Restart MongoDB. You can restart the mongod process by issuing the following command: sudo service mongod restart You can follow the state of the process for errors or important messages by watching the output in the /var/log/mongodb/mongod.log file. Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Install MongoDB Enterprise on Ubuntu Overview Use this tutorial to install MongoDB Enterprise on Ubuntu Linux systems from .deb packages. Packages MongoDB provides packages of the officially supported MongoDB Enterprise builds in it’s own repository. This repository provides the MongoDB Enterprise distribution in the following packages: • mongodb-enterprise This package is a metapackage that will automatically install the four component packages listed below. • mongodb-enterprise-server This package contains the mongod daemon and associated configuration and init scripts. • mongodb-enterprise-mongos This package contains the mongos daemon. 2.1. Installation Guides 27
  • 32. MongoDB Documentation, Release 2.6.4 • mongodb-enterprise-shell This package contains the mongo shell. • mongodb-enterprise-tools This package contains the following MongoDB tools: mongoimport bsondump, mongodump, mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, mongostat, and mongotop. Control Scripts The mongodb-enterprise package includes various control scripts, including the init script /etc/rc.d/init.d/mongod. The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). You can use the mongod init script to derive your own mongos control script. Considerations MongoDB only provides Enterprise packages for Ubuntu 12.04 LTS (Precise Pangolin). Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation of an older release, please refer to the documentation for the appropriate version. Install MongoDB Enterprise Step 1: Import the public key used by the package management system. The Ubuntu package management tools (i.e. dpkg and apt) ensure package consistency and authenticity by requiring that distributors sign packages with GPG keys. Issue the following command to import the MongoDB public GPG Key16: sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 Step 2: Create a /etc/apt/sources.list.d/mongodb-enterprise.list file for MongoDB. Create the list file using the following command: echo 'deb http://repo.mongodb.com/apt/ubuntu precise/mongodb-enterprise/stable multiverse' | sudo tee If you’d like to install MongoDB Enterprise packages from a particular release series (page 808), such as 2.4 or 2.6, you can specify the release series in the repository configuration. For example, to restrict your system to the 2.6 release series, add the following repository: echo 'deb http://repo.mongodb.com/apt/ubuntu precise/mongodb-enterprise/2.6 multiverse' | sudo tee /etc/Step 3: Reload local package database. Issue the following command to reload the local package database: sudo apt-get update 16http://docs.mongodb.org/10gen-gpg-key.asc 28 Chapter 2. Install MongoDB
  • 33. MongoDB Documentation, Release 2.6.4 Step 4: Install the MongoDB Enterprise packages. When you install the packages, you choose whether to install the current release or a previous one. This step provides instructions for both. To install the latest stable version of MongoDB Enterprise, issue the following command: sudo apt-get install mongodb-enterprise To install a specific release of MongoDB Enterprise, specify each component package individually and append the version number to the package name, as in the following example that installs the 2.6.1‘ release of MongoDB Enter-prise: apt-get install mongodb-enterprise=2.6.1 mongodb-enterprise-server=2.6.1 mongodb-enterprise-shell=2.6.1 You can specify any available version of MongoDB Enterprise. However apt-get will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To pin the version of MongoDB Enterprise at the currently installed version, issue the following command sequence: echo "mongodb-enterprise hold" | sudo dpkg --set-selections echo "mongodb-enterprise-server hold" | sudo dpkg --set-selections echo "mongodb-enterprise-shell hold" | sudo dpkg --set-selections echo "mongodb-enterprise-mongos hold" | sudo dpkg --set-selections echo "mongodb-enterprise-tools hold" | sudo dpkg --set-selections Previous versions of MongoDB Enterprise packages use different naming conventions. See the 2.4 version of docu-mentation17 for more information. Run MongoDB Enterprise The MongoDB instance stores its data files in /var/lib/mongodb and its log files in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log and data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. If you change the user that runs the MongoDB process, you must modify the access control rights to the /var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. Step 1: Start MongoDB. Issue the following command to start mongod: sudo service mongod start Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading [initandlisten] waiting for connections on port <port> where <port> is the port configured in /etc/mongod.conf, 27017 by default. Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: sudo service mongod stop Step 4: Restart MongoDB. Issue the following command to restart mongod: 17http://docs.mongodb.org/v2.4/tutorial/install-mongodb-enterprise 2.1. Installation Guides 29
  • 34. MongoDB Documentation, Release 2.6.4 sudo service mongod restart Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Install MongoDB Enterprise on Debian Overview Use this tutorial to install MongoDB Enterprise on Debian Linux systems from .deb packages. Packages MongoDB provides packages of the officially supported MongoDB Enterprise builds in it’s own repository. This repository provides the MongoDB Enterprise distribution in the following packages: • mongodb-enterprise This package is a metapackage that will automatically install the four component packages listed below. • mongodb-enterprise-server This package contains the mongod daemon and associated configuration and init scripts. • mongodb-enterprise-mongos This package contains the mongos daemon. • mongodb-enterprise-shell This package contains the mongo shell. • mongodb-enterprise-tools This package contains the following MongoDB tools: mongoimport bsondump, mongodump, mongoexport, mongofiles, mongoimport, mongooplog, mongoperf, mongorestore, mongostat, and mongotop. Control Scripts The mongodb-enterprise package includes various control scripts, including the init script /etc/rc.d/init.d/mongod. The package configures MongoDB using the /etc/mongod.conf file in conjunction with the control scripts. As of version 2.6.4, there are no control scripts for mongos. The mongos process is used only in sharding (page 613). You can use the mongod init script to derive your own mongos control script. Considerations Changed in version 2.6: The package structure and names have changed as of version 2.6. For instructions on instal-lation of an older release, please refer to the documentation for the appropriate version. MongoDB only provides Enterprise packages for 64-bit versions of Debian Wheezy. 30 Chapter 2. Install MongoDB
  • 35. MongoDB Documentation, Release 2.6.4 Install MongoDB Enterprise Step 1: Import the public key used by the package management system. Issue the following command to add the MongoDB public GPG Key18 to the system key ring. sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10 Step 2: Create a /etc/apt/sources.list.d/mongodb-enterprise.list file for MongoDB. Create the list file using the following command: echo 'deb http://repo.mongodb.com/apt/debian wheezy/mongodb-enterprise/stable main' | sudo tee /etc/apt/If you’d like to install MongoDB Enterprise packages from a particular release series (page 808), such as 2.6, you can specify the release series in the repository configuration. For example, to restrict your system to the 2.6 release series, add the following repository: echo 'deb http://repo.mongodb.com/apt/debian precise/mongodb-enterprise/2.6 main' | sudo tee /etc/apt/Step 3: Reload local package database. Issue the following command to reload the local package database: sudo apt-get update Step 4: Install the MongoDB Enterprise packages. When you install the packages, you choose whether to install the current release or a previous one. This step provides instructions for both. To install the latest stable version of MongoDB Enterprise, issue the following command: sudo apt-get install mongodb-enterprise To install a specific release of MongoDB Enterprise, specify each component package individually and append the version number to the package name, as in the following example that installs the 2.6.1‘ release of MongoDB Enter-prise: apt-get install mongodb-enterprise=2.6.1 mongodb-enterprise-server=2.6.1 mongodb-enterprise-shell=2.6.1 You can specify any available version of MongoDB Enterprise. However apt-get will upgrade the packages when a newer version becomes available. To prevent unintended upgrades, pin the package. To pin the version of MongoDB Enterprise at the currently installed version, issue the following command sequence: echo "mongodb-enterprise hold" | sudo dpkg --set-selections echo "mongodb-enterprise-server hold" | sudo dpkg --set-selections echo "mongodb-enterprise-shell hold" | sudo dpkg --set-selections echo "mongodb-enterprise-mongos hold" | sudo dpkg --set-selections echo "mongodb-enterprise-tools hold" | sudo dpkg --set-selections Run MongoDB Enterprise The MongoDB instance stores its data files in /var/lib/mongodb and its log files in /var/log/mongodb by default, and runs using the mongodb user account. You can specify alternate log and data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. If you change the user that runs the MongoDB process, you must modify the access control rights to the /var/lib/mongodb and /var/log/mongodb directories to give this users access to these directories. 18http://docs.mongodb.org/10gen-gpg-key.asc 2.1. Installation Guides 31
  • 36. MongoDB Documentation, Release 2.6.4 Step 1: Start MongoDB. Issue the following command to start mongod: sudo service mongod start Step 2: Verify that MongoDB has started successfully Verify that the mongod process has started successfully by checking the contents of the log file at /var/log/mongodb/mongod.log for a line reading [initandlisten] waiting for connections on port <port> where <port> is the port configured in /etc/mongod.conf, 27017 by default. Step 3: Stop MongoDB. As needed, you can stop the mongod process by issuing the following command: sudo service mongod stop Step 4: Restart MongoDB. Issue the following command to restart mongod: sudo service mongod restart Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Install MongoDB Enterprise on SUSE Overview Use this tutorial to install MongoDB Enterprise on SUSE Linux. MongoDB Enterprise is available on select platforms and contains support for several features related to security and monitoring. Prerequisites To use MongoDB Enterprise on SUSE Enterprise Linux, you must install several prerequisite packages: • libopenssl0_9_8 • libsnmp15 • net-snmp • snmp-mibs • cyrus-sasl • cyrus-sasl-gssapi To install these packages, you can issue the following command: sudo zypper install libopenssl0_9_8 net-snmp libsnmp15 snmp-mibs cyrus-sasl cyrus-sasl-gssapi 32 Chapter 2. Install MongoDB
  • 37. MongoDB Documentation, Release 2.6.4 Install MongoDB Enterprise Note: The Enterprise packages include an example SNMP configuration file named mongod.conf. This file is not a MongoDB configuration file. Step 1: Download and install the MongoDB Enterprise packages. After you have in-stalled the required prerequisite packages, download and install the MongoDB Enterprise packages from http://www.mongodb.com/subscription/downloads. The MongoDB binaries are located in the http://docs.mongodb.org/manualbin directory of the archive. To download and install, use the following sequence of commands. curl -O http://downloads.10gen.com/linux/mongodb-linux-x86_64-subscription-suse11-2.6.4.tgz tar -zxvf mongodb-linux-x86_64-subscription-suse11-2.6.4.tgz cp -R -n mongodb-linux-x86_64-subscription-suse11-2.6.4/ mongodb Step 2: Ensure the location of the MongoDB binaries is included in the PATH variable. Once you have copied the MongoDB binaries to their target location, ensure that the location is included in your PATH variable. If it is not, either include it or create symbolic links from the binaries to a directory that is included. Run MongoDB Enterprise Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a directory other than this one, you must specify that directory in the dbpath option when starting the mongod process later in this procedure. The following example command creates the default /data/db directory: mkdir -p /data/db Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user account running mongod has read and write permissions for the directory. Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the path of the mongod or the data directory. See the following examples. Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: mongod Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full path to the mongod binary at the system prompt: <path to binary>/mongod 2.1. Installation Guides 33
  • 38. MongoDB Documentation, Release 2.6.4 Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the path to the data directory using the --dbpath option: mongod --dbpath <path to data directory> Step 4: Stop MongoDB as needed. To stop MongoDB, press Control+C in the terminal where the mongod instance is running. Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Install MongoDB Enterprise on Amazon Linux AMI Overview Use this tutorial to install MongoDB Enterprise on Amazon Linux AMI. MongoDB Enterprise is available on select platforms and contains support for several features related to security and monitoring. Prerequisites To use MongoDB Enterprise on Amazon Linux AMI, you must install several prerequisite packages: • net-snmp • net-snmp-libs • openssl • net-snmp-utils • cyrus-sasl • cyrus-sasl-lib • cyrus-sasl-devel • cyrus-sasl-gssapi To install these packages, you can issue the following command: sudo yum install openssl net-snmp net-snmp-libs net-snmp-utils cyrus-sasl cyrus-sasl-lib cyrus-sasl-devel Install MongoDB Enterprise Note: The Enterprise packages include an example SNMP configuration file named mongod.conf. This file is not a MongoDB configuration file. Step 1: Download and install the MongoDB Enterprise packages. After you have in-stalled the required prerequisite packages, download and install the MongoDB Enterprise packages from http://www.mongodb.com/subscription/downloads. The MongoDB binaries are located in the http://docs.mongodb.org/manualbin directory of the archive. To download and install, use the following sequence of commands. 34 Chapter 2. Install MongoDB
  • 39. MongoDB Documentation, Release 2.6.4 curl -O http://downloads.10gen.com/linux/mongodb-linux-x86_64-subscription-amzn64-2.6.4.tgz tar -zxvf mongodb-linux-x86_64-subscription-amzn64-2.6.4.tgz cp -R -n mongodb-linux-x86_64-subscription-amzn64-2.6.4/ mongodb Step 2: Ensure the location of the MongoDB binaries is included in the PATH variable. Once you have copied the MongoDB binaries to their target location, ensure that the location is included in your PATH variable. If it is not, either include it or create symbolic links from the binaries to a directory that is included. Run MongoDB Enterprise The MongoDB instance stores its data files in /var/lib/mongo and its log files in /var/log/mongodb by default, and runs using the mongod user account. You can specify alternate log and data file directories in /etc/mongodb.conf. See systemLog.path and storage.dbPath for additional information. If you change the user that runs the MongoDB process, you must modify the access control rights to the /var/lib/mongo and /var/log/mongodb directories to give this users access to these directories. Step 1: Create the data directory. Before you start MongoDB for the first time, create the directory to which the mongod process will write data. By default, the mongod process uses the /data/db directory. If you create a directory other than this one, you must specify that directory in the dbpath option when starting the mongod process later in this procedure. The following example command creates the default /data/db directory: mkdir -p /data/db Step 2: Set permissions for the data directory. Before running mongod for the first time, ensure that the user account running mongod has read and write permissions for the directory. Step 3: Run MongoDB. To run MongoDB, run the mongod process at the system prompt. If necessary, specify the path of the mongod or the data directory. See the following examples. Run without specifying paths If your system PATH variable includes the location of the mongod binary and if you use the default data directory (i.e., /data/db), simply enter mongod at the system prompt: mongod Specify the path of the mongod If your PATH does not include the location of the mongod binary, enter the full path to the mongod binary at the system prompt: <path to binary>/mongod Specify the path of the data directory If you do not use the default data directory (i.e., /data/db), specify the path to the data directory using the --dbpath option: mongod --dbpath <path to data directory> Step 4: Stop MongoDB as needed. To stop MongoDB, press Control+C in the terminal where the mongod instance is running. 2.1. Installation Guides 35
  • 40. MongoDB Documentation, Release 2.6.4 Step 5: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Install MongoDB Enterprise on Windows New in version 2.6. Overview Use this tutorial to install MongoDB Enterprise on Windows systems. MongoDB Enterprise is available on select platforms and contains support for several features related to security and monitoring. Prerequisites MongoDB Enterprise Server for Windows requires Windows Server 2008 R2 or later. The MSI installer includes all other software dependencies. Install MongoDB Enterprise Step 1: Download MongoDB Enterprise for Windows. Download the latest production release of MongoDB Enterprise19 Step 2: Install MongoDB Enterprise for Windows. Run the downloaded MSI installer. Make configuration choices as prompted. MongoDB is self-contained and does not have any other system dependencies. You can install MongoDB into any folder (e.g. D:testmongodb) and run it from there. The installation wizard includes an option to select an installation directory. Run MongoDB Enterprise Warning: Do not make mongod.exe visible on public networks without running in “Secure Mode” with the auth setting. MongoDB is designed to be run in trusted environments, and the database does not enable “Secure Mode” by default. Step 1: Set up the MongoDB environment. MongoDB requires a data directory to store all data. MongoDB’s default data directory path is datadb. Create this folder using the following commands from a Command Prompt: md datadb You can specify an alternate path for data files using the --dbpath option to mongod.exe, for example: C:mongodbbinmongod.exe --dbpath d:testmongodbdata If your path includes spaces, enclose the entire path in double quotes, for example: 19http://www.mongodb.com/products/mongodb-enterprise 36 Chapter 2. Install MongoDB
  • 41. MongoDB Documentation, Release 2.6.4 C:mongodbbinmongod.exe --dbpath "d:testmongo db data" Step 2: Start MongoDB. To start MongoDB, run mongod.exe. For example, from the Command Prompt: C:Program FilesMongoDBbinmongod.exe This starts the main MongoDB database process. The waiting for connections message in the console output indicates that the mongod.exe process is running successfully. Depending on the security level of your system, Windows may pop up a Security Alert dialog box about block-ing “some features” of C:Program FilesMongoDBbinmongod.exe from communicating on networks. All users should select Private Networks, such as my home or work network and click Allow access. For additional information on security and MongoDB, please see the Security Documentation (page 281). Step 3: Connect to MongoDB. To connect to MongoDB through the mongo.exe shell, open another Command Prompt. When connecting, specify the data directory if necessary. This step provides several example connection commands. If your MongoDB installation uses the default data directory, connect without specifying the data directory: C:mongodbbinmongo.exe If you installation uses a different data directory, specify the directory when connecting, as in this example: C:mongodbbinmongod.exe --dbpath d:testmongodbdata If your path includes spaces, enclose the entire path in double quotes. For example: C:mongodbbinmongod.exe --dbpath "d:testmongo db data" If you want to develop applications using .NET, see the documentation of C# and MongoDB20 for more information. Step 4: Begin using MongoDB. To begin using MongoDB, see Getting Started with MongoDB (page 43). Also consider the Production Notes (page 188) document before deploying MongoDB in a production environment. Configure a Windows Service for MongoDB Enterprise You can set up the MongoDB server as a Windows Service that starts automatically at boot time. Step 1: Configure directories and files. Create a configuration file and a directory path for MongoDB log output (logpath): Create a specific directory for MongoDB log files: md "C:Program FilesMongoDBlog" In the Command Prompt, create a configuration file for the logpath option for MongoDB: echo logpath="C:Program FilesMongoDBlogmongo.log" > "C:Program FilesMongoDBmongod.cfg" 20http://docs.mongodb.org/ecosystem/drivers/csharp 2.1. Installation Guides 37
  • 42. MongoDB Documentation, Release 2.6.4 Step 2: Run the MongoDB service. Run all of the following commands in Command Prompt with “Administrative Privileges:” Install the MongoDB service. For --install to succeed, you must specify the logpath run-time option. "C:Program FilesMongoDBbinmongod.exe" --config "C:Program FilesMongoDBmongod.cfg" --install Modify the path to the mongod.cfg file as needed. To use an alternate dbpath, specify the path in the configuration file (e.g. C:Program FilesMongoDBmongod.cfg) or on the command line with the --dbpath option. If the dbpath directory does not exist, mongod.exe will not start. The default value for dbpath is datadb. If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system resources exist and your system design requires it. Step 3: Stop or remove the MongoDB service as needed. To stop the MongoDB service use the following com-mand: net stop MongoDB To remove the MongoDB service use the following command: "C:Program FilesMongoDBbinmongod.exe" --remove Configure a Windows Service for MongoDB Enterprise Note: There is a known issue for MongoDB 2.6.0, SERVER-1351521, which prevents the use of the instructions in this section. For MongoDB 2.6.0, use Manually Create a Windows Service for MongoDB Enterprise (page 39) to create a Windows Service for MongoDB. You can set up the MongoDB server as a Windows Service that starts automatically at boot time. Step 1: Configure directories and files. Create a configuration file and a directory path for MongoDB log output (logpath): Create a specific directory for MongoDB log files: md "C:Program FilesMongoDBlog" In the Command Prompt, create a configuration file for the logpath option for MongoDB: echo logpath="C:Program FilesMongoDBlogmongo.log" > "C:Program FilesMongoDBmongod.cfg" Step 2: Run the MongoDB service. Run all of the following commands in Command Prompt with “Administrative Privileges:” Install the MongoDB service. For --install to succeed, you must specify the logpath run-time option. "C:Program FilesMongoDBbinmongod.exe" --config "C:Program FilesMongoDBmongod.cfg" --install 21https://jira.mongodb.org/browse/SERVER-13515 38 Chapter 2. Install MongoDB
  • 43. MongoDB Documentation, Release 2.6.4 Modify the path to the mongod.cfg file as needed. To use an alternate dbpath, specify the path in the configuration file (e.g. C:Program FilesMongoDBmongod.cfg) or on the command line with the --dbpath option. If the dbpath directory does not exist, mongod.exe will not start. The default value for dbpath is datadb. If needed, you can install services for multiple instances of mongod.exe or mongos.exe. Install each service with a unique --serviceName and --serviceDisplayName. Use multiple instances only when sufficient system resources exist and your system design requires it. Step 3: Stop or remove the MongoDB service as needed. To stop the MongoDB service use the following com-mand: net stop MongoDB To remove the MongoDB service use the following command: "C:Program FilesMongoDBbinmongod.exe" --remove Manually Create a Windows Service for MongoDB Enterprise The following procedure assumes you have installed MongoDB using the MSI installer, with the default path C:Program FilesMongoDB 2.6 Enterprise. If you have installed in an alternative directory, you will need to adjust the paths as appropriate. Step 1: Open an Administrator command prompt. Press Win + R, then type cmd, then press Ctrl + Shift + Enter. Execute the remaining steps from the Administrator command prompt. Step 2: Create directories. Create directories for your database and log files: mkdir c:datadb mkdir c:datalog Step 3: Create a configuration file. Create a configuration file. This file can include any of the configuration options for mongod, but must include a valid setting for logpath: The following creates a configuration file, specifying both the logpath and the dbpath settings in the configuration file: echo logpath=c:datalogmongod.log> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" echo dbpath=c:datadb>> "C:Program FilesMongoDB 2.6 Standardmongod.cfg" Step 4: Create the MongoDB service. Create the MongoDB service. sc.exe create MongoDB binPath= ""C:Program FilesMongoDB 2.6 Enterprisebinmongod.exe" --service sc.exe requires a space between “=” and the configuration values (eg “binPath= ”), and a “” to escape double quotes. If successfully created, the following log message will display: 2.1. Installation Guides 39
  • 44. MongoDB Documentation, Release 2.6.4 [SC] CreateService SUCCESS Step 5: Start the MongoDB service. net start MongoDB Step 6: Stop or remove the MongoDB service as needed. To stop the MongoDB service, use the following com-mand: net stop MongoDB To remove the MongoDB service, first stop the service and then run the following command: sc.exe delete MongoDB 2.1.5 Verify Integrity of MongoDB Packages Overview The MongoDB release team digitally signs all software packages to certify that a particular MongoDB package is a valid and unaltered MongoDB release. Before installing MongoDB, you can validate packages using either a PGP signature or with MD5 and SHA checksums of the MongoDB packages. The PGP signatures store an encrypted hash of the software package, that you can validate to ensure that the package you have is consistent with the official package release. MongoDB also publishes MD5 and SHA hashes of the official packages that you can use to confirm that you have a valid package. Considerations MongoDB signs each release branch with a different PGP key. The public .asc and .pub key files for each branch are available for download. For example, the 2.2 keys are available at the following URLs: https://www.mongodb.org/static/pgp/server-2.2.asc https://www.mongodb.org/static/pgp/server-2.2.pub Replace 2.2 with the appropriate release number to download public key. Keys are available for all MongoDB releases beginning with 2.2. Procedures Use PGP/GPG Step 1: Download the MongoDB installation file. Download the binaries from https://www.mongodb.org/downloads based on your environment. For example, to download the 2.6.0 release for OS X through the shell, type this command: curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz 40 Chapter 2. Install MongoDB
  • 45. MongoDB Documentation, Release 2.6.4 Step 2: Download the public signature file. curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz.sig Step 3: Download then import the key file. If you have not downloaded and imported the key file, enter these commands: curl -LO https://www.mongodb.org/static/pgp/server-2.6.asc gpg --import server-2.6.asc You should receive this message: gpg: key AAB2461C: public key "MongoDB 2.6 Release Signing Key <packaging@mongodb.com>" imported gpg: Total number processed: 1 gpg: imported: 1 (RSA: 1) Step 4: Verify the MongoDB installation file. Type this command: gpg --verify mongodb-osx-x86_64-2.6.0.tgz.sig mongodb-osx-x86_64-2.6.0.tgz You should receive this message: gpg: Signature made Thu Mar 6 15:11:28 2014 EST using RSA key ID AAB2461C gpg: Good signature from "MongoDB 2.6 Release Signing Key <packaging@mongodb.com>" Download and import the key file, as described above, if you receive a message like this one: gpg: Signature made Thu Mar 6 15:11:28 2014 EST using RSA key ID AAB2461C gpg: Can't check signature: public key not found gpg will return the following message if the package is properly signed, but you do not currently trust the signing key in your local trustdb. gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: DFFA 3DCF 326E 302C 4787 673A 01C4 E7FA AAB2 461C Use SHA MongoDB provides checksums using both the SHA-1 and SHA-256 hash functions. You can use either, as you like. Step 1: Download the MongoDB installation file. Download the binaries from https://www.mongodb.org/downloads based on your environment. For example, to download the 2.6.0 release for OS X through the shell, type this command: curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz Step 2: Download the SHA1 and SHA256 file. curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.3.tgz.sha1 curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.3.tgz.sha256 2.1. Installation Guides 41
  • 46. MongoDB Documentation, Release 2.6.4 Step 3: Use the SHA-256 checksum to verify the MongoDB package file. Compute the checksum of the package file: shasum mongodb-linux-x86_64-2.6.3.tgz which will generate this result: fe511ee40428edda3a507f70d2b91d16b0483674 mongodb-osx-x86_64-2.6.3.tgz Enter this command: cat mongodb-linux-x86_64-2.6.3.tgz.sha1 which will generate this result: fe511ee40428edda3a507f70d2b91d16b0483674 mongodb-osx-x86_64-2.6.3.tgz The output of the shasum and cat commands should be identical. Step 3: Use the SHA-1 checksum to verify the MongoDB package file. Compute the checksum of the package file: shasum -a 256 mongodb-linux-x86_64-2.6.3.tgz which will generate this result: be3a5e9f4e9c8e954e9af7053776732387d2841a019185eaf2e52086d4d207a3 mongodb-osx-x86_64-2.6.3.tgz Enter this command: cat mongodb-linux-x86_64-2.6.3.tgz.sha256 which will generate this result: be3a5e9f4e9c8e954e9af7053776732387d2841a019185eaf2e52086d4d207a3 mongodb-osx-x86_64-2.6.3.tgz The output of the shasum and cat commands should be identical. Use MD5 Step 1: Download the MongoDB installation file. Download the binaries from https://www.mongodb.org/downloads based on your environment. For example, to download the 2.6.0 release for OS X through the shell, type this command: curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz Step 2: Download the MD5 file. curl -LO http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.0.tgz.md5 Step 3: Verify the checksum values for the MongoDB package file (Linux). Compute the checksum of the pack-age file: md5 mongodb-linux-x86_64-2.6.0.tgz which will generate this result: 42 Chapter 2. Install MongoDB
  • 47. MongoDB Documentation, Release 2.6.4 MD5 (mongodb-linux-x86_64-2.6.0.tgz) = a937d49881f90e1a024b58d642011dc4 Enter this command: cat mongodb-linux-x86_64-2.6.0.tgz.md5 which will generate this result: a937d49881f90e1a024b58d642011dc4 The output of the md5 and cat commands should be identical. Step 4: Verify the MongoDB installation file (OS X). Compute the checksum of the package file: md5sum -c mongodb-osx-x86_64-2.6.0.tgz.md5 mongodb-osx-x86_64-2.6.0.tgz which will generate this result: mongodb-osx-x86_64-2.6.0-rc1.tgz ok 2.2 First Steps with MongoDB After you have installed MongoDB, consider the following documents as you begin to learn about MongoDB: Getting Started with MongoDB (page 43) An introduction to the basic operation and use of MongoDB. Generate Test Data (page 47) To support initial exploration, generate test data to facilitate testing. 2.2.1 Getting Started with MongoDB This tutorial provides an introduction to basic database operations using the mongo shell. mongo is a part of the standard MongoDB distribution and provides a full JavaScript environment with complete access to the JavaScript language and all standard functions as well as a full database interface for MongoDB. See the mongo JavaScript API22 documentation and the mongo shell JavaScript Method Reference. The tutorial assumes that you’re running MongoDB on a Linux or OS X operating system and that you have a running database server; MongoDB does support Windows and provides a Windows distribution with identical operation. For instructions on installing MongoDB and starting the database server, see the appropriate installation (page 5) document. Connect to a Database In this section, you connect to the database server, which runs as mongod, and begin using the mongo shell to select a logical database within the database instance and access the help text in the mongo shell. Connect to a mongod From a system prompt, start mongo by issuing the mongo command, as follows: mongo 22http://api.mongodb.org/js 2.2. First Steps with MongoDB 43
  • 48. MongoDB Documentation, Release 2.6.4 By default, mongo looks for a database server listening on port 27017 on the localhost interface. To connect to a server on a different port or interface, use the --port and --host options. Select a Database After starting the mongo shell your session will use the test database by default. At any time, issue the following operation at the mongo to report the name of the current database: db 1. From the mongo shell, display the list of databases, with the following operation: show dbs 2. Switch to a new database named mydb, with the following operation: use mydb 3. Confirm that your session has the mydb database as context, by checking the value of the db object, which returns the name of the current database, as follows: db At this point, if you issue the show dbs operation again, it will not include the mydb database. MongoDB will not permanently create a database until you insert data into that database. The Create a Collection and Insert Documents (page 44) section describes the process for inserting data. New in version 2.4: show databases also returns a list of databases. Display mongo Help At any point, you can access help for the mongo shell using the following operation: help Furthermore, you can append the .help() method to some JavaScript methods, any cursor object, as well as the db and db.collection objects to return additional help information. Create a Collection and Insert Documents In this section, you insert documents into a new collection named testData within the new database named mydb. MongoDB will create a collection implicitly upon its first use. You do not need to create a collection before inserting data. Furthermore, because MongoDB uses dynamic schemas (page 688), you also need not specify the structure of your documents before inserting them into the collection. 1. From the mongo shell, confirm you are in the mydb database by issuing the following: db 2. If mongo does not return mydb for the previous operation, set the context to the mydb database, with the following operation: use mydb 3. Create two documents named j and k by using the following sequence of JavaScript operations: 44 Chapter 2. Install MongoDB
  • 49. MongoDB Documentation, Release 2.6.4 j = { name : "mongo" } k = { x : 3 } 4. Insert the j and k documents into the testData collection with the following sequence of operations: db.testData.insert( j ) db.testData.insert( k ) When you insert the first document, the mongod will create both the mydb database and the testData collection. 5. Confirm that the testData collection exists. Issue the following operation: show collections The mongo shell will return the list of the collections in the current (i.e. mydb) database. At this point, the only collection is testData. All mongod databases also have a system.indexes (page 271) collection. 6. Confirm that the documents exist in the testData collection by issuing a query on the collection using the find() method: db.testData.find() This operation returns the following results. The ObjectId (page 165) values will be unique: { "_id" : ObjectId("4c2209f9f3924d31102bd84a"), "name" : "mongo" } { "_id" : ObjectId("4c2209fef3924d31102bd84b"), "x" : 3 } All MongoDB documents must have an _id field with a unique value. These operations do not explicitly specify a value for the _id field, so mongo creates a unique ObjectId (page 165) value for the field before inserting it into the collection. Insert Documents using a For Loop or a JavaScript Function To perform the remaining procedures in this tutorial, first add more documents to your database using one or both of the procedures described in Generate Test Data (page 47). Working with the Cursor When you query a collection, MongoDB returns a “cursor” object that contains the results of the query. The mongo shell then iterates over the cursor to display the results. Rather than returning all results at once, the shell iterates over the cursor 20 times to display the first 20 results and then waits for a request to iterate over the remaining results. In the shell, use enter it to iterate over the next set of results. The procedures in this section show other ways to work with a cursor. For comprehensive documentation on cursors, see crud-read-cursor. Iterate over the Cursor with a Loop Before using this procedure, add documents to a collection using one of the procedures in Generate Test Data (page 47). You can name your database and collections anything you choose, but this procedure will assume the database named test and a collection named testData. 1. In the MongoDB JavaScript shell, query the testData collection and assign the resulting cursor object to the c variable: 2.2. First Steps with MongoDB 45
  • 50. MongoDB Documentation, Release 2.6.4 var c = db.testData.find() 2. Print the full result set by using a while loop to iterate over the c variable: while ( c.hasNext() ) printjson( c.next() ) The hasNext() function returns true if the cursor has documents. The next() method returns the next document. The printjson() method renders the document in a JSON-like format. The operation displays all documents: { "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 } { "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 } { "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 } ... Use Array Operations with the Cursor The following procedure lets you manipulate a cursor object as if it were an array: 1. In the mongo shell, query the testData collection and assign the resulting cursor object to the c variable: var c = db.testData.find() 2. To find the document at the array index 4, use the following operation: printjson( c [ 4 ] ) MongoDB returns the following: { "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), "x" : 5 } When you access documents in a cursor using the array index notation, mongo first calls the cursor.toArray() method and loads into RAM all documents returned by the cursor. The index is then applied to the resulting array. This operation iterates the cursor completely and exhausts the cursor. For very large result sets, mongo may run out of available memory. For more information on the cursor, see crud-read-cursor. Query for Specific Documents MongoDB has a rich query system that allows you to select and filter the documents in a collection along specific fields and values. See Query Documents (page 87) and Read Operations (page 55) for a full account of queries in MongoDB. In this procedure, you query for specific documents in the testData collection by passing a “query document” as a parameter to the find() method. A query document specifies the criteria the query must match to return a document. In the mongo shell, query for all documents where the x field has a value of 18 by passing the { x : 18 } query document as a parameter to the find() method: db.testData.find( { x : 18 } ) MongoDB returns one document that fits this criteria: { "_id" : ObjectId("51a7dc7b2cacf40b79990bf7"), "x" : 18 } 46 Chapter 2. Install MongoDB
  • 51. MongoDB Documentation, Release 2.6.4 Return a Single Document from a Collection With the findOne() method you can return a single document from a MongoDB collection. The findOne() method takes the same parameters as find(), but returns a document rather than a cursor. To retrieve one document from the testData collection, issue the following command: db.testData.findOne() For more information on querying for documents, see the Query Documents (page 87) and Read Operations (page 55) documentation. Limit the Number of Documents in the Result Set To increase performance, you can constrain the size of the result by limiting the amount of data your application must receive over the network. To specify the maximum number of documents in the result set, call the limit() method on a cursor, as in the following command: db.testData.find().limit(3) MongoDB will return the following result, with different ObjectId (page 165) values: { "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 } { "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 } { "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 } Next Steps with MongoDB For more information on manipulating the documents in a database as you continue to learn MongoDB, consider the following resources: • MongoDB CRUD Operations (page 51) • SQL to MongoDB Mapping Chart (page 120) • http://docs.mongodb.org/manualapplications/drivers 2.2.2 Generate Test Data This tutorial describes how to quickly generate test data as you need to test basic MongoDB operations. Insert Multiple Documents Using a For Loop You can add documents to a new or existing collection by using a JavaScript for loop run from the mongo shell. 1. From the mongo shell, insert new documents into the testData collection using the following for loop. If the testData collection does not exist, MongoDB creates the collection implicitly. for (var i = 1; i <= 25; i++) db.testData.insert( { x : i } ) 2. Use find() to query the collection: db.testData.find() 2.2. First Steps with MongoDB 47
  • 52. MongoDB Documentation, Release 2.6.4 The mongo shell displays the first 20 documents in the collection. Your ObjectId (page 165) values will be different: { "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 } { "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 } { "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 } { "_id" : ObjectId("51a7dc7b2cacf40b79990be9"), "x" : 4 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), "x" : 5 } { "_id" : ObjectId("51a7dc7b2cacf40b79990beb"), "x" : 6 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bec"), "x" : 7 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bed"), "x" : 8 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bee"), "x" : 9 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bef"), "x" : 10 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf0"), "x" : 11 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf1"), "x" : 12 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf2"), "x" : 13 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf3"), "x" : 14 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf4"), "x" : 15 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf5"), "x" : 16 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf6"), "x" : 17 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf7"), "x" : 18 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf8"), "x" : 19 } { "_id" : ObjectId("51a7dc7b2cacf40b79990bf9"), "x" : 20 } 1. The find() returns a cursor. To iterate the cursor and return more documents use the it operation in the mongo shell. The mongo shell will exhaust the cursor, and return the following documents: { "_id" : ObjectId("51a7dce92cacf40b79990bfc"), "x" : 21 } { "_id" : ObjectId("51a7dce92cacf40b79990bfd"), "x" : 22 } { "_id" : ObjectId("51a7dce92cacf40b79990bfe"), "x" : 23 } { "_id" : ObjectId("51a7dce92cacf40b79990bff"), "x" : 24 } { "_id" : ObjectId("51a7dce92cacf40b79990c00"), "x" : 25 } Insert Multiple Documents with a mongo Shell Function You can create a JavaScript function in your shell session to generate the above data. The insertData() JavaScript function, shown here, creates new data for use in testing or training by either creating a new collection or appending data to an existing collection: function insertData(dbName, colName, num) { var col = db.getSiblingDB(dbName).getCollection(colName); for (i = 0; i < num; i++) { col.insert({x:i}); } print(col.count()); } The insertData() function takes three parameters: a database, a new or existing collection, and the number of documents to create. The function creates documents with an x field that is set to an incremented integer, as in the following example documents: { "_id" : ObjectId("51a4da9b292904caffcff6eb"), "x" : 0 } { "_id" : ObjectId("51a4da9b292904caffcff6ec"), "x" : 1 } { "_id" : ObjectId("51a4da9b292904caffcff6ed"), "x" : 2 } 48 Chapter 2. Install MongoDB
  • 53. MongoDB Documentation, Release 2.6.4 Store the function in your .mongorc.js file. The mongo shell loads the function for you every time you start a session. Example Specify database name, collection name, and the number of documents to insert as arguments to insertData(). insertData("test", "testData", 400) This operation inserts 400 documents into the testData collection in the test database. If the collection and database do not exist, MongoDB creates them implicitly before inserting documents. See also: MongoDB CRUD Concepts (page 53) and Data Models (page 131). 2.2. First Steps with MongoDB 49
  • 54. MongoDB Documentation, Release 2.6.4 50 Chapter 2. Install MongoDB
  • 55. CHAPTER 3 MongoDB CRUD Operations MongoDB provides rich semantics for reading and manipulating data. CRUD stands for create, read, update, and delete. These terms are the foundation for all interactions with the database. MongoDB CRUD Introduction (page 51) An introduction to the MongoDB data model as well as queries and data manipulations. MongoDB CRUD Concepts (page 53) The core documentation of query and data manipulation. MongoDB CRUD Tutorials (page 84) Examples of basic query and data modification operations. MongoDB CRUD Reference (page 117) Reference material for the query and data manipulation interfaces. 3.1 MongoDB CRUD Introduction MongoDB stores data in the form of documents, which are JSON-like field and value pairs. Documents are analogous to structures in programming languages that associate keys with values (e.g. dictionaries, hashes, maps, and associative arrays). Formally, MongoDB documents are BSON documents. BSON is a binary representation of JSON with additional type information. In the documents, the value of a field can be any of the BSON data types, including other documents, arrays, and arrays of documents. For more information, see Documents (page 158). Figure 3.1: A MongoDB document. MongoDB stores all documents in collections. A collection is a group of related documents that have a set of shared common indexes. Collections are analogous to a table in relational databases. 51
  • 56. MongoDB Documentation, Release 2.6.4 Figure 3.2: A collection of MongoDB documents. 3.1.1 Database Operations Query In MongoDB a query targets a specific collection of documents. Queries specify criteria, or conditions, that identify the documents that MongoDB returns to the clients. A query may include a projection that specifies the fields from the matching documents to return. You can optionally modify queries to impose limits, skips, and sort orders. In the following diagram, the query process specifies a query criteria and a sort modifier: See Read Operations Overview (page 55) for more information. Data Modification Data modification refers to operations that create, update, or delete data. In MongoDB, these operations modify the data of a single collection. For the update and delete operations, you can specify the criteria to select the documents to update or remove. In the following diagram, the insert operation adds a new document to the users collection. See Write Operations Overview (page 68) for more information. 3.1.2 Related Features Indexes To enhance the performance of common queries and updates, MongoDB has full support for secondary indexes. These indexes allow applications to store a view of a portion of the collection in an efficient data structure. Most indexes store an ordered representation of all values of a field or a group of fields. Indexes may also enforce uniqueness (page 457), store objects in a geospatial representation (page 444), and facilitate text search (page 454). 52 Chapter 3. MongoDB CRUD Operations
  • 57. MongoDB Documentation, Release 2.6.4 Figure 3.3: The stages of a MongoDB query with a query criteria and a sort modifier. Replica Set Read Preference For replica sets and sharded clusters with replica set components, applications specify read preferences (page 530). A read preference determines how the client direct read operations to the set. Write Concern Applications can also control the behavior of write operations using write concern (page 72). Particularly useful for deployments with replica sets, the write concern semantics allow clients to specify the assurance that MongoDB provides when reporting on the success of a write operation. Aggregation In addition to the basic queries, MongoDB provides several data aggregation features. For example, MongoDB can return counts of the number of documents that match a query, or return the number of distinct values for a field, or process a collection of documents using a versatile stage-based data processing pipeline or map-reduce operations. 3.2 MongoDB CRUD Concepts The Read Operations (page 55) and Write Operations (page 67) documents introduce the behavior and operations of read and write operations for MongoDB deployments. Read Operations (page 55) Introduces all operations that select and return documents to clients, including the query specifications. Cursors (page 59) Queries return iterable objects, called cursors, that hold the full result set. Query Optimization (page 60) Analyze and improve query performance. 3.2. MongoDB CRUD Concepts 53
  • 58. MongoDB Documentation, Release 2.6.4 Figure 3.4: The stages of a MongoDB insert operation. 54 Chapter 3. MongoDB CRUD Operations
  • 59. MongoDB Documentation, Release 2.6.4 Distributed Queries (page 63) Describes how sharded clusters and replica sets affect the performance of read operations. Write Operations (page 67) Introduces data create and modify operations, their behavior, and performances. Write Concern (page 72) Describes the kind of guarantee MongoDB provides when reporting on the success of a write operation. Distributed Write Operations (page 76) Describes how MongoDB directs write operations on sharded clusters and replica sets and the performance characteristics of these operations. Continue reading from Write Operations (page 67) for additional background on the behavior of data modifica-tion operations in MongoDB. 3.2.1 Read Operations The following documents describe read operations: Read Operations Overview (page 55) A high level overview of queries and projections in MongoDB, including a discussion of syntax and behavior. Cursors (page 59) Queries return iterable objects, called cursors, that hold the full result set. Query Optimization (page 60) Analyze and improve query performance. Query Plans (page 61) MongoDB executes queries using optimal plans. Distributed Queries (page 63) Describes how sharded clusters and replica sets affect the performance of read opera-tions. Read Operations Overview Read operations, or queries, retrieve data stored in the database. In MongoDB, queries select documents from a single collection. Queries specify criteria, or conditions, that identify the documents that MongoDB returns to the clients. A query may include a projection that specifies the fields from the matching documents to return. The projection limits the amount of data that MongoDB returns to the client over the network. Query Interface For query operations, MongoDB provides a db.collection.find() method. The method accepts both the query criteria and projections and returns a cursor (page 59) to the matching documents. You can optionally modify the query to impose limits, skips, and sort orders. The following diagram highlights the components of a MongoDB query operation: Figure 3.5: The components of a MongoDB find operation. 3.2. MongoDB CRUD Concepts 55
  • 60. MongoDB Documentation, Release 2.6.4 The next diagram shows the same query in SQL: Figure 3.6: The components of a SQL SELECT statement. Example db.users.find( { age: { $gt: 18 } }, { name: 1, address: 1 } ).limit(5) This query selects the documents in the users collection that match the condition age is greater than 18. To specify the greater than condition, query criteria uses the greater than (i.e. $gt) query selection operator. The query returns at most 5 matching documents (or more precisely, a cursor to those documents). The matching documents will return with only the _id, name and address fields. See Projections (page 57) for details. See SQL to MongoDB Mapping Chart (page 120) for additional examples of MongoDB queries and the corresponding SQL statements. Query Behavior MongoDB queries exhibit the following behavior: • All queries in MongoDB address a single collection. • You can modify the query to impose limits, skips, and sort orders. • The order of documents returned by a query is not defined unless you specify a sort(). • Operations that modify existing documents (page 98) (i.e. updates) use the same query syntax as queries to select documents to update. • In aggregation (page 391) pipeline, the $match pipeline stage provides access to MongoDB queries. MongoDB provides a db.collection.findOne() method as a special case of find() that returns a single document. Query Statements Consider the following diagram of the query process that specifies a query criteria and a sort modifier: In the diagram, the query selects documents from the users collection. Using a query selection operator to define the conditions for matching documents, the query selects documents that have age greater than (i.e. $gt) 18. Then the sort() modifier sorts the results by age in ascending order. For additional examples of queries, see Query Documents (page 87). 56 Chapter 3. MongoDB CRUD Operations
  • 61. MongoDB Documentation, Release 2.6.4 Figure 3.7: The stages of a MongoDB query with a query criteria and a sort modifier. Projections Queries in MongoDB return all fields in all matching documents by default. To limit the amount of data that MongoDB sends to applications, include a projection in the queries. By projecting results with a subset of fields, applications reduce their network overhead and processing requirements. Projections, which are the second argument to the find() method, may either specify a list of fields to return or list fields to exclude in the result documents. Important: Except for excluding the _id field in inclusive projections, you cannot mix exclusive and inclusive projections. Consider the following diagram of the query process that specifies a query criteria and a projection: In the diagram, the query selects from the users collection. The criteria matches the documents that have age equal to 18. Then the projection specifies that only the name field should return in the matching documents. Projection Examples Exclude One Field From a Result Set db.records.find( { "user_id": { $lt: 42 } }, { "history": 0 } ) This query selects documents in the records collection that match the condition { "user_id": { $lt: 42 } }, and uses the projection { "history": 0 } to exclude the history field from the documents in the result set. Return Two fields and the _id Field db.records.find( { "user_id": { $lt: 42 } }, { "name": 1, "email": 1 } ) 3.2. MongoDB CRUD Concepts 57
  • 62. MongoDB Documentation, Release 2.6.4 Figure 3.8: The stages of a MongoDB query with a query criteria and projection. MongoDB only transmits the projected data to the clients. This query selects documents in the records collection that match the query { "user_id": { $lt: 42 } } and uses the projection { "name": 1, "email": 1 } to return just the _id field (implicitly included), name field, and the email field in the documents in the result set. Return Two Fields and Exclude _id db.records.find( { "user_id": { $lt: 42} }, { "_id": 0, "name": 1 , "email": 1 } ) This query selects documents in the records collection that match the query { "user_id": { $lt: 42} }, and only returns the name and email fields in the documents in the result set. See Limit Fields to Return from a Query (page 94) for more examples of queries with projection statements. Projection Behavior MongoDB projections have the following properties: • By default, the _id field is included in the results. To suppress the _id field from the result set, specify _id: 0 in the projection document. • For fields that contain arrays, MongoDB provides the following projection operators: $elemMatch, $slice, and $. • For related projection functionality in the aggregation framework (page 391) pipeline, use the $project pipeline stage. 58 Chapter 3. MongoDB CRUD Operations
  • 63. MongoDB Documentation, Release 2.6.4 Cursors In the mongo shell, the primary method for the read operation is the db.collection.find() method. This method queries a collection and returns a cursor to the returning documents. To access the documents, you need to iterate the cursor. However, in the mongo shell, if the returned cursor is not assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times 1 to print up to the first 20 documents in the results. For example, in the mongo shell, the following read operation queries the inventory collection for documents that have type equal to ’food’ and automatically print up to the first 20 matching documents: db.inventory.find( { type: 'food' } ); To manually iterate the cursor to access the documents, see Iterate a Cursor in the mongo Shell (page 95). Cursor Behaviors Closure of Inactive Cursors By default, the server will automatically close the cursor after 10 minutes of inactivity or if client has exhausted the cursor. To override this behavior, you can specify the noTimeout wire protocol flag2 in your query; however, you should either close the cursor manually or exhaust the cursor. In the mongo shell, you can set the noTimeout flag: var myCursor = db.inventory.find().addOption(DBQuery.Option.noTimeout); See your driver documentation for information on setting the noTimeout flag. For the mongo shell, see cursor.addOption() for a complete list of available cursor flags. Cursor Isolation Because the cursor is not isolated during its lifetime, intervening write operations on a document may result in a cursor that returns a document more than once if that document has changed. To handle this situation, see the information on snapshot mode (page 698). Cursor Batches The MongoDB server returns the query results in batches. Batch size will not exceed the maximum BSON document size. For most queries, the first batch returns 101 documents or just enough documents to exceed 1 megabyte. Subsequent batch size is 4 megabytes. To override the default size of the batch, see batchSize() and limit(). For queries that include a sort operation without an index, the server must load all the documents in memory to perform the sort and will return all documents in the first batch. As you iterate through the cursor and reach the end of the returned batch, if there are more results, cursor.next() will perform a getmore operation to retrieve the next batch. To see how many documents remain in the batch as you iterate the cursor, you can use the objsLeftInBatch() method, as in the following example: var myCursor = db.inventory.find(); var myFirstDocument = myCursor.hasNext() ? myCursor.next() : null; myCursor.objsLeftInBatch(); 1 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries (page 256) for more information. 2http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol 3.2. MongoDB CRUD Concepts 59
  • 64. MongoDB Documentation, Release 2.6.4 Cursor Information The db.serverStatus() method returns a document that includes a metrics field. The metrics field con-tains a cursor field with the following information: • number of timed out cursors since the last server restart • number of open cursors with the option DBQuery.Option.noTimeout set to prevent timeout after a period of inactivity • number of “pinned” open cursors • total number of open cursors Consider the following example which calls the db.serverStatus() method and accesses the metrics field from the results and then the cursor field from the metrics field: db.serverStatus().metrics.cursor The result is the following document: { "timedOut" : <number> "open" : { "noTimeout" : <number>, "pinned" : <number>, "total" : <number> } } See also: db.serverStatus() Query Optimization Indexes improve the efficiency of read operations by reducing the amount of data that query operations need to process. This simplifies the work associated with fulfilling queries within MongoDB. Create an Index to Support Read Operations If your application queries a collection on a particular field or fields, then an index on the queried field or fields can prevent the query from scanning the whole collection to find and return the query results. For more information about indexes, see the complete documentation of indexes in MongoDB (page 436). Example An application queries the inventory collection on the type field. The value of the type field is user-driven. var typeValue = <someUserInput>; db.inventory.find( { type: typeValue } ); To improve the performance of this query, add an ascending, or a descending, index to the inventory collection on the type field. 3 In the mongo shell, you can create indexes using the db.collection.ensureIndex() method: 3 For single-field indexes, the selection between ascending and descending order is immaterial. For compound indexes, the selection is important. See indexing order (page 441) for more details. 60 Chapter 3. MongoDB CRUD Operations
  • 65. MongoDB Documentation, Release 2.6.4 db.inventory.ensureIndex( { type: 1 } ) This index can prevent the above query on type from scanning the whole collection to return the results. To analyze the performance of the query with an index, see Analyze Query Performance (page 97). In addition to optimizing read operations, indexes can support sort operations and allow for a more efficient storage utilization. See db.collection.ensureIndex() and Indexing Tutorials (page 464) for more information about index creation. Query Selectivity Some query operations are not selective. These operations cannot use indexes effectively or cannot use indexes at all. The inequality operators $nin and $ne are not very selective, as they often match a large portion of the index. As a result, in most cases, a $nin or $ne query with an index may perform no better than a $nin or $ne query that must scan all documents in a collection. Queries that specify regular expressions, with inline JavaScript regular expressions or $regex operator expressions, cannot use an index with one exception. Queries that specify regular expression with anchors at the beginning of a string can use an index. Covering a Query An index covers (page 495) a query, a covered query, when: • all the fields in the query (page 87) are part of that index, and • all the fields returned in the documents that match the query are in the same index. For these queries, MongoDB does not need to inspect documents outside of the index. This is often more efficient than inspecting entire documents. Example Given a collection inventory with the following index on the type and item fields: { type: 1, item: 1 } This index will cover the following query on the type and item fields, which returns only the item field: db.inventory.find( { type: "food", item:/^c/ }, { item: 1, _id: 0 } ) However, the index will not cover the following query, which returns the item field and the _id field: db.inventory.find( { type: "food", item:/^c/ }, { item: 1 } ) See Create Indexes that Support Covered Queries (page 495) for more information on the behavior and use of covered queries. Query Plans The MongoDB query optimizer processes queries and chooses the most efficient query plan for a query given the available indexes. The query system then uses this query plan each time the query runs. 3.2. MongoDB CRUD Concepts 61
  • 66. MongoDB Documentation, Release 2.6.4 The query optimizer only caches the plans for those query shapes that can have more than one viable plan. The query optimizer occasionally reevaluates query plans as the content of the collection changes to ensure optimal query plans. You can also specify which indexes the optimizer evaluates with Index Filters (page 63). You can use the explain() method to view statistics about the query plan for a given query. This information can help as you develop indexing strategies (page 493). Query Optimization To create a new query plan, the query optimizer: 1. runs the query against several candidate indexes in parallel. 2. records the matches in a common results buffer or buffers. • If the candidate plans include only ordered query plans, there is a single common results buffer. • If the candidate plans include only unordered query plans, there is a single common results buffer. • If the candidate plans include both ordered query plans and unordered query plans, there are two common results buffers, one for the ordered plans and the other for the unordered plans. If an index returns a result already returned by another index, the optimizer skips the duplicate match. In the case of the two buffers, both buffers are de-duped. 3. stops the testing of candidate plans and selects an index when one of the following events occur: • An unordered query plan has returned all the matching results; or • An ordered query plan has returned all the matching results; or • An ordered query plan has returned a threshold number of matching results: – Version 2.0: Threshold is the query batch size. The default batch size is 101. – Version 2.2: Threshold is 101. The selected index becomes the index specified in the query plan; future iterations of this query or queries with the same query pattern will use this index. Query pattern refers to query select conditions that differ only in the values, as in the following two queries with the same query pattern: db.inventory.find( { type: 'food' } ) db.inventory.find( { type: 'utensil' } ) Query Plan Revision As collections change over time, the query optimizer deletes the query plan and re-evaluates after any of the following events: • The collection receives 1,000 write operations. • The reIndex rebuilds the index. • You add or drop an index. • The mongod process restarts. 62 Chapter 3. MongoDB CRUD Operations
  • 67. MongoDB Documentation, Release 2.6.4 Cached Query Plan Interface New in version 2.6. MongoDB provides http://docs.mongodb.org/manualreference/method/js-plan-cache to view and modify the cached query plans. Index Filters New in version 2.6. Index filters determine which indexes the optimizer evaluates for a query shape. A query shape consists of a combi-nation of query, sort, and projection specifications. If an index filter exists for a given query shape, the optimizer only considers those indexes specified in the filter. When an index filter exists for the query shape, MongoDB ignores the hint(). To see whether MongoDB applied an index filter for a query, check the explain.filterSet field of the explain() output. Index filters only affects which indexes the optimizer evaluates; the optimizer may still select the collection scan as the winning plan for a given query shape. Index filters exist for the duration of the server process and do not persist after shutdown. MongoDB also provides a command to manually remove filters. Because index filters overrides the expected behavior of the optimizer as well as the hint() method, use index filters sparingly. See planCacheListFilters, planCacheClearFilters, and planCacheSetFilter. Distributed Queries Read Operations to Sharded Clusters Sharded clusters allow you to partition a data set among a cluster of mongod instances in a way that is nearly trans-parent to the application. For an overview of sharded clusters, see the Sharding (page 607) section of this manual. For a sharded cluster, applications issue operations to one of the mongos instances associated with the cluster. Read operations on sharded clusters are most efficient when directed to a specific shard. Queries to sharded collections should include the collection’s shard key (page 620). When a query includes a shard key, the mongos can use cluster metadata from the config database (page 616) to route the queries to shards. If a query does not include the shard key, the mongos must direct the query to all shards in the cluster. These scatter gather queries can be inefficient. On larger clusters, scatter gather queries are unfeasible for routine operations. For more information on read operations in sharded clusters, see the Sharded Cluster Query Routing (page 624) and Shard Keys (page 620) sections. Read Operations to Replica Sets Replica sets use read preferences to determine where and how to route read operations to members of the replica set. By default, MongoDB always reads data from a replica set’s primary. You can modify that behavior by changing the read preference mode (page 603). You can configure the read preference mode (page 603) on a per-connection or per-operation basis to allow reads from secondaries to: 3.2. MongoDB CRUD Concepts 63
  • 68. MongoDB Documentation, Release 2.6.4 Figure 3.9: Diagram of a sharded cluster. 64 Chapter 3. MongoDB CRUD Operations
  • 69. MongoDB Documentation, Release 2.6.4 Figure 3.10: Read operations to a sharded cluster. Query criteria includes the shard key. The query router mongos can target the query to the appropriate shard or shards. 3.2. MongoDB CRUD Concepts 65
  • 70. MongoDB Documentation, Release 2.6.4 Figure 3.11: Read operations to a sharded cluster. Query criteria does not include the shard key. The query router mongos must broadcast query to all shards for the collection. 66 Chapter 3. MongoDB CRUD Operations
  • 71. MongoDB Documentation, Release 2.6.4 • reduce latency in multi-data-center deployments, • improve read throughput by distributing high read-volumes (relative to write volume), • for backup operations, and/or • to allow reads during failover (page 523) situations. Figure 3.12: Read operations to a replica set. Default read preference routes the read to the primary. Read preference of nearest routes the read to the nearest member. Read operations from secondary members of replica sets are not guaranteed to reflect the current state of the primary, and the state of secondaries will trail the primary by some amount of time. Often, applications don’t rely on this kind of strict consistency, but application developers should always consider the needs of their application before setting read preference. For more information on read preference or on the read preference modes, see Read Preference (page 530) and Read Preference Modes (page 603). 3.2.2 Write Operations The following documents describe write operations: Write Operations Overview (page 68) Provides an overview of MongoDB’s data insertion and modification opera-tions, including aspects of the syntax, and behavior. Write Concern (page 72) Describes the kind of guarantee MongoDB provides when reporting on the success of a write operation. Distributed Write Operations (page 76) Describes how MongoDB directs write operations on sharded clusters and replica sets and the performance characteristics of these operations. 3.2. MongoDB CRUD Concepts 67
  • 72. MongoDB Documentation, Release 2.6.4 Write Operation Performance (page 77) Introduces the performance constraints and factors for writing data to Mon-goDB deployments. Bulk Inserts in MongoDB (page 81) Describe behaviors associated with inserting an array of documents. Storage (page 82) Introduces the storage allocation strategies available for MongoDB collections. Write Operations Overview A write operation is any operation that creates or modifies data in the MongoDB instance. In MongoDB, write operations target a single collection. All write operations in MongoDB are atomic on the level of a single document. There are three classes of write operations in MongoDB: insert (page 68), update (page 69), and remove (page 70). Insert operations add new data to a collection. Update operations modify existing data, and remove operations delete data from a collection. No insert, update, or remove can affect more than one document atomically. For the update and remove operations, you can specify criteria, or conditions, that identify the documents to update or remove. These operations use the same query syntax to specify the criteria as read operations (page 55). MongoDB allows applications to determine the acceptable level of acknowledgement required of write operations. See Write Concern (page 72) for more information. Insert In MongoDB, the db.collection.insert() method adds new documents to a collection. The following diagram highlights the components of a MongoDB insert operation: Figure 3.13: The components of a MongoDB insert operations. The following diagram shows the same query in SQL: Example The following operation inserts a new documents into the users collection. The new document has four fields name, age, and status, and an _id field. MongoDB always adds the _id field to the new document if that field does not exist. db.users.insert( { name: "sue", age: 26, 68 Chapter 3. MongoDB CRUD Operations
  • 73. MongoDB Documentation, Release 2.6.4 Figure 3.14: The components of a SQL INSERT statement. status: "A" } ) For more information and examples, see db.collection.insert(). Insert Behavior If you add a new document without the _id field, the client library or the mongod instance adds an _id field and populates the field with a unique ObjectId. If you specify the _id field, the value must be unique within the collection. For operations with write concern (page 72), if you try to create a document with a duplicate _id value, mongod returns a duplicate key exception. Other Methods to Add Documents You can also add new documents to a collection using methods that have an upsert (page 70) option. If the option is set to true, these methods will either modify existing documents or add a new document when no matching documents exist for the query. For more information, see Update Behavior with the upsert Option (page 70). Update In MongoDB, the db.collection.update() method modifies existing documents in a collection. The db.collection.update() method can accept query criteria to determine which documents to update as well as an options document that affects its behavior, such as the multi option to update multiple documents. The following diagram highlights the components of a MongoDB update operation: Figure 3.15: The components of a MongoDB update operation. The following diagram shows the same query in SQL: Example 3.2. MongoDB CRUD Concepts 69
  • 74. MongoDB Documentation, Release 2.6.4 Figure 3.16: The components of a SQL UPDATE statement. db.users.update( { age: { $gt: 18 } }, { $set: { status: "A" } }, { multi: true } ) This update operation on the users collection sets the status field to A for the documents that match the criteria of age greater than 18. For more information, see db.collection.update() and update() Examples. Default Update Behavior By default, the db.collection.update() method updates a single document. However, with the multi option, update() can update all documents in a collection that match a query. The db.collection.update() method either updates specific fields in the existing document or replaces the document. See db.collection.update() for details as well as examples. When performing update operations that increase the document size beyond the allocated space for that document, the update operation relocates the document on disk. MongoDB preserves the order of the document fields following write operations except for the following cases: • The _id field is always the first field in the document. • Updates that include renaming of field names may result in the reordering of fields in the document. Changed in version 2.6: Starting in version 2.6, MongoDB actively attempts to preserve the field order in a document. Before version 2.6, MongoDB did not actively preserve the order of the fields in a document. Update Behavior with the upsert Option If the update() method includes upsert: true and no documents match the query portion of the update operation, then the update operation creates a new document. If there are matching documents, then the update operation with the upsert: true modifies the matching document or documents. By specifying upsert: true, applications can indicate, in a single operation, that if no matching documents are found for the update, an insert should be performed. See update() for details on performing an upsert. Changed in version 2.6: In 2.6, the new Bulk() methods and the underlying update command allow you to perform many updates with upsert: true operations in a single call. Remove In MongoDB, the db.collection.remove() method deletes documents from a collection. The db.collection.remove() method accepts a query criteria to determine which documents to remove. The following diagram highlights the components of a MongoDB remove operation: 70 Chapter 3. MongoDB CRUD Operations
  • 75. MongoDB Documentation, Release 2.6.4 Figure 3.17: The components of a MongoDB remove operation. The following diagram shows the same query in SQL: Figure 3.18: The components of a SQL DELETE statement. Example db.users.remove( { status: "D" } ) This delete operation on the users collection removes all documents that match the criteria of status equal to D. For more information, see db.collection.remove() method and Remove Documents (page 101). Remove Behavior By default, db.collection.remove() method removes all documents that match its query. However, the method can accept a flag to limit the delete operation to a single document. Isolation of Write Operations The modification of a single document is always atomic, even if the write operation modifies multiple sub-documents within that document. For write operations that modify multiple documents, the operation as a whole is not atomic, and other operations may interleave. No other operations are atomic. You can, however, attempt to isolate a write operation that affects multiple documents using the isolation operator. To isolate a sequence of write operations from other read and write operations, see Perform Two Phase Commits (page 102). Additional Methods The db.collection.save() method can either update an existing document or an insert a document if the document cannot be found by the _id field. See db.collection.save() for more information and examples. MongoDB also provides methods to perform write operations in bulk. See Bulk() for more information. 3.2. MongoDB CRUD Concepts 71
  • 76. MongoDB Documentation, Release 2.6.4 Write Concern Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation. The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm the write operations. MongoDB provides different levels of write concern to better address the specific needs of applications. Clients may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather than ensure persistence to the entire deployment. Changed in version 2.6: A new protocol for write operations (page 737) integrates write concern with the write operations. For details on write concern configurations, see Write Concern Reference (page 118). Considerations Default Write Concern The mongo shell and the MongoDB drivers use Acknowledged (page 73) as the default write concern. See Acknowledged (page 73) for more information, including when this write concern became the default. Read Isolation MongoDB allows clients to read documents inserted or modified before it commits these modifica-tions to disk, regardless of write concern level or journaling configuration. As a result, applications may observe two classes of behaviors: • For systems with multiple concurrent readers and writers, MongoDB will allow clients to read the results of a write operation before the write operation returns. • If the mongod terminates before the journal commits, even if a write returns successfully, queries may have read data that will not exist after the mongod restarts. Other database systems refer to these isolation semantics as read uncommitted. For all inserts and updates, Mon-goDB modifies each document in isolation: clients never see documents in intermediate states. For multi-document operations, MongoDB does not provide any multi-document transactions or isolation. When mongod returns a successful journaled write concern, the data is fully committed to disk and will be available after mongod restarts. For replica sets, write operations are durable only after a write replicates and commits to the journal of a majority of the members of the set. MongoDB regularly commits data to the journal regardless of journaled write concern: use the commitIntervalMs to control how often a mongod commits the journal. Timeouts Clients can set a wtimeout (page 119) value as part of a replica acknowledged (page 75) write concern. If the write concern is not satisfied in the specified interval, the operation returns an error, even if the write concern will eventually succeed. MongoDB does not “rollback” or undo modifications made before the wtimeout interval expired. Write Concern Levels MongoDB has the following levels of conceptual write concern, listed from weakest to strongest: 72 Chapter 3. MongoDB CRUD Operations
  • 77. MongoDB Documentation, Release 2.6.4 Unacknowledged With an unacknowledged write concern, MongoDB does not acknowledge the receipt of write operations. Unacknowledged is similar to errors ignored; however, drivers will attempt to receive and handle network errors when possible. The driver’s ability to detect network errors depends on the system’s networking configuration. Before the releases outlined in Default Write Concern Change (page 808), this was the default write concern. Figure 3.19: Write operation to a mongod instance with write concern of unacknowledged. The client does not wait for any acknowledgment. Acknowledged With a receipt acknowledged write concern, the mongod confirms the receipt of the write operation. Acknowledged write concern allows clients to catch network, duplicate key, and other errors. MongoDB uses the acknowledged write concern by default starting in the driver releases outlined in Releases (page 808). Changed in version 2.6: The mongo shell write methods now incorporates the write concern (page 72) in the write methods and provide the default write concern whether run interactively or in a script. See Write Method Acknowl-edgements (page 743) for details. Journaled With a journaled write concern, the MongoDB acknowledges the write operation only after committing the data to the journal. This write concern ensures that MongoDB can recover the data following a shutdown or power interruption. You must have journaling enabled to use this write concern. With a journaled write concern, write operations must wait for the next journal commit. To reduce latency for these op-erations, MongoDB also increases the frequency that it commits operations to the journal. See commitIntervalMs for more information. Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the primary of the set regardless of the level of replica acknowledged write concern. 3.2. MongoDB CRUD Concepts 73
  • 78. MongoDB Documentation, Release 2.6.4 Figure 3.20: Write operation to a mongod instance with write concern of acknowledged. The client waits for acknowledgment of success or exception. Figure 3.21: Write operation to a mongod instance with write concern of journaled. The mongod sends acknowl-edgment after it commits the write operation to the journal. 74 Chapter 3. MongoDB CRUD Operations
  • 79. MongoDB Documentation, Release 2.6.4 Replica Acknowledged Replica sets present additional considerations with regards to write concern.. The default write concern only requires acknowledgement from the primary. With replica acknowledged write concern, you can guarantee that the write operation propagates to additional members of the replica set. See Write Concern for Replica Sets (page 528) for more information. Figure 3.22: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one secondary. Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the primary of the set regardless of the level of replica acknowledged write concern. See also: Write Concern Reference (page 118) 3.2. MongoDB CRUD Concepts 75
  • 80. MongoDB Documentation, Release 2.6.4 Distributed Write Operations Write Operations on Sharded Clusters For sharded collections in a sharded cluster, the mongos directs write operations from applications to the shards that are responsible for the specific portion of the data set. The mongos uses the cluster metadata from the config database (page 616) to route the write operation to the appropriate shards. Figure 3.23: Diagram of a sharded cluster. MongoDB partitions data in a sharded collection into ranges based on the values of the shard key. Then, MongoDB distributes these chunks to shards. The shard key determines the distribution of chunks to shards. This can affect the performance of write operations in the cluster. Important: Update operations that affect a single document must include the shard key or the _id field. Updates that affect multiple documents are more efficient in some situations if they have the shard key, but can be broadcast to all shards. If the value of the shard key increases or decreases with every insert, all insert operations target a single shard. As a result, the capacity of a single shard becomes the limit for the insert capacity of the sharded cluster. 76 Chapter 3. MongoDB CRUD Operations
  • 81. MongoDB Documentation, Release 2.6.4 Figure 3.24: Diagram of the shard key value space segmented into smaller ranges or chunks. For more information, see Sharded Cluster Tutorials (page 634) and Bulk Inserts in MongoDB (page 81). Write Operations on Replica Sets In replica sets, all write operations go to the set’s primary, which applies the write operation then records the oper-ations on the primary’s operation log or oplog. The oplog is a reproducible sequence of operations to the data set. Secondary members of the set are continuously replicating the oplog and applying the operations to themselves in an asynchronous process. Large volumes of write operations, particularly bulk operations, may create situations where the secondary members have difficulty applying the replicating operations from the primary at a sufficient rate: this can cause the secondary’s state to fall behind that of the primary. Secondaries that are significantly behind the primary present problems for normal operation of the replica set, particularly failover (page 523) in the form of rollbacks (page 527) as well as general read consistency (page 528). To help avoid this issue, you can customize the write concern (page 72) to return confirmation of the write operation to another member 4 of the replica set every 100 or 1,000 operations. This provides an opportunity for secondaries to catch up with the primary. Write concern can slow the overall progress of write operations but ensure that the secondaries can maintain a largely current state with respect to the primary. For more information on replica sets and write operations, see Replica Acknowledged (page 75), Oplog Size (page 535), and Change the Size of the Oplog (page 570). Write Operation Performance Indexes After every insert, update, or delete operation, MongoDB must update every index associated with the collection in addition to the data itself. Therefore, every index on a collection adds some amount of overhead for the performance of write operations. 5 4 Intermittently issuing a write concern with a w value of 2 or majority will slow the throughput of write traffic; however, this practice will allow the secondaries to remain current with the state of the primary. Changed in version 2.6: In Master/Slave (page 538) deployments, MongoDB treats w: "majority" as equivalent to w: 1. In earlier versions of MongoDB, w: "majority" produces an error in master/slave (page 538) deployments. 5 For inserts and updates to un-indexed fields, the overhead for sparse indexes (page 457) is less than for non-sparse indexes. Also for non-sparse indexes, updates that do not change the record size have less indexing overhead. 3.2. MongoDB CRUD Concepts 77
  • 82. MongoDB Documentation, Release 2.6.4 Figure 3.25: Diagram of default routing of reads and writes to the primary. 78 Chapter 3. MongoDB CRUD Operations
  • 83. MongoDB Documentation, Release 2.6.4 Figure 3.26: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one secondary. 3.2. MongoDB CRUD Concepts 79
  • 84. MongoDB Documentation, Release 2.6.4 In general, the performance gains that indexes provide for read operations are worth the insertion penalty. However, in order to optimize write performance when possible, be careful when creating new indexes and evaluate the existing indexes to ensure that your queries actually use these indexes. For indexes and queries, see Query Optimization (page 60). For more information on indexes, see Indexes (page 431) and Indexing Strategies (page 493). Document Growth If an update operation causes a document to exceed the currently allocated record size, MongoDB relocates the docu-ment on disk with enough contiguous space to hold the document. These relocations take longer than in-place updates, particularly if the collection has indexes. If a collection has indexes, MongoDB must update all index entries. Thus, for a collection with many indexes, the move will impact the write throughput. Some update operations, such as the $inc operation, do not cause an increase in document size. For these update operations, MongoDB can apply the updates in-place. Other update operations, such as the $push operation, change the size of the document. In-place-updates are significantly more efficient than updates that cause document growth. When possible, use data models (page 133) that minimize the need for document growth. See Storage (page 82) for more information. Storage Performance Hardware The capability of the storage system creates some important physical limits for the performance of Mon-goDB’s write operations. Many unique factors related to the storage system of the drive affect write performance, including random access patterns, disk caches, disk readahead and RAID configurations. Solid state drives (SSDs) can outperform spinning hard disks (HDDs) by 100 times or more for random workloads. See Production Notes (page 188) for recommendations regarding additional hardware and configuration options. Journaling MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 67) dura-bility and to provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation to the journal. While the durability assurance provided by the journal typically outweigh the performance costs of the additional write operations, consider the following interactions between the journal and performance: • if the journal and the data file reside on the same block device, the data files and the journal may have to contend for a finite number of available write operations. Moving the journal to a separate device may increase the capacity for write operations. • if applications specify write concern (page 72) that includes journaled (page 73), mongod will decrease the duration between journal commits, which can increases the overall write load. • the duration between journal commits is configurable using the commitIntervalMs run-time option. De-creasing the period between journal commits will increase the number of write operations, which can limit MongoDB’s capacity for write operations. Increasing the amount of time between commits may decrease the total number of write operation, but also increases the chance that the journal will not record a write operation in the event of a failure. For additional information on journaling, see Journaling Mechanics (page 275). 80 Chapter 3. MongoDB CRUD Operations
  • 85. MongoDB Documentation, Release 2.6.4 Bulk Inserts in MongoDB In some situations you may need to insert or ingest a large amount of data into a MongoDB database. These bulk inserts have some special considerations that are different from other write operations. Use the insert() Method The insert() method, when passed an array of documents, performs a bulk insert, and inserts each document atomically. Bulk inserts can significantly increase performance by amortizing write concern (page 72) costs. New in version 2.2: insert() in the mongo shell gained support for bulk inserts in version 2.2. In the drivers, you can configure write concern for batches rather than on a per-document level. Drivers have a ContinueOnError option in their insert operation, so that the bulk operation will continue to insert remaining documents in a batch even if an insert fails. Note: If multiple errors occur during a bulk insert, clients only receive the last error generated. See also: Driver documentation for details on performing bulk inserts in your application. Also see Import and Export MongoDB Data (page 186). Bulk Inserts on Sharded Clusters While ContinueOnError is optional on unsharded clusters, all bulk operations to a sharded collection run with ContinueOnError, which cannot be disabled. Large bulk insert operations, including initial data inserts or routine data import, can affect sharded cluster perfor-mance. For bulk inserts, consider the following strategies: Pre-Split the Collection If the sharded collection is empty, then the collection has only one initial chunk, which resides on a single shard. MongoDB must then take time to receive data, create splits, and distribute the split chunks to the available shards. To avoid this performance cost, you can pre-split the collection, as described in Split Chunks in a Sharded Cluster (page 666). Insert to Multiple mongos To parallelize import processes, send insert operations to more than one mongos instance. Pre-split empty collections first as described in Split Chunks in a Sharded Cluster (page 666). Avoid Monotonic Throttling If your shard key increases monotonically during an insert, then all inserted data goes to the last chunk in the collection, which will always end up on a single shard. Therefore, the insert capacity of the cluster will never exceed the insert capacity of that single shard. If your insert volume is larger than what a single shard can process, and if you cannot avoid a monotonically increasing shard key, then consider the following modifications to your application: • Reverse the binary bits of the shard key. This preserves the information and avoids correlating insertion order with increasing sequence of values. • Swap the first and last 16-bit words to “shuffle” the inserts. 3.2. MongoDB CRUD Concepts 81
  • 86. MongoDB Documentation, Release 2.6.4 Example The following example, in C++, swaps the leading and trailing 16-bit word of BSON ObjectIds generated so that they are no longer monotonically increasing. using namespace mongo; OID make_an_id() { OID x = OID::gen(); const unsigned char *p = x.getData(); swap( (unsigned short&) p[0], (unsigned short&) p[10] ); return x; } void foo() { // create an object BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" ); // now we may insert o into a sharded collection } See also: Shard Keys (page 620) for information on choosing a sharded key. Also see Shard Key Internals (page 620) (in particular, Choosing a Shard Key (page 639)). Storage Data Model MongoDB stores data in the form of BSON documents, which are rich mappings of keys, or field names, to values. BSON supports a rich collection of types, and fields in BSON documents may hold arrays of values or embedded documents. All documents in MongoDB must be less than 16MB, which is the BSON document size. Every document in MongoDB is stored in a record which contains the document itself and extra space, or padding, which allows the document to grow as the result of updates. All records are contiguously located on disk, and when a document becomes larger than the allocated record, Mon-goDB must allocate a new record. New allocations require MongoDB to move a document and update all indexes that refer to the document, which takes more time than in-place updates and leads to storage fragmentation. All records are part of a collection, which is a logical grouping of documents in a MongoDB database. The documents in a collection share a set of indexes, and typically these documents share common fields and structure. In MongoDB the database construct is a group of related collections. Each database has a distinct set of data files and can contain a large number of collections. Also, each database has one distinct write lock, that blocks operations to the database during write operations. A single MongoDB deployment may have many databases. Journal In order to ensure that all modifications to a MongoDB data set are durably written to disk, MongoDB records all modifications to a journal that it writes to disk more frequently than it writes the data files. The journal allows MongoDB to successfully recover data from data files after a mongod instance exits without flushing all changes. See Journaling Mechanics (page 275) for more information about the journal in MongoDB. 82 Chapter 3. MongoDB CRUD Operations
  • 87. MongoDB Documentation, Release 2.6.4 Record Allocation Strategies MongoDB supports multiple record allocation strategies that determine how mongod adds padding to a document when creating a record. Because documents in MongoDB may grow after insertion and all records are contiguous on disk, the padding can reduce the need to relocate documents on disk following updates. Relocations are less efficient than in-place updates, and can lead to storage fragmentation. As a result, all padding strategies trade additional space for increased efficiency and decreased fragmentation. Different allocation strategies support different kinds of workloads: the power of 2 allocations (page 83) are more efficient for insert/update/delete workloads; while exact fit allocations (page 83) is ideal for collections without update and delete workloads. Power of 2 Sized Allocations Changed in version 2.6: For all new collections, usePowerOf2Sizes became the default allocation strategy. To change the default allocation strategy, use the newCollectionsUsePowerOf2Sizes parameter. mongod uses an allocation strategy called usePowerOf2Sizes where each record has a size in bytes that is a power of 2 (e.g. 32, 64, 128, 256, 512...16777216.) The smallest allocation for a document is 32 bytes. The power of 2 sizes allocation strategy has two key properties: • there are a limited number of record allocation sizes, which makes it easier for mongod to reuse existing allocations, which will reduce fragmentation in some cases. • in many cases, the record allocations are significantly larger than the documents they hold. This allows docu-ments to grow while minimizing or eliminating the chance that the mongod will need to allocate a new record if the document grows. The usePowerOf2Sizes strategy does not eliminate document reallocation as a result of document growth, but it minimizes its occurrence in many common operations. Exact Fit Allocation The exact fit allocation strategy allocates record sizes based on the size of the document and an additional padding factor. Each collection has its own padding factor, which defaults to 1 when you insert the first document in a collection. MongoDB dynamically adjusts the padding factor up to 2 depending on the rate of growth of the documents over the life of the collection. To estimate total record size, compute the product of the padding factor and the size of the document. That is: record size = paddingFactor * <document size> The size of each record in a collection reflects the size of the padding factor at the time of allocation. See the paddingFactor field in the output of db.collection.stats() to see the current padding factor for a collec-tion. On average, this exact fit allocation strategy uses less storage space than the usePowerOf2Sizes strategy but will result in higher levels of storage fragmentation if documents grow beyond the size of their initial allocation. The compact and repairDatabase operations remove padding by default, as do the mongodump and mongorestore. compact does allow you to specify a padding for records during compaction. Capped Collections Capped collections are fixed-size collections that support high-throughput operations that store records in insertion order. Capped collections work like circular buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection. See Capped Collections (page 196) for more information. 3.2. MongoDB CRUD Concepts 83
  • 88. MongoDB Documentation, Release 2.6.4 3.3 MongoDB CRUD Tutorials The following tutorials provide instructions for querying and modifying data. For a higher-level overview of these operations, see MongoDB CRUD Operations (page 51). Insert Documents (page 84) Insert new documents into a collection. Query Documents (page 87) Find documents in a collection using search criteria. Limit Fields to Return from a Query (page 94) Limit which fields are returned by a query. Iterate a Cursor in the mongo Shell (page 95) Access documents returned by a find query by iterating the cursor, either manually or using the iterator index. Analyze Query Performance (page 97) Analyze the efficiency of queries and determine how a query uses available indexes. Modify Documents (page 98) Modify documents in a collection Remove Documents (page 101) Remove documents from a collection. Perform Two Phase Commits (page 102) Use two-phase commits when writing data to multiple documents. Create Tailable Cursor (page 109) Create tailable cursors for use in capped collections with high numbers of write operations for which an index would be too expensive. Isolate Sequence of Operations (page 111) Use the <isolation> isolated operator to isolate a single write operation that affects multiple documents, preventing other operations from interrupting the sequence of write operations. Create an Auto-Incrementing Sequence Field (page 113) Describes how to create an incrementing sequence num-ber for the _id field using a Counters Collection or an Optimistic Loop. Limit Number of Elements in an Array after an Update (page 116) Use $push with various modifiers to sort and maintain an array of fixed size after update 3.3.1 Insert Documents In MongoDB, the db.collection.insert() method adds new documents into a collection. Insert a Document Step 1: Insert a document into a collection. Insert a document into a collection named inventory. The operation will create the collection if the collection does not currently exist. db.inventory.insert( { item: "ABC1", details: { model: "14Q3", manufacturer: "XYZ Company" }, stock: [ { size: "S", qty: 25 }, { size: "M", qty: 50 } ], category: "clothing" } ) 84 Chapter 3. MongoDB CRUD Operations
  • 89. MongoDB Documentation, Release 2.6.4 The operation returns a WriteResult object with the status of the operation. A successful insert of the document returns the following object: WriteResult({ "nInserted" : 1 }) The nInserted field specifies the number of documents inserted. If the operation encounters an error, the WriteResult object will contain the error information. Step 2: Review the inserted document. If the insert operation is successful, verify the insertion by querying the collection. db.inventory.find() The document you inserted should return. { "_id" : ObjectId("53d98f133bb604791249ca99"), "item" : "ABC1", "details" : { "model" : "14Q3", "manufacturer" The returned document shows that MongoDB added an _id field to the document. If a client inserts a document that does not contain the _id field, MongoDB adds the field with the value set to a generated ObjectId6. The ObjectId7 values in your documents will differ from the ones shown. Insert an Array of Documents You can pass an array of documents to the db.collection.insert() method to insert multiple documents. Step 1: Create an array of documents. Define a variable mydocuments that holds an array of documents to insert. var mydocuments = [ { item: "ABC2", details: { model: "14Q3", manufacturer: "M1 Corporation" }, stock: [ { size: "M", qty: 50 } ], category: "clothing" }, { item: "MNO2", details: { model: "14Q3", manufacturer: "ABC Company" }, stock: [ { size: "S", qty: 5 }, { size: "M", qty: 5 }, { size: "L", qty: 1 } ], category: "clothing" }, { item: "IJK2", details: { model: "14Q2", manufacturer: "M5 Corporation" }, stock: [ { size: "S", qty: 5 }, { size: "L", qty: 1 } ], category: "houseware" } ]; 6http://docs.mongodb.org/manual/reference/object-id 7http://docs.mongodb.org/manual/reference/object-id 3.3. MongoDB CRUD Tutorials 85
  • 90. MongoDB Documentation, Release 2.6.4 Step 2: Insert the documents. Pass the mydocuments array to the db.collection.insert() to perform a bulk insert. db.inventory.insert( mydocuments ); The method returns a BulkWriteResult object with the status of the operation. A successful insert of the docu-ments returns the following object: BulkWriteResult({ "writeErrors" : [ ], "writeConcernErrors" : [ ], "nInserted" : 3, "nUpserted" : 0, "nMatched" : 0, "nModified" : 0, "nRemoved" : 0, "upserted" : [ ] }) The nInserted field specifies the number of documents inserted. If the operation encounters an error, the BulkWriteResult object will contain information regarding the error. The inserted documents will each have an _id field added by MongoDB. Insert Multiple Documents with Bulk New in version 2.6. MongoDB provides a Bulk() API that you can use to perform multiple write operations in bulk. The following sequence of operations describes how you would use the Bulk() API to insert a group of documents into a MongoDB collection. Step 1: Initialize a Bulk operations builder. Initialize a Bulk operations builder for the collection inventory. var bulk = db.inventory.initializeUnorderedBulkOp(); The operation returns an unordered operations builder which maintains a list of operations to perform. Unordered operations means that MongoDB can execute in parallel as well as in nondeterministic order. If an error occurs during the processing of one of the write operations, MongoDB will continue to process remaining write operations in the list. You can also initialize an ordered operations builder; see db.collection.initializeOrderedBulkOp() for details. Step 2: Add insert operations to the bulk object. Add two insert operations to the bulk object using the Bulk.insert() method. bulk.insert( { item: "BE10", details: { model: "14Q2", manufacturer: "XYZ Company" }, stock: [ { size: "L", qty: 5 } ], 86 Chapter 3. MongoDB CRUD Operations
  • 91. MongoDB Documentation, Release 2.6.4 category: "clothing" } ); bulk.insert( { item: "ZYT1", details: { model: "14Q1", manufacturer: "ABC Company" }, stock: [ { size: "S", qty: 5 }, { size: "M", qty: 5 } ], category: "houseware" } ); Step 3: Execute the bulk operation. Call the execute() method on the bulk object to execute the operations in its list. bulk.execute(); The method returns a BulkWriteResult object with the status of the operation. A successful insert of the docu-ments returns the following object: BulkWriteResult({ "writeErrors" : [ ], "writeConcernErrors" : [ ], "nInserted" : 2, "nUpserted" : 0, "nMatched" : 0, "nModified" : 0, "nRemoved" : 0, "upserted" : [ ] }) The nInserted field specifies the number of documents inserted. If the operation encounters an error, the BulkWriteResult object will contain information regarding the error. Additional Examples and Methods For more examples, see db.collection.insert(). The db.collection.update() method, the db.collection.findAndModify(), and the db.collection.save() method can also add new documents. See the individual reference pages for the methods for more information and examples. 3.3.2 Query Documents In MongoDB, the db.collection.find() method retrieves documents from a collection. 8 The db.collection.find() method returns a cursor (page 59) to the retrieved documents. This tutorial provides examples of read operations using the db.collection.find() method in the mongo shell. In these examples, the retrieved documents contain all their fields. To restrict the fields to return in the retrieved documents, see Limit Fields to Return from a Query (page 94). 8 The db.collection.findOne() method also performs a read operation to return a single document. Internally, the db.collection.findOne() method is the db.collection.find() method with a limit of 1. 3.3. MongoDB CRUD Tutorials 87
  • 92. MongoDB Documentation, Release 2.6.4 Select All Documents in a Collection An empty query document ({}) selects all documents in the collection: db.inventory.find( {} ) Not specifying a query document to the find() is equivalent to specifying an empty query document. Therefore the following operation is equivalent to the previous operation: db.inventory.find() Specify Equality Condition To specify equality condition, use the query document { <field>: <value> } to select all documents that contain the <field> with the specified <value>. The following example retrieves from the inventory collection all documents where the type field has the value snacks: db.inventory.find( { type: "snacks" } ) Specify Conditions Using Query Operators A query document can use the query operators to specify conditions in a MongoDB query. The following example selects all documents in the inventory collection where the value of the type field is either ’food’ or ’snacks’: db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } ) Although you can express this query using the $or operator, use the $in operator rather than the $or operator when performing equality checks on the same field. Refer to the http://docs.mongodb.org/manualreference/operator document for the complete list of query operators. Specify AND Conditions A compound query can specify conditions for more than one field in the collection’s documents. Implicitly, a logical AND conjunction connects the clauses of a compound query so that the query selects the documents in the collection that match all the conditions. In the following example, the query document specifies an equality match on the field type and a less than ($lt) comparison match on the field price: db.inventory.find( { type: 'food', price: { $lt: 9.95 } } ) This query selects all documents where the type field has the value ’food’ and the value of the price field is less than 9.95. See comparison operators for other comparison operators. Specify OR Conditions Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so that the query selects the documents in the collection that match at least one condition. 88 Chapter 3. MongoDB CRUD Operations
  • 93. MongoDB Documentation, Release 2.6.4 In the following example, the query document selects all documents in the collection where the field qty has a value greater than ($gt) 100 or the value of the price field is less than ($lt) 9.95: db.inventory.find( { $or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ] } ) Specify AND as well as OR Conditions With additional clauses, you can specify precise conditions for matching documents. In the following example, the compound query document selects all documents in the collection where the value of the type field is ’food’ and either the qty has a value greater than ($gt) 100 or the value of the price field is less than ($lt) 9.95: db.inventory.find( { type: 'food', $or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ] } ) Embedded Documents When the field holds an embedded document, a query can either specify an exact match on the embedded document or specify a match by individual fields in the embedded document using the dot notation. Exact Match on the Embedded Document To specify an equality match on the whole embedded document, use the query document { <field>: <value> } where <value> is the document to match. Equality matches on an embedded document require an exact match of the specified <value>, including the field order. In the following example, the query matches all documents where the value of the field producer is an embedded document that contains only the field company with the value ’ABC123’ and the field address with the value ’123 Street’, in the exact order: db.inventory.find( { producer: { company: 'ABC123', address: '123 Street' } } ) Equality Match on Fields within an Embedded Document Use the dot notation to match by specific fields in an embedded document. Equality matches for specific fields in an embedded document will select documents in the collection where the embedded document contains the specified fields with the specified values. The embedded document can contain additional fields. 3.3. MongoDB CRUD Tutorials 89
  • 94. MongoDB Documentation, Release 2.6.4 In the following example, the query uses the dot notation to match all documents where the value of the field producer is an embedded document that contains a field company with the value ’ABC123’ and may contain other fields: db.inventory.find( { 'producer.company': 'ABC123' } ) Arrays When the field holds an array, you can query for an exact array match or for specific values in the array. If the array holds embedded documents, you can query for specific fields in the embedded documents using dot notation. If you specify multiple conditions using the $elemMatch operator, the array must contain at least one element that satisfies all the conditions. See Single Element Satisfies the Criteria (page 91). If you specify multiple conditions without using the $elemMatch operator, then some combination of the array elements, not necessarily a single element, must satisfy all the conditions; i.e. different elements in the array can satisfy different parts of the conditions. See Combination of Elements Satisfies the Criteria (page 91). Consider an inventory collection that contains the following documents: { _id: 5, type: "food", item: "aaa", ratings: [ 5, 8, 9 ] } { _id: 6, type: "food", item: "bbb", ratings: [ 5, 9 ] } { _id: 7, type: "food", item: "ccc", ratings: [ 9, 5, 8 ] } Exact Match on an Array To specify equality match on an array, use the query document { <field>: <value> } where <value> is the array to match. Equality matches on the array require that the array field match exactly the specified <value>, including the element order. The following example queries for all documents where the field ratings is an array that holds exactly three ele-ments, 5, 8, and 9, in this order: db.inventory.find( { ratings: [ 5, 8, 9 ] } ) The operation returns the following document: { "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } Match an Array Element Equality matches can specify a single element in the array to match. These specifications match if the array contains at least one element with the specified value. The following example queries for all documents where ratings is an array that contains 5 as one of its elements: db.inventory.find( { ratings: 5 } ) The operation returns the following documents: { "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } { "_id" : 6, "type" : "food", "item" : "bbb", "ratings" : [ 5, 9 ] } { "_id" : 7, "type" : "food", "item" : "ccc", "ratings" : [ 9, 5, 8 ] } 90 Chapter 3. MongoDB CRUD Operations
  • 95. MongoDB Documentation, Release 2.6.4 Match a Specific Element of an Array Equality matches can specify equality matches for an element at a particular index or position of the array using the dot notation. In the following example, the query uses the dot notation to match all documents where the ratings array contains 5 as the first element: db.inventory.find( { 'ratings.0': 5 } ) The operation returns the following documents: { "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } { "_id" : 6, "type" : "food", "item" : "bbb", "ratings" : [ 5, 9 ] } Specify Multiple Criteria for Array Elements Single Element Satisfies the Criteria Use $elemMatch operator to specify multiple criteria on the elements of an array such that at least one array element satisfies all the specified criteria. The following example queries for documents where the ratings array contains at least one element that is greater than ($gt) 5 and less than ($lt) 9: db.inventory.find( { ratings: { $elemMatch: { $gt: 5, $lt: 9 } } } ) The operation returns the following documents, whose ratings array contains the element 8 which meets the crite-ria: { "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } { "_id" : 7, "type" : "food", "item" : "ccc", "ratings" : [ 9, 5, 8 ] } Combination of Elements Satisfies the Criteria The following example queries for documents where the ratings array contains elements that in some combination satisfy the query conditions; e.g., one element can satisfy the greater than 5 condition and another element can satisfy the less than 9 condition, or a single element can satisfy both: db.inventory.find( { ratings: { $gt: 5, $lt: 9 } } ) The operation returns the following documents: { "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } { "_id" : 6, "type" : "food", "item" : "bbb", "ratings" : [ 5, 9 ] } { "_id" : 7, "type" : "food", "item" : "ccc", "ratings" : [ 9, 5, 8 ] } The document with the "ratings" : [ 5, 9 ] matches the query since the element 9 is greater than 5 (the first condition) and the element 5 is less than 9 (the second condition). Array of Embedded Documents Consider that the inventory collection includes the following documents: { _id: 100, type: "food", item: "xyz", qty: 25, 3.3. MongoDB CRUD Tutorials 91
  • 96. MongoDB Documentation, Release 2.6.4 price: 2.5, ratings: [ 5, 8, 9 ], memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] } { _id: 101, type: "fruit", item: "jkl", qty: 10, price: 4.25, ratings: [ 5, 9 ], memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ] } Match a Field in the Embedded Document Using the Array Index If you know the array index of the embedded document, you can specify the document using the subdocument’s position using the dot notation. The following example selects all documents where the memos contains an array whose first element (i.e. index is 0) is a document that contains the field by whose value is ’shipping’: db.inventory.find( { 'memos.0.by': 'shipping' } ) The operation returns the following document: { _id: 100, type: "food", item: "xyz", qty: 25, price: 2.5, ratings: [ 5, 8, 9 ], memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] } Match a FieldWithout Specifying Array Index If you do not know the index position of the document in the array, concatenate the name of the field that contains the array, with a dot (.) and the name of the field in the subdocument. The following example selects all documents where the memos field contains an array that contains at least one embedded document that contains the field by with the value ’shipping’: db.inventory.find( { 'memos.by': 'shipping' } ) The operation returns the following documents: { _id: 100, type: "food", item: "xyz", qty: 25, price: 2.5, ratings: [ 5, 8, 9 ], memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] } { _id: 101, type: "fruit", 92 Chapter 3. MongoDB CRUD Operations
  • 97. MongoDB Documentation, Release 2.6.4 item: "jkl", qty: 10, price: 4.25, ratings: [ 5, 9 ], memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ] } Specify Multiple Criteria for Array of Documents Single Element Satisfies the Criteria Use $elemMatch operator to specify multiple criteria on an array of em-bedded documents such that at least one embedded document satisfies all the specified criteria. The following example queries for documents where the memos array has at least one embedded document that contains both the field memo equal to ’on time’ and the field by equal to ’shipping’: db.inventory.find( { memos: { $elemMatch: { memo: 'on time', by: 'shipping' } } } ) The operation returns the following document: { _id: 100, type: "food", item: "xyz", qty: 25, price: 2.5, ratings: [ 5, 8, 9 ], memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] } Combination of Elements Satisfies the Criteria The following example queries for documents where the memos array contains elements that in some combination satisfy the query conditions; e.g. one element satisfies the field memo equal to ’on time’ condition and another element satisfies the field by equal to ’shipping’ condition, or a single element can satisfy both criteria: db.inventory.find( { 'memos.memo': 'on time', 'memos.by': 'shipping' } ) The query returns the following documents: { _id: 100, 3.3. MongoDB CRUD Tutorials 93
  • 98. MongoDB Documentation, Release 2.6.4 type: "food", item: "xyz", qty: 25, price: 2.5, ratings: [ 5, 8, 9 ], memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by: "billing" } ] } { _id: 101, type: "fruit", item: "jkl", qty: 10, price: 4.25, ratings: [ 5, 9 ], memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by: "shipping" } ] } 3.3.3 Limit Fields to Return from a Query The projection document limits the fields to return for all matching documents. The projection document can specify the inclusion of fields or the exclusion of fields. The specifications have the following forms: Syntax Description <field>: <1 or true> Specify the inclusion of a field. <field>: <0 or false> Specify the suppression of the field. Important: The _id field is, by default, included in the result set. To suppress the _id field from the result set, specify _id: 0 in the projection document. You cannot combine inclusion and exclusion semantics in a single projection with the exception of the _id field. This tutorial offers various query examples that limit the fields to return for all matching documents. The examples in this tutorial use a collection inventory and use the db.collection.find() method in the mongo shell. The db.collection.find() method returns a cursor (page 59) to the retrieved documents. For examples on query selection criteria, see Query Documents (page 87). Return All Fields in Matching Documents If you specify no projection, the find() method returns all fields of all documents that match the query. db.inventory.find( { type: 'food' } ) This operation will return all documents in the inventory collection where the value of the type field is ’food’. The returned documents contain all its fields. Return the Specified Fields and the _id Field Only A projection can explicitly include several fields. In the following operation, find() method returns all documents that match the query. In the result set, only the item and qty fields and, by default, the _id field return in the matching documents. 94 Chapter 3. MongoDB CRUD Operations
  • 99. MongoDB Documentation, Release 2.6.4 db.inventory.find( { type: 'food' }, { item: 1, qty: 1 } ) Return Specified Fields Only You can remove the _id field from the results by specifying its exclusion in the projection, as in the following example: db.inventory.find( { type: 'food' }, { item: 1, qty: 1, _id:0 } ) This operation returns all documents that match the query. In the result set, only the item and qty fields return in the matching documents. Return All But the Excluded Field To exclude a single field or group of fields you can use a projection in the following form: db.inventory.find( { type: 'food' }, { type:0 } ) This operation returns all documents where the value of the type field is food. In the result set, the type field does not return in the matching documents. With the exception of the _id field you cannot combine inclusion and exclusion statements in projection documents. Projection for Array Fields For fields that contain arrays, MongoDB provides the following projection operators: $elemMatch, $slice, and $. For example, the inventory collection contains the following document: { "_id" : 5, "type" : "food", "item" : "aaa", "ratings" : [ 5, 8, 9 ] } Then the following operation uses the $slice projection operator to return just the first two elements in the ratings array. db.inventory.find( { _id: 5 }, { ratings: { $slice: 2 } } ) $elemMatch, $slice, and $ are the only way to project portions of an array. For instance, you cannot project a portion of an array using the array index; e.g. { "ratings.0": 1 } projection will not project the array with the first element. 3.3.4 Iterate a Cursor in the mongo Shell The db.collection.find() method returns a cursor. To access the documents, you need to iterate the cursor. However, in the mongo shell, if the returned cursor is not assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times to print up to the first 20 documents in the results. The following describes ways to manually iterate the cursor to access the documents or to use the iterator index. Manually Iterate the Cursor In the mongo shell, when you assign the cursor returned from the find() method to a variable using the var keyword, the cursor does not automatically iterate. 3.3. MongoDB CRUD Tutorials 95
  • 100. MongoDB Documentation, Release 2.6.4 You can call the cursor variable in the shell to iterate up to 20 times 9 and print the matching documents, as in the following example: var myCursor = db.inventory.find( { type: 'food' } ); myCursor You can also use the cursor method next() to access the documents, as in the following example: var myCursor = db.inventory.find( { type: 'food' } ); while (myCursor.hasNext()) { print(tojson(myCursor.next())); } As an alternative print operation, consider the printjson() helper method to replace print(tojson()): var myCursor = db.inventory.find( { type: 'food' } ); while (myCursor.hasNext()) { printjson(myCursor.next()); } You can use the cursor method forEach() to iterate the cursor and access the documents, as in the following example: var myCursor = db.inventory.find( { type: 'food' } ); myCursor.forEach(printjson); See JavaScript cursor methods and your driver documentation for more information on cursor methods. Iterator Index In the mongo shell, you can use the toArray() method to iterate the cursor and return the documents in an array, as in the following: var myCursor = db.inventory.find( { type: 'food' } ); var documentArray = myCursor.toArray(); var myDocument = documentArray[3]; The toArray() method loads into RAM all documents returned by the cursor; the toArray() method exhausts the cursor. Additionally, some drivers provide access to the documents by using an index on the cursor (i.e. cursor[index]). This is a shortcut for first calling the toArray() method and then using an index on the resulting array. Consider the following example: var myCursor = db.inventory.find( { type: 'food' } ); var myDocument = myCursor[3]; The myCursor[3] is equivalent to the following example: myCursor.toArray() [3]; 9 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries (page 256) for more information. 96 Chapter 3. MongoDB CRUD Operations
  • 101. MongoDB Documentation, Release 2.6.4 3.3.5 Analyze Query Performance The explain() cursor method allows you to inspect the operation of the query system. This method is useful for analyzing the efficiency of queries, and for determining how the query uses the index. The explain() method tests the query operation, and not the timing of query performance. Because explain() attempts multiple query plans, it does not reflect an accurate timing of query performance. Evaluate the Performance of a Query To use the explain() method, call the method on a cursor returned by find(). Example Evaluate a query on the type field on the collection inventory that has an index on the type field. db.inventory.find( { type: 'food' } ).explain() Consider the results: { "cursor" : "BtreeCursor type_1", "isMultiKey" : false, "n" : 5, "nscannedObjects" : 5, "nscanned" : 5, "nscannedObjectsAllPlans" : 5, "nscannedAllPlans" : 5, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "type" : [ [ "food", "food" ] ] }, "server" : "mongodbo0.example.net:27017" } The BtreeCursor value of the cursor field indicates that the query used an index. This query returned 5 documents, as indicated by the n field. To return these 5 documents, the query scanned 5 documents from the index, as indicated by the nscanned field, and then read 5 full documents from the collection, as indicated by the nscannedObjects field. Without the index, the query would have scanned the whole collection to return the 5 documents. See explain-results method for full details on the output. Compare Performance of Indexes To manually compare the performance of a query using more than one index, you can use the hint() and explain() methods in conjunction. Example Evaluate a query using different indexes: 3.3. MongoDB CRUD Tutorials 97
  • 102. MongoDB Documentation, Release 2.6.4 db.inventory.find( { type: 'food' } ).hint( { type: 1 } ).explain() db.inventory.find( { type: 'food' } ).hint( { type: 1, name: 1 } ).explain() These return the statistics regarding the execution of the query using the respective index. Note: If you run explain() without including hint(), the query optimizer reevaluates the query and runs against multiple indexes before returning the query statistics. For more detail on the explain output, see explain-results. 3.3.6 Modify Documents MongoDB provides the update() method to update the documents of a collection. The method accepts as its parameters: • an update conditions document to match the documents to update, • an update operations document to specify the modification to perform, and • an options document. To specify the update condition, use the same structure and syntax as the query conditions. By default, update() updates a single document. To update multiple documents, use the multi option. Update Specific Fields in a Document To change a field value, MongoDB provides update operators10, such as $set to modify values. Some update operators, such as $set, will create the field if the field does not exist. See the individual update operator11 reference. Step 1: Use update operators to change field values. For the document with item equal to "MNO2", use the $set operator to update the category field and the details field to the specified values and the $currentDate operator to update the field lastModified with the current date. db.inventory.update( { item: "MNO2" }, { $set: { category: "apparel", details: { model: "14Q3", manufacturer: "XYZ Company" } }, $currentDate: { lastModified: true } } ) The update operation returns a WriteResult object which contains the status of the operation. A successful update of the document returns the following object: 10http://docs.mongodb.org/manual/reference/operator/update 11http://docs.mongodb.org/manual/reference/operator/update 98 Chapter 3. MongoDB CRUD Operations
  • 103. MongoDB Documentation, Release 2.6.4 WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) The nMatched field specifies the number of existing documents matched for the update, and nModified specifies the number of existing documents modified. Step 2: Update an embedded field. To update a field within an embedded document, use the dot notation. When using the dot notation, enclose the whole dotted field name in quotes. The following updates the model field within the embedded details document. db.inventory.update( { item: "ABC1" }, { $set: { "details.model": "14Q2" } } ) The update operation returns a WriteResult object which contains the status of the operation. A successful update of the document returns the following object: WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) Step 3: Update multiple documents. By default, the update() method updates a single document. To update multiple documents, use the multi option in the update() method. Update the category field to "apparel" and update the lastModified field to the current date for all docu-ments that have category field equal to "clothing". db.inventory.update( { category: "clothing" }, { $set: { category: "apparel" }, $currentDate: { lastModified: true } }, { multi: true } ) The update operation returns a WriteResult object which contains the status of the operation. A successful update of the document returns the following object: WriteResult({ "nMatched" : 3, "nUpserted" : 0, "nModified" : 3 }) Replace the Document To replace the entire content of a document except for the _id field, pass an entirely new document as the second argument to update(). The replacement document can have different fields from the original document. In the replacement document, you can omit the _id field since the _id field is immutable. If you do include the _id field, it must be the same value as the existing value. 3.3. MongoDB CRUD Tutorials 99
  • 104. MongoDB Documentation, Release 2.6.4 Step 1: Replace a document. The following operation replaces the document with item equal to "BE10". The newly replaced document will only contain the the _id field and the fields in the replacement document. db.inventory.update( { item: "BE10" }, { item: "BE05", stock: [ { size: "S", qty: 20 }, { size: "M", qty: 5 } ], category: "apparel" } ) The update operation returns a WriteResult object which contains the status of the operation. A successful update of the document returns the following object: WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) upsert Option By default, if no document matches the update query, the update() method does nothing. However, by specifying upsert: true, the update() method either updates matching document or documents, or inserts a new document using the update specification if no matching document exists. Step 1: Specify upsert: true for the update replacement operation. When you specify upsert: true for an update operation to replace a document and no matching documents are found, MongoDB creates a new document using the equality conditions in the update conditions document, and replaces this document, except for the _id field if specified, with the update document. The following operation either updates a matching document by replacing it with a new document or adds a new document if no matching document exists. db.inventory.update( { item: "TBD1" }, { item: "TBD1", details: { "model" : "14Q4", "manufacturer" : "ABC Company" }, stock: [ { "size" : "S", "qty" : 25 } ], category: "houseware" }, { upsert: true } ) The update operation returns a WriteResult object which contains the status of the operation, including whether the db.collection.update() method modified an existing document or added a new document. WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, "_id" : ObjectId("53dbd684babeaec6342ed6c7") }) 100 Chapter 3. MongoDB CRUD Operations
  • 105. MongoDB Documentation, Release 2.6.4 The nMatched field shows that the operation matched 0 documents. The nUpserted of 1 shows that the update added a document. The nModified of 0 specifies that no existing documents were updated. The _id field shows the generated _id field for the added document. Step 2: Specify an upsert: true for the update specific fields operation. When you specify an upsert: true for an update operation that modifies specific fields and no matching docu-ments are found, MongoDB creates a new document using the equality conditions in the update conditions document, and applies the modification as specified in the update document. The following update operation either updates specific fields of a matching document or adds a new document if no matching document exists. db.inventory.update( { item: "TBD2" }, { $set: { details: { "model" : "14Q3", "manufacturer" : "IJK Co." }, category: "houseware" } }, { upsert: true } ) The update operation returns a WriteResult object which contains the status of the operation, including whether the db.collection.update() method modified an existing document or added a new document. WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, "_id" : ObjectId("53dbd7c8babeaec6342ed6c8") }) The nMatched field shows that the operation matched 0 documents. The nUpserted of 1 shows that the update added a document. The nModified of 0 specifies that no existing documents were updated. The _id field shows the generated _id field for the added document. Additional Examples and Methods For more examples, see Update examples in the db.collection.update() reference page. The db.collection.findAndModify() and the db.collection.save() method can also modify exist-ing documents or insert a new one. See the individual reference pages for the methods for more information and examples. 3.3.7 Remove Documents In MongoDB, the db.collection.remove() method removes documents from a collection. You can remove all documents from a collection, remove all documents that match a condition, or limit the operation to remove just a 3.3. MongoDB CRUD Tutorials 101
  • 106. MongoDB Documentation, Release 2.6.4 single document. This tutorial provides examples of remove operations using the db.collection.remove() method in the mongo shell. Remove All Documents To remove all documents from a collection, pass an empty query document {} to the remove() method. The remove() method does not remove the indexes. The following example removes all documents from the inventory collection: db.inventory.remove({}) To remove all documents from a collection, it may be more efficient to use the drop() method to drop the entire collection, including the indexes, and then recreate the collection and rebuild the indexes. Remove Documents that Match a Condition To remove the documents that match a deletion criteria, call the remove() method with the <query> parameter. The following example removes all documents from the inventory collection where the type field equals food: db.inventory.remove( { type : "food" } ) For large deletion operations, it may be more efficient to copy the documents that you want to keep to a new collection and then use drop() on the original collection. Remove a Single Document that Matches a Condition To remove a single document, call the remove() method with the justOne parameter set to true or 1. The following example removes one document from the inventory collection where the type field equals food: db.inventory.remove( { type : "food" }, 1 ) To delete a single document sorted by some specified order, use the findAndModify() method. 3.3.8 Perform Two Phase Commits Synopsis This document provides a pattern for doing multi-document updates or “multi-document transactions” using a two-phase commit approach for writing data to multiple documents. Additionally, you can extend this process to provide a rollback-like (page 106) functionality. Background Operations on a single document are always atomic with MongoDB databases; however, operations that involve multi-ple documents, which are often referred to as “multi-document transactions”, are not atomic. Since documents can be fairly complex and contain multiple “nested” documents, single-document atomicity provides necessary support for many practical use cases. Despite the power of single-document atomic operations, there are cases that require multi-document transactions. When executing a transaction composed of sequential operations, certain issues arise, such as: 102 Chapter 3. MongoDB CRUD Operations
  • 107. MongoDB Documentation, Release 2.6.4 • Atomicity: if one operation fails, the previous operation within the transaction must “rollback” to the previous state (i.e. the “nothing,” in “all or nothing”). • Consistency: if a major failure (i.e. network, hardware) interrupts the transaction, the database must be able to recover a consistent state. For situations that require multi-document transactions, you can implement two-phase commit in your application to provide support for these kinds of multi-document updates. Using two-phase commit ensures that data is consistent and, in case of an error, the state that preceded the transaction is recoverable (page 106). During the procedure, however, documents can represent pending data and states. Note: Because only single-document operations are atomic with MongoDB, two-phase commits can only offer transaction-like semantics. It is possible for applications to return intermediate data at intermediate points during the two-phase commit or rollback. Pattern Overview Consider a scenario where you want to transfer funds from account A to account B. In a relational database system, you can subtract the funds from A and add the funds to B in a single multi-statement transaction. In MongoDB, you can emulate a two-phase commit to achieve a comparable result. The examples in this tutorial use the following two collections: 1. A collection named accounts to store account information. 2. A collection named transactions to store information on the fund transfer transactions. Initialize Source and Destination Accounts Insert into the accounts collection a document for account A and a document for account B. db.accounts.insert( [ { _id: "A", balance: 1000, pendingTransactions: [] }, { _id: "B", balance: 1000, pendingTransactions: [] } ] ) The operation returns a BulkWriteResult() object with the status of the operation. Upon successful insert, the BulkWriteResult() has nInserted set to 2 . Initialize Transfer Record For each fund transfer to perform, insert into the transactions collection a document with the transfer information. The document contains the following fields: • source and destination fields, which refer to the _id fields from the accounts collection, • value field, which specifies the amount of transfer affecting the balance of the source and destination accounts, • state field, which reflects the current state of the transfer. The state field can have the value of initial, pending, applied, done, canceling, and canceled. 3.3. MongoDB CRUD Tutorials 103
  • 108. MongoDB Documentation, Release 2.6.4 • lastModified field, which reflects last modification date. To initialize the transfer of 100 from account A to account B, insert into the transactions collection a document with the transfer information, the transaction state of "initial", and the lastModified field set to the current date: db.transactions.insert( { _id: 1, source: "A", destination: "B", value: 100, state: "initial", lastModified: new Date() } ) The operation returns a WriteResult() object with the status of the operation. Upon successful insert, the WriteResult() object has nInserted set to 1. Transfer Funds Between Accounts Using Two-Phase Commit Step 1: Retrieve the transaction to start. From the transactions collection, find a transaction in the initial state. Currently the transactions collection has only one document, namely the one added in the Initialize Transfer Record (page 103) step. If the collection contains additional documents, the query will return any transaction with an initial state unless you specify additional query conditions. var t = db.transactions.findOne( { state: "initial" } ) Type the variable t in the mongo shell to print the contents of the variable. The operation should print a document similar to the following except the lastModified field should reflect date of your insert operation: { "_id" : 1, "source" : "A", "destination" : "B", "value" : 100, "state" : "initial", "lastModified" Step 2: Update transaction state to pending. Set the transaction state from initial to pending and use the $currentDate operator to set the lastModified field to the current date. db.transactions.update( { _id: t._id, state: "initial" }, { $set: { state: "pending" }, $currentDate: { lastModified: true } } ) The operation returns a WriteResult() object with the status of the operation. Upon successful update, the nMatched and nModified displays 1. In the update statement, the state: "initial" condition ensures that no other process has already updated this record. If nMatched and nModified is 0, go back to the first step to get a different transaction and restart the procedure. Step 3: Apply the transaction to both accounts. Apply the transaction t to both accounts using the update() method if the transaction has not been applied to the accounts. In the update condition, include the condition pendingTransactions: { $ne: t._id } in order to avoid re-applying the transaction if the step is run more than once. To apply the transaction to the account, update both the balance field and the pendingTransactions field. Update the source account, subtracting from its balance the transaction value and adding to its pendingTransactions array the transaction _id. 104 Chapter 3. MongoDB CRUD Operations
  • 109. MongoDB Documentation, Release 2.6.4 db.accounts.update( { _id: t.source, pendingTransactions: { $ne: t._id } }, { $inc: { balance: -t.value }, $push: { pendingTransactions: t._id } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Update the destination account, adding to its balance the transaction value and adding to its pendingTransactions array the transaction _id . db.accounts.update( { _id: t.destination, pendingTransactions: { $ne: t._id } }, { $inc: { balance: t.value }, $push: { pendingTransactions: t._id } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Step 4: Update transaction state to applied. Use the following update() operation to set the transaction’s state to applied and update the lastModified field: db.transactions.update( { _id: t._id, state: "pending" }, { $set: { state: "applied" }, $currentDate: { lastModified: true } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Step 5: Update both accounts’ list of pending transactions. Remove the applied transaction _id from the pendingTransactions array for both accounts. Update the source account. db.accounts.update( { _id: t.source, pendingTransactions: t._id }, { $pull: { pendingTransactions: t._id } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Update the destination account. db.accounts.update( { _id: t.destination, pendingTransactions: t._id }, { $pull: { pendingTransactions: t._id } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Step 6: Update transaction state to done. Complete the transaction by setting the state of the transaction to done and updating the lastModified field: db.transactions.update( { _id: t._id, state: "applied" }, { $set: { state: "done" }, 3.3. MongoDB CRUD Tutorials 105
  • 110. MongoDB Documentation, Release 2.6.4 $currentDate: { lastModified: true } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Recovering from Failure Scenarios The most important part of the transaction procedure is not the prototypical example above, but rather the possibility for recovering from the various failure scenarios when transactions do not complete successfully. This section presents an overview of possible failures and provides steps to recover from these kinds of events. Recovery Operations The two-phase commit pattern allows applications running the sequence to resume the transaction and arrive at a consistent state. Run the recovery operations at application startup, and possibly at regular intervals, to catch any unfinished transactions. The time required to reach a consistent state depends on how long the application needs to recover each transaction. The following recovery procedures uses the lastModified date as an indicator of whether the pending transaction requires recovery; specifically, if the pending or applied transaction has not been updated in the last 30 minutes, the procedures determine that these transactions require recovery. You can use different conditions to make this determination. Transactions in Pending State To recover from failures that occur after step “Update transaction state to pend-ing. (page ??)” but before “Update transaction state to applied. (page ??)“step, retrieve from the transactions collection a pending transaction for recovery: var dateThreshold = new Date(); dateThreshold.setMinutes(dateThreshold.getMinutes() - 30); var t = db.transactions.findOne( { state: "pending", lastModified: { $lt: dateThreshold } } ); And resume from step “Apply the transaction to both accounts. (page ??)“ Transactions in Applied State To recover from failures that occur after step “Update transaction state to applied. (page ??)” but before “Update transaction state to done. (page ??)“step, retrieve from the transactions collection an applied transaction for recovery: var dateThreshold = new Date(); dateThreshold.setMinutes(dateThreshold.getMinutes() - 30); var t = db.transactions.findOne( { state: "applied", lastModified: { $lt: dateThreshold } } ); And resume from “Update both accounts’ list of pending transactions. (page ??)“ Rollback Operations In some cases, you may need to “roll back” or undo a transaction; e.g., if the application needs to “cancel” the transaction or if one of the accounts does not exist or stops existing during the transaction. 106 Chapter 3. MongoDB CRUD Operations
  • 111. MongoDB Documentation, Release 2.6.4 Transactions in Applied State After the “Update transaction state to applied. (page ??)” step, you should not roll back the transaction. Instead, complete that transaction and create a new transaction to reverse the transaction by switching the values in the source and the destination fields. Transactions in Pending State After the “Update transaction state to pending. (page ??)” step, but before the “Update transaction state to applied. (page ??)” step, you can rollback the transaction using the following procedure: Step 1: Update transaction state to canceling. Update the transaction state from pending to canceling. db.transactions.update( { _id: t._id, state: "pending" }, { $set: { state: "canceling" }, $currentDate: { lastModified: true } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Step 2: Undo the transaction on both accounts. To undo the transaction on both accounts, reverse the transaction t if the transaction has been applied. In the update condition, include the condition pendingTransactions: t._id in order to update the account only if the pending transaction has been applied. Update the destination account, subtracting from its balance the transaction value and removing the transaction _id from the pendingTransactions array. db.accounts.update( { _id: t.destination, pendingTransactions: t._id }, { $inc: { balance: -t.value }, $pull: { pendingTransactions: t._id } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. If the pending transaction has not been previously applied to this account, no document will match the update condition and nMatched and nModified will be 0. Update the source account, adding to its balance the transaction value and removing the transaction _id from the pendingTransactions array. db.accounts.update( { _id: t.source, pendingTransactions: t._id }, { $inc: { balance: t.value}, $pull: { pendingTransactions: t._id } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. If the pending transaction has not been previously applied to this account, no document will match the update condition and nMatched and nModified will be 0. Step 3: Update transaction state to canceled. To finish the rollback, update the transaction state from canceling to cancelled. 3.3. MongoDB CRUD Tutorials 107
  • 112. MongoDB Documentation, Release 2.6.4 db.transactions.update( { _id: t._id, state: "canceling" }, { $set: { state: "cancelled" }, $currentDate: { lastModified: true } } ) Upon successful update, the method returns a WriteResult() object with nMatched and nModified set to 1. Multiple Applications Transactions exist, in part, so that multiple applications can create and run operations concurrently without causing data inconsistency or conflicts. In our procedure, to update or retrieve the transaction document, the update conditions include a condition on the state field to prevent reapplication of the transaction by multiple applications. For example, applications App1 and App2 both grab the same transaction, which is in the initial state. App1 applies the whole transaction before App2 starts. When App2 attempts to perform the “Update transaction state to pending. (page ??)” step, the update condition, which includes the state: "initial" criterion, will not match any document, and the nMatched and nModified will be 0. This should signal to App2 to go back to the first step to restart the procedure with a different transaction. When multiple applications are running, it is crucial that only one application can handle a given transaction at any point in time. As such, in addition including the expected state of the transaction in the update condition, you can also create a marker in the transaction document itself to identify the application that is handling the transaction. Use findAndModify() method to modify the transaction and get it back in one step: t = db.transactions.findAndModify( { query: { state: "initial", application: { $exists: false } }, update: { $set: { state: "pending", application: "App1" }, $currentDate: { lastModified: true } }, new: true } ) Amend the transaction operations to ensure that only applications that match the identifier in the application field apply the transaction. If the application App1 fails during transaction execution, you can use the recovery procedures (page 106), but appli-cations should ensure that they “own” the transaction before applying the transaction. For example to find and resume the pending job, use a query that resembles the following: var dateThreshold = new Date(); dateThreshold.setMinutes(dateThreshold.getMinutes() - 30); db.transactions.find( { application: "App1", state: "pending", lastModified: { $lt: dateThreshold } } ) 108 Chapter 3. MongoDB CRUD Operations
  • 113. MongoDB Documentation, Release 2.6.4 Using Two-Phase Commits in Production Applications The example transaction above is intentionally simple. For example, it assumes that it is always possible to roll back operations to an account and that account balances can hold negative values. Production implementations would likely be more complex. Typically, accounts need information about current bal-ance, pending credits, and pending debits. For all transactions, ensure that you use a level of write concern appropriate for your deployment. 3.3.9 Create Tailable Cursor Overview By default, MongoDB will automatically close a cursor when the client has exhausted all results in the cursor. How-ever, for capped collections (page 196) you may use a Tailable Cursor that remains open after the client exhausts the results in the initial cursor. Tailable cursors are conceptually equivalent to the tail Unix command with the -f option (i.e. with “follow” mode). After clients insert new additional documents into a capped collection, the tailable cursor will continue to retrieve documents. Use tailable cursors on capped collections that have high write volumes where indexes aren’t practical. For instance, MongoDB replication (page 503) uses tailable cursors to tail the primary’s oplog. Note: If your query is on an indexed field, do not use tailable cursors, but instead, use a regular cursor. Keep track of the last value of the indexed field returned by the query. To retrieve the newly added documents, query the collection again using the last value of the indexed field in the query criteria, as in the following example: db.<collection>.find( { indexedField: { $gt: <lastvalue> } } ) Consider the following behaviors related to tailable cursors: • Tailable cursors do not use indexes and return documents in natural order. • Because tailable cursors do not use indexes, the initial scan for the query may be expensive; but, after initially exhausting the cursor, subsequent retrievals of the newly added documents are inexpensive. • Tailable cursors may become dead, or invalid, if either: – the query returns no match. – the cursor returns the document at the “end” of the collection and then the application deletes those docu-ment. A dead cursor has an id of 0. See your driver documentation for the driver-specific method to specify the tailable cursor. For more infor-mation on the details of specifying a tailable cursor, see MongoDB wire protocol12 documentation. C++ Example The tail function uses a tailable cursor to output the results from a query to a capped collection: • The function handles the case of the dead cursor by having the query be inside a loop. • To periodically check for new data, the cursor->more() statement is also inside a loop. 12http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol 3.3. MongoDB CRUD Tutorials 109
  • 114. MongoDB Documentation, Release 2.6.4 #include "client/dbclient.h" using namespace mongo; /* * Example of a tailable cursor. * The function "tails" the capped collection (ns) and output elements as they are added. * The function also handles the possibility of a dead cursor by tracking the field 'insertDate'. * New documents are added with increasing values of 'insertDate'. */ void tail(DBClientBase& conn, const char *ns) { BSONElement lastValue = minKey.firstElement(); Query query = Query().hint( BSON( "$natural" << 1 ) ); while ( 1 ) { auto_ptr<DBClientCursor> c = conn.query(ns, query, 0, 0, 0, QueryOption_CursorTailable | QueryOption_AwaitData ); while ( 1 ) { if ( !c->more() ) { if ( c->isDead() ) { break; } continue; } BSONObj o = c->next(); lastValue = o["insertDate"]; cout << o.toString() << endl; } query = QUERY( "insertDate" << GT << lastValue ).hint( BSON( "$natural" << 1 ) ); } } The tail function performs the following actions: • Initialize the lastValue variable, which tracks the last accessed value. The function will use the lastValue if the cursor becomes invalid and tail needs to restart the query. Use hint() to ensure that the query uses the $natural order. • In an outer while(1) loop, – Query the capped collection and return a tailable cursor that blocks for several seconds waiting for new documents auto_ptr<DBClientCursor> c = conn.query(ns, query, 0, 0, 0, QueryOption_CursorTailable | QueryOption_AwaitData ); * Specify the capped collection using ns as an argument to the function. * Set the QueryOption_CursorTailable option to create a tailable cursor. 110 Chapter 3. MongoDB CRUD Operations
  • 115. MongoDB Documentation, Release 2.6.4 * Set the QueryOption_AwaitData option so that the returned cursor blocks for a few seconds to wait for data. – In an inner while (1) loop, read the documents from the cursor: * If the cursor has no more documents and is not invalid, loop the inner while loop to recheck for more documents. * If the cursor has no more documents and is dead, break the inner while loop. * If the cursor has documents: ¡ output the document, ¡ update the lastValue value, ¡ and loop the inner while (1) loop to recheck for more documents. – If the logic breaks out of the inner while (1) loop and the cursor is invalid: * Use the lastValue value to create a new query condition that matches documents added after the lastValue. Explicitly ensure $natural order with the hint() method: query = QUERY( "insertDate" << GT << lastValue ).hint( BSON( "$natural" << 1 ) ); * Loop through the outer while (1) loop to re-query with the new query condition and repeat. See also: Detailed blog post on tailable cursor13 3.3.10 Isolate Sequence of Operations Overview Write operations are atomic on the level of a single document: no single write operation can atomically affect more than one document or more than one collection. When a single write operation modifies multiple documents, the operation as a whole is not atomic, and other opera-tions may interleave. The modification of a single document, or record, is always atomic, even if the write operation modifies multiple sub-documents within the single record. No other operations are atomic; however, you can isolate a single write operation that affects multiple documents using the isolation operator. This document describes one method of updating documents only if the local copy of the document reflects the current state of the document in the database. In addition the following methods provide a way to manage isolated sequences of operations: • the findAndModify() provides an isolated update and return operation. • Perform Two Phase Commits (page 102) • Create a unique index (page 457), to ensure that a key doesn’t exist when you insert it. 13http://shtylman.com/post/the-tail-of-mongodb 3.3. MongoDB CRUD Tutorials 111
  • 116. MongoDB Documentation, Release 2.6.4 Update if Current In this pattern, you will: • query for a document, • modify the fields in that document • and update the fields of a document only if the fields have not changed in the collection since the query. Consider the following example in JavaScript which attempts to update the qty field of a document in the products collection: Changed in version 2.6: The db.collection.update() method now returns a WriteResult() object that contains the status of the operation. Previous versions required an extra db.getLastErrorObj() method call. var myCollection = db.products; var myDocument = myCollection.findOne( { sku: 'abc123' } ); if (myDocument) { var oldQty = myDocument.qty; if (myDocument.qty < 10) { myDocument.qty *= 4; } else if ( myDocument.qty < 20 ) { myDocument.qty *= 3; } else { myDocument.qty *= 2; } var results = myCollection.update( { _id: myDocument._id, qty: oldQty }, { $set: { qty: myDocument.qty } } ); if ( results.hasWriteError() ) { print("unexpected error updating document: " + tojson( results )); } else if ( results.nMatched == 0 ) { print("No update: no matching document for { _id: " + myDocument._id + ", qty: " + oldQty + " } } Your application may require some modifications of this pattern, such as: • Use the entire document as the query in the update() operation, to generalize the operation and guarantee that the original document was not modified, rather than ensuring that as single field was not changed. • Add a version variable to the document that applications increment upon each update operation to the documents. Use this version variable in the query expression. You must be able to ensure that all clients that connect to your database obey this constraint. • Use $set in the update expression to modify only your fields and prevent overriding other fields. • Use one of the methods described in Create an Auto-Incrementing Sequence Field (page 113). 112 Chapter 3. MongoDB CRUD Operations
  • 117. MongoDB Documentation, Release 2.6.4 3.3.11 Create an Auto-Incrementing Sequence Field Synopsis MongoDB reserves the _id field in the top level of all documents as a primary key. _id must be unique, and always has an index with a unique constraint (page 457). However, except for the unique constraint you can use any value for the _id field in your collections. This tutorial describes two methods for creating an incrementing sequence number for the _id field using the following: • Use Counters Collection (page 113) • Optimistic Loop (page 115) Considerations Generally in MongoDB, you would not use an auto-increment pattern for the _id field, or any field, because it does not scale for databases with large numbers of documents. Typically the default value ObjectId is more ideal for the _id. Procedures Use Counters Collection Counter Collection Implementation Use a separate counters collection to track the last number sequence used. The _id field contains the sequence name and the seq field contains the last value of the sequence. 1. Insert into the counters collection, the initial value for the userid: db.counters.insert( { _id: "userid", seq: 0 } ) 2. Create a getNextSequence function that accepts a name of the sequence. The function uses the findAndModify() method to atomically increment the seq value and return this new value: function getNextSequence(name) { var ret = db.counters.findAndModify( { query: { _id: name }, update: { $inc: { seq: 1 } }, new: true } ); return ret.seq; } 3. Use this getNextSequence() function during insert(). db.users.insert( { _id: getNextSequence("userid"), name: "Sarah C." 3.3. MongoDB CRUD Tutorials 113
  • 118. MongoDB Documentation, Release 2.6.4 } ) db.users.insert( { _id: getNextSequence("userid"), name: "Bob D." } ) You can verify the results with find(): db.users.find() The _id fields contain incrementing sequence values: { _id : 1, name : "Sarah C." } { _id : 2, name : "Bob D." } findAndModify Behavior When findAndModify() includes the upsert: true option and the query field(s) is not uniquely indexed, the method could insert a document multiple times in certain circumstances. For instance, if multiple clients each invoke the method with the same query condition and these methods complete the find phase before any of methods perform the modify phase, these methods could insert the same document. In the counters collection example, the query field is the _id field, which always has a unique index. Consider that the findAndModify() includes the upsert: true option, as in the following modified example: function getNextSequence(name) { var ret = db.counters.findAndModify( { query: { _id: name }, update: { $inc: { seq: 1 } }, new: true, upsert: true } ); return ret.seq; } If multiple clients were to invoke the getNextSequence() method with the same name parameter, then the methods would observe one of the following behaviors: • Exactly one findAndModify() would successfully insert a new document. • Zero or more findAndModify() methods would update the newly inserted document. • Zero or more findAndModify() methods would fail when they attempted to insert a duplicate. If the method fails due to a unique index constraint violation, retry the method. Absent a delete of the document, the retry should not fail. 114 Chapter 3. MongoDB CRUD Operations
  • 119. MongoDB Documentation, Release 2.6.4 Optimistic Loop In this pattern, an Optimistic Loop calculates the incremented _id value and attempts to insert a document with the calculated _id value. If the insert is successful, the loop ends. Otherwise, the loop will iterate through possible _id values until the insert is successful. 1. Create a function named insertDocument that performs the “insert if not present” loop. The function wraps the insert() method and takes a doc and a targetCollection arguments. Changed in version 2.6: The db.collection.insert() method now returns a writeresults-insert object that contains the status of the operation. Previous versions required an extra db.getLastErrorObj() method call. function insertDocument(doc, targetCollection) { while (1) { var cursor = targetCollection.find( {}, { _id: 1 } ).sort( { _id: -1 } ).limit(1); var seq = cursor.hasNext() ? cursor.next()._id + 1 : 1; doc._id = seq; var results = targetCollection.insert(doc); if( results.hasWriteError() ) { if( results.writeError.code == 11000 /* dup key */ ) continue; else print( "unexpected error inserting data: " + tojson( results ) ); } break; } } The while (1) loop performs the following actions: • Queries the targetCollection for the document with the maximum _id value. • Determines the next sequence value for _id by: – adding 1 to the returned _id value if the returned cursor points to a document. – otherwise: it sets the next sequence value to 1 if the returned cursor points to no document. • For the doc to insert, set its _id field to the calculated sequence value seq. • Insert the doc into the targetCollection. • If the insert operation errors with duplicate key, repeat the loop. Otherwise, if the insert operation encoun-ters some other error or if the operation succeeds, break out of the loop. 2. Use the insertDocument() function to perform an insert: var myCollection = db.users2; insertDocument( { name: "Grace H." }, myCollection 3.3. MongoDB CRUD Tutorials 115
  • 120. MongoDB Documentation, Release 2.6.4 ); insertDocument( { name: "Ted R." }, myCollection ) You can verify the results with find(): db.users2.find() The _id fields contain incrementing sequence values: { _id: 1, name: "Grace H." } { _id : 2, "name" : "Ted R." } The while loop may iterate many times in collections with larger insert volumes. 3.3.12 Limit Number of Elements in an Array after an Update New in version 2.4. Synopsis Consider an application where users may submit many scores (e.g. for a test), but the application only needs to track the top three test scores. This pattern uses the $push operator with the $each, $sort, and $slice modifiers to sort and maintain an array of fixed size. Important: The array elements must be documents in order to use the $sort modifier. Pattern Consider the following document in the collection students: { _id: 1, scores: [ { attempt: 1, score: 10 }, { attempt: 2 , score:8 } ] } The following update uses the $push operator with: • the $each modifier to append to the array 2 new elements, 116 Chapter 3. MongoDB CRUD Operations
  • 121. MongoDB Documentation, Release 2.6.4 • the $sort modifier to order the elements by ascending (1) score, and • the $slice modifier to keep the last 3 elements of the ordered array. db.students.update( { _id: 1 }, { $push: { scores: { $each : [ { attempt: 3, score: 7 }, { attempt: 4, score: 4 } ], $sort: { score: 1 }, $slice: -3 } } } ) Note: When using the $sort modifier on the array element, access the field in the subdocument element directly instead of using the dot notation on the array field. After the operation, the document contains only the top 3 scores in the scores array: { "_id" : 1, "scores" : [ { "attempt" : 3, "score" : 7 }, { "attempt" : 2, "score" : 8 }, { "attempt" : 1, "score" : 10 } ] } See also: • $push operator, • $each modifier, • $sort modifier, and • $slice modifier. 3.4 MongoDB CRUD Reference 3.4.1 Query Cursor Methods Name Description cursor.count() Returns a count of the documents in a cursor. cursor.explain() Reports on the query execution plan, including index use, for a cursor. cursor.hint() Forces MongoDB to use a specific index for a query. cursor.limit() Constrains the size of a cursor’s result set. cursor.next() Returns the next document in a cursor. cursor.skip() Returns a cursor that begins returning results only after passing or skipping a number of documents. cursor.sort() Returns results ordered according to a sort specification. cursor.toArray() Returns an array that contains all documents returned by the cursor. 3.4. MongoDB CRUD Reference 117
  • 122. MongoDB Documentation, Release 2.6.4 3.4.2 Query and Data Manipulation Collection Methods Name Description db.collection.count() Wraps count to return a count of the number of documents in a collection or matching a query. db.collection.distinct(R)eturns an array of documents that have distinct values for the specified field. db.collection.find() Performs a query on a collection and returns a cursor object. db.collection.findOne()Performs a query and returns a single document. db.collection.insert()Creates a new document in a collection. db.collection.remove()Deletes documents from a collection. db.collection.save() Provides a wrapper around an insert() and update() to insert new documents. db.collection.update()Modifies a document in a collection. 3.4.3 MongoDB CRUD Reference Documentation Write Concern Reference (page 118) Configuration options associated with the guarantee MongoDB provides when reporting on the success of a write operation. SQL to MongoDB Mapping Chart (page 120) An overview of common database operations showing both the Mon-goDB operations and SQL statements. The bios Example Collection (page 125) Sample data for experimenting with MongoDB. insert(), update() and find() pages use the data for some of their examples. Write Concern Reference Write concern (page 72) describes the guarantee that MongoDB provides when reporting on the success of a write operation. Changed in version 2.6: A new protocol for write operations (page 737) integrates write concerns with the write oper-ations and eliminates the need to call the getLastError command. Previous versions required a getLastError command immediately after a write operation to specify the write concern. Read Isolation Behavior MongoDB allows clients to read documents inserted or modified before it commits these modifications to disk, regard-less of write concern level or journaling configuration. As a result, applications may observe two classes of behaviors: • For systems with multiple concurrent readers and writers, MongoDB will allow clients to read the results of a write operation before the write operation returns. • If the mongod terminates before the journal commits, even if a write returns successfully, queries may have read data that will not exist after the mongod restarts. Other database systems refer to these isolation semantics as read uncommitted. For all inserts and updates, Mon-goDB modifies each document in isolation: clients never see documents in intermediate states. For multi-document operations, MongoDB does not provide any multi-document transactions or isolation. When mongod returns a successful journaled write concern, the data is fully committed to disk and will be available after mongod restarts. For replica sets, write operations are durable only after a write replicates and commits to the journal of a majority of the members of the set. MongoDB regularly commits data to the journal regardless of journaled write concern: use the commitIntervalMs to control how often a mongod commits the journal. 118 Chapter 3. MongoDB CRUD Operations
  • 123. MongoDB Documentation, Release 2.6.4 Available Write Concern Write concern can include the w (page 119) option to specify the required number of acknowledgments before return-ing, the j (page 119) option to require writes to the journal before returning, and wtimeout (page 119) option to specify a time limit to prevent write operations from blocking indefinitely. In sharded clusters, mongos instances will pass the write concern on to the shard. w Option The w option provides the ability to disable write concern entirely as well as specify the write concern for replica sets. MongoDB uses w: 1 as the default write concern. w: 1 provides basic receipt acknowledgment. The w option accepts the following values: Value Description 1 Provides acknowledgment of write operations on a standalone mongod or the primary in a replica set. This is the default write concern for MongoDB. 0 Disables basic acknowledgment of write operations, but returns information about socket exceptions and networking errors to the application. If you disable basic write operation acknowledgment but require journal commit acknowledgment, the journal commit prevails, and the server will require that mongod acknowledge the write operation. <Number greater than 1> Guarantees that write operations have propagated successfully to the specified number of replica set members including the primary. For example, w: 2 indicates acknowledgements from the primary and at least one secondary. If you set w to a number that is greater than the number of set members that hold data, MongoDB waits for the non-existent members to become available, which means MongoDB blocks indefinitely. "majority" Confirms that write operations have propagated to the majority of configured replica set: a majority of the set’s configured members must acknowledge the write operation before it succeeds. This allows you to avoid hard coding assumptions about the size of your replica set into your application. Changed in version 2.6: In Master/Slave (page 538) deployments, MongoDB treats w: "majority" as equivalent to w: 1. In earlier versions of MongoDB, w: "majority" produces an error in master/slave (page 538) deployments. <tag set> By specifying a tag set (page 576), you can have fine-grained control over which replica set members must acknowledge a write operation to satisfy the required level of write concern. j Option The j option confirms that the mongod instance has written the data to the on-disk journal. This ensures that data is not lost if the mongod instance shuts down unexpectedly. Set to true to enable. Changed in version 2.6: Specifying a write concern that includes j: true to a mongod or mongos running with --nojournal option now errors. Previous versions would ignore the j: true. Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the primary of the set regardless of the level of replica acknowledged write concern. wtimeout This option specifies a time limit, in milliseconds, for the write concern. wtimeout is only applicable for w values greater than 1. 3.4. MongoDB CRUD Reference 119
  • 124. MongoDB Documentation, Release 2.6.4 wtimeout causes write operations to return with an error after the specified limit, even if the required write concern will eventually succeed. When these write operations return, MongoDB does not undo successful data modifications performed before the write concern exceeded the wtimeout time limit. If you do not specify the wtimeout option and the level of write concern is unachievable, the write operation will block indefinitely. Specifying a wtimeout value of 0 is equivalent to a write concern without the wtimeout option. See also: Write Concern Introduction (page 72) and Write Concern for Replica Sets (page 75). SQL to MongoDB Mapping Chart In addition to the charts that follow, you might want to consider the Frequently Asked Questions (page 687) section for a selection of common questions about MongoDB. Terminology and Concepts The following table presents the various SQL terminology and concepts and the corresponding MongoDB terminology and concepts. SQL Terms/Concepts MongoDB Terms/Concepts database database table collection row document or BSON document column field index index table joins embedded documents and linking primary key primary key Specify any unique column or column combination as In MongoDB, the primary key is automatically set to primary key. the _id field. aggregation (e.g. group by) aggregation pipeline See the SQL to Aggregation Mapping Chart (page 426). Executables The following table presents some database executables and the corresponding MongoDB executables. This table is not meant to be exhaustive. MongoDB MySQL Oracle Informix DB2 Database Server mongod mysqld oracle IDS DB2 Server Database Client mongo mysql sqlplus DB-Access DB2 Client Examples The following table presents the various SQL statements and the corresponding MongoDB statements. The examples in the table assume the following conditions: • The SQL examples assume a table named users. • The MongoDB examples assume a collection named users that contain documents of the following prototype: 120 Chapter 3. MongoDB CRUD Operations
  • 125. MongoDB Documentation, Release 2.6.4 { _id: ObjectId("509a8fb2f3f4948bd2f983a0"), user_id: "abc123", age: 55, status: 'A' } Create and Alter The following table presents the various SQL statements related to table-level actions and the corresponding MongoDB statements. 3.4. MongoDB CRUD Reference 121
  • 126. MongoDB Documentation, Release 2.6.4 SQL Schema Statements MongoDB Schema Statements CREATE TABLE users ( id MEDIUMINT NOT NULL AUTO_INCREMENT, user_id Varchar(30), age Number, status char(1), PRIMARY KEY (id) ) Implicitly created on first insert() operation. The primary key _id is automatically added if _id field is not specified. db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) However, you can also explicitly create a collection: db.createCollection("users") ALTER TABLE users ADD join_date DATETIME Collections do not describe or enforce the structure of its documents; i.e. there is no structural alteration at the collection level. However, at the document level, update() operations can add fields to existing documents using the $set op-erator. db.users.update( { }, { $set: { join_date: new Date() } }, { multi: true } ) ALTER TABLE users DROP COLUMN join_date Collections do not describe or enforce the structure of its documents; i.e. there is no structural alteration at the collection level. However, at the document level, update() operations can remove fields from documents using the $unset operator. db.users.update( { }, { $unset: { join_date: "" } }, { multi: true } ) CREATE INDEX idx_user_id_asc ON users(user_id) db.users.ensureIndex( { user_id: 1 } ) CREATE INDEX idx_user_id_asc_age_desc ON users(user_id, age DESC) db.users.ensureIndex( { user_id: 1, age: -1 } ) DROP TABLE users db.users.drop() For more information, see db.collection.insert(), db.createCollection(), db.collection.update(), $set, $unset, db.collection.ensureIndex(), indexes (page 436), db.collection.drop(), and Data Modeling Concepts (page 133). Insert The following table presents the various SQL statements related to inserting records into tables and the cor-responding MongoDB statements. 122 Chapter 3. MongoDB CRUD Operations
  • 127. MongoDB Documentation, Release 2.6.4 SQL INSERT Statements MongoDB insert() Statements INSERT INTO users(user_id, age, status) VALUES ("bcd001", 45, "A") db.users.insert( { user_id: "bcd001", age: 45, status: "A" } ) For more information, see db.collection.insert(). Select The following table presents the various SQL statements related to reading records from tables and the corre-sponding MongoDB statements. 3.4. MongoDB CRUD Reference 123
  • 128. MongoDB Documentation, Release 2.6.4 SQL SELECT Statements MongoDB find() Statements SELECT * db.users.find() FROM users SELECT id, user_id, status FROM users db.users.find( { }, { user_id: 1, status: 1 } ) SELECT user_id, status FROM users db.users.find( { }, { user_id: 1, status: 1, _id: 0 } ) SELECT * FROM users WHERE status = "A" db.users.find( { status: "A" } ) SELECT user_id, status FROM users WHERE status = "A" db.users.find( { status: "A" }, { user_id: 1, status: 1, _id: 0 } ) SELECT * FROM users WHERE status != "A" db.users.find( { status: { $ne: "A" } } ) SELECT * FROM users WHERE status = "A" AND age = 50 db.users.find( { status: "A", age: 50 } ) SELECT * FROM users WHERE status = "A" OR age = 50 db.users.find( { $or: [ { status: "A" } , { age: 50 } ] } ) SELECT * FROM users WHERE age > 25 db.users.find( { age: { $gt: 25 } } ) SELECT * FROM users WHERE age < 25 db.users.find( { age: { $lt: 25 } } ) SELECT * FROM users WHERE age > 25 AND age <= 50 db.users.find( { age: { $gt: 25, $lte: 50 } } ) 124 Chapter 3. MongoDB CRUD Operations SELECT * FROM users WHERE user_id like "%bc%" db.users.find( { user_id: /bc/ } )
  • 129. MongoDB Documentation, Release 2.6.4 For more information, see db.collection.find(), db.collection.distinct(), db.collection.findOne(), $ne $and, $or, $gt, $lt, $exists, $lte, $regex, limit(), skip(), explain(), sort(), and count(). Update Records The following table presents the various SQL statements related to updating existing records in tables and the corresponding MongoDB statements. SQL Update Statements MongoDB update() Statements UPDATE users db.users.update( SET status = "C" WHERE age > 25 { age: { $gt: 25 } }, { $set: { status: "C" } }, { multi: true } ) UPDATE users SET age = age + 3 WHERE status = "A" db.users.update( { status: "A" } , { $inc: { age: 3 } }, { multi: true } ) For more information, see db.collection.update(), $set, $inc, and $gt. Delete Records The following table presents the various SQL statements related to deleting records from tables and the corresponding MongoDB statements. SQL Delete Statements MongoDB remove() Statements DELETE FROM users db.users.remove( { status: "D" } ) WHERE status = "D" DELETE FROM users db.users.remove({}) For more information, see db.collection.remove(). The bios Example Collection The bios collection provides example data for experimenting with MongoDB. Many of this guide’s examples on insert, update and read operations create or query data from the bios collection. The following documents comprise the bios collection. In the examples, the data might be different, as the examples themselves make changes to the data. { "_id" : 1, "name" : { "first" : "John", "last" : "Backus" }, "birth" : ISODate("1924-12-03T05:00:00Z"), "death" : ISODate("2007-03-17T04:00:00Z"), "contribs" : [ "Fortran", 3.4. MongoDB CRUD Reference 125
  • 130. MongoDB Documentation, Release 2.6.4 "ALGOL", "Backus-Naur Form", "FP" ], "awards" : [ { "award" : "W.W. McDowell Award", "year" : 1967, "by" : "IEEE Computer Society" }, { "award" : "National Medal of Science", "year" : 1975, "by" : "National Science Foundation" }, { "award" : "Turing Award", "year" : 1977, "by" : "ACM" }, { "award" : "Draper Prize", "year" : 1993, "by" : "National Academy of Engineering" } ] } { "_id" : ObjectId("51df07b094c6acd67e492f41"), "name" : { "first" : "John", "last" : "McCarthy" }, "birth" : ISODate("1927-09-04T04:00:00Z"), "death" : ISODate("2011-12-24T05:00:00Z"), "contribs" : [ "Lisp", "Artificial Intelligence", "ALGOL" ], "awards" : [ { "award" : "Turing Award", "year" : 1971, "by" : "ACM" }, { "award" : "Kyoto Prize", "year" : 1988, "by" : "Inamori Foundation" }, { "award" : "National Medal of Science", "year" : 1990, "by" : "National Science Foundation" } ] 126 Chapter 3. MongoDB CRUD Operations
  • 131. MongoDB Documentation, Release 2.6.4 } { "_id" : 3, "name" : { "first" : "Grace", "last" : "Hopper" }, "title" : "Rear Admiral", "birth" : ISODate("1906-12-09T05:00:00Z"), "death" : ISODate("1992-01-01T05:00:00Z"), "contribs" : [ "UNIVAC", "compiler", "FLOW-MATIC", "COBOL" ], "awards" : [ { "award" : "Computer Sciences Man of the Year", "year" : 1969, "by" : "Data Processing Management Association" }, { "award" : "Distinguished Fellow", "year" : 1973, "by" : " British Computer Society" }, { "award" : "W. W. McDowell Award", "year" : 1976, "by" : "IEEE Computer Society" }, { "award" : "National Medal of Technology", "year" : 1991, "by" : "United States" } ] } { "_id" : 4, "name" : { "first" : "Kristen", "last" : "Nygaard" }, "birth" : ISODate("1926-08-27T04:00:00Z"), "death" : ISODate("2002-08-10T04:00:00Z"), "contribs" : [ "OOP", "Simula" ], "awards" : [ { "award" : "Rosing Prize", "year" : 1999, "by" : "Norwegian Data Association" 3.4. MongoDB CRUD Reference 127
  • 132. MongoDB Documentation, Release 2.6.4 }, { "award" : "Turing Award", "year" : 2001, "by" : "ACM" }, { "award" : "IEEE John von Neumann Medal", "year" : 2001, "by" : "IEEE" } ] } { "_id" : 5, "name" : { "first" : "Ole-Johan", "last" : "Dahl" }, "birth" : ISODate("1931-10-12T04:00:00Z"), "death" : ISODate("2002-06-29T04:00:00Z"), "contribs" : [ "OOP", "Simula" ], "awards" : [ { "award" : "Rosing Prize", "year" : 1999, "by" : "Norwegian Data Association" }, { "award" : "Turing Award", "year" : 2001, "by" : "ACM" }, { "award" : "IEEE John von Neumann Medal", "year" : 2001, "by" : "IEEE" } ] } { "_id" : 6, "name" : { "first" : "Guido", "last" : "van Rossum" }, "birth" : ISODate("1956-01-31T05:00:00Z"), "contribs" : [ "Python" ], "awards" : [ { "award" : "Award for the Advancement of Free Software", 128 Chapter 3. MongoDB CRUD Operations
  • 133. MongoDB Documentation, Release 2.6.4 "year" : 2001, "by" : "Free Software Foundation" }, { "award" : "NLUUG Award", "year" : 2003, "by" : "NLUUG" } ] } { "_id" : ObjectId("51e062189c6ae665454e301d"), "name" : { "first" : "Dennis", "last" : "Ritchie" }, "birth" : ISODate("1941-09-09T04:00:00Z"), "death" : ISODate("2011-10-12T04:00:00Z"), "contribs" : [ "UNIX", "C" ], "awards" : [ { "award" : "Turing Award", "year" : 1983, "by" : "ACM" }, { "award" : "National Medal of Technology", "year" : 1998, "by" : "United States" }, { "award" : "Japan Prize", "year" : 2011, "by" : "The Japan Prize Foundation" } ] } { "_id" : 8, "name" : { "first" : "Yukihiro", "aka" : "Matz", "last" : "Matsumoto" }, "birth" : ISODate("1965-04-14T04:00:00Z"), "contribs" : [ "Ruby" ], "awards" : [ { "award" : "Award for the Advancement of Free Software", "year" : "2011", "by" : "Free Software Foundation" 3.4. MongoDB CRUD Reference 129
  • 134. MongoDB Documentation, Release 2.6.4 } ] } { "_id" : 9, "name" : { "first" : "James", "last" : "Gosling" }, "birth" : ISODate("1955-05-19T04:00:00Z"), "contribs" : [ "Java" ], "awards" : [ { "award" : "The Economist Innovation Award", "year" : 2002, "by" : "The Economist" }, { "award" : "Officer of the Order of Canada", "year" : 2007, "by" : "Canada" } ] } { "_id" : 10, "name" : { "first" : "Martin", "last" : "Odersky" }, "contribs" : [ "Scala" ] } 130 Chapter 3. MongoDB CRUD Operations
  • 135. CHAPTER 4 Data Models Data in MongoDB has a flexible schema. Collections do not enforce document structure. This flexibility gives you data-modeling choices to match your application and its performance requirements. Read the Data Modeling Introduction (page 131) document for a high level introduction to data modeling, and proceed to the documents in the Data Modeling Concepts (page 133) section for additional documentation of the data model design process. The Data Model Examples and Patterns (page 140) documents provide examples of different data models. In addition, the MongoDB Use Case Studies1 provide overviews of application design and include example data models with MongoDB. Data Modeling Introduction (page 131) An introduction to data modeling in MongoDB. Data Modeling Concepts (page 133) The core documentation detailing the decisions you must make when determin-ing a data model, and discussing considerations that should be taken into account. Data Model Examples and Patterns (page 140) Examples of possible data models that you can use to structure your MongoDB documents. Data Model Reference (page 158) Reference material for data modeling for developers of MongoDB applications. 4.1 Data Modeling Introduction Data in MongoDB has a flexible schema. Unlike SQL databases, where you must determine and declare a table’s schema before inserting data, MongoDB’s collections do not enforce document structure. This flexibility facilitates the mapping of documents to an entity or an object. Each document can match the data fields of the represented entity, even if the data has substantial variation. In practice, however, the documents in a collection share a similar structure. The key challenge in data modeling is balancing the needs of the application, the performance characteristics of the database engine, and the data retrieval patterns. When designing data models, always consider the application usage of the data (i.e. queries, updates, and processing of the data) as well as the inherent structure of the data itself. 4.1.1 Document Structure The key decision in designing data models for MongoDB applications revolves around the structure of documents and how the application represents relationships between data. There are two tools that allow applications to represent these relationships: references and embedded documents. 1http://docs.mongodb.org/ecosystem/use-cases 131
  • 136. MongoDB Documentation, Release 2.6.4 References References store the relationships between data by including links or references from one document to another. Appli-cations can resolve these references (page 161) to access the related data. Broadly, these are normalized data models. Figure 4.1: Data model using references to link documents. Both the contact document and the access document contain a reference to the user document. See Normalized Data Models (page 135) for the strengths and weaknesses of using references. Embedded Data Embedded documents capture relationships between data by storing related data in a single document structure. Mon-goDB documents make it possible to embed document structures as sub-documents in a field or array within a docu-ment. These denormalized data models allow applications to retrieve and manipulate related data in a single database operation. See Embedded Data Models (page 134) for the strengths and weaknesses of embedding sub-documents. 4.1.2 Atomicity of Write Operations In MongoDB, write operations are atomic at the document level, and no single write operation can atomically affect more than one document or more than one collection. A denormalized data model with embedded data combines all related data for a represented entity in a single document. This facilitates atomic write operations since a single write operation can insert or update the data for an entity. Normalizing the data would split the data across multiple collections and would require multiple write operations that are not atomic collectively. 132 Chapter 4. Data Models
  • 137. MongoDB Documentation, Release 2.6.4 Figure 4.2: Data model with embedded fields that contain all related information. However, schemas that facilitate atomic writes may limit ways that applications can use the data or may limit ways to modify applications. The Atomicity Considerations (page 136) documentation describes the challenge of designing a schema that balances flexibility and atomicity. 4.1.3 Document Growth Some updates, such as pushing elements to an array or adding new fields, increase a document’s size. If the document size exceeds the allocated space for that document, MongoDB relocates the document on disk. The growth consider-ation can affect the decision to normalize or denormalize data. See Document Growth Considerations (page 136) for more about planning for and managing document growth in MongoDB. 4.1.4 Data Use and Performance When designing a data model, consider how applications will use your database. For instance, if your application only uses recently inserted documents, consider using Capped Collections (page 196). Or if your application needs are mainly read operations to a collection, adding indexes to support common queries can improve performance. See Operational Factors and Data Models (page 136) for more information on these and other operational considera-tions that affect data model designs. 4.2 Data Modeling Concepts When constructing a data model for your MongoDB collection, there are various options you can choose from, each of which has its strengths and weaknesses. The following sections guide you through key design decisions and detail various considerations for choosing the best data model for your application needs. 4.2. Data Modeling Concepts 133
  • 138. MongoDB Documentation, Release 2.6.4 For a general introduction to data modeling in MongoDB, see the Data Modeling Introduction (page 131). For example data models, see Data Modeling Examples and Patterns (page 140). Data Model Design (page 134) Presents the different strategies that you can choose from when determining your data model, their strengths and their weaknesses. Operational Factors and Data Models (page 136) Details features you should keep in mind when designing your data model, such as lifecycle management, indexing, horizontal scalability, and document growth. GridFS (page 138) GridFS is a specification for storing documents that exceeds the BSON-document size limit of 16MB. 4.2.1 Data Model Design Effective data models support your application needs. The key consideration for the structure of your documents is the decision to embed (page 134) or to use references (page 135). Embedded Data Models With MongoDB, you may embed related data in a single structure or document. These schema are generally known as “denormalized” models, and take advantage of MongoDB’s rich documents. Consider the following diagram: Figure 4.3: Data model with embedded fields that contain all related information. Embedded data models allow applications to store related pieces of information in the same database record. As a result, applications may need to issue fewer queries and updates to complete common operations. In general, use embedded data models when: • you have “contains” relationships between entities. See Model One-to-One Relationships with Embedded Doc-uments (page 140). 134 Chapter 4. Data Models
  • 139. MongoDB Documentation, Release 2.6.4 • you have one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents. See Model One-to-Many Relationships with Embedded Documents (page 141). In general, embedding provides better performance for read operations, as well as the ability to request and retrieve related data in a single database operation. Embedded data models make it possible to update related data in a single atomic write operation. However, embedding related data in documents may lead to situations where documents grow after creation. Doc-ument growth can impact write performance and lead to data fragmentation. See Document Growth (page 136) for details. Furthermore, documents in MongoDB must be smaller than the maximum BSON document size. For bulk binary data, consider GridFS (page 138). To interact with embedded documents, use dot notation to “reach into” embedded documents. See query for data in arrays (page 90) and query data in sub-documents (page 89) for more examples on accessing data in arrays and embedded documents. Normalized Data Models Normalized data models describe relationships using references (page 161) between documents. Figure 4.4: Data model using references to link documents. Both the contact document and the access document contain a reference to the user document. In general, use normalized data models: • when embedding would result in duplication of data but would not provide sufficient read performance advan-tages to outweigh the implications of the duplication. • to represent more complex many-to-many relationships. 4.2. Data Modeling Concepts 135
  • 140. MongoDB Documentation, Release 2.6.4 • to model large hierarchical data sets. References provides more flexibility than embedding. However, client-side applications must issue follow-up queries to resolve the references. In other words, normalized data models can require more round trips to the server. See Model One-to-Many Relationships with Document References (page 143) for an example of referencing. For examples of various tree models using references, see Model Tree Structures (page 144). 4.2.2 Operational Factors and Data Models Modeling application data for MongoDB depends on both the data itself, as well as the characteristics of MongoDB itself. For example, different data models may allow applications to use more efficient queries, increase the throughput of insert and update operations, or distribute activity to a sharded cluster more effectively. These factors are operational or address requirements that arise outside of the application but impact the performance of MongoDB based applications. When developing a data model, analyze all of your application’s read operations (page 55) and write operations (page 67) in conjunction with the following considerations. Document Growth Some updates to documents can increase the size of documents. These updates include pushing elements to an array (i.e. $push) and adding new fields to a document. If the document size exceeds the allocated space for that document, MongoDB will relocate the document on disk. Relocating documents takes longer than in place updates and can lead to fragmented storage. Although MongoDB automatically adds padding to document allocations (page 83) to minimize the likelihood of relocation, data models should avoid document growth when possible. For instance, if your applications require updates that will cause document growth, you may want to refactor your data model to use references between data in distinct documents rather than a denormalized data model. MongoDB adaptively adjusts the amount of automatic padding to reduce occurrences of relocation. You may also use a pre-allocation strategy to explicitly avoid document growth. Refer to the Pre-Aggregated Reports Use Case2 for an example of the pre-allocation approach to handling document growth. See Storage (page 82) for more information on MongoDB’s storage model and record allocation strategies. Atomicity In MongoDB, operations are atomic at the document level. No single write operation can change more than one document. Operations that modify more than a single document in a collection still operate on one document at a time. 3 Ensure that your application stores all fields with atomic dependency requirements in the same document. If the application can tolerate non-atomic updates for two pieces of data, you can store these data in separate documents. A data model that embeds related data in a single document facilitates these kinds of atomic operations. For data mod-els that store references between related pieces of data, the application must issue separate read and write operations to retrieve and modify these related pieces of data. See Model Data for Atomic Operations (page 154) for an example data model that provides atomic updates for a single document. 2http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports 3 Document-level atomic operations include all operations within a single MongoDB document record: operations that affect multiple sub-documents within that single record are still atomic. 136 Chapter 4. Data Models
  • 141. MongoDB Documentation, Release 2.6.4 Sharding MongoDB uses sharding to provide horizontal scaling. These clusters support deployments with large data sets and high-throughput operations. Sharding allows users to partition a collection within a database to distribute the collec-tion’s documents across a number of mongod instances or shards. To distribute data and application traffic in a sharded collection, MongoDB uses the shard key (page 620). Selecting the proper shard key (page 620) has significant implications for performance, and can enable or prevent query isolation and increased write capacity. It is important to consider carefully the field or fields to use as the shard key. See Sharding Introduction (page 607) and Shard Keys (page 620) for more information. Indexes Use indexes to improve performance for common queries. Build indexes on fields that appear often in queries and for all operations that return sorted results. MongoDB automatically creates a unique index on the _id field. As you create indexes, consider the following behaviors of indexes: • Each index requires at least 8KB of data space. • Adding an index has some negative performance impact for write operations. For collections with high write-to- read ratio, indexes are expensive since each insert must also update any indexes. • Collections with high read-to-write ratio often benefit from additional indexes. Indexes do not affect un-indexed read operations. • When active, each index consumes disk space and memory. This usage can be significant and should be tracked for capacity planning, especially for concerns over working set size. See Indexing Strategies (page 493) for more information on indexes as well as Analyze Query Performance (page 97). Additionally, the MongoDB database profiler (page 210) may help identify inefficient queries. Large Number of Collections In certain situations, you might choose to store related information in several collections rather than in a single collec-tion. Consider a sample collection logs that stores log documents for various environment and applications. The logs collection contains documents of the following form: { log: "dev", ts: ..., info: ... } { log: "debug", ts: ..., info: ...} If the total number of documents is low, you may group documents into collection by type. For logs, consider main-taining distinct log collections, such as logs_dev and logs_debug. The logs_dev collection would contain only the documents related to the dev environment. Generally, having a large number of collections has no significant performance penalty and results in very good performance. Distinct collections are very important for high-throughput batch processing. When using models that have a large number of collections, consider the following behaviors: • Each collection has a certain minimum overhead of a few kilobytes. • Each index, including the index on _id, requires at least 8KB of data space. • For each database, a single namespace file (i.e. <database>.ns) stores all meta-data for that database, and each index and collection has its own entry in the namespace file. MongoDB places limits on the size of namespace files. 4.2. Data Modeling Concepts 137
  • 142. MongoDB Documentation, Release 2.6.4 • MongoDB has limits on the number of namespaces. You may wish to know the current number of namespaces in order to determine how many additional namespaces the database can support. To get the current number of namespaces, run the following in the mongo shell: db.system.namespaces.count() The limit on the number of namespaces depend on the <database>.ns size. The namespace file defaults to 16 MB. To change the size of the new namespace file, start the server with the option --nssize <new size MB>. For existing databases, after starting up the server with --nssize, run the db.repairDatabase() com-mand from the mongo shell. For impacts and considerations on running db.repairDatabase(), see repairDatabase. Data Lifecycle Management Data modeling decisions should take data lifecycle management into consideration. The Time to Live or TTL feature (page 198) of collections expires documents after a period of time. Consider using the TTL feature if your application requires some data to persist in the database for a limited period of time. Additionally, if your application only uses recently inserted documents, consider Capped Collections (page 196). Capped collections provide first-in-first-out (FIFO) management of inserted documents and efficiently support opera-tions that insert and read documents based on insertion order. 4.2.3 GridFS GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16MB. Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, 4 and stores each of those chunks as a separate document. By default GridFS limits chunk size to 255k. GridFS uses two collections to store files. One collection stores the file chunks, and the other stores file metadata. When you query a GridFS store for a file, the driver or client will reassemble the chunks as needed. You can perform range queries on files stored through GridFS. You also can access information from arbitrary sections of files, which allows you to “skip” into the middle of a video or audio file. GridFS is useful not only for storing files that exceed 16MB but also for storing any files for which you want access without having to load the entire file into memory. For more information on the indications of GridFS, see When should I use GridFS? (page 693). Changed in version 2.4.10: The default chunk size changed from 256k to 255k. Implement GridFS To store and retrieve files using GridFS, use either of the following: • A MongoDB driver. See the drivers documentation for information on using GridFS with your driver. • The mongofiles command-line tool in the mongo shell. See http://docs.mongodb.org/manualreference/program/mongofiles. 4 The use of the term chunks in the context of GridFS is not related to the use of the term chunks in the context of sharding. 138 Chapter 4. Data Models
  • 143. MongoDB Documentation, Release 2.6.4 GridFS Collections GridFS stores files in two collections: • chunks stores the binary chunks. For details, see The chunks Collection (page 164). • files stores the file’s metadata. For details, see The files Collection (page 165). GridFS places the collections in a common bucket by prefixing each with the bucket name. By default, GridFS uses two collections with names prefixed by fs bucket: • fs.files • fs.chunks You can choose a different bucket name than fs, and create multiple buckets in a single database. Each document in the chunks collection represents a distinct chunk of a file as represented in the GridFS store. Each chunk is identified by its unique ObjectId stored in its _id field. For descriptions of all fields in the chunks and files collections, see GridFS Reference (page 164). GridFS Index GridFS uses a unique, compound index on the chunks collection for the files_id and n fields. The files_id field contains the _id of the chunk’s “parent” document. The n field contains the sequence number of the chunk. GridFS numbers all chunks, starting with 0. For descriptions of the documents and fields in the chunks collection, see GridFS Reference (page 164). The GridFS index allows efficient retrieval of chunks using the files_id and n values, as shown in the following example: cursor = db.fs.chunks.find({files_id: myFileID}).sort({n:1}); See the relevant driver documentation for the specific behavior of your GridFS application. If your driver does not create this index, issue the following operation using the mongo shell: db.fs.chunks.ensureIndex( { files_id: 1, n: 1 }, { unique: true } ); Example Interface The following is an example of the GridFS interface in Java. The example is for demonstration purposes only. For API specifics, see the relevant driver documentation. By default, the interface must support the default GridFS bucket, named fs, as in the following: // returns default GridFS bucket (i.e. "fs" collection) GridFS myFS = new GridFS(myDatabase); // saves the file to "fs" GridFS bucket myFS.createFile(new File("/tmp/largething.mpg")); Optionally, interfaces may support other additional GridFS buckets as in the following example: // returns GridFS bucket named "contracts" GridFS myContracts = new GridFS(myDatabase, "contracts"); // retrieve GridFS object "smithco" GridFSDBFile file = myContracts.findOne("smithco"); 4.2. Data Modeling Concepts 139
  • 144. MongoDB Documentation, Release 2.6.4 // saves the GridFS file to the file system file.writeTo(new File("/tmp/smithco.pdf")); 4.3 Data Model Examples and Patterns The following documents provide overviews of various data modeling patterns and common schema design consider-ations: Model Relationships Between Documents (page 140) Examples for modeling relationships between documents. Model One-to-One Relationships with Embedded Documents (page 140) Presents a data model that uses em-bedded documents (page 134) to describe one-to-one relationships between connected data. Model One-to-Many Relationships with Embedded Documents (page 141) Presents a data model that uses embedded documents (page 134) to describe one-to-many relationships between connected data. Model One-to-Many Relationships with Document References (page 143) Presents a data model that uses references (page 135) to describe one-to-many relationships between documents. Model Tree Structures (page 144) Examples for modeling tree structures. Model Tree Structures with Parent References (page 146) Presents a data model that organizes documents in a tree-like structure by storing references (page 135) to “parent” nodes in “child” nodes. Model Tree Structures with Child References (page 148) Presents a data model that organizes documents in a tree-like structure by storing references (page 135) to “child” nodes in “parent” nodes. See Model Tree Structures (page 144) for additional examples of data models for tree structures. Model Specific Application Contexts (page 154) Examples for models for specific application contexts. Model Data for Atomic Operations (page 154) Illustrates how embedding fields related to an atomic update within the same document ensures that the fields are in sync. Model Data to Support Keyword Search (page 155) Describes one method for supporting keyword search by storing keywords in an array in the same document as the text field. Combined with a multi-key index, this pattern can support application’s keyword search operations. 4.3.1 Model Relationships Between Documents Model One-to-One Relationships with Embedded Documents (page 140) Presents a data model that uses embedded documents (page 134) to describe one-to-one relationships between connected data. Model One-to-Many Relationships with Embedded Documents (page 141) Presents a data model that uses embed-ded documents (page 134) to describe one-to-many relationships between connected data. Model One-to-Many Relationships with Document References (page 143) Presents a data model that uses refer-ences (page 135) to describe one-to-many relationships between documents. Model One-to-One Relationships with Embedded Documents Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. 140 Chapter 4. Data Models
  • 145. MongoDB Documentation, Release 2.6.4 This document describes a data model that uses embedded (page 134) documents to describe relationships between connected data. Pattern Consider the following example that maps patron and address relationships. The example illustrates the advantage of embedding over referencing if you need to view one data entity in context of the other. In this one-to-one relationship between patron and address data, the address belongs to the patron. In the normalized data model, the address document contains a reference to the patron document. { _id: "joe", name: "Joe Bookreader" } { patron_id: "joe", street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } If the address data is frequently retrieved with the name information, then with referencing, your application needs to issue multiple queries to resolve the reference. The better data model would be to embed the address data in the patron data, as in the following document: { _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } } With the embedded data model, your application can retrieve the complete patron information with one query. Model One-to-Many Relationships with Embedded Documents Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. This document describes a data model that uses embedded (page 134) documents to describe relationships between connected data. 4.3. Data Model Examples and Patterns 141
  • 146. MongoDB Documentation, Release 2.6.4 Pattern Consider the following example that maps patron and multiple address relationships. The example illustrates the advantage of embedding over referencing if you need to view many data entities in context of another. In this one-to-many relationship between patron and address data, the patron has multiple address entities. In the normalized data model, the address documents contain a reference to the patron document. { _id: "joe", name: "Joe Bookreader" } { patron_id: "joe", street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } { patron_id: "joe", street: "1 Some Other Street", city: "Boston", state: "MA", zip: "12345" } If your application frequently retrieves the address data with the name information, then your application needs to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data entities in the patron data, as in the following document: { _id: "joe", name: "Joe Bookreader", addresses: [ { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" }, { street: "1 Some Other Street", city: "Boston", state: "MA", zip: "12345" } ] } With the embedded data model, your application can retrieve the complete patron information with one query. 142 Chapter 4. Data Models
  • 147. MongoDB Documentation, Release 2.6.4 Model One-to-Many Relationships with Document References Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. This document describes a data model that uses references (page 135) between documents to describe relationships between connected data. Pattern Consider the following example that maps publisher and book relationships. The example illustrates the advantage of referencing over embedding to avoid repetition of the publisher information. Embedding the publisher document inside the book document would lead to repetition of the publisher data, as the following documents show: { title: "MongoDB: The Definitive Guide", author: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O'Reilly Media", founded: 1980, location: "CA" } } { title: "50 Tips and Tricks for MongoDB Developer", author: "Kristina Chodorow", published_date: ISODate("2011-05-06"), pages: 68, language: "English", publisher: { name: "O'Reilly Media", founded: 1980, location: "CA" } } To avoid repetition of the publisher data, use references and keep the publisher information in a separate collection from the book collection. When using references, the growth of the relationships determine where to store the reference. If the number of books per publisher is small with limited growth, storing the book reference inside the publisher document may sometimes be useful. Otherwise, if the number of books per publisher is unbounded, this data model would lead to mutable, growing arrays, as in the following example: { name: "O'Reilly Media", founded: 1980, location: "CA", 4.3. Data Model Examples and Patterns 143
  • 148. MongoDB Documentation, Release 2.6.4 books: [12346789, 234567890, ...] } { _id: 123456789, title: "MongoDB: The Definitive Guide", author: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English" } { _id: 234567890, title: "50 Tips and Tricks for MongoDB Developer", author: "Kristina Chodorow", published_date: ISODate("2011-05-06"), pages: 68, language: "English" } To avoid mutable, growing arrays, store the publisher reference inside the book document: { _id: "oreilly", name: "O'Reilly Media", founded: 1980, location: "CA" } { _id: 123456789, title: "MongoDB: The Definitive Guide", author: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly" } { _id: 234567890, title: "50 Tips and Tricks for MongoDB Developer", author: "Kristina Chodorow", published_date: ISODate("2011-05-06"), pages: 68, language: "English", publisher_id: "oreilly" } 4.3.2 Model Tree Structures MongoDB allows various ways to use tree data structures to model large hierarchical or nested data relationships. Model Tree Structures with Parent References (page 146) Presents a data model that organizes documents in a tree-like structure by storing references (page 135) to “parent” nodes in “child” nodes. 144 Chapter 4. Data Models
  • 149. MongoDB Documentation, Release 2.6.4 Figure 4.5: Tree data model for a sample hierarchy of categories. 4.3. Data Model Examples and Patterns 145
  • 150. MongoDB Documentation, Release 2.6.4 Model Tree Structures with Child References (page 148) Presents a data model that organizes documents in a tree-like structure by storing references (page 135) to “child” nodes in “parent” nodes. Model Tree Structures with an Array of Ancestors (page 149) Presents a data model that organizes documents in a tree-like structure by storing references (page 135) to “parent” nodes and an array that stores all ancestors. Model Tree Structures with Materialized Paths (page 151) Presents a data model that organizes documents in a tree-like structure by storing full relationship paths between documents. In addition to the tree node, each document stores the _id of the nodes ancestors or path as a string. Model Tree Structures with Nested Sets (page 153) Presents a data model that organizes documents in a tree-like structure using the Nested Sets pattern. This optimizes discovering subtrees at the expense of tree mutability. Model Tree Structures with Parent References Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. This document describes a data model that describes a tree-like structure in MongoDB documents by storing references (page 135) to “parent” nodes in children nodes. Pattern The Parent References pattern stores each tree node in a document; in addition to the tree node, the document stores the id of the node’s parent. Consider the following hierarchy of categories: The following example models the tree using Parent References, storing the reference to the parent category in the field parent: db.categories.insert( { _id: "MongoDB", parent: "Databases" } ) db.categories.insert( { _id: "dbm", parent: "Databases" } ) db.categories.insert( { _id: "Databases", parent: "Programming" } ) db.categories.insert( { _id: "Languages", parent: "Programming" } ) db.categories.insert( { _id: "Programming", parent: "Books" } ) db.categories.insert( { _id: "Books", parent: null } ) • The query to retrieve the parent of a node is fast and straightforward: db.categories.findOne( { _id: "MongoDB" } ).parent • You can create an index on the field parent to enable fast search by the parent node: db.categories.ensureIndex( { parent: 1 } ) • You can query by the parent field to find its immediate children nodes: db.categories.find( { parent: "Databases" } ) The Parent Links pattern provides a simple solution to tree storage but requires multiple queries to retrieve subtrees. 146 Chapter 4. Data Models
  • 151. MongoDB Documentation, Release 2.6.4 Figure 4.6: Tree data model for a sample hierarchy of categories. 4.3. Data Model Examples and Patterns 147
  • 152. MongoDB Documentation, Release 2.6.4 Model Tree Structures with Child References Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. This document describes a data model that describes a tree-like structure in MongoDB documents by storing references (page 135) in the parent-nodes to children nodes. Pattern The Child References pattern stores each tree node in a document; in addition to the tree node, document stores in an array the id(s) of the node’s children. Consider the following hierarchy of categories: Figure 4.7: Tree data model for a sample hierarchy of categories. The following example models the tree using Child References, storing the reference to the node’s children in the field children: db.categories.insert( { _id: "MongoDB", children: [] } ) db.categories.insert( { _id: "dbm", children: [] } ) db.categories.insert( { _id: "Databases", children: [ "MongoDB", "dbm" ] } ) 148 Chapter 4. Data Models
  • 153. MongoDB Documentation, Release 2.6.4 db.categories.insert( { _id: "Languages", children: [] } ) db.categories.insert( { _id: "Programming", children: [ "Databases", "Languages" ] } ) db.categories.insert( { _id: "Books", children: [ "Programming" ] } ) • The query to retrieve the immediate children of a node is fast and straightforward: db.categories.findOne( { _id: "Databases" } ).children • You can create an index on the field children to enable fast search by the child nodes: db.categories.ensureIndex( { children: 1 } ) • You can query for a node in the children field to find its parent node as well as its siblings: db.categories.find( { children: "MongoDB" } ) The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are necessary. This pattern may also provide a suitable solution for storing graphs where a node may have multiple parents. Model Tree Structures with an Array of Ancestors Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. This document describes a data model that describes a tree-like structure in MongoDB documents using references (page 135) to parent nodes and an array that stores all ancestors. Pattern The Array of Ancestors pattern stores each tree node in a document; in addition to the tree node, document stores in an array the id(s) of the node’s ancestors or path. Consider the following hierarchy of categories: The following example models the tree using Array of Ancestors. In addition to the ancestors field, these docu-ments also store the reference to the immediate parent category in the parent field: db.categories.insert( { _id: "MongoDB", ancestors: [ "Books", "Programming", "Databases" ], parent: "db.categories.insert( { _id: "dbm", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" db.categories.insert( { _id: "Databases", ancestors: [ "Books", "Programming" ], parent: "Programming" db.categories.insert( { _id: "Languages", ancestors: [ "Books", "Programming" ], parent: "Programming" db.categories.insert( { _id: "Programming", ancestors: [ "Books" ], parent: "Books" } ) db.categories.insert( { _id: "Books", ancestors: [ ], parent: null } ) • The query to retrieve the ancestors or path of a node is fast and straightforward: db.categories.findOne( { _id: "MongoDB" } ).ancestors • You can create an index on the field ancestors to enable fast search by the ancestors nodes: db.categories.ensureIndex( { ancestors: 1 } ) • You can query by the field ancestors to find all its descendants: 4.3. Data Model Examples and Patterns 149
  • 154. MongoDB Documentation, Release 2.6.4 Figure 4.8: Tree data model for a sample hierarchy of categories. 150 Chapter 4. Data Models
  • 155. MongoDB Documentation, Release 2.6.4 db.categories.find( { ancestors: "Programming" } ) The Array of Ancestors pattern provides a fast and efficient solution to find the descendants and the ancestors of a node by creating an index on the elements of the ancestors field. This makes Array of Ancestors a good choice for working with subtrees. The Array of Ancestors pattern is slightly slower than the Materialized Paths (page 151) pattern but is more straight-forward to use. Model Tree Structures with Materialized Paths Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. This document describes a data model that describes a tree-like structure in MongoDB documents by storing full relationship paths between documents. Pattern The Materialized Paths pattern stores each tree node in a document; in addition to the tree node, document stores as a string the id(s) of the node’s ancestors or path. Although the Materialized Paths pattern requires additional steps of working with strings and regular expressions, the pattern also provides more flexibility in working with the path, such as finding nodes by partial paths. Consider the following hierarchy of categories: The following example models the tree using Materialized Paths, storing the path in the field path; the path string uses the comma , as a delimiter: db.categories.insert( { _id: "Books", path: null } ) db.categories.insert( { _id: "Programming", path: ",Books," } ) db.categories.insert( { _id: "Databases", path: ",Books,Programming," } ) db.categories.insert( { _id: "Languages", path: ",Books,Programming," } ) db.categories.insert( { _id: "MongoDB", path: ",Books,Programming,Databases," } ) db.categories.insert( { _id: "dbm", path: ",Books,Programming,Databases," } ) • You can query to retrieve the whole tree, sorting by the field path: db.categories.find().sort( { path: 1 } ) • You can use regular expressions on the path field to find the descendants of Programming: db.categories.find( { path: /,Programming,/ } ) • You can also retrieve the descendants of Books where the Books is also at the topmost level of the hierarchy: db.categories.find( { path: /^,Books,/ } ) • To create an index on the field path use the following invocation: db.categories.ensureIndex( { path: 1 } ) This index may improve performance depending on the query: 4.3. Data Model Examples and Patterns 151
  • 156. MongoDB Documentation, Release 2.6.4 Figure 4.9: Tree data model for a sample hierarchy of categories. 152 Chapter 4. Data Models
  • 157. MongoDB Documentation, Release 2.6.4 – For queries of the Books sub-tree (e.g. http://docs.mongodb.org/manual^,Books,/) an index on the path field improves the query performance significantly. – For queries of the Programming sub-tree (e.g. http://docs.mongodb.org/manual,Programming,/), or similar queries of sub-tress, where the node might be in the middle of the indexed string, the query must inspect the entire index. For these queries an index may provide some performance improvement if the index is significantly smaller than the entire collection. Model Tree Structures with Nested Sets Overview Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how you model data can affect application performance and database capacity. See Data Modeling Concepts (page 133) for a full high level overview of data modeling in MongoDB. This document describes a data model that describes a tree like structure that optimizes discovering subtrees at the expense of tree mutability. Pattern The Nested Sets pattern identifies each node in the tree as stops in a round-trip traversal of the tree. The application visits each node in the tree twice; first during the initial trip, and second during the return trip. The Nested Sets pattern stores each tree node in a document; in addition to the tree node, document stores the id of node’s parent, the node’s initial stop in the left field, and its return stop in the right field. Consider the following hierarchy of categories: Figure 4.10: Example of a hierarchical data. The numbers identify the stops at nodes during a roundtrip traversal of a tree. 4.3. Data Model Examples and Patterns 153
  • 158. MongoDB Documentation, Release 2.6.4 The following example models the tree using Nested Sets: db.categories.insert( { _id: "Books", parent: 0, left: 1, right: 12 } ) db.categories.insert( { _id: "Programming", parent: "Books", left: 2, right: 11 } ) db.categories.insert( { _id: "Languages", parent: "Programming", left: 3, right: 4 } ) db.categories.insert( { _id: "Databases", parent: "Programming", left: 5, right: 10 } ) db.categories.insert( { _id: "MongoDB", parent: "Databases", left: 6, right: 7 } ) db.categories.insert( { _id: "dbm", parent: "Databases", left: 8, right: 9 } ) You can query to retrieve the descendants of a node: var databaseCategory = db.categories.findOne( { _id: "Databases" } ); db.categories.find( { left: { $gt: databaseCategory.left }, right: { $lt: databaseCategory.right } } The Nested Sets pattern provides a fast and efficient solution for finding subtrees but is inefficient for modifying the tree structure. As such, this pattern is best for static trees that do not change. 4.3.3 Model Specific Application Contexts Model Data for Atomic Operations (page 154) Illustrates how embedding fields related to an atomic update within the same document ensures that the fields are in sync. Model Data to Support Keyword Search (page 155) Describes one method for supporting keyword search by storing keywords in an array in the same document as the text field. Combined with a multi-key index, this pattern can support application’s keyword search operations. Model Monetary Data (page 156) Describes two methods to model monetary data in MongoDB. Model Data for Atomic Operations Pattern In MongoDB, write operations, e.g. db.collection.update(), db.collection.findAndModify(), db.collection.remove(), are atomic on the level of a single document. For fields that must be updated to-gether, embedding the fields within the same document ensures that the fields can be updated atomically. For example, consider a situation where you need to maintain information on books, including the number of copies available for checkout as well as the current checkout information. The available copies of the book and the checkout information should be in sync. As such, embedding the available field and the checkout field within the same document ensures that you can update the two fields atomically. { _id: 123456789, title: "MongoDB: The Definitive Guide", author: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly", available: 3, checkout: [ { by: "joe", date: ISODate("2012-10-15") } ] } Then to update with new checkout information, you can use the db.collection.update() method to atomically update both the available field and the checkout field: 154 Chapter 4. Data Models
  • 159. MongoDB Documentation, Release 2.6.4 db.books.update ( { _id: 123456789, available: { $gt: 0 } }, { $inc: { available: -1 }, $push: { checkout: { by: "abc", date: new Date() } } } ) The operation returns a WriteResult() object that contains information on the status of the operation: WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) The nMatched field shows that 1 document matched the update condition, and nModified shows that the operation updated 1 document. If no document matched the update condition, then nMatched and nModified would be 0 and would indicate that you could not check out the book. Model Data to Support Keyword Search Note: Keyword search is not the same as text search or full text search, and does not provide stemming or other text-processing features. See the Limitations of Keyword Indexes (page 156) section for more information. In 2.4, MongoDB provides a text search feature. See Text Indexes (page 454) for more information. If your application needs to perform queries on the content of a field that holds text you can perform exact matches on the text or use $regex to use regular expression pattern matches. However, for many operations on text, these methods do not satisfy application requirements. This pattern describes one method for supporting keyword search using MongoDB to support application search functionality, that uses keywords stored in an array in the same document as the text field. Combined with a multi-key index (page 442), this pattern can support application’s keyword search operations. Pattern To add structures to your document to support keyword-based queries, create an array field in your documents and add the keywords as strings in the array. You can then create a multi-key index (page 442) on the array and create queries that select values from the array. Example Given a collection of library volumes that you want to provide topic-based search. For each volume, you add the array topics, and you add as many keywords as needed for a given volume. For the Moby-Dick volume you might have the following document: { title : "Moby-Dick" , author : "Herman Melville" , published : 1851 , ISBN : 0451526996 , topics : [ "whaling" , "allegory" , "revenge" , "American" , "novel" , "nautical" , "voyage" , "Cape Cod" ] } You then create a multi-key index on the topics array: 4.3. Data Model Examples and Patterns 155
  • 160. MongoDB Documentation, Release 2.6.4 db.volumes.ensureIndex( { topics: 1 } ) The multi-key index creates separate index entries for each keyword in the topics array. For example the index contains one entry for whaling and another for allegory. You then query based on the keywords. For example: db.volumes.findOne( { topics : "voyage" }, { title: 1 } ) Note: An array with a large number of elements, such as one with several hundreds or thousands of keywords will incur greater indexing costs on insertion. Limitations of Keyword Indexes MongoDB can support keyword searches using specific data models and multi-key indexes (page 442); however, these keyword indexes are not sufficient or comparable to full-text products in the following respects: • Stemming. Keyword queries in MongoDB can not parse keywords for root or related words. • Synonyms. Keyword-based search features must provide support for synonym or related queries in the applica-tion layer. • Ranking. The keyword look ups described in this document do not provide a way to weight results. • Asynchronous Indexing. MongoDB builds indexes synchronously, which means that the indexes used for key-word indexes are always current and can operate in real-time. However, asynchronous bulk indexes may be more efficient for some kinds of content and workloads. Model Monetary Data Overview MongoDB stores numeric data as either IEEE 754 standard 64-bit floating point numbers or as 32-bit or 64-bit signed integers. Applications that handle monetary data often require capturing fractional units of currency. However, arith-metic on floating point numbers, as implemented in modern hardware, often does not conform to requirements for monetary arithmetic. In addition, some fractional numeric quantities, such as one third and one tenth, have no exact representation in binary floating point numbers. Note: Arithmetic mentioned on this page refers to server-side arithmetic performed by mongod or mongos, and not to client-side arithmetic. This document describes two ways to model monetary data in MongoDB: • Exact Precision (page 157) which multiplies the monetary value by a power of 10. • Arbitrary Precision (page 157) which uses two fields for the value: one field to store the exact monetary value as a non-numeric and another field to store a floating point approximation of the value. Use Cases for Exact Precision Model If you regularly need to perform server-side arithmetic on monetary data, the exact precision model may be appropriate. For instance: 156 Chapter 4. Data Models
  • 161. MongoDB Documentation, Release 2.6.4 • If you need to query the database for exact, mathematically valid matches, use Exact Precision (page 157). • If you need to be able to do server-side arithmetic, e.g., $inc, $mul, and aggregation framework arithmetic, use Exact Precision (page 157). Use Cases for Arbitrary Precision Model If there is no need to perform server-side arithmetic on monetary data, modeling monetary data using the arbitrary precision model may be suitable. For instance: • If you need to handle arbitrary or unforeseen number of precision, see Arbitrary Precision (page 157). • If server-side approximations are sufficient, possibly with client-side post-processing, see Arbitrary Precision (page 157). Exact Precision To model monetary data using the exact precision model: 1. Determine the maximum precision needed for the monetary value. For example, your application may require precision down to the tenth of one cent for monetary values in USD currency. 2. Convert the monetary value into an integer by multiplying the value by a power of 10 that ensures the maximum precision needed becomes the least significant digit of the integer. For example, if the required maximum precision is the tenth of one cent, multiply the monetary value by 1000. 3. Store the converted monetary value. For example, the following scales 9.99 USD by 1000 to preserve precision up to one tenth of a cent. { price: 9990, currency: "USD" } The model assumes that for a given currency value: • The scale factor is consistent for a currency; i.e. same scaling factor for a given currency. • The scale factor is a constant and known property of the currency; i.e applications can determine the scale factor from the currency. When using this model, applications must be consistent in performing the appropriate scaling of the values. For use cases of this model, see Use Cases for Exact Precision Model (page 156). Arbitrary Precision To model monetary data using the arbitrary precision model, store the value in two fields: 1. In one field, encode the exact monetary value as a non-numeric data type; e.g., BinData or a string. 2. In the second field, store a double-precision floating point approximation of the exact value. The following example uses the arbitrary precision model to store 9.99 USD for the price and 0.25 USD for the fee: { price: { display: "9.99", approx: 9.9900000000000002, currency: "USD" }, fee: { display: "0.25", approx: 0.2499999999999999, currency: "USD" } } 4.3. Data Model Examples and Patterns 157
  • 162. MongoDB Documentation, Release 2.6.4 With some care, applications can perform range and sort queries on the field with the numeric approximation. How-ever, the use of the approximation field for the query and sort operations requires that applications perform client-side post-processing to decode the non-numeric representation of the exact value and then filter out the returned documents based on the exact monetary value. For use cases of this model, see Use Cases for Arbitrary Precision Model (page 157). 4.4 Data Model Reference Documents (page 158) MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs. Database References (page 161) Discusses manual references and DBRefs, which MongoDB can use to represent relationships between documents. GridFS Reference (page 164) Convention for storing large files in a MongoDB Database. ObjectId (page 165) A 12-byte BSON type that MongoDB uses as the default value for its documents’ _id field if the _id field is not specified. BSON Types (page 167) Outlines the unique BSON types used by MongoDB. See BSONspec.org5 for the complete BSON specification. 4.4.1 Documents MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs: { "item": "pencil", "qty": 500, "type": "no.2" } Most user-accessible data structures in MongoDB are documents, including: • All database records. • Query selectors (page 55), which define what records to select for read, update, and delete operations. • Update definitions (page 67), which define what fields to modify during an update. • Index specifications (page 436), which define what fields to index. • Data output by MongoDB for reporting and configuration, such as the output of the serverStatus and the replica set configuration document (page 594). Document Format MongoDB stores documents on disk in the BSON serialization format. BSON is a binary representation of JSON documents, though it contains more data types than JSON. For the BSON spec, see bsonspec.org6. See also BSON Types (page 167). The mongo JavaScript shell and the MongoDB language drivers translate between BSON and the language-specific document representation. 5http://bsonspec.org/ 6http://bsonspec.org/ 158 Chapter 4. Data Models
  • 163. MongoDB Documentation, Release 2.6.4 Document Structure MongoDB documents are composed of field-and-value pairs and have the following structure: { field1: value1, field2: value2, field3: value3, ... fieldN: valueN } The value of a field can be any of the BSON data types (page 167), including other documents, arrays, and arrays of documents. The following document contains values of varying types: var mydoc = { _id: ObjectId("5099803df3f4948bd2f98391"), name: { first: "Alan", last: "Turing" }, birth: new Date('Jun 23, 1912'), death: new Date('Jun 07, 1954'), contribs: [ "Turing machine", "Turing test", "Turingery" ], views : NumberLong(1250000) } The above fields have the following data types: • _id holds an ObjectId. • name holds a subdocument that contains the fields first and last. • birth and death hold values of the Date type. • contribs holds an array of strings. • views holds a value of the NumberLong type. Field Names Field names are strings. Documents (page 158) have the following restrictions on field names: • The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array. • The field names cannot start with the dollar sign ($) character. • The field names cannot contain the dot (.) character. • The field names cannot contain the null character. BSON documents may have more than one field with the same name. Most MongoDB interfaces, however, represent MongoDB with a structure (e.g. a hash table) that does not support duplicate field names. If you need to manipulate documents that have more than one field with the same name, see the driver documentation for your driver. Some documents created by internal MongoDB processes may have duplicate fields, but no MongoDB process will ever add duplicate fields to an existing user document. 4.4. Data Model Reference 159
  • 164. MongoDB Documentation, Release 2.6.4 Field Value Limit For indexed collections (page 431), the values for the indexed fields have a Maximum Index Key Length limit. See Maximum Index Key Length for details. Document Limitations Documents have the following attributes: Document Size Limit The maximum BSON document size is 16 megabytes. The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API. See mongofiles and the documentation for your driver for more information about GridFS. Document Field Order MongoDB preserves the order of the document fields following write operations except for the following cases: • The _id field is always the first field in the document. • Updates that include renaming of field names may result in the reordering of fields in the document. Changed in version 2.6: Starting in version 2.6, MongoDB actively attempts to preserve the field order in a document. Before version 2.6, MongoDB did not actively preserve the order of the fields in a document. The _id Field The _id field has the following behavior and constraints: • By default, MongoDB creates a unique index on the _id field during the creation of a collection. • The _id field is always the first field in the documents. If the server receives a document that does not have the _id field first, then the server will move the field to the beginning. • The _id field may contain values of any BSON data type (page 167), other than an array. Warning: To ensure functioning replication, do not store values that are of the BSON regular expression type in the _id field. The following are common options for storing values for _id: • Use an ObjectId (page 165). • Use a natural unique identifier, if available. This saves space and avoids an additional index. • Generate an auto-incrementing number. See Create an Auto-Incrementing Sequence Field (page 113). • Generate a UUID in your application code. For a more efficient storage of the UUID values in the collection and in the _id index, store the UUID as a value of the BSON BinData type. Index keys that are of the BinData type are more efficiently stored in the index if: – the binary subtype value is in the range of 0-7 or 128-135, and 160 Chapter 4. Data Models
  • 165. MongoDB Documentation, Release 2.6.4 – the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32. • Use your driver’s BSON UUID facility to generate UUIDs. Be aware that driver implementations may imple-ment UUID serialization and deserialization logic differently, which may not be fully compatible with other drivers. See your driver documentation7 for information concerning UUID interoperability. Note: Most MongoDB driver clients will include the _id field and generate an ObjectId before sending the insert operation to MongoDB; however, if the client sends a document without an _id field, the mongod will add the _id field and generate the ObjectId. Dot Notation MongoDB uses the dot notation to access the elements of an array and to access the fields of a subdocument. To access an element of an array by the zero-based index position, concatenate the array name with the dot (.) and zero-based index position, and enclose in quotes: '<array>.<index>' To access a field of a subdocument with dot-notation, concatenate the subdocument name with the dot (.) and the field name, and enclose in quotes: '<subdocument>.<field>' See also: • Embedded Documents (page 89) for dot notation examples with subdocuments. • Arrays (page 90) for dot notation examples with arrays. 4.4.2 Database References MongoDB does not support joins. InMongoDB some data is denormalized, or stored with related data in documents to remove the need for joins. However, in some cases it makes sense to store related information in separate documents, typically in different collections or databases. MongoDB applications use one of two methods for relating documents: 1. Manual references (page 162) where you save the _id field of one document in another document as a reference. Then your application can run a second query to return the related data. These references are simple and sufficient for most use cases. 2. DBRefs (page 162) are references from one document to another using the value of the first document’s _id field, collection name, and, optionally, its database name. By including these names, DBRefs allow documents located in multiple collections to be more easily linked with documents from a single collection. To resolve DBRefs, your application must perform additional queries to return the referenced documents. Many drivers have helper methods that form the query for the DBRef automatically. The drivers 8 do not automat-ically resolve DBRefs into documents. DBRefs provide a common format and type to represent relationships among documents. The DBRef format also provides common semantics for representing links between documents if your database must interact with multiple frameworks and tools. Unless you have a compelling reason to use DBRefs, use manual references instead. 7http://api.mongodb.org/ 8 Some community supported drivers may have alternate behavior and may resolve a DBRef into a document automatically. 4.4. Data Model Reference 161
  • 166. MongoDB Documentation, Release 2.6.4 Manual References Background Using manual references is the practice of including one document’s _id field in another document. The application can then issue a second query to resolve the referenced fields as needed. Process Consider the following operation to insert two documents, using the _id field of the first document as a reference in the second document: original_id = ObjectId() db.places.insert({ "_id": original_id, "name": "Broadway Center", "url": "bc.example.net" }) db.people.insert({ "name": "Erin", "places_id": original_id, "url": "bc.example.net/Erin" }) Then, when a query returns the document from the people collection you can, if needed, make a second query for the document referenced by the places_id field in the places collection. Use For nearly every case where you want to store a relationship between two documents, use manual references (page 162). The references are simple to create and your application can resolve references as needed. The only limitation of manual linking is that these references do not convey the database and collection names. If you have documents in a single collection that relate to documents in more than one collection, you may need to consider using DBRefs (page 162). DBRefs Background DBRefs are a convention for representing a document, rather than a specific reference type. They include the name of the collection, and in some cases the database name, in addition to the value from the _id field. Format DBRefs have the following fields: $ref The $ref field holds the name of the collection where the referenced document resides. 162 Chapter 4. Data Models
  • 167. MongoDB Documentation, Release 2.6.4 $id The $id field contains the value of the _id field in the referenced document. $db Optional. Contains the name of the database where the referenced document resides. Only some drivers support $db references. Example DBRef documents resemble the following document: { "$ref" : <value>, "$id" : <value>, "$db" : <value> } Consider a document from a collection that stored a DBRef in a creator field: { "_id" : ObjectId("5126bbf64aed4daf9e2ab771"), // .. application fields "creator" : { "$ref" : "creators", "$id" : ObjectId("5126bc054aed4daf9e2ab772"), "$db" : "users" } } The DBRef in this example points to a document in the creators collection of the users database that has ObjectId("5126bc054aed4daf9e2ab772") in its _id field. Note: The order of fields in the DBRef matters, and you must use the above sequence when using a DBRef. Support C++ The C++ driver contains no support for DBRefs. You can transverse references manually. C# The C# driver provides access to DBRef objects with the MongoDBRef Class9 and supplies the FetchDBRef Method10 for accessing these objects. Java The DBRef11 class provides supports for DBRefs from Java. JavaScript The mongo shell’s JavaScript interface provides a DBRef. Perl The Perl driver contains no support for DBRefs. You can transverse references manually or use the Mon-goDBx:: AutoDeref12 CPAN module. PHP The PHP driver supports DBRefs, including the optional $db reference, through The MongoDBRef class13. Python The Python driver provides the DBRef class14, and the dereference method15 for interacting with DBRefs. 9http://api.mongodb.org/csharp/current/html/46c356d3-ed06-a6f8-42fa-e0909ab64ce2.htm 10http://api.mongodb.org/csharp/current/html/1b0b8f48-ba98-1367-0a7d-6e01c8df436f.htm 11http://api.mongodb.org/java/current/com/mongodb/DBRef.html 12http://search.cpan.org/dist/MongoDBx-AutoDeref/ 13http://www.php.net/manual/en/class.mongodbref.php/ 14http://api.mongodb.org/python/current/api/bson/dbref.html 15http://api.mongodb.org//python/current/api/pymongo/database.html#pymongo.database.Database.dereference 4.4. Data Model Reference 163
  • 168. MongoDB Documentation, Release 2.6.4 Ruby The Ruby Driver supports DBRefs using the DBRef class16 and the deference method17. Use In most cases you should use the manual reference (page 162) method for connecting two or more related documents. However, if you need to reference documents from multiple collections, consider using DBRefs. 4.4.3 GridFS Reference GridFS stores files in two collections: • chunks stores the binary chunks. For details, see The chunks Collection (page 164). • files stores the file’s metadata. For details, see The files Collection (page 165). GridFS places the collections in a common bucket by prefixing each with the bucket name. By default, GridFS uses two collections with names prefixed by fs bucket: • fs.files • fs.chunks You can choose a different bucket name than fs, and create multiple buckets in a single database. See also: GridFS (page 138) for more information about GridFS. The chunks Collection Each document in the chunks collection represents a distinct chunk of a file as represented in the GridFS store. The following is a prototype document from the chunks collection.: { "_id" : <ObjectId>, "files_id" : <ObjectId>, "n" : <num>, "data" : <binary> } A document from the chunks collection contains the following fields: chunks._id The unique ObjectId of the chunk. chunks.files_id The _id of the “parent” document, as specified in the files collection. chunks.n The sequence number of the chunk. GridFS numbers all chunks, starting with 0. chunks.data The chunk’s payload as a BSON binary type. The chunks collection uses a compound index on files_id and n, as described in GridFS Index (page 139). 16http://api.mongodb.org//ruby/current/BSON/DBRef.html 17http://api.mongodb.org//ruby/current/Mongo/DB.html#dereference-instance_method 164 Chapter 4. Data Models
  • 169. MongoDB Documentation, Release 2.6.4 The files Collection Each document in the files collection represents a file in the GridFS store. Consider the following prototype of a document in the files collection: { "_id" : <ObjectId>, "length" : <num>, "chunkSize" : <num>, "uploadDate" : <timestamp>, "md5" : <hash>, "filename" : <string>, "contentType" : <string>, "aliases" : <string array>, "metadata" : <dataObject>, } Documents in the files collection contain some or all of the following fields. Applications may create additional arbitrary fields: files._id The unique ID for this document. The _id is of the data type you chose for the original document. The default type for MongoDB documents is BSON ObjectId. files.length The size of the document in bytes. files.chunkSize The size of each chunk. GridFS divides the document into chunks of the size specified here. The default size is 255 kilobytes. Changed in version 2.4.10: The default chunk size changed from 256k to 255k. files.uploadDate The date the document was first stored by GridFS. This value has the Date type. files.md5 An MD5 hash returned by the filemd5 command. This value has the String type. files.filename Optional. A human-readable name for the document. files.contentType Optional. A valid MIME type for the document. files.aliases Optional. An array of alias strings. files.metadata Optional. Any additional information you want to store. 4.4.4 ObjectId Overview ObjectId is a 12-byte BSON type, constructed using: • a 4-byte value representing the seconds since the Unix epoch, 4.4. Data Model Reference 165
  • 170. MongoDB Documentation, Release 2.6.4 • a 3-byte machine identifier, • a 2-byte process id, and • a 3-byte counter, starting with a random value. In MongoDB, documents stored in a collection require a unique _id field that acts as a primary key. Because ObjectIds are small, most likely unique, and fast to generate, MongoDB uses ObjectIds as the default value for the _id field if the _id field is not specified. MongoDB clients should add an _id field with a unique ObjectId. However, if a client does not add an _id field, mongod will add an _id field that holds an ObjectId. Using ObjectIds for the _id field provides the following additional benefits: • in the mongo shell, you can access the creation time of the ObjectId, using the getTimestamp() method. • sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by creation time. Important: The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process. Also consider the Documents (page 158) section for related information on MongoDB’s document orientation. ObjectId() The mongo shell provides the ObjectId() wrapper class to generate a new ObjectId, and to provide the following helper attribute and methods: • str The hexadecimal string representation of the object. • getTimestamp() Returns the timestamp portion of the object as a Date. • toString() Returns the JavaScript representation in the form of a string literal “ObjectId(...)”. Changed in version 2.2: In previous versions toString() returns the hexadecimal string representation, which as of version 2.2 can be retrieved by the str property. • valueOf() Returns the representation of the object as a hexadecimal string. The returned string is the str attribute. Changed in version 2.2: In previous versions, valueOf() returns the object. Examples Consider the following uses ObjectId() class in the mongo shell: Generate a new ObjectId To generate a new ObjectId, use the ObjectId() constructor with no argument: 166 Chapter 4. Data Models
  • 171. MongoDB Documentation, Release 2.6.4 x = ObjectId() In this example, the value of x would be: ObjectId("507f1f77bcf86cd799439011") To generate a new ObjectId using the ObjectId() constructor with a unique hexadecimal string: y = ObjectId("507f191e810c19729de860ea") In this example, the value of y would be: ObjectId("507f191e810c19729de860ea") • To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows: Convert an ObjectId into a Timestamp To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows: ObjectId("507f191e810c19729de860ea").getTimestamp() This operation will return the following Date object: ISODate("2012-10-17T20:46:22Z") Convert ObjectIds into Strings Access the str attribute of an ObjectId() object, as follows: ObjectId("507f191e810c19729de860ea").str This operation will return the following hexadecimal string: 507f191e810c19729de860ea To return the hexadecimal string representation of an ObjectId(), use the valueOf() method as follows: ObjectId("507f191e810c19729de860ea").valueOf() This operation returns the following output: 507f191e810c19729de860ea To return the string representation of an ObjectId() object, use the toString() method as follows: ObjectId("507f191e810c19729de860ea").toString() This operation will return the following output: 507f191e810c19729de860ea 4.4.5 BSON Types BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The BSON specification is located at bsonspec.org18. 18http://bsonspec.org/ 4.4. Data Model Reference 167
  • 172. MongoDB Documentation, Release 2.6.4 BSON supports the following data types as values in documents. Each data type has a corresponding number that can be used with the $type operator to query documents by BSON type. Type Number Double 1 String 2 Object 3 Array 4 Binary data 5 Undefined 6 Object id 7 Boolean 8 Date 9 Null 10 Regular Expression 11 JavaScript 13 Symbol 14 JavaScript (with scope) 15 32-bit integer 16 Timestamp 17 64-bit integer 18 Min key 255 Max key 127 To determine a field’s type, see Check Types in the mongo Shell (page 252). If you convert BSON to JSON, see http://docs.mongodb.org/manualreference/mongodb-extended-json. Comparison/Sort Order When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to highest: 1. MinKey (internal type) 2. Null 3. Numbers (ints, longs, doubles) 4. Symbol, String 5. Object 6. Array 7. BinData 8. ObjectId 9. Boolean 10. Date, Timestamp 11. Regular Expression 12. MaxKey (internal type) MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo conversion before comparison. The comparison treats a non-existent field as it would an empty BSON Object. As such, a sort on the a field in documents { } and { a: null } would treat the documents as equivalent in sort order. 168 Chapter 4. Data Models
  • 173. MongoDB Documentation, Release 2.6.4 With arrays, a less-than comparison or an ascending sort compares the smallest element of arrays, and a greater-than comparison or a descending sort compares the largest element of the arrays. As such, when comparing a field whose value is a single-element array (e.g. [ 1 ]) with non-array fields (e.g. 2), the comparison is between 1 and 2. A comparison of an empty array (e.g. [ ]) treats the empty array as less than null or a missing field. MongoDB sorts BinData in the following order: 1. First, the length or size of the data. 2. Then, by the BSON one-byte subtype. 3. Finally, by the data, performing a byte-by-byte comparison. The following sections describe special considerations for particular BSON types. ObjectId ObjectIds are: small, likely unique, fast to generate, and ordered. These values consists of 12-bytes, where the first four bytes are a timestamp that reflect the ObjectId’s creation. Refer to the ObjectId (page 165) documentation for more information. String BSON strings are UTF-8. In general, drivers for each programming language convert from the language’s string format to UTF-8 when serializing and deserializing BSON. This makes it possible to store most international characters in BSON strings with ease. 19 In addition, MongoDB $regex queries support UTF-8 in the regex string. Timestamps BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date (page 170) type. Timestamp values are a 64 bit value where: • the first 32 bits are a time_t value (seconds since the Unix epoch) • the second 32 bits are an incrementing ordinal for operations within a given second. Within a single mongod instance, timestamp values are always unique. In replication, the oplog has a ts field. The values in this field reflect the operation time, which uses a BSON timestamp value. Note: The BSON Timestamp type is for internal MongoDB use. For most cases, in application development, you will want to use the BSON date type. See Date (page 170) for more information. If you create a BSON Timestamp using the empty constructor (e.g. new Timestamp()), MongoDB will only generate a timestamp if you use the constructor in the first field of the document. 20 Otherwise, MongoDB will generate an empty timestamp value (i.e. Timestamp(0, 0).) Changed in version 2.1: mongo shell displays the Timestamp value with the wrapper: Timestamp(<time_t>, <ordinal>) Prior to version 2.1, the mongo shell display the Timestamp value as a document: 19 Given strings using UTF-8 character sets, using sort() on strings will be reasonably correct. However, because internally sort() uses the C++ strcmp api, the sort order may handle some characters incorrectly. 20 If the first field in the document is _id, then you can generate a timestamp in the second field of a document. 4.4. Data Model Reference 169
  • 174. MongoDB Documentation, Release 2.6.4 { t : <time_t>, i : <ordinal> } Date BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This results in a representable date range of about 290 million years into the past and future. The official BSON specification21 refers to the BSON Date type as the UTC datetime. Changed in version 2.0: BSON Date type is signed. 22 Negative values represent dates before 1970. Example Construct a Date using the new Date() constructor in the mongo shell: var mydate1 = new Date() Example Construct a Date using the ISODate() constructor in the mongo shell: var mydate2 = ISODate() Example Return the Date value as string: mydate1.toString() Example Return the month portion of the Date value; months are zero-indexed, so that January is month 0: mydate1.getMonth() 21http://bsonspec.org/#/specification 22 Prior to version 2.0, Date values were incorrectly interpreted as unsigned integers, which affected sorts, range queries, and indexes on Date fields. Because indexes are not recreated when upgrading, please re-index if you created an index on Date values with an earlier version, and dates before 1970 are relevant to your application. 170 Chapter 4. Data Models
  • 175. CHAPTER 5 Administration The administration documentation addresses the ongoing operation and maintenance of MongoDB instances and de-ployments. This documentation includes both high level overviews of these concerns as well as tutorials that cover specific procedures and processes for operating MongoDB. Administration Concepts (page 171) Core conceptual documentation of operational practices for managing Mon-goDB deployments and systems. MongoDB Backup Methods (page 172) Describes approaches and considerations for backing up a MongoDB database. Monitoring for MongoDB (page 175) An overview of monitoring tools, diagnostic strategies, and approaches to monitoring replica sets and sharded clusters. Production Notes (page 188) A collection of notes that describe best practices and considerations for the oper-ations of MongoDB instances and deployments. Continue reading from Administration Concepts (page 171) for additional documentation of MongoDB admin-istration. Administration Tutorials (page 205) Tutorials that describe common administrative procedures and practices for op-erations for MongoDB instances and deployments. Configuration, Maintenance, and Analysis (page 205) Describes routine management operations, including configuration and performance analysis. Backup and Recovery (page 229) Outlines procedures for data backup and restoration with mongod instances and deployments. Continue reading from Administration Tutorials (page 205) for more tutorials of common MongoDB mainte-nance operations. Administration Reference (page 266) Reference and documentation of internal mechanics of administrative features, systems and functions and operations. See also: The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica Set Tutorials (page 543) and Sharded Cluster Tutorials (page 634) for additional tutorials and information. 5.1 Administration Concepts The core administration documents address strategies and practices used in the operation of MongoDB systems and deployments. 171
  • 176. MongoDB Documentation, Release 2.6.4 Operational Strategies (page 172) Higher level documentation of key concepts for the operation and maintenance of MongoDB deployments, including backup, maintenance, and configuration. MongoDB Backup Methods (page 172) Describes approaches and considerations for backing up a MongoDB database. Monitoring for MongoDB (page 175) An overview of monitoring tools, diagnostic strategies, and approaches to monitoring replica sets and sharded clusters. Run-time Database Configuration (page 182) Outlines common MongoDB configurations and examples of best-practice configurations for common use cases. Data Management (page 194) Core documentation that addresses issues in data management, organization, mainte-nance, and lifestyle management. Data Center Awareness (page 194) Presents the MongoDB features that allow application developers and database administrators to configure their deployments to be more data center aware or allow operational and location-based separation. Expire Data from Collections by Setting TTL (page 198) TTL collections make it possible to automatically remove data from a collection based on the value of a timestamp and are useful for managing data like machine generated event data that are only useful for a limited period of time. Capped Collections (page 196) Capped collections provide a special type of size-constrained collections that preserve insertion order and can support high volume inserts. Optimization Strategies for MongoDB (page 200) Techniques for optimizing application performance with Mon-goDB. 5.1.1 Operational Strategies These documents address higher level strategies for common administrative tasks and requirements with respect to MongoDB deployments. MongoDB Backup Methods (page 172) Describes approaches and considerations for backing up a MongoDB database. Monitoring for MongoDB (page 175) An overview of monitoring tools, diagnostic strategies, and approaches to monitoring replica sets and sharded clusters. Run-time Database Configuration (page 182) Outlines common MongoDB configurations and examples of best-practice configurations for common use cases. Import and Export MongoDB Data (page 186) Provides an overview of mongoimport and mongoexport, the tools MongoDB includes for importing and exporting data. Production Notes (page 188) A collection of notes that describe best practices and considerations for the operations of MongoDB instances and deployments. MongoDB Backup Methods When deploying MongoDB in production, you should have a strategy for capturing and restoring backups in the case of data loss events. There are several ways to back up MongoDB clusters: • Backup by Copying Underlying Data Files (page 173) • Backup with mongodump (page 173) • MongoDB Management Service (MMS) Cloud Backup (page 174) • MongoDB Management Service (MMS) On Prem Backup Software (page 174) 172 Chapter 5. Administration
  • 177. MongoDB Documentation, Release 2.6.4 Backup by Copying Underlying Data Files You can create a backup by copying MongoDB’s underlying data files. If the volume where MongoDB stores data files supports point in time snapshots, you can use these snapshots to create backups of a MongoDB system at an exact moment in time. File systems snapshots are an operating system volume manager feature, and are not specific to MongoDB. The mechanics of snapshots depend on the underlying storage system. For example, if you use Amazon’s EBS storage system for EC2 supports snapshots. On Linux the LVM manager can create a snapshot. To get a correct snapshot of a running mongod process, you must have journaling enabled and the journal must reside on the same logical volume as the other MongoDB data files. Without journaling enabled, there is no guarantee that the snapshot will be consistent or valid. To get a consistent snapshot of a sharded system, you must disable the balancer and capture a snapshot from every shard and a config server at approximately the same moment in time. If your storage system does not support snapshots, you can copy the files directly using cp, rsync, or a similar tool. Since copying multiple files is not an atomic operation, you must stop all writes to the mongod before copying the files. Otherwise, you will copy the files in an invalid state. Backups produced by copying the underlying data do not support point in time recovery for replica sets and are difficult to manage for larger sharded clusters. Additionally, these backups are larger because they include the indexes and duplicate underlying storage padding and fragmentation. mongodump, by contrast, creates smaller backups. For more information, see the Backup and Restore with Filesystem Snapshots (page 229) and Backup a Sharded Cluster with Filesystem Snapshots (page 239) for complete instructions on using LVM to create snapshots. Also see Back up and Restore Processes for MongoDB on Amazon EC21. Backup with mongodump The mongodump tool reads data from a MongoDB database and creates high fidelity BSON files. The mongorestore tool can populate a MongoDB database with the data from these BSON files. These tools are simple and efficient for backing up small MongoDB deployments, but are not ideal for capturing backups of larger systems. mongodump and mongorestore can operate against a running mongod process, and can manipulate the underly-ing data files directly. By default, mongodump does not capture the contents of the local database (page 598). mongodump only captures the documents in the database. The resulting backup is space efficient, but mongorestore or mongod must rebuild the indexes after restoring data. When connected to a MongoDB instance, mongodump can adversely affect mongod performance. If your data is larger than system memory, the queries will push the working set out of memory. To mitigate the impact of mongodump on the performance of the replica set, use mongodump to capture back-ups from a secondary (page 508) member of a replica set. Alternatively, you can shut down a secondary and use mongodump with the data files directly. If you shut down a secondary to capture data with mongodump ensure that the operation can complete before its oplog becomes too stale to continue replicating. For replica sets, mongodump also supports a point in time feature with the --oplog option. Applications may continue modifying data while mongodump captures the output. To restore a point in time backup created with --oplog, use mongorestore with the --oplogReplay option. If applications modify data while mongodump is creating a backup, mongodump will compete for resources with those applications. 1http://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2 5.1. Administration Concepts 173
  • 178. MongoDB Documentation, Release 2.6.4 See Back Up and Restore with MongoDB Tools (page 234), Backup a Small Sharded Cluster with mongodump (page 238), and Backup a Sharded Cluster with Database Dumps (page 241) for more information. MongoDB Management Service (MMS) Cloud Backup The MongoDB Management Service2 supports backup and restore for MongoDB deployments. MMS continually backs up MongoDB replica sets and sharded systems by reading the oplog data from your MongoDB cluster. MMS Backup offers point in time recovery of MongoDB replica sets and a consistent snapshot of sharded systems. MMS achieves point in time recovery by storing oplog data so that it can create a restore for any moment in time in the last 24 hours for a particular replica set. For sharded systems, MMS does not provide restores for arbitrary moments in time. MMS does provide periodic con-sistent snapshots of the entire sharded cluster. Sharded cluster snapshots are difficult to achieve with other MongoDB backup methods. To restore a MongoDB cluster from an MMS Backup snapshot, you download a compressed archive of your MongoDB data files and distribute those files before restarting the mongod processes. To get started with MMS Backup sign up for MMS3, and consider the complete documentation of MMS see the MMS Manual4. MongoDB Management Service (MMS) On Prem Backup Software MongoDB Subscribers can install and run the same core software that powers MongoDB Management Service (MMS) Cloud Backup (page 174) on their own infrastructure. The On Prem version of MMS, has similar functionality as the cloud version and is available with Standard and Enterprise subscriptions. For more information about On Prem MMS see the MongoDB subscription5 page and the MMS On Prem Manual6. Further Reading Backup and Restore with Filesystem Snapshots (page 229) An outline of procedures for creating MongoDB data set backups using system-level file snapshot tool, such as LVM or native storage appliance tools. Restore a Replica Set from MongoDB Backups (page 232) Describes procedure for restoring a replica set from an archived backup such as a mongodump or MMS Backup7 file. Back Up and Restore with MongoDB Tools (page 234) The procedure for writing the contents of a database to a BSON (i.e. binary) dump file for backing up MongoDB databases. Backup and Restore Sharded Clusters (page 238) Detailed procedures and considerations for backing up sharded clusters and single shards. Recover Data after an Unexpected Shutdown (page 246) Recover data from MongoDB data files that were not prop-erly closed or have an invalid state. 2https://mms.10gen.com/?pk_campaign=MongoDB-Org&pk_kwd=Backup-Docs 3http://mms.mongodb.com 4https://mms.mongodb.com/help/ 5https://www.mongodb.com/products/subscriptions 6https://mms.mongodb.com/help-hosted/current/ 7https://mms.mongodb.com/?pk_campaign=mongodb-docs-admin-tutorials 174 Chapter 5. Administration
  • 179. MongoDB Documentation, Release 2.6.4 Monitoring for MongoDB Monitoring is a critical component of all database administration. A firm grasp of MongoDB’s reporting will allow you to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDB’s normal operational parameters will allow you to diagnose before they escalate to failures. This document presents an overview of the available monitoring utilities and the reporting statistics available in Mon-goDB. It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters. Note: MongoDB Management Service (MMS)8 is a hosted monitoring service which collects and aggregates data to provide insight into the performance and operation of MongoDB deployments. See the MMS documentation9 for more information. Monitoring Strategies There are three methods for collecting data about the state of a running MongoDB instance: • First, there is a set of utilities distributed with MongoDB that provides real-time reporting of database activities. • Second, database commands return statistics regarding the current database state with greater fidelity. • Third, MMS Monitoring Service10 collects data from running MongoDB deployments and provides visualiza-tion and alerts based on that data. MMS is a free service provided by MongoDB. Each strategy can help answer different questions and is useful in different contexts. These methods are complemen-tary. MongoDB Reporting Tools This section provides an overview of the reporting methods distributed with MongoDB. It also offers examples of the kinds of questions that each method is best suited to help you address. Utilities The MongoDB distribution includes a number of utilities that quickly return statistics about instances’ performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation. mongostat mongostat captures and returns the counts of database operations by type (e.g. insert, query, update, delete, etc.). These counts report on the load distribution on the server. Use mongostat to understand the distribution of operation types and to inform capacity planning. See the mongostat manual for details. mongotop mongotop tracks and reports the current read and write activity of a MongoDB instance, and reports these statistics on a per collection basis. Use mongotop to check if your database activity and use match your expectations. See the mongotop manual for details. 8https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 9http://mms.mongodb.com/help/ 10https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 5.1. Administration Concepts 175
  • 180. MongoDB Documentation, Release 2.6.4 REST Interface MongoDB provides a simple REST interface that can be useful for configuring monitoring and alert scripts, and for other administrative tasks. To enable, configure mongod to use REST, either by starting mongod with the --rest option, or by setting the net.http.RESTInterfaceEnabled setting to true in a configuration file. For more information on using the REST Interface see, the Simple REST Interface11 documentation. HTTP Console MongoDB provides a web interface that exposes diagnostic and monitoring information in a simple web page. The web interface is accessible at localhost:<port>, where the <port> number is 1000 more than the mongod port . For example, if a locally running mongod is using the default port 27017, access the HTTP console at http://localhost:28017. Commands MongoDB includes a number of commands that report on the state of the database. These data may provide a finer level of granularity than the utilities discussed above. Consider using their output in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the activity of your instance. The db.currentOp method is another useful tool for identifying the database instance’s in-progress operations. serverStatus The serverStatus command, or db.serverStatus() from the shell, returns a general overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access. The command returns quickly and does not impact MongoDB performance. serverStatus outputs an account of the state of a MongoDB instance. This command is rarely run directly. In most cases, the data is more meaningful when aggregated, as one would see with monitoring tools including MMS12 . Nevertheless, all administrators should be familiar with the data provided by serverStatus. dbStats The dbStats command, or db.stats() from the shell, returns a document that addresses storage use and data volumes. The dbStats reflect the amount of storage used, the quantity of data contained in the database, and object, collection, and index counters. Use this data to monitor the state and storage capacity of a specific database. This output also allows you to compare use between databases and to determine the average document size in a database. collStats The collStats provides statistics that resemble dbStats on the collection level, including a count of the objects in the collection, the size of the collection, the amount of disk space used by the collection, and infor-mation about its indexes. replSetGetStatus The replSetGetStatus command (rs.status() from the shell) returns an overview of your replica set’s status. The replSetGetStatus document details the state and configuration of the replica set and statistics about its members. Use this data to ensure that replication is properly configured, and to check the connections between the current host and the other members of the replica set. Third Party Tools A number of third party monitoring tools have support for MongoDB, either directly, or through their own plugins. 11http://docs.mongodb.org/ecosystem/tools/http-interfaces 12http://mms.mongodb.com 176 Chapter 5. Administration
  • 181. MongoDB Documentation, Release 2.6.4 Self Hosted Monitoring Tools These are monitoring tools that you must install, configure and maintain on your own servers. Most are open source. Tool Plugin Description Gan-glia26 mongodb-ganglia27 Python script to report operations per second, memory usage, btree statistics, master/slave status and current connections. Gan-glia gmond_python_modulePsa2r8ses output from the serverStatus and replSetGetStatus commands. Mo-top29 None Realtime monitoring tool for MongoDB servers. Shows current operations ordered by durations every second. mtop30 None A top like tool. Munin31 mongo-munin32 Retrieves server statistics. Munin mongomon33 Retrieves collection statistics (sizes, index sizes, and each (configured) collection count for one DB). Munin munin-plugins Ubuntu PPA34 Some additional munin plugins not in the main distribution. Na-gios35 nagios-plugin-mongodb36 A simple Nagios check script, written in Python. Zab-bix37 mikoomi-mongodb38 Monitors availability, resource utilization, health, performance and other important metrics. Also consider dex39, an index and query analyzing tool for MongoDB that compares MongoDB log files and indexes to make indexing recommendations. As part of MongoDB Enterprise40, you can run MMS On-Prem41, which offers the features of MMS in a package that runs within your infrastructure. Hosted (SaaS) Monitoring Tools These are monitoring tools provided as a hosted service, usually through a paid subscription. 13http://sourceforge.net/apps/trac/ganglia/wiki 14https://github.com/quiiver/mongodb-ganglia 15https://github.com/ganglia/gmond_python_modules 16https://github.com/tart/motop 17https://github.com/beaufour/mtop 18http://munin-monitoring.org/ 19https://github.com/erh/mongo-munin 20https://github.com/pcdummy/mongomon 21https://launchpad.net/ chris-lea/+archive/munin-plugins 22http://www.nagios.org/ 23https://github.com/mzupan/nagios-plugin-mongodb 24http://www.zabbix.com/ 25https://code.google.com/p/mikoomi/wiki/03 26http://sourceforge.net/apps/trac/ganglia/wiki 27https://github.com/quiiver/mongodb-ganglia 28https://github.com/ganglia/gmond_python_modules 29https://github.com/tart/motop 30https://github.com/beaufour/mtop 31http://munin-monitoring.org/ 32https://github.com/erh/mongo-munin 33https://github.com/pcdummy/mongomon 34https://launchpad.net/ chris-lea/+archive/munin-plugins 35http://www.nagios.org/ 36https://github.com/mzupan/nagios-plugin-mongodb 37http://www.zabbix.com/ 38https://code.google.com/p/mikoomi/wiki/03 39https://github.com/mongolab/dex 40http://www.mongodb.com/products/mongodb-enterprise 41http://mms.mongodb.com 5.1. Administration Concepts 177
  • 182. MongoDB Documentation, Release 2.6.4 Name Notes MongoDB Management Service50 MMS is a cloud-based suite of services for managing MongoDB deployments. MMS provides monitoring and backup functionality. Scout51 Several plugins, including MongoDB Monitoring52, MongoDB Slow Queries53, and MongoDB Replica Set Monitoring54. Server Density55 Dashboard for MongoDB56, MongoDB specific alerts, replication failover timeline and iPhone, iPad and Android mobile apps. Application Performance Management57 IBM has an Application Performance Management SaaS offering that includes monitor for MongoDB and other applications and middleware. Process Logging During normal operation, mongod and mongos instances report a live account of all server activity and operations to either standard output or a log file. The following runtime settings control these options. • quiet. Limits the amount of information written to the log or output. • verbosity. Increases the amount of information written to the log or output. • path. Enables logging to a file, rather than the standard output. You must specify the full path to the log file when adjusting this setting. • logAppend. Adds information to a log file instead of overwriting the file. Note: You can specify these configuration operations as the command line arguments to mongod or mongos For example: mongod -v --logpath /var/log/mongodb/server1.log --logappend Starts a mongod instance in verbose mode, appending data to the log file at /var/log/mongodb/server1.log/. The following database commands also affect logging: • getLog. Displays recent messages from the mongod process log. • logRotate. Rotates the log files for mongod processes only. See Rotate Log Files (page 214). 42https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 43http://scoutapp.com 44https://scoutapp.com/plugin_urls/391-mongodb-monitoring 45http://scoutapp.com/plugin_urls/291-mongodb-slow-queries 46http://scoutapp.com/plugin_urls/2251-mongodb-replica-set-monitoring 47http://www.serverdensity.com 48http://www.serverdensity.com/mongodb-monitoring/ 49http://ibmserviceengage.com 50https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring 51http://scoutapp.com 52https://scoutapp.com/plugin_urls/391-mongodb-monitoring 53http://scoutapp.com/plugin_urls/291-mongodb-slow-queries 54http://scoutapp.com/plugin_urls/2251-mongodb-replica-set-monitoring 55http://www.serverdensity.com 56http://www.serverdensity.com/mongodb-monitoring/ 57http://ibmserviceengage.com 178 Chapter 5. Administration
  • 183. MongoDB Documentation, Release 2.6.4 Diagnosing Performance Issues Degraded performance in MongoDB is typically a function of the relationship between the quantity of data stored in the database, the amount of system RAM, the number of connections to the database, and the amount of time the database spends in a locked state. In some cases performance issues may be transient and related to traffic load, data access patterns, or the availability of hardware on the host system for virtualized environments. Some users also experience performance limitations as a result of inadequate or inappropriate indexing strategies, or as a consequence of poor schema design patterns. In other situations, performance issues may indicate that the database may be operating at capacity and that it is time to add additional capacity to the database. The following are some causes of degraded performance in MongoDB. Locks MongoDB uses a locking system to ensure data set consistency. However, if certain operations are long-running, or a queue forms, performance will slow as requests and operations wait for the lock. Lock-related slowdowns can be intermittent. To see if the lock has been affecting your performance, look to the data in the globalLock section of the serverStatus output. If globalLock.currentQueue.total is consistently high, then there is a chance that a large number of requests are waiting for a lock. This indicates a possible concurrency issue that may be affecting performance. If globalLock.totalTime is high relative to uptime, the database has existed in a lock state for a significant amount of time. If globalLock.ratio is also high, MongoDB has likely been processing a large number of long running queries. Long queries are often the result of a number of factors: ineffective use of indexes, non-optimal schema design, poor query structure, system architecture issues, or insufficient RAM resulting in page faults (page 179) and disk reads. Memory Usage MongoDB uses memory mapped files to store data. Given a data set of sufficient size, the MongoDB process will allocate all available memory on the system for its use. While this is part of the design, and affords MongoDB superior performance, the memory mapped files make it difficult to determine if the amount of RAM is sufficient for the data set. The memory usage statuses metrics of the serverStatus output can provide insight into MongoDB’s memory use. Check the resident memory use (i.e. mem.resident): if this exceeds the amount of system memory and there is a significant amount of data on disk that isn’t in RAM, you may have exceeded the capacity of your system. You should also check the amount of mapped memory (i.e. mem.mapped.) If this value is greater than the amount of system memory, some operations will require disk access page faults to read data from virtual memory and negatively affect performance. Page Faults Page faults can occur as MongoDB reads from or writes data to parts of its data files that are not currently located in physical memory. In contrast, operating system page faults happen when physical memory is exhausted and pages of physical memory are swapped to disk. Page faults triggered by MongoDB are reported as the total number of page faults in one second. To check for page faults, see the extra_info.page_faults value in the serverStatus output. MongoDB on Windows counts both hard and soft page faults. The MongoDB page fault counter may increase dramatically in moments of poor performance and may correlate with limited physical memory environments. Page faults also can increase while accessing much larger data sets, for example, scanning an entire collection. Limited and sporadic MongoDB page faults do not necessarily indicate a problem or a need to tune the database. A single page fault completes quickly and is not problematic. However, in aggregate, large volumes of page faults typically indicate that MongoDB is reading too much data from disk. In many situations, MongoDB’s read locks will 5.1. Administration Concepts 179
  • 184. MongoDB Documentation, Release 2.6.4 “yield” after a page fault to allow other processes to read and avoid blocking while waiting for the next page to read into memory. This approach improves concurrency, and also improves overall throughput in high volume systems. Increasing the amount of RAM accessible to MongoDB may help reduce the frequency of page faults. If this is not possible, you may want to consider deploying a sharded cluster or adding shards to your deployment to distribute load among mongod instances. See What are page faults? (page 715) for more information. Number of Connections In some cases, the number of connections between the application layer (i.e. clients) and the database can overwhelm the ability of the server to handle requests. This can produce performance irregularities. The following fields in the serverStatus document can provide insight: • globalLock.activeClients contains a counter of the total number of clients with active operations in progress or queued. • connections is a container for the following two fields: – current the total number of current clients that connect to the database instance. – available the total number of unused collections available for new clients. If requests are high because there are numerous concurrent application requests, the database may have trouble keeping up with demand. If this is the case, then you will need to increase the capacity of your deployment. For read-heavy applications increase the size of your replica set and distribute read operations to secondary members. For write heavy applications, deploy sharding and add one or more shards to a sharded cluster to distribute load among mongod instances. Spikes in the number of connections can also be the result of application or driver errors. All of the officially supported MongoDB drivers implement connection pooling, which allows clients to use and reuse connections more efficiently. Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or other configuration error. Unless constrained by system-wide limits MongoDB has no limit on incoming connections. You can modify system limits using the ulimit command, or by editing your system’s /etc/sysctl file. See UNIX ulimit Settings (page 266) for more information. Database Profiling MongoDB’s “Profiler” is a database profiling system that can help identify inefficient queries and operations. The following profiling levels are available: Level Setting 0 Off. No profiling 1 On. Only includes “slow” operations 2 On. Includes all operations Enable the profiler by setting the profile value using the following command in the mongo shell: db.setProfilingLevel(1) The slowOpThresholdMs setting defines what constitutes a “slow” operation. To set the threshold above which the profiler considers operations “slow” (and thus, included in the level 1 profiling data), you can configure slowOpThresholdMs at runtime as an argument to the db.setProfilingLevel() operation. See The documentation of db.setProfilingLevel() for more information about this command. By default, mongod records all “slow” queries to its log, as defined by slowOpThresholdMs. 180 Chapter 5. Administration
  • 185. MongoDB Documentation, Release 2.6.4 Note: Because the database profiler can negatively impact performance, only enable profiling for strategic intervals and as minimally as possible on production systems. You may enable profiling on a per-mongod basis. This setting will not propagate across a replica set or sharded cluster. You can view the output of the profiler in the system.profile collection of your database by issuing the show profile command in the mongo shell, or with the following operation: db.system.profile.find( { millis : { $gt : 100 } } ) This returns all operations that lasted longer than 100 milliseconds. Ensure that the value specified here (100, in this example) is above the slowOpThresholdMs threshold. See also: Optimization Strategies for MongoDB (page 200) addresses strategies that may improve the performance of your database queries and operations. Replication and Monitoring Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor replication lag. “Replication lag” refers to the amount of time that it takes to copy (i.e. replicate) a write operation on the primary to a secondary. Some small delay period may be acceptable, but two significant problems emerge as replication lag grows: • First, operations that occurred during the period of lag are not replicated to one or more secondaries. If you’re using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set. • Second, if the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. This is uncommon under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise. Note: The size of the oplog is only configurable during the first run using the --oplogSize argument to the mongod command, or preferably, the oplogSizeMB setting in the MongoDB configuration file. If you do not specify this on the command line before running with the --replSet option, mongod will create a default sized oplog. By default, the oplog is 5 percent of total available disk space on 64-bit systems. For more information about changing the oplog size, see the Change the Size of the Oplog (page 570) For causes of replication lag, see Replication Lag (page 589). Replication issues are most often the result of network connectivity issues between members, or the result of a primary that does not have the resources to support application and replication traffic. To check the status of a replica, use the replSetGetStatus or the following helper in the shell: rs.status() The http://docs.mongodb.org/manualreference/command/replSetGetStatus document pro-vides a more in-depth overview view of this output. In general, watch the value of optimeDate, and pay particular attention to the time difference between the primary and the secondary members. 5.1. Administration Concepts 181
  • 186. MongoDB Documentation, Release 2.6.4 Sharding and Monitoring In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB instances. In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes and that sharding operations are functioning appropriately. See also: See the Sharding Concepts (page 613) documentation for more information. Config Servers The config database maintains a map identifying which documents are on which shards. The cluster updates this map as chunks move between shards. When a configuration server becomes inaccessible, certain sharding operations become unavailable, such as moving chunks and starting mongos instances. However, clusters remain accessible from already-running mongos instances. Because inaccessible configuration servers can seriously impact the availability of a sharded cluster, you should mon-itor your configuration servers to ensure that the cluster remains well balanced and that mongos instances can restart. MMS Monitoring58 monitors config servers and can create notifications if a config server becomes inaccessible. Balancing and Chunk Distribution The most effective sharded cluster deployments evenly balance chunks among the shards. To facilitate this, MongoDB has a background balancer process that distributes data to ensure that chunks are always optimally distributed among the shards. Issue the db.printShardingStatus() or sh.status() command to the mongos by way of the mongo shell. This returns an overview of the entire cluster including the database name, and a list of the chunks. Stale Locks In nearly every case, all locks used by the balancer are automatically released when they become stale. However, because any long lasting lock can block future balancing, it’s important to ensure that all locks are legitimate. To check the lock status of the database, connect to a mongos instance using the mongo shell. Issue the following command sequence to switch to the config database and display all outstanding locks on the shard database: use config db.locks.find() For active deployments, the above query can provide insights. The balancing process, which originates on a randomly selected mongos, takes a special “balancer” lock that prevents other balancing activity from transpiring. Use the following command, also to the config database, to check the status of the “balancer” lock. db.locks.find( { _id : "balancer" } ) If this lock exists, make sure that the balancer process is actively using this lock. Run-time Database Configuration The command line and configuration file interfaces provide MongoDB administrators with a large num-ber of options and settings for controlling the operation of the database system. This document provides an overview of common configurations and examples of best-practice configurations for common use cases. While both interfaces provide access to the same collection of options and settings, this document primarily uses the configuration file interface. If you run MongoDB using a control script or installed from a package for your operating system, you likely already have a configuration file located at /etc/mongodb.conf. Confirm this by checking the contents of the /etc/init.d/mongod or /etc/rc.d/mongod script to ensure that the control scripts start the mongod with the appropriate configuration file (see below.) 58http://mms.mongodb.com 182 Chapter 5. Administration
  • 187. MongoDB Documentation, Release 2.6.4 To start a MongoDB instance using this configuration issue a command in the following form: mongod --config /etc/mongodb.conf mongod -f /etc/mongodb.conf Modify the values in the /etc/mongodb.conf file on your system to control the configuration of your database instance. Configure the Database Consider the following basic configuration: fork = true bind_ip = 127.0.0.1 port = 27017 quiet = true dbpath = /srv/mongodb logpath = /var/log/mongodb/mongod.log logappend = true journal = true For most standalone servers, this is a sufficient base configuration. It makes several assumptions, but consider the following explanation: • fork is true, which enables a daemon mode for mongod, which detaches (i.e. “forks”) the MongoDB from the current session and allows you to run the database as a conventional server. • bindIp is 127.0.0.1, which forces the server to only listen for requests on the localhost IP. Only bind to secure interfaces that the application-level systems can access with access control provided by system network filtering (i.e. “firewall”). New in version 2.6: mongod installed from official .deb (page 12) and .rpm (page 6) packages have the bind_ip configuration set to 127.0.0.1 by default. • port is 27017, which is the default MongoDB port for database instances. MongoDB can bind to any port. You can also filter access based on port using network filtering tools. Note: UNIX-like systems require superuser privileges to attach processes to ports lower than 1024. • quiet is true. This disables all but the most critical entries in output/log file. In normal operation this is the preferable operation to avoid log noise. In diagnostic or testing situations, set this value to false. Use setParameter to modify this setting during run time. • dbPath is /srv/mongodb, which specifies where MongoDB will store its data files. /srv/mongodb and /var/lib/mongodb are popular locations. The user account that mongod runs under will need read and write access to this directory. • systemLog.path is /var/log/mongodb/mongod.log which is where mongod will write its output. If you do not set this value, mongod writes all output to standard output (e.g. stdout.) • logAppend is true, which ensures that mongod does not overwrite an existing log file following the server start operation. • storage.journal.enabled is true, which enables journaling. Journaling ensures single instance write-durability. 64-bit builds of mongod enable journaling by default. Thus, this setting may be redundant. Given the default configuration, some of these values may be redundant. However, in many situations explicitly stating the configuration increases overall system intelligibility. 5.1. Administration Concepts 183
  • 188. MongoDB Documentation, Release 2.6.4 Security Considerations The following collection of configuration options are useful for limiting access to a mongod instance. Consider the following: bind_ip = 127.0.0.1,10.8.0.10,192.168.4.24 auth = true Consider the following explanation for these configuration decisions: • “bindIp” has three values: 127.0.0.1, the localhost interface; 10.8.0.10, a private IP address typically used for local networks and VPN interfaces; and 192.168.4.24, a private network interface typically used for local networks. Because production MongoDB instances need to be accessible from multiple database servers, it is important to bind MongoDB to multiple interfaces that are accessible from your application servers. At the same time it’s important to limit these interfaces to interfaces controlled and protected at the network layer. • “enabled” to false disables the UNIX Socket, which is otherwise enabled by default. This limits access on the local system. This is desirable when running MongoDB on systems with shared access, but in most situations has minimal impact. • “authorization” is true enables the authentication system within MongoDB. If enabled you will need to log in by connecting over the localhost interface for the first time to create user credentials. See also: Security Concepts (page 281) Replication and Sharding Configuration Replication Configuration Replica set configuration is straightforward, and only requires that the replSetName have a value that is consistent among all members of the set. Consider the following: replSet = set0 Use descriptive names for sets. Once configured use the mongo shell to add hosts to the replica set. See also: Replica set reconfiguration. To enable authentication for the replica set, add the following option: keyFile = /srv/mongodb/keyfile New in version 1.8: for replica sets, and 1.9.1 for sharded replica sets. Setting keyFile enables authentication and specifies a key file for the replica set member use to when authenticating to each other. The content of the key file is arbitrary, but must be the same on all members of the replica set and mongos instances that connect to the set. The keyfile must be less than one kilobyte in size and may only contain characters in the base64 set and the file must not have group or “world” permissions on UNIX systems. See also: The Replica set Reconfiguration section for information regarding the process for changing replica set during opera-tion. Additionally, consider the Replica Set Security section for information on configuring authentication with replica sets. Finally, see the Replication (page 503) document for more information on replication in MongoDB and replica set configuration in general. 184 Chapter 5. Administration
  • 189. MongoDB Documentation, Release 2.6.4 Sharding Configuration Sharding requires a number of mongod instances with different configurations. The con-fig servers store the cluster’s metadata, while the cluster distributes data among one or more shard servers. Note: Config servers are not replica sets. To set up one or three “config server” instances as normal (page 183) mongod instances, and then add the following configuration option: configsvr = true bind_ip = 10.8.0.12 port = 27001 This creates a config server running on the private IP address 10.8.0.12 on port 27001. Make sure that there are no port conflicts, and that your config server is accessible from all of your mongos and mongod instances. To set up shards, configure two or more mongod instance using your base configuration (page 183), with the shardsvr value for the clusterRole setting: shardsvr = true Finally, to establish the cluster, configure at least one mongos process with the following settings: configdb = 10.8.0.12:27001 chunkSize = 64 You can specify multiple configDB instances by specifying hostnames and ports in the form of a comma separated list. In general, avoid modifying the chunkSize from the default value of 64, 59 and should ensure this setting is consistent among all mongos instances. See also: The Sharding (page 607) section of the manual for more information on sharding and cluster configuration. Run Multiple Database Instances on the Same System In many cases running multiple instances of mongod on a single system is not recommended. On some types of deployments 60 and for testing purposes you may need to run more than one mongod on a single system. In these cases, use a base configuration (page 183) for each instance, but consider the following configuration values: dbpath = /srv/mongodb/db0/ pidfilepath = /srv/mongodb/db0.pid The dbPath value controls the location of the mongod instance’s data directory. Ensure that each database has a distinct and well labeled data directory. The pidFilePath controls where mongod process places it’s process id file. As this tracks the specific mongod file, it is crucial that file be unique and well labeled to make it easy to start and stop these processes. Create additional control scripts and/or adjust your existing MongoDB configuration and control script as needed to control these processes. 59 Chunk size is 64 megabytes by default, which provides the ideal balance between the most even distribution of data, for which smaller chunk sizes are best, and minimizing chunk migration, for which larger chunk sizes are optimal. 60 Single-tenant systems with SSD or other high performance disks may provide acceptable performance levels for multiple mongod instances. Additionally, you may find that multiple databases with small working sets may function acceptably on a single system. 5.1. Administration Concepts 185
  • 190. MongoDB Documentation, Release 2.6.4 Diagnostic Configurations The following configuration options control various mongod behaviors for diagnostic purposes. The following set-tings have default values that tuned for general production purposes: slowms = 50 profile = 3 verbose = true objcheck = true Use the base configuration (page 183) and add these options if you are experiencing some unknown issue or perfor-mance problem as needed: • slowOpThresholdMs configures the threshold for to consider a query “slow,” for the purpose of the logging system and the database profiler. The default value is 100 milliseconds. Set a lower value if the database profiler does not return useful results, or a higher value to only log the longest running queries. See Optimization Strategies for MongoDB (page 200) for more information on optimizing operations in MongoDB. • mode sets the database profiler level. The profiler is not active by default because of the possible impact on the profiler itself on performance. Unless this setting has a value, queries are not profiled. • verbosity controls the amount of logging output that mongod write to the log. Only use this option if you are experiencing an issue that is not reflected in the normal logging level. • wireObjectCheck forces mongod to validate all requests from clients upon receipt. Use this option to ensure that invalid requests are not causing errors, particularly when running a database with untrusted clients. This option may affect database performance. Import and Export MongoDB Data This document provides an overview of the import and export programs included in the MongoDB distribution. These tools are useful when you want to backup or export a portion of your data without capturing the state of the entire database, or for simple data ingestion cases. For more complex data migration tasks, you may want to write your own import and export scripts using a client driver to interact with the database itself. For disaster recovery protection and routine database backup operation, use full database instance backups (page 172). Warning: Because these tools primarily operate by interacting with a running mongod instance, they can impact the performance of your running database. Not only do these processes create traffic for a running database instance, they also force the database to read all data through memory. When MongoDB reads infrequently used data, it can supplant more frequently accessed data, causing a deterioration in performance for the database’s regular workload. See also: MongoDB Backup Methods (page 172) or MMS Backup Manual61 for more information on backing up MongoDB instances. Additionally, consider the following references for the MongoDB import/export tools: • http://docs.mongodb.org/manualreference/program/mongoimport • http://docs.mongodb.org/manualreference/program/mongoexport • http://docs.mongodb.org/manualreference/program/mongorestore • http://docs.mongodb.org/manualreference/program/mongodump 61https://mms.mongodb.com/help/backup 186 Chapter 5. Administration
  • 191. MongoDB Documentation, Release 2.6.4 Data Import, Export, and Backup Operations For resilient and non-disruptive backups, use a file system or block-level disk snapshot function, such as the meth-ods described in the MongoDB Backup Methods (page 172) document. The tools and operations discussed provide functionality that is useful in the context of providing some kinds of backups. In contrast, use import and export tools to backup a small subset of your data or to move data to or from a third party system. These backups may capture a small crucial set of data or a frequently modified section of data for extra insurance, or for ease of access. Warning: mongoimport and mongoexport do not reliably preserve all rich BSON data types because JSON can only represent a subset of the types supported by BSON. As a re-sult, data exported or imported with these tools may lose some measure of fidelity. See http://docs.mongodb.org/manualreference/mongodb-extended-json for more infor-mation. No matter how you decide to import or export your data, consider the following guidelines: • Label files so that you can identify the contents of the export or backup as well as the point in time the ex-port/ backup reflect. • Do not create or apply exports if the backup process itself will have an adverse effect on a production system. • Make sure that they reflect a consistent data state. Export or backup processes can impact data integrity (i.e. type fidelity) and consistency if updates continue during the backup process. • Test backups and exports by restoring and importing to ensure that the backups are useful. Human Intelligible Import/Export Formats This section describes a process to import/export a collection to a file in a JSON or CSV format. The examples in this section use the MongoDB tools http://docs.mongodb.org/manualreference/program/mongoimport and http://docs.mongodb.org/manualreference/program/mongoexport. These tools may also be useful for importing data into a MongoDB database from third party applications. If you want to simply copy a database or collection from one instance to another, consider using the copydb, clone, or cloneCollection commands, which may be more suited to this task. The mongo shell provides the db.copyDatabase() method. Warning: mongoimport and mongoexport do not reliably preserve types because JSON can only represent a subset of the types supported by Collection Export with mongoexport data exported or imported with these tools may lose some measure http://docs.mongodb.org/manualreference/mongodb-extended-json With the mongoexport utility you can create a backup file. In the most simple invocation, the command takes the following form: mongoexport --collection collection --out collection.json This will export all documents in the collection named collection into the file collection.json. Without the output specification (i.e. “--out collection.json”), mongoexport writes output to standard output (i.e. “stdout”). You can further narrow the results by supplying a query filter using the “--query” and limit results to a single database using the “--db” option. For instance: 5.1. Administration Concepts 187
  • 192. MongoDB Documentation, Release 2.6.4 mongoexport --db sales --collection contacts --query '{"field": 1}' This command returns all documents in the sales database’s contacts collection, with a field named field with a value of 1. Enclose the query in single quotes (e.g. ’) to ensure that it does not interact with your shell environment. The resulting documents will return on standard output. By default, mongoexport returns one JSON document per MongoDB document. Specify the “--jsonArray” argument to return the export as a single JSON array. Use the “--csv” file to return the result in CSV (comma separated values) format. If your mongod instance is not running, you can use the “--dbpath” option to specify the location to your Mon-goDB instance’s database files. See the following example: mongoexport --db sales --collection contacts --dbpath /srv/MongoDB/ This reads the data files directly. This locks the data directory to prevent conflicting writes. The mongod process must not be running or attached to these data files when you run mongoexport in this configuration. The “--host” and “--port” options allow you to specify a non-local host to connect to capture the export. Consider the following example: mongoexport --host mongodb1.example.net --port 37017 --username user --password pass --collection contacts On any mongoexport command you may, as above specify username and password credentials as above. Warning: mongoimport and mongoexport do not reliably preserve types because JSON can only represent a subset of the types supported by Collection Import with mongoimport data exported or imported with these tools may lose some measure http://docs.mongodb.org/manualreference/mongodb-extended-json To restore a backup taken with mongoexport. Most of the arguments to mongoexport also exist for mongoimport. Consider the following command: mongoimport --collection collection --file collection.json This imports the contents of the file collection.json into the collection named collection. If you do not specify a file with the “--file” option, mongoimport accepts input over standard input (e.g. “stdin.”) If you specify the “--upsert” option, all of mongoimport operations will attempt to update existing documents in the database and insert other documents. This option will cause some performance impact depending on your configuration. You can specify the database option --db to import these documents to a particular database. If your MongoDB instance is not running, use the “--dbpath” option to specify the location of your MongoDB instance’s database files. Consider using the “--journal” option to ensure that mongoimport records its operations in the jour-nal. The mongod process must not be running or attached to these data files when you run mongoimport in this configuration. Use the “--ignoreBlanks” option to ignore blank fields. For CSV and TSV imports, this option provides the desired functionality in most cases: it avoids inserting blank fields in MongoDB documents. Production Notes This page details system configurations that affect MongoDB, especially in production. Note: MongoDB Management Service (MMS)62 is a hosted monitoring service which collects and aggregates diag- 188 Chapter 5. Administration
  • 193. MongoDB Documentation, Release 2.6.4 nostic data to provide insight into the performance and operation of MongoDB deployments. See the MMS Website63 and the MMS documentation64 for more information. Packages MongoDB Be sure you have the latest stable release. All releases are available on the Downloads65 page. This is a good place to verify what is current, even if you then choose to install via a package manager. Always use 64-bit builds for production. The 32-bit build MongoDB offers for test and development environments is not suitable for production deployments as it can store no more than 2GB of data. See the 32-bit limitations page (page 690) for more information. 32-bit builds exist to support use on development machines. Operating Systems MongoDB distributions are currently available for Mac OS X, Linux,Windows Server 2008 R2 64bit, Windows 7 (32 bit and 64 bit), Windows Vista, and Solaris platforms. Note: MongoDB uses the GNU C Library66 (glibc) if available on a system. MongoDB requires version at least glibc-2.12-1.2.el6 to avoid a known bug with earlier versions. For best results use at least version 2.13. Concurrency In earlier versions of MongoDB, all write operations contended for a single readers-writer lock on the MongoDB instance. As of version 2.2, each database has a readers-writer lock that allows concurrent reads access to a database, but gives exclusive access to a single write operation per database. See the Concurrency (page 702) page for more information. Journaling MongoDB uses write ahead logging to an on-disk journal to guarantee that MongoDB is able to quickly recover the write operations (page 67) following a crash or other serious failure. In order to ensure that mongod will be able to recover its data files and keep the data files in a valid state following a crash, leave journaling enabled. See Journaling (page 275) for more information. Networking Use Trusted Networking Environments Always run MongoDB in a trusted environment, with network rules that prevent access from all unknown machines, systems, and networks. As with any sensitive system dependent on network access, your MongoDB deployment should only be accessible to specific systems that require access, such as application servers, monitoring services, and other MongoDB components. Note: By default, authorization is not enabled and mongod assumes a trusted environment. You can enable security/auth (page 281) mode if you need it. 62http://mms.mongodb.com 63http://mms.mongodb.com/ 64http://mms.mongodb.com/help/ 65http://www.mongodb.org/downloads 66http://www.gnu.org/software/libc/ 5.1. Administration Concepts 189
  • 194. MongoDB Documentation, Release 2.6.4 See documents in the Security Section (page 279) for additional information, specifically: • Configuration Options (page 288) • Firewalls (page 289) • Network Security Tutorials (page 297) ForWindows users, consider theWindows Server Technet Article on TCP Configuration67 when deploying MongoDB on Windows. Connection Pools To avoid overloading the connection resources of a single mongod or mongos instance, ensure that clients maintain reasonable connection pool sizes. The connPoolStats database command returns information regarding the number of open connections to the current database for mongos instances and mongod instances in sharded clusters. Hardware Considerations MongoDB is designed specifically with commodity hardware in mind and has few hardware requirements or limita-tions. MongoDB’s core components run on little-endian hardware, primarily x86/x86_64 processors. Client libraries (i.e. drivers) can run on big or little endian systems. Hardware Requirements and Limitations The hardware for the most effective MongoDB deployments have the following properties: Allocate Sufficient RAM and CPU As with all software, more RAM and a faster CPU clock speed are important for performance. In general, databases are not CPU bound. As such, increasing the number of cores can help, but does not provide significant marginal return. Use Solid State Disks (SSDs) MongoDB has good results and a good price-performance ratio with SATA SSD (Solid State Disk). Use SSD if available and economical. Spinning disks can be performant, but SSDs’ capacity for random I/O operations works well with the update model of mongod. Commodity (SATA) spinning drives are often a good option, as the random I/O performance increase with more expensive spinning drives is not that dramatic (only on the order of 2x). Using SSDs or increasing RAM may be more effective in increasing I/O throughput. Avoid Remote File Systems • Remote file storage can create performance problems in MongoDB. See Remote Filesystems (page 191) for more information about storage and MongoDB. 67http://technet.microsoft.com/en-us/library/dd349797.aspx 190 Chapter 5. Administration
  • 195. MongoDB Documentation, Release 2.6.4 MongoDB andNUMAHardware Important: The discussion of NUMA in this section only applies to Linux systems with multiple physical processors, and therefore does not affect deployments where mongod instances run on other UNIX-like systems, on Windows, or on a Linux system with only one physical processor. Running MongoDB on a system with Non-Uniform Access Memory (NUMA) can cause a number of operational problems, including slow performance for periods of time or high system process usage. When running MongoDB on NUMA hardware, you should disable NUMA for MongoDB and instead set an interleave memory policy. Note: MongoDB version 2.0 and greater checks these settings on start up when deployed on a Linux-based system, and prints a warning if the system is NUMA-based. To disable NUMA for MongoDB and set an interleave memory policy, use the numactl command and start mongod in the following manner: numactl --interleave=all /usr/bin/local/mongod Then, disable zone reclaim in the proc settings using the following command: echo 0 > /proc/sys/vm/zone_reclaim_mode To fully disable NUMA, you must perform both operations. For more information, see the Documentation for /proc/sys/vm/*68. See The MySQL “swap insanity” problem and the effects of NUMA69 post, which describes the effects of NUMA on databases. This blog post addresses the impact of NUMA for MySQL, but the issues for MongoDB are similar. The post introduces NUMA and its goals, and illustrates how these goals are not compatible with production databases. Disk and Storage Systems Swap Assign swap space for your systems. Allocating swap space can avoid issues with memory contention and can prevent the OOM Killer on Linux systems from killing mongod. The method mongod uses to map memory files to memory ensures that the operating system will never store Mon-goDB data in swap space. RAID Most MongoDB deployments should use disks backed by RAID-10. RAID-5 and RAID-6 do not typically provide sufficient performance to support a MongoDB deployment. Avoid RAID-0 with MongoDB deployments. While RAID-0 provides good write performance, it also provides limited availability and can lead to reduced performance on read operations, particularly when using Amazon’s EBS volumes. Remote Filesystems The Network File System protocol (NFS) is not recommended for use with MongoDB as some versions perform poorly. Performance problems arise when both the data files and the journal files are hosted on NFS. You may experience better performance if you place the journal on local or iscsi volumes. If you must use NFS, add the following NFS options to your /etc/fstab file: bg, nolock, and noatime. 68http://www.kernel.org/doc/Documentation/sysctl/vm.txt 69http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/ 5.1. Administration Concepts 191
  • 196. MongoDB Documentation, Release 2.6.4 Separate Components onto Different Storage Devices For improved performance, consider separating your database’s data, journal, and logs onto different storage devices, based on your application’s access and write pat-tern. Note: This will affect your ability to create snapshot-style backups of your data, since the files will be on different devices and volumes. Scheduling for Virtual Devices Local block devices attached to virtual machine instances via the hypervisor should use a noop scheduler for best performance. The noop scheduler allows the operating system to defer I/O scheduling to the underlying hypervisor. Architecture Write Concern Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation. The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm the write operations. MongoDB provides different levels of write concern to better address the specific needs of applications. Clients may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather than ensure persistence to the entire deployment. See the Write Concern (page 72) document for more information about choosing an appropriate write concern level for your deployment. Replica Sets See the Replica Set Architectures (page 516) document for an overview of architectural considerations for replica set deployments. Sharded Clusters See the Sharded Cluster Production Architecture (page 618) document for an overview of rec-ommended sharded cluster architectures for production deployments. Platforms MongoDB on Linux Important: The following discussion only applies to Linux, and therefore does not affect deployments where mongod instances run other UNIX-like systems or on Windows. Kernel and File Systems When running MongoDB in production on Linux, it is recommended that you use Linux kernel version 2.6.36 or later. MongoDB preallocates its database files before using them and often creates large files. As such, you should use the Ext4 and XFS file systems: • In general, if you use the Ext4 file system, use at least version 2.6.23 of the Linux Kernel. • In general, if you use the XFS file system, use at least version 2.6.25 of the Linux Kernel. • Some Linux distributions require different versions of the kernel to support using ext4 and/or xfs: 192 Chapter 5. Administration
  • 197. MongoDB Documentation, Release 2.6.4 Linux Distribution Filesystem Kernel Version CentOS 5.5 ext4, xfs 2.6.18-194.el5 CentOS 5.6 ext4, xfs 2.6.18-238.el5 CentOS 5.8 ext4, xfs 2.6.18-308.8.2.el5 CentOS 6.1 ext4, xfs 2.6.32-131.0.15.el6.x86_64 RHEL 5.6 ext4 2.6.18-238 RHEL 6.0 xfs 2.6.32-71 Ubuntu 10.04.4 LTS ext4, xfs 2.6.32-38-server Amazon Linux AMI release 2012.03 ext4 3.2.12-3.2.4.amzn1.x86_64 Important: MongoDB requires a filesystem that supports fsync() on directories. For example, HGFS and Virtual Box’s shared folders do not support this operation. Recommended Configuration • Turn off atime for the storage volume containing the database files. • Set the file descriptor limit, -n, and the user process limit (ulimit), -u, above 20,000, according to the sug-gestions in the ulimit (page 266) document. A low ulimit will affect MongoDB when under heavy use and can produce errors and lead to failed connections to MongoDB processes and loss of service. • Disable transparent huge pages as MongoDB performs better with normal (4096 bytes) virtual mem-ory pages. • Disable NUMA in your BIOS. If that is not possible see MongoDB on NUMA Hardware (page 191). • Ensure that readahead settings for the block devices that store the database files are appropriate. For random access use patterns, set low readahead values. A readahead of 32 (16kb) often works well. For a standard block device, you can run sudo blockdev --report to get the readahead settings and sudo blockdev --setra <value> <device> to change the readahead settings. Refer to your spe-cific operating system manual for more information. • Use the Network Time Protocol (NTP) to synchronize time among your hosts. This is especially important in sharded clusters. MongoDB on Virtual Environments The section describes considerations when running MongoDB in some of the more common virtual environments. For all platforms, consider Scheduling for Virtual Devices (page 192). EC2 MongoDB is compatible with EC2 and requires no configuration changes specific to the environment. You may alternately choose to obtain a set of Amazon Machine Images (AMI) that bundle together MongoDB and Amazon’s Provisioned IOPS storage volumes. Provisioned IOPS can greatly increase MongoDB’s performance and ease of use. For more information, see this blog post70. VMWare MongoDB is compatible with VMWare. As some users have run into issues with VMWare’s memory overcommit feature, disabling the feature is recommended. It is possible to clone a virtual machine running MongoDB. You might use this function to spin up a new virtual host to add as a member of a replica set. If you clone a VM with journaling enabled, the clone snapshot will be valid. If not using journaling, first stop mongod, then clone the VM, and finally, restart mongod. 70http://www.mongodb.com/blog/post/provisioned-iops-aws-marketplace-significantly-boosts-mongodb-performance-ease-use 5.1. Administration Concepts 193
  • 198. MongoDB Documentation, Release 2.6.4 OpenVZ Some users have had issues when running MongoDB on some older version of OpenVZ due to its handling of virtual memory, as with VMWare. This issue seems to have been resolved in the more recent versions of OpenVZ. Performance Monitoring iostat On Linux, use the iostat command to check if disk I/O is a bottleneck for your database. Specify a number of seconds when running iostat to avoid displaying stats covering the time since server boot. For example, the following command will display extended statistics and the time for each displayed report, with traffic in MB/s, at one second intervals: iostat -xmt 1 Key fields from iostat: • %util: this is the most useful field for a quick check, it indicates what percent of the time the device/drive is in use. • avgrq-sz: average request size. Smaller number for this value reflect more random IO operations. bwm-ng bwm-ng71 is a command-line tool for monitoring network use. If you suspect a network-based bottleneck, you may use bwm-ng to begin your diagnostic process. Backups To make backups of your MongoDB database, please refer to MongoDB Backup Methods Overview (page 172). 5.1.2 Data Management These document introduce data management practices and strategies for MongoDB deployments, including strategies for managing multi-data center deployments, managing larger file stores, and data lifecycle tools. Data Center Awareness (page 194) Presents the MongoDB features that allow application developers and database administrators to configure their deployments to be more data center aware or allow operational and location-based separation. Capped Collections (page 196) Capped collections provide a special type of size-constrained collections that preserve insertion order and can support high volume inserts. Expire Data from Collections by Setting TTL (page 198) TTL collections make it possible to automatically remove data from a collection based on the value of a timestamp and are useful for managing data like machine generated event data that are only useful for a limited period of time. Data Center Awareness MongoDB provides a number of features that allow application developers and database administrators to customize the behavior of a sharded cluster or replica set deployment so that MongoDB may be more “data center aware,” or allow operational and location-based separation. 71http://www.gropp.org/?id=projects&sub=bwm-ng 194 Chapter 5. Administration
  • 199. MongoDB Documentation, Release 2.6.4 MongoDB also supports segregation based on functional parameters, to ensure that certain mongod instances are only used for reporting workloads or that certain high-frequency portions of a sharded collection only exist on specific shards. The following documents, found either in this section or other sections of this manual, provide information on cus-tomizing a deployment for operation- and location-based separation: Operational Segregation in MongoDB Deployments (page 195) MongoDB lets you specify that certain application operations use certain mongod instances. Tag Aware Sharding (page 671) Tags associate specific ranges of shard key values with specific shards for use in managing deployment patterns. Manage Shard Tags (page 672) Use tags to associate specific ranges of shard key values with specific shards. Operational Segregation in MongoDB Deployments Operational Overview MongoDB includes a number of features that allow database administrators and developers to segregate application operations to MongoDB deployments by functional or geographical groupings. This capability provides “data center awareness,” which allows applications to target MongoDB deployments with consideration of the physical location of the mongod instances. MongoDB supports segmentation of operations across different dimensions, which may include multiple data centers and geographical regions in multi-data center deployments, racks, networks, or power circuits in single data center deployments. MongoDB also supports segregation of database operations based on functional or operational parameters, to ensure that certain mongod instances are only used for reporting workloads or that certain high-frequency portions of a sharded collection only exist on specific shards. Specifically, with MongoDB, you can: • ensure write operations propagate to specific members of a replica set, or to specific members of replica sets. • ensure that specific members of a replica set respond to queries. • ensure that specific ranges of your shard key balance onto and reside on specific shards. • combine the above features in a single distributed deployment, on a per-operation (for read and write operations) and collection (for chunk distribution in sharded clusters distribution) basis. For full documentation of these features, see the following documentation in the MongoDB Manual: • Read Preferences (page 530), which controls how drivers help applications target read operations to members of a replica set. • Write Concerns (page 72), which controls how MongoDB ensures that write operations propagate to members of a replica set. • Replica Set Tags (page 576), which control how applications create and interact with custom groupings of replica set members to create custom application-specific read preferences and write concerns. • Tag Aware Sharding (page 671), which allows MongoDB administrators to define an application-specific bal-ancing policy, to control how documents belonging to specific ranges of a shard key distribute to shards in the sharded cluster. See also: Before adding operational segregation features to your application and MongoDB deployment, become familiar with all documentation of replication (page 503), and sharding (page 607). 5.1. Administration Concepts 195
  • 200. MongoDB Documentation, Release 2.6.4 Further Reading • The Write Concern (page 72) and Read Preference (page 530) documents, which address capabilities related to data center awareness. • Deploy a Geographically Redundant Replica Set (page 550). Capped Collections Capped collections are fixed-size collections that support high-throughput operations that insert, retrieve, and delete documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection. See createCollection() or create for more information on creating capped collections. Capped collections have the following behaviors: • Capped collections guarantee preservation of the insertion order. As a result, queries do not need an index to return documents in insertion order. Without this indexing overhead, they can support higher insertion through-put. • Capped collections guarantee that insertion order is identical to the order on disk (natural order) and do so by prohibiting updates that increase document size. Capped collections only allow updates that fit the original document size, which ensures a document does not change its location on disk. • Capped collections automatically remove the oldest documents in the collection without requiring scripts or explicit remove operations. For example, the oplog.rs collection that stores a log of the operations in a replica set uses a capped collection. Consider the following potential use cases for capped collections: • Store log information generated by high-volume systems. Inserting documents in a capped collection without an index is close to the speed of writing log information directly to a file system. Furthermore, the built-in first-in-first-out property maintains the order of events, while managing storage use. • Cache small amounts of data in a capped collections. Since caches are read rather than write heavy, you would either need to ensure that this collection always remains in the working set (i.e. in RAM) or accept some write penalty for the required index or indexes. Recommendations and Restrictions • You can only make in-place updates of documents. If the update operation causes the document to grow beyond their original size, the update operation will fail. If you plan to update documents in a capped collection, create an index so that these update operations do not require a table scan. • If you update a document in a capped collection to a size smaller than its original size, and then a secondary resyncs from the primary, the secondary will replicate and allocate space based on the current smaller document size. If the primary then receives an update which increases the document back to its original size, the primary will accept the update but the secondary will fail with a failing update: objects in a capped ns cannot grow error message. To prevent this error, create your secondary from a snapshot of one of the other up-to-date members of the replica set. Follow our tutorial on filesystem snapshots (page 229) to seed your new secondary. Seeding the secondary with a filesystem snapshot is the only way to guarantee the primary and secondary binary files are compatible. MMS Backup snapshots are insufficient in this situation since you need more than the content of the secondary to match the primary. 196 Chapter 5. Administration
  • 201. MongoDB Documentation, Release 2.6.4 • You cannot delete documents from a capped collection. To remove all records from a capped collection, use the ‘emptycapped’ command. To remove the collection entirely, use the drop() method. • You cannot shard a capped collection. • Capped collections created after 2.2 have an _id field and an index on the _id field by default. Capped collections created before 2.2 do not have an index on the _id field by default. If you are using capped collections with replication prior to 2.2, you should explicitly create an index on the _id field. Warning: If you have a capped collection in a replica set outside of the local database, before 2.2, you should create a unique index on _id. Ensure uniqueness using the unique: true option to the ensureIndex() method or by using an ObjectId for the _id field. Alternately, you can use the autoIndexId option to create when creating the capped collection, as in the Query a Capped Collec-tion (page 197) procedure. • Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is (somewhat) analogous to tail on a log file. • The aggregation pipeline operator $out cannot write results to a capped collection. Procedures Create a Capped Collection You must create capped collections explicitly using the createCollection() method, which is a helper in the mongo shell for the create command. When creating a capped collection you must specify the maximum size of the collection in bytes, which MongoDB will pre-allocate for the collection. The size of the capped collection includes a small amount of space for internal overhead. db.createCollection( "log", { capped: true, size: 100000 } ) Additionally, you may also specify a maximum number of documents for the collection using the max field as in the following document: db.createCollection("log", { capped : true, size : 5242880, max : 5000 } ) Important: The size argument is always required, even when you specify max number of documents. MongoDB will remove older documents if a collection reaches the maximum size limit before it reaches the maximum document count. See createCollection() and create. Query a Capped Collection If you perform a find() on a capped collection with no ordering specified, MongoDB guarantees that the ordering of results is the same as the insertion order. To retrieve documents in reverse insertion order, issue find() along with the sort() method with the $natural parameter set to -1, as shown in the following example: db.cappedCollection.find().sort( { $natural: -1 } ) Check if a Collection is Capped Use the isCapped() method to determine if a collection is capped, as follows: db.collection.isCapped() 5.1. Administration Concepts 197
  • 202. MongoDB Documentation, Release 2.6.4 Convert a Collection to Capped You can convert a non-capped collection to a capped collection with the convertToCapped command: db.runCommand({"convertToCapped": "mycoll", size: 100000}); The size parameter specifies the size of the capped collection in bytes. Warning: This command obtains a global write lock and will block other operations until it has completed. Changed in version 2.2: Before 2.2, capped collections did not have an index on _id unless you specified autoIndexId to the create, after 2.2 this became the default. Automatically Remove Data After a Specified Period of Time For additional flexibility when expiring data, con-sider MongoDB’s TTL indexes, as described in Expire Data from Collections by Setting TTL (page 198). These indexes allow you to expire and remove data from normal collections using a special type, based on the value of a date-typed field and a TTL value for the index. TTL Collections (page 198) are not compatible with capped collections. Tailable Cursor You can use a tailable cursor with capped collections. Similar to the Unix tail -f command, the tailable cursor “tails” the end of a capped collection. As new documents are inserted into the capped collection, you can use the tailable cursor to continue retrieving documents. See Create Tailable Cursor (page 109) for information on creating a tailable cursor. Expire Data from Collections by Setting TTL New in version 2.2. This document provides an introduction to MongoDB’s “time to live” or “TTL” collection feature. TTL collections make it possible to store data in MongoDB and have the mongod automatically remove data after a specified number of seconds or at a specific clock time. Data expiration is useful for some classes of information, including machine generated event data, logs, and session information that only need to persist for a limited period of time. A special index type supports the implementation of TTL collections. TTL relies on a background thread in mongod that reads the date-typed values in the index and removes expired documents from the collection. Considerations • The _id field does not support TTL indexes. • You cannot create a TTL index on a field that already has an index. • A document will not expire if the indexed field does not exist. • A document will not expire if the indexed field is not a date BSON type or an array of date BSON types. • The TTL index may not be compound (may not have multiple fields). • If the TTL field holds an array, and there are multiple date-typed data in the index, the document will expire when the lowest (i.e. earliest) date matches the expiration threshold. • You cannot create a TTL index on a capped collection, because MongoDB cannot remove documents from a capped collection. 198 Chapter 5. Administration
  • 203. MongoDB Documentation, Release 2.6.4 • You cannot use ensureIndex() to change the value of expireAfterSeconds. Instead use the collMod database command in conjunction with the index collection flag. • When you build a TTL index in the background (page 460), the TTL thread can begin deleting documents while the index is building. If you build a TTL index in the foreground, MongoDB begins removing expired documents as soon as the index finishes building. When the TTL thread is active, you will see delete (page 67) operations in the output of db.currentOp() or in the data collected by the database profiler (page 210). When using TTL indexes on replica sets, the TTL background thread only deletes documents on primary members. However, the TTL background thread does run on secondaries. Secondary members replicate deletion operations from the primary. The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay between the time a document expires and the time that MongoDB removes the document from the database. The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a collection after they expire but before the background task runs or completes. The duration of the removal operation depends on the workload of your mongod instance. Therefore, expired data may exist for some time beyond the 60 second period between runs of the background task. All collections with an index using the expireAfterSeconds option have usePowerOf2Sizes enabled. Users cannot modify this setting. As a result of enabling usePowerOf2Sizes, MongoDB must allocate more disk space relative to data size. This approach helps mitigate the possibility of storage fragmentation caused by frequent delete operations and leads to more predictable storage use patterns. Procedures To enable TTL for a collection, use the ensureIndex() method to create a TTL index, as shown in the examples below. With the exception of the background thread, a TTL index supports queries in the same way normal indexes do. You can use TTL indexes to expire documents in one of two ways, either: • remove documents a certain number of seconds after creation. The index will support queries for the creation time of the documents. Alternately, • specify an explicit expiration time. The index will support queries for the expiration-time of the document. Expire Documents after a Certain Number of Seconds To expire data after a certain number of seconds, create a TTL index on a field that holds values of BSON date type or an array of BSON date-typed objects and specify a positive non-zero value in the expireAfterSeconds field. A document will expire when the number of seconds in the expireAfterSeconds field has passed since the time specified in its indexed field. 72 For example, the following operation creates an index on the log_events collection’s createdAt field and spec-ifies the expireAfterSeconds value of 3600 to set the expiration time to be one hour after the time specified by createdAt. db.log_events.ensureIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } ) When adding documents to the log_events collection, set the createdAt field to the current time: 72 If the field contains an array of BSON date-typed objects, data expires if at least one of BSON date-typed object is older than the number of seconds specified in expireAfterSeconds. 5.1. Administration Concepts 199
  • 204. MongoDB Documentation, Release 2.6.4 db.log_events.insert( { "createdAt": new Date(), "logEvent": 2, "logMessage": "Success!" } ) MongoDB will automatically delete documents from the log_events collection when the document’s createdAt value 1 is older than the number of seconds specified in expireAfterSeconds. See also: $currentDate operator Expire Documents at a Certain Clock Time To expire documents at a certain clock time, begin by creating a TTL index on a field that holds values of BSON date type or an array of BSON date-typed objects and specify an expireAfterSeconds value of 0. For each document in the collection, set the indexed date field to a value corresponding to the time the document should expire. If the indexed date field contains a date in the past, MongoDB considers the document expired. For example, the following operation creates an index on the log_events collection’s expireAt field and specifies the expireAfterSeconds value of 0: db.log_events.ensureIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } ) For each document, set the value of expireAt to correspond to the time the document should expire. For instance, the following insert() operation adds a document that should expire at July 22, 2013 14:00:00. db.log_events.insert( { "expireAt": new Date('July 22, 2013 14:00:00'), "logEvent": 2, "logMessage": "Success!" } ) MongoDB will automatically delete documents from the log_events collection when the documents’ expireAt value is older than the number of seconds specified in expireAfterSeconds, i.e. 0 seconds older in this case. As such, the data expires at the specified expireAt value. 5.1.3 Optimization Strategies for MongoDB There are many factors that can affect database performance and responsiveness including index use, query structure, data models and application design, as well as operational factors such as architecture and system configuration. This section describes techniques for optimizing application performance with MongoDB. Evaluate Performance of Current Operations (page 201) MongoDB provides introspection tools that describe the query execution process, to allow users to test queries and build more efficient queries. Use Capped Collections for Fast Writes and Reads (page 201) Outlines a use case for Capped Collections (page 196) to optimize certain data ingestion work flows. Optimize Query Performance (page 202) Introduces the use of projections (page 57) to reduce the amount of data MongoDB must set to clients. Design Notes (page 203) A collection of notes related to the architecture, design, and administration of MongoDB-based applications. 200 Chapter 5. Administration
  • 205. MongoDB Documentation, Release 2.6.4 Evaluate Performance of Current Operations The following sections describe techniques for evaluating operational performance. Use the Database Profiler to Evaluate Operations Against the Database MongoDB provides a database profiler that shows performance characteristics of each operation against the database. Use the profiler to locate any queries or write operations that are running slow. You can use this information, for example, to determine what indexes to create. For more information, see Database Profiling (page 180). Use db.currentOp() to Evaluate mongod Operations The db.currentOp() method reports on current operations running on a mongod instance. Use $explain to Evaluate Query Performance The explain() method returns statistics on a query, and reports the index MongoDB selected to fulfill the query, as well as information about the internal operation of the query. Example To use explain() on a query for documents matching the expression { a: 1 }, in the collection named records, use an operation that resembles the following in the mongo shell: db.records.find( { a: 1 } ).explain() Use Capped Collections for Fast Writes and Reads Use Capped Collections for Fast Writes Capped Collections (page 196) are circular, fixed-size collections that keep documents well-ordered, even without the use of an index. This means that capped collections can receive very high-speed writes and sequential reads. These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections where appropriate. Use Natural Order for Fast Reads To return documents in the order they exist on disk, return sorted operations using the $natural operator. On a capped collection, this also returns the documents in the order in which they were written. Natural order does not use indexes but can be fast for operations when you want to select the first or last items on disk. See also: sort() and limit(). 5.1. Administration Concepts 201
  • 206. MongoDB Documentation, Release 2.6.4 Optimize Query Performance Create Indexes to Support Queries For commonly issued queries, create indexes (page 431). If a query searches multiple fields, create a compound index (page 440). Scanning an index is much faster than scanning a collection. The indexes structures are smaller than the documents reference, and store references in order. Example If you have a posts collection containing blog posts, and if you regularly issue a query that sorts on the author_name field, then you can optimize the query by creating an index on the author_name field: db.posts.ensureIndex( { author_name : 1 } ) Indexes also improve efficiency on queries that routinely sort on a given field. Example If you regularly issue a query that sorts on the timestamp field, then you can optimize the query by creating an index on the timestamp field: Creating this index: db.posts.ensureIndex( { timestamp : 1 } ) Optimizes this query: db.posts.find().sort( { timestamp : -1 } ) Because MongoDB can read indexes in both ascending and descending order, the direction of a single-key index does not matter. Indexes support queries, update operations, and some phases of the aggregation pipeline (page 393). Index keys that are of the BinData type are more efficiently stored in the index if: • the binary subtype value is in the range of 0-7 or 128-135, and • the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32. Limit the Number of Query Results to Reduce Network Demand MongoDB cursors return results in groups of multiple documents. If you know the number of results you want, you can reduce the demand on network resources by issuing the limit() method. This is typically used in conjunction with sort operations. For example, if you need only 10 results from your query to the posts collection, you would issue the following command: db.posts.find().sort( { timestamp : -1 } ).limit(10) For more information on limiting results, see limit() Use Projections to Return Only Necessary Data When you need only a subset of fields from documents, you can achieve better performance by returning only the fields you need: 202 Chapter 5. Administration
  • 207. MongoDB Documentation, Release 2.6.4 For example, if in your query to the posts collection, you need only the timestamp, title, author, and abstract fields, you would issue the following command: db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 For more information on using projections, see Limit Fields to Return from a Query (page 94). Use $hint to Select a Particular Index In most cases the query optimizer (page 61) selects the optimal index for a specific operation; however, you can force MongoDB to use a specific index using the hint() method. Use hint() to support performance testing, or on some queries where you must select a field or field included in several indexes. Use the Increment Operator to Perform Operations Server-Side Use MongoDB’s $inc operator to increment or decrement values in documents. The operator increments the value of the field on the server side, as an alternative to selecting a document, making simple modifications in the client and then writing the entire document to the server. The $inc operator can also help avoid race conditions, which would result when two application instances queried for a document, manually incremented a field, and saved the entire document back at the same time. Design Notes This page details features of MongoDB that may be important to bear in mind when designing your applications. Schema Considerations Dynamic Schema Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly ho-mogeneous structures. See Data Modeling Concepts (page 133) for more information. Some operational considerations include: • the exact set of collections to be used; • the indexes to be used: with the exception of the _id index, all indexes must be created explicitly; • shard key declarations: choosing a good shard key is very important as the shard key cannot be changed once set. Avoid importing unmodified data directly from a relational database. In general, you will want to “roll up” certain data into richer documents that take advantage of MongoDB’s support for sub-documents and nested arrays. Case Sensitive Strings MongoDB strings are case sensitive. So a search for "joe" will not find "Joe". Consider: • storing data in a normalized case format, or • using regular expressions ending with http://docs.mongodb.org/manuali, and/or • using $toLower or $toUpper in the aggregation framework (page 391). 5.1. Administration Concepts 203
  • 208. MongoDB Documentation, Release 2.6.4 Type Sensitive Fields MongoDB data is stored in the BSON73 format, a binary encoded serialization of JSON-like documents. BSON encodes additional type information. See bsonspec.org74 for more information. Consider the following document which has a field x with the string value "123": { x : "123" } Then the following query which looks for a number value 123 will not return that document: db.mycollection.find( { x : 123 } ) General Considerations By Default, Updates Affect one Document To update multiple documents that meet your query criteria, set the update multi option to true or 1. See: Update Multiple Documents (page 70). Prior to MongoDB 2.2, you would specify the upsert and multi options in the update method as positional boolean options. See: the update method reference documentation. BSON Document Size Limit The BSON Document Size limit is currently set at 16MB per document. If you require larger documents, use GridFS (page 138). No Fully Generalized Transactions MongoDB does not have fully generalized transactions (page 111). If you model your data using rich documents that closely resemble your application’s objects, each logical object will be in one MongoDB document. MongoDB allows you to modify a document in a single atomic operation. These kinds of data modification pattern covers most common uses of transactions in other systems. Replica Set Considerations Use an Odd Number of Replica Set Members Replica sets (page 503) perform consensus elections. To ensure that elections will proceed successfully, either use an odd number of members, typically three, or else use an arbiter to ensure an odd number of votes. Keep Replica Set Members Up-to-Date MongoDB replica sets support automatic failover (page 523). It is impor-tant for your secondaries to be up-to-date. There are various strategies for assessing consistency: 1. Use monitoring tools to alert you to lag events. See Monitoring for MongoDB (page 175) for a detailed discus-sion of MongoDB’s monitoring options. 2. Specify appropriate write concern. 3. If your application requires manual fail over, you can configure your secondaries as priority 0 (page 512). Priority 0 secondaries require manual action for a failover. This may be practical for a small replica set, but large deployments should fail over automatically. See also: replica set rollbacks (page 527). 73http://docs.mongodb.org/meta-driver/latest/legacy/bson/ 74http://bsonspec.org/#/specification 204 Chapter 5. Administration
  • 209. MongoDB Documentation, Release 2.6.4 Sharding Considerations • Pick your shard keys carefully. You cannot choose a new shard key for a collection that is already sharded. • Shard key values are immutable. • When enabling sharding on an existing collection, MongoDB imposes a maximum size on those col-lections to ensure that it is possible to create chunks. For a detailed explanation of this limit, see: <sharding-existing-collection-data-size>. To shard large amounts of data, create a new empty sharded collection, and ingest the data from the source collection using an application level import operation. • Unique indexes are not enforced across shards except for the shard key itself. See Enforce Unique Keys for Sharded Collections (page 674). • Consider pre-splitting (page 634) a sharded collection before a massive bulk import. 5.2 Administration Tutorials The administration tutorials provide specific step-by-step instructions for performing common MongoDB setup, main-tenance, and configuration operations. Configuration, Maintenance, and Analysis (page 205) Describes routine management operations, including config-uration and performance analysis. Manage mongod Processes (page 207) Start, configure, and manage running mongod process. Rotate Log Files (page 214) Archive the current log files and start new ones. Continue reading from Configuration, Maintenance, and Analysis (page 205) for additional tutorials of funda-mental MongoDB maintenance procedures. Backup and Recovery (page 229) Outlines procedures for data backup and restoration with mongod instances and deployments. Backup and Restore with Filesystem Snapshots (page 229) An outline of procedures for creating MongoDB data set backups using system-level file snapshot tool, such as LVM or native storage appliance tools. Backup and Restore Sharded Clusters (page 238) Detailed procedures and considerations for backing up sharded clusters and single shards. Recover Data after an Unexpected Shutdown (page 246) Recover data from MongoDB data files that were not properly closed or have an invalid state. Continue reading from Backup and Recovery (page 229) for additional tutorials of MongoDB backup and re-covery procedures. MongoDB Scripting (page 248) An introduction to the scripting capabilities of the mongo shell and the scripting capabilities embedded in MongoDB instances. MongoDB Tutorials (page 225) A complete list of tutorials in the MongoDB Manual that address MongoDB opera-tion and use. 5.2.1 Configuration, Maintenance, and Analysis The following tutorials describe routine management operations, including configuration and performance analysis: Use Database Commands (page 206) The process for running database commands that provide basic database oper-ations. 5.2. Administration Tutorials 205
  • 210. MongoDB Documentation, Release 2.6.4 Manage mongod Processes (page 207) Start, configure, and manage running mongod process. Terminate Running Operations (page 209) Stop in progress MongoDB client operations using db.killOp() and maxTimeMS(). Analyze Performance of Database Operations (page 210) Collect data that introspects the performance of query and update operations on a mongod instance. Rotate Log Files (page 214) Archive the current log files and start new ones. Manage Journaling (page 215) Describes the procedures for configuring and managing MongoDB’s journaling sys-tem which allows MongoDB to provide crash resiliency and durability. Store a JavaScript Function on the Server (page 217) Describes how to store JavaScript functions on a MongoDB server. Upgrade to the Latest Revision of MongoDB (page 218) Introduces the basic process for upgrading a MongoDB de-ployment between different minor release versions. Monitor MongoDB With SNMP on Linux (page 221) The SNMP extension, available in MongoDB Enterprise, al-lows MongoDB to report data into SNMP traps. Monitor MongoDB Windows with SNMP (page 223) The SNMP extension, available in theWindows build of Mon-goDB Enterprise, allows MongoDB to report data into SNMP traps. Troubleshoot SNMP (page 224) Outlines common errors and diagnostic processes useful for deploying MongoDB Enterprise with SNMP support. MongoDB Tutorials (page 225) A complete list of tutorials in the MongoDB Manual that address MongoDB opera-tion and use. Use Database Commands The MongoDB command interface provides access to all non CRUD database operations. Fetching server stats, initializing a replica set, and running a map-reduce job are all accomplished with commands. See http://docs.mongodb.org/manualreference/command for list of all commands sorted by function, and http://docs.mongodb.org/manualreference/command for a list of all commands sorted alphabet-ically. Database Command Form You specify a command first by constructing a standard BSON document whose first key is the name of the command. For example, specify the isMaster command using the following BSON document: { isMaster: 1 } Issue Commands The mongo shell provides a helper method for running commands called db.runCommand(). The following operation in mongo runs the above command: db.runCommand( { isMaster: 1 } ) Many drivers provide an equivalent for the db.runCommand() method. Internally, running commands with db.runCommand() is equivalent to a special query against the $cmd collection. 206 Chapter 5. Administration
  • 211. MongoDB Documentation, Release 2.6.4 Many common commands have their own shell helpers or wrappers in the mongo shell and drivers, such as the db.isMaster() method in the mongo JavaScript shell. You can use the maxTimeMS option to specify a time limit for the execution of a command, see Terminate a Command (page 210) for more information on operation termination. admin Database Commands You must run some commands on the admin database. Normally, these operations resemble the followings: use admin db.runCommand( {buildInfo: 1} ) However, there’s also a command helper that automatically runs the command in the context of the admin database: db._adminCommand( {buildInfo: 1} ) Command Responses All commands return, at minimum, a document with an ok field indicating whether the command has succeeded: { 'ok': 1 } Failed commands return the ok field with a value of 0. Manage mongod Processes MongoDB runs as a standard program. You can start MongoDB from a command line by issuing the mongod command and specifying options. For a list of options, see http://docs.mongodb.org/manualreference/program/mongod. MongoDB can also run as a Windows service. For details, see Configure a Windows Service for MongoDB (page 21). To install MongoDB, see Install MongoDB (page 5). The following examples assume the directory containing the mongod process is in your system paths. The mongod process is the primary database process that runs on an individual server. mongos provides a coherent MongoDB interface equivalent to a mongod from the perspective of a client. The mongo binary provides the administrative shell. This document page discusses the mongod process; however, some portions of this document may be applicable to mongos instances. See also: Run-time Database Configuration (page 182), http://docs.mongodb.org/manualreference/program/mongod, http://docs.mongodb.org/manualreference/program/mongos, and http://docs.mongodb.org/manualreference/configuration-options. Start mongod Processes By default, MongoDB stores data in the /data/db directory. OnWindows, MongoDB stores data in C:datadb. On all platforms, MongoDB listens for connections from clients on port 27017. To start MongoDB using all defaults, issue the following command at the system shell: 5.2. Administration Tutorials 207
  • 212. MongoDB Documentation, Release 2.6.4 mongod Specify a Data Directory If you want mongod to store data files at a path other than /data/db you can specify a dbPath. The dbPath must exist before you start mongod. If it does not exist, create the directory and the permissions so that mongod can read and write data to this path. For more information on permissions, see the security operations documentation. To specify a dbPath for mongod to use as a data directory, use the --dbpath option. The following invocation will start a mongod instance and store data in the /srv/mongodb path mongod --dbpath /srv/mongodb/ Specify a TCP Port Only a single process can listen for connections on a network interface at a time. If you run multiple mongod processes on a single machine, or have other processes that must use this port, you must assign each a different port to listen on for client connections. To specify a port to mongod, use the --port option on the command line. The following command starts mongod listening on port 12345: mongod --port 12345 Use the default port number when possible, to avoid confusion. Start mongod as a Daemon To run a mongod process as a daemon (i.e. fork), and write its output to a log file, use the --fork and --logpath options. You must create the log directory; however, mongod will create the log file if it does not exist. The following command starts mongod as a daemon and records log output to /var/log/mongodb.log. mongod --fork --logpath /var/log/mongodb.log Additional Configuration Options For an overview of common configurations and common configuration deploy-ments. configurations for common use cases, see Run-time Database Configuration (page 182). Stop mongod Processes In a clean shutdown a mongod completes all pending operations, flushes all data to data files, and closes all data files. Other shutdowns are unclean and can compromise the validity the data files. To ensure a clean shutdown, always shutdown mongod instances using one of the following methods: Use shutdownServer() Shut down the mongod from the mongo shell using the db.shutdownServer() method as follows: use admin db.shutdownServer() Calling the same method from a control script accomplishes the same result. For systems with authorization enabled, users may only issue db.shutdownServer() when authenticated to the admin database or via the localhost interface on systems without authentication enabled. 208 Chapter 5. Administration
  • 213. MongoDB Documentation, Release 2.6.4 Use --shutdown From the Linux command line, shut down the mongod using the --shutdown option in the following command: mongod --shutdown Use CTRL-C When running the mongod instance in interactive mode (i.e. without --fork), issue Control-C to perform a clean shutdown. Use kill From the Linux command line, shut down a specific mongod instance using the following command: kill <mongod process ID> Warning: Never use kill -9 (i.e. SIGKILL) to terminate a mongod instance. Stop a Replica Set Procedure If the mongod is the primary in a replica set, the shutdown process for these mongod instances has the following steps: 1. Check how up-to-date the secondaries are. 2. If no secondary is within 10 seconds of the primary, mongod will return a message that it will not shut down. You can pass the shutdown command a timeoutSecs argument to wait for a secondary to catch up. 3. If there is a secondary within 10 seconds of the primary, the primary will step down and wait for the secondary to catch up. 4. After 60 seconds or once the secondary has caught up, the primary will shut down. Force Replica Set Shutdown If there is no up-to-date secondary and you want the primary to shut down, issue the shutdown command with the force argument, as in the following mongo shell operation: db.adminCommand({shutdown : 1, force : true}) To keep checking the secondaries for a specified number of seconds if none are immediately up-to-date, issue shutdown with the timeoutSecs argument. MongoDB will keep checking the secondaries for the specified number of seconds if none are immediately up-to-date. If any of the secondaries catch up within the allotted time, the primary will shut down. If no secondaries catch up, it will not shut down. The following command issues shutdown with timeoutSecs set to 5: db.adminCommand({shutdown : 1, timeoutSecs : 5}) Alternately you can use the timeoutSecs argument with the db.shutdownServer() method: db.shutdownServer({timeoutSecs : 5}) Terminate Running Operations Overview MongoDB provides two facilitates to terminate running operations: maxTimeMS() and db.killOp(). Use these operations as needed to control the behavior of operations in a MongoDB deployment. 5.2. Administration Tutorials 209
  • 214. MongoDB Documentation, Release 2.6.4 Available Procedures maxTimeMS New in version 2.6. The maxTimeMS() method sets a time limit for an operation. When the operation reaches the specified time limit, MongoDB interrupts the operation at the next interrupt point. Terminate a Query From the mongo shell, use the following method to set a time limit of 30 milliseconds for this query: db.location.find( { "town": { "$regex": "(Pine Lumber)", "$options": 'i' } } ).maxTimeMS(30) Terminate a Command Consider a potentially long running operation using distinct to return each dis-tinct‘‘ collection‘‘ field that has a city key: db.runCommand( { distinct: "collection", key: "city" } ) You can add the maxTimeMS field to the command document to set a time limit of 30 milliseconds for the operation: db.runCommand( { distinct: "collection", key: "city", maxTimeMS: 45 } ) db.getLastError() and db.getLastErrorObj() will return errors for interrupted options: { "n" : 0, "connectionId" : 1, "err" : "operation exceeded time limit", "ok" : 1 } killOp The db.killOp() method interrupts a running operation at the next interrupt point. db.killOp() identifies the target operation by operation ID. db.killOp(<opId>) Related To return a list of running operations see db.currentOp(). Analyze Performance of Database Operations The database profiler collects fine grained data about MongoDB write operations, cursors, database commands on a running mongod instance. You can enable profiling on a per-database or per-instance basis. The profiling level (page 211) is also configurable when enabling profiling. The database profiler writes all the data it collects to the system.profile (page 271) collection, which is a capped collection (page 196). See Database Profiler Output (page 271) for overview of the data in the system.profile (page 271) documents created by the profiler. This document outlines a number of key administration options for the database profiler. For additional related infor-mation, consider the following resources: • Database Profiler Output (page 271) 210 Chapter 5. Administration
  • 215. MongoDB Documentation, Release 2.6.4 • Profile Command • http://docs.mongodb.org/manualreference/method/db.currentOp Profiling Levels The following profiling levels are available: • 0 - the profiler is off, does not collect any data. mongod always writes operations longer than the slowOpThresholdMs threshold to its log. • 1 - collects profiling data for slow operations only. By default slow operations are those slower than 100 milliseconds. You can modify the threshold for “slow” operations with the slowOpThresholdMs runtime option or the setParameter command. See the Specify the Threshold for Slow Operations (page 211) section for more information. • 2 - collects profiling data for all database operations. Enable Database Profiling and Set the Profiling Level You can enable database profiling from the mongo shell or through a driver using the profile command. This section will describe how to do so from the mongo shell. See your driver documentation if you want to control the profiler from within your application. When you enable profiling, you also set the profiling level (page 211). The profiler records data in the system.profile (page 271) collection. MongoDB creates the system.profile (page 271) collection in a database after you enable profiling for that database. To enable profiling and set the profiling level, use the db.setProfilingLevel() helper in the mongo shell, passing the profiling level as a parameter. For example, to enable profiling for all database operations, consider the following operation in the mongo shell: db.setProfilingLevel(2) The shell returns a document showing the previous level of profiling. The "ok" : 1 key-value pair indicates the operation succeeded: { "was" : 0, "slowms" : 100, "ok" : 1 } To verify the new setting, see the Check Profiling Level (page 212) section. Specify the Threshold for Slow Operations The threshold for slow operations applies to the entire mongod in-stance. When you change the threshold, you change it for all databases on the instance. Important: Changing the slow operation threshold for the database profiler also affects the profiling subsystem’s slow operation threshold for the entire mongod instance. Always set the threshold to the highest useful value. By default the slow operation threshold is 100 milliseconds. Databases with a profiling level of 1 will log operations slower than 100 milliseconds. To change the threshold, pass two parameters to the db.setProfilingLevel() helper in the mongo shell. The first parameter sets the profiling level for the current database, and the second sets the default slow operation threshold for the entire mongod instance. 5.2. Administration Tutorials 211
  • 216. MongoDB Documentation, Release 2.6.4 For example, the following command sets the profiling level for the current database to 0, which disables profiling, and sets the slow-operation threshold for the mongod instance to 20 milliseconds. Any database on the instance with a profiling level of 1 will use this threshold: db.setProfilingLevel(0,20) Check Profiling Level To view the profiling level (page 211), issue the following from the mongo shell: db.getProfilingStatus() The shell returns a document similar to the following: { "was" : 0, "slowms" : 100 } The was field indicates the current level of profiling. The slowms field indicates how long an operation must exist in milliseconds for an operation to pass the “slow” threshold. MongoDB will log operations that take longer than the threshold if the profiling level is 1. This document returns the profiling level in the was field. For an explanation of profiling levels, see Profiling Levels (page 211). To return only the profiling level, use the db.getProfilingLevel() helper in the mongo as in the following: db.getProfilingLevel() Disable Profiling To disable profiling, use the following helper in the mongo shell: db.setProfilingLevel(0) Enable Profiling for an Entire mongod Instance For development purposes in testing environments, you can enable database profiling for an entire mongod instance. The profiling level applies to all databases provided by the mongod instance. To enable profiling for a mongod instance, pass the following parameters to mongod at startup or within the configuration file: mongod --profile=1 --slowms=15 This sets the profiling level to 1, which collects profiling data for slow operations only, and defines slow operations as those that last longer than 15 milliseconds. See also: mode and slowOpThresholdMs. Database Profiling and Sharding You cannot enable profiling on a mongos instance. To enable profiling in a shard cluster, you must enable profiling for each mongod instance in the cluster. View Profiler Data The database profiler logs information about database operations in the system.profile (page 271) collection. To view profiling information, query the system.profile (page 271) collection. To view example queries, see Profiler Overhead (page 213) For an explanation of the output data, see Database Profiler Output (page 271). 212 Chapter 5. Administration
  • 217. MongoDB Documentation, Release 2.6.4 Example Profiler Data Queries This section displays example queries to the system.profile (page 271) col-lection. For an explanation of the query output, see Database Profiler Output (page 271). To return the most recent 10 log entries in the system.profile (page 271) collection, run a query similar to the following: db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty() To return all operations except command operations ($cmd), run a query similar to the following: db.system.profile.find( { op: { $ne : 'command' } } ).pretty() To return operations for a particular collection, run a query similar to the following. This example returns operations in the mydb database’s test collection: db.system.profile.find( { ns : 'mydb.test' } ).pretty() To return operations slower than 5 milliseconds, run a query similar to the following: db.system.profile.find( { millis : { $gt : 5 } } ).pretty() To return information from a certain time range, run a query similar to the following: db.system.profile.find( { ts : { $gt : new ISODate("2012-12-09T03:00:00Z") , $lt : new ISODate("2012-12-09T03:40:00Z") } } ).pretty() The following example looks at the time range, suppresses the user field from the output to make it easier to read, and sorts the results by how long each operation took to run: db.system.profile.find( { ts : { $gt : new ISODate("2011-07-12T03:00:00Z") , $lt : new ISODate("2011-07-12T03:40:00Z") } }, { user : 0 } ).sort( { millis : -1 } ) Show the Five Most Recent Events On a database that has profiling enabled, the show profile helper in the mongo shell displays the 5 most recent operations that took at least 1 millisecond to execute. Issue show profile from the mongo shell, as follows: show profile Profiler Overhead When enabled, profiling has a minor effect on performance. The system.profile (page 271) collection is a capped collection with a default size of 1 megabyte. A collection of this size can typically store several thousand profile documents, but some application may use more or less profiling data per operation. To change the size of the system.profile (page 271) collection, you must: 5.2. Administration Tutorials 213
  • 218. MongoDB Documentation, Release 2.6.4 1. Disable profiling. 2. Drop the system.profile (page 271) collection. 3. Create a new system.profile (page 271) collection. 4. Re-enable profiling. For example, to create a new system.profile (page 271) collections that’s 4000000 bytes, use the following sequence of operations in the mongo shell: db.setProfilingLevel(0) db.system.profile.drop() db.createCollection( "system.profile", { capped: true, size:4000000 } ) db.setProfilingLevel(1) Change Size of system.profile Collection To change the size of the system.profile (page 271) collection on a secondary, you must stop the secondary, run it as a standalone, and then perform the steps above. When done, restart the standalone as a member of the replica set. For more information, see Perform Maintenance on Replica Set Members (page 572). Rotate Log Files Overview Log rotation using MongoDB’s standard approach archives the current log file and starts a new one. To do this, the mongod or mongos instance renames the current log file by appending a UTC (GMT) timestamp to the filename, in ISODate format. It then opens a new log file, closes the old log file, and sends all new log entries to the new log file. MongoDB’s standard approach to log rotation only rotates logs in response to the logRotate command, or when the mongod or mongos process receives a SIGUSR1 signal from the operating system. Alternately, you may configure mongod to send log data to syslog. In this case, you can take advantage of alternate logrotation tools. See also: For information on logging, see the Process Logging (page 178) section. Log Rotation With MongoDB The following steps create and rotate a log file: 1. Start a mongod with verbose logging, with appending enabled, and with the following log file: mongod -v --logpath /var/log/mongodb/server1.log --logappend 2. In a separate terminal, list the matching files: ls /var/log/mongodb/server1.log* For results, you get: 214 Chapter 5. Administration
  • 219. MongoDB Documentation, Release 2.6.4 server1.log 3. Rotate the log file using one of the following methods. • From the mongo shell, issue the logRotate command from the admin database: use admin db.runCommand( { logRotate : 1 } ) This is the only available method to rotate log files on Windows systems. • For Linux systems, rotate logs for a single process by issuing the following command: kill -SIGUSR1 <mongod process id> 4. List the matching files again: ls /var/log/mongodb/server1.log* For results you get something similar to the following. The timestamps will be different. server1.log server1.log.2011-11-24T23-30-00 The example results indicate a log rotation performed at exactly 11:30 pm on November 24th, 2011 UTC, which is the local time offset by the local time zone. The original log file is the one with the timestamp. The new log is server1.log file. If you issue a second logRotate command an hour later, then an additional file would appear when listing matching files, as in the following example: server1.log server1.log.2011-11-24T23-30-00 server1.log.2011-11-25T00-30-00 This operation does not modify the server1.log.2011-11-24T23-30-00 file created earlier, while server1.log.2011-11-25T00-30-00 is the previous server1.log file, renamed. server1.log is a new, empty file that receives all new log output. Syslog Log Rotation New in version 2.2. To configure mongod to send log data to syslog rather than writing log data to a file, use the following procedure. 1. Start a mongod with the syslogFacility option. 2. Store and rotate the log output using your system’s default log rotation mechanism. Important: You cannot use syslogFacility with systemLog.path. Manage Journaling MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 67) durability and to provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation to the journal. If MongoDB should terminate or encounter an error before it can write the changes from the journal to the data files, MongoDB can re-apply the write operation and maintain a consistent state. Without a journal, if mongod exits unexpectedly, you must assume your data is in an inconsistent state, and you must run either repair (page 246) or, preferably, resync (page 575) from a clean member of the replica set. 5.2. Administration Tutorials 215
  • 220. MongoDB Documentation, Release 2.6.4 With journaling enabled, if mongod stops unexpectedly, the program can recover everything written to the journal, and the data remains in a consistent state. By default, the greatest extent of lost writes, i.e., those not made to the journal, are those made in the last 100 milliseconds. See commitIntervalMs for more information on the default. With journaling, if you want a data set to reside entirely in RAM, you need enough RAM to hold the data set plus the “write working set.” The “write working set” is the amount of unique data you expect to see written between re-mappings of the private view. For information on views, see Storage Views used in Journaling (page 275). Important: Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default. For other platforms, see storage.journal.enabled. Procedures Enable Journaling Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default. To enable journaling, start mongod with the --journal command line option. If no journal files exist, when mongod starts, it must preallocate new journal files. During this operation, the mongod is not listening for connections until preallocation completes: for some systems this may take a several minutes. During this period your applications and the mongo shell are not available. Disable Journaling Warning: Do not disable journaling on production systems. If your mongod instance stops without shutting down cleanly unexpectedly for any reason, (e.g. power failure) and you are not running with journaling, then you must recover from an unaffected replica set member or backup, as described in repair (page 246). To disable journaling, start mongod with the --nojournal command line option. Get Commit Acknowledgment You can get commit acknowledgment with the Write Concern (page 72) and the j option. For details, see Write Concern Reference (page 118). Avoid Preallocation Lag To avoid preallocation lag (page 275), you can preallocate files in the journal directory by copying them from another instance of mongod. Preallocated files do not contain data. It is safe to later remove them. But if you restart mongod with journaling, mongod will create them again. Example The following sequence preallocates journal files for an instance of mongod running on port 27017 with a database path of /data/db. For demonstration purposes, the sequence starts by creating a set of journal files in the usual way. 1. Create a temporary directory into which to create a set of journal files: mkdir ~/tmpDbpath 2. Create a set of journal files by staring a mongod instance that uses the temporary directory: mongod --port 10000 --dbpath ~/tmpDbpath --journal 3. When you see the following log output, indicating mongod has the files, press CONTROL+C to stop the mongod instance: 216 Chapter 5. Administration
  • 221. MongoDB Documentation, Release 2.6.4 [initandlisten] waiting for connections on port 10000 4. Preallocate journal files for the new instance of mongod by moving the journal files from the data directory of the existing instance to the data directory of the new instance: mv ~/tmpDbpath/journal /data/db/ 5. Start the new mongod instance: mongod --port 27017 --dbpath /data/db --journal Monitor Journal Status Use the following commands and methods to monitor journal status: • serverStatus The serverStatus command returns database status information that is useful for assessing performance. • journalLatencyTest Use journalLatencyTest to measure how long it takes on your volume to write to the disk in an append-only fashion. You can run this command on an idle system to get a baseline sync time for journaling. You can also run this command on a busy system to see the sync time on a busy system, which may be higher if the journal directory is on the same volume as the data files. The journalLatencyTest command also provides a way to check if your disk drive is buffering writes in its local cache. If the number is very low (i.e., less than 2 milliseconds) and the drive is non-SSD, the drive is probably buffering writes. In that case, enable cache write-through for the device in your operating system, unless you have a disk controller card with battery backed RAM. Change the Group Commit Interval Changed in version 2.0. You can set the group commit interval using the --journalCommitInterval command line option. The allowed range is 2 to 300 milliseconds. Lower values increase the durability of the journal at the expense of disk performance. Recover Data After Unexpected Shutdown On a restart after a crash, MongoDB replays all journal files in the journal directory before the server becomes available. If MongoDB must replay journal files, mongod notes these events in the log output. There is no reason to run repairDatabase in these situations. Store a JavaScript Function on the Server Note: Do not store application logic in the database. There are performance limitations to running JavaScript inside of MongoDB. Application code also is typically most effective when it shares version control with the application itself. There is a special system collection named system.js that can store JavaScript functions for reuse. To store a function, you can use the db.collection.save(), as in the following example: 5.2. Administration Tutorials 217
  • 222. MongoDB Documentation, Release 2.6.4 db.system.js.save( { _id : "myAddFunction" , value : function (x, y){ return x + y; } } ); • The _id field holds the name of the function and is unique per database. • The value field holds the function definition Once you save a function in the system.js collection, you can use the function from any JavaScript context (e.g. eval command or the mongo shell method db.eval(), $where operator, mapReduce or mongo shell method db.collection.mapReduce()). Consider the following example from the mongo shell that first saves a function named echoFunction to the system.js collection and calls the function using db.eval() method: db.system.js.save( { _id: "echoFunction", value : function(x) { return x; } } ) db.eval( "echoFunction( 'test' )" ) See http://github.com/mongodb/mongo/tree/master/jstests/core/storefunc.js for a full example. New in version 2.1: In the mongo shell, you can use db.loadServerScripts() to load all the scripts saved in the system.js collection for the current database. Once loaded, you can invoke the functions directly in the shell, as in the following example: db.loadServerScripts(); echoFunction(3); myAddFunction(3, 5); Upgrade to the Latest Revision of MongoDB Revisions provide security patches, bug fixes, and new or changed features that do not contain any backward breaking changes. Always upgrade to the latest revision in your release series. The third number in the MongoDB version number (page 808) indicates the revision. Before Upgrading • Ensure you have an up-to-date backup of your data set. See MongoDB Backup Methods (page 172). • Consult the following documents for any special considerations or compatibility issues specific to your Mon-goDB release: – The release notes, located at Release Notes (page 725). – The documentation for your driver. See http://docs.mongodb.org/manualapplications/drivers. • If your installation includes replica sets, plan the upgrade during a predefined maintenance window. 218 Chapter 5. Administration
  • 223. MongoDB Documentation, Release 2.6.4 • Before you upgrade a production environment, use the procedures in this document to upgrade a staging environ-ment that reproduces your production environment, to ensure that your production configuration is compatible with all changes. Upgrade Procedure Important: Always backup all of your data before upgrading MongoDB. Upgrade each mongod and mongos binary separately, using the procedure described here. When upgrading a binary, use the procedure Upgrade a MongoDB Instance (page 219). Follow this upgrade procedure: 1. For deployments that use authentication, first upgrade all of your MongoDB drivers. To upgrade, see the documentation for your driver. 2. Upgrade sharded clusters, as described in Upgrade Sharded Clusters (page 220). 3. Upgrade any standalone instances. See Upgrade a MongoDB Instance (page 219). 4. Upgrade any replica sets that are not part of a sharded cluster, as described in Upgrade Replica Sets (page 220). Upgrade a MongoDB Instance To upgrade a mongod or mongos instance, use one of the following approaches: • Upgrade the instance using the operating system’s package management tool and the official MongoDB pack-ages. This is the preferred approach. See Install MongoDB (page 5). • Upgrade the instance by replacing the existing binaries with new binaries. See Replace the Existing Binaries (page 219). Replace the Existing Binaries Important: Always backup all of your data before upgrading MongoDB. This section describes how to upgrade MongoDB by replacing the existing binaries. The preferred approach to an upgrade is to use the operating system’s package management tool and the official MongoDB packages, as described in Install MongoDB (page 5). To upgrade a mongod or mongos instance by replacing the existing binaries: 1. Download the binaries for the latest MongoDB revision from the MongoDB Download Page75 and store the binaries in a temporary location. The binaries download as compressed files that uncompress to the directory structure used by the MongoDB installation. 2. Shutdown the instance. 3. Replace the existing MongoDB binaries with the downloaded binaries. 4. Restart the instance. 75http://downloads.mongodb.org/ 5.2. Administration Tutorials 219
  • 224. MongoDB Documentation, Release 2.6.4 Upgrade Sharded Clusters To upgrade a sharded cluster: 1. Disable the cluster’s balancer, as described in Disable the Balancer (page 661). 2. Upgrade each mongos instance by following the instructions below in Upgrade a MongoDB Instance (page 219). You can upgrade the mongos instances in any order. 3. Upgrade each mongod config server (page 616) individually starting with the last config server listed in your mongos --configdb string and working backward. To keep the cluster online, make sure at least one config server is always running. For each config server upgrade, follow the instructions below in Upgrade a MongoDB Instance (page 219) Example Given the following config string: mongos --configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019 You would upgrade the config servers in the following order: (a) cfg2.example.net (b) cfg1.example.net (c) cfg0.example.net 4. Upgrade each shard. • If a shard is a replica set, upgrade the shard using the procedure below titled Upgrade Replica Sets (page 220). • If a shard is a standalone instance, upgrade the shard using the procedure below titled Upgrade a MongoDB Instance (page 219). 5. Re-enable the balancer, as described in Enable the Balancer (page 661). Upgrade Replica Sets To upgrade a replica set, upgrade each member individually, starting with the secondaries and finishing with the primary. Plan the upgrade during a predefined maintenance window. Upgrade Secondaries Upgrade each secondary separately as follows: 1. Upgrade the secondary’s mongod binary by following the instructions below in Upgrade a MongoDB Instance (page 219). 2. After upgrading a secondary, wait for the secondary to recover to the SECONDARY state before upgrading the next instance. To check the member’s state, issue rs.status() in the mongo shell. The secondary may briefly go into STARTUP2 or RECOVERING. This is normal. Make sure to wait for the secondary to fully recover to SECONDARY before you continue the upgrade. Upgrade the Primary 1. Step down the primary to initiate the normal failover (page 523) procedure. Using one of the following: • The rs.stepDown() helper in the mongo shell. 220 Chapter 5. Administration
  • 225. MongoDB Documentation, Release 2.6.4 • The replSetStepDown database command. During failover, the set cannot accept writes. Typically this takes 10-20 seconds. Plan the upgrade during a predefined maintenance window. Note: Stepping down the primary is preferable to directly shutting down the primary. Stepping down expedites the failover procedure. 2. Once the primary has stepped down, call the rs.status() method from the mongo shell until you see that another member has assumed the PRIMARY state. 3. Shut down the original primary and upgrade its instance by following the instructions below in Upgrade a MongoDB Instance (page 219). Monitor MongoDB With SNMP on Linux New in version 2.2. Enterprise Feature SNMP is only available in MongoDB Enterprise76. Overview MongoDB Enterprise can report system information into SNMP traps, to support centralized data collection and aggregation. This procedure explains the setup and configuration of a mongod instance as an SNMP subagent, as well as initializing and testing of SNMP support with MongoDB Enterprise. See also: Troubleshoot SNMP (page 224) and Monitor MongoDB Windows with SNMP (page 223) for complete instructions on using MongoDB with SNMP on Windows systems. Considerations Only mongod instances provide SNMP support. mongos and the other MongoDB binaries do not support SNMP. Configuration Files Changed in version 2.6. MongoDB Enterprise contains the following configuration files to support SNMP: • MONGOD-MIB.txt: The management information base (MIB) file that defines MongoDB’s SNMP output. • mongod.conf.subagent: The configuration file to run mongod as the SNMP subagent. This file sets SNMP run-time configuration options, including the AgentX socket to connect to the SNMP master. 76http://www.mongodb.com/products/mongodb-enterprise 5.2. Administration Tutorials 221
  • 226. MongoDB Documentation, Release 2.6.4 • mongod.conf.master: The configuration file to run mongod as the SNMP master. This file sets SNMP run-time configuration options. Procedure Step 1: Copy configuration files. Use the following sequence of commands to move the SNMP configuration files to the SNMP service configuration directory. First, create the SNMP configuration directory if needed and then, from the installation directory, copy the configura-tion files to the SNMP service configuration directory: mkdir -p /etc/snmp/ cp MONGOD-MIB.txt /usr/share/snmp/mibs/MONGOD-MIB.txt cp mongod.conf.subagent /etc/snmp/mongod.conf The configuration filename is tool-dependent. For example, when using net-snmp the configuration file is snmpd.conf. By default SNMP uses UNIX domain for communication between the agent (i.e. snmpd or the master) and sub-agent (i.e. MongoDB). Ensure that the agentXAddress specified in the SNMP configuration file for MongoDB matches the agentXAddress in the SNMP master configuration file. Step 2: Start MongoDB. Start mongod with the snmp-subagent to send data to the SNMP master. mongod --snmp-subagent Step 3: Confirm SNMP data retrieval. Use snmpwalk to collect data from mongod: Connect an SNMP client to verify the ability to collect SNMP data from MongoDB. Install the net-snmp77 package to access the snmpwalk client. net-snmp provides the snmpwalk SNMP client. snmpwalk -m /usr/share/snmp/mibs/MONGOD-MIB.txt -v 2c -c mongodb 127.0.0.1:<port> 1.3.6.1.4.1.34601 <port> refers to the port defined by the SNMP master, not the primary port used by mongod for client communi-cation. Optional: Run MongoDB as SNMP Master You can run mongod with the snmp-master option for testing purposes. To do this, use the SNMP master configu-ration file instead of the subagent configuration file. From the directory containing the unpacked MongoDB installation files: cp mongod.conf.master /etc/snmp/mongod.conf Additionally, start mongod with the snmp-master option, as in the following: mongod --snmp-master 77http://www.net-snmp.org/ 222 Chapter 5. Administration
  • 227. MongoDB Documentation, Release 2.6.4 Monitor MongoDB Windows with SNMP New in version 2.6. Enterprise Feature SNMP is only available in MongoDB Enterprise78. Overview MongoDB Enterprise can report system information into SNMP traps, to support centralized data collection and aggregation. This procedure explains the setup and configuration of a mongod.exe instance as an SNMP subagent, as well as initializing and testing of SNMP support with MongoDB Enterprise. See also: Monitor MongoDB With SNMP on Linux (page 221) and Troubleshoot SNMP (page 224) for more information. Considerations Only mongod.exe instances provide SNMP support. mongos.exe and the other MongoDB binaries do not support SNMP. Configuration Files Changed in version 2.6. MongoDB Enterprise contains the following configuration files to support SNMP: • MONGOD-MIB.txt: The management information base (MIB) file that defines MongoDB’s SNMP output. • mongod.conf.subagent: The configuration file to run mongod.exe as the SNMP subagent. This file sets SNMP run-time configuration options, including the AgentX socket to connect to the SNMP master. • mongod.conf.master: The configuration file to run mongod.exe as the SNMP master. This file sets SNMP run-time configuration options. Procedure Step 1: Copy configuration files. Use the following sequence of commands to move the SNMP configuration files to the SNMP service configuration directory. First, create the SNMP configuration directory if needed and then, from the installation directory, copy the configura-tion files to the SNMP service configuration directory: md C:snmpetcconfig copy MONGOD-MIB.txt C:snmpetcconfigMONGOD-MIB.txt copy mongod.conf.subagent C:snmpetcconfigmongod.conf 78http://www.mongodb.com/products/mongodb-enterprise 5.2. Administration Tutorials 223
  • 228. MongoDB Documentation, Release 2.6.4 The configuration filename is tool-dependent. For example, when using net-snmp the configuration file is snmpd.conf. Edit the configuration file to ensure that the communication between the agent (i.e. snmpd or the master) and sub-agent (i.e. MongoDB) uses TCP. Ensure that the agentXAddress specified in the SNMP configuration file for MongoDB matches the agentXAddress in the SNMP master configuration file. Step 2: Start MongoDB. Start mongod.exe with the snmp-subagent to send data to the SNMP master. mongod.exe --snmp-subagent Step 3: Confirm SNMP data retrieval. Use snmpwalk to collect data from mongod.exe: Connect an SNMP client to verify the ability to collect SNMP data from MongoDB. Install the net-snmp79 package to access the snmpwalk client. net-snmp provides the snmpwalk SNMP client. snmpwalk -m C:snmpetcconfigMONGOD-MIB.txt -v 2c -c mongodb 127.0.0.1:<port> 1.3.6.1.4.1.34601 <port> refers to the port defined by the SNMP master, not the primary port used by mongod.exe for client communication. Optional: Run MongoDB as SNMP Master You can run mongod.exe with the snmp-master option for testing purposes. To do this, use the SNMP master configuration file instead of the subagent configuration file. From the directory containing the unpacked MongoDB installation files: copy mongod.conf.master C:snmpetcconfigmongod.conf Additionally, start mongod.exe with the snmp-master option, as in the following: mongod.exe --snmp-master Troubleshoot SNMP New in version 2.6. Enterprise Feature SNMP is only available in MongoDB Enterprise. Overview MongoDB Enterprise can report system information into SNMP traps, to support centralized data collection and aggregation. This document identifies common problems you may encounter when deploying MongoDB Enterprise with SNMP as well as possible solutions for these issues. See Monitor MongoDB With SNMP on Linux (page 221) and Monitor MongoDB Windows with SNMP (page 223) for complete installation instructions. 79http://www.net-snmp.org/ 224 Chapter 5. Administration
  • 229. MongoDB Documentation, Release 2.6.4 Issues Failed to Connect The following in the mongod logfile: Warning: Failed to connect to the agentx master agent AgentX is the SNMP agent extensibility protocol defined in Internet RFC 274180. It explains how to define additional data to monitor over SNMP. When MongoDB fails to connect to the agentx master agent, use the following procedure to ensure that the SNMP subagent can connect properly to the SNMP master. 1. Make sure the master agent is running. 2. Compare the SNMP master’s configuration file with the subagent configuration file. Ensure that the agentx socket definition is the same between the two. 3. Check the SNMP configuration files to see if they specify using UNIX Domain Sockets. If so, confirm that the mongod has appropriate permissions to open a UNIX domain socket. Error Parsing Command Line One of the following errors at the command line: Error parsing command line: unknown option snmp-master try 'mongod --help' for more information Error parsing command line: unknown option snmp-subagent try 'mongod --help' for more information mongod binaries that are not part of the Enterprise Edition produce this error. Install the Enterprise Edition (page 24) and attempt to start mongod again. Other MongoDB binaries, including mongos will produce this error if you attempt to star them with snmp-master or snmp-subagent. Only mongod supports SNMP. Error Starting SNMPAgent The following line in the log file indicates that mongod cannot read the mongod.conf file: [SNMPAgent] warning: error starting SNMPAgent as master err:1 If running on Linux, ensure mongod.conf exists in the /etc/snmp directory, and ensure that the mongod UNIX user has permission to read the mongod.conf file. If running on Windows, ensure mongod.conf exists in C:snmpetcconfig. MongoDB Tutorials This page lists the tutorials available as part of the MongoDB Manual. In addition to these documents, you can refer to the introductory MongoDB Tutorial (page 43). If there is a process or pattern that you would like to see included here, please open a Jira Case81. Getting Started • Install MongoDB on Linux Systems (page 14) • Install MongoDB on Red Hat Enterprise, CentOS, Fedora, or Amazon Linux (page 6) 80http://www.ietf.org/rfc/rfc2741.txt 81https://jira.mongodb.org/browse/DOCS 5.2. Administration Tutorials 225
  • 230. MongoDB Documentation, Release 2.6.4 • Install MongoDB on Debian (page 12) • Install MongoDB on Ubuntu (page 9) • Install MongoDB on OS X (page 16) • Install MongoDB on Windows (page 19) • Getting Started with MongoDB (page 43) • Generate Test Data (page 47) Administration Replica Sets • Deploy a Replica Set (page 545) • Deploy Replica Set and Configure Authentication and Authorization (page 313) • Convert a Standalone to a Replica Set (page 556) • Add Members to a Replica Set (page 557) • Remove Members from Replica Set (page 560) • Replace a Replica Set Member (page 561) • Adjust Priority for Replica Set Member (page 562) • Resync a Member of a Replica Set (page 575) • Deploy a Geographically Redundant Replica Set (page 550) • Change the Size of the Oplog (page 570) • Force a Member to Become Primary (page 573) • Change Hostnames in a Replica Set (page 584) • Add an Arbiter to Replica Set (page 555) • Convert a Secondary to an Arbiter (page 568) • Configure a Secondary’s Sync Target (page 587) • Configure a Delayed Replica Set Member (page 566) • Configure a Hidden Replica Set Member (page 565) • Configure Non-Voting Replica Set Member (page 567) • Prevent Secondary from Becoming Primary (page 563) • Configure Replica Set Tag Sets (page 576) • Manage Chained Replication (page 583) • Reconfigure a Replica Set with Unavailable Members (page 580) • Recover Data after an Unexpected Shutdown (page 246) • Troubleshoot Replica Sets (page 588) 226 Chapter 5. Administration
  • 231. MongoDB Documentation, Release 2.6.4 Sharding • Deploy a Sharded Cluster (page 635) • Convert a Replica Set to a Replicated Sharded Cluster (page 643) • Add Shards to a Cluster (page 642) • Remove Shards from an Existing Sharded Cluster (page 663) • Deploy Three Config Servers for Production Deployments (page 643) • Migrate Config Servers with the Same Hostname (page 652) • Migrate Config Servers with Different Hostnames (page 652) • Replace Disabled Config Server (page 653) • Migrate a Sharded Cluster to Different Hardware (page 654) • Backup Cluster Metadata (page 657) • Backup a Small Sharded Cluster with mongodump (page 238) • Backup a Sharded Cluster with Filesystem Snapshots (page 239) • Backup a Sharded Cluster with Database Dumps (page 241) • Restore a Single Shard (page 243) • Restore a Sharded Cluster (page 244) • Schedule Backup Window for Sharded Clusters (page 243) • Manage Shard Tags (page 672) Basic Operations • Use Database Commands (page 206) • Recover Data after an Unexpected Shutdown (page 246) • Expire Data from Collections by Setting TTL (page 198) • Analyze Performance of Database Operations (page 210) • Rotate Log Files (page 214) • Build Old Style Indexes (page 471) • Manage mongod Processes (page 207) • Back Up and Restore with MongoDB Tools (page 234) • Backup and Restore with Filesystem Snapshots (page 229) Security • Configure Linux iptables Firewall for MongoDB (page 297) • Configure Windows netsh Firewall for MongoDB (page 300) • Enable Client Access Control (page 317) • Create a User Administrator (page 343) • Add a User to a Database (page 344) • Create a Role (page 347) 5.2. Administration Tutorials 227
  • 232. MongoDB Documentation, Release 2.6.4 • Modify a User’s Access (page 352) • View Roles (page 353) • Generate a Key File (page 338) • Configure MongoDB with Kerberos Authentication on Linux (page 331) • Create a Vulnerability Report (page 359) Development Patterns • Perform Two Phase Commits (page 102) • Isolate Sequence of Operations (page 111) • Create an Auto-Incrementing Sequence Field (page 113) • Enforce Unique Keys for Sharded Collections (page 674) • Aggregation Examples (page 403) • Model Data to Support Keyword Search (page 155) • Limit Number of Elements in an Array after an Update (page 116) • Perform Incremental Map-Reduce (page 413) • Troubleshoot the Map Function (page 415) • Troubleshoot the Reduce Function (page 416) • Store a JavaScript Function on the Server (page 217) Text Search Patterns • Create a text Index (page 486) • Specify a Language for Text Index (page 487) • Specify Name for text Index (page 489) • Control Search Results with Weights (page 490) • Limit the Number of Entries Scanned (page 491) Data Modeling Patterns • Model One-to-One Relationships with Embedded Documents (page 140) • Model One-to-Many Relationships with Embedded Documents (page 141) • Model One-to-Many Relationships with Document References (page 143) • Model Data for Atomic Operations (page 154) • Model Tree Structures with Parent References (page 146) • Model Tree Structures with Child References (page 148) • Model Tree Structures with Materialized Paths (page 151) • Model Tree Structures with Nested Sets (page 153) 228 Chapter 5. Administration
  • 233. MongoDB Documentation, Release 2.6.4 5.2.2 Backup and Recovery The following tutorials describe backup and restoration for a mongod instance: Backup and Restore with Filesystem Snapshots (page 229) An outline of procedures for creating MongoDB data set backups using system-level file snapshot tool, such as LVM or native storage appliance tools. Restore a Replica Set from MongoDB Backups (page 232) Describes procedure for restoring a replica set from an archived backup such as a mongodump or MMS Backup82 file. Back Up and Restore with MongoDB Tools (page 234) The procedure for writing the contents of a database to a BSON (i.e. binary) dump file for backing up MongoDB databases. Backup and Restore Sharded Clusters (page 238) Detailed procedures and considerations for backing up sharded clusters and single shards. Recover Data after an Unexpected Shutdown (page 246) Recover data from MongoDB data files that were not prop-erly closed or have an invalid state. Backup and Restore with Filesystem Snapshots This document describes a procedure for creating backups of MongoDB systems using system-level tools, such as LVM or storage appliance, as well as the corresponding restoration strategies. These filesystem snapshots, or “block-level” backup methods use system level tools to create copies of the device that holds MongoDB’s data files. These methods complete quickly and work reliably, but require more system configura-tion outside of MongoDB. See also: MongoDB Backup Methods (page 172) and Back Up and Restore with MongoDB Tools (page 234). Snapshots Overview Snapshots work by creating pointers between the live data and a special snapshot volume. These pointers are the-oretically equivalent to “hard links.” As the working data diverges from the snapshot, the snapshot process uses a copy-on-write strategy. As a result the snapshot only stores modified data. After making the snapshot, you mount the snapshot image on your file system and copy data from the snapshot. The resulting backup contains a full copy of all data. Snapshots have the following limitations: • The database must be valid when the snapshot takes place. This means that all writes accepted by the database need to be fully written to disk: either to the journal or to data files. If all writes are not on disk when the backup occurs, the backup will not reflect these changes. If writes are in progress when the backup occurs, the data files will reflect an inconsistent state. With journaling all data-file states resulting from in-progress writes are recoverable; without journaling you must flush all pending writes to disk before running the backup operation and must ensure that no writes occur during the entire backup procedure. If you do use journaling, the journal must reside on the same volume as the data. • Snapshots create an image of an entire disk image. Unless you need to back up your entire system, consider isolating your MongoDB data files, journal (if applicable), and configuration on one logical disk that doesn’t contain any other data. 82https://mms.mongodb.com/?pk_campaign=mongodb-docs-admin-tutorials 5.2. Administration Tutorials 229
  • 234. MongoDB Documentation, Release 2.6.4 Alternately, store all MongoDB data files on a dedicated device so that you can make backups without duplicat-ing extraneous data. • Ensure that you copy data from snapshots and onto other systems to ensure that data is safe from site failures. • Although different snapshots methods provide different capability, the LVM method outlined below does not provide any capacity for capturing incremental backups. Snapshots With Journaling If your mongod instance has journaling enabled, then you can use any kind of file system or volume/block level snapshot tool to create backups. If you manage your own infrastructure on a Linux-based system, configure your system with LVM to provide your disk packages and provide snapshot capability. You can also use LVM-based setups within a cloud/virtualized environment. Note: Running LVM provides additional flexibility and enables the possibility of using snapshots to back up Mon-goDB. Snapshots with Amazon EBS in a RAID 10 Configuration If your deployment depends on Amazon’s Elastic Block Storage (EBS) with RAID configured within your instance, it is impossible to get a consistent state across all disks using the platform’s snapshot tool. As an alternative, you can do one of the following: • Flush all writes to disk and create a write lock to ensure consistent state during the backup process. If you choose this option see Create Backups on Instances that do not have Journaling Enabled (page 232). • Configure LVM to run and hold your MongoDB data files on top of the RAID within your system. If you choose this option, perform the LVM backup operation described in Create a Snapshot (page 230). Backup and Restore Using LVM on a Linux System This section provides an overview of a simple backup process using LVM on a Linux system. While the tools, com-mands, and paths may be (slightly) different on your system the following steps provide a high level overview of the backup operation. Note: Only use the following procedure as a guideline for a backup system and infrastructure. Production backup systems must consider a number of application specific requirements and factors unique to specific environments. Create a Snapshot To create a snapshot with LVM, issue a command as root in the following format: lvcreate --size 100M --snapshot --name mdb-snap01 /dev/vg0/mongodb This command creates an LVM snapshot (with the --snapshot option) named mdb-snap01 of the mongodb volume in the vg0 volume group. This example creates a snapshot named mdb-snap01 located at http://docs.mongodb.org/manualdev/vg0/mdb-snap01. The location and paths to your systems volume groups and devices may vary slightly depending on your operating system’s LVM configuration. The snapshot has a cap of at 100 megabytes, because of the parameter --size 100M. This size does not reflect the total amount of the data on the disk, but rather the quantity of differences between the current state of http://docs.mongodb.org/manualdev/vg0/mongodb and the creation of the snapshot (i.e. http://docs.mongodb.org/manualdev/vg0/mdb-snap01.) 230 Chapter 5. Administration
  • 235. MongoDB Documentation, Release 2.6.4 Warning: Ensure that you create snapshots with enough space to account for data growth, particularly for the period of time that it takes to copy data out of the system or to a temporary image. If your snapshot runs out of space, the snapshot image becomes unusable. Discard this logical volume and create another. The snapshot will exist when the command returns. You can restore directly from the snapshot at any time or by creating a new logical volume and restoring from this snapshot to the alternate image. While snapshots are great for creating high quality backups very quickly, they are not ideal as a format for storing backup data. Snapshots typically depend and reside on the same storage infrastructure as the original disk images. Therefore, it’s crucial that you archive these snapshots and store them elsewhere. Archive a Snapshot After creating a snapshot, mount the snapshot and copy the data to separate storage. Your system might try to compress the backup images as you move the offline. Alternatively, take a block level copy of the snapshot image, such as with the following procedure: umount /dev/vg0/mdb-snap01 dd if=/dev/vg0/mdb-snap01 | gzip > mdb-snap01.gz The above command sequence does the following: • Ensures that the http://docs.mongodb.org/manualdev/vg0/mdb-snap01 device is not mounted. Never take a block level copy of a filesystem or filesystem snapshot that is mounted. • Performs a block level copy of the entire snapshot image using the dd command and compresses the result in a gzipped file in the current working directory. Warning: This command will create a large gz file in your current working directory. Make sure that you run this command in a file system that has enough free space. Restore a Snapshot To restore a snapshot created with the above method, issue the following sequence of com-mands: lvcreate --size 1G --name mdb-new vg0 gzip -d -c mdb-snap01.gz | dd of=/dev/vg0/mdb-new mount /dev/vg0/mdb-new /srv/mongodb The above sequence does the following: • Creates a new logical volume named mdb-new, in the http://docs.mongodb.org/manualdev/vg0 volume group. The path to the new device will be http://docs.mongodb.org/manualdev/vg0/mdb-new. Warning: This volume will have a maximum size of 1 gigabyte. The original file system must have had a total size of 1 gigabyte or smaller, or else the restoration will fail. Change 1G to your desired volume size. • Uncompresses and unarchives the mdb-snap01.gz into the mdb-new disk image. • Mounts the mdb-new disk image to the /srv/mongodb directory. Modify the mount point to correspond to your MongoDB data file location, or other location as needed. Note: The restored snapshot will have a stale mongod.lock file. If you do not remove this file from the snap-shot, and MongoDB may assume that the stale lock file indicates an unclean shutdown. If you’re running with storage.journal.enabled enabled, and you do not use db.fsyncLock(), you do not need to remove the mongod.lock file. If you use db.fsyncLock() you will need to remove the lock. 5.2. Administration Tutorials 231
  • 236. MongoDB Documentation, Release 2.6.4 Restore Directly from a Snapshot To restore a backup without writing to a compressed gz file, use the following sequence of commands: umount /dev/vg0/mdb-snap01 lvcreate --size 1G --name mdb-new vg0 dd if=/dev/vg0/mdb-snap01 of=/dev/vg0/mdb-new mount /dev/vg0/mdb-new /srv/mongodb Remote Backup Storage You can implement off-system backups using the combined process (page 232) and SSH. This sequence is identical to procedures explained above, except that it archives and compresses the backup on a remote system using SSH. Consider the following procedure: umount /dev/vg0/mdb-snap01 dd if=/dev/vg0/mdb-snap01 | ssh username@example.com gzip > /opt/backup/mdb-snap01.gz lvcreate --size 1G --name mdb-new vg0 ssh username@example.com gzip -d -c /opt/backup/mdb-snap01.gz | dd of=/dev/vg0/mdb-new mount /dev/vg0/mdb-new /srv/mongodb Create Backups on Instances that do not have Journaling Enabled If your mongod instance does not run with journaling enabled, or if your journal is on a separate volume, obtaining a functional backup of a consistent state is more complicated. As described in this section, you must flush all writes to disk and lock the database to prevent writes during the backup process. If you have a replica set configuration, then for your backup use a secondary which is not receiving reads (i.e. hidden member). Step 1: Flush writes to disk and lock the database to prevent further writes. To flush writes to disk and to “lock” the database, issue the db.fsyncLock() method in the mongo shell: db.fsyncLock(); Step 2: Perform the backup operation described in Create a Snapshot. Step 3: After the snapshot completes, unlock the database. To unlock the database after the snapshot has com-pleted, use the following command in the mongo shell: db.fsyncUnlock(); Changed in version 2.2: When used in combination with fsync or db.fsyncLock(), mongod may block some reads, including those from mongodump, when queued write operation waits behind the fsync lock. Restore a Replica Set from MongoDB Backups This procedure outlines the process for taking MongoDB data and restoring that data into a new replica set. Use this approach for seeding test deployments from production backups as well as part of disaster recovery. You cannot restore a single data set to three new mongod instances and then create a replica set. In this situation MongoDB will force the secondaries to perform an initial sync. The procedures in this document describe the correct and efficient ways to deploy a replica set. 232 Chapter 5. Administration
  • 237. MongoDB Documentation, Release 2.6.4 Restore Database into a Single Node Replica Set Step 1: Obtain backup MongoDB Database files. The backup files may come from a file system snapshot (page 229). The MongoDB Management Service (MMS)83 produces MongoDB database files for stored snapshots84 and point and time snapshots85. You can also use mongorestore to restore database files using data created with mongodump. See Back Up and Restore with MongoDB Tools (page 234) for more information. Step 2: Start a mongod using data files from the backup as the data path. The following example uses /data/db as the data path, as specified in the dbpath setting: mongod --dbpath /data/db Step 3: Convert the standalone mongod to a single-node replica set Convert the standalone mongod process to a single-node replica set by shutting down the mongod instance, and restarting it with the --replSet option, as in the following example: mongod --dbpath /data/db --replSet <replName> Optionally, you can explicitly set a oplogSizeMB to control the size of the oplog created for this replica set member. Step 4: Connect to the mongod instance. For example, first use the following command to a mongod instance running on the localhost interface: mongo Step 5: Initiate the new replica set. Use rs.initiate() to initiate the new replica set, as in the following example: rs.initiate() Add Members to the Replica Set MongoDB provides two options for restoring secondary members of a replica set: • Manually copy the database files to each data directory. • Allow initial sync (page 537) to distribute data automatically. The following sections outlines both approaches. Note: If your database is large, initial sync can take a long time to complete. For large databases, it might be preferable to copy the database files onto each host. Copy Database Files and Restart mongod Instance Use the following sequence of operations to “seed” additional members of the replica set with the restored data by copying MongoDB data files directly. Step 1: Shut down the mongod instance that you restored. Use --shutdown or db.shutdownServer() to ensure a clean shut down. 83https://mms.mongodb.com/?pk_campaign=mongodb-docs-restore-rs-tutorial 84https://mms.mongodb.com/help/backup/tutorial/restore-from-snapshot/ 85https://mms.mongodb.com/help/backup/tutorial/restore-from-point-in-time-snapshot/ 5.2. Administration Tutorials 233
  • 238. MongoDB Documentation, Release 2.6.4 Step 2: Copy the primary’s data directory to each secondary. Copy the primary’s data directory into the dbPath of the other members of the replica set. The dbPath is /data/db by default. Step 3: Start the mongod instance that you restored. Step 4: Add the secondaries to the replica set. In a mongo shell connected to the primary, add the secondaries to the replica set using rs.add(). See Deploy a Replica Set (page 545) for more information about deploying a replica set. Update Secondaries using Initial Sync Use the following sequence of operations to “seed” additional members of the replica set with the restored data using the default initial sync operation. Step 1: Ensure that the data directories on the prospective replica set members are empty. Step 2: Add each prospective member to the replica set. When you add a member to the replica set, Initial Sync (page 537) copies the data from the primary to the new member. Back Up and Restore with MongoDB Tools This document describes the process for writing and restoring backups to files in binary format with the mongodump and mongorestore tools. Use these tools for backups if other backup methods, such as the MMS Backup Service86 or file system snapshots (page 229) are unavailable. See also: MongoDB Backup Methods (page 172), http://docs.mongodb.org/manualreference/program/mongodump, and http://docs.mongodb.org/manualreference/program/mongorestore. Backup a Database with mongodump mongodump does not dump the content of the local database. To backup all the databases in a cluster via mongodump, you should have the backup (page 367) role. The backup (page 367) role provides all the needed privileges for backing up all database. The role confers no additional access, in keeping with the policy of least privilege. To backup a given database, you must have read access on the database. Several roles provide this access, including the backup (page 367) role. To backup the system.profile collection in a database, you must have read access on certain system collec-tions in the database. Several roles provide this access, including the clusterAdmin (page 364) and dbAdmin (page 363) roles. Changed in version 2.6. To backup users and user-defined roles (page 286) for a given database, you must have access to the admin database. MongoDB stores the user data and role definitions for all databases in the admin database. 86https://mms.mongodb.com/?pk_campaign=mongodb-docs-tools 234 Chapter 5. Administration
  • 239. MongoDB Documentation, Release 2.6.4 Specifically, to backup a given database’s users, you must have the find (page 375) action (page 375) on the admin database’s admin.system.users (page 271) collection. The backup (page 367) and userAdminAnyDatabase (page 368) roles both provide this privilege. To backup the user-defined roles on a database, you must have the find (page 375) action on the admin database’s admin.system.roles (page 270) collection. Both the backup (page 367) and userAdminAnyDatabase (page 368) roles provide this privilege. Basic mongodump Operations The mongodump utility can back up data by either: • connecting to a running mongod or mongos instance, or • accessing data files without an active instance. The utility can create a backup for an entire server, database or collection, or can use a query to backup just part of a collection. When you run mongodump without any arguments, the command connects to the MongoDB instance on the local system (e.g. 127.0.0.1 or localhost) on port 27017 and creates a database backup named dump/ in the current directory. To backup data from a mongod or mongos instance running on the same machine and on the default port of 27017, use the following command: mongodump The data format used by mongodump from version 2.2 or later is incompatible with earlier versions of mongod. Do not use recent versions of mongodump to back up older data stores. You can also specify the --host and --port of the MongoDB instance that the mongodump should connect to. For example: mongodump --host mongodb.example.net --port 27017 mongodump will write BSON files that hold a copy of data accessible via the mongod listening on port 27017 of the mongodb.example.net host. See Create Backups from Non-Local mongod Instances (page 236) for more information. To use mongodump without a running MongoDB instance, specify the --dbpath option to read directly from MongoDB data files. See Create Backups Without a Running mongod Instance (page 236) for details. To specify a different output directory, you can use the --out or -o option: mongodump --out /data/backup/ To limit the amount of data included in the database dump, you can specify --db and --collection as options to mongodump. For example: mongodump --collection myCollection --db test This operation creates a dump of the collection named myCollection from the database test in a dump/ subdi-rectory of the current working directory. mongodump overwrites output files if they exist in the backup data folder. Before running the mongodump command multiple times, either ensure that you no longer need the files in the output folder (the default is the dump/ folder) or rename the folders or files. Point in Time Operation Using Oplogs Use the --oplog option with mongodump to collect the oplog entries to build a point-in-time snapshot of a database within a replica set. With --oplog, mongodump copies all the data from the source database as well as all of the oplog entries from the beginning of the backup procedure to until the backup 5.2. Administration Tutorials 235
  • 240. MongoDB Documentation, Release 2.6.4 procedure completes. This backup procedure, in conjunction with mongorestore --oplogReplay, allows you to restore a backup that reflects the specific moment in time that corresponds to when mongodump completed creating the dump file. Create Backups Without a Running mongod Instance If your MongoDB instance is not running, you can use the --dbpath option to specify the location to your MongoDB instance’s database files. mongodump reads from the data files directly with this operation. This locks the data directory to prevent conflicting writes. The mongod process must not be running or attached to these data files when you run mongodump in this configuration. Consider the following example: Given a MongoDB instance that contains the customers, products, and suppliers databases, the follow-ing mongodump operation backs up the databases using the --dbpath option, which specifies the location of the database files on the host: mongodump --dbpath /data -o dataout The --out or -o option allows you to specify the directory where mongodump will save the backup. mongodump creates a separate backup directory for each of the backed up databases: dataout/customers, dataout/products, and dataout/suppliers. Create Backups from Non-Local mongod Instances The --host and --port options for mongodump allow you to connect to and backup from a remote host. Consider the following example: mongodump --host mongodb1.example.net --port 3017 --username user --password pass --out /opt/backup/mongodump-On any mongodump command you may, as above, specify username and password credentials to specify database authentication. Restore a Database with mongorestore Changed in version 2.6. To restore users and user-defined roles (page 286) on a given database, you must have access to the admin database. MongoDB stores the user data and role definitions for all databases in the admin database. Specifically, to restore users to a given database, you must have the insert (page 375) action (page 375) on the admin database’s admin.system.users (page 271) collection. The restore (page 367) role provides this privilege. To restore user-defined roles to a database, you must have the insert (page 375) action on the admin database’s admin.system.roles (page 270) collection. The restore (page 367) role provides this privilege. Basic mongorestore Operations The mongorestore utility restores a binary backup created by mongodump. By default, mongorestore looks for a database backup in the dump/ directory. The mongorestore utility can restore data either by: • connecting to a running mongod or mongos directly, or • writing to a set of MongoDB data files without use of a running mongod. mongorestore can restore either an entire database backup or a subset of the backup. To use mongorestore to connect to an active mongod or mongos, use a command with the following prototype form: 236 Chapter 5. Administration
  • 241. MongoDB Documentation, Release 2.6.4 mongorestore --port <port number> <path to the backup> To use mongorestore to write to data files without using a running mongod, use a command with the following prototype form: mongorestore --dbpath <database path> <path to the backup> Consider the following example: mongorestore dump-2013-10-25/ Here, mongorestore imports the database backup in the dump-2013-10-25 directory to the mongod instance running on the localhost interface. Restore Point in Time Oplog Backup If you created your database dump using the --oplog option to ensure a point-in-time snapshot, call mongorestore with the --oplogReplay option, as in the following example: mongorestore --oplogReplay You may also consider using the mongorestore --objcheck option to check the integrity of objects while inserting them into the database, or you may consider the mongorestore --drop option to drop each collection from the database before restoring from backups. Restore a Subset of data from a Binary Database Dump mongorestore also includes the ability to a filter to all input before inserting it into the new database. Consider the following example: mongorestore --filter '{"field": 1}' Here, mongorestore only adds documents to the database from the dump located in the dump/ folder if the documents have a field name field that holds a value of 1. Enclose the filter in single quotes (e.g. ’) to prevent the filter from interacting with your shell environment. Restore Without a Running mongod mongorestore can write data to MongoDB data files without needing to connect to a mongod directly. Example Restore a Database Without a Running mongod Given a set of backed up databases in the /data/backup/ directory: • /data/backup/customers, • /data/backup/products, and • /data/backup/suppliers The following mongorestore command restores the products database. The command uses the --dbpath option to specify the path to the MongoDB data files: mongorestore --dbpath /data/db --journal /data/backup/products The mongorestore imports the database backup in the /data/backup/products directory to the mongod instance that runs on the localhost interface. The mongorestore operation imports the backup even if the mongod is not running. The --journal option ensures that mongorestore records all operation in the durability journal. The journal prevents data file corruption if anything (e.g. power failure, disk failure, etc.) interrupts the restore operation. 5.2. Administration Tutorials 237
  • 242. MongoDB Documentation, Release 2.6.4 See also: http://docs.mongodb.org/manualreference/program/mongodump and http://docs.mongodb.org/manualreference/program/mongorestore. Restore Backups to Non-Local mongod Instances By default, mongorestore connects to a MongoDB instance running on the localhost interface (e.g. 127.0.0.1) and on the default port (27017). If you want to restore to a different host or port, use the --host and --port options. Consider the following example: mongorestore --host mongodb1.example.net --port 3017 --username user --password pass /opt/backup/mongodump-As above, you may specify username and password connections if your mongod requires authentication. Backup and Restore Sharded Clusters The following tutorials describe backup and restoration for sharded clusters: Backup a Small Sharded Cluster with mongodump (page 238) If your sharded cluster holds a small data set, you can use mongodump to capture the entire backup in a reasonable amount of time. Backup a Sharded Cluster with Filesystem Snapshots (page 239) Use file system snapshots back up each compo-nent in the sharded cluster individually. The procedure involves stopping the cluster balancer. If your system configuration allows file system backups, this might be more efficient than using MongoDB tools. Backup a Sharded Cluster with Database Dumps (page 241) Create backups using mongodump to back up each component in the cluster individually. Schedule Backup Window for Sharded Clusters (page 243) Limit the operation of the cluster balancer to provide a window for regular backup operations. Restore a Single Shard (page 243) An outline of the procedure and consideration for restoring a single shard from a backup. Restore a Sharded Cluster (page 244) An outline of the procedure and consideration for restoring an entire sharded cluster from backup. Backup a Small Sharded Cluster with mongodump Overview If your sharded cluster holds a small data set, you can connect to a mongos using mongodump. You can create backups of your MongoDB cluster, if your backup infrastructure can capture the entire backup in a reasonable amount of time and if you have a storage system that can hold the complete MongoDB data set. See MongoDB Backup Methods (page 172) and Backup and Restore Sharded Clusters (page 238) for complete infor-mation on backups in MongoDB and backups of sharded clusters in particular. Important: By default mongodump issue its queries to the non-primary nodes. To backup all the databases in a cluster via mongodump, you should have the backup (page 367) role. The backup (page 367) role provides all the needed privileges for backing up all database. The role confers no additional access, in keeping with the policy of least privilege. To backup a given database, you must have read access on the database. Several roles provide this access, including the backup (page 367) role. 238 Chapter 5. Administration
  • 243. MongoDB Documentation, Release 2.6.4 To backup the system.profile collection in a database, you must have read access on certain system collec-tions in the database. Several roles provide this access, including the clusterAdmin (page 364) and dbAdmin (page 363) roles. Changed in version 2.6. To backup users and user-defined roles (page 286) for a given database, you must have access to the admin database. MongoDB stores the user data and role definitions for all databases in the admin database. Specifically, to backup a given database’s users, you must have the find (page 375) action (page 375) on the admin database’s admin.system.users (page 271) collection. The backup (page 367) and userAdminAnyDatabase (page 368) roles both provide this privilege. To backup the user-defined roles on a database, you must have the find (page 375) action on the admin database’s admin.system.roles (page 270) collection. Both the backup (page 367) and userAdminAnyDatabase (page 368) roles provide this privilege. Considerations If you use mongodump without specifying a database or collection, mongodump will capture collection data and the cluster meta-data from the config servers (page 616). You cannot use the --oplog option for mongodump when capturing data from mongos. As a result, if you need to capture a backup that reflects a single moment in time, you must stop all writes to the cluster for the duration of the backup operation. Procedure Capture Data You can perform a backup of a sharded cluster by connecting mongodump to a mongos. Use the following operation at your system’s prompt: mongodump --host mongos3.example.net --port 27017 mongodump will write BSON files that hold a copy of data stored in the sharded cluster accessible via the mongos listening on port 27017 of the mongos3.example.net host. Restore Data Backups created with mongodump do not reflect the chunks or the distribution of data in the sharded collection or collections. Like all mongodump output, these backups contain separate directories for each database and BSON files for each collection in that database. You can restore mongodump output to any MongoDB instance, including a standalone, a replica set, or a new sharded cluster. When restoring data to sharded cluster, you must deploy and configure sharding before restoring data from the backup. See Deploy a Sharded Cluster (page 635) for more information. Backup a Sharded Cluster with Filesystem Snapshots Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This pro-cedure uses file system snapshots to capture a copy of the mongod instance. An alternate procedure uses mongodump to create binary database dumps when file-system snapshots are not available. See Backup a Sharded Cluster with Database Dumps (page 241) for the alternate procedure. See MongoDB Backup Methods (page 172) and Backup and Restore Sharded Clusters (page 238) for complete infor-mation on backups in MongoDB and backups of sharded clusters in particular. Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a running production system, you can only capture an approximation of point-in-time snapshot. 5.2. Administration Tutorials 239
  • 244. MongoDB Documentation, Release 2.6.4 Considerations Balancing It is essential that you stop the balancer before capturing a backup. If the balancer is active while you capture backups, the backup artifacts may be incomplete and/or have duplicate data, as chunks may migrate while recording backups. Precision In this procedure, you will stop the cluster balancer and take a backup up of the config database, and then take backups of each shard in the cluster using a file-system snapshot tool. If you need an exact moment-in-time snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise the snapshot will only approximate a moment in time. For approximate point-in-time snapshots, you can improve the quality of the backup while minimizing impact on the cluster by taking the backup from a secondary member of the replica set that provides each shard. Procedure Step 1: Disable the balancer. Disable the balancer process that equalizes the distribution of data among the shards. To disable the balancer, use the sh.stopBalancer() method in the mongo shell. For example: use config sh.stopBalancer() For more information, see the Disable the Balancer (page 661) procedure. Step 2: Lock one secondary member of each replica set in each shard. Lock one secondary member of each replica set in each shard so that your backups reflect the state of your database at the nearest possible approximation of a single moment in time. Lock these mongod instances in as short of an interval as possible. To lock a secondary, connect through the mongo shell to the secondary member’s mongod instance and issue the db.fsyncLock() method. Step 3: Back up one of the config servers. Backing up a config server (page 616) backs up the sharded cluster’s metadata. You need back up only one config server, as they all hold the same data. Do one of the following to back up one of the config servers: Create a file-system snapshot of the config server. Do this only if the config server has journaling enabled. Use the procedure in Backup and Restore with Filesystem Snapshots (page 229). Never use db.fsyncLock() on config databases. Create a database dump to backup the config server. Issue mongodump against one of the config mongod instances or via the mongos. If you are running MongoDB 2.4 or later with the --configsvr option, then include the --oplog option to ensure that the dump includes a partial oplog containing operations from the duration of the mongodump operation. For example: mongodump --oplog --db config Step 4: Back up the replica set members of the shards that you locked. You may back up the shards in parallel. For each shard, create a snapshot. Use the procedure in Backup and Restore with Filesystem Snapshots (page 229). 240 Chapter 5. Administration
  • 245. MongoDB Documentation, Release 2.6.4 Step 5: Unlock locked replica set members. Unlock all locked replica set members of each shard using the db.fsyncUnlock() method in the mongo shell. Step 6: Enable the balancer. Re-enable the balancer with the sh.setBalancerState() method. Use the following command sequence when connected to the mongos with the mongo shell: use config sh.setBalancerState(true) Backup a Sharded Cluster with Database Dumps Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This procedure uses mongodump to create dumps of the mongod instance. An alternate procedure uses file system snap-shots to capture the backup data, and may be more efficient in some situations if your system configuration allows file system backups. See Backup and Restore Sharded Clusters (page 238) for more information. See MongoDB Backup Methods (page 172) and Backup and Restore Sharded Clusters (page 238) for complete infor-mation on backups in MongoDB and backups of sharded clusters in particular. Prerequisites Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a running production system, you can only capture an approximation of point-in-time snapshot. To backup all the databases in a cluster via mongodump, you should have the backup (page 367) role. The backup (page 367) role provides all the needed privileges for backing up all database. The role confers no additional access, in keeping with the policy of least privilege. To backup a given database, you must have read access on the database. Several roles provide this access, including the backup (page 367) role. To backup the system.profile collection in a database, you must have read access on certain system collec-tions in the database. Several roles provide this access, including the clusterAdmin (page 364) and dbAdmin (page 363) roles. Changed in version 2.6. To backup users and user-defined roles (page 286) for a given database, you must have access to the admin database. MongoDB stores the user data and role definitions for all databases in the admin database. Specifically, to backup a given database’s users, you must have the find (page 375) action (page 375) on the admin database’s admin.system.users (page 271) collection. The backup (page 367) and userAdminAnyDatabase (page 368) roles both provide this privilege. To backup the user-defined roles on a database, you must have the find (page 375) action on the admin database’s admin.system.roles (page 270) collection. Both the backup (page 367) and userAdminAnyDatabase (page 368) roles provide this privilege. Consideration To create these backups of a sharded cluster, you will stop the cluster balancer and take a backup up of the config database, and then take backups of each shard in the cluster using mongodump to capture the backup data. To capture a more exact moment-in-time snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise the snapshot will only approximate a moment in time. For approximate point-in-time snapshots, taking the backup from a single offline secondary member of the replica set that provides each shard can improve the quality of the backup while minimizing impact on the cluster. 5.2. Administration Tutorials 241
  • 246. MongoDB Documentation, Release 2.6.4 Procedure Step 1: Disable the balancer process. Disable the balancer process that equalizes the distribution of data among the shards. To disable the balancer, use the sh.stopBalancer() method in the mongo shell. For example: use config sh.setBalancerState(false) For more information, see the Disable the Balancer (page 661) procedure. If you do not stop the balancer, the backup could have duplicate data or omit data as chunks migrate while recording backups. Step 2: Lock replica set members. Lock one member of each replica set in each shard so that your backups reflect the state of your database at the nearest possible approximation of a single moment in time. Lock these mongod instances in as short of an interval as possible. To lock or freeze a sharded cluster, you shut down one member of each replica set. Ensure that the oplog has sufficient capacity to allow these secondaries to catch up to the state of the primaries after finishing the backup procedure. See Oplog Size (page 535) for more information. Step 3: Backup one config server. Use mongodump to backup one of the config servers (page 616). This backs up the cluster’s metadata. You only need to back up one config server, as they all hold the same data. Use the mongodump tool to capture the content of the config mongod instances. Your config servers must run MongoDB 2.4 or later with the --configsvr option and the mongodump option must include the --oplog to capture a consistent copy of the config database: mongodump --oplog --db config Step 4: Backup replica set members. Back up the replica set members of the shards that shut down using mongodump and specifying the --dbpath option. You may back up the shards in parallel. Consider the following invocation: mongodump --journal --dbpath /data/db/ --out /data/backup/ You must run mongodump on the same system where the mongod ran. This operation will create a dump of all the data managed by the mongod instances that used the dbPath /data/db/. mongodump writes the output of this dump to the /data/backup/ directory. Step 5: Restart replica set members. Restart all stopped replica set members of each shard as normal and allow them to catch up with the state of the primary. Step 6: Re-enable the balancer process. Re-enable the balancer with the sh.setBalancerState() method. Use the following command sequence when connected to the mongos with the mongo shell: use config sh.setBalancerState(true) 242 Chapter 5. Administration
  • 247. MongoDB Documentation, Release 2.6.4 Schedule Backup Window for Sharded Clusters Overview In a sharded cluster, the balancer process is responsible for distributing sharded data around the cluster, so that each shard has roughly the same amount of data. However, when creating backups from a sharded cluster it is important that you disable the balancer while taking backups to ensure that no chunk migrations affect the content of the backup captured by the backup procedure. Using the procedure outlined in the section Disable the Balancer (page 661) you can manually stop the balancer process temporarily. As an alternative you can use this procedure to define a balancing window so that the balancer is always disabled during your automated backup operation. Procedure If you have an automated backup schedule, you can disable all balancing operations for a period of time. For instance, consider the following command: use config db.settings.update( { _id : "balancer" }, { $set : { activeWindow : { start : "6:00", stop : "23:00" This operation configures the balancer to run between 6:00am and 11:00pm, server time. Schedule your backup operation to run and complete outside of this time. Ensure that the backup can complete outside the window when the balancer is running and that the balancer can effectively balance the collection among the shards in the window allotted to each. Restore a Single Shard Overview Restoring a single shard from backup with other unaffected shards requires a number of special consider-ations and practices. This document outlines the additional tasks you must perform when restoring a single shard. Consider the following resources on backups in general as well as backup and restoration of sharded clusters specifi-cally: • Backup and Restore Sharded Clusters (page 238) • Restore a Sharded Cluster (page 244) • MongoDB Backup Methods (page 172) Procedure Always restore sharded clusters as a whole. When you restore a single shard, keep in mind that the balancer process might have moved chunks to or from this shard since the last backup. If that’s the case, you must manually move those chunks, as described in this procedure. Step 1: Restore the shard as you would any other mongod instance. See MongoDB Backup Methods (page 172) for overviews of these procedures. Step 2: Manage the chunks. For all chunks that migrate away from this shard, you do not need to do anything at this time. You do not need to delete these documents from the shard because the chunks are automatically filtered out from queries by mongos. You can remove these documents from the shard, if you like, at your leisure. For chunks that migrate to this shard after the most recent backup, you must manually recover the chunks using back-ups of other shards, or some other source. To determine what chunks have moved, view the changelog collection in the Config Database (page 679). 5.2. Administration Tutorials 243
  • 248. MongoDB Documentation, Release 2.6.4 Restore a Sharded Cluster Overview You can restore a sharded cluster either from snapshots (page 229) or from BSON database dumps (page 241) created by the mongodump tool. This document provides procedures for both: • Restore a Sharded Cluster with Filesystem Snapshots (page 244) • restore-sh-cl-dmp Related Documents For an overview of backups in MongoDB, see MongoDB Backup Methods (page 172). For complete information on backups and backups of sharded clusters in particular, see Backup and Restore Sharded Clusters (page 238). For backup procedures, see: • Backup a Sharded Cluster with Filesystem Snapshots (page 239) • Backup a Sharded Cluster with Database Dumps (page 241) Procedures Use the procedure for the type of backup files to restore. Restore a Sharded Cluster with Filesystem Snapshots Step 1: Shut down the entire cluster. Stop all mongos and mongod processes, including all shards and all config servers. Connect to each member use the following operation: use admin db.shutdownServer() For version 2.4 or earlier, use db.shutdownServer({force:true}). Step 2: Restore the data files. One each server, extract the data files to the location where the mongod instance will access them. Restore the following: Data files for each server in each shard. Because replica sets provide each production shard, restore all the mem-bers of the replica set or use the other standard approaches for restoring a replica set from backup. See the Restore a Snapshot (page 231) and Restore a Database with mongorestore (page 236) sections for details on these procedures. Data files for each config server. Step 3: Restart the config servers. Restart each config server (page 616) mongod instance by issuing a command similar to the following for each, using values appropriate to your configuration: mongod --configsvr --dbpath /data/configdb --port 27019 Step 4: If shard hostnames have changed, update the config string and config database. If shard hostnames have changed, start one mongos instance using the updated config string with the new configdb hostnames and ports. Then update the shards collection in the Config Database (page 679) to reflect the new hostnames. Then stop the mongos instance. 244 Chapter 5. Administration
  • 249. MongoDB Documentation, Release 2.6.4 Step 5: Restart all the shard mongod instances. Step 6: Restart all the mongos instances. If shard hostnames have changed, make sure to use the updated config string. Step 7: Connect to a mongos to ensure the cluster is operational. Connect to a mongos instance from a mongo shell and use the db.printShardingStatus() method to ensure that the cluster is operational, as follows: db.printShardingStatus() show collections Restore a Sharded Cluster with Database Dumps Step 1: Shut down the entire cluster. Stop all mongos and mongod processes, including all shards and all config servers. Connect to each member use the following operation: use admin db.shutdownServer() For version 2.4 or earlier, use db.shutdownServer({force:true}). Step 2: Restore the data files. One each server, use mongorestore to restore the database dump to the location where the mongod instance will access the data. The following example restores a database dump located at http://docs.mongodb.org/manualopt/backup/ to the /data/ directory. This requires that there are no active mongod instances attached to the /data directory. mongorestore --dbpath /data /opt/backup Step 3: Restart the config servers. Restart each config server (page 616) mongod instance by issuing a command similar to the following for each, using values appropriate to your configuration: mongod --configsvr --dbpath /data/configdb --port 27019 Step 4: If shard hostnames have changed, update the config string and config database. If shard hostnames have changed, start one mongos instance using the updated config string with the new configdb hostnames and ports. Then update the shards collection in the Config Database (page 679) to reflect the new hostnames. Then stop the mongos instance. Step 5: Restart all the shard mongod instances. Step 6: Restart all the mongos instances. If shard hostnames have changed, make sure to use the updated config string. 5.2. Administration Tutorials 245
  • 250. MongoDB Documentation, Release 2.6.4 Step 7: Connect to a mongos to ensure the cluster is operational. Connect to a mongos instance from a mongo shell and use the db.printShardingStatus() method to ensure that the cluster is operational, as follows: db.printShardingStatus() show collections Recover Data after an Unexpected Shutdown If MongoDB does not shutdown cleanly 87 the on-disk representation of the data files will likely reflect an inconsistent state which could lead to data corruption. 88 To prevent data inconsistency and corruption, always shut down the database cleanly and use the durability journaling. MongoDB writes data to the journal, by default, every 100 milliseconds, such that MongoDB can always recover to a consistent state even in the case of an unclean shutdown due to power loss or other system failure. If you are not running as part of a replica set and do not have journaling enabled, use the following procedure to recover data that may be in an inconsistent state. If you are running as part of a replica set, you should always restore from a backup or restart the mongod instance with an empty dbPath and allow MongoDB to perform an initial sync to restore the data. See also: The Administration (page 171) documents, including Replica Set Syncing (page 535), and the documentation on the --repair repairPath and storage.journal.enabled settings. Process Indications When you are aware of a mongod instance running without journaling that stops unexpectedly and you’re not running with replication, you should always run the repair operation before starting MongoDB again. If you’re using replication, then restore from a backup and allow replication to perform an initial sync (page 535) to restore data. If the mongod.lock file in the data directory specified by dbPath, /data/db by default, is not a zero-byte file, then mongod will refuse to start, and you will find a message that contains the following line in your MongoDB log our output: Unclean shutdown detected. This indicates that you need to run mongod with the --repair option. If you run repair when the mongodb.lock file exists in your dbPath, or the optional --repairpath, you will see a message that contains the following line: old lock file: /data/db/mongod.lock. probably means unclean shutdown If you see this message, as a last resort you may remove the lockfile and run the repair operation before starting the database normally, as in the following procedure: 87 To ensure a clean shut down, use the db.shutdownServer() from the mongo shell, your control script, the mongod --shutdown option on Linux systems, “Control-C” when running mongod in interactive mode, or kill $(pidof mongod) or kill -2 $(pidof mongod). 88 You can also use the db.collection.validate() method to test the integrity of a single collection. However, this process is time consuming, and without journaling you can safely assume that the data is in an invalid state and you should either run the repair operation or resync from an intact member of the replica set. 246 Chapter 5. Administration
  • 251. MongoDB Documentation, Release 2.6.4 Overview Warning: Recovering a member of a replica set. Do not use this procedure to recover a member of a replica set. Instead you should either restore from a backup (page 172) or perform an initial sync using data from an intact member of the set, as described in Resync a Member of a Replica Set (page 575). There are two processes to repair data files that result from an unexpected shutdown: • Use the --repair option in conjunction with the --repairpath option. mongod will read the existing data files, and write the existing data to new data files. This does not modify or alter the existing data files. You do not need to remove the mongod.lock file before using this procedure. • Use the --repair option. mongod will read the existing data files, write the existing data to new files and replace the existing, possibly corrupt, files with new files. You must remove the mongod.lock file before using this procedure. Note: --repair functionality is also available in the shell with the db.repairDatabase() helper for the repairDatabase command. Procedures Important: Always Run mongod as the same user to avoid changing the permissions of the MongoDB data files. Repair Data Files and Preserve Original Files To repair your data files using the --repairpath option to preserve the original data files unmodified. Repair Data Files without Preserving Original Files To repair your data files without preserving the original files, do not use the --repairpath option, as in the following procedure: Warning: After you remove the mongod.lock file you must run the --repair process before using your database. Step 1: Start mongod using the option to replace the original files with the repaired files. Start the mongod instance using the --repair option and the --repairpath option. Issue a command similar to the following: mongod --dbpath /data/db --repair --repairpath /data/db0 When this completes, the new repaired data files will be in the /data/db0 directory. Step 2: Start mongod with the new data directory. Start mongod using the following invocation to point the dbPath at /data/db0: mongod --dbpath /data/db0 Once you confirm that the data files are operational you may delete or archive the old data files in the /data/db directory. You may also wish to move the repaired files to the old database location or update the dbPath to indicate the new location. Step 1: Remove the stale lock file. For example: 5.2. Administration Tutorials 247
  • 252. MongoDB Documentation, Release 2.6.4 rm /data/db/mongod.lock Replace /data/db with your dbPath where your MongoDB instance’s data files reside. Step 2: Start mongod using the option to replace the original files with the repaired files. Start the mongod instance using the --repair option, which replaces the original data files with the repaired data files. Issue a command similar to the following: mongod --dbpath /data/db --repair When this completes, the repaired data files will replace the original data files in the /data/db directory. Step 3: Start mongod as usual. Start mongod using the following invocation to point the dbPath at /data/db: mongod --dbpath /data/db mongod.lock In normal operation, you should never remove the mongod.lock file and start mongod. Instead consider the one of the above methods to recover the database and remove the lock files. In dire situations you can remove the lockfile, and start the database using the possibly corrupt files, and attempt to recover data from the database; however, it’s impossible to predict the state of the database in these situations. If you are not running with journaling, and your database shuts down unexpectedly for any reason, you should always proceed as if your database is in an inconsistent and likely corrupt state. If at all possible restore from backup (page 172) or, if running as a replica set, restore by performing an initial sync using data from an intact member of the set, as described in Resync a Member of a Replica Set (page 575). 5.2.3 MongoDB Scripting The mongo shell is an interactive JavaScript shell for MongoDB, and is part of all MongoDB distributions89. This section provides an introduction to the shell, and outlines key functions, operations, and use of the mongo shell. Also consider FAQ: The mongo Shell (page 700) and the shell method and other relevant reference material. Note: Most examples in the MongoDB Manual use the mongo shell; however, many drivers provide similar interfaces to MongoDB. Server-side JavaScript (page 249) Details MongoDB’s support for executing JavaScript code for server-side opera-tions. Data Types in the mongo Shell (page 250) Describes the super-set of JSON available for use in the mongo shell. Write Scripts for the mongo Shell (page 253) An introduction to the mongo shell for writing scripts to manipulate data and administer MongoDB. Getting Started with the mongo Shell (page 255) Introduces the use and operation of the MongoDB shell. Access the mongo Shell Help Information (page 259) Describes the available methods for accessing online help for the operation of the mongo interactive shell. mongo Shell Quick Reference (page 261) A high level reference to the use and operation of the mongo shell. 89http://www.mongodb.org/downloads 248 Chapter 5. Administration
  • 253. MongoDB Documentation, Release 2.6.4 Server-side JavaScript Changed in version 2.4: The V8 JavaScript engine, which became the default in 2.4, allows multiple JavaScript operations to execute at the same time. Prior to 2.4, MongoDB operations that required the JavaScript interpreter had to acquire a lock, and a single mongod could only run a single JavaScript operation at a time. Overview MongoDB supports the execution of JavaScript code for the following server-side operations: • mapReduce and the corresponding mongo shell method db.collection.mapReduce(). See Map- Reduce (page 394) for more information. • eval command, and the corresponding mongo shell method db.eval() • $where operator • Running .js files via a mongo shell Instance on the Server (page 249) JavaScript in MongoDB Although the above operations use JavaScript, most interactions with MongoDB do not use JavaScript but use an idiomatic driver in the language of the interacting application. See also: Store a JavaScript Function on the Server (page 217) You can disable all server-side execution of JavaScript, by passing the --noscripting option on the command line or setting security.javascriptEnabled in a configuration file. Running .js files via a mongo shell Instance on the Server You can run a JavaScript (.js) file using a mongo shell instance on the server. This is a good technique for performing batch administrative work. When you run mongo shell on the server, connecting via the localhost interface, the connection is fast with low latency. The command helpers (page 261) provided in the mongo shell are not available in JavaScript files because they are not valid JavaScript. The following table maps the most common mongo shell helpers to their JavaScript equivalents. 5.2. Administration Tutorials 249
  • 254. MongoDB Documentation, Release 2.6.4 Shell Helpers JavaScript Equivalents show dbs, show databases db.adminCommand('listDatabases') use <db> db = db.getSiblingDB('<db>') show collections db.getCollectionNames() show users db.getUsers() show roles db.getRoles({showBuiltinRoles: true}) show log <logname> db.adminCommand({ 'getLog' : '<logname>' }) show logs db.adminCommand({ 'getLog' : '*' }) it cursor = db.collection.find() if ( cursor.hasNext() ){ cursor.next(); } Concurrency Refer to the individual method or operator documentation for any concurrency information. See also the concurrency table (page 703). Data Types in the mongo Shell MongoDB BSON provides support for additional data types than JSON. Drivers provide na-tive support for these data types in host languages and the mongo shell also provides sev-eral helper classes to support the use of these data types in the mongo JavaScript shell. See http://docs.mongodb.org/manualreference/mongodb-extended-json for additional infor-mation. Types Date The mongo shell provides various methods to return the date, either as a string or as a Date object: • Date() method which returns the current date as a string. • new Date() constructor which returns a Date object using the ISODate() wrapper. • ISODate() constructor which returns a Date object using the ISODate() wrapper. Internally, Date objects are stored as a 64 bit integer representing the number of milliseconds since the Unix epoch (Jan 1, 1970), which results in a representable date range of about 290 millions years into the past and future. 250 Chapter 5. Administration
  • 255. MongoDB Documentation, Release 2.6.4 Return Date as a String To return the date as a string, use the Date() method, as in the following example: var myDateString = Date(); To print the value of the variable, type the variable name in the shell, as in the following: myDateString The result is the value of myDateString: Wed Dec 19 2012 01:03:25 GMT-0500 (EST) To verify the type, use the typeof operator, as in the following: typeof myDateString The operation returns string. Return Date The mongo shell wrap objects of Date type with the ISODate helper; however, the objects remain of type Date. The following example uses both the new Date() constructor and the ISODate() constructor to return Date objects. var myDate = new Date(); var myDateInitUsingISODateWrapper = ISODate(); You can use the new operator with the ISODate() constructor as well. To print the value of the variable, type the variable name in the shell, as in the following: myDate The result is the Date value of myDate wrapped in the ISODate() helper: ISODate("2012-12-19T06:01:17.171Z") To verify the type, use the instanceof operator, as in the following: myDate instanceof Date myDateInitUsingISODateWrapper instanceof Date The operation returns true for both. ObjectId The mongo shell provides the ObjectId() wrapper class around the ObjectId data type. To generate a new ObjectId, use the following operation in the mongo shell: new ObjectId See ObjectId (page 165) for full documentation of ObjectIds in MongoDB. NumberLong By default, the mongo shell treats all numbers as floating-point values. The mongo shell provides the NumberLong() wrapper to handle 64-bit integers. The NumberLong() wrapper accepts the long as a string: 5.2. Administration Tutorials 251
  • 256. MongoDB Documentation, Release 2.6.4 NumberLong("2090845886852") The following examples use the NumberLong() wrapper to write to the collection: db.collection.insert( { _id: 10, calc: NumberLong("2090845886852") } ) db.collection.update( { _id: 10 }, { $set: { calc: NumberLong("2555555000000") } } ) db.collection.update( { _id: 10 }, { $inc: { calc: NumberLong(5) } } ) Retrieve the document to verify: db.collection.findOne( { _id: 10 } ) In the returned document, the calc field contains a NumberLong object: { "_id" : 10, "calc" : NumberLong("2555555000005") } If you use the $inc to increment the value of a field that contains a NumberLong object by a float, the data type changes to a floating point value, as in the following example: 1. Use $inc to increment the calc field by 5, which the mongo shell treats as a float: db.collection.update( { _id: 10 }, { $inc: { calc: 5 } } ) 2. Retrieve the updated document: db.collection.findOne( { _id: 10 } ) In the updated document, the calc field contains a floating point value: { "_id" : 10, "calc" : 2555555000010 } NumberInt By default, the mongo shell treats all numbers as floating-point values. The mongo shell provides the NumberInt() constructor to explicitly specify 32-bit integers. Check Types in the mongo Shell To determine the type of fields, the mongo shell provides the instanceof and typeof operators. instanceof instanceof returns a boolean to test if a value is an instance of some type. For example, the following operation tests whether the _id field is an instance of type ObjectId: mydoc._id instanceof ObjectId The operation returns true. typeof typeof returns the type of a field. For example, the following operation returns the type of the _id field: typeof mydoc._id In this case typeof will return the more generic object type rather than ObjectId type. 252 Chapter 5. Administration
  • 257. MongoDB Documentation, Release 2.6.4 Write Scripts for the mongo Shell You can write scripts for the mongo shell in JavaScript that manipulate data in MongoDB or perform administrative operation. For more information about the mongo shell see MongoDB Scripting (page 248), and see the Running .js files via a mongo shell Instance on the Server (page 249) section for more information about using these mongo script. This tutorial provides an introduction to writing JavaScript that uses the mongo shell to access MongoDB. Opening New Connections From the mongo shell or from a JavaScript file, you can instantiate database connections using the Mongo() con-structor: new Mongo() new Mongo(<host>) new Mongo(<host:port>) Consider the following example that instantiates a new connection to the MongoDB instance running on localhost on the default port and sets the global db variable to myDatabase using the getDB() method: conn = new Mongo(); db = conn.getDB("myDatabase"); Additionally, you can use the connect() method to connect to the MongoDB instance. The following example connects to the MongoDB instance that is running on localhost with the non-default port 27020 and set the global db variable: db = connect("localhost:27020/myDatabase"); Differences Between Interactive and Scripted mongo When writing scripts for the mongo shell, consider the following: • To set the db global variable, use the getDB() method or the connect() method. You can assign the database reference to a variable other than db. • Write operations in the mongo shell use the “safe writes” by default. If performing bulk operations, use the Bulk() methods. See Write Method Acknowledgements (page 743) for more information. Changed in version 2.6: Before MongoDB 2.6, call db.getLastError() explicitly to wait for the result of write operations (page 67). • You cannot use any shell helper (e.g. use <dbname>, show dbs, etc.) inside the JavaScript file because they are not valid JavaScript. The following table maps the most common mongo shell helpers to their JavaScript equivalents. 5.2. Administration Tutorials 253
  • 258. MongoDB Documentation, Release 2.6.4 Shell Helpers JavaScript Equivalents show dbs, show databases db.adminCommand('listDatabases') use <db> db = db.getSiblingDB('<db>') show collections db.getCollectionNames() show users db.getUsers() show roles db.getRoles({showBuiltinRoles: true}) show log <logname> db.adminCommand({ 'getLog' : '<logname>' }) show logs db.adminCommand({ 'getLog' : '*' }) it cursor = db.collection.find() if ( cursor.hasNext() ){ cursor.next(); } • In interactive mode, mongo prints the results of operations including the content of all cursors. In scripts, either use the JavaScript print() function or the mongo specific printjson() function which returns formatted JSON. Example To print all items in a result cursor in mongo shell scripts, use the following idiom: cursor = db.collection.find(); while ( cursor.hasNext() ) { printjson( cursor.next() ); } Scripting From the system prompt, use mongo to evaluate JavaScript. --eval option Use the --eval option to mongo to pass the shell a JavaScript fragment, as in the following: mongo test --eval "printjson(db.getCollectionNames())" This returns the output of db.getCollectionNames() using the mongo shell connected to the mongod or mongos instance running on port 27017 on the localhost interface. Execute a JavaScript file You can specify a .js file to the mongo shell, and mongo will execute the JavaScript directly. Consider the following example: 254 Chapter 5. Administration
  • 259. MongoDB Documentation, Release 2.6.4 mongo localhost:27017/test myjsfile.js This operation executes the myjsfile.js script in a mongo shell that connects to the test database on the mongod instance accessible via the localhost interface on port 27017. Alternately, you can specify the mongodb connection parameters inside of the javascript file using the Mongo() constructor. See Opening New Connections (page 253) for more information. You can execute a .js file from within the mongo shell, using the load() function, as in the following: load("myjstest.js") This function loads and executes the myjstest.js file. The load() method accepts relative and absolute paths. If the current working directory of the mongo shell is /data/db, and the myjstest.js resides in the /data/db/scripts directory, then the following calls within the mongo shell would be equivalent: load("scripts/myjstest.js") load("/data/db/scripts/myjstest.js") Note: There is no search path for the load() function. If the desired script is not in the current working directory or the full specified path, mongo will not be able to access the file. Getting Started with the mongo Shell This document provides a basic introduction to using the mongo shell. See Install MongoDB (page 5) for instructions on installing MongoDB for your system. Start the mongo Shell To start the mongo shell and connect to your MongoDB instance running on localhost with default port: 1. Go to your <mongodb installation dir>: cd <mongodb installation dir> 2. Type ./bin/mongo to start mongo: ./bin/mongo If you have added the <mongodb installation dir>/bin to the PATH environment variable, you can just type mongo instead of ./bin/mongo. 3. To display the database you are using, type db: db The operation should return test, which is the default database. To switch databases, issue the use <db> helper, as in the following example: use <database> To list the available databases, use the helper show dbs. See also How can I access different databases temporarily? (page 700) to access a different database from the current database without switching your current database context (i.e. db..) 5.2. Administration Tutorials 255
  • 260. MongoDB Documentation, Release 2.6.4 To start the mongo shell with other options, see examples of starting up mongo and mongo reference which provides details on the available options. Note: When starting, mongo checks the user’s HOME directory for a JavaScript file named .mongorc.js. If found, mongo interprets the content of .mongorc.js before displaying the prompt for the first time. If you use the shell to evaluate a JavaScript file or expression, either by using the --eval option on the command line or by specifying a .js file to mongo, mongo will read the .mongorc.js file after the JavaScript has finished processing. You can prevent .mongorc.js from being loaded by using the --norc option. Executing Queries From the mongo shell, you can use the shell methods to run queries, as in the following example: db.<collection>.find() • The db refers to the current database. • The <collection> is the name of the collection to query. See Collection Help (page 259) to list the available collections. If the mongo shell does not accept the name of the collection, for instance if the name contains a space, hyphen, or starts with a number, you can use an alternate syntax to refer to the collection, as in the following: db["3test"].find() db.getCollection("3test").find() • The find() method is the JavaScript method to retrieve documents from <collection>. The find() method returns a cursor to the results; however, in the mongo shell, if the returned cursor is not assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times to print up to the first 20 documents that match the query. The mongo shell will prompt Type it to iterate another 20 times. You can set the DBQuery.shellBatchSize attribute to change the number of iteration from the default value 20, as in the following example which sets it to 10: DBQuery.shellBatchSize = 10; For more information and examples on cursor handling in the mongo shell, see Cursors (page 59). See also Cursor Help (page 260) for list of cursor help in the mongo shell. For more documentation of basic MongoDB operations in the mongo shell, see: • Getting Started with MongoDB (page 43) • mongo Shell Quick Reference (page 261) • Read Operations (page 55) • Write Operations (page 67) • Indexing Tutorials (page 464) Print The mongo shell automatically prints the results of the find() method if the returned cursor is not assigned to a variable using the var keyword. To format the result, you can add the .pretty() to the operation, as in the following: 256 Chapter 5. Administration
  • 261. MongoDB Documentation, Release 2.6.4 db.<collection>.find().pretty() In addition, you can use the following explicit print methods in the mongo shell: • print() to print without formatting • print(tojson(<obj>)) to print with JSON formatting and equivalent to printjson() • printjson() to print with JSON formatting and equivalent to print(tojson(<obj>)) Evaluate a JavaScript File You can execute a .js file from within the mongo shell, using the load() function, as in the following: load("myjstest.js") This function loads and executes the myjstest.js file. The load() method accepts relative and absolute paths. If the current working directory of the mongo shell is /data/db, and the myjstest.js resides in the /data/db/scripts directory, then the following calls within the mongo shell would be equivalent: load("scripts/myjstest.js") load("/data/db/scripts/myjstest.js") Note: There is no search path for the load() function. If the desired script is not in the current working directory or the full specified path, mongo will not be able to access the file. Use a Custom Prompt You may modify the content of the prompt by creating the variable prompt in the shell. The prompt variable can hold strings as well as any arbitrary JavaScript. If prompt holds a function that returns a string, mongo can display dynamic information in each prompt. Consider the following examples: Example Create a prompt with the number of operations issued in the current session, define the following variables: cmdCount = 1; prompt = function() { return (cmdCount++) + "> "; } The prompt would then resemble the following: 1> db.collection.find() 2> show collections 3> Example To create a mongo shell prompt in the form of <database>@<hostname>$ define the following variables: host = db.serverStatus().host; prompt = function() { 5.2. Administration Tutorials 257
  • 262. MongoDB Documentation, Release 2.6.4 return db+"@"+host+"$ "; } The prompt would then resemble the following: <database>@<hostname>$ use records switched to db records records@<hostname>$ Example To create a mongo shell prompt that contains the system up time and the number of documents in the current database, define the following prompt variable: prompt = function() { return "Uptime:"+db.serverStatus().uptime+" Documents:"+db.stats().objects+" > "; } The prompt would then resemble the following: Uptime:5897 Documents:6 > db.people.save({name : "James"}); Uptime:5948 Documents:7 > Use an External Editor in the mongo Shell New in version 2.2. In the mongo shell you can use the edit operation to edit a function or variable in an external editor. The edit operation uses the value of your environments EDITOR variable. At your system prompt you can define the EDITOR variable and start mongo with the following two operations: export EDITOR=vim mongo Then, consider the following example shell session: MongoDB shell version: 2.2.0 > function f() {} > edit f > f function f() { print("this really works"); } > f() this really works > o = {} { } > edit o > o { "soDoes" : "this" } > Note: As mongo shell interprets code edited in an external editor, it may modify code in functions, depending on the JavaScript compiler. For mongo may convert 1+1 to 2 or remove comments. The actual changes affect only the appearance of the code and will vary based on the version of JavaScript used but will not affect the semantics of the code. 258 Chapter 5. Administration
  • 263. MongoDB Documentation, Release 2.6.4 Exit the Shell To exit the shell, type quit() or use the <Ctrl-c> shortcut. Access the mongo Shell Help Information In addition to the documentation in the MongoDB Manual, the mongo shell provides some additional information in its “online” help system. This document provides an overview of accessing this help information. See also: • mongo Manual Page • MongoDB Scripting (page 248), and • mongo Shell Quick Reference (page 261). Command Line Help To see the list of options and help for starting the mongo shell, use the --help option from the command line: mongo --help Shell Help To see the list of help, in the mongo shell, type help: help Database Help • To see the list of databases on the server, use the show dbs command: show dbs New in version 2.4: show databases is now an alias for show dbs • To see the list of help for methods you can use on the db object, call the db.help() method: db.help() • To see the implementation of a method in the shell, type the db.<method name> without the parenthesis (()), as in the following example which will return the implementation of the method db.addUser(): db.addUser Collection Help • To see the list of collections in the current database, use the show collections command: 5.2. Administration Tutorials 259
  • 264. MongoDB Documentation, Release 2.6.4 show collections • To see the help for methods available on the collection objects (e.g. db.<collection>), use the db.<collection>.help() method: db.collection.help() <collection> can be the name of a collection that exists, although you may specify a collection that doesn’t exist. • To see the collection method implementation, type the db.<collection>.<method> name without the parenthesis (()), as in the following example which will return the implementation of the save() method: db.collection.save Cursor Help When you perform read operations (page 55) with the find() method in the mongo shell, you can use various cursor methods to modify the find() behavior and various JavaScript methods to handle the cursor returned from the find() method. • To list the available modifier and cursor handling methods, use the db.collection.find().help() command: db.collection.find().help() <collection> can be the name of a collection that exists, although you may specify a collection that doesn’t exist. • To see the implementation of the cursor method, type the db.<collection>.find().<method> name without the parenthesis (()), as in the following example which will return the implementation of the toArray() method: db.collection.find().toArray Some useful methods for handling cursors are: • hasNext() which checks whether the cursor has more documents to return. • next() which returns the next document and advances the cursor position forward by one. • forEach(<function>) which iterates the whole cursor and applies the <function> to each document returned by the cursor. The <function> expects a single argument which corresponds to the document from each iteration. For examples on iterating a cursor and retrieving the documents from the cursor, see cursor handling (page 59). See also js-query-cursor-methods for all available cursor methods. Type Help To get a list of the wrapper classes available in the mongo shell, such as BinData(), type help misc in the mongo shell: help misc 260 Chapter 5. Administration
  • 265. MongoDB Documentation, Release 2.6.4 mongo Shell Quick Reference mongo Shell Command History You can retrieve previous commands issued in the mongo shell with the up and down arrow keys. Command history is stored in ~/.dbshell file. See .dbshell for more information. Command Line Options The mongo executable can be started with numerous options. See mongo executable page for details on all available options. The following table displays some common options for mongo: Op-tion Description --help Show command line options --nodb Start mongo shell without connecting to a database. To connect later, see Opening New Connections (page 253). --shellUsed in conjunction with a JavaScript file (i.e. <file.js>) to continue in the mongo shell after running the JavaScript file. See JavaScript file (page 254) for an example. Command Helpers The mongo shell provides various help. The following table displays some common help methods and commands: Help Methods and Description Commands help Show help. db.help() Show help for database methods. db.<collection>.hSehlopw()help on collection methods. The <collection> can be the name of an existing collection or a non-existing collection. show dbs Print a list of all databases on the server. use <db> Switch current database to <db>. The mongo shell variable db is set to the current database. show collections Print a list of all collections for current database show users Print a list of users for current database. show roles Print a list of all roles, both user-defined and built-in, for the current database. show profile Print the five most recent operations that took 1 millisecond or more. See documentation on the database profiler (page 210) for more information. show databases New in version 2.4: Print a list of all available databases. load() Execute a JavaScript file. See Getting Started with the mongo Shell (page 255) for more information. Basic Shell JavaScript Operations The mongo shell provides numerous http://docs.mongodb.org/manualreference/method methods for database operations. In the mongo shell, db is the variable that references the current database. The variable is automatically set to the default database test or is set when you use the use <db> to switch current database. 5.2. Administration Tutorials 261
  • 266. MongoDB Documentation, Release 2.6.4 The following table displays some common JavaScript operations: JavaScript Database Operations Description db.auth() If running in secure mode, authenticate the user. coll = db.<collection> Set a specific collection in the current database to a vari-able coll, as in the following example: coll = db.myCollection; You can perform operations on the myCollection using the variable, as in the following example: coll.find(); find() Find all documents in the collection and returns a cursor. See the db.collection.find() and Query Docu-ments (page 87) for more information and examples. See Cursors (page 59) for additional information on cur-sor handling in the mongo shell. insert() Insert a new document into the collection. update() Update an existing document in the collection. See Write Operations (page 67) for more information. save() Insert either a new document or update an existing doc-ument in the collection. See Write Operations (page 67) for more information. remove() Delete documents from the collection. See Write Operations (page 67) for more information. drop() Drops or removes completely the collection. ensureIndex() Create a new index on the collection if the index does not exist; otherwise, the operation has no effect. db.getSiblingDB() Return a reference to another database using this same connection without explicitly switching the current database. This allows for cross database queries. See How can I access different databases temporarily? (page 700) for more information. For more information on performing operations in the shell, see: • MongoDB CRUD Concepts (page 53) • Read Operations (page 55) • Write Operations (page 67) • http://docs.mongodb.org/manualreference/method Keyboard Shortcuts Changed in version 2.2. The mongo shell provides most keyboard shortcuts similar to those found in the bash shell or in Emacs. For some functions mongo provides multiple key bindings, to accommodate several familiar paradigms. The following table enumerates the keystrokes supported by the mongo shell: Keystroke Function Up-arrow previous-history Down-arrow next-history Home beginning-of-line Continued on next page 262 Chapter 5. Administration
  • 267. MongoDB Documentation, Release 2.6.4 Table 5.1 – continued from previous page Keystroke Function End end-of-line Tab autocomplete Left-arrow backward-character Right-arrow forward-character Ctrl-left-arrow backward-word Ctrl-right-arrow forward-word Meta-left-arrow backward-word Meta-right-arrow forward-word Ctrl-A beginning-of-line Ctrl-B backward-char Ctrl-C exit-shell Ctrl-D delete-char (or exit shell) Ctrl-E end-of-line Ctrl-F forward-char Ctrl-G abort Ctrl-J accept-line Ctrl-K kill-line Ctrl-L clear-screen Ctrl-M accept-line Ctrl-N next-history Ctrl-P previous-history Ctrl-R reverse-search-history Ctrl-S forward-search-history Ctrl-T transpose-chars Ctrl-U unix-line-discard Ctrl-W unix-word-rubout Ctrl-Y yank Ctrl-Z Suspend (job control works in linux) Ctrl-H (i.e. Backspace) backward-delete-char Ctrl-I (i.e. Tab) complete Meta-B backward-word Meta-C capitalize-word Meta-D kill-word Meta-F forward-word Meta-L downcase-word Meta-U upcase-word Meta-Y yank-pop Meta-[Backspace] backward-kill-word Meta-< beginning-of-history Meta-> end-of-history Queries In the mongo shell, perform read operations using the find() and findOne() methods. The find() method returns a cursor object which the mongo shell iterates to print documents on screen. By default, mongo prints the first 20. The mongo shell will prompt the user to “Type it” to continue iterating the next 20 results. The following table provides some common read operations in the mongo shell: 5.2. Administration Tutorials 263
  • 268. MongoDB Documentation, Release 2.6.4 Read Operations Description db.collection.find(<query>) Find the documents matching the <query> criteria in the collection. If the <query> criteria is not specified or is empty (i.e {} ), the read operation selects all doc-uments in the collection. The following example selects the documents in the users collection with the name field equal to "Joe": coll = db.users; coll.find( { name: "Joe" } ); For more information on specifying the <query> cri-teria, see Query Documents (page 87). db.collection.find( <query>, <projection> ) Find documents matching the <query> criteria and re-turn just specific fields in the <projection>. The following example selects all documents from the collection but returns only the name field and the _id field. The _id is always returned unless explicitly spec-ified to not return. coll = db.users; coll.find( { }, { name: true } ); For more information on specifying the <projection>, see Limit Fields to Return from a Query (page 94). db.collection.find().sort( <sort order> ) Return results in the specified <sort order>. The following example selects all documents from the collection and returns the results sorted by the name field in ascending order (1). Use -1 for descending or-der: coll = db.users; coll.find().sort( { name: 1 } ); db.collection.find( <query> ).sort( <sort order> ) Return the documents matching the <query> criteria in the specified <sort order>. db.collection.find( ... ).limit( <n> ) Limit result to <n> rows. Highly recommended if you need only a certain number of rows for best perfor-mance. db.collection.find( ... ).skip( <n> ) Skip <n> results. count() Returns total number of documents in the collection. db.collection.find( <query> ).count() Returns the total number of documents that match the query. The count() ignores limit() and skip(). For example, if 100 records match but the limit is 10, count() will return 100. This will be faster than it-erating yourself, but still take time. db.collection.findOne( <query> ) Find and return a single document. Returns null if not found. The following example selects a single doc-ument in the users collection with the name field matches to "Joe": coll = db.users; coll.findOne( { name: "Joe" } ); Internally, the findOne() method is the find() method with a limit(1). 264 Chapter 5. Administration
  • 269. MongoDB Documentation, Release 2.6.4 See Query Documents (page 87) and Read Operations (page 55) documentation for more information and examples. See http://docs.mongodb.org/manualreference/operator to specify other query operators. Error Checking Methods Changed in version 2.6. The mongo shell write methods now integrates the Write Concern (page 72) directly into the method execution rather than with a separate db.getLastError() method. As such, the write methods now return a WriteResult() object that contains the results of the operation, including any write errors and write concern errors. Previous versions used db.getLastError() and db.getLastErrorObj() methods to return error informa-tion. Administrative Command Helpers The following table lists some common methods to support database administration: JavaScript Database Description Administration Methods db.cloneDatabase(<host>C)lone the current database from the <host> specified. The <host> database instance must be in noauth mode. Copy the <from> database from the <host> to the <to> database on the current server. The <host> database instance must be in noauth mode. db.copyDatabase(<from>, <to>, <host>) db.fromColl.renameColleRcetniamone(c<oltleocCtioolnlfr>o)m fromColl to <toColl>. db.repairDatabase() Repair and compact the current database. This operation can be very slow on large databases. db.addUser( <user>, <pwd> ) Add user to current database. db.getCollectionNames()Get the list of all collections in the current database. db.dropDatabase() Drops the current database. See also administrative database methods for a full list of methods. Opening Additional Connections You can create new connections within the mongo shell. The following table displays the methods to create the connections: JavaScript Connection Create Methods Description db = connect("<host><:port>/<dbname>") Open a new database connection. conn = new Mongo() db = conn.getDB("dbname") Open a connection to a new server using new Mongo(). Use getDB() method of the connection to select a database. See also Opening New Connections (page 253) for more information on the opening new connections from the mongo shell. 5.2. Administration Tutorials 265
  • 270. MongoDB Documentation, Release 2.6.4 Miscellaneous The following table displays some miscellaneous methods: Method Description Object.bsonsize(<document>) Prints the BSON size of a <document> in bytes See the MongoDB JavaScript API Documentation90 for a full list of JavaScript methods . Additional Resources Consider the following reference material that addresses the mongo shell and its interface: • http://docs.mongodb.org/manualreference/program/mongo • http://docs.mongodb.org/manualreference/method • http://docs.mongodb.org/manualreference/operator • http://docs.mongodb.org/manualreference/command • Aggregation Reference (page 419) Additionally, the MongoDB source code repository includes a jstests directory91 which contains numerous mongo shell scripts. See also: The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica Set Tutorials (page 543) and Sharded Cluster Tutorials (page 634) for additional tutorials and information. 5.3 Administration Reference UNIX ulimit Settings (page 266) Describes user resources limits (i.e. ulimit) and introduces the considerations and optimal configurations for systems that run MongoDB deployments. System Collections (page 270) Introduces the internal collections that MongoDB uses to track per-database metadata, including indexes, collections, and authentication credentials. Database Profiler Output (page 271) Describes the data collected by MongoDB’s operation profiler, which intro-spects operations and reports data for analysis on performance and behavior. Journaling Mechanics (page 275) Describes the internal operation of MongoDB’s journaling facility and outlines how the journal allows MongoDB to provide provides durability and crash resiliency. Exit Codes and Statuses (page 276) Lists the unique codes returned by mongos and mongod processes upon exit. 5.3.1 UNIX ulimit Settings Most UNIX-like operating systems, including Linux and OS X, provide ways to limit and control the usage of system resources such as threads, files, and network connections on a per-process and per-user basis. These “ulimits” prevent single users from using too many system resources. Sometimes, these limits have low default values that can cause a number of issues in the course of normal MongoDB operation. Note: Red Hat Enterprise Linux and CentOS 6 place a max process limitation of 1024 which overrides ulimit set- 90http://api.mongodb.org/js/index.html 91https://github.com/mongodb/mongo/tree/master/jstests/ 266 Chapter 5. Administration
  • 271. MongoDB Documentation, Release 2.6.4 tings. Create a file named /etc/security/limits.d/99-mongodb-nproc.conf with new soft nproc and hard nproc values to increase the process limit. See /etc/security/limits.d/90-nproc.conf file as an example. Resource Utilization mongod and mongos each use threads and file descriptors to track connections and manage internal operations. This section outlines the general resource utilization patterns for MongoDB. Use these figures in combination with the actual information about your deployment and its use to determine ideal ulimit settings. Generally, all mongod and mongos instances: • track each incoming connection with a file descriptor and a thread. • track each internal thread or pthread as a system process. mongod • 1 file descriptor for each data file in use by the mongod instance. • 1 file descriptor for each journal file used by the mongod instance when storage.journal.enabled is true. • In replica sets, each mongod maintains a connection to all other members of the set. mongod uses background threads for a number of internal processes, including TTL collections (page 198), replica-tion, and replica set health checks, which may require a small number of additional resources. mongos In addition to the threads and file descriptors for client connections, mongos must maintain connects to all config servers and all shards, which includes all members of all replica sets. For mongos, consider the following behaviors: • mongos instances maintain a connection pool to each shard so that the mongos can reuse connections and quickly fulfill requests without needing to create new connections. • You can limit the number of incoming connections using the maxIncomingConnections run-time option. By restricting the number of incoming connections you can prevent a cascade effect where the mongos creates too many connections on the mongod instances. Note: Changed in version 2.6: MongoDB removed the upward limit on the maxIncomingConnections setting. Review and Set Resource Limits ulimit Note: Both the “hard” and the “soft” ulimit affect MongoDB’s performance. The “hard” ulimit refers to the maximum number of processes that a user can have active at any time. This is the ceiling: no non-root process can increase the “hard” ulimit. In contrast, the “soft” ulimit is the limit that is actually enforced for a session or process, but any process can increase it up to “hard” ulimit maximum. 5.3. Administration Reference 267
  • 272. MongoDB Documentation, Release 2.6.4 A low “soft” ulimit can cause can’t create new thread, closing connection errors if the number of connections grows too high. For this reason, it is extremely important to set both ulimit values to the recom-mended values. ulimit will modify both “hard” and “soft” values unless the -H or -S modifiers are specified when modifying limit values. You can use the ulimit command at the system prompt to check system limits, as in the following example: $ ulimit -a -t: cpu time (seconds) unlimited -f: file size (blocks) unlimited -d: data seg size (kbytes) unlimited -s: stack size (kbytes) 8192 -c: core file size (blocks) 0 -m: resident set size (kbytes) unlimited -u: processes 192276 -n: file descriptors 21000 -l: locked-in-memory size (kb) 40000 -v: address space (kb) unlimited -x: file locks unlimited -i: pending signals 192276 -q: bytes in POSIX msg queues 819200 -e: max nice 30 -r: max rt priority 65 -N 15: unlimited ulimit refers to the per-user limitations for various resources. Therefore, if your mongod instance executes as a user that is also running multiple processes, or multiple mongod processes, you might see contention for these resources. Also, be aware that the processes value (i.e. -u) refers to the combined number of distinct processes and sub-process threads. You can change ulimit settings by issuing a command in the following form: ulimit -n <value> For many distributions of Linux you can change values by substituting the -n option for any possible value in the output of ulimit -a. On OS X, use the launchctl limit command. See your operating system documentation for the precise procedure for changing system limits on running systems. Note: After changing the ulimit settings, you must restart the process to take advantage of the modified settings. You can use the http://docs.mongodb.org/manualproc file system to see the current limitations on a running process. Depending on your system’s configuration, and default settings, any change to system limits made using ulimit may revert following system a system restart. Check your distribution and operating system documentation for more information. Recommended ulimit Settings Every deployment may have unique requirements and settings; however, the following thresholds and settings are particularly important for mongod and mongos deployments: • -f (file size): unlimited • -t (cpu time): unlimited 268 Chapter 5. Administration
  • 273. MongoDB Documentation, Release 2.6.4 • -v (virtual memory): unlimited 92 • -n (open files): 64000 • -m (memory size): unlimited 1 93 • -u (processes/threads): 64000 Always remember to restart your mongod and mongos instances after changing the ulimit settings to ensure that the changes take effect. Linux distributions using Upstart For Linux distributions that use Upstart, you can specify limits within service scripts if you start mongod and/or mongos instances as Upstart services. You can do this by using limit stanzas94. Specify the Recommended ulimit Settings (page 268), as in the following example: limit fsize unlimited unlimited # (file size) limit cpu unlimited unlimited # (cpu time) limit as unlimited unlimited # (virtual memory size) limit nofile 64000 64000 # (open files) limit nproc 64000 64000 # (processes/threads) Each limit stanza sets the “soft” limit to the first value specified and the “hard” limit to the second. After after changing limit stanzas, ensure that the changes take effect by restarting the application services, using the following form: restart <service name> Linux distributions using systemd For Linux distributions that use systemd, you can specify limits within the [Service] sections of service scripts if you start mongod and/or mongos instances as systemd services. You can do this by using resource limit direc-tives95. Specify the Recommended ulimit Settings (page 268), as in the following example: [Service] # Other directives omitted # (file size) LimitFSIZE=infinity # (cpu time) LimitCPU=infinity # (virtual memory size) LimitAS=infinity # (open files) LimitNOFILE=64000 # (processes/threads) LimitNPROC=64000 92 If you limit virtual or resident memory size on a system running MongoDB the operating system will refuse to honor additional allocation requests. 93 The -m parameter to ulimit has no effect on Linux systems with kernel versions more recent than 2.4.30. You may omit -m if you wish. 94http://upstart.ubuntu.com/wiki/Stanzas#limit 95http://www.freedesktop.org/software/systemd/man/systemd.exec.html#LimitCPU= 5.3. Administration Reference 269
  • 274. MongoDB Documentation, Release 2.6.4 Each systemd limit directive sets both the “hard” and “soft” limits to the value specified. After after changing limit stanzas, ensure that the changes take effect by restarting the application services, using the following form: systemctl restart <service name> /proc File System Note: This section applies only to Linux operating systems. The http://docs.mongodb.org/manualproc file-system stores the per-process limits in the file system ob-ject located at /proc/<pid>/limits, where <pid> is the process’s PID or process identifier. You can use the following bash function to return the content of the limits object for a process or processes with a given name: return-limits(){ for process in $@; do process_pids=`ps -C $process -o pid --no-headers | cut -d " " -f 2` if [ -z $@ ]; then echo "[no $process running]" else for pid in $process_pids; do echo "[$process #$pid -- limits]" cat /proc/$pid/limits done fi done } You can copy and paste this function into a current shell session or load it as part of a script. Call the function with one the following invocations: return-limits mongod return-limits mongos return-limits mongod mongos 5.3.2 System Collections Synopsis MongoDB stores system information in collections that use the <database>.system.* namespace, which Mon-goDB reserves for internal use. Do not create collections that begin with system. MongoDB also stores some additional instance-local metadata in the local database (page 598), specifically for repli-cation purposes. Collections System collections include these collections stored in the admin database: 270 Chapter 5. Administration
  • 275. MongoDB Documentation, Release 2.6.4 admin.system.roles New in version 2.6. The admin.system.roles (page 270) collection stores custom roles that administrators create and assign to users to provide access to specific resources. admin.system.users Changed in version 2.6. The admin.system.users (page 271) collection stores the user’s authentication credentials as well as any roles assigned to the user. Users may define authorization roles in the admin.system.roles (page 270) collection. admin.system.version New in version 2.6. Stores the schema version of the user credential documents. System collections also include these collections stored directly in each database: <database>.system.namespaces The <database>.system.namespaces (page 271) collection contains information about all of the database’s collections. Additional namespace metadata exists in the database.ns files and is opaque to database users. <database>.system.indexes The <database>.system.indexes (page 271) collection lists all the indexes in the database. Add and remove data from this collection via the ensureIndex() and dropIndex() <database>.system.profile The <database>.system.profile (page 271) collection stores database profiling information. For in-formation on profiling, see Database Profiling (page 180). <database>.system.js The <database>.system.js (page 271) collection holds special JavaScript code for use in server side JavaScript (page 249). See Store a JavaScript Function on the Server (page 217) for more information. 5.3.3 Database Profiler Output The database profiler captures data information about read and write operations, cursor operations, and database com-mands. To configure the database profile and set the thresholds for capturing profile data, see the Analyze Performance of Database Operations (page 210) section. The database profiler writes data in the system.profile (page 271) collection, which is a capped collection. To view the profiler’s output, use normal MongoDB queries on the system.profile (page 271) collection. Note: Because the database profiler writes data to the system.profile (page 271) collection in a database, the profiler will profile some write activity, even for databases that are otherwise read-only. Example system.profile Document The documents in the system.profile (page 271) collection have the following form. This example document reflects an update operation: { "ts" : ISODate("2012-12-10T19:31:28.977Z"), "op" : "update", "ns" : "social.users", 5.3. Administration Reference 271
  • 276. MongoDB Documentation, Release 2.6.4 "query" : { "name" : "j.r." }, "updateobj" : { "$set" : { "likes" : [ "basketball", "trekking" ] } }, "nscanned" : 8, "scanAndOrder" : true, "moved" : true, "nmoved" : 1, "nupdated" : 1, "keyUpdates" : 0, "numYield" : 0, "lockStats" : { "timeLockedMicros" : { "r" : NumberLong(0), "w" : NumberLong(258) }, "timeAcquiringMicros" : { "r" : NumberLong(0), "w" : NumberLong(7) } }, "millis" : 0, "client" : "127.0.0.1", "user" : "" } Output Reference For any single operation, the documents created by the database profiler will include a subset of the following fields. The precise selection of fields in these documents depends on the type of operation. system.profile.ts The timestamp of the operation. system.profile.op The type of operation. The possible values are: •insert •query •update •remove •getmore •command system.profile.ns The namespace the operation targets. Namespaces in MongoDB take the form of the database, followed by a dot (.), followed by the name of the collection. 272 Chapter 5. Administration
  • 277. MongoDB Documentation, Release 2.6.4 system.profile.query The query document (page 87) used. system.profile.command The command operation. system.profile.updateobj The <update> document passed in during an update (page 67) operation. system.profile.cursorid The ID of the cursor accessed by a getmore operation. system.profile.ntoreturn Changed in version 2.2: In 2.0, MongoDB includes this field for query and command operations. In 2.2, this information MongoDB also includes this field for getmore operations. The number of documents the operation specified to return. For example, the profile command would return one document (a results document) so the ntoreturn (page 273) value would be 1. The limit(5) command would return five documents so the ntoreturn (page 273) value would be 5. If the ntoreturn (page 273) value is 0, the command did not specify a number of documents to return, as would be the case with a simple find() command with no limit specified. system.profile.ntoskip New in version 2.2. The number of documents the skip() method specified to skip. system.profile.nscanned The number of documents that MongoDB scans in the index (page 431) in order to carry out the operation. In general, if nscanned (page 273) is much higher than nreturned (page 274), the database is scanning many objects to find the target objects. Consider creating an index to improve this. system.profile.scanAndOrder scanAndOrder (page 273) is a boolean that is true when a query cannot use the order of documents in the index for returning sorted results: MongoDB must sort the documents after it receives the documents from a cursor. If scanAndOrder (page 273) is false, MongoDB can use the order of the documents in an index to return sorted results. system.profile.moved This field appears with a value of true when an update operation moved one or more documents to a new location on disk. If the operation did not result in a move, this field does not appear. Operations that result in a move take more time than in-place updates and typically occur as a result of document growth. system.profile.nmoved New in version 2.2. The number of documents the operation moved on disk. This field appears only if the operation resulted in a move. The field’s implicit value is zero, and the field is present only when non-zero. system.profile.nupdated New in version 2.2. The number of documents updated by the operation. system.profile.keyUpdates New in version 2.2. 5.3. Administration Reference 273
  • 278. MongoDB Documentation, Release 2.6.4 The number of index (page 431) keys the update changed in the operation. Changing an index key carries a small performance cost because the database must remove the old key and inserts a new key into the B-tree index. system.profile.numYield New in version 2.2. The number of times the operation yielded to allow other operations to complete. Typically, operations yield when they need access to data that MongoDB has not yet fully read into memory. This allows other operations that have data in memory to complete while MongoDB reads in data for the yielding operation. For more information, see the FAQ on when operations yield (page 703). system.profile.lockStats New in version 2.2. The time in microseconds the operation spent acquiring and holding locks. This field reports data for the following lock types: •R - global read lock •W - global write lock •r - database-specific read lock •w - database-specific write lock system.profile.lockStats.timeLockedMicros The time in microseconds the operation held a specific lock. For operations that require more than one lock, like those that lock the local database to update the oplog, this value may be longer than the total length of the operation (i.e. millis (page 274).) system.profile.lockStats.timeAcquiringMicros The time in microseconds the operation spent waiting to acquire a specific lock. system.profile.nreturned The number of documents returned by the operation. system.profile.responseLength The length in bytes of the operation’s result document. A large responseLength (page 274) can affect performance. To limit the size of the result document for a query operation, you can use any of the following: •Projections (page 94) •The limit() method •The batchSize() method Note: When MongoDB writes query profile information to the log, the responseLength (page 274) value is in a field named reslen. system.profile.millis The time in milliseconds from the perspective of the mongod from the beginning of the operation to the end of the operation. system.profile.client The IP address or hostname of the client connection where the operation originates. For some operations, such as db.eval(), the client is 0.0.0.0:0 instead of an actual client. system.profile.user The authenticated user who ran the operation. 274 Chapter 5. Administration
  • 279. MongoDB Documentation, Release 2.6.4 5.3.4 Journaling Mechanics When running with journaling, MongoDB stores and applies write operations (page 67) in memory and in the on-disk journal before the changes are present in the data files on disk. This document discusses the implementation and mechanics of journaling in MongoDB systems. See Manage Journaling (page 215) for information on configuring, tuning, and managing journaling. Journal Files With journaling enabled, MongoDB creates a journal subdirectory within the directory defined by dbPath, which is /data/db by default. The journal directory holds journal files, which contain write-ahead redo logs. The directory also holds a last-sequence-number file. A clean shutdown removes all the files in the journal directory. A dirty shut-down (crash) leaves files in the journal directory; these are used to automatically recover the database to a consistent state when the mongod process is restarted. Journal files are append-only files and have file names prefixed with j._. When a journal file holds 1 gigabyte of data, MongoDB creates a new journal file. Once MongoDB applies all the write operations in a particular journal file to the database data files, it deletes the file, as it is no longer needed for recovery purposes. Unless you write many bytes of data per second, the journal directory should contain only two or three journal files. You can use the storage.smallFiles run time option when starting mongod to limit the size of each journal file to 128 megabytes, if you prefer. To speed the frequent sequential writes that occur to the current journal file, you can ensure that the journal directory is on a different filesystem from the database data files. Important: If you place the journal on a different filesystem from your data files you cannot use a filesystem snapshot alone to capture valid backups of a dbPath directory. In this case, use fsyncLock() to ensure that database files are consistent before the snapshot and fsyncUnlock() once the snapshot is complete. Note: Depending on your filesystem, you might experience a preallocation lag the first time you start a mongod instance with journaling enabled. MongoDB may preallocate journal files if the mongod process determines that it is more efficient to preallocate journal files than create new journal files as needed. The amount of time required to pre-allocate lag might last several minutes, during which you will not be able to connect to the database. This is a one-time preallocation and does not occur with future invocations. To avoid preallocation lag, see Avoid Preallocation Lag (page 216). Storage Views used in Journaling Journaling adds three internal storage views to MongoDB. The shared view stores modified data for upload to the MongoDB data files. The shared view is the only view with direct access to the MongoDB data files. When running with journaling, mongod asks the operating system to map your existing on-disk data files to the shared view virtual memory view. The operating system maps the files but does not load them. MongoDB later loads data files into the shared view as needed. The private view stores data for use with read operations (page 55). The private view is the first place MongoDB applies new write operations (page 67). Upon a journal commit, MongoDB copies the changes made in the private view to the shared view, where they are then available for uploading to the database data files. The journal is an on-disk view that stores new write operations after MongoDB applies the operation to the private view but before applying them to the data files. The journal provides durability. If the mongod instance were to 5.3. Administration Reference 275
  • 280. MongoDB Documentation, Release 2.6.4 crash without having applied the writes to the data files, the journal could replay the writes to the shared view for eventual upload to the data files. How Journaling Records Write Operations MongoDB copies the write operations to the journal in batches called group commits. These “group commits” help minimize the performance impact of journaling, since a group commit must block all writers during the commit. See commitIntervalMs for information on the default commit interval. Journaling stores raw operations that allow MongoDB to reconstruct the following: • document insertion/updates • index modifications • metadata changes to the namespace files • creation and dropping of databases and their associated data files As write operations (page 67) occur, MongoDB writes the data to the private view in RAM and then copies the write operations in batches to the journal. The journal stores the operations on disk to ensure durability. Each journal entry describes the bytes the write operation changed in the data files. MongoDB next applies the journal’s write operations to the shared view. At this point, the shared view becomes inconsistent with the data files. At default intervals of 60 seconds, MongoDB asks the operating system to flush the shared view to disk. This brings the data files up-to-date with the latest write operations. The operating system may choose to flush the shared view to disk at a higher frequency than 60 seconds, particularly if the system is low on free memory. When MongoDB flushes write operations to the data files, MongoDB notes which journal writes have been flushed. Once a journal file contains only flushed writes, it is no longer needed for recovery, and MongoDB either deletes it or recycles it for a new journal file. As part of journaling, MongoDB routinely asks the operating system to remap the shared view to the private view, in order to save physical RAM. Upon a new remapping, the operating system knows that physical memory pages can be shared between the shared view and the private view mappings. Note: The interaction between the shared view and the on-disk data files is similar to how MongoDB works without journaling, which is that MongoDB asks the operating system to flush in-memory changes back to the data files every 60 seconds. 5.3.5 Exit Codes and Statuses MongoDB will return one of the following codes and statuses when exiting. Use this guide to interpret logs and when troubleshooting issues with mongod and mongos instances. 0 Returned by MongoDB applications upon successful exit. 2 The specified options are in error or are incompatible with other options. 3 Returned by mongod if there is a mismatch between hostnames specified on the command line and in the local.sources (page 600) collection. mongod may also return this status if oplog collection in the local database is not readable. 276 Chapter 5. Administration
  • 281. MongoDB Documentation, Release 2.6.4 4 The version of the database is different from the version supported by the mongod (or mongod.exe) instance. The instance exits cleanly. Restart mongod with the --upgrade option to upgrade the database to the version supported by this mongod instance. 5 Returned by mongod if a moveChunk operation fails to confirm a commit. 12 Returned by the mongod.exe process on Windows when it receives a Control-C, Close, Break or Shutdown event. 14 Returned by MongoDB applications which encounter an unrecoverable error, an uncaught exception or uncaught signal. The system exits without performing a clean shut down. 20 Message: ERROR: wsastartup failed <reason> Returned by MongoDB applications on Windows following an error in the WSAStartup function. Message: NT Service Error Returned by MongoDB applications forWindows due to failures installing, starting or removing the NT Service for the application. 45 Returned when a MongoDB application cannot open a file or cannot obtain a lock on a file. 47 MongoDB applications exit cleanly following a large clock skew (32768 milliseconds) event. 48 mongod exits cleanly if the server socket closes. The server socket is on port 27017 by default, or as specified to the --port run-time option. 49 Returned by mongod.exe or mongos.exe on Windows when either receives a shutdown message from the Windows Service Control Manager. 100 Returned by mongod when the process throws an uncaught exception. 5.3. Administration Reference 277
  • 282. MongoDB Documentation, Release 2.6.4 278 Chapter 5. Administration
  • 283. CHAPTER 6 Security This section outlines basic security and risk management strategies and access control. The included tutorials outline specific tasks for configuring firewalls, authentication, and system privileges. Security Introduction (page 279) A high-level introduction to security and MongoDB deployments. Security Concepts (page 281) The core documentation of security. Authentication (page 282) Mechanisms for verifying user and instance access to MongoDB. Authorization (page 285) Control access to MongoDB instances using authorization. Network Exposure and Security (page 288) Discusses potential security risks related to the network and strate-gies for decreasing possible network-based attack vectors for MongoDB. Continue reading from Security Concepts (page 281) for additional documentation of MongoDB’s security features and operation. Security Tutorials (page 294) Tutorials for enabling and configuring security features for MongoDB. Security Checklist (page 295) A high level overview of global security consideration for administrators of MongoDB deployments. Use this checklist if you are new to deploying MongoDB in production and want to implement high quality security practices. Network Security Tutorials (page 297) Ensure that the underlying network configuration supports a secure op-erating environment for MongoDB deployments, and appropriately limits access to MongoDB deploy-ments. Access Control Tutorials (page 316) These tutorials describe procedures relevant for the configuration, opera-tion, and maintenance of MongoDB’s access control system. User and Role Management Tutorials (page 342) MongoDB’s access control system provides a flexible role-based access control system that you can use to limit access to MongoDB deployments. The tutorials in this section describe the configuration an setup of the authorization system. Continue reading from Security Tutorials (page 294) for additional tutorials that address the use and management of secure MongoDB deployments. Create a Vulnerability Report (page 359) Report a vulnerability in MongoDB. Security Reference (page 360) Reference for security related functions. 6.1 Security Introduction Maintaining a secure MongoDB deployment requires administrators to implement controls to ensure that users and applications have access to only the data that they require. MongoDB provides features that allow administrators to 279
  • 284. MongoDB Documentation, Release 2.6.4 implement these controls and restrictions for any MongoDB deployment. If you are already familiar with security and MongoDB security practices, consider the Security Checklist (page 295) for a collection of recommended actions to protect a MongoDB deployment. 6.1.1 Authentication Before gaining access to a system all clients should identify themselves to MongoDB. This ensures that no client can access the data stored in MongoDB without being explicitly allowed. MongoDB supports a number of authentication mechanisms (page 282) that clients can use to verify their identity. MongoDB supports two mechanisms: a password-based challenge and response protocol and x.509 certificates. Ad-ditionally, MongoDB Enterprise1 also provides support for LDAP proxy authentication (page 283) and Kerberos au-thentication (page 283). See Authentication (page 282) for more information. 6.1.2 Role Based Access Control Access control, i.e. authorization (page 285), determines a user’s access to resources and operations. Clients should only be able to perform the operations required to fulfill their approved functions. This is the “principle of least privilege” and limits the potential risk of a compromised application. MongoDB’s role-based access control system allows administrators to control all access and ensure that all granted access applies as narrowly as possible. MongoDB does not enable authorization by default. When you enable autho-rization (page 285), MongoDB will require authentication for all connections. When authorization is enabled, MongoDB controls a user’s access through the roles assigned to the user. A role consists of a set of privileges, where a privilege consists of actions, or a set of operations, and a resource upon which the actions are allowed. Users may have one or more role that describes their access. MongoDB provides several built-in roles (page 361) and users can construct specific roles tailored to clients’ actual requirements. See Authorization (page 285) for more information. 6.1.3 Auditing Auditing provides administrators with the ability to verify that the implemented security policies are controlling activ-ity in the system. Retaining audit information ensures that administrators have enough information to perform forensic investigations and comply with regulations and polices that require audit data. See Auditing (page 290) for more information. 6.1.4 Encryption Transport Encryption You can use SSL to encrypt all of MongoDB’s network traffic. SSL ensures that MongoDB network traffic is only readable by the intended client. See Configure mongod and mongos for SSL (page 304) for more information. 1http://www.mongodb.com/products/mongodb-enterprise 280 Chapter 6. Security
  • 285. MongoDB Documentation, Release 2.6.4 Encryption at Rest There are two broad classes of approaches to encrypting data at rest with MongoDB. You can use these solutions together or independently: Application Level Encryption Provide encryption on a per-field or per-document basis within the application layer. To encrypt document or field level data, write custom encryption and decryption routines or use a commercial solutions such as the Vormetric Data Security Platform2. Storage Encryption Encrypt all MongoDB data on the storage or operating system to ensure that only authorized processes can access protected data. A number of third-party libraries can integrate with the operating system to provide transparent disk-level encryption. For example: Linux Unified Key Setup (LUKS) LUKS is available for most Linux distributions. For configuration explanation, see the LUKS documentation from Red Hat3. IBM Guardium Data Encryption IBM Guardium Data Encryption4 provides support for disk-level encryption for Linux and Windows operating systems. Vormetric Data Security Platform The Vormetric Data Security Platform5 provides disk and file-level encryption in addition to application level encryption. Bitlocker Drive Encryption Bitlocker Drive Encryption6 is a feature available on Windows Server 2008 and 2012 that provides disk encryption. Properly configured disk encryption, when used alongside good security policies that protect relevant accounts, pass-words, and encryption keys, can help ensure compliance with standards, including HIPAA, PCI-DSS, and FERPA. 6.1.5 Hardening Deployments and Environments In addition to implementing controls within MongoDB, you should also place controls around MongoDB to reduce the risk exposure of the entire MongoDB system. This is a defense in depth strategy. Hardening MongoDB extends the ideas of least privilege, auditing, and encryption outside of MongoDB. Reducing risk includes: configuring the network rules to ensure that only trusted hosts have access to MongoDB, and that the MongoDB processes only have access to the parts of the filesystem required for operation. 6.2 Security Concepts These documents introduce and address concepts and strategies related to security practices in MongoDB deployments. Authentication (page 282) Mechanisms for verifying user and instance access to MongoDB. Authorization (page 285) Control access to MongoDB instances using authorization. 2http://www.vormetric.com/sites/default/files/sb-MongoDB-Letter-2014-0611.pdf 3https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security_Guide/sect-Security_Guide- LUKS_Disk_Encryption.html 4http://www-03.ibm.com/software/products/en/infosphere-guardium-data-encryption 5http://www.vormetric.com/sites/default/files/sb-MongoDB-Letter-2014-0611.pdf 6http://technet.microsoft.com/en-us/library/hh831713.aspx 6.2. Security Concepts 281
  • 286. MongoDB Documentation, Release 2.6.4 Collection-Level Access Control (page 287) Scope privileges to specific collections. Network Exposure and Security (page 288) Discusses potential security risks related to the network and strategies for decreasing possible network-based attack vectors for MongoDB. Security and MongoDB API Interfaces (page 289) Discusses potential risks related to MongoDB’s JavaScript, HTTP and REST interfaces, including strategies to control those risks. Auditing (page 290) Audit server and client activity for mongod and mongos instances. Kerberos Authentication (page 291) Kerberos authentication and MongoDB. 6.2.1 Authentication Authentication is the process of verifying the identity of a client. When access control, i.e. authorization (page 285), is enabled, MongoDB requires all clients to authenticate themselves first in order to determine the access for the client. Although authentication and authorization (page 285) are closely connected, authentication is distinct from authoriza-tion. Authentication verifies the identity of a user; authorization determines the verified user’s access to resources and operations. MongoDB supports a number of authentication mechanisms (page 282) that clients can use to verify their identity. These mechanisms allow MongoDB to integrate into your existing authentication system. See Authentication Mecha-nisms (page 282) for details. In addition to verifying the identity of a client, MongoDB can require members of replica sets and sharded clusters to authenticate their membership (page 284) to their respective replica set or sharded cluster. See Authentication Between MongoDB Instances (page 284) for more information. Client Users To authenticate a client in MongoDB, you must add a corresponding user to MongoDB. When adding a user, you create the user in a specific database. Together, the user’s name and database serve as a unique identifier for that user. That is, if two users have the same name but are created in different databases, they are two separate users. To authenticate, the client must authenticate the user against the user’s database. For instance, if using the mongo shell as a client, you can specify the database for the user with the –authenticationDatabase option. To add and manage user information, MongoDB provides the db.createUser() method as well as other user management methods. For an example of adding a user to MongoDB, see Add a User to a Database (page 344). MongoDB stores all user information, including name (page 372), password (page 372), and the user’s database (page 372), in the system.users (page 372) collection in the admin database. Authentication Mechanisms MongoDB supports multiple authentication mechanisms. MongoDB’s default authentication method is a challenge and response mechanism (MONGODB-CR) (page 283). MongoDB also supports x509 certificate authentication (page 283), LDAP proxy authentication (page 283), and Kerberos authentication (page 283). This section introduces the mechanisms available in MongoDB. To specify the authentication mechanism to use, see authenticationMechanisms. 282 Chapter 6. Security
  • 287. MongoDB Documentation, Release 2.6.4 MONGODB-CR Authentication MONGODB-CR is a challenge-response mechanism that authenticates users through passwords. MONGODB-CR is the default mechanism. When you use MONGODB-CR authentication, MONGODB-CR verifies the user against the user’s name (page 372), password (page 372) and database (page 372). The user’s database is the database where the user was created, and the user’s database and the user’s name together serves to identify the user. Using key files, you can also use MONGODB-CR authentication for the internal member authentication (page 284) of replica set members and sharded cluster members. The contents of the key files serve as the shared password for the members. You must store the key file on each mongod or mongos instance for that replica set or sharded cluster. The content of the key file is arbitrary but must be the same on all mongod and mongos instances that connect to each other. See Generate a Key File (page 338) for instructions on generating a key file and turning on key file authentication for members. x.509 Certificate Authentication New in version 2.6. MongoDB supports x.509 certificate authentication for use with a secure SSL connection (page 304). To authenticate to servers, clients can use x.509 certificates instead of usernames and passwords. See Client x.509 Certificate (page 321) for more information. For membership authentication, members of sharded clusters and replica sets can use x.509 certificates instead of key files. See Use x.509 Certificate for Membership Authentication (page 323) for more information. Kerberos Authentication MongoDB Enterprise7 supports authentication using a Kerberos service. Kerberos is an industry standard authentica-tion protocol for large client/server systems. To use MongoDB with Kerberos, you must have a properly configured Kerberos deployment, configured Kerberos service principals (page 292) for MongoDB, and added Kerberos user principal (page 292) to MongoDB. See Kerberos Authentication (page 291) for more information on Kerberos and MongoDB. To configure MongoDB to use Kerberos authentication, see Configure MongoDB with Kerberos Authentication on Linux (page 331) and Configure MongoDB with Kerberos Authentication on Windows (page 334). LDAP Proxy Authority Authentication MongoDB Enterprise8 supports proxy authentication through a Lightweight Directory Access Protocol (LDAP) ser-vice. See Authenticate Using SASL and LDAP with OpenLDAP (page 329) and Authenticate Using SASL and LDAP with ActiveDirectory (page 326). MongoDB Enterprise forWindows does not include LDAP support for authentication. However, MongoDB Enterprise for Linux supports using LDAP authentication with an ActiveDirectory server. MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4 and version 2.6 shards. 7http://www.mongodb.com/products/mongodb-enterprise 8http://www.mongodb.com/products/mongodb-enterprise 6.2. Security Concepts 283
  • 288. MongoDB Documentation, Release 2.6.4 Authentication Behavior Client Authentication Clients can authenticate using the challenge and response (page 283), x.509 (page 283), LDAP Proxy (page 283) and Kerberos (page 283) mechanisms. Each client connection should authenticate as exactly one user. If a client authenticates to a database as one user and later authenticates to the same database as a different user, the second authentication invalidates the first. While clients can authenticate as multiple users if the users are defined on different databases, we recommend authenticating as one user at a time, providing the user with appropriate privileges on the databases required by the user. See Authenticate to a MongoDB Instance or Cluster (page 336) for more information. Authentication Between MongoDB Instances You can authenticate members of replica sets and sharded clusters. To authenticate members of a single MongoDB deployment to each other, MongoDB can use the keyFile and x.509 (page 283) mechanisms. Using keyFile authentication for members also enables authorization. Always run replica sets and sharded clusters in a trusted networking environment. Ensure that the network permits only trusted traffic to reach each mongod and mongos instance. Use your environment’s firewall and network routing to ensure that traffic only from clients and other members can reach your mongod and mongos instances. If needed, use virtual private networks (VPNs) to ensure secure connec-tions over wide area networks (WANs). Always ensure that: • Your network configuration will allow every member of the replica set or sharded cluster to contact every other member. • If you use MongoDB’s authentication system to limit access to your infrastructure, ensure that you configure a keyFile on all members to permit authentication. See Generate a Key File (page 338) for instructions on generating a key file and turning on key file authentication for members. For an example of using key files for sharded cluster authentication, see Enable Authentication in a Sharded Cluster (page 318). Authentication on Sharded Clusters In sharded clusters, applications authenticate to directly to mongos instances, using credentials stored in the admin database of the config servers. The shards in the sharded cluster also have credentials, and clients can authenticate directly to the shards to perform maintenance directly on the shards. In general, applications and clients should connect to the sharded cluster through the mongos. Changed in version 2.6: Previously, the credentials for authenticating to a database on a cluster resided on the primary shard (page 615) for that database. Some maintenance operations, such as cleanupOrphaned, compact, rs.reconfig(), require direct connec-tions to specific shards in a sharded cluster. To perform these operations with authentication enabled, you must connect directly to the shard and authenticate as a shard local administrative user. To create a shard local administrative user, connect directly to the shard and create the user. MongoDB stores shard local users in the admin database of the shard itself. These shard local users are completely independent from the users added to the sharded cluster via mongos. Shard local users are local to the shard and are inaccessible by mongos. Direct connections to a shard should only be for shard-specific maintenance and configuration. 284 Chapter 6. Security
  • 289. MongoDB Documentation, Release 2.6.4 Localhost Exception The localhost exception allows you to enable authorization before creating the first user in the system. When active, the localhost exception allows all connections from the localhost interface to have full access to that instance. The exception applies only when there are no users created in the MongoDB instance. If you use the localhost exception when deploying a new MongoDB system, the first user you create must be in the admin database with privileges to create other users, such as a user with the userAdmin (page 363) or userAdminAnyDatabase (page 368) role. See Enable Client Access Control (page 317) and Create a User Ad-ministrator (page 343) for more information. In the case of a sharded cluster, the localhost exception can apply to the cluster as a whole or separately to each shard. The localhost exception can apply to the cluster as a whole if there are no user information stored on the config servers and clients access via mongos instances. The localhost exception can apply separately to each shard if there is no user information stored on the shard itself and clients connect to the shard directly. To prevent unauthorized access to a cluster’s shards, you must either create an administrator on each shard or disable the localhost exception. To disable the localhost exception, use setParameter to set the enableLocalhostAuthBypass parameter to 0 during startup. 6.2.2 Authorization MongoDB employs Role-Based Access Control (RBAC) to govern access to a MongoDB system. A user is granted one or more roles (page 285) that determine the user’s access to database resources and operations. Outside of role assignments, the user has no access to the system. MongoDB does not enable authorization by default. You can enable authorization using the --auth or the --keyFile options, or if using a configuration file, with the security.authorization or the security.keyFile settings. MongoDB provides built-in roles (page 361), each with a dedicated purpose for a common use case. Examples include the read (page 362), readWrite (page 362), dbAdmin (page 363), and root (page 368) roles. Administrators also can create new roles and privileges to cater to operational needs. Administrators can assign privileges scoped as granularly as the collection level. When granted a role, a user receives all the privileges of that role. A user can have several roles concurrently, in which case the user receives the union of all the privileges of the respective roles. Roles A role consists of privileges that pair resources with allowed operations. Each privilege is defined directly in the role or inherited from another role. A role’s privileges apply to the database where the role is created. A role created on the admin database can include privileges that apply to all databases or to the cluster (page 374). A user assigned a role receives all the privileges of that role. The user can have multiple roles and can have different roles on different databases. Roles always grant privileges and never limit access. For example, if a user has both read (page 362) and readWriteAnyDatabase (page 368) roles on a database, the greater access prevails. 6.2. Security Concepts 285
  • 290. MongoDB Documentation, Release 2.6.4 Privileges A privilege consists of a specified resource and the actions permitted on the resource. A privilege resource (page 373) is either a database, collection, set of collections, or the cluster. If the cluster, the affiliated actions affect the state of the system rather than a specific database or collection. An action (page 375) is a command or method the user is allowed to perform on the resource. A resource can have multiple allowed actions. For available actions see Privilege Actions (page 375). For example, a privilege that includes the update (page 375) action allows a user to modify existing documents on the resource. To additionally grant the user permission to create documents on the resource, the administrator would add the insert (page 375) action to the privilege. For privilege syntax, see admin.system.roles.privileges (page 370). Inherited Privileges A role can include one or more existing roles in its definition, in which case the role inherits all the privileges of the included roles. A role can inherit privileges from other roles in its database. A role created on the admin database can inherit privileges from roles in any database. User-Defined Roles New in version 2.6. User administrators can create custom roles to ensure collection-level and command-level granularity and to adhere to the policy of least privilege. Administrators create and edit roles using the role management commands. MongoDB scopes a user-defined role to the database in which it is created and uniquely identifies the role by the pairing of its name and its database. MongoDB stores the roles in the admin database’s system.roles (page 369) collection. Do not access this collection directly but instead use the role management commands to view and edit custom roles. Collection-Level Access Control By creating a role with privileges (page 286) that are scoped to a specific collection in a particular database, adminis-trators can implement collection-level access control. See Collection-Level Access Control (page 287) for more information. Users MongoDB stores user credentials in the protected admin.system.users (page 271). Use the user management methods to view and edit user credentials. Role Assignment to Users User administrators create the users that access the system’s databases. MongoDB’s user management commands let administrators create users and assign them roles. 286 Chapter 6. Security
  • 291. MongoDB Documentation, Release 2.6.4 MongoDB scopes a user to the database in which the user is created. MongoDB stores all user definitions in the admin database, no matter which database the user is scoped to. MongoDB stores users in the admin database’s system.users collection (page 372). Do not access this collection directly but instead use the user management commands. The first role assigned in a database should be either userAdmin (page 363) or userAdminAnyDatabase (page 368). This user can then create all other users in the system. See Create a User Administrator (page 343). Protect the User and Role Collections MongoDB stores role and user data in the protected admin.system.roles (page 270) and admin.system.users (page 271) collections, which are only accessible using the user management meth-ods. If you disable access control, do not modify the admin.system.roles (page 270) and admin.system.users (page 271) collections using normal insert() and update() operations. Additional Information See the reference section for documentation of all built-in-roles (page 361) and all available privilege actions (page 375). Also consider the reference for the form of the resource documents (page 373). To create users see the Create a User Administrator (page 343) and Add a User to a Database (page 344) tutorials. 6.2.3 Collection-Level Access Control Collection-level access control allows administrators to grant users privileges that are scoped to specific collections. Administrators can implement collection-level access control through user-defined roles (page 286). By creating a role with privileges (page 286) that are scoped to a specific collection in a particular database, administrators can provision users with roles that grant privileges on a collection level. Privileges and Scope A privilege consists of actions (page 375) and the resources (page 373) upon which the actions are permissible; i.e. the resources define the scope of the actions for that privilege. By specifying both the database and the collection in the resource document (page 373) for a privilege, administrator can limit the privilege actions just to a specific collection in a specific database. Each privilege action in a role can be scoped to a different collection. For example, a user defined role can contain the following privileges: privileges: [ { resource: { db: "products", collection: "inventory" }, actions: [ "find", "update", "insert" ] }, { resource: { db: "products", collection: "orders" }, actions: [ "find" ] } ] The first privilege scopes its actions to the inventory collection of the products database. The second privilege scopes its actions to the orders collection of the products database. Additional Information For more information on user-defined roles and MongoDB authorization model, see Authorization (page 285). For a tutorial on creating user-defined roles, see Create a Role (page 347). 6.2. Security Concepts 287
  • 292. MongoDB Documentation, Release 2.6.4 6.2.4 Network Exposure and Security By default, MongoDB programs (i.e. mongos and mongod) will bind to all available network interfaces (i.e. IP addresses) on a system. This page outlines various runtime options that allow you to limit access to MongoDB programs. Configuration Options You can limit the network exposure with the following mongod and mongos configuration options: enabled, net.http.RESTInterfaceEnabled, bindIp, and port. You can use a configuration file to specify these settings. nohttpinterface The enabled setting for mongod and mongos instances disables the “home” status page. Changed in version 2.6: The mongod and mongos instances run with the http interface disabled by default. The status interface is read-only by default, and the default port for the status page is 28017. Authentication does not control or affect access to this interface. Important: Disable this interface for production deployments. If you enable this interface, you should only allow trusted clients to access this port. See Firewalls (page 289). rest The net.http.RESTInterfaceEnabled setting for mongod enables a fully interactive administrative REST interface, which is disabled by default. The net.http.RESTInterfaceEnabled configuration makes the http status interface 9, which is read-only by default, fully interactive. Use the net.http.RESTInterfaceEnabled setting with the enabled setting. The REST interface does not support any authentication and you should always restrict access to this interface to only allow trusted clients to connect to this port. You may also enable this interface on the command line as mongod --rest --httpinterface. Important: Disable this option for production deployments. If do you leave this interface enabled, you should only allow trusted clients to access this port. bind_ip The bindIp setting for mongod and mongos instances limits the network interfaces on which MongoDB programs will listen for incoming connections. You can also specify a number of interfaces by passing bindIp a comma separated list of IP addresses. You can use the mongod --bind_ip and mongos --bind_ip option on the command line at run time to limit the network accessibility of a MongoDB program. Important: Make sure that your mongod and mongos instances are only accessible on trusted networks. If your system has more than one network interface, bind MongoDB programs to the private or internal network interface. 9 Starting in version 2.6, http interface is disabled by default. 288 Chapter 6. Security
  • 293. MongoDB Documentation, Release 2.6.4 port The port setting for mongod and mongos instances changes the main port on which the mongod or mongos instance listens for connections. The default port is 27017. Changing the port does not meaningfully reduce risk or limit exposure. You may also specify this option on the command line as mongod --port or mongos --port. Setting port also indirectly sets the port for the HTTP status interface, which is always available on the port numbered 1000 greater than the primary mongod port. Only allow trusted clients to connect to the port for the mongod and mongos instances. See Firewalls (page 289). See also Security Considerations (page 184) and Default MongoDB Port (page 380). Firewalls Firewalls allow administrators to filter and control access to a system by providing granular control over what network communications. For administrators of MongoDB, the following capabilities are important: limiting incoming traffic on a specific port to specific systems, and limiting incoming traffic from untrusted hosts. On Linux systems, the iptables interface provides access to the underlying netfilter firewall. On Windows systems, netsh command line interface provides access to the underlying Windows Firewall. For additional infor-mation about firewall configuration, see Configure Linux iptables Firewall for MongoDB (page 297) and Configure Windows netsh Firewall for MongoDB (page 300). For best results and to minimize overall exposure, ensure that only traffic from trusted sources can reach mongod and mongos instances and that the mongod and mongos instances can only connect to trusted outputs. See also: For MongoDB deployments on Amazon’s web services, see the Amazon EC210 page, which addresses Amazon’s Security Groups and other EC2-specific security features. Virtual Private Networks Virtual private networks, or VPNs, make it possible to link two networks over an encrypted and limited-access trusted network. Typically MongoDB users who use VPNs use SSL rather than IPSEC VPNs for performance issues. Depending on configuration and implementation, VPNs provide for certificate validation and a choice of encryption protocols, which requires a rigorous level of authentication and identification of all clients. Furthermore, because VPNs provide a secure tunnel, by using a VPN connection to control access to your MongoDB instance, you can prevent tampering and “man-in-the-middle” attacks. 6.2.5 Security and MongoDB API Interfaces The following section contains strategies to limit risks related to MongoDB’s available interfaces including JavaScript, HTTP, and REST interfaces. JavaScript and the Security of the mongo Shell The following JavaScript evaluation behaviors of the mongo shell represents risk exposures. 10http://docs.mongodb.org/ecosystem/platforms/amazon-ec2 6.2. Security Concepts 289
  • 294. MongoDB Documentation, Release 2.6.4 JavaScript Expression or JavaScript File The mongo program can evaluate JavaScript expressions using the command line --eval option. Also, the mongo program can evaluate a JavaScript file (.js) passed directly to it (e.g. mongo someFile.js). Because the mongo program evaluates the JavaScript directly, inputs should only come from trusted sources. .mongorc.js File If a .mongorc.js file exists 11, the mongo shell will evaluate a .mongorc.js file before starting. You can disable this behavior by passing the mongo --norc option. HTTP Status Interface The HTTP status interface provides a web-based interface that includes a variety of operational data, logs, and status reports regarding the mongod or mongos instance. The HTTP interface is always available on the port numbered 1000 greater than the primary mongod port. By default, the HTTP interface port is 28017, but is indirectly set using the port option which allows you to configure the primary mongod port. Without the net.http.RESTInterfaceEnabled setting, this interface is entirely read-only, and limited in scope; nevertheless, this interface may represent an exposure. To disable the HTTP interface, set the enabled run time option or the --nohttpinterface command line option. See also Configuration Options (page 288). REST API The REST API to MongoDB provides additional information and write access on top of the HTTP Status interface. While the REST API does not provide any support for insert, update, or remove operations, it does provide adminis-trative access, and its accessibility represents a vulnerability in a secure environment. The REST interface is disabled by default, and is not recommended for production use. If you must use the REST API, please control and limit access to the REST API. The REST API does not include any support for authentication, even when running with authorization enabled. See the following documents for instructions on restricting access to the REST API interface: • Configure Linux iptables Firewall for MongoDB (page 297) • Configure Windows netsh Firewall for MongoDB (page 300) 6.2.6 Auditing New in version 2.6. MongoDB Enterprise includes an auditing capability for mongod and mongos instances. The auditing facility allows administrators and users to track system activity for deployments with multiple users and applications. The auditing facility can write audit events to the console, the syslog, a JSON file, or a BSON file. For details on the audit log messages, see System Event Audit Messages (page 380). 11 On Linux and Unix systems, mongo reads the .mongorc.js file from $HOME/.mongorc.js (i.e. ~/.mongorc.js). On Windows, mongo.exe reads the .mongorc.js file from %HOME%.mongorc.js or %HOMEDRIVE%%HOMEPATH%.mongorc.js. 290 Chapter 6. Security
  • 295. MongoDB Documentation, Release 2.6.4 Audit Events and Filter The auditing system can record the following operations: • schema (DDL), • replica set, • authentication and authorization, and • general operations. See Event Actions, Details, and Results (page 381) for the specific actions recorded. By default, the auditing system records all these operations; however, you can configure the --auditFilter option to restrict the events captured. See Configure System Events Auditing (page 356) to enable and configure auditing for MongoDB Enterprise. To set up filters, see Filter Events (page 358). Audit Guarantee The auditing system writes every audit event 12 to an in-memory buffer of audit events. MongoDB writes this buffer to disk periodically. For events collected from any single connection, the events have a total order: if MongoDB writes one event to disk, the system guarantees that it has written all prior events for that connection to disk. If an audit event entry corresponds to an operation that affects the durable state of the database, such as a modification to data, MongoDB will always write the audit event to disk before writing to the journal for that entry. That is, before adding an operation to the journal, MongoDB writes all audit events on the connection that triggered the operation, up to and including the entry for the operation. These auditing guarantees require that MongoDB runs with the journaling enabled. Warning: MongoDB may lose events if the server terminates before it commits the events to the audit log. The client may receive confirmation of the event before MongoDB commits to the audit log. For example, while auditing an aggregation operation, the server might crash after returning the result but before the audit log flushes. 6.2.7 Kerberos Authentication New in version 2.4. Overview MongoDB Enterprise provides support for Kerberos authentication of MongoDB clients to mongod and mongos. Kerberos is an industry standard authentication protocol for large client/server systems. Kerberos allows MongoDB and applications to take advantage of existing authentication infrastructure and processes. Kerberos Components and MongoDB Principals In a Kerberos-based system, every participant in the authenticated communication is known as a “principal”, and every principal must have a unique name. 12 Audit configuration can include a filter (page 358) to limit events to audit. 6.2. Security Concepts 291
  • 296. MongoDB Documentation, Release 2.6.4 Principals belong to administrative units called realms. For each realm, the Kerberos Key Distribution Center (KDC) maintains a database of the realm’s principal and the principals’ associated “secret keys”. For a client-server authentication, the client requests from the KDC a “ticket” for access to a specific asset. KDC uses the client’s secret and the server’s secret to construct the ticket which allows the client and server to mutually authenticate each other, while keeping the secrets hidden. For the configuration of MongoDB for Kerberos support, two kinds of principal names are of interest: user principals (page 292) and service principals (page 292). User Principal To authenticate using Kerberos, you must add the Kerberos user principals to MongoDB to the $external database. User principal names have the form: <username>@<KERBEROS REALM> For every user you want to authenticate using Kerberos, you must create a corresponding user in MongoDB in the $external database. For examples of adding a user to MongoDB as well as authenticating as that user, see Configure MongoDB with Kerberos Authentication on Linux (page 331) and Configure MongoDB with Kerberos Authentication on Windows (page 334). See also: http://docs.mongodb.org/manualreference/command/nav-user-management for general in-formation regarding creating and managing users in MongoDB. Service Principal Every MongoDB mongod and mongos instance (or mongod.exe or mongos.exe on Win-dows) must have an associated service principal. Service principal names have the form: <service>/<fully qualified domain name>@<KERBEROS REALM> For MongoDB, the <service> defaults to mongodb. For example, if m1.example.com is a MongoDB server, and example.com maintains the EXAMPLE.COM Kerberos realm, then m1 should have the service principal name mongodb/m1.example.com@EXAMPLE.COM. To specify a different value for <service>, use serviceName during the start up of mongod or mongos (or mongod.exe or mongos.exe). mongo shell or other clients may also specify a different service principal name using serviceName. Service principal names must be reachable over the network using the fully qualified domain name (FQDN) part of its service principal name. By default, Kerberos attempts to identify hosts using the /etc/kerb5.conf file before using DNS to resolve hosts. On Windows, if running MongoDB as a service, see Assign Service Principal Name to MongoDB Windows Service (page 336). Linux Keytab Files Linux systems can store Kerberos authentication keys for a service principal (page 292) in keytab files. Each Kerber-ized mongod and mongos instance running on Linux must have access to a keytab file containing keys for its service principal (page 292). To keep keytab files secure, use file permissions that restrict access to only the user that runs the mongod or mongos process. 292 Chapter 6. Security
  • 297. MongoDB Documentation, Release 2.6.4 Tickets On Linux, MongoDB clients can use Kerberos’s kinit program to initialize a credential cache for authenticating the user principal to servers. Windows Active Directory Unlike on Linux systems, mongod and mongos instances running on Windows do not require access to keytab files. Instead, the mongod and mongos instances read their server credentials from a credential store specific to the operating system. However, from the Windows Active Directory, you can export a keytab file for use on Linux systems. See Ktpass13 for more information. Authenticate With Kerberos To configure MongoDB for Kerberos support and authenticate, see Configure MongoDB with Kerberos Authentication on Linux (page 331) and Configure MongoDB with Kerberos Authentication on Windows (page 334). Operational Considerations The HTTP Console The MongoDB HTTP Console14 interface does not support Kerberos authentication. DNS Each host that runs a mongod or mongos instance must have both A and PTR DNS records to provide forward and reverse lookup. Without A and PTR DNS records, the host cannot resolve the components of the Kerberos domain or the Key Distri-bution Center (KDC). System Time Synchronization To successfully authenticate, the system time for each mongod and mongos instance must be within 5 minutes of the system time of the other hosts in the Kerberos infrastructure. Kerberized MongoDB Environments Driver Support The following MongoDB drivers support Kerberos authentication: • Java15 13http://technet.microsoft.com/en-us/library/cc753771.aspx 14http://docs.mongodb.org/ecosystem/tools/http-interfaces/#http-console 15http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-java-driver/ 6.2. Security Concepts 293
  • 298. MongoDB Documentation, Release 2.6.4 • C#16 • C++17 • Python18 Use with Additional MongoDB Authentication Mechanism Although MongoDB supports the use of Kerberos authentication with other authentication mechanisms, only add the other mechanisms as necessary. See the Incorporate Additional Authentication Mechanisms section in Configure MongoDB with Kerberos Authentication on Linux (page 331) and Configure MongoDB with Kerberos Authentication on Windows (page 334) for details. 6.3 Security Tutorials The following tutorials provide instructions for enabling and using the security features available in MongoDB. Security Checklist (page 295) A high level overview of global security consideration for administrators of MongoDB deployments. Use this checklist if you are new to deploying MongoDB in production and want to implement high quality security practices. Network Security Tutorials (page 297) Ensure that the underlying network configuration supports a secure operating environment for MongoDB deployments, and appropriately limits access to MongoDB deployments. Configure Linux iptables Firewall for MongoDB (page 297) Basic firewall configuration patterns and exam-ples for iptables on Linux systems. Configure Windows netsh Firewall for MongoDB (page 300) Basic firewall configuration patterns and exam-ples for netsh on Windows systems. Configure mongod and mongos for SSL (page 304) SSL allows MongoDB clients to support encrypted con-nections to mongod instances. Continue reading from Network Security Tutorials (page 297) for more information on running MongoDB in secure environments. Security Deployment Tutorials (page 313) These tutorials describe procedures for deploying MongoDB using au-thentication and authorization. Access Control Tutorials (page 316) These tutorials describe procedures relevant for the configuration, operation, and maintenance of MongoDB’s access control system. Enable Client Access Control (page 317) Describes the process for enabling authentication for MongoDB de-ployments. Use x.509 Certificates to Authenticate Clients (page 320) Use x.509 for client authentication. Use x.509 Certificate for Membership Authentication (page 323) Use x.509 for internal member authentica-tion for replica sets and sharded clusters. Configure MongoDB with Kerberos Authentication on Linux (page 331) For MongoDB Enterprise Linux, describes the process to enable Kerberos-based authentication for MongoDB deployments. Continue reading from Access Control Tutorials (page 316) for additional tutorials on configuring MongoDB’s authentication systems. 16http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-csharp-driver/ 17http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-cpp-driver/ 18http://api.mongodb.org/python/current/examples/authentication.html 294 Chapter 6. Security
  • 299. MongoDB Documentation, Release 2.6.4 Enable Authentication after Creating the User Administrator (page 319) Describes an alternative process for enabling authentication for MongoDB deployments. User and Role Management Tutorials (page 342) MongoDB’s access control system provides a flexible role-based access control system that you can use to limit access to MongoDB deployments. The tutorials in this section describe the configuration an setup of the authorization system. Add a User to a Database (page 344) Create non-administrator users using MongoDB’s role-based authentica-tion system. Create a Role (page 347) Create custom role. Modify a User’s Access (page 352) Modify the actions available to a user on specific database resources. View Roles (page 353) View a role’s privileges. Continue reading from User and Role Management Tutorials (page 342) for additional tutorials on managing users and privileges in MongoDB’s authorization system. Configure System Events Auditing (page 356) Enable and configure MongoDB Enterprise system event auditing fea-ture. Create a Vulnerability Report (page 359) Report a vulnerability in MongoDB. 6.3.1 Security Checklist This documents provides a list of security measures that you should implement to protect your MongoDB installation. Require Authentication Enable MongoDB authentication and specify the authentication mechanism. You can use the MongoDB authentica-tion mechanism or an existing external framework. Authentication requires that all clients and servers provide valid credentials before they can connect to the system. In clustered deployments, enable authentication for each MongoDB server. See Authentication (page 282), Enable Client Access Control (page 317), and Enable Authentication in a Sharded Cluster (page 318). Configure Role-Based Access Control Create roles that define the exact access a set of users needs. Follow a principle of least privilege. Then create users and assign them only the roles they need to perform their operations. A user can be a person or a client application. Create a user administrator first, then create additional users. Create a unique MongoDB user for each person and application that accesses the system. See Authorization (page 285), Create a Role (page 347), Create a User Administrator (page 343), and Add a User to a Database (page 344). Encrypt Communication Configure MongoDB to use SSL for all incoming and outgoing connections. Use SSL to encrypt communication between mongod and mongos components of a MongoDB client, as well as between all applications and MongoDB. See Configure mongod and mongos for SSL (page 304). 6.3. Security Tutorials 295
  • 300. MongoDB Documentation, Release 2.6.4 Limit Network Exposure Ensure that MongoDB runs in a trusted network environment and limit the interfaces on which MongoDB instances listen for incoming connections. Allow only trusted clients to access the network interfaces and ports on which MongoDB instances are available. See the bindIp setting, and see Configure Linux iptables Firewall for MongoDB (page 297) and Configure Windows netsh Firewall for MongoDB (page 300). Audit System Activity Track access and changes to database configurations and data. MongoDB Enterprise19 includes a system auditing facility that can record system events (e.g. user operations, connection events) on a MongoDB instance. These audit records permit forensic analysis and allow administrators to verify proper controls. See Auditing (page 290) and Configure System Events Auditing (page 356). Encrypt and Protect Data Encrypt MongoDB data on each host using file-system, device, or physical encryption. Protect MongoDB data using file-system permissions. MongoDB data includes data files, configuration files, auditing logs, and key files. Run MongoDB with a Dedicated User Run MongoDB processes with a dedicated operating system user account. Ensure that the account has permissions to access data but no unnecessary permissions. See Install MongoDB (page 5) for more information on running MongoDB. Run MongoDB with Secure Configuration Options MongoDB supports the execution of JavaScript code for certain server-side operations: mapReduce, group, eval, and $where. If you do not use these operations, disable server-side scripting by using the --noscripting option on the command line. Use only the MongoDB wire protocol on production deployments. Do not enable the following, all of which enable the web server interface: enabled, net.http.JSONPEnabled, and net.http.RESTInterfaceEnabled. Leave these disabled, unless required for backwards compatibility. Keep input validation enabled. MongoDB enables input validation by default through the wireObjectCheck setting. This ensures that all documents stored by the mongod instance are valid BSON. Consider Security Standards Compliance For applications requiring HIPAA or PCI-DSS compliance, please refer to the MongoDB Security Reference Architec-ture20 to learn more about how you can use the key security capabilities to build compliant application infrastructure. 19http://www.mongodb.com/products/mongodb-enterprise 20http://info.mongodb.com/rs/mongodb/images/MongoDB_Security_Architecture_WP.pdf 296 Chapter 6. Security
  • 301. MongoDB Documentation, Release 2.6.4 Contact MongoDB for Further Guidance MongoDB Inc. provides a Security Technical Implementation Guide (STIG) upon request. Please request a copy21 for more information. 6.3.2 Network Security Tutorials The following tutorials provide information on handling network security for MongoDB. Configure Linux iptables Firewall for MongoDB (page 297) Basic firewall configuration patterns and examples for iptables on Linux systems. Configure Windows netsh Firewall for MongoDB (page 300) Basic firewall configuration patterns and examples for netsh on Windows systems. Configure mongod and mongos for SSL (page 304) SSL allows MongoDB clients to support encrypted connections to mongod instances. SSL Configuration for Clients (page 307) Configure clients to connect to MongoDB instances that use SSL. Upgrade a Cluster to Use SSL (page 311) Rolling upgrade process to use SSL. Configure MongoDB for FIPS (page 311) Configure for Federal Information Processing Standard (FIPS). Configure Linux iptables Firewall for MongoDB On contemporary Linux systems, the iptables program provides methods for managing the Linux Kernel’s netfilter or network packet filtering capabilities. These firewall rules make it possible for administrators to control what hosts can connect to the system, and limit risk exposure by limiting the hosts that can connect to a system. This document outlines basic firewall configurations for iptables firewalls on Linux. Use these approaches as a starting point for your larger networking organization. For a detailed overview of security practices and risk manage-ment for MongoDB, see Security Concepts (page 281). See also: For MongoDB deployments on Amazon’s web services, see the Amazon EC222 page, which addresses Amazon’s Security Groups and other EC2-specific security features. Overview Rules in iptables configurations fall into chains, which describe the process for filtering and processing specific streams of traffic. Chains have an order, and packets must pass through earlier rules in a chain to reach later rules. This document addresses only the following two chains: INPUT Controls all incoming traffic. OUTPUT Controls all outgoing traffic. Given the default ports (page 288) of all MongoDB processes, you must configure networking rules that permit only required communication between your application and the appropriate mongod and mongos instances. Be aware that, by default, the default policy of iptables is to allow all connections and traffic unless explicitly disabled. The configuration changes outlined in this document will create rules that explicitly allow traffic from specific addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed. When 21http://www.mongodb.com/lp/contact/stig-requests 22http://docs.mongodb.org/ecosystem/platforms/amazon-ec2 6.3. Security Tutorials 297
  • 302. MongoDB Documentation, Release 2.6.4 you have properly configured your iptables rules to allow only the traffic that you want to permit, you can Change Default Policy to DROP (page 300). Patterns This section contains a number of patterns and examples for configuring iptables for use with MongoDB deploy-ments. If you have configured different ports using the port configuration setting, you will need to modify the rules accordingly. Traffic to and from mongod Instances This pattern is applicable to all mongod instances running as standalone instances or as part of a replica set. The goal of this pattern is to explicitly allow traffic to the mongod instance from the application server. In the following examples, replace <ip-address> with the IP address of the application server: iptables -A INPUT -s <ip-address> -p tcp --destination-port 27017 -m state --state NEW,ESTABLISHED -j iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27017 -m state --state ESTABLISHED -j ACCEPT The first rule allows all incoming traffic from <ip-address> on port 27017, which allows the application server to connect to the mongod instance. The second rule, allows outgoing traffic from the mongod to reach the application server. Optional If you have only one application server, you can replace <ip-address> with either the IP address itself, such as: 198.51.100.55. You can also express this using CIDR notation as 198.51.100.55/32. If you want to permit a larger block of possible IP addresses you can allow traffic from a http://docs.mongodb.org/manual24 using one of the following specifications for the <ip-address>, as follows: 10.10.10.10/24 10.10.10.10/255.255.255.0 Traffic to and from mongos Instances mongos instances provide query routing for sharded clusters. Clients connect to mongos instances, which behave from the client’s perspective as mongod instances. In turn, the mongos connects to all mongod instances that are components of the sharded cluster. Use the same iptables command to allow traffic to and from these instances as you would from the mongod instances that are members of the replica set. Take the configuration outlined in the Traffic to and from mongod Instances (page 298) section as an example. Traffic to and from a MongoDB Config Server Config servers, host the config database that stores metadata for sharded clusters. Each production cluster has three config servers, initiated using the mongod --configsvr option. 23 Config servers listen for connections on port 27019. As a result, add the following iptables rules to the config server to allow incoming and outgoing connection on port 27019, for connection to the other config servers. iptables -A INPUT -s <ip-address> -p tcp --destination-port 27019 -m state --state NEW,ESTABLISHED -j iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27019 -m state --state ESTABLISHED -j ACCEPT Replace <ip-address> with the address or address space of all the mongod that provide config servers. Additionally, config servers need to allow incoming connections from all of the mongos instances in the cluster and all mongod instances in the cluster. Add rules that resemble the following: 23 You also can run a config server by using the configsvr value for the clusterRole setting in a configuration file. 298 Chapter 6. Security
  • 303. MongoDB Documentation, Release 2.6.4 iptables -A INPUT -s <ip-address> -p tcp --destination-port 27019 -m state --state NEW,ESTABLISHED -j Replace <ip-address> with the address of the mongos instances and the shard mongod instances. Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 24 Because the default port number is 27018 when running with the shardsvr value for the clusterRole setting, you must configure the following iptables rules to allow traffic to and from each shard: iptables -A INPUT -s <ip-address> -p tcp --destination-port 27018 -m state --state NEW,ESTABLISHED -j iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27018 -m state --state ESTABLISHED -j ACCEPT Replace the <ip-address> specification with the IP address of all mongod. This allows you to permit incoming and outgoing traffic between all shards including constituent replica set members, to: • all mongod instances in the shard’s replica sets. • all mongod instances in other shards. 25 Furthermore, shards need to be able make outgoing connections to: • all mongos instances. • all mongod instances in the config servers. Create a rule that resembles the following, and replace the <ip-address> with the address of the config servers and the mongos instances: iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27018 -m state --state ESTABLISHED -j ACCEPT Provide Access For Monitoring Systems 1. The mongostat diagnostic tool, when running with the --discover needs to be able to reach all compo-nents of a cluster, including the config servers, the shard servers, and the mongos instances. 2. If your monitoring system needs access the HTTP interface, insert the following rule to the chain: iptables -A INPUT -s <ip-address> -p tcp --destination-port 28017 -m state --state NEW,ESTABLISHED Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface. For all deployments, you should restrict access to this port to only the monitoring instance. Optional For config server mongod instances running with the shardsvr value for the clusterRole setting, the rule would resemble the following: iptables -A INPUT -s <ip-address> -p tcp --destination-port 28018 -m state --state NEW,ESTABLISHED For config server mongod instances running with the configsvr value for the clusterRole setting, the rule would resemble the following: iptables -A INPUT -s <ip-address> -p tcp --destination-port 28019 -m state --state NEW,ESTABLISHED 24 You can also specify the shard server option with the shardsvr value for the clusterRole setting in the configuration file. Shard members are also often conventional replica sets using the default port. 25 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations. 6.3. Security Tutorials 299
  • 304. MongoDB Documentation, Release 2.6.4 Change Default Policy to DROP The default policy for iptables chains is to allow all traffic. After completing all iptables configuration changes, you must change the default policy to DROP so that all traffic that isn’t explicitly allowed as above will not be able to reach components of the MongoDB deployment. Issue the following commands to change this policy: iptables -P INPUT DROP iptables -P OUTPUT DROP Manage and Maintain iptables Configuration This section contains a number of basic operations for managing and using iptables. There are various front end tools that automate some aspects of iptables configuration, but at the core all iptables front ends provide the same basic functionality: Make all iptables Rules Persistent By default all iptables rules are only stored in memory. When your system restarts, your firewall rules will revert to their defaults. When you have tested a rule set and have guaranteed that it effectively controls traffic you can use the following operations to you should make the rule set persistent. On Red Hat Enterprise Linux, Fedora Linux, and related distributions you can issue the following command: service iptables save On Debian, Ubuntu, and related distributions, you can use the following command to dump the iptables rules to the /etc/iptables.conf file: iptables-save > /etc/iptables.conf Run the following operation to restore the network rules: iptables-restore < /etc/iptables.conf Place this command in your rc.local file, or in the /etc/network/if-up.d/iptables file with other similar operations. List all iptables Rules To list all of currently applied iptables rules, use the following operation at the system shell. iptables --L Flush all iptables Rules If you make a configuration mistake when entering iptables rules or simply need to revert to the default rule set, you can use the following operation at the system shell to flush all rules: iptables --F If you’ve already made your iptables rules persistent, you will need to repeat the appropriate procedure in the Make all iptables Rules Persistent (page 300) section. Configure Windows netsh Firewall for MongoDB On Windows Server systems, the netsh program provides methods for managing the Windows Firewall. These firewall rules make it possible for administrators to control what hosts can connect to the system, and limit risk exposure by limiting the hosts that can connect to a system. 300 Chapter 6. Security
  • 305. MongoDB Documentation, Release 2.6.4 This document outlines basic Windows Firewall configurations. Use these approaches as a starting point for your larger networking organization. For a detailed over view of security practices and risk management for MongoDB, see Security Concepts (page 281). See also: Windows Firewall26 documentation from Microsoft. Overview Windows Firewall processes rules in an ordered determined by rule type, and parsed in the following order: 1. Windows Service Hardening 2. Connection security rules 3. Authenticated Bypass Rules 4. Block Rules 5. Allow Rules 6. Default Rules By default, the policy in Windows Firewall allows all outbound connections and blocks all incoming connections. Given the default ports (page 288) of all MongoDB processes, you must configure networking rules that permit only required communication between your application and the appropriate mongod.exe and mongos.exe instances. The configuration changes outlined in this document will create rules which explicitly allow traffic from specific addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed. You can configure the Windows Firewall with using the netsh command line tool or through a windows application. On Windows Server 2008 this application is Windows Firewall With Advanced Security in Administrative Tools. On previous versions of Windows Server, access the Windows Firewall application in the System and Security control panel. The procedures in this document use the netsh command line tool. Patterns This section contains a number of patterns and examples for configuring Windows Firewall for use with MongoDB deployments. If you have configured different ports using the port configuration setting, you will need to modify the rules accordingly. Traffic to and from mongod.exe Instances This pattern is applicable to all mongod.exe instances running as standalone instances or as part of a replica set. The goal of this pattern is to explicitly allow traffic to the mongod.exe instance from the application server. netsh advfirewall firewall add rule name="Open mongod port 27017" dir=in action=allow protocol=TCP localport=This rule allows all incoming traffic to port 27017, which allows the application server to connect to the mongod.exe instance. Windows Firewall also allows enabling network access for an entire application rather than to a specific port, as in the following example: 26http://technet.microsoft.com/en-us/network/bb545423.aspx 6.3. Security Tutorials 301
  • 306. MongoDB Documentation, Release 2.6.4 netsh advfirewall firewall add rule name="Allowing mongod" dir=in action=allow program=" C:mongodbbinYou can allow all access for a mongos.exe server, with the following invocation: netsh advfirewall firewall add rule name="Allowing mongos" dir=in action=allow program=" C:mongodbbinTraffic to and from mongos.exe Instances mongos.exe instances provide query routing for sharded clusters. Clients connect to mongos.exe instances, which behave from the client’s perspective as mongod.exe instances. In turn, the mongos.exe connects to all mongod.exe instances that are components of the sharded cluster. Use the same Windows Firewall command to allow traffic to and from these instances as you would from the mongod.exe instances that are members of the replica set. netsh advfirewall firewall add rule name="Open mongod shard port 27018" dir=in action=allow protocol=Traffic to and from a MongoDB Config Server Configuration servers, host the config database that stores meta-data for sharded clusters. Each production cluster has three configuration servers, initiated using the mongod --configsvr option. 27 Configuration servers listen for connections on port 27019. As a result, add the fol-lowing Windows Firewall rules to the config server to allow incoming and outgoing connection on port 27019, for connection to the other config servers. netsh advfirewall firewall add rule name="Open mongod config svr port 27019" dir=in action=allow protocol=Additionally, config servers need to allow incoming connections from all of the mongos.exe instances in the cluster and all mongod.exe instances in the cluster. Add rules that resemble the following: netsh advfirewall firewall add rule name="Open mongod config svr inbound" dir=in action=allow protocol=Replace <ip-address> with the addresses of the mongos.exe instances and the shard mongod.exe instances. Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 28 Because the default port number is 27018 when running with the shardsvr value for the clusterRole setting, you must configure the following Windows Firewall rules to allow traffic to and from each shard: netsh advfirewall firewall add rule name="Open mongod shardsvr inbound" dir=in action=allow protocol=netsh advfirewall firewall add rule name="Open mongod shardsvr outbound" dir=out action=allow protocol=Replace the <ip-address> specification with the IP address of all mongod.exe instances. This allows you to permit incoming and outgoing traffic between all shards including constituent replica set members to: • all mongod.exe instances in the shard’s replica sets. • all mongod.exe instances in other shards. 29 Furthermore, shards need to be able make outgoing connections to: • all mongos.exe instances. • all mongod.exe instances in the config servers. Create a rule that resembles the following, and replace the <ip-address> with the address of the config servers and the mongos.exe instances: 27 You also can run a config server by using the configsrv value for the clusterRole setting in a configuration file. 28 You can also specify the shard server option with the shardsvr value for the clusterRole setting in the configuration file. Shard members are also often conventional replica sets using the default port. 29 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations. 302 Chapter 6. Security
  • 307. MongoDB Documentation, Release 2.6.4 netsh advfirewall firewall add rule name="Open mongod config svr outbound" dir=out action=allow protocol=Provide Access For Monitoring Systems 1. The mongostat diagnostic tool, when running with the --discover needs to be able to reach all compo-nents of a cluster, including the config servers, the shard servers, and the mongos.exe instances. 2. If your monitoring system needs access the HTTP interface, insert the following rule to the chain: netsh advfirewall firewall add rule name="Open mongod HTTP monitoring inbound" dir=in action=allow Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface. For all deployments, you should restrict access to this port to only the monitoring instance. Optional For config server mongod instances running with the shardsvr value for the clusterRole setting, the rule would resemble the following: netsh advfirewall firewall add rule name="Open mongos HTTP monitoring inbound" dir=in action=allow For config server mongod instances running with the configsvr value for the clusterRole setting, the rule would resemble the following: netsh advfirewall firewall add rule name="Open mongod configsvr HTTP monitoring inbound" dir=in Manage and Maintain Windows Firewall Configurations This section contains a number of basic operations for managing and using netsh. While you can use the GUI front ends to manage the Windows Firewall, all core functionality is accessible is accessible from netsh. Delete all Windows Firewall Rules To delete the firewall rule allowing mongod.exe traffic: netsh advfirewall firewall delete rule name="Open mongod port 27017" protocol=tcp localport=27017 netsh advfirewall firewall delete rule name="Open mongod shard port 27018" protocol=tcp localport=27018 List All Windows Firewall Rules To return a list of all Windows Firewall rules: netsh advfirewall firewall show rule name=all Reset Windows Firewall To reset the Windows Firewall rules: netsh advfirewall reset Backup and Restore Windows Firewall Rules To simplify administration of larger collection of systems, you can export or import firewall systems from different servers) rules very easily on Windows: Export all firewall rules with the following command: netsh advfirewall export "C:tempMongoDBfw.wfw" 6.3. Security Tutorials 303
  • 308. MongoDB Documentation, Release 2.6.4 Replace "C:tempMongoDBfw.wfw" with a path of your choosing. You can use a command in the following form to import a file created using this operation: netsh advfirewall import "C:tempMongoDBfw.wfw" Configure mongod and mongos for SSL This document helps you to configure MongoDB to support SSL. MongoDB clients can use SSL to encrypt connec-tions to mongod and mongos instances. Note: The default distribution of MongoDB30 does not contain support for SSL. To use SSL, you must either build MongoDB locally passing the --ssl option to scons or use MongoDB Enterprise31. These instructions assume that you have already installed a build of MongoDB that includes SSL support and that your client driver supports SSL. For instructions on upgrading a cluster currently not using SSL to using SSL, see Upgrade a Cluster to Use SSL (page 311). Changed in version 2.6: MongoDB’s SSL encryption only allows use of strong SSL ciphers with a minimum of 128-bit key length for all connections. MongoDB Enterprise for Windows includes support for SSL. See also: SSL Configuration for Clients (page 307) to learn about SSL support for Python, Java, Ruby, and other clients. .pem File Before you can use SSL, you must have a .pem file containing a public key certificate and its associated private key. MongoDB can use any valid SSL certificate issued by a certificate authority, or a self-signed certificate. If you use a self-signed certificate, although the communications channel will be encrypted, there will be no validation of server identity. Although such a situation will prevent eavesdropping on the connection, it leaves you vulnerable to a man-in-the- middle attack. Using a certificate signed by a trusted certificate authority will permit MongoDB drivers to verify the server’s identity. In general, avoid using self-signed certificates unless the network is trusted. Additionally, with regards to authentication among replica set/sharded cluster members (page 284), in order to mini-mize exposure of the private key and allow hostname validation, it is advisable to use different certificates on different servers. For testing purposes, you can generate a self-signed certificate and private key on a Unix system with a command that resembles the following: • cd /etc/ssl/ openssl req -newkey rsa:2048 -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key This operation generates a new, self-signed certificate with no passphrase that is valid for 365 days. Once you have the certificate, concatenate the certificate and private key to a .pem file, as in the following example: cat mongodb-cert.key mongodb-cert.crt > mongodb.pem See also: Use x.509 Certificates to Authenticate Clients (page 320) 30http://www.mongodb.org/downloads 31http://www.mongodb.com/products/mongodb-enterprise 304 Chapter 6. Security
  • 309. MongoDB Documentation, Release 2.6.4 Set Up mongod and mongos with SSL Certificate and Key To use SSL in your MongoDB deployment, include the following run-time options with mongod and mongos: • net.ssl.mode set to requireSSL. This setting restricts each server to use only SSL encrypted connections. You can also specify either the value allowSSL or preferSSL to set up the use of mixed SSL modes on a port. See net.ssl.mode for details. • PEMKeyfile with the .pem file that contains the SSL certificate and key. Consider the following syntax for mongod: mongod --sslMode requireSSL --sslPEMKeyFile <pem> For example, given an SSL certificate located at /etc/ssl/mongodb.pem, configure mongod to use SSL encryp-tion for all connections with the following command: mongod --sslMode requireSSL --sslPEMKeyFile /etc/ssl/mongodb.pem Note: • Specify <pem> with the full path name to the certificate. • If the private key portion of the <pem> is encrypted, specify the passphrase. See SSL Certificate Passphrase (page 306). • You may also specify these options in the configuration file, as in the following example: sslMode = requireSSL sslPEMKeyFile = /etc/ssl/mongodb.pem To connect, to mongod and mongos instances using SSL, the mongo shell and MongoDB tools must include the --ssl option. See SSL Configuration for Clients (page 307) for more information on connecting to mongod and mongos running with SSL. See also: Upgrade a Cluster to Use SSL (page 311) Set Up mongod and mongos with Certificate Validation To set up mongod or mongos for SSL encryption using an SSL certificate signed by a certificate authority, include the following run-time options during startup: • net.ssl.mode set to requireSSL. This setting restricts each server to use only SSL encrypted connections. You can also specify either the value allowSSL or preferSSL to set up the use of mixed SSL modes on a port. See net.ssl.mode for details. • PEMKeyfile with the name of the .pem file that contains the signed SSL certificate and key. • CAFile with the name of the .pem file that contains the root certificate chain from the Certificate Authority. Consider the following syntax for mongod: mongod --sslMode requireSSL --sslPEMKeyFile <pem> --sslCAFile <ca> For example, given a signed SSL certificate located at /etc/ssl/mongodb.pem and the certificate authority file at /etc/ssl/ca.pem, you can configure mongod for SSL encryption as follows: 6.3. Security Tutorials 305
  • 310. MongoDB Documentation, Release 2.6.4 mongod --sslMode requireSSL --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile /etc/ssl/ca.pem Note: • Specify the <pem> file and the <ca> file with either the full path name or the relative path name. • If the <pem> is encrypted, specify the passphrase. See SSL Certificate Passphrase (page 306). • You may also specify these options in the configuration file, as in the following example: sslMode = requireSSL sslPEMKeyFile = /etc/ssl/mongodb.pem sslCAFile = /etc/ssl/ca.pem To connect, to mongod and mongos instances using SSL, the mongo tools must include the both the --ssl and --sslPEMKeyFile option. See SSL Configuration for Clients (page 307) for more information on connecting to mongod and mongos running with SSL. See also: Upgrade a Cluster to Use SSL (page 311) Block Revoked Certificates for Clients To prevent clients with revoked certificates from connecting, include the sslCRLFile to specify a .pem file that contains revoked certificates. For example, the following mongod with SSL configuration includes the sslCRLFile setting: mongod --sslMode requireSSL --sslCRLFile /etc/ssl/ca-crl.pem --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile Clients with revoked certificates in the /etc/ssl/ca-crl.pem will not be able to connect to this mongod in-stance. Validate Only if a Client Presents a Certificate In most cases it is important to ensure that clients present valid certificates. However, if you have clients that cannot present a client certificate, or are transitioning to using a certificate authority you may only want to validate certificates from clients that present a certificate. If you want to bypass validation for clients that don’t present certificates, include the weakCertificateValidation run-time option with mongod and mongos. If the client does not present a certificate, no validation occurs. These connections, though not validated, are still encrypted using SSL. For example, consider the following mongod with an SSL configuration that includes the weakCertificateValidation setting: mongod --sslMode requireSSL --sslWeakCertificateValidation --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile Then, clients can connect either with the option --ssl and no certificate or with the option --ssl and a valid certificate. See SSL Configuration for Clients (page 307) for more information on SSL connections for clients. Note: If the client presents a certificate, the certificate must be a valid certificate. All connections, including those that have not presented certificates are encrypted using SSL. SSL Certificate Passphrase The PEM files for PEMKeyfile and ClusterFile may be encrypted. With encrypted PEM files, you must specify the passphrase at startup with a command-line or a configuration file option or enter the passphrase when prompted. 306 Chapter 6. Security
  • 311. MongoDB Documentation, Release 2.6.4 Changed in version 2.6: In previous versions, you can only specify the passphrase with a command-line or a configu-ration file option. To specify the passphrase in clear text on the command line or in a configuration file, use the PEMKeyPassword and/or the ClusterPassword option. To have MongoDB prompt for the passphrase at the start of mongod or mongos and avoid specifying the passphrase in clear text, omit the PEMKeyPassword and/or the ClusterPassword option. MongoDB will prompt for each passphrase as necessary. Important: The passphrase prompt option is available if you run the MongoDB instance in the foreground with a connected terminal. If you run mongod or mongos in a non-interactive session (e.g. without a terminal or as a service on Windows), you cannot use the passphrase prompt option. Run in FIPS Mode See Configure MongoDB for FIPS (page 311) for more details. SSL Configuration for Clients Clients must have support for SSL to work with a mongod or a mongos instance that has SSL support enabled. The current versions of the Python, Java, Ruby, Node.js, .NET, and C++ drivers have support for SSL, with full support coming in future releases of other drivers. See also: Configure mongod and mongos for SSL (page 304). mongo Shell SSL Configuration For SSL connections, you must use the mongo shell built with SSL support or distributed with MongoDB Enterprise. To support SSL, mongo has the following settings: • --ssl • --sslPEMKeyFile with the name of the .pem file that contains the SSL certificate and key. • --sslCAFile with the name of the .pem file that contains the certificate from the Certificate Authority (CA). Warning: If the mongo shell or any other tool that connects to mongos or mongod is run without --sslCAFile, it will not attempt to validate server certificates. This results in vulnerability to expired mongod and mongos certificates as well as to foreign processes posing as valid mongod or mongos instances. Ensure that you always specify the CA file against which server certificates should be validated in cases where intrusion is a possibility. • --sslPEMKeyPassword option if the client certificate-key file is encrypted. Connect to MongoDB Instance with SSL Encryption To connect to a mongod or mongos instance that requires only a SSL encryption mode (page 305), start mongo shell with --ssl, as in the following: mongo --ssl 6.3. Security Tutorials 307
  • 312. MongoDB Documentation, Release 2.6.4 Connect to MongoDB Instance that Requires Client Certificates To connect to a mongod or mongos that re-quires CA-signed client certificates (page 305), start the mongo shell with --ssl and the --sslPEMKeyFile option to specify the signed certificate-key file, as in the following: mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem Connect to MongoDB Instance that Validates when Presented with a Certificate To connect to a mongod or mongos instance that only requires valid certificates when the client presents a certificate (page 306), start mongo shell either with the --ssl ssl and no certificate or with the --ssl ssl and a valid signed certificate. For example, if mongod is running with weak certificate validation, both of the following mongo shell clients can connect to that mongod: mongo --ssl mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem Important: If the client presents a certificate, the certificate must be valid. MMS Monitoring Agent The Monitoring agent will also have to connect via SSL in order to gather its stats. Because the agent already utilizes SSL for its communications to the MMS servers, this is just a matter of enabling SSL support in MMS itself on a per host basis. Use the “Edit” host button (i.e. the pencil) on the Hosts page in the MMS console to enable SSL. Please see the MMS documentation32 for more information about MMS configuration. PyMongo Add the “ssl=True” parameter to a PyMongo MongoClient33 to create a MongoDB connection to an SSL Mon-goDB instance: from pymongo import MongoClient c = MongoClient(host="mongodb.example.net", port=27017, ssl=True) To connect to a replica set, use the following operation: from pymongo import MongoReplicaSetClient c = MongoReplicaSetClient("mongodb.example.net:27017", replicaSet="mysetname", ssl=True) PyMongo also supports an “ssl=true” option for the MongoDB URI: mongodb://mongodb.example.net:27017/?ssl=true For more details, see the Python MongoDB Driver page34. 32http://mms.mongodb.com/help 33http://api.mongodb.org/python/current/api/pymongo/mongo_client.html#pymongo.mongo_client.MongoClient 34http://docs.mongodb.org/ecosystem/drivers/python 308 Chapter 6. Security
  • 313. MongoDB Documentation, Release 2.6.4 Java Consider the following example “SSLApp.java” class file: import com.mongodb.*; import javax.net.ssl.SSLSocketFactory; public class SSLApp { public static void main(String args[]) throws Exception { MongoClientOptions o = new MongoClientOptions.Builder() .socketFactory(SSLSocketFactory.getDefault()) .build(); MongoClient m = new MongoClient("localhost", o); DB db = m.getDB( "test" ); DBCollection c = db.getCollection( "foo" ); System.out.println( c.findOne() ); } } For more details, see the Java MongoDB Driver page35. Ruby The recent versions of the Ruby driver have support for connections to SSL servers. Install the latest version of the driver with the following command: gem install mongo Then connect to a standalone instance, using the following form: require 'rubygems' require 'mongo' connection = MongoClient.new('localhost', 27017, :ssl => true) Replace connection with the following if you’re connecting to a replica set: connection = MongoReplicaSetClient.new(['localhost:27017'], ['localhost:27018'], :ssl => true) Here, mongod instance run on “localhost:27017” and “localhost:27018”. For more details, see the Ruby MongoDB Driver page36. Node.JS (node-mongodb-native) In the node-mongodb-native37 driver, use the following invocation to connect to a mongod or mongos instance via SSL: 35http://docs.mongodb.org/ecosystem/drivers/java 36http://docs.mongodb.org/ecosystem/drivers/ruby 37https://github.com/mongodb/node-mongodb-native 6.3. Security Tutorials 309
  • 314. MongoDB Documentation, Release 2.6.4 var db1 = new Db(MONGODB, new Server("127.0.0.1", 27017, { auto_reconnect: false, poolSize:4, ssl:true } ); To connect to a replica set via SSL, use the following form: var replSet = new ReplSetServers( [ new Server( RS.host, RS.ports[1], { auto_reconnect: true } ), new Server( RS.host, RS.ports[0], { auto_reconnect: true } ), ], {rs_name:RS.name, ssl:true} ); For more details, see the Node.JS MongoDB Driver page38. .NET As of release 1.6, the .NET driver supports SSL connections with mongod and mongos instances. To connect using SSL, you must add an option to the connection string, specifying ssl=true as follows: var connectionString = "mongodb://localhost/?ssl=true"; var server = MongoServer.Create(connectionString); The .NET driver will validate the certificate against the local trusted certificate store, in addition to providing en-cryption of the server. This behavior may produce issues during testing if the server uses a self-signed certificate. If you encounter this issue, add the sslverifycertificate=false option to the connection string to prevent the .NET driver from validating the certificate, as follows: var connectionString = "mongodb://localhost/?ssl=true&sslverifycertificate=false"; var server = MongoServer.Create(connectionString); For more details, see the .NET MongoDB Driver page39. MongoDB Tools Changed in version 2.6. Various MongoDB utility programs supports SSL. These tools include: • mongodump • mongoexport • mongofiles • mongoimport • mongooplog • mongorestore • mongostat • mongotop To use SSL connections with these tools, use the same SSL options as the mongo shell. See mongo Shell SSL Configuration (page 307). 38http://docs.mongodb.org/ecosystem/drivers/node-js 39http://docs.mongodb.org/ecosystem/drivers/csharp 310 Chapter 6. Security
  • 315. MongoDB Documentation, Release 2.6.4 Upgrade a Cluster to Use SSL Note: The default distribution of MongoDB40 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Configure mongod and mongos for SSL (page 304) for more information about SSL and MongoDB. Changed in version 2.6. The MongoDB server supports listening for both SSL encrypted and unencrypted connections on the same TCP port. This allows upgrades of MongoDB clusters to use SSL encrypted connections. To upgrade from a MongoDB cluster using no SSL encryption to one using only SSL encryption, use the following rolling upgrade process: 1. For each node of a cluster, start the node with the option --sslMode set to allowSSL. The --sslMode allowSSL setting allows the node to accept both SSL and non-SSL incoming connections. Its connections to other servers do not use SSL. Include other SSL options (page 304) as well as any other options that are required for your specific configuration. For example: mongod --replSet <name> --sslMode allowSSL --sslPEMKeyFile <path to SSL Certificate and key PEM Upgrade all nodes of the cluster to these settings. Note: You may also specify these options in the configuration file, as in the following example: sslMode = <disabled|allowSSL|preferSSL|requireSSL> sslPEMKeyFile = <path to SSL certificate and key PEM file> sslCAFile = <path to root CA PEM file> 2. Switch all clients to use SSL. See SSL Configuration for Clients (page 307). 3. For each node of a cluster, use the setParameter command to update the sslMode to preferSSL. 41 With preferSSL as its net.ssl.mode, the node accepts both SSL and non-SSL incoming connections, and its connections to other servers use SSL. For example: db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "preferSSL" } ) Upgrade all nodes of the cluster to these settings. At this point, all connections should be using SSL. 4. For each node of the cluster, use the setParameter command to update the sslMode to requireSSL. 1 With requireSSL as its net.ssl.mode, the node will reject any non-SSL connections. For example: db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "requireSSL" } ) 5. After the upgrade of all nodes, edit the configuration file with the appropriate SSL settings to ensure that upon subsequent restarts, the cluster uses SSL. Configure MongoDB for FIPS New in version 2.6. 40http://www.mongodb.org/downloads 41 As an alternative to using the setParameter command, you can also restart the nodes with the appropriate SSL options and values. 6.3. Security Tutorials 311
  • 316. MongoDB Documentation, Release 2.6.4 Overview The Federal Information Processing Standard (FIPS) is a U.S. government computer security standard used to certify software modules and libraries that encrypt and decrypt data securely. You can configure MongoDB to run with a FIPS 140-2 certified library for OpenSSL. Configure FIPS to run by default or as needed from the command line. Prerequisites Only the MongoDB Enterprise42 version supports FIPS mode. Download and install MongoDB Enterprise43 to use FIPS mode. Your system must have an OpenSSL library configured with the FIPS 140-2 module. At the command line, type openssl version to confirm your OpenSSL software includes FIPS support. For Red Hat Enterprise Linux 6.x (RHEL 6.x) or its derivatives such as CentOS 6.x, the OpenSSL toolkit must be at least openssl-1.0.1e-16.el6_5 to use FIPS mode. To upgrade the toolkit for these platforms, issue the following command: sudo yum update openssl Some versions of Linux periodically execute a process to prelink dynamic libraries with pre-assigned addresses. This process modifies the OpenSSL libraries, specifically libcrypto. The OpenSSL FIPS mode will subsequently fail the signature check performed upon startup to ensure libcrypto has not been modified since compilation. To configure the Linux prelink process to not prelink libcrypto: sudo bash -c "echo '-b /usr/lib64/libcrypto.so.*' >>/etc/prelink.conf.d/openssl-prelink.conf" Procedure Configure MongoDB to use SSL See Configure mongod and mongos for SSL (page 304) for details about config-uring OpenSSL. Run mongod or mongos instance in FIPS mode Perform these steps after you Configure mongod and mongos for SSL (page 304). Step 1: Change configuration file. To configure your mongod or mongos instance to use FIPS mode, shut down the instance and update the configuration file with the following setting: net: ssl: FIPSMode: true Step 2: Start mongod or mongos instance with configuration file. For example, run this command to start the mongod instance with its configuration file: mongod --config /etc/mongodb.conf For more information about configuration files, see http://docs.mongodb.org/manualreference/configuration-options. 42http://www.mongodb.com/products/mongodb-enterprise 43http://www.mongodb.com/products/mongodb-enterprise 312 Chapter 6. Security
  • 317. MongoDB Documentation, Release 2.6.4 Confirm FIPS mode is running Check the server log file for a message FIPS is active: FIPS 140-2 mode activated 6.3.3 Security Deployment Tutorials The following tutorials provide information in deploying MongoDB using authentication and authorization. Deploy Replica Set and Configure Authentication and Authorization (page 313) Configure a replica set that has au-thentication enabled. Deploy Replica Set and Configure Authentication and Authorization Overview With authentication (page 282) enabled, MongoDB forces all clients to identify themselves before granting access to the server. Authorization (page 285), in turn, allows administrators to define and limit the resources and operations that a user can access. Using authentication and authorization is a key part of a complete security strategy. All MongoDB deployments support authentication. By default, MongoDB does not require authorization checking. You can enforce authorization checking when deploying MongoDB, or on an existing deployment; however, you cannot enable authorization checking on a running deployment without downtime. This tutorial provides a procedure for creating a MongoDB replica set (page 503) that uses the challenge-response au-thentication mechanism. The tutorial includes creation of a minimal authorization system to support basic operations. Considerations Authentication In this procedure, you will configure MongoDB using the default challenge-response authentication mechanism, using the keyFile to supply the password for inter-process authentication (page 284). The content of the key file is the shared secret used for all internal authentication. All deployments that enforce authorization checking should have one user administrator user that can create new users and modify existing users. During this procedure you will create a user administrator that you will use to administer this deployment. Architecture In a production, deploy each member of the replica set to its own machine and if possible bind to the standard MongoDB port of 27017. Use the bind_ip option to ensure that MongoDB listens for connections from applications on configured addresses. For a geographically distributed replica sets, ensure that the majority of the set’s mongod instances reside in the primary site. See Replica Set Deployment Architectures (page 516) for more information. Connectivity Ensure that network traffic can pass between all members of the set and all clients in the network securely and efficiently. Consider the following: • Establish a virtual private network. Ensure that your network topology routes all traffic between members within a single site over the local area network. • Configure access control to prevent connections from unknown clients to the replica set. • Configure networking and firewall rules so that incoming and outgoing packets are permitted only on the default MongoDB port and only from within your deployment. 6.3. Security Tutorials 313
  • 318. MongoDB Documentation, Release 2.6.4 Finally ensure that each member of a replica set is accessible by way of resolvable DNS or hostnames. You should either configure your DNS names appropriately or set up your systems’ /etc/hosts file to reflect this configuration. Configuration Specify the run time configuration on each system in a configuration file stored in /etc/mongodb.conf or a related location. Create the directory where MongoDB stores data files before de-ploying MongoDB. For more information about the run time options used above and other configuration options, see http://docs.mongodb.org/manualreference/configuration-options. Procedure This procedure deploys a replica set in which all members use the same key file. Step 1: Start one member of the replica set. This mongod should not enable auth. Step 2: Create administrative users. The following operations will create two users: a user administrator that will be able to create and modify users (siteUserAdmin), and a root (page 368) user (siteRootAdmin) that you will use to complete the remainder of the tutorial: use admin db.createUser( { user: "siteUserAdmin", pwd: "<password>", roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] }); db.createUser( { user: "siteRootAdmin", pwd: "<password>", roles: [ { role: "root", db: "admin" } ] }); Step 3: Stop the mongod instance. Step 4: Create the key file to be used by each member of the replica set. Create the key file your deployment will use to authenticate servers to each other. To generate pseudo-random data to use for a keyfile, issue the following openssl command: openssl rand -base64 741 > mongodb-keyfile chmod 600 mongodb-keyfile You may generate a key file using any method you choose. Always ensure that the password stored in the key file is both long and contains a high amount of entropy. Using openssl in this manner helps generate such a key. Step 5: Copy the key file to each member of the replica set. Copy the mongodb-keyfile to all hosts where components of a MongoDB deployment run. Set the permissions of these files to 600 so that only the owner of the file can read or write this file to prevent other users on the system from accessing the shared secret. 314 Chapter 6. Security
  • 319. MongoDB Documentation, Release 2.6.4 Step 6: Start each member of the replica set with the appropriate options. For each member, start a mongod and specify the key file and the name of the replica set. Also specify other parameters as needed for your deployment. For replication-specific parameters, see cli-mongod-replica-set required by your deployment. If your application connects to more than one replica set, each set should have a distinct name. Some drivers group replica set connections by replica set name. The following example specifies parameters through the --keyFile and --replSet command-line options: mongod --keyFile /mysecretdirectory/mongodb-keyfile --replSet "rs0" The following example specifies parameters through a configuration file: mongod --config $HOME/.mongodb/config In production deployments, you can configure a control script to manage this process. Control scripts are beyond the scope of this document. Step 7: Connect to the member of the replica set where you created the administrative users. Connect to the replica set member you started and authenticate as the siteRootAdmin user. From the mongo shell, use the following operation to authenticate: use admin db.auth("siteRootAdmin", "<password>"); Step 8: Initiate the replica set. Use rs.initiate(): rs.initiate() MongoDB initiates a set that consists of the current member and that uses the default replica set configuration. Step 9: Verify the initial replica set configuration. Use rs.conf() to display the replica set configuration object (page 594): rs.conf() The replica set configuration object resembles the following: { "_id" : "rs0", "version" : 1, "members" : [ { "_id" : 1, "host" : "mongodb0.example.net:27017" } ] } Step 10: Add the remaining members to the replica set. Add the remaining members with the rs.add() method. The following example adds two members: rs.add("mongodb1.example.net") rs.add("mongodb2.example.net") When complete, you have a fully functional replica set. The new replica set will elect a primary. 6.3. Security Tutorials 315
  • 320. MongoDB Documentation, Release 2.6.4 Step 11: Check the status of the replica set. Use the rs.status() operation: rs.status() Step 12: Create additional users to address operational requirements. You can use built-in roles (page 361) to create common types of database users, such as the dbOwner (page 363) role to create a database administrator, the readWrite (page 362) role to create a user who can update data, or the read (page 362) role to create user who can search data but no more. You also can define custom roles (page 286). For example, the following creates a database administrator for the products database: use products db.createUser( { user: "productsDBAdmin", pwd: "password", roles: [ { role: "dbOwner", db: "products" } ] } ) For an overview of roles and privileges, see Authorization (page 285). For more information on adding users, see Add a User to a Database (page 344). 6.3.4 Access Control Tutorials The following tutorials provide instructions for MongoDB”s authentication and authorization related features. Enable Client Access Control (page 317) Describes the process for enabling authentication for MongoDB deploy-ments. Enable Authentication in a Sharded Cluster (page 318) Control access to a sharded cluster through a key file and the keyFile setting on each of the cluster’s components. Enable Authentication after Creating the User Administrator (page 319) Describes an alternative process for en-abling authentication for MongoDB deployments. Use x.509 Certificates to Authenticate Clients (page 320) Use x.509 for client authentication. Use x.509 Certificate for Membership Authentication (page 323) Use x.509 for internal member authentication for replica sets and sharded clusters. Authenticate Using SASL and LDAP with ActiveDirectory (page 326) Describes the process for authentication us-ing SASL/LDAP with ActiveDirectory. Authenticate Using SASL and LDAP with OpenLDAP (page 329) Describes the process for authentication using SASL/LDAP with OpenLDAP. Configure MongoDB with Kerberos Authentication on Linux (page 331) For MongoDB Enterprise Linux, de-scribes the process to enable Kerberos-based authentication for MongoDB deployments. Configure MongoDB with Kerberos Authentication on Windows (page 334) For MongoDB Enterprise for Win-dows, describes the process to enable Kerberos-based authentication for MongoDB deployments. 316 Chapter 6. Security
  • 321. MongoDB Documentation, Release 2.6.4 Authenticate to a MongoDB Instance or Cluster (page 336) Describes the process for authenticating to MongoDB systems using the mongo shell. Generate a Key File (page 338) Use key file to allow the components of MongoDB sharded cluster or replica set to mutually authenticate. Troubleshoot Kerberos Authentication on Linux (page 338) Steps to troubleshoot Kerberos-based authentication for MongoDB deployments. Implement Field Level Redaction (page 340) Describes the process to set up and access document content that can have different access levels for the same data. Enable Client Access Control Overview Enabling access control on a MongoDB instance restricts access to the instance by requiring that users identify them-selves when connecting. In this procedure, you enable access control and then create the instance’s first user, which must be a user administrator. The user administrator grants further access to the instance by creating additional users. Considerations If you create the user administrator before enabling access control, MongoDB disables the localhost exception (page 285). In that case, you must use the “Enable Authentication after Creating the User Administrator (page 319)” procedure to enable access control. This procedure uses the localhost exception (page 285) to allow you to create the first user after enabling authentication. See Localhost Exception (page 285) and Authentication (page 282) for more information. Procedure Step 1: Start the MongoDB instance with authentication enabled. Start the mongod or mongos instance with the authorization or keyFile setting. Use authorization on a standalone instance. Use keyFile on an instance in a replica set or sharded cluster. For example, to start a mongod with authentication enabled and a key file stored in http://docs.mongodb.org/manualprivate/var, first set the following option in the mongod‘s configuration file: security: keyFile: /private/var/key.pem Then start the mongod and specify the config file. For example: mongod --config /etc/mongodb/mongodb.conf After you enable authentication, only the user administrator can connect to the MongoDB instance. The user admin-istrator must log in and grant further access to the instance by creating additional users. Step 2: Connect to the MongoDB instance via the localhost exception. Connect to the MongoDB instance from a client running on the same system. This access is made possible by the localhost exception (page 285). 6.3. Security Tutorials 317
  • 322. MongoDB Documentation, Release 2.6.4 Step 3: Create the system user administrator. Add the user with the userAdminAnyDatabase (page 368) role, and only that role. The following example creates the user siteUserAdmin user on the admin database: use admin db.createUser( { user: "siteUserAdmin", pwd: "password", roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] } ) After you create the user administrator, the localhost exception (page 285) is no longer available. Step 4: Create additional users. Login in with the user administrator’s credentials and create additional users. See Add a User to a Database (page 344). Next Steps If you need to disable access control for any reason, restart the process without the authorization or keyFile setting. Enable Authentication in a Sharded Cluster New in version 2.0: Support for authentication with sharded clusters. Overview When authentication is enabled on a sharded cluster every client that accesses the cluster must provide credentials. This includes MongoDB instances that access each other within the cluster. To enable authentication on a sharded cluster, you must enable authentication individually on each component of the cluster. This means enabling authentication on each mongos and each mongod, including each config server, and all members of a shard’s replica set. Authentication requires an authentication mechanism and, in most cases, a key file. The content of the key file must be the same on all cluster members. Procedure Step 1: Create a key file. Create the key file your deployment will use to authenticate servers to each other. To generate pseudo-random data to use for a keyfile, issue the following openssl command: openssl rand -base64 741 > mongodb-keyfile chmod 600 mongodb-keyfile 318 Chapter 6. Security
  • 323. MongoDB Documentation, Release 2.6.4 You may generate a key file using any method you choose. Always ensure that the password stored in the key file is both long and contains a high amount of entropy. Using openssl in this manner helps generate such a key. Step 2: Enable authentication on each component in the cluster. On each mongos and mongod in the cluster, including all config servers and shards, specify the key file using one of the following approaches: Specify the key file in the configuration file. In the configuration file, set the keyFile option to the key file’s path and then start the component, as in the following example: security: keyFile: /srv/mongodb/keyfile Specify the key file at runtime. When starting the component, set the --keyFile option, which is an option for both mongos instances and mongod instances. Set the --keyFile to the key file’s path. The keyFile setting implies the authorization setting, which means in most cases you do not need to set authorization explicitly. Step 3: Add users. While connected to a mongos, add the first administrative user and then add subsequent users. See Create a User Administrator (page 343). Related Documents • Authentication (page 282) • Security (page 279) • Use x.509 Certificate for Membership Authentication (page 323) Enable Authentication after Creating the User Administrator Overview Enabling authentication on a MongoDB instance restricts access to the instance by requiring that users identify them-selves when connecting. In this procedure, you will create the instance’s first user, which must be a user administrator and then enable authentication. Then, you can authenticate as the user administrator to create additional users and grant additional access to the instance. This procedures outlines how enable authentication after creating the user administrator. The approach requires a restart. To enable authentication without restarting, see Enable Client Access Control (page 317). Considerations This document outlines a procedure for enabling authentication for MongoDB instance where you create the first user on an existing MongoDB system that does not require authentication before restarting the instance and requiring au-thentication. You can use the localhost exception (page 285) to gain access to a system with no users and authentication enabled. See Enable Client Access Control (page 317) for the description of that procedure. 6.3. Security Tutorials 319
  • 324. MongoDB Documentation, Release 2.6.4 Procedure Step 1: Start the MongoDB instance without authentication. Start the mongod or mongos instance without the authorization or keyFile setting. For example: mongod --port 27017 --dbpath /data/db1 For details on starting a mongod or mongos, see Manage mongod Processes (page 207) or Deploy a Sharded Cluster (page 635). Step 2: Create the system user administrator. Add the user with the userAdminAnyDatabase (page 368) role, and only that role. The following example creates the user siteUserAdmin user on the admin database: use admin db.createUser( { user: "siteUserAdmin", pwd: "password", roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] } ) Step 3: Re-start the MongoDB instance with authentication enabled. Re-start the mongod or mongos instance with the authorization or keyFile setting. Use authorization on a standalone instance. Use keyFile on an instance in a replica set or sharded cluster. The following example enables authentication on a standalone mongod using the authorization command-line option: mongod --auth --config /etc/mongodb/mongodb.conf Step 4: Create additional users. Log in with the user administrator’s credentials and create additional users. See Add a User to a Database (page 344). Next Steps If you need to disable authentication for any reason, restart the process without the authorization or keyFile option. Use x.509 Certificates to Authenticate Clients New in version 2.6. MongoDB supports x.509 certificate authentication for use with a secure SSL connection (page 304). The x.509 client authentication allows clients to authenticate to servers with certificates (page 321) rather than with a username and password. 320 Chapter 6. Security
  • 325. MongoDB Documentation, Release 2.6.4 To use x.509 authentication for the internal authentication of replica set/sharded cluster members, see Use x.509 Certificate for Membership Authentication (page 323). Client x.509 Certificate The client certificate must have the following properties: • A single Certificate Authority (CA) must issue the certificates for both the client and the server. • Client certificates must contain the following fields: keyUsage = digitalSignature extendedKeyUsage = clientAuth • A client x.509 certificate’s subject, which contains the Distinguished Name (DN), must differ from that of a Member x.509 Certificate (page 323) to prevent client certificates from identifying the client as a cluster member and granting full permission on the system. Specifically, the subjects must differ with regards to at least one of the following attributes: Organization (O), the Organizational Unit (OU) or the Domain Component (DC). • Each unique MongoDB user must have a unique certificate. Configure MongoDB Server Use Command-line Options You can configure the MongoDB server from the command line, e.g.: mongod --sslMode requireSSL --sslPEMKeyFile <path to SSL certificate and key PEM file> --sslCAFile <path Warning: If the --sslCAFile option and its target file are not specified, x.509 client and member authenti-cation will not function. mongod, and mongos in sharded systems, will not be able to verify the certificates of processes connecting to it against the trusted certificate authority (CA) that issued them, breaking the certificate chain. As of version 2.6.4, mongod will not start with x.509 authentication enabled if the CA file is not specified. Use Configuration File You may also specify these options in the configuration file. Starting in MongoDB 2.6, you can specify the configuration for MongoDB in YAML format, e.g.: net: ssl: mode: requireSSL PEMKeyFile: <path to SSL certificate and key PEM file> CAFile: <path to root CA PEM file> For backwards compatibility, you can also specify the configuration using the older configuration file format44, e.g.: sslMode = requireSSL sslPEMKeyFile = <path to SSL certificate and key PEM file> sslCAFile = <path to the root CA PEM file> Include any additional options, SSL or otherwise, that are required for your specific configuration. 44http://docs.mongodb.org/v2.4/reference/configuration 6.3. Security Tutorials 321
  • 326. MongoDB Documentation, Release 2.6.4 Add x.509 Certificate subject as a User To authenticate with a client certificate, you must first add the value of the subject from the client certificate as a MongoDB user. Each unique x.509 client certificate corresponds to a single MongoDB user; i.e. you cannot use a single client certificate to authenticate more than one MongoDB user. 1. You can retrieve the subject from the client certificate with the following command: openssl x509 -in <pathToClient PEM> -inform PEM -subject -nameopt RFC2253 The command returns the subject string as well as certificate: subject= CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry -----BEGIN CERTIFICATE----- # ... -----END CERTIFICATE----- 2. Add the value of the subject, omitting the spaces, from the certificate as a user. For example, in the mongo shell, to add the user with both the readWrite role in the test database and the userAdminAnyDatabase role which is defined only in the admin database: db.getSiblingDB("$external").runCommand( { createUser: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry", roles: [ { role: 'readWrite', db: 'test' }, { role: 'userAdminAnyDatabase', db: 'admin' } ], writeConcern: { w: "majority" , wtimeout: 5000 } } ) In the above example, to add the user with the readWrite role in the test database, the role specification document specified ’test’ in the db field. To add userAdminAnyDatabase role for the user, the above example specified ’admin’ in the db field. Note: Some roles are defined only in the admin database, including: clusterAdmin, readAnyDatabase, readWriteAnyDatabase, dbAdminAnyDatabase, and userAdminAnyDatabase. To add a user with these roles, specify ’admin’ in the db. See Add a User to a Database (page 344) for details on adding a user with roles. Authenticate with a x.509 Certificate To authenticate with a client certificate, you must first add a MongoDB user that corresponds to the client certificate. See Add x.509 Certificate subject as a User (page 322). To authenticate, use the db.auth() method in the $external database, specifying "MONGODB-X509" for the mechanism field, and the user that corresponds to the client certificate (page 322) for the user field. For example, if using the mongo shell, 1. Connect mongo shell to the mongod set up for SSL: mongo --ssl --sslPEMKeyFile <path to CA signed client PEM file> --sslCAFile <path to root CA PEM 322 Chapter 6. Security
  • 327. MongoDB Documentation, Release 2.6.4 2. To perform the authentication, use the db.auth() method in the $external database. For the mechanism field, specify "MONGODB-X509", and for the user field, specify the user, or the subject, that corresponds to the client certificate. db.getSiblingDB("$external").auth( { mechanism: "MONGODB-X509", user: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry" } ) Use x.509 Certificate for Membership Authentication New in version 2.6. MongoDB supports x.509 certificate authentication for use with a secure SSL connection (page 304). Sharded cluster members and replica set members can use x.509 certificates to verify their membership to the cluster or the replica set instead of using keyfiles (page 282). The membership authentication is an internal process. For client authentication with x.509, see Use x.509 Certificates to Authenticate Clients (page 320). Member x.509 Certificate The member certificate, used for internal authentication to verify membership to the sharded cluster or a replica set, must have the following properties: • A single Certificate Authority (CA) must issue all the x.509 certificates for the members of a sharded cluster or a replica set. • The Distinguished Name (DN), found in the member certificate’s subject, must specify a non-empty value for at least one of the following attributes: Organization (O), the Organizational Unit (OU) or the Domain Component (DC). • The Organization attributes (O‘s), the Organizational Unit attributes (OU‘s), and the Domain Components (DC‘s) must match those from the certificates for the other cluster members. To match, the certificate must match all specifications of these attributes, or even the non-specification of these attributes. The order of the attributes does not matter. In the following example, the two DN‘s contain matching specifications for O, OU as well as the non-specification of the DC attribute. CN=host1,OU=Dept1,O=MongoDB,ST=NY,C=US C=US, ST=CA, O=MongoDB, OU=Dept1, CN=host2 However, the following two DN‘s contain a mismatch for the OU attribute since one contains two OU specifica-tions and the other, only one specification. CN=host1,OU=Dept1,OU=Sales,O=MongoDB CN=host2,OU=Dept1,O=MongoDB • Either the Common Name (CN) or one of the Subject Alternative Name (SAN) entries must match the hostname of the server, used by the other members of the cluster. For example, the certificates for a cluster could have the following subjects: subject= CN=<myhostname1>,OU=Dept1,O=MongoDB,ST=NY,C=US subject= CN=<myhostname2>,OU=Dept1,O=MongoDB,ST=NY,C=US subject= CN=<myhostname3>,OU=Dept1,O=MongoDB,ST=NY,C=US 6.3. Security Tutorials 323
  • 328. MongoDB Documentation, Release 2.6.4 It is possible to use a single x509 certificate for both member authentication and x.509 client authentication (page 320). To do so, obtain a certificate with both clientAuth and serverAuth (i.e. “TLS Web Client Authentication” and “TLS Web Server Authentication”) specified as Extended Key Usage (EKU) values, or simply do not specify any EKU values. Provide this file as the the --sslPEMKeyFile and omit the --sslClusterFile option described below. Configure Replica Set/Sharded Cluster Use Command-line Options To specify the x.509 certificate for internal cluster member authentication, append the additional SSL options --clusterAuthMode and --sslClusterFile, as in the following example for a member of a replica set: mongod --replSet <name> --sslMode requireSSL --clusterAuthMode x509 --sslClusterFile <path to membership Include any additional options, SSL or otherwise, that are required for your specific configuration. For instance, if the membership key is encrypted, set the --sslClusterPassword to the passphrase to decrypt the key or have MongoDB prompt for the passphrase. See SSL Certificate Passphrase (page 306) for details. Warning: If the --sslCAFile option and its target file are not specified, x.509 client and member authenti-cation will not function. mongod, and mongos in sharded systems, will not be able to verify the certificates of processes connecting to it against the trusted certificate authority (CA) that issued them, breaking the certificate chain. As of version 2.6.4, mongod will not start with x.509 authentication enabled if the CA file is not specified. Use Configuration File You may also specify these options in the configuration file. YAML Formatted Configuration File Starting in MongoDB 2.6, you can specify the configuration for MongoDB in YAML format, as in the following example: security: clusterAuthMode: x509 net: ssl: mode: requireSSL PEMKeyFile: <path to SSL certificate and key PEM file> CAFile: <path to root CA PEM file> clusterFile: <path to x.509 membership certificate and key PEM file> See security.clusterAuthMode, net.ssl.mode, net.ssl.PEMKeyFile, net.ssl.CAFile, and net.ssl.clusterFile for more information on the settings. v2.4 Configuration File For backwards compatibility, you can also specify the configuration using the v2.4 config-uration file format45, as in the following example: sslMode = requireSSL sslPEMKeyFile = <path to SSL certificate and key PEM file> sslCAFile = <path to root CA PEM file> clusterAuthMode = x509 sslClusterFile = <path to membership certificate and key PEM file> 45http://docs.mongodb.org/v2.4/reference/configuration 324 Chapter 6. Security
  • 329. MongoDB Documentation, Release 2.6.4 Upgrade from Keyfile Authentication to x.509 Authentication To upgrade clusters that are currently using keyfile authentication to x.509 authentication, use a rolling upgrade pro-cess. Clusters Currently Using SSL For clusters using SSL and keyfile authentication, to upgrade to x.509 cluster au-thentication, use the following rolling upgrade process: 1. For each node of a cluster, start the node with the option --clusterAuthMode set to sendKeyFile and the option --sslClusterFile set to the appropriate path of the node’s certificate. Include other SSL options (page 304) as well as any other options that are required for your specific configuration. For example: mongod --replSet <name> --sslMode requireSSL --clusterAuthMode sendKeyFile --sslClusterFile <path With this setting, each node continues to use its keyfile to authenticate itself as a member. However, each node can now accept either a keyfile or an x.509 certificate from other members to authenticate those members. Upgrade all nodes of the cluster to this setting. 2. Then, for each node of a cluster, connect to the node and use the setParameter command to update the clusterAuthMode to sendX509. 46 For example, db.getSiblingDB('admin').runCommand( { setParameter: 1, clusterAuthMode: "sendX509" } ) With this setting, each node uses its x.509 certificate, specified with the --sslClusterFile option in the previous step, to authenticate itself as a member. However, each node continues to accept either a keyfile or an x.509 certificate from other members to authenticate those members. Upgrade all nodes of the cluster to this setting. 3. Optional but recommended. Finally, for each node of the cluster, connect to the node and use the setParameter command to update the clusterAuthMode to x509 to only use the x.509 certificate for authentication. 1 For example: db.getSiblingDB('admin').runCommand( { setParameter: 1, clusterAuthMode: "x509" } ) 4. After the upgrade of all nodes, edit the configuration file with the appropriate x.509 settings to ensure that upon subsequent restarts, the cluster uses x.509 authentication. See --clusterAuthMode for the various modes and their descriptions. Clusters Currently Not Using SSL For clusters using keyfile authentication but not SSL, to upgrade to x.509 authentication, use the following rolling upgrade process: 1. For each node of a cluster, start the node with the option --sslMode set to allowSSL, the option --clusterAuthMode set to sendKeyFile and the option --sslClusterFile set to the appropri-ate path of the node’s certificate. Include other SSL options (page 304) as well as any other options that are required for your specific configuration. For example: mongod --replSet <name> --sslMode allowSSL --clusterAuthMode sendKeyFile --sslClusterFile <path The --sslMode allowSSL setting allows the node to accept both SSL and non-SSL incoming connections. Its outgoing connections do not use SSL. The --clusterAuthMode sendKeyFile setting allows each node continues to use its keyfile to authen-ticate itself as a member. However, each node can now accept either a keyfile or an x.509 certificate from other members to authenticate those members. 46 As an alternative to using the setParameter command, you can also restart the nodes with the appropriate SSL and x509 options and values. 6.3. Security Tutorials 325
  • 330. MongoDB Documentation, Release 2.6.4 Upgrade all nodes of the cluster to these settings. 2. Then, for each node of a cluster, connect to the node and use the setParameter command to update the sslMode to preferSSL and the clusterAuthMode to sendX509. 1 For example: db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "preferSSL", clusterAuthMode: "With the sslMode set to preferSSL, the node accepts both SSL and non-SSL incoming connections, and its outgoing connections use SSL. With the clusterAuthMode set to sendX509, each node uses its x.509 certificate, specified with the --sslClusterFile option in the previous step, to authenticate itself as a member. However, each node continues to accept either a keyfile or an x.509 certificate from other members to authenticate those members. Upgrade all nodes of the cluster to these settings. 3. Optional but recommended. Finally, for each node of the cluster, connect to the node and use the setParameter command to update the sslMode to requireSSL and the clusterAuthMode to x509. 1 For example: db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "requireSSL", clusterAuthMode: With the sslMode set to requireSSL, the node only uses SSL connections. With the clusterAuthMode set to x509, the node only uses the x.509 certificate for authentication. 4. After the upgrade of all nodes, edit the configuration file with the appropriate SSL and x.509 settings to ensure that upon subsequent restarts, the cluster uses x.509 authentication. See --clusterAuthMode for the various modes and their descriptions. Authenticate Using SASL and LDAP with ActiveDirectory MongoDB Enterprise provides support for proxy authentication of users. This allows administrators to configure a MongoDB cluster to authenticate users by proxying authentication requests to a specified Lightweight Directory Access Protocol (LDAP) service. Considerations MongoDB Enterprise forWindows does not include LDAP support for authentication. However, MongoDB Enterprise for Linux supports using LDAP authentication with an ActiveDirectory server. MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4 and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 751) for upgrade instructions. Use secure encrypted or trusted connections between clients and the server, as well as between saslauthd and the LDAP server. The LDAP server uses the SASL PLAIN mechanism, sending and receiving data in plain text. You should use only a trusted channel such as a VPN, a connection encrypted with SSL, or a trusted wired network. Configure saslauthd LDAP support for user authentication requires proper configuration of the saslauthd daemon process as well as the MongoDB server. 326 Chapter 6. Security
  • 331. MongoDB Documentation, Release 2.6.4 Step 1: Specify the mechanism. On systems that configure saslauthd with the /etc/sysconfig/saslauthd file, such as Red Hat Enterprise Linux, Fedora, CentOS, and Amazon Linux AMI, set the mechanism MECH to ldap: MECH=ldap On systems that configure saslauthd with the /etc/default/saslauthd file, such as Ubuntu, set the MECHANISMS option to ldap: MECHANISMS="ldap" Step 2: Adjust caching behavior. On certain Linux distributions, saslauthd starts with the caching of authenti-cation credentials enabled. Until restarted or until the cache expires, saslauthd will not contact the LDAP server to re-authenticate users in its authentication cache. This allows saslauthd to successfully authenticate users in its cache, even in the LDAP server is down or if the cached users’ credentials are revoked. To set the expiration time (in seconds) for the authentication cache, see the -t option47 of saslauthd. Step 3: Configure LDAP Options with ActiveDirectory. If the saslauthd.conf file does not exist, create it. The saslauthd.conf file usually resides in the /etc folder. If specifying a different file path, see the -O option48 of saslauthd. To use with ActiveDirectory, start saslauthd with the following configuration options set in the saslauthd.conf file: ldap_servers: <ldap uri> ldap_use_sasl: yes ldap_mech: DIGEST-MD5 ldap_auth_method: fastbind For the <ldap uri>, specify the uri of the ldap server. For example, ldap_servers: ldaps://ad.example.net. For more information on saslauthd configuration, see http://www.openldap.org/doc/admin24/guide.html#Configuringsaslauthd. Step 4: Test the saslauthd configuration. Use testsaslauthd utility to test the saslauthd configuration. For example: testsaslauthd -u testuser -p testpassword -f /var/run/saslauthd/mux Configure MongoDB Step 1: Add user to MongoDB for authentication. Add the user to the $external database in MongoDB. To specify the user’s privileges, assign roles (page 285) to the user. For example, the following adds a user with read-only access to the records database. db.getSiblingDB("$external").createUser( { user : <username>, roles: [ { role: "read", db: "records" } ] } ) 47http://www.linuxcommand.org/man_pages/saslauthd8.html 48http://www.linuxcommand.org/man_pages/saslauthd8.html 6.3. Security Tutorials 327
  • 332. MongoDB Documentation, Release 2.6.4 Add additional principals as needed. For more information about creating and managing users, see http://docs.mongodb.org/manualreference/command/nav-user-management. Step 2: Configure MongoDB server. To configure the MongoDB server to use the saslauthd instance for proxy authentication, start the mongod with the following options: • --auth, • authenticationMechanisms parameter set to PLAIN, and • saslauthdPath parameter set to the path to the Unix-domain Socket of the saslauthd instance. Configure the MongoDB server using either the command line option --setParameter or the configuration file. Specify additional configurations as appropriate for your configuration. If you use the authorization option to enforce authentication, you will need privileges to create a user. Use specific saslauthd socket path. For socket path of /<some>/<path>/saslauthd, set the saslauthdPath to /<some>/<path>/saslauthd/mux, as in the following command line example: mongod --auth --setParameter saslauthdPath=/<some>/<path>/saslauthd/mux --setParameter authenticationMechanisms=Or if using a configuration file, specify the following parameters in the file: auth=true setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux setParameter=authenticationMechanisms=PLAIN Use default Unix-domain socket path. To use the default Unix-domain socket path, set the saslauthdPath to the empty string "", as in the following command line example: mongod --auth --setParameter saslauthdPath="" --setParameter authenticationMechanisms=PLAIN Or if using a configuration file, specify the following parameters in the file: auth=true setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux setParameter=authenticationMechanisms=PLAIN Step 3: Authenticate the user in the mongo shell. To perform the authentication in the mongo shell, use the db.auth() method in the $external database. Specify the value "PLAIN" in the mechanism field, the user and password in the user and pwd fields respectively, and the value false in the digestPassword field. You must specify false for digestPassword since the server must receive an undigested password to forward on to saslauthd, as in the following example: db.getSiblingDB("$external").auth( { mechanism: "PLAIN", user: <username>, pwd: <cleartext password>, digestPassword: false } ) The server forwards the password in plain text. In general, use only on a trusted channel (VPN, SSL, trusted wired network). See Considerations. 328 Chapter 6. Security
  • 333. MongoDB Documentation, Release 2.6.4 Authenticate Using SASL and LDAP with OpenLDAP MongoDB Enterprise provides support for proxy authentication of users. This allows administrators to configure a MongoDB cluster to authenticate users by proxying authentication requests to a specified Lightweight Directory Access Protocol (LDAP) service. Considerations MongoDB Enterprise forWindows does not include LDAP support for authentication. However, MongoDB Enterprise for Linux supports using LDAP authentication with an ActiveDirectory server. MongoDB does not support LDAP authentication in mixed sharded cluster deployments that contain both version 2.4 and version 2.6 shards. See Upgrade MongoDB to 2.6 (page 751) for upgrade instructions. Use secure encrypted or trusted connections between clients and the server, as well as between saslauthd and the LDAP server. The LDAP server uses the SASL PLAIN mechanism, sending and receiving data in plain text. You should use only a trusted channel such as a VPN, a connection encrypted with SSL, or a trusted wired network. Configure saslauthd LDAP support for user authentication requires proper configuration of the saslauthd daemon process as well as the MongoDB server. Step 1: Specify the mechanism. On systems that configure saslauthd with the /etc/sysconfig/saslauthd file, such as Red Hat Enterprise Linux, Fedora, CentOS, and Amazon Linux AMI, set the mechanism MECH to ldap: MECH=ldap On systems that configure saslauthd with the /etc/default/saslauthd file, such as Ubuntu, set the MECHANISMS option to ldap: MECHANISMS="ldap" Step 2: Adjust caching behavior. On certain Linux distributions, saslauthd starts with the caching of authenti-cation credentials enabled. Until restarted or until the cache expires, saslauthd will not contact the LDAP server to re-authenticate users in its authentication cache. This allows saslauthd to successfully authenticate users in its cache, even in the LDAP server is down or if the cached users’ credentials are revoked. To set the expiration time (in seconds) for the authentication cache, see the -t option49 of saslauthd. Step 3: Configure LDAP Options with OpenLDAP. If the saslauthd.conf file does not exist, create it. The saslauthd.conf file usually resides in the /etc folder. If specifying a different file path, see the -O option50 of saslauthd. To connect to an OpenLDAP server, update the saslauthd.conf file with the following configuration options: ldap_servers: <ldap uri> ldap_search_base: <search base> ldap_filter: <filter> 49http://www.linuxcommand.org/man_pages/saslauthd8.html 50http://www.linuxcommand.org/man_pages/saslauthd8.html 6.3. Security Tutorials 329
  • 334. MongoDB Documentation, Release 2.6.4 The ldap_servers specifies the uri of the LDAP server used for authentication. In general, for OpenLDAP installed on the local machine, you can specify the value ldap://localhost:389 or if using LDAP over SSL, you can specify the value ldaps://localhost:636. The ldap_search_base specifies distinguished name to which the search is relative. The search includes the base or objects below. The ldap_filter specifies the search filter. The values for these configuration options should correspond to the values specific for your test. For example, to filter on email, specify ldap_filter: (mail=%n) instead. OpenLDAP Example A sample saslauthd.conf file for OpenLDAP includes the following content: ldap_servers: ldaps://ad.example.net ldap_search_base: ou=Users,dc=example,dc=com ldap_filter: (uid=%u) To use this sample OpenLDAP configuration, create users with a uid attribute (login name) and place under the Users organizational unit (ou) under the domain components (dc) example and com. For more information on saslauthd configuration, see http://www.openldap.org/doc/admin24/guide.html#Configuringsaslauthd. Step 4: Test the saslauthd configuration. Use testsaslauthd utility to test the saslauthd configuration. For example: testsaslauthd -u testuser -p testpassword -f /var/run/saslauthd/mux Configure MongoDB Step 1: Add user to MongoDB for authentication. Add the user to the $external database in MongoDB. To specify the user’s privileges, assign roles (page 285) to the user. For example, the following adds a user with read-only access to the records database. db.getSiblingDB("$external").createUser( { user : <username>, roles: [ { role: "read", db: "records" } ] } ) Add additional principals as needed. For more information about creating and managing users, see http://docs.mongodb.org/manualreference/command/nav-user-management. Step 2: Configure MongoDB server. To configure the MongoDB server to use the saslauthd instance for proxy authentication, start the mongod with the following options: • --auth, • authenticationMechanisms parameter set to PLAIN, and • saslauthdPath parameter set to the path to the Unix-domain Socket of the saslauthd instance. Configure the MongoDB server using either the command line option --setParameter or the configuration file. Specify additional configurations as appropriate for your configuration. If you use the authorization option to enforce authentication, you will need privileges to create a user. 330 Chapter 6. Security
  • 335. MongoDB Documentation, Release 2.6.4 Use specific saslauthd socket path. For socket path of /<some>/<path>/saslauthd, set the saslauthdPath to /<some>/<path>/saslauthd/mux, as in the following command line example: mongod --auth --setParameter saslauthdPath=/<some>/<path>/saslauthd/mux --setParameter authenticationMechanisms=Or if using a configuration file, specify the following parameters in the file: auth=true setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux setParameter=authenticationMechanisms=PLAIN Use default Unix-domain socket path. To use the default Unix-domain socket path, set the saslauthdPath to the empty string "", as in the following command line example: mongod --auth --setParameter saslauthdPath="" --setParameter authenticationMechanisms=PLAIN Or if using a configuration file, specify the following parameters in the file: auth=true setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux setParameter=authenticationMechanisms=PLAIN Step 3: Authenticate the user in the mongo shell. To perform the authentication in the mongo shell, use the db.auth() method in the $external database. Specify the value "PLAIN" in the mechanism field, the user and password in the user and pwd fields respectively, and the value false in the digestPassword field. You must specify false for digestPassword since the server must receive an undigested password to forward on to saslauthd, as in the following example: db.getSiblingDB("$external").auth( { mechanism: "PLAIN", user: <username>, pwd: <cleartext password>, digestPassword: false } ) The server forwards the password in plain text. In general, use only on a trusted channel (VPN, SSL, trusted wired network). See Considerations. Configure MongoDB with Kerberos Authentication on Linux New in version 2.4. Overview MongoDB Enterprise supports authentication using a Kerberos service (page 291). Kerberos is an industry standard authentication protocol for large client/server system. Prerequisites Setting up and configuring a Kerberos deployment is beyond the scope of this document. This tutorial assumes you have have configured a Kerberos service principal (page 292) for each mongod and mongos instance in your 6.3. Security Tutorials 331
  • 336. MongoDB Documentation, Release 2.6.4 MongoDB deployment, and you have a valid keytab file (page 292) for for each mongod and mongos instance. To verify MongoDB Enterprise binaries: mongod --version In the output from this command, look for the string modules: subscription or modules: enterprise to confirm your system has MongoDB Enterprise. Procedure The following procedure outlines the steps to add a Kerberos user principal to MongoDB, configure a standalone mongod instance for Kerberos support, and connect using the mongo shell and authenticate the user principal. Step 1: Start mongod withoutKerberos. For the initial addition of Kerberos users, start mongod without Kerberos support. If a Kerberos user is already in MongoDB and has the privileges required to create a user, you can start mongod with Kerberos support. Step 2: Connect to mongod. Connect via the mongo shell to the mongod instance. If mongod has --auth enabled, ensure you connect with the privileges required to create a user. Step 3: Add Kerberos Principal(s) to MongoDB. Add a Kerberos principal, <username>@<KERBEROS REALM> or <username>/<instance>@<KERBEROS REALM>, to MongoDB in the $external database. Specify the Kerberos realm in all uppercase. The $external database allows mongod to consult an external source (e.g. Kerberos) to authenticate. To specify the user’s privileges, assign roles (page 285) to the user. The following example adds the Kerberos principal application/reporting@EXAMPLE.NET with read-only access to the records database: use $external db.createUser( { user: "application/reporting@EXAMPLE.NET", roles: [ { role: "read", db: "records" } ] } ) Add additional principals as needed. For every user you want to authenticate using Kerberos, you must create a corresponding user in MongoDB. For more information about creating and managing users, see http://docs.mongodb.org/manualreference/command/nav-user-management. Step 4: Start mongod with Kerberos support. To start mongod with Kerberos support, set the environmental variable KRB5_KTNAME to the path of the keytab file and the mongod parameter authenticationMechanisms to GSSAPI in the following form: env KRB5_KTNAME=<path to keytab file> mongod --setParameter authenticationMechanisms=GSSAPI <additional mongod options> For example, the following starts a standalone mongod instance with Kerberos support: 332 Chapter 6. Security
  • 337. MongoDB Documentation, Release 2.6.4 env KRB5_KTNAME=/opt/mongodb/mongod.keytab /opt/mongodb/bin/mongod --auth --setParameter authenticationMechanisms=GSSAPI --dbpath /opt/mongodb/data The path to your mongod as well as your keytab file (page 292) may differ. Modify or include additional mongod options as required for your configuration. The keytab file (page 292) must be only accessible to the owner of the mongod process. With the official .deb or .rpm packages, you can set the KRB5_KTNAME in a environment settings file. See KRB5_KTNAME (page 333) for details. Step 5: Connect mongo shell to mongod and authenticate. Connect the mongo shell client as the Kerberos prin-cipal application/reporting@EXAMPLE.NET. Before connecting, you must have used Kerberos’s kinit program to get credentials for application/reporting@EXAMPLE.NET. You can connect and authenticate from the command line. mongo --authenticationMechanism=GSSAPI --authenticationDatabase='$external' --username application/reporting@EXAMPLE.NET Or, alternatively, you can first connect mongo to the mongod, and then from the mongo shell, use the db.auth() method to authenticate in the $external database. use $external db.auth( { mechanism: "GSSAPI", user: "application/reporting@EXAMPLE.NET" } ) Additional Considerations KRB5_KTNAME If you installed MongoDB Enterprise using one of the official .deb or .rpm packages, and you use the included init/upstart scripts to control the mongod instance, you can set the KR5_KTNAME variable in the default environment settings file instead of setting the variable each time. For .rpm packages, the default environment settings file is /etc/sysconfig/mongod. For .deb packages, the file is /etc/default/mongodb. Set the KRB5_KTNAME value in a line that resembles the following: export KRB5_KTNAME="<path to keytab>" Configure mongos for Kerberos To start mongos with Kerberos support, set the environmen-tal variable KRB5_KTNAME to the path of its keytab file (page 292) and the mongos parameter authenticationMechanisms to GSSAPI in the following form: env KRB5_KTNAME=<path to keytab file> mongos --setParameter authenticationMechanisms=GSSAPI <additional mongos options> For example, the following starts a mongos instance with Kerberos support: env KRB5_KTNAME=/opt/mongodb/mongos.keytab mongos --setParameter authenticationMechanisms=GSSAPI --configdb shard0.example.net, shard1.example.net,shard2.example.net --keyFile /opt/mongodb/mongos.keyfile 6.3. Security Tutorials 333
  • 338. MongoDB Documentation, Release 2.6.4 The path to your mongos as well as your keytab file (page 292) may differ. The keytab file (page 292) must be only accessible to the owner of the mongos process. Modify or include any additional mongos options as required for your configuration. For example, instead of us-ing --keyFile for internal authentication of sharded cluster members, you can use x.509 member authentication (page 323) instead. Use a Config File To configure mongod or mongos for Kerberos support using a configuration file, specify the authenticationMechanisms setting in the configuration file: setParameter=authenticationMechanisms=GSSAPI Modify or include any additional mongod options as required for your configuration. For example, if http://docs.mongodb.org/manualopt/mongodb/mongod.conf contains the follow-ing configuration settings for a standalone mongod: auth = true setParameter=authenticationMechanisms=GSSAPI dbpath=/opt/mongodb/data To start mongod with Kerberos support, use the following form: env KRB5_KTNAME=/opt/mongodb/mongod.keytab /opt/mongodb/bin/mongod --config /opt/mongodb/mongod.conf The path to your mongod, keytab file (page 292), and configuration file may differ. The keytab file (page 292) must be only accessible to the owner of the mongod process. Troubleshoot Kerberos Setup for MongoDB If you encounter problems when starting mongod or mongos with Kerberos authentication, see Troubleshoot Kerberos Authentication on Linux (page 338). Incorporate Additional Authentication Mechanisms Kerberos authentication (GSSAPI) can work alongside MongoDB’s challenge/response authentication mechanism (MONGODB-CR), MongoDB’s authentication mechanism for LDAP (PLAIN), and MongoDB’s authentication mechanism for x.509 (MONGODB-X509). Specify the mecha-nisms, as follows: --setParameter authenticationMechanisms=GSSAPI,MONGODB-CR Only add the other mechanisms if in use. This parameter setting does not affect MongoDB’s internal authentication of cluster members. Configure MongoDB with Kerberos Authentication on Windows New in version 2.6. Overview MongoDB Enterprise supports authentication using a Kerberos service (page 291). Kerberos is an industry standard authentication protocol for large client/server system. Kerberos allows MongoDB and applications to take advantage of existing authentication infrastructure and processes. 334 Chapter 6. Security
  • 339. MongoDB Documentation, Release 2.6.4 Prerequisites Setting up and configuring a Kerberos deployment is beyond the scope of this document. This tutorial assumes have configured a Kerberos service principal (page 292) for each mongod.exe and mongos.exe instance. Procedures Step 1: Start mongod.exe without Kerberos. For the initial addition of Kerberos users, start mongod.exe without Kerberos support. If a Kerberos user is already in MongoDB and has the privileges required to create a user, you can start mongod.exe with Kerberos support. Step 2: Connect to mongod. Connect via the mongo.exe shell to the mongod.exe instance. If mongod.exe has --auth enabled, ensure you connect with the privileges required to create a user. Step 3: Add Kerberos Principal(s) to MongoDB. Add a Kerberos principal, <username>@<KERBEROS REALM>, to MongoDB in the $external database. Specify the Kerberos realm in all uppercase. The $external database allows mongod.exe to consult an external source (e.g. Kerberos) to authenticate. To specify the user’s privileges, assign roles (page 285) to the user. The following example adds the Kerberos principal reportingapp@EXAMPLE.NET with read-only access to the records database: use $external db.createUser( { user: "reportingapp@EXAMPLE.NET", roles: [ { role: "read", db: "records" } ] } ) Add additional principals as needed. For every user you want to authenticate using Kerberos, you must create a corresponding user in MongoDB. For more information about creating and managing users, see http://docs.mongodb.org/manualreference/command/nav-user-management. Step 4: Start mongod.exe with Kerberos support. You must start mongod.exe as the service principal ac-count (page 336). To start mongod.exe with Kerberos support, set the mongod.exe parameter authenticationMechanisms to GSSAPI: mongod.exe --setParameter authenticationMechanisms=GSSAPI <additional mongod.exe options> For example, the following starts a standalone mongod.exe instance with Kerberos support: mongod.exe --auth --setParameter authenticationMechanisms=GSSAPI Modify or include additional mongod.exe options as required for your configuration. Step 5: Connect mongo.exe shell to mongod.exe and authenticate. Connect the mongo.exe shell client as the Kerberos principal application@EXAMPLE.NET. You can connect and authenticate from the command line. 6.3. Security Tutorials 335
  • 340. MongoDB Documentation, Release 2.6.4 mongo.exe --authenticationMechanism=GSSAPI --authenticationDatabase='$external' --username reportingapp@EXAMPLE.NET Or, alternatively, you can first connect mongo.exe to the mongod.exe, and then from the mongo.exe shell, use the db.auth() method to authenticate in the $external database. use $external db.auth( { mechanism: "GSSAPI", user: "reportingapp@EXAMPLE.NET" } ) Additional Considerations Configure mongos.exe for Kerberos To start mongos.exe with Kerberos support, set the mongos.exe pa-rameter authenticationMechanisms to GSSAPI. You must start mongos.exe as the service principal ac-count (page 336).: mongos.exe --setParameter authenticationMechanisms=GSSAPI <additional mongos options> For example, the following starts a mongos instance with Kerberos support: mongos.exe --setParameter authenticationMechanisms=GSSAPI --configdb shard0.example.net, shard1.example.Modify or include any additional mongos.exe options as required for your configuration. For example, instead of using --keyFile for for internal authentication of sharded cluster members, you can use x.509 member authentica-tion (page 323) instead. Assign Service Principal Name to MongoDBWindows Service Use setspn.exe to assign the service principal name (SPN) to the account running the mongod.exe and the mongos.exe service: setspn.exe -A <service>/<fully qualified domain name> <service account name> For example, if mongod.exe runs as a service named mongodb on testserver.mongodb.com with the ser-vice account name mongodtest, assign the SPN as follows: setspn.exe -A mongodb/testserver.mongodb.com mongodtest Incorporate Additional Authentication Mechanisms Kerberos authentication (GSSAPI) can work alongside MongoDB’s challenge/response authentication mechanism (MONGODB-CR), MongoDB’s authentication mechanism for LDAP (PLAIN), and MongoDB’s authentication mechanism for x.509 (MONGODB-X509). Specify the mecha-nisms, as follows: --setParameter authenticationMechanisms=GSSAPI,MONGODB-CR Only add the other mechanisms if in use. This parameter setting does not affect MongoDB’s internal authentication of cluster members. Authenticate to a MongoDB Instance or Cluster Overview To authenticate to a running mongod or mongos instance, you must have user credentials for a resource on that instance. When you authenticate to MongoDB, you authenticate either to a database or to a cluster. Your user privileges determine the resource you can authenticate to. You authenticate to a resource either by: 336 Chapter 6. Security
  • 341. MongoDB Documentation, Release 2.6.4 • using the authentication options when connecting to the mongod or mongos instance, or • connecting first and then authenticating to the resource with the authenticate command or the db.auth() method. This section describes both approaches. In general, always use a trusted channel (VPN, SSL, trusted wired network) for connecting to a MongoDB instance. Prerequisites You must have user credentials on the database or cluster to which you are authenticating. Procedures Authenticate When First Connecting to MongoDB Step 1: Specify your credentials when starting the mongo instance. When using mongo to connect to a mongod or mongos, enter your username, password, and authenticationDatabase. For example: mongo --username "prodManager" --password "cleartextPassword" --authenticationDatabase "products" Step 2: Close the session when your work is complete. To close an authenticated session, use the logout com-mand.: db.runCommand( { logout: 1 } ) Authenticate After Connecting to MongoDB Step 1: Connect to a MongoDB instance. Connect to a mongod or mongos instance. Step 2: Switch to the database to which to authenticate. use <database> Step 3: Authenticate. Use either the authenticate command or the db.auth() method to provide your username and password to the database. For example: db.auth( "prodManager", "cleartextPassword" ) Step 4: Close the session when your work is complete. To close an authenticated session, use the logout com-mand.: db.runCommand( { logout: 1 } ) 6.3. Security Tutorials 337
  • 342. MongoDB Documentation, Release 2.6.4 Generate a Key File Overview This section describes how to generate a key file to store authentication information. After generating a key file, specify the key file using the keyFile option when starting a mongod or mongos instance. A key’s length must be between 6 and 1024 characters and may only contain characters in the base64 set. The key file must not have group or world permissions on UNIX systems. Key file permissions are not checked on Windows systems. MongoDB strips whitespace characters (e.g. x0d, x09, and x20) for cross-platform convenience. As a result, the following operations produce identical keys: echo -e "my secret key" > key1 echo -e "my secret keyn" > key2 echo -e "my secret key" > key3 echo -e "myrnsecretrnkeyrn" > key4 Procedure Step 1: Create a key file. Create the key file your deployment will use to authenticate servers to each other. To generate pseudo-random data to use for a keyfile, issue the following openssl command: openssl rand -base64 741 > mongodb-keyfile chmod 600 mongodb-keyfile You may generate a key file using any method you choose. Always ensure that the password stored in the key file is both long and contains a high amount of entropy. Using openssl in this manner helps generate such a key. Step 2: Specify the key file when starting a MongoDB instance. Specify the path to the key file with the keyFile option. Troubleshoot Kerberos Authentication on Linux New in version 2.4. Kerberos Configuration Checklist If you have difficulty starting mongod or mongos with Kerberos (page 291) on Linux systems, ensure that: • The mongod and the mongos binaries are from MongoDB Enterprise. To verify MongoDB Enterprise binaries: mongod --version In the output from this command, look for the string modules: subscription or modules: enterprise to confirm your system has MongoDB Enterprise. • You are not using the HTTP Console51. MongoDB Enterprise does not support Kerberos authentication over the HTTP Console interface. 51http://docs.mongodb.org/ecosystem/tools/http-interface/#http-console 338 Chapter 6. Security
  • 343. MongoDB Documentation, Release 2.6.4 • Either the service principal name (SPN) in the keytab file (page 292) matches the SPN for the mongod or mongos instance, or the mongod or the mongos instance use the --setParameter saslHostName=<host name> to match the name in the keytab file. • The canonical system hostname of the system that runs the mongod or mongos instance is a resolvable, fully qualified domain for this host. You can test the system hostname resolution with the hostname -f command at the system prompt. • Each host that runs a mongod or mongos instance has both the A and PTR DNS records to provide forward and reverse lookup. The records allow the host to resolve the components of the Kerberos infrastructure. • Both the Kerberos Key Distribution Center (KDC) and the system running mongod instance or mongos must be able to resolve each other using DNS. By default, Kerberos attempts to resolve hosts using the content of the /etc/kerb5.conf before using DNS to resolve hosts. • The time synchronization of the systems running mongod or the mongos instances and the Kerberos infras-tructure are within the maximum time skew (default is 5 minutes) of each other. Time differences greater than the maximum time skew will prevent successful authentication. Debug with More Verbose Logs If you still encounter problems with Kerberos on Linux, you can start both mongod and mongo (or another client) with the environment variable KRB5_TRACE set to different files to produce more verbose logging of the Kerberos process to help further troubleshooting. For example, the following starts a standalone mongod with KRB5_TRACE set: env KRB5_KTNAME=/opt/mongodb/mongod.keytab KRB5_TRACE=/opt/mongodb/log/mongodb-kerberos.log /opt/mongodb/bin/mongod --dbpath /opt/mongodb/data --fork --logpath /opt/mongodb/log/mongod.log --auth --setParameter authenticationMechanisms=GSSAPI Common Error Messages In some situations, MongoDB will return error messages from the GSSAPI interface if there is a problem with the Kerberos service. Some common error messages are: GSSAPI error in client while negotiating security context. This error occurs on the client and reflects insufficient credentials or a malicious attempt to authenticate. If you receive this error, ensure that you are using the correct credentials and the correct fully qualified domain name when connecting to the host. GSSAPI error acquiring credentials. This error occurs during the start of the mongod or mongos and reflects improper configuration of the system hostname or a missing or incorrectly configured keytab file. If you encounter this problem, consider the items in the Kerberos Configuration Checklist (page 338), in partic-ular, whether the SPN in the keytab file (page 292) matches the SPN for the mongod or mongos instance. To determine whether the SPNs match: 1. Examine the keytab file, with the following command: klist -k <keytab> Replace <keytab> with the path to your keytab file. 2. Check the configured hostname for your system, with the following command: 6.3. Security Tutorials 339
  • 344. MongoDB Documentation, Release 2.6.4 hostname -f Ensure that this name matches the name in the keytab file, or start mongod or mongos with the --setParameter saslHostName=<hostname>. See also: • Kerberos Authentication (page 291) • Configure MongoDB with Kerberos Authentication on Linux (page 331) • Configure MongoDB with Kerberos Authentication on Windows (page 334) Implement Field Level Redaction The $redact pipeline operator restricts the contents of the documents based on information stored in the documents themselves. Figure 6.1: Diagram of security architecture with middleware and redaction. To store the access criteria data, add a field to the documents and subdocuments. To allow for multiple combinations of access levels for the same data, consider setting the access field to an array of arrays. Each array element contains a required set that allows a user with that set to access the data. Then, include the $redact stage in the db.collection.aggregate() operation to restrict contents of the result set based on the access required to view the data. For more information on the $redact pipeline operator, including its syntax and associated system variables as well as additional examples, see $redact. 340 Chapter 6. Security
  • 345. MongoDB Documentation, Release 2.6.4 Procedure For example, a forecasts collection contains documents of the following form where the tags field determines the access levels required to view the data: { _id: 1, title: "123 Department Report", tags: [ [ "G" ], [ "FDW" ] ], year: 2014, subsections: [ { subtitle: "Section 1: Overview", tags: [ [ "SI", "G" ], [ "FDW" ] ], content: "Section 1: This is the content of section 1." }, { subtitle: "Section 2: Analysis", tags: [ [ "STLW" ] ], content: "Section 2: This is the content of section 2." }, { subtitle: "Section 3: Budgeting", tags: [ [ "TK" ], [ "FDW", "TGE" ] ], content: { text: "Section 3: This is the content of section3.", tags: [ [ "HCS"], [ "FDW", "TGE", "BX" ] ] } } ] } For each document, the tags field contains various access groupings necessary to view the data. For example, the value [ [ "G" ], [ "FDW", "TGE" ] ] can specify that a user requires either access level ["G"] or both [ "FDW", "TGE" ] to view the data. Consider a user who only has access to view information tagged with either "FDW" or "TGE". To run a query on all documents with year 2014 for this user, include a $redact stage as in the following: var userAccess = [ "FDW", "TGE" ]; db.forecasts.aggregate( [ { $match: { year: 2014 } }, { $redact: { $cond: { if: { $anyElementTrue: { $map: { input: "$tags" , as: "fieldTag", in: { $setIsSubset: [ "$$fieldTag", userAccess ] } } } }, then: "$$DESCEND", else: "$$PRUNE" } } 6.3. Security Tutorials 341
  • 346. MongoDB Documentation, Release 2.6.4 } ] ) The aggregation operation returns the following “redacted” document for the user: { "_id" : 1, "title" : "123 Department Report", "tags" : [ [ "G" ], [ "FDW" ] ], "year" : 2014, "subsections" : [ { "subtitle" : "Section 1: Overview", "tags" : [ [ "SI", "G" ], [ "FDW" ] ], "content" : "Section 1: This is the content of section 1." }, { "subtitle" : "Section 3: Budgeting", "tags" : [ [ "TK" ], [ "FDW", "TGE" ] ] } ] } See also: $map, $setIsSubset, $anyElementTrue 6.3.5 User and Role Management Tutorials The following tutorials provide instructions on how to enable authentication and limit access for users with privilege roles. Create a User Administrator (page 343) Create users with special permissions to to create, modify, and remove other users, as well as administer authentication credentials (e.g. passwords). Add a User to a Database (page 344) Create non-administrator users using MongoDB’s role-based authentication system. Create an Administrative User with Unrestricted Access (page 346) Create a user with unrestricted access. Create such a user only in unique situations. In general, all users in the system should have no more access than needed to perform their required operations. Create a Role (page 347) Create custom role. Assign a User a Role (page 349) Assign a user a role. A role grants the user a defined set of privileges. A user can have multiple roles. Verify User Privileges (page 350) View a user’s current privileges. Modify a User’s Access (page 352) Modify the actions available to a user on specific database resources. View Roles (page 353) View a role’s privileges. Change a User’s Password (page 354) Only user administrators can edit credentials. This tutorial describes the pro-cess for editing an existing user’s password. Change Your Password and Custom Data (page 355) Users with sufficient access can change their own passwords and modify the optional custom data associated with their user credential. 342 Chapter 6. Security
  • 347. MongoDB Documentation, Release 2.6.4 Create a User Administrator Overview User administrators create users and create and assigns roles. A user administrator can grant any privilege in the database and can create new ones. In a MongoDB deployment, create the user administrator as the first user. Then let this user create all other users. To provide user administrators, MongoDB has userAdmin (page 363) and userAdminAnyDatabase (page 368) roles, which grant access to actions (page 375) that support user and role management. Following the policy of least privilege userAdmin (page 363) and userAdminAnyDatabase (page 368) confer no additional privileges. Carefully control access to these roles. A user with either of these roles can grant itself unlimited additional privileges. Specifically, a user with the userAdmin (page 363) role can grant itself any privilege in the database. A user assigned either the userAdmin (page 363) role on the admin database or the userAdminAnyDatabase (page 368) can grant itself any privilege in the system. Prerequisites Required Access You must have the createUser (page 376) action (page 375) on a database to create a new user on that database. You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another user. If you have the userAdmin (page 363) or userAdminAnyDatabase (page 368) role, you have those actions. First User Restrictions If your MongoDB deployment has no users, you must connect to mongod using the local-host exception (page 285) or use the --noauth option when starting mongod to gain full access the system. Once you have access, you can skip to Creating the system user administrator in this procedure. If users exist in the MongoDB database, but none of them has the appropriate prerequisites to create a new user or you do not have access to them, you must restart mongod with the --noauth option. Procedure Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos as a user with the privileges required in the Prerequisites (page 343) section. The following example operation connects to MongoDB as an authenticated user named manager: mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. 6.3. Security Tutorials 343
  • 348. MongoDB Documentation, Release 2.6.4 Step 3: Create the system user administrator. Add the user with the userAdminAnyDatabase (page 368) role, and only that role. The following example creates the user siteUserAdmin user on the admin database: use admin db.createUser( { user: "siteUserAdmin", pwd: "password", roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] } ) Step 4: Create a user administrator for a single database. Optionally, you may want to create user administrators that only have access to administer users in a specific database by way of the userAdmin (page 363) role. The following example creates the user recordsUserAdmin on the records database: use products db.createUser( { user: "recordsUserAdmin", pwd: "password", roles: [ { role: "userAdmin", db: "records" } ] } ) Related Documents • Authentication (page 282) • Security Introduction (page 279) • Enable Client Access Control (page 317) • Access Control Tutorials (page 316) Add a User to a Database Changed in version 2.6. 344 Chapter 6. Security
  • 349. MongoDB Documentation, Release 2.6.4 Overview Each application and user of a MongoDB system should map to a distinct application or administrator. This access isolation facilitates access revocation and ongoing user maintenance. At the same time users should have only the minimal set of privileges required to ensure a system of least privilege. To create a user, you must define the user’s credentials and assign that user roles (page 285). Credentials verify the user’s identity to a database, and roles determine the user’s access to database resources and operations. For an overview of credentials and roles in MongoDB see Security Introduction (page 279). Considerations For users that authenticate using external mechanisms, 52 you do not need to provide credentials when creating users. For all users, select the roles that have the exact required privileges (page 286). If the correct roles do not exist, create roles (page 347). You can create a user without assigning roles, choosing instead to assign the roles later. To do so, create the user with an empty roles (page 372) array. When adding a user to multiple databases, use unique username-and-password combinations for each database, see Password Hashing Insecurity (page 385) for more information. Prerequisites To create a user on a system that uses authentication (page 282), you must authenticate as a user administrator. If you have not yet created a user administrator, do so as described in Create a User Administrator (page 343). Required Access You must have the createUser (page 376) action (page 375) on a database to create a new user on that database. You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another user. If you have the userAdmin (page 363) or userAdminAnyDatabase (page 368) role, you have those actions. First User Restrictions If your MongoDB deployment has no users, you must connect to mongod using the local-host exception (page 285) or use the --noauth option when starting mongod to gain full access the system. Once you have access, you can skip to Creating the system user administrator in this procedure. If users exist in the MongoDB database, but none of them has the appropriate prerequisites to create a new user or you do not have access to them, you must restart mongod with the --noauth option. Procedures Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos with the privileges required in the Prerequisites (page 345) section. The following example operation connects to MongoDB as an authenticated user named manager: mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin 52 Configure MongoDB with Kerberos Authentication on Linux (page 331), Authenticate Using SASL and LDAP with OpenLDAP (page 329), Authenticate Using SASL and LDAP with ActiveDirectory (page 326), and x.509 certificates provide external authentication mechanisms. 6.3. Security Tutorials 345
  • 350. MongoDB Documentation, Release 2.6.4 Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. Step 3: Create the new user. Create the user in the database to which the user will belong. Pass a well formed user document to the db.createUser() method. The following operation creates a user in the reporting database with the specified name, password, and roles. use reporting db.createUser( { user: "reportsUser", pwd: "12345678", roles: [ { role: "read", db: "reporting" }, { role: "read", db: "products" }, { role: "read", db: "sales" } ] } ) To authenticate the reportsUser, you must authenticate the user in the reporting database. Create an Administrative User with Unrestricted Access Overview Most users should have only the minimal set of privileges required for their operations, in keeping with the policy of least privilege. However, some authorization architectures may require a user with unrestricted access. To support these super users, you can create users with access to all database resources (page 373) and actions (page 375). For many deployments, you may be able to avoid having any users with unrestricted access by having an administrative user with the createUser (page 376) and grantRole (page 376) actions granted as needed to support operations. If users truly need unrestricted access to a MongoDB deployment, MongoDB provides a built-in role (page 361) named root (page 368) that grants the combined privileges of all built-in roles. This document describes how to create an administrative user with the root (page 368) role. For descriptions of the access each built-in role provides, see the section on built-in roles (page 361). Prerequisites Required Access You must have the createUser (page 376) action (page 375) on a database to create a new user on that database. You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another user. If you have the userAdmin (page 363) or userAdminAnyDatabase (page 368) role, you have those actions. 346 Chapter 6. Security
  • 351. MongoDB Documentation, Release 2.6.4 First User Restrictions If your MongoDB deployment has no users, you must connect to mongod using the local-host exception (page 285) or use the --noauth option when starting mongod to gain full access the system. Once you have access, you can skip to Creating the system user administrator in this procedure. If users exist in the MongoDB database, but none of them has the appropriate prerequisites to create a new user or you do not have access to them, you must restart mongod with the --noauth option. Procedure Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos as a user with the privileges required in the Prerequisites (page 346) section. The following example operation connects to MongoDB as an authenticated user named manager: mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. Step 3: Create the administrative user. In the admin database, create a new user using the db.createUser() method. Give the user the built-in root (page 368) role. For example: use admin db.createUser( { user: "superuser", pwd: "12345678", roles: [ "root" ] } ) Authenticate against the admin database to test the new user account. Use db.auth() while using the admin database or use the mongo shell with the --authenticationDatabase option. Create a Role Overview Roles grant users access to MongoDB resources. By default, MongoDB provides a number of built-in roles (page 361) that administrators may use to control access to a MongoDB system. However, if these roles cannot describe the desired privilege set of a particular user type in a deployment, you can define a new, customized role. 6.3. Security Tutorials 347
  • 352. MongoDB Documentation, Release 2.6.4 A role’s privileges apply to the database where the role is created. The role can inherit privileges from other roles in its database. A role created on the admin database can include privileges that apply to all databases or to the cluster (page 374) and can inherit privileges from roles in other databases. The combination of the database name and the role name uniquely defines a role in MongoDB. Prerequisites You must have the createRole (page 375) action (page 375) on a database to create a role on that database. You must have the grantRole (page 376) action (page 375) on the database that a privilege targets in order to grant that privilege to a role. If the privilege targets multiple databases or the cluster resource , you must have the grantRole (page 376) action on the admin database. You must have the grantRole (page 376) action (page 375) on a role’s database to grant the role to another role. To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action (page 375) on the role’s database. Procedure Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos with the privileges required in the Prerequisites (page 348) section. The following example operation connects to MongoDB as an authenticated user named manager: mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. Step 3: Define the privileges to grant to the role. Decide which resources (page 373) to grant access to and which actions (page 375) to grant on each resource. When creating the role, you will enter the resource-action pairings as documents in the privileges array, as in the following example: { db: "products", collection: "electronics" } Step 4: Check whether an existing role provides the privileges. If an existing role contains the exact set of privileges (page 286), the new role can inherit (page 286) those privileges. To view the privileges provided by existing roles, use the rolesInfo command, as in the following: db.runCommand( { rolesInfo: 1, showPrivileges: 1 } ) 348 Chapter 6. Security
  • 353. MongoDB Documentation, Release 2.6.4 Step 5: Create the role. To create the role, use the createRole command. Specify privileges in the privileges array and inherited roles in the roles array. The following example creates the myClusterwideAdmin role in the admin database: use admin db.createRole( { role: "myClusterwideAdmin", privileges: [ { resource: { cluster: true }, actions: [ "addShard" ] }, { resource: { db: "config", collection: "" }, actions: [ "find", "update", "insert" ] }, { resource: { db: "users", collection: "usersCollection" }, actions: [ "update" ] }, { resource: { db: "", collection: "" }, actions: [ "find" ] } ], roles: [ { role: "read", db: "admin" } ], writeConcern: { w: "majority" , wtimeout: 5000 } } ) The operation defines myClusterwideAdmin role’s privileges in the privileges array. In the roles array, myClusterwideAdmin inherits privileges from the admin database’s read role. Assign a User a Role Changed in version 2.6. Overview A role provides a user privileges to perform a set of actions (page 375) on a resource (page 373). A user can have multiple roles. In MongoDB systems with authorization enforced, you must grant a user a role for the user to access a database resource. To assign a role, first determine the privileges the user needs and then determine the role that grants those privileges. For an overview of roles and privileges, see Authorization (page 285). For descriptions of the access each built-in role provides, see the section on built-in roles (page 361). Prerequisites You must have the grantRole (page 376) action (page 375) on a database to grant a role on that database. To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action (page 375) on the role’s database. Procedure Step 1: Connect with the privilege to grant roles. Connect to the mongod or mongos either through the localhost exception (page 285) or as a user with the privileges required in the Prerequisites (page 349) section. 6.3. Security Tutorials 349
  • 354. MongoDB Documentation, Release 2.6.4 The following example operation connects to the MongoDB instance as a user named roleManager: mongo --port 27017 -u roleManager -p 12345678 --authenticationDatabase admin Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. Step 3: Identify the user’s roles and privileges. To display the roles and privileges of the user to be modified, use the db.getUser() and db.getRole() methods, as described in Verify User Privileges (page 350). To display the privileges granted by siteRole01 on the current database, issue: db.getRole( "siteRole01", { showPrivileges: true } ) Step 4: Identify the privileges to grant or revoke. Determine which role contains the privileges and only those privileges. If such a role does not exist, then to grant the privileges will require creating a new role (page 347) with the specific set of privileges. To revoke a subset of privileges provided by an existing role: revoke the original role, create a new role (page 347) that contains the privileges to keep, and then grant that role to the user. Step 5: Grant a role to a user. Grant the user the role using the db.grantRolesToUser() method. For example: use admin db.grantRolesToUser( "accountAdmin01", [ { role: "readWrite", db: "products" }, { role: "readAnyDatabase", db:"admin" } ] ) Verify User Privileges Overview A user’s privileges determine the access the user has to MongoDB resources (page 373) and the actions (page 375) that user can perform. Users receive privileges through role assignments. A user can have multiple roles, and each role can have multiple privileges. For an overview of roles and privileges, see Authorization (page 285). 350 Chapter 6. Security
  • 355. MongoDB Documentation, Release 2.6.4 Prerequisites To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action (page 375) on the role’s database. Procedure Step 1: Identify the user’s roles. Use the usersInfo command or db.getUser() method to display user information. The roles (page 372) array specifies the user’s roles. For example, to view roles for accountUser01 on the accounts database, issue the following: use accounts db.getUser("accountUser01") The roles (page 372) array displays all roles for accountUser01: "roles" : [ { "role" : "readWrite", "db" : "accounts" }, { "role" : "siteRole01", "db" : "records" } ] Step 2: Identify the privileges granted by the roles. For a given role, use the rolesInfo command or db.getRole() method, and include the showPrivileges parameter. The resulting role document displays both privileges granted directly and roles from which this role inherits privileges. For example, to view the privileges granted by siteRole01 on the records database, use the following operation, which returns a document with a privileges (page 370) array: use records db.getRole( "siteRole01", { showPrivileges: true } ) The returned document includes the roles (page 370) and privileges (page 370) arrays: "roles" : [ { "role" : "read", "db" : "corporate" } ], "privileges" : [ { "resource" : { "db" : "records", "collection" : "" }, "actions" : [ "find", "insert", "update" ] 6.3. Security Tutorials 351
  • 356. MongoDB Documentation, Release 2.6.4 } ] To view the privileges granted by the read (page 362) role, use db.getRole() again with the appropriate param-eters. Modify a User’s Access Overview When a user’s responsibilities change, modify the user’s access to include only those roles the user requires. This follows the policy of least privilege. To change a user’s access, first determine the privileges the user needs and then determine the roles that grants those privileges. Grant and revoke roles using the method:db.grantRolesToUser() and db.revokeRolesFromUser methods. For an overview of roles and privileges, see Authorization (page 285). For descriptions of the access each built-in role provides, see the section on built-in roles (page 361). Prerequisites You must have the grantRole (page 376) action (page 375) on a database to grant a role on that database. You must have the revokeRole (page 376) action (page 375) on a database to revoke a role on that database. To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action (page 375) on the role’s database. Procedure Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos either through the localhost exception (page 285) or as a user with the privileges required in the Prerequisites (page 352) section. The following example operation connects to MongoDB as an authenticated user named manager: mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. 352 Chapter 6. Security
  • 357. MongoDB Documentation, Release 2.6.4 Step 3: Identify the user’s roles and privileges. To display the roles and privileges of the user to be modified, use the db.getUser() and db.getRole() methods, as described in Verify User Privileges (page 350). To display the privileges granted by siteRole01 on the current database, issue: db.getRole( "siteRole01", { showPrivileges: true } ) Step 4: Identify the privileges to grant or revoke. Determine which role contains the privileges and only those privileges. If such a role does not exist, then to grant the privileges will require creating a new role (page 347) with the specific set of privileges. To revoke a subset of privileges provided by an existing role: revoke the original role, create a new role (page 347) that contains the privileges to keep, and then grant that role to the user. Step 5: Modify the user’s access. Revoke a Role Revoke a role with the db.revokeRolesFromUser() method. Access revocations apply as soon as the user tries to run a command. On a mongos revocations are instant on the mongos on which the command ran, but there is up to a 10-minute delay before the user cache is updated on the other mongos instances in the cluster. The following example operation removes the readWrite (page 362) role on the accounts database from the accountUser01 user’s existing roles: use accounts db.revokeRolesFromUser( "accountUser01", [ { role: "readWrite", db: "accounts" } ] ) Grant a Role Grant a role using the db.grantRolesToUser() method. For example, the following operation grants the accountUser01 user the read (page 362) role on the records database: use accounts db.grantRolesToUser( "accountUser01", [ { role: "read", db: "records" } ] ) View Roles Overview A role (page 285) grants privileges to the users who are assigned the role. Each role is scoped to a particular database, but MongoDB stores all role information in the admin.system.roles (page 270) collection in the admin database. Prerequisites To view a role’s information, you must be explicitly granted the role or must have the viewRole (page 376) action (page 375) on the role’s database. 6.3. Security Tutorials 353
  • 358. MongoDB Documentation, Release 2.6.4 Procedures The following procedures use the rolesInfo command. You also can use the methods db.getRole() (singular) and db.getRoles(). View a Role in the Current Database If the role is in the current database, you can refer to the role by name, as for the role dataEntry on the current database: db.runCommand({ rolesInfo: "dataEntry" }) View a Role in a Different Database If the role is in a different database, specify the role as a document. Use the following form: { role: "<role name>", db: "<role db>" } To view the custom appWriter role in the orders database, issue the following command from the mongo shell: db.runCommand({ rolesInfo: { role: "appWriter", db: "orders" } }) View Multiple Roles To view information for multiple roles, specify each role as a document or string in an array. To view the custom appWriter and clientWriter roles in the orders database, as well as the dataEntry role on the current database, use the following command from the mongo shell: db.runCommand( { rolesInfo: [ { role: "appWriter", db: "orders" }, { role: "clientWriter", db: "orders" }, "dataEntry" ] } ) View All Custom Roles To view the all custom roles, query admin.system.roles (page 369) collection directly, for example: db = db.getSiblingDB('admin') db.system.roles.find() Change a User’s Password Changed in version 2.6. Overview Strong passwords help prevent unauthorized access, and all users should have strong passwords. You can use the openssl program to generate unique strings for use in passwords, as in the following command: openssl rand -base64 48 Prerequisites You must have the changeAnyPassword action (page 375) on a database to modify the password of any user on that database. 354 Chapter 6. Security
  • 359. MongoDB Documentation, Release 2.6.4 You must have the changeOwnPassword (page 375) action (page 375) on your database to change your own password. Procedure Step 1: Connect to MongoDB with the appropriate privileges. Connect to the mongod or mongos with the privileges required in the Prerequisites (page 354) section. The following example operation connects to MongoDB as an authenticated user named manager: mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin Step 2: Verify your privileges. Use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. Step 3: Change the password. Pass the user’s username and the new password to the db.changeUserPassword() method. The following operation changes the reporting user’s password to SOh3TbYhxuLiW8ypJPxmt1oOfL: db.changeUserPassword("reporting", "SOh3TbYhxuLiW8ypJPxmt1oOfL") Change Your Password and Custom Data Changed in version 2.6. Overview Users with appropriate privileges can change their own passwords and custom data. Custom data (page 373) stores optional user information. Considerations To generate a strong password for use in this procedure, you can use the openssl utility’s rand command. For example, issue openssl rand with the following options to create a base64-encoded string of 48 pseudo-random bytes: openssl rand -base64 48 6.3. Security Tutorials 355
  • 360. MongoDB Documentation, Release 2.6.4 Prerequisites To modify your own password or custom data, you must have the changeOwnPassword (page 375) and changeOwnCustomData (page 375) actions (page 375) respectively on the cluster resource. Procedure Step 1: Connect with the appropriate privileges. Connect to the mongod or mongos with your username and current password. For example, the following operation connects to MongoDB as an authenticated user named manager. mongo --port 27017 -u manager -p 12345678 --authenticationDatabase admin Step 2: Verify your privileges. To check that you have the privileges specified in the Prerequisites (page 356) section, use the usersInfo command with the showPrivileges option. The following example operation checks privileges for a user connected as manager: db.runCommand( { usersInfo:"manager", showPrivileges:true } ) The resulting users document displays the privileges granted to manager. Step 2: View your custom data. Connect to the mongod or mongos with your username and current password. For example, the following operation returns information for the manager user: db.runCommand( { usersInfo: "manager" } ) Step 3: Change your password and custom data. Pass your username, new password, and new custom data to the updateUser command. For example, the following operation changes a user’s password to KNlZmiaNUp0B and custom data to { title: "Senior Manager" }: db.runCommand( { updateUser: "manager", pwd: "KNlZmiaNUp0B", customData: { title: "Senior Manager" } } ) 6.3.6 Configure System Events Auditing New in version 2.6. MongoDB Enterprise supports auditing (page 290) of various operations. A complete auditing solution must involve all mongod server and mongos router processes. 356 Chapter 6. Security
  • 361. MongoDB Documentation, Release 2.6.4 The audit facility can write audit events to the console, the syslog (option is unavailable on Windows), a JSON file, or a BSON file. For details on the audited operations and the audit log messages, see System Event Audit Messages (page 380). Enable and Configure Audit Output Use the --auditDestination option to enable auditing and specify where to output the audit events. Output to Syslog To enable auditing and print audit events to the syslog (option is unavailable on Windows) in JSON format, specify syslog for the --auditDestination setting. For example: mongod --dbpath data/db --auditDestination syslog Warning: The syslog message limit can result in the truncation of the audit messages. The auditing system will neither detect the truncation nor error upon its occurrence. You may also specify these options in the configuration file: dbpath=data/db auditDestination=syslog Output to Console To enable auditing and print the audit events to standard output (i.e. stdout), specify console for the --auditDestination setting. For example: mongod --dbpath data/db --auditDestination console You may also specify these options in the configuration file: dbpath=data/db auditDestination=console Output to JSON File To enable auditing and print audit events to a file in JSON format, specify file for the --auditDestination set-ting, JSON for the --auditFormat setting, and the output filename for the --auditPath. The --auditPath option accepts either full path name or relative path name. For example, the following enables auditing and records audit events to a file with the relative path name of data/db/auditLog.json: mongod --dbpath data/db --auditDestination file --auditFormat JSON --auditPath data/db/auditLog.json The audit file rotates at the same time as the server log file. You may also specify these options in the configuration file: dbpath=data/db auditDestination=file auditFormat=JSON auditPath=data/db/auditLog.json 6.3. Security Tutorials 357
  • 362. MongoDB Documentation, Release 2.6.4 Note: Printing audit events to a file in JSON format degrades server performance more than printing to a file in BSON format. Output to BSON File To enable auditing and print audit events to a file in BSON binary format, specify file for the --auditDestination setting, BSON for the --auditFormat setting, and the output filename for the --auditPath. The --auditPath option accepts either full path name or relative path name. For ex-ample, the following enables auditing and records audit events to a BSON file with the relative path name of data/db/auditLog.bson: mongod --dbpath data/db --auditDestination file --auditFormat BSON --auditPath data/db/auditLog.bson The audit file rotates at the same time as the server log file. You may also specify these options in the configuration file: dbpath=data/db auditDestination=file auditFormat=BSON auditPath=data/db/auditLog.bson To view the contents of the file, pass the file to the MongoDB utility bsondump. For example, the following converts the audit log into a human-readable form and output to the terminal: bsondump data/db/auditLog.bson Filter Events By default, the audit facility records all auditable operations (page 381). The audit feature has an --auditFilter option to determine which events to record. The --auditFilter option takes a document of the form: { atype: <expression> } The <expression> is a query condition expression to match on various actions (page 381) . Filter for a Single Operation Type For example, to audit only the createCollection (page 375) action, use the filter { atype: "createCollection" }: Tip To specify the filter as a command-line option, enclose the filter document in single quotes to pass the document as a string. mongod --dbpath data/db --auditDestination file --auditFilter '{ atype: "createCollection" }' --auditFormat Filter for Multiple Operation Types To match on multiple operations, use the $in operator in the <expression> as in the following: 358 Chapter 6. Security
  • 363. MongoDB Documentation, Release 2.6.4 Tip To specify the filter as a command-line option, enclose the filter document in single quotes to pass the document as a string. mongod --dbpath data/db --auditDestination file --auditFilter '{ atype: { $in: [ "createCollection", Filter on Authentication Operations on a Single Database For authentication operations, you can also specify a specific database with the param.db field: { atype: <expression>, "param.db": <database> } For example, to audit only authenticate operations that occur against the test database, use the filter { atype: "authenticate", "param.db": "test" }: Tip To specify the filter as a command-line option, enclose the filter document in single quotes to pass the document as a string. mongod --dbpath data/db --auth --auditDestination file --auditFilter '{ atype: "authenticate", "param.To filter on all authenticate operations across databases, use the filter { atype: "authenticate" }. 6.3.7 Create a Vulnerability Report If you believe you have discovered a vulnerability in MongoDB or have experienced a security incident related to MongoDB, please report the issue to aid in its resolution. To report an issue, we strongly suggest filing a ticket in the SECURITY53 project in JIRA. MongoDB, Inc responds to vulnerability notifications within 48 hours. Create the Report in JIRA Submit a ticket in the Security54 project at: <http://jira.mongodb.org/browse>. The ticket number will become the reference identification for the issue for its lifetime. You can use this identifier for tracking purposes. Information to Provide All vulnerability reports should contain as much information as possible so MongoDB’s developers can move quickly to resolve the issue. In particular, please include the following: • The name of the product. • Common Vulnerability information, if applicable, including: • CVSS (Common Vulnerability Scoring System) Score. • CVE (Common Vulnerability and Exposures) Identifier. • Contact information, including an email address and/or phone number, if applicable. 53https://jira.mongodb.org/browse/SECURITY 54https://jira.mongodb.org/browse/SECURITY 6.3. Security Tutorials 359
  • 364. MongoDB Documentation, Release 2.6.4 Send the Report via Email While JIRA is the preferred reporting method, you may also report vulnerabilities via email to secu-rity@ mongodb.com55. You may encrypt email using MongoDB’s public key at http://docs.mongodb.org/10gen-security-gpg-key.asc. MongoDB, Inc. responds to vulnerability reports sent via email with a response email that contains a reference number for a JIRA ticket posted to the SECURITY56 project. Evaluation of a Vulnerability Report MongoDB, Inc. validates all submitted vulnerabilities and uses Jira to track all communications regarding a vulner-ability, including requests for clarification or additional information. If needed, MongoDB representatives set up a conference call to exchange information regarding the vulnerability. Disclosure MongoDB, Inc. requests that you do not publicly disclose any information regarding the vulnerability or exploit the issue until it has had the opportunity to analyze the vulnerability, to respond to the notification, and to notify key users, customers, and partners. The amount of time required to validate a reported vulnerability depends on the complexity and severity of the issue. MongoDB, Inc. takes all required vulnerabilities very seriously and will always ensure that there is a clear and open channel of communication with the reporter. After validating an issue, MongoDB, Inc. coordinates public disclosure of the issue with the reporter in a mutually agreed timeframe and format. If required or requested, the reporter of a vulnerability will receive credit in the published security bulletin. 6.4 Security Reference 6.4.1 Security Methods in the mongo Shell Name Description db.auth() Authenticates a user to a database. 55security@mongodb.com 56https://jira.mongodb.org/browse/SECURITY 360 Chapter 6. Security
  • 365. MongoDB Documentation, Release 2.6.4 User Management Methods Name Description db.addUser() Deprecated. Adds a user to a database, and allows administrators to configure the user’s privileges. db.changeUserPassword()Changes an existing user’s password. db.createUser() Creates a new user. db.dropAllUsers() Deletes all users associated with a database. db.dropUser() Removes a single user. db.getUser() Returns information about the specified user. db.getUsers() Returns information about all users associated with a database. db.grantRolesToUser() Grants a role and its privileges to a user. db.removeUser() Deprecated. Removes a user from a database. db.revokeRolesFromUserR()emoves a role from a user. db.updateUser() Updates user data. Role Management Methods Name Description db.createRole() Creates a role and specifies its privileges. db.dropAllRoles() Deletes all user-defined roles associated with a database. db.dropRole() Deletes a user-defined role. db.getRole() Returns information for the specified role. db.getRoles() Returns information for all the user-defined roles in a database. db.grantPrivilegesToRole() Assigns privileges to a user-defined role. db.grantRolesToRole() Specifies roles from which a user-defined role inherits privileges. db.revokePrivilegesFromRole() Removes the specified privileges from a user-defined role. db.revokeRolesFromRole() Removes a role from a user. db.updateRole() Updates a user-defined role. 6.4.2 Security Reference Documentation Built-In Roles (page 361) Reference on MongoDB provided roles and corresponding access. system.roles Collection (page 369) Describes the content of the collection that stores user-defined roles. system.users Collection (page 372) Describes the content of the collection that stores users’ credentials and role as-signments. Resource Document (page 373) Describes the resource document for roles. Privilege Actions (page 375) List of the actions available for privileges. Default MongoDB Port (page 380) List of default ports used by MongoDB. System Event Audit Messages (page 380) Reference on system event audit messages. Built-In Roles MongoDB grants access to data and commands through role-based authorization (page 285) and provides built-in roles that provide the different levels of access commonly needed in a database system. You can additionally create user-defined roles (page 286). A role grants privileges to perform sets of actions (page 375) on defined resources (page 373). A given role applies to the database on which it is defined and can grant access down to a collection level of granularity. 6.4. Security Reference 361
  • 366. MongoDB Documentation, Release 2.6.4 Each of MongoDB’s built-in roles defines access at the database level for all non-system collections in the role’s database and at the collection level for all system collections (page 270). MongoDB provides the built-in database user (page 362) and database administration (page 363) roles on every database. MongoDB provides all other built-in roles only on the admin database. This section describes the privileges for each built-in role. You can also view the privileges for a built-in role at any time by issuing the rolesInfo command with the showPrivileges and showBuiltinRoles fields both set to true. Database User Roles Every database includes the following client roles: read Provides the ability to read data on all non-system collections and on the following system collections: system.indexes (page 271), system.js (page 271), and system.namespaces (page 271) collec-tions. The role provides read access by granting the following actions (page 375): •collStats (page 379) •dbHash (page 379) •dbStats (page 379) •find (page 375) •killCursors (page 376) readWrite Provides all the privileges of the read (page 362) role plus ability to modify data on all non-system collections and the system.js (page 271) collection. The role provides the following actions on those collections: •collStats (page 379) •convertToCapped (page 378) •createCollection (page 375) •dbHash (page 379) •dbStats (page 379) •dropCollection (page 376) •createIndex (page 375) •dropIndex (page 378) •emptycapped (page 376) •find (page 375) •insert (page 375) •killCursors (page 376) •remove (page 375) •renameCollectionSameDB (page 378) •update (page 375) 362 Chapter 6. Security
  • 367. MongoDB Documentation, Release 2.6.4 Database Administration Roles Every database includes the following database administration roles: dbAdmin Provides the following actions (page 375) on the database’s system.indexes (page 271), system.namespaces (page 271), and system.profile (page 271) collections: •collStats (page 379) •dbHash (page 379) •dbStats (page 379) •find (page 375) •killCursors (page 376) •dropCollection (page 376) on system.profile (page 271) only Provides the following actions on all non-system collections. This role does not include full read access on non-system collections: •collMod (page 378) •collStats (page 379) •compact (page 378) •convertToCapped (page 378) •createCollection (page 375) •createIndex (page 375) •dbStats (page 379) •dropCollection (page 376) •dropDatabase (page 378) •dropIndex (page 378) •enableProfiler (page 376) •indexStats (page 379) •reIndex (page 378) •renameCollectionSameDB (page 378) •repairDatabase (page 378) •storageDetails (page 377) •validate (page 379) dbOwner The database owner can perform any administrative action on the database. This role combines the privileges granted by the readWrite (page 362), dbAdmin (page 363) and userAdmin (page 363) roles. userAdmin Provides the ability to create and modify roles and users on the current database. This role also indirectly provides superuser (page 368) access to either the database or, if scoped to the admin database, the cluster. The userAdmin (page 363) role allows users to grant any user any privilege, including themselves. The userAdmin (page 363) role explicitly provides the following actions: 6.4. Security Reference 363
  • 368. MongoDB Documentation, Release 2.6.4 •changeCustomData (page 375) •changePassword (page 375) •createRole (page 375) •createUser (page 376) •dropRole (page 376) •dropUser (page 376) •grantRole (page 376) •revokeRole (page 376) •viewRole (page 376) •viewUser (page 376) Cluster Administration Roles The admin database includes the following roles for administering the whole system rather than just a single database. These roles include but are not limited to replica set and sharded cluster administrative functions. clusterAdmin Provides the greatest cluster-management access. This role combines the privileges granted by the clusterManager (page 364), clusterMonitor (page 365), and hostManager (page 366) roles. Ad-ditionally, the role provides the dropDatabase (page 378) action. clusterManager Provides management and monitoring actions on the cluster. A user with this role can access the config and local databases, which are used in sharding and replication, respectively. Provides the following actions on the cluster as a whole: •addShard (page 377) •applicationMessage (page 378) •cleanupOrphaned (page 376) •flushRouterConfig (page 377) •listShards (page 377) •removeShard (page 377) •replSetConfigure (page 377) •replSetGetStatus (page 377) •replSetStateChange (page 377) •resync (page 377) Provides the following actions on all databases in the cluster: •enableSharding (page 377) •moveChunk (page 377) •splitChunk (page 378) •splitVector (page 378) On the config database, provides the following actions on the settings (page 683) collection: 364 Chapter 6. Security
  • 369. MongoDB Documentation, Release 2.6.4 •insert (page 375) •remove (page 375) •update (page 375) On the config database, provides the following actions on all configuration collections and on the system.indexes (page 271), system.js (page 271), and system.namespaces (page 271) collec-tions: •collStats (page 379) •dbHash (page 379) •dbStats (page 379) •find (page 375) •killCursors (page 376) On the local database, provides the following actions on the replset (page 600) collection: •collStats (page 379) •dbHash (page 379) •dbStats (page 379) •find (page 375) •killCursors (page 376) clusterMonitor Provides read-only access to monitoring tools, such as the MongoDB Management Service (MMS)57 monitoring agent. Provides the following actions on the cluster as a whole: •connPoolStats (page 379) •cursorInfo (page 379) •getCmdLineOpts (page 379) •getLog (page 379) •getParameter (page 378) •getShardMap (page 377) •hostInfo (page 378) •inprog (page 376) •listDatabases (page 379) •listShards (page 377) •netstat (page 379) •replSetGetStatus (page 377) •serverStatus (page 379) •shardingState (page 377) •top (page 379) 57http://mms.mongodb.com/help/ 6.4. Security Reference 365
  • 370. MongoDB Documentation, Release 2.6.4 Provides the following actions on all databases in the cluster: •collStats (page 379) •dbStats (page 379) •getShardVersion (page 377) Provides the find (page 375) action on all system.profile (page 271) collections in the cluster. Provides the following actions on the config database’s configuration collections and system.indexes (page 271), system.js (page 271), and system.namespaces (page 271) collections: •collStats (page 379) •dbHash (page 379) •dbStats (page 379) •find (page 375) •killCursors (page 376) hostManager Provides the ability to monitor and manage servers. Provides the following actions on the cluster as a whole: •applicationMessage (page 378) •closeAllDatabases (page 378) •connPoolSync (page 378) •cpuProfiler (page 376) •diagLogging (page 379) •flushRouterConfig (page 377) •fsync (page 378) •invalidateUserCache (page 376) •killop (page 376) •logRotate (page 378) •resync (page 377) •setParameter (page 379) •shutdown (page 379) •touch (page 379) •unlock (page 376) Provides the following actions on all databases in the cluster: •killCursors (page 376) •repairDatabase (page 378) 366 Chapter 6. Security
  • 371. MongoDB Documentation, Release 2.6.4 Backup and Restoration Roles The admin database includes the following roles for backing up and restoring data: backup Provides minimal privileges needed for backing up data. This role provides sufficient privileges to use the MongoDB Management Service (MMS)58 backup agent, or to use mongodump to back up an entire mongod instance. Provides the following actions (page 375) on the mms.backup collection in the admin database: •insert (page 375) •update (page 375) Provides the listDatabases (page 379) action on the cluster as a whole. Provides the find (page 375) action on the following: •all non-system collections in the cluster •all the following system collections in the cluster: system.indexes (page 271), system.namespaces (page 271), and system.js (page 271) •the admin.system.users (page 271) and admin.system.roles (page 270) collections •legacy system.users collections from versions of MongoDB prior to 2.6 restore Provides minimal privileges needed for restoring data from backups. This role provides sufficient privileges to use the mongorestore tool to restore an entire mongod instance. Provides the following actions on all non-system collections and system.js (page 271) collections in the cluster; on the admin.system.users (page 271) and admin.system.roles (page 270) collections in the admin database; and on legacy system.users collections from versions of MongoDB prior to 2.6: •collMod (page 378) •createCollection (page 375) •createIndex (page 375) •dropCollection (page 376) •insert (page 375) Provides the following additional actions on admin.system.users (page 271) and legacy system.users collections: •find (page 375) •remove (page 375) •update (page 375) Provides the find (page 375) action on all the system.namespaces (page 271) collections in the cluster. Although, restore (page 367) includes the ability to modify the documents in the admin.system.users (page 271) collection using normal modification operations, only modify these data using the user management methods. 58http://mms.mongodb.com/help/ 6.4. Security Reference 367
  • 372. MongoDB Documentation, Release 2.6.4 All-Database Roles The admin database provides the following roles that apply to all databases in a mongod instance and are roughly equivalent to their single-database equivalents: readAnyDatabase Provides the same read-only permissions as read (page 362), except it applies to all databases in the cluster. The role also provides the listDatabases (page 379) action on the cluster as a whole. readWriteAnyDatabase Provides the same read and write permissions as readWrite (page 362), except it applies to all databases in the cluster. The role also provides the listDatabases (page 379) action on the cluster as a whole. userAdminAnyDatabase Provides the same access to user administration operations as userAdmin (page 363), except it applies to all databases in the cluster. The role also provides the following actions on the cluster as a whole: •authSchemaUpgrade (page 376) •invalidateUserCache (page 376) •listDatabases (page 379) The role also provides the following actions on the admin.system.users (page 271) and admin.system.roles (page 270) collections on the admin database, and on legacy system.users collections from versions of MongoDB prior to 2.6: •collStats (page 379) •dbHash (page 379) •dbStats (page 379) •find (page 375) •killCursors (page 376) •planCacheRead (page 376) The userAdminAnyDatabase (page 368) role does not restrict the permissions that a user can grant. As a result, userAdminAnyDatabase (page 368) users can grant themselves privileges in excess of their cur-rent privileges and even can grant themselves all privileges, even though the role does not explicitly authorize privileges beyond user administration. This role is effectively a MongoDB system superuser (page 368). dbAdminAnyDatabase Provides the same access to database administration operations as dbAdmin (page 363), except it applies to all databases in the cluster. The role also provides the listDatabases (page 379) action on the cluster as a whole. Superuser Roles Several roles provide either indirect or direct system-wide superuser access. The following roles provide the ability to assign any user any privilege on any database, which means that users with one of these roles can assign themselves any privilege on any database: • dbOwner (page 363) role, when scoped to the admin database • userAdmin (page 363) role, when scoped to the admin database • userAdminAnyDatabase (page 368) role The following role provides full privileges on all resources: 368 Chapter 6. Security
  • 373. MongoDB Documentation, Release 2.6.4 root Provides access to the operations and all the resources of the readWriteAnyDatabase (page 368), dbAdminAnyDatabase (page 368), userAdminAnyDatabase (page 368) and clusterAdmin (page 364) roles combined. root (page 368) does not include the ability to insert data directly into the system.users (page 271) and system.roles (page 270) collections in the admin database. Therefore, root (page 368) is not suitable for restoring data that have these collections with mongorestore. To perform these kinds of restore operations, provision users with the restore (page 367) role. Internal Role __system MongoDB assigns this role to user objects that represent cluster members, such as replica set members and mongos instances. The role entitles its holder to take any action against any object in the database. Do not assign this role to user objects representing applications or human administrators, other than in excep-tional circumstances. If you need access to all actions on all resources, for example to run the eval or applyOps commands, do not assign this role. Instead, create a user-defined role that grants anyAction (page 380) on anyResource (page 375) and ensure that only the users who needs access to these operations has this access. system.roles Collection New in version 2.6. The system.roles collection in the admin database stores the user-defined roles. To create and manage these user-defined roles, MongoDB provides role management commands. system.roles Schema The documents in the system.roles collection have the following schema: { _id: <system-defined id>, role: "<role name>", db: "<database>", privileges: [ { resource: { <resource> }, actions: [ "<action>", ... ] }, ... ], roles: [ { role: "<role name>", db: "<database>" }, ... ] } A system.roles document has the following fields: 6.4. Security Reference 369
  • 374. MongoDB Documentation, Release 2.6.4 admin.system.roles.role The role (page 369) field is a string that specifies the name of the role. admin.system.roles.db The db (page 370) field is a string that specifies the database to which the role belongs. MongoDB uniquely identifies each role by the pairing of its name (i.e. role (page 369)) and its database. admin.system.roles.privileges The privileges (page 370) array contains the privilege documents that define the privileges (page 286) for the role. A privilege document has the following syntax: { resource: { <resource> }, actions: [ "<action>", ... ] } Each privilege document has the following fields: admin.system.roles.privileges[n].resource A document that specifies the resources upon which the privilege actions (page 370) apply. The docu-ment has one of the following form: { db: <database>, collection: <collection> } or { cluster : true } See Resource Document (page 373) for more details. admin.system.roles.privileges[n].actions An array of actions permitted on the resource. For a list of actions, see Privilege Actions (page 375). admin.system.roles.roles The roles (page 370) array contains role documents that specify the roles from which this role inherits (page 286) privileges. A role document has the following syntax: { role: "<role name>", db: "<database>" } A role document has the following fields: admin.system.roles.roles[n].role The name of the role. A role can be a built-in role (page 361) provided by MongoDB or a user-defined role (page 286). admin.system.roles.roles[n].db The name of the database where the role is defined. Examples Consider the following sample documents found in system.roles collection of the admin database. A User-Defined Role Specifies Privileges The following is a sample document for a user-defined role appUser defined for the myApp database: 370 Chapter 6. Security
  • 375. MongoDB Documentation, Release 2.6.4 { _id: "myApp.appUser", role: "appUser", db: "myApp", privileges: [ { resource: { db: "myApp" , collection: "" }, actions: [ "find", "createCollection", "dbStats", "collStats" ] }, { resource: { db: "myApp", collection: "logs" }, actions: [ "insert" ] }, { resource: { db: "myApp", collection: "data" }, actions: [ "insert", "update", "remove", "compact" ] }, { resource: { db: "myApp", collection: "system.indexes" }, actions: [ "find" ] }, { resource: { db: "myApp", collection: "system.namespaces" }, actions: [ "find" ] }, ], roles: [] } The privileges array lists the five privileges that the appUser role specifies: • The first privilege permits its actions ( "find", "createCollection", "dbStats", "collStats") on all the collections in the myApp database excluding its system collections. See Specify a Database as Resource (page 373). • The next two privileges permits additional actions on specific collections, logs and data, in the myApp database. See Specify a Collection of a Database as Resource (page 373). • The last two privileges permits actions on two system collections (page 270) in the myApp database. While the first privilege gives database-wide permission for the find action, the action does not apply to myApp‘s system collections. To give access to a system collection, a privilege must explicitly specify the collection. See Resource Document (page 373). As indicated by the empty roles array, appUser inherits no additional privileges from other roles. User-Defined Role Inherits from Other Roles The following is a sample document for a user-defined role appAdmin defined for the myApp database: The document shows that the appAdmin role specifies privileges as well as inherits privileges from other roles: { _id: "myApp.appAdmin", role: "appAdmin", db: "myApp", privileges: [ { resource: { db: "myApp", collection: "" }, actions: [ "insert", "dbStats", "collStats", "compact", "repairDatabase" ] } ], roles: [ { role: "appUser", db: "myApp" } ] } The privileges array lists the privileges that the appAdmin role specifies. This role has a single privilege that permits its actions ( "insert", "dbStats", "collStats", "compact", "repairDatabase") on all the collections in the myApp database excluding its system collections. See Specify a Database as Resource (page 373). 6.4. Security Reference 371
  • 376. MongoDB Documentation, Release 2.6.4 The roles array lists the roles, identified by the role names and databases, from which the role appAdmin inherits privileges. system.users Collection Changed in version 2.6. The system.users collection in the admin database stores user authentication (page 282) and authorization (page 285) information. To manage data in this collection, MongoDB provides user management commands. system.users Schema The documents in the system.users collection have the following schema: { _id: <system defined id>, user: "<name>", db: "<database>", credentials: { <authentication credentials> }, roles: [ { role: "<role name>", db: "<database>" }, ... ], customData: <custom information> } Each system.users document has the following fields: admin.system.users.user The user (page 372) field is a string that identifies the user. A user exists in the context of a single logical database but can have access to other databases through roles specified in the roles (page 372) array. admin.system.users.db The db (page 372) field specifies the database associated with the user. The user’s privileges are not necessarily limited to this database. The user can have privileges in additional databases through the roles (page 372) array. admin.system.users.credentials The credentials (page 372) field contains the user’s authentication information. For users with externally stored authentication credentials, such as users that use Kerberos (page 331) or x.509 certificates for authentica-tion, the system.users document for that user does not contain the credentials (page 372) field. admin.system.users.roles The roles (page 372) array contains role documents that specify the roles granted to the user. The array contains both built-in roles (page 361) and user-defined role (page 286). A role document has the following syntax: { role: "<role name>", db: "<database>" } A role document has the following fields: admin.system.users.roles[n].role The name of a role. A role can be a built-in role (page 361) provided by MongoDB or a custom user-defined role (page 286). admin.system.users.roles[n].db The name of the database where role is defined. 372 Chapter 6. Security
  • 377. MongoDB Documentation, Release 2.6.4 When specifying a role using the role management or user management commands, you can specify the role name alone (e.g. "readWrite") if the role that exists on the database on which the command is run. admin.system.users.customData The customData (page 373) field contains optional custom information about the user. Example Consider the following document in the system.users collection: { _id: "home.Kari", user: "Kari", db: "home", credentials: { "MONGODB-CR" :"<hashed password>" }, roles : [ { role: "read", db: "home" }, { role: "readWrite", db: "test" }, { role: "appUser", db: "myApp" } ], customData: { zipCode: "64157" } } The document shows that a user Kari is associated with the home database. Kari has the read (page 362) role in the home database, the readWrite (page 362) role in the test database, and the appUser role in the myApp database. Resource Document The resource document specifies the resources upon which a privilege permits actions. Database and/or Collection Resource To specify databases and/or collections, use the following syntax: { db: <database>, collection: <collection> } Specify a Collection of a Database as Resource If the resource document species both the db an collection fields as non-empty strings, the resource is the specified collection in the specified database. For example, the following document specifies a resource of the inventory collection in the products database: { db: "products", collection: "inventory" } For a user-defined role scoped for a non-admin database, the resource specification for its privileges must specify the same database as the role. User-defined roles scoped for the admin database can specify other databases. Specify a Database as Resource If only the collection field is an empty string (""), the resource is the specified database, excluding the system collections (page 270). For example, the following resource document specifies the resource of the test database, excluding the system collections: { db: "test", collection: "" } 6.4. Security Reference 373
  • 378. MongoDB Documentation, Release 2.6.4 For a user-defined role scoped for a non-admin database, the resource specification for its privileges must specify the same database as the role. User-defined roles scoped for the admin database can specify other databases. Note: When you specify a database as the resource, the system collections are excluded, unless you name them explicitly, as in the following: { db: "test", collection: "system.namespaces" } System collections include but are not limited to the following: • <database>.system.profile (page 271) • <database>.system.namespaces (page 271) • <database>.system.indexes (page 271) • <database>.system.js (page 271) • local.system.replset (page 600) • system.users Collection (page 372) in the admin database • system.roles Collection (page 369) in the admin database Specify Collections Across Databases as Resource If only the db field is an empty string (""), the resource is all collections with the specified name across all databases. For example, the following document specifies the resource of all the accounts collections across all the databases: { db: "", collection: "accounts" } For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-leges. Specify All Non-System Collections in All Databases If both the db and collection fields are empty strings (""), the resource is all collections, excluding the system collections (page 270), in all the databases: { db: "", collection: "" } For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-leges. Cluster Resource To specify the cluster as the resource, use the following syntax: { cluster : true } Use the cluster resource for actions that affect the state of the system rather than act on specific set of databases or collections. Examples of such actions are shutdown, replSetReconfig, and addShard. For example, the following document grants the action shutdown on the cluster. { resource: { cluster : true }, actions: [ "shutdown" ] } For user-defined roles, only roles scoped for the admin database can have this resource specification for their privi-leges. 374 Chapter 6. Security
  • 379. MongoDB Documentation, Release 2.6.4 anyResource The internal resource anyResource gives access to every resource in the system and is intended for internal use. Do not use this resource, other than in exceptional circumstances. The syntax for this resource is { anyResource: true }. Privilege Actions New in version 2.6. Privilege actions define the operations a user can perform on a resource (page 373). A MongoDB privilege (page 286) comprises a resource (page 373) and the permitted actions. This page lists available actions grouped by common purpose. MongoDB provides built-in roles with pre-defined pairings of resources and permitted actions. For lists of the actions granted, see Built-In Roles (page 361). To define custom roles, see Create a Role (page 347). Query and Write Actions find User can perform the db.collection.find() method. Apply this action to database or collection re-sources. insert User can perform the insert command. Apply this action to database or collection resources. remove User can perform the db.collection.remove() method. Apply this action to database or collection resources. update User can perform the update command. Apply this action to database or collection resources. Database Management Actions changeCustomData User can change the custom information of any user in the given database. Apply this action to database resources. changeOwnCustomData Users can change their own custom information. Apply this action to database resources. changeOwnPassword Users can change their own passwords. Apply this action to database resources. changePassword User can change the password of any user in the given database. Apply this action to database resources. createCollection User can perform the db.createCollection() method. Apply this action to database or collection re-sources. createIndex Provides access to the db.collection.createIndex() method and the createIndexes command. Apply this action to database or collection resources. 6.4. Security Reference 375
  • 380. MongoDB Documentation, Release 2.6.4 createRole User can create new roles in the given database. Apply this action to database resources. createUser User can create new users in the given database. Apply this action to database resources. dropCollection User can perform the db.collection.drop() method. Apply this action to database or collection re-sources. dropRole User can delete any role from the given database. Apply this action to database resources. dropUser User can remove any user from the given database. Apply this action to database resources. emptycapped User can perform the emptycapped command. Apply this action to database or collection resources. enableProfiler User can perform the db.setProfilingLevel() method. Apply this action to database resources. grantRole User can grant any role in the database to any user from any database in the system. Apply this action to database resources. killCursors User can kill cursors on the target collection. revokeRole User can remove any role from any user from any database in the system. Apply this action to database resources. unlock User can perform the db.fsyncUnlock() method. Apply this action to the cluster resource. viewRole User can view information about any role in the given database. Apply this action to database resources. viewUser User can view the information of any user in the given database. Apply this action to database resources. Deployment Management Actions authSchemaUpgrade User can perform the authSchemaUpgrade command. Apply this action to the cluster resource. cleanupOrphaned User can perform the cleanupOrphaned command. Apply this action to the cluster resource. cpuProfiler User can enable and use the CPU profiler. Apply this action to the cluster resource. inprog User can use the db.currentOp() method to return pending and active operations. Apply this action to the cluster resource. invalidateUserCache Provides access to the invalidateUserCache command. Apply this action to the cluster resource. killop User can perform the db.killOp() method. Apply this action to the cluster resource. 376 Chapter 6. Security
  • 381. MongoDB Documentation, Release 2.6.4 planCacheRead User can perform the planCacheListPlans and planCacheListQueryShapes commands and the PlanCache.getPlansByQuery() and PlanCache.listQueryShapes() methods. Apply this ac-tion to database or collection resources. planCacheWrite User can perform the planCacheClear command and the PlanCache.clear() and PlanCache.clearPlansByQuery() methods. Apply this action to database or collection resources. storageDetails User can perform the storageDetails command. Apply this action to database or collection resources. Replication Actions appendOplogNote User can append notes to the oplog. Apply this action to the cluster resource. replSetConfigure User can configure a replica set. Apply this action to the cluster resource. replSetGetStatus User can perform the replSetGetStatus command. Apply this action to the cluster resource. replSetHeartbeat User can perform the replSetHeartbeat command. Apply this action to the cluster resource. replSetStateChange User can change the state of a replica set through the replSetFreeze, replSetMaintenance, replSetStepDown, and replSetSyncFrom commands. Apply this action to the cluster resource. resync User can perform the resync command. Apply this action to the cluster resource. Sharding Actions addShard User can perform the addShard command. Apply this action to the cluster resource. enableSharding User can enable sharding on a database using the enableSharding command and can shard a collection using the shardCollection command. Apply this action to database or collection resources. flushRouterConfig User can perform the flushRouterConfig command. Apply this action to the cluster resource. getShardMap User can perform the getShardMap command. Apply this action to the cluster resource. getShardVersion User can perform the getShardVersion command. Apply this action to database resources. listShards User can perform the listShards command. Apply this action to the cluster resource. moveChunk User can perform the moveChunk command. Apply this action to the cluster resource. removeShard User can perform the removeShard command. Apply this action to the cluster resource. 6.4. Security Reference 377
  • 382. MongoDB Documentation, Release 2.6.4 shardingState User can perform the shardingState command. Apply this action to the cluster resource. splitChunk User can perform the splitChunk command. Apply this action to database or collection resources. splitVector User can perform the splitVector command. Apply this action to database or collection resources. Server Administration Actions applicationMessage User can perform the logApplicationMessage command. Apply this action to the cluster resource. closeAllDatabases User can perform the closeAllDatabases command. Apply this action to the cluster resource. collMod User can perform the collMod command. Apply this action to database or collection resources. compact User can perform the compact command. Apply this action to database or collection resources. connPoolSync User can perform the connPoolSync command. Apply this action to the cluster resource. convertToCapped User can perform the convertToCapped command. Apply this action to database or collection resources. dropDatabase User can perform the dropDatabase command. Apply this action to database resources. dropIndex User can perform the dropIndexes command. Apply this action to database or collection resources. fsync User can perform the fsync command. Apply this action to the cluster resource. getParameter User can perform the getParameter command. Apply this action to the cluster resource. hostInfo Provides information about the server the MongoDB instance runs on. Apply this action to the cluster resource. logRotate User can perform the logRotate command. Apply this action to the cluster resource. reIndex User can perform the reIndex command. Apply this action to database or collection resources. renameCollectionSameDB Allows the user to rename collections on the current database using the renameCollection command. Apply this action to database resources. Additionally, the user must either have find (page 375) on the source collection or not have find (page 375) on the destination collection. If a collection with the new name already exists, the user must also have the dropCollection (page 376) action on the destination collection. 378 Chapter 6. Security
  • 383. MongoDB Documentation, Release 2.6.4 repairDatabase User can perform the repairDatabase command. Apply this action to database resources. setParameter User can perform the setParameter command. Apply this action to the cluster resource. shutdown User can perform the shutdown command. Apply this action to the cluster resource. touch User can perform the touch command. Apply this action to the cluster resource. Diagnostic Actions collStats User can perform the collStats command. Apply this action to database or collection resources. connPoolStats User can perform the connPoolStats and shardConnPoolStats commands. Apply this action to the cluster resource. cursorInfo User can perform the cursorInfo command. Apply this action to the cluster resource. dbHash User can perform the dbHash command. Apply this action to database or collection resources. dbStats User can perform the dbStats command. Apply this action to database resources. diagLogging User can perform the diagLogging command. Apply this action to the cluster resource. getCmdLineOpts User can perform the getCmdLineOpts command. Apply this action to the cluster resource. getLog User can perform the getLog command. Apply this action to the cluster resource. indexStats User can perform the indexStats command. Apply this action to database or collection resources. listDatabases User can perform the listDatabases command. Apply this action to the cluster resource. netstat User can perform the netstat command. Apply this action to the cluster resource. serverStatus User can perform the serverStatus command. Apply this action to the cluster resource. validate User can perform the validate command. Apply this action to database or collection resources. top User can perform the top command. Apply this action to the cluster resource. 6.4. Security Reference 379
  • 384. MongoDB Documentation, Release 2.6.4 Internal Actions anyAction Allows any action on a resource. Do not assign this action except for exceptional circumstances. internal Allows internal actions. Do not assign this action except for exceptional circumstances. Default MongoDB Port The following table lists the default ports used by MongoDB: Default Description Port 27017 The default port for mongod and mongos instances. You can change this port with port or --port. 27018 The default port when running with --shardsvr runtime operation or the shardsvr value for the clusterRole setting in a configuration file. 27019 The default port when running with --configsvr runtime operation or the configsvr value for the clusterRole setting in a configuration file. 28017 The default port for the web status page. The web status page is always accessible at a port number that is 1000 greater than the port determined by port. System Event Audit Messages Note: The audit system (page 290) is available only in MongoDB Enterprise59. The event auditing feature (page 290) can record events in JSON format. The recorded JSON messages have the following syntax: { atype: <String>, ts : { "$date": <timestamp> }, local: { ip: <String>, port: <int> }, remote: { ip: <String>, port: <int> }, users : [ { user: <String>, db: String> }, ... ], params: <document>, result: <int> } field String atype Action type. See Event Actions, Details, and Results (page 381). field document ts Document that contains the date and UTC time of the event, in ISO 8601 format. field document local Document that contains the local ip address and the port number of the running instance. field document remote Document that contains the remote ip address and the port number of the incoming connection associated with the event. field array users Array of user identification documents. Because MongoDB allows a session to log in with different user per database, this array can have more than one user. Each document contains a user field for the username and a db field for the authentication database for that user. 59http://www.mongodb.com/products/mongodb-enterprise 380 Chapter 6. Security
  • 385. MongoDB Documentation, Release 2.6.4 field document params Specific details for the event. See Event Actions, Details, and Results (page 381). field integer result Error code. See Event Actions, Details, and Results (page 381). Event Actions, Details, and Results The following table lists for each atype or action type, the associated params details and the result values, if any. atype params result Notes authenticate { user: <user name>, db: <database>, mechanism: <mechanism> } 0 - Success 18 - Authentication Failed authCheck { command: <name>, ns: <database>.<collection>, args: <command object> } 0 - Success 13 - Unauthorized to per-form the operation. The auditing system logs only authorization failures. ns field is optional. args field may be redacted. 0 - Success createCollection (page 375) { ns: <database>.<collection> } createDatabase { ns: <database> } 0 - Success createIndex (page 375) { ns: <database>.<collection>, indexName: <index name>, indexSpec: <full index specification> } 0 - Success renameCollection { old: <database>.<collection>, new: <database>.<collection> } 0 - Success 0 - Success dropCollection (page 376) { ns: <database>.<collection> } dropDatabase (page 378) { ns: <database> } 0 - Success Continued on next page 6.4. Security Reference 381
  • 386. MongoDB Documentation, Release 2.6.4 Table 6.1 – continued from previous page atype params result Notes dropIndex (page 378) { ns: <database>.<collection>, indexName: <index name> } 0 - Success createUser (page 376) { user: <user name>, db: <database>, customData: <document>, roles: [ <role1>, ... ] } 0 - Success customData field is op-tional. dropUser (page 376) { user: <user name>, db: <database> } 0 - Success dropAllUsersFromDatabase { db: <database> } 0 - Success updateUser { user: <user name>, db: <database>, passwordChanged: <boolean>, customData: <document>, roles: [ <role1>, ... ] } 0 - Success customData field is op-tional. grantRolesToUser { user: <user name>, db: <database>, roles: [ <role1>, ... ] } 0 - Success The roles array contains role documents. See role Document (page 384). revokeRolesFromUser { user: <user name>, db: <database>, roles: [ <role1>, ... ] } 0 - Success The roles array contains role documents. See role Document (page 384). Continued on next page 382 Chapter 6. Security
  • 387. MongoDB Documentation, Release 2.6.4 Table 6.1 – continued from previous page atype params result Notes createRole (page 375) { role: <role name>, db: <database>, roles: [ <role1>, ... ], privileges: [ <privilege1>, ... ] } 0 - Success Either roles or the privileges field can be optional. The roles array contains role documents. See role Document (page 384). The privileges array contains privilege doc-uments. See privilege Document (page 384). updateRole { role: <role name>, db: <database>, roles: [ <role1>, ... ], privileges: [ <privilege1>, ... ] } 0 - Success Either roles or the privileges field can be optional. The roles array contains role documents. See role Document (page 384). The privileges array contains privilege doc-uments. See privilege Document (page 384). dropRole (page 376) { role: <role name>, db: <database> } 0 - Success dropAllRolesFromDatabase { db: <database> } 0 - Success grantRolesToRole { role: <role name>, db: <database>, roles: [ <role1>, ... ] } 0 - Success The roles array contains role documents. See role Document (page 384). revokeRolesFromRole { role: <role name>, db: <database>, roles: [ <role1>, ... ] } 0 - Success The roles array contains role documents. See role Document (page 384). grantPrivilegesToRole { role: <role name>, db: <database>, privileges: [ <privilege1>, ... ] } 0 - Success The privileges array contains privilege doc-uments. See privilege Document (page 384). Continued on next page 6.4. Security Reference 383
  • 388. MongoDB Documentation, Release 2.6.4 Table 6.1 – continued from previous page atype params result Notes revokePrivilegesFromRole { role: <role name>, db: <database name>, privileges: [ <privilege1>, ... ] } 0 - Success The privileges array contains privilege doc-uments. See privilege Document (page 384). replSetReconfig { old: <configuration>, new: <configuration> } 0 - Success enableSharding (page 377) { ns: <database> } 0 - Success shardCollection { ns: <database>.<collection>, key: <shard key pattern>, options: { unique: <boolean> } } 0 - Success addShard (page 377) { shard: <shard name>, connectionString: <hostname>:<port>, maxSize: <maxSize> } 0 - Success When a shard is a replica set, the connectionString includes the replica set name and can include other members of the replica set. removeShard (page 377) 0 - Success { shard: <shard name> } shutdown (page 379) { } 0 - Success Indicates commencement of database shutdown. 0 - Success See applicationMessage (page 378) { msg: <custom message string> } logApplicationMessage. Additional Information role Document The <role> document in the roles array has the following form: { role: <role name>, db: <database> } privilege Document The <privilege> document in the privilege array has the following form: 384 Chapter 6. Security
  • 389. MongoDB Documentation, Release 2.6.4 { resource: <resource document> , actions: [ <action>, ... ] } See Resource Document (page 373) for details on the resource document. For a list of actions, see Privilege Actions (page 375). 6.4.3 Security Release Notes Alerts Security Release Notes (page 385) Security vulnerability for password. Security Release Notes Access to system.users Collection Changed in version 2.4. In 2.4, only users with the userAdmin role have access to the system.users collection. In version 2.2 and earlier, the read-write users of a database all have access to the system.users collection, which contains the user names and user password hashes. 60 Password Hashing Insecurity If a user has the same password for multiple databases, the hash will be the same. A malicious user could exploit this to gain access on a second database using a different user’s credentials. As a result, always use unique username and password combinations for each database. Thanks to Will Urbanski, from Dell SecureWorks, for identifying this issue. 60 Read-only users do not have access to the system.users collection. 6.4. Security Reference 385
  • 390. MongoDB Documentation, Release 2.6.4 386 Chapter 6. Security
  • 391. CHAPTER 7 Aggregation Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline (page 391), the map-reduce function (page 394), and single purpose aggregation methods and commands (page 395). Aggregation Introduction (page 387) A high-level introduction to aggregation. Aggregation Concepts (page 391) Introduces the use and operation of the data aggregation modalities available in MongoDB. Aggregation Pipeline (page 391) The aggregation pipeline is a framework for performing aggregation tasks, modeled on the concept of data processing pipelines. Using this framework, MongoDB passes the doc-uments of a single collection through a pipeline. The pipeline transforms the documents into aggregated results, and is accessed through the aggregate database command. Map-Reduce (page 394) Map-reduce is a generic multi-phase data aggregation modality for processing quan-tities of data. MongoDB provides map-reduce with the mapReduce database command. Single Purpose Aggregation Operations (page 395) MongoDB provides a collection of specific data aggrega-tion operations to support a number of common data aggregation functions. These operations include returning counts of documents, distinct values of a field, and simple grouping operations. Aggregation Mechanics (page 398) Details internal optimization operations, limits, support for sharded col-lections, and concurrency concerns. Aggregation Examples (page 403) Examples and tutorials for data aggregation operations in MongoDB. Aggregation Reference (page 419) References for all aggregation operations material for all data aggregation meth-ods in MongoDB. 7.1 Aggregation Introduction Aggregations are operations that process data records and return computed results. MongoDB provides a rich set of aggregation operations that examine and perform calculations on the data sets. Running data aggregation on the mongod instance simplifies application code and limits resource requirements. Like queries, aggregation operations in MongoDB use collections of documents as an input and return results in the form of one or more documents. 387
  • 392. MongoDB Documentation, Release 2.6.4 7.1.1 Aggregation Modalities Aggregation Pipelines MongoDB 2.2 introduced a new aggregation framework (page 391), modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result. The most basic pipeline stages provide filters that operate like queries and document transformations that modify the form of the output document. Other pipeline operations provide tools for grouping and sorting documents by specific field or fields as well as tools for aggregating the contents of arrays, including arrays of documents. In addition, pipeline stages can use operators for tasks such as calculating the average or concatenating a string. The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB. Figure 7.1: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two stages: $match and $group. Map-Reduce MongoDB also provides map-reduce (page 394) operations to perform aggregation. In general, map-reduce operations have two phases: a map stage that processes each document and emits one or more objects for each input document, 388 Chapter 7. Aggregation
  • 393. MongoDB Documentation, Release 2.6.4 and reduce phase that combines the output of the map operation. Optionally, map-reduce can have a finalize stage to make final modifications to the result. Like other aggregation operations, map-reduce can specify a query condition to select the input documents as well as sort and limit the results. Map-reduce uses custom JavaScript functions to perform the map and reduce operations, as well as the optional finalize operation. While the custom JavaScript provide great flexibility compared to the aggregation pipeline, in general, map-reduce is less efficient and more complex than the aggregation pipeline. Note: Starting in MongoDB 2.4, certain mongo shell functions and properties are inaccessible in map-reduce op-erations. MongoDB 2.4 also provides support for multiple JavaScript operations to run at the same time. Before MongoDB 2.4, JavaScript code executed in a single thread, raising concurrency issues for map-reduce. Figure 7.2: Diagram of the annotated map-reduce operation. Single Purpose Aggregation Operations For a number of common single purpose aggregation operations (page 395), MongoDB provides special purpose database commands. These common aggregation operations are: returning a count of matching documents, returning the distinct values for a field, and grouping data based on the values of a field. All of these operations aggregate documents from a single collection. While these operations provide simple access to common aggregation processes, they lack the flexibility and capabilities of the aggregation pipeline and map-reduce. 7.1. Aggregation Introduction 389
  • 394. MongoDB Documentation, Release 2.6.4 Figure 7.3: Diagram of the annotated distinct operation. 390 Chapter 7. Aggregation
  • 395. MongoDB Documentation, Release 2.6.4 7.1.2 Additional Features and Behaviors Both the aggregation pipeline and map-reduce can operate on a sharded collection (page 607). Map-reduce operations can also output to a sharded collection. See Aggregation Pipeline and Sharded Collections (page 401) and Map-Reduce and Sharded Collections (page 402) for details. The aggregation pipeline can use indexes to improve its performance during some of its stages. In addition, the aggre-gation pipeline has an internal optimization phase. See Pipeline Operators and Indexes (page 393) and Aggregation Pipeline Optimization (page 398) for details. For a feature comparison of the aggregation pipeline, map-reduce, and the special group functionality, see Aggregation Commands Comparison (page 424). 7.2 Aggregation Concepts MongoDB provides the three approaches to aggregation, each with its own strengths and purposes for a given situation. This section describes these approaches and also describes behaviors and limitations specific to each approach. See also the chart (page 424) that compares the approaches. Aggregation Pipeline (page 391) The aggregation pipeline is a framework for performing aggregation tasks, modeled on the concept of data processing pipelines. Using this framework, MongoDB passes the documents of a single collection through a pipeline. The pipeline transforms the documents into aggregated results, and is accessed through the aggregate database command. Map-Reduce (page 394) Map-reduce is a generic multi-phase data aggregation modality for processing quantities of data. MongoDB provides map-reduce with the mapReduce database command. Single Purpose Aggregation Operations (page 395) MongoDB provides a collection of specific data aggregation op-erations to support a number of common data aggregation functions. These operations include returning counts of documents, distinct values of a field, and simple grouping operations. Aggregation Mechanics (page 398) Details internal optimization operations, limits, support for sharded collections, and concurrency concerns. 7.2.1 Aggregation Pipeline New in version 2.2. The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated results. The aggregation pipeline provides an alternative to map-reduce and may be the preferred solution for aggregation tasks where the complexity of map-reduce may be unwarranted. Aggregation pipeline have some limitations on value types and result size. See Aggregation Pipeline Limits (page 401) for details on limits and restrictions on the aggregation pipeline. Pipeline The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents. Pipeline stages can appear multiple times in the pipeline. MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregate com-mand for aggregation pipeline. See aggregation-pipeline-operator-reference for the available stages. 7.2. Aggregation Concepts 391
  • 396. MongoDB Documentation, Release 2.6.4 Figure 7.4: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two stages: $match and $group. 392 Chapter 7. Aggregation
  • 397. MongoDB Documentation, Release 2.6.4 For example usage of the aggregation pipeline, consider Aggregation with User Preference Data (page 407) and Aggregation with the Zip Code Data Set (page 404). Pipeline Expressions Some pipeline stages takes a pipeline expression as its operand. Pipeline expressions specify the transformation to apply to the input documents. Expressions have a document (page 158) structure and can contain other expression (page 420). Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other documents: expression operations provide in-memory transformation of documents. Generally, expressions are stateless and are only evaluated when seen by the aggregation process with one exception: accumulator expressions. The accumulators, used with the $group pipeline operator, maintain their state (e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline. For more information on expressions, see Expressions (page 420). Aggregation Pipeline Behavior In MongoDB, the aggregate command operates on a single collection, logically passing the entire collection into the aggregation pipeline. To optimize the operation, wherever possible, use the following strategies to avoid scanning the entire collection. Pipeline Operators and Indexes The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the pipeline. New in version 2.4: The $geoNear pipeline operator takes advantage of a geospatial index. When using $geoNear, the $geoNear pipeline operation must appear as the first stage in an aggregation pipeline. Even when the pipeline uses an index, aggregation still requires access to the actual documents; i.e. indexes cannot fully cover an aggregation pipeline. Changed in version 2.6: In previous versions, for very select use cases, an index could cover a pipeline. Early Filtering If your aggregation operation requires only a subset of the data in a collection, use the $match, $limit, and $skip stages to restrict the documents that enter at the beginning of the pipeline. When placed at the beginning of a pipeline, $match operations use suitable indexes to scan only the matching documents in a collection. Placing a $match pipeline stage followed by a $sort stage at the start of the pipeline is logically equivalent to a single query with a sort and can use an index. When possible, place $match operators at the beginning of the pipeline. Additional Features The aggregation pipeline has an internal optimization phase that provides improved performance for certain sequences of operators. For details, see Aggregation Pipeline Optimization (page 398). The aggregation pipeline supports operations on sharded collections. See Aggregation Pipeline and Sharded Collec-tions (page 401). 7.2. Aggregation Concepts 393
  • 398. MongoDB Documentation, Release 2.6.4 7.2.2 Map-Reduce Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For map-reduce operations, MongoDB provides the mapReduce database command. Consider the following map-reduce operation: Figure 7.5: Diagram of the annotated map-reduce operation. In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further condense or process the results of the aggregation. All map-reduce functions in MongoDB are JavaScript and run within the mongod process. Map-reduce operations take the documents of a single collection as the input and can perform any arbitrary sorting and limiting before beginning the map stage. mapReduce can return the results of a map-reduce operation as a document, or may write the results to collections. The input and the output collections may be sharded. Note: For most aggregation operations, the Aggregation Pipeline (page 391) provides better performance and more coherent interface. However, map-reduce operations provide some flexibility that is not presently available in the aggregation pipeline. 394 Chapter 7. Aggregation
  • 399. MongoDB Documentation, Release 2.6.4 Map-Reduce JavaScript Functions In MongoDB, map-reduce operations use custom JavaScript functions to map, or associate, values to a key. If a key has multiple values mapped to it, the operation reduces the values for the key to a single object. The use of custom JavaScript functions provide flexibility to map-reduce operations. For instance, when processing a document, the map function can create more than one key and value mapping or no mapping. Map-reduce operations can also use a custom JavaScript function to make final modifications to the results at the end of the map and reduce operation, such as perform additional calculations. Map-Reduce Behavior In MongoDB, the map-reduce operation can write results to a collection or return the results inline. If you write map-reduce output to a collection, you can perform subsequent map-reduce operations on the same input collection that merge replace, merge, or reduce new results with previous results. See mapReduce and Perform Incremental Map-Reduce (page 413) for details and examples. When returning the results of a map reduce operation inline, the result documents must be within the BSON Document Size limit, which is currently 16 megabytes. For additional information on limits and restrictions on map-reduce operations, see the http://docs.mongodb.org/manualreference/command/mapReduce reference page. MongoDB supports map-reduce operations on sharded collections (page 607). Map-reduce operations can also output the results to a sharded collection. See Map-Reduce and Sharded Collections (page 402). 7.2.3 Single Purpose Aggregation Operations Aggregation refers to a broad class of data manipulation operations that compute a result based on an input and a spe-cific procedure. MongoDB provides a number of aggregation operations that perform specific aggregation operations on a set of data. Although limited in scope, particularly compared to the aggregation pipeline (page 391) and map-reduce (page 394), these operations provide straightforward semantics for common data processing options. Count MongoDB can return a count of the number of documents that match a query. The count command as well as the count() and cursor.count() methods provide access to counts in the mongo shell. Example Given a collection named records with only the following documents: { a: 1, b: 0 } { a: 1, b: 1 } { a: 1, b: 4 } { a: 2, b: 2 } The following operation would count all documents in the collection and return the number 4: db.records.count() The following operation will count only the documents where the value of the field a is 1 and return 3: db.records.count( { a: 1 } ) 7.2. Aggregation Concepts 395
  • 400. MongoDB Documentation, Release 2.6.4 Distinct The distinct operation takes a number of documents that match a query and returns all of the unique values for a field in the matching documents. The distinct command and db.collection.distinct() method provide this operation in the mongo shell. Consider the following examples of a distinct operation: Figure 7.6: Diagram of the annotated distinct operation. Example Given a collection named records with only the following documents: { a: 1, b: 0 } { a: 1, b: 1 } { a: 1, b: 1 } { a: 1, b: 4 } { a: 2, b: 2 } 396 Chapter 7. Aggregation
  • 401. MongoDB Documentation, Release 2.6.4 { a: 2, b: 2 } Consider the following db.collection.distinct() operation which returns the distinct values of the field b: db.records.distinct( "b" ) The results of this operation would resemble: [ 0, 1, 4, 2 ] Group The group operation takes a number of documents that match a query, and then collects groups of documents based on the value of a field or fields. It returns an array of documents with computed results for each group of documents. Access the grouping functionality via the group command or the db.collection.group() method in the mongo shell. Warning: group does not support data in sharded collections. In addition, the results of the group operation must be no larger than 16 megabytes. Consider the following group operation: Example Given a collection named records with the following documents: { a: 1, count: 4 } { a: 1, count: 2 } { a: 1, count: 4 } { a: 2, count: 3 } { a: 2, count: 1 } { a: 1, count: 5 } { a: 4, count: 4 } Consider the following group operation which groups documents by the field a, where a is less than 3, and sums the field count for each group: db.records.group( { key: { a: 1 }, cond: { a: { $lt: 3 } }, reduce: function(cur, result) { result.count += cur.count }, initial: { count: 0 } } ) The results of this group operation would resemble the following: [ { a: 1, count: 15 }, { a: 2, count: 4 } ] See also: The $group for related functionality in the aggregation pipeline (page 391). 7.2. Aggregation Concepts 397
  • 402. MongoDB Documentation, Release 2.6.4 7.2.4 Aggregation Mechanics This section describes behaviors and limitations for the various aggregation modalities. Aggregation Pipeline Optimization (page 398) Details the internal optimization of certain pipeline sequence. Aggregation Pipeline Limits (page 401) Presents limitations on aggregation pipeline operations. Aggregation Pipeline and Sharded Collections (page 401) Mechanics of aggregation pipeline operations on sharded collections. Map-Reduce and Sharded Collections (page 402) Mechanics of map-reduce operation with sharded collections. Map Reduce Concurrency (page 403) Details the locks taken during map-reduce operations. Aggregation Pipeline Optimization Aggregation pipeline operations have an optimization phase which attempts to reshape the pipeline for improved performance. To see how the optimizer transforms a particular aggregation pipeline, include the explain option in the db.collection.aggregate() method. Optimizations are subject to change between releases. Projection Optimization The aggregation pipeline can determine if it requires only a subset of the fields in the documents to obtain the results. If so, the pipeline will only use those required fields, reducing the amount of data passing through the pipeline. Pipeline Sequence Optimization $sort + $match Sequence Optimization When you have a sequence with $sort followed by a $match, the $match moves before the $sort to minimize the number of objects to sort. For example, if the pipeline consists of the following stages: { $sort: { age : -1 } }, { $match: { status: 'A' } } During the optimization phase, the optimizer transforms the sequence to the following: { $match: { status: 'A' } }, { $sort: { age : -1 } } $skip + $limit Sequence Optimization When you have a sequence with $skip followed by a $limit, the $limit moves before the $skip. With the reordering, the $limit value increases by the $skip amount. For example, if the pipeline consists of the following stages: { $skip: 10 }, { $limit: 5 } During the optimization phase, the optimizer transforms the sequence to the following: { $limit: 15 }, { $skip: 10 } 398 Chapter 7. Aggregation
  • 403. MongoDB Documentation, Release 2.6.4 This optimization allows for more opportunities for $sort + $limit Coalescence (page 399), such as with $sort + $skip + $limit sequences. See $sort + $limit Coalescence (page 399) for details on the coalescence and $sort + $skip + $limit Sequence (page 400) for an example. For aggregation operations on sharded collections (page 401), this optimization reduces the results returned from each shard. $redact + $match Sequence Optimization When possible, when the pipeline has the $redact stage immedi-ately followed by the $match stage, the aggregation can sometimes add a portion of the $match stage before the $redact stage. If the added $match stage is at the start of a pipeline, the aggregation can use an index as well as query the collection to limit the number of documents that enter the pipeline. See Pipeline Operators and Indexes (page 393) for more information. For example, if the pipeline consists of the following stages: { $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } }, { $match: { year: 2014, category: { $ne: "Z" } } } The optimizer can add the same $match stage before the $redact stage: { $match: { year: 2014 } }, { $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } }, { $match: { year: 2014, category: { $ne: "Z" } } } Pipeline Coalescence Optimization When possible, the optimization phase coalesces a pipeline stage into its predecessor. Generally, coalescence occurs after any sequence reordering optimization. $sort + $limit Coalescence When a $sort immediately precedes a $limit, the optimizer can coalesce the $limit into the $sort. This allows the sort operation to only maintain the top n results as it progresses, where n is the specified limit, and MongoDB only needs to store n items in memory 1. See sort-and-memory for more information. $limit + $limit Coalescence When a $limit immediately follows another $limit, the two stages can coalesce into a single $limit where the limit amount is the smaller of the two initial limit amounts. For example, a pipeline contains the following sequence: { $limit: 100 }, { $limit: 10 } Then the second $limit stage can coalesce into the first $limit stage and result in a single $limit stage where the limit amount 10 is the minimum of the two initial limits 100 and 10. { $limit: 10 } $skip + $skip Coalescence When a $skip immediately follows another $skip, the two stages can coalesce into a single $skip where the skip amount is the sum of the two initial skip amounts. For example, a pipeline contains the following sequence: { $skip: 5 }, { $skip: 2 } 1 The optimization will still apply when allowDiskUse is true and the n items exceed the aggregation memory limit (page 401). 7.2. Aggregation Concepts 399
  • 404. MongoDB Documentation, Release 2.6.4 Then the second $skip stage can coalesce into the first $skip stage and result in a single $skip stage where the skip amount 7 is the sum of the two initial limits 5 and 2. { $skip: 7 } $match + $match Coalescence When a $match immediately follows another $match, the two stages can coalesce into a single $match combining the conditions with an $and. For example, a pipeline contains the following sequence: { $match: { year: 2014 } }, { $match: { status: "A" } } Then the second $match stage can coalesce into the first $match stage and result in a single $match stage { $match: { $and: [ { "year" : 2014 }, { "status" : "A" } ] } } Examples The following examples are some sequences that can take advantage of both sequence reordering and coalescence. Generally, coalescence occurs after any sequence reordering optimization. $sort + $skip + $limit Sequence A pipeline contains a sequence of $sort followed by a $skip followed by a $limit: { $sort: { age : -1 } }, { $skip: 10 }, { $limit: 5 } First, the optimizer performs the $skip + $limit Sequence Optimization (page 398) to transforms the sequence to the following: { $sort: { age : -1 } }, { $limit: 15 } { $skip: 10 } The $skip + $limit Sequence Optimization (page 398) increases the $limit amount with the reordering. See $skip + $limit Sequence Optimization (page 398) for details. The reordered sequence now has $sort immediately preceding the $limit, and the pipeline can coalesce the two stages to decrease memory usage during the sort operation. See $sort + $limit Coalescence (page 399) for more information. $limit + $skip + $limit + $skip Sequence A pipeline contains a sequence of alternating $limit and $skip stages: { $limit: 100 }, { $skip: 5 }, { $limit: 10 }, { $skip: 2 } The $skip + $limit Sequence Optimization (page 398) reverses the position of the { $skip: 5 } and { $limit: 10 } stages and increases the limit amount: 400 Chapter 7. Aggregation
  • 405. MongoDB Documentation, Release 2.6.4 { $limit: 100 }, { $limit: 15}, { $skip: 5 }, { $skip: 2 } The optimizer then coalesces the two $limit stages into a single $limit stage and the two $skip stages into a single $skip stage. The resulting sequence is the following: { $limit: 15 }, { $skip: 7 } See $limit + $limit Coalescence (page 399) and $skip + $skip Coalescence (page 399) for details. See also: explain option in the db.collection.aggregate() Aggregation Pipeline Limits Aggregation operations with the aggregate command have the following limitations. Result Size Restrictions If the aggregate command returns a single document that contains the complete result set, the command will produce an error if the result set exceeds the BSON Document Size limit, which is currently 16 megabytes. To manage result sets that exceed this limit, the aggregate command can return result sets of any size if the command return a cursor or store the results to a collection. Changed in version 2.6: The aggregate command can return results as a cursor or store the results in a collection, which are not subject to the size limit. The db.collection.aggregate() returns a cursor and can return result sets of any size. Memory Restrictions Changed in version 2.6. Pipeline stages have a limit of 100 megabytes of RAM. If a stage exceeds this limit, MongoDB will produce an error. To allow for the handling of large datasets, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files. See also: sort-memory-limit and group-memory-limit. Aggregation Pipeline and Sharded Collections The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the aggregation pipeline (page 391) and sharded collections. Behavior Changed in version 2.6. 7.2. Aggregation Concepts 401
  • 406. MongoDB Documentation, Release 2.6.4 When operating on a sharded collection, the aggregation pipeline is split into two parts. The first pipeline runs on each shard, or if an early $match can exclude shards through the use of the shard key in the predicate, the pipeline runs on only the relevant shards. The second pipeline consists of the remaining pipeline stages and runs on the primary shard (page 615). The primary shard merges the cursors from the other shards and runs the second pipeline on these results. The primary shard forwards the final results to the mongos. In previous versions, the second pipeline would run on the mongos. 2 Optimization When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards perform as many stages as possible with consideration for optimization. To see how the pipeline was split, include the explain option in the db.collection.aggregate() method. Optimizations are subject to change between releases. Map-Reduce and Sharded Collections Map-reduce supports operations on sharded collections, both as an input and as an output. This section describes the behaviors of mapReduce specific to sharded collections. Sharded Collection as Input When using sharded collection as the input for a map-reduce operation, mongos will automatically dispatch the map-reduce job to each shard in parallel. There is no special option required. mongos will wait for jobs on all shards to finish. Sharded Collection as Output Changed in version 2.2. If the out field for mapReduce has the sharded value, MongoDB shards the output collection using the _id field as the shard key. To output to a sharded collection: • If the output collection does not exist, MongoDB creates and shards the collection on the _id field. • For a new or an empty sharded collection, MongoDB uses the results of the first stage of the map-reduce operation to create the initial chunks distributed among the shards. • mongos dispatches, in parallel, a map-reduce post-processing job to every shard that owns a chunk. During the post-processing, each shard will pull the results for its own chunks from the other shards, run the final reduce/finalize, and write locally to the output collection. Note: • During later map-reduce jobs, MongoDB splits chunks as needed. • Balancing of chunks for the output collection is automatically prevented during post-processing to avoid con-currency issues. In MongoDB 2.0: 2 Until all shards upgrade to v2.6, the second pipeline runs on the mongos if any shards are still running v2.4. 402 Chapter 7. Aggregation
  • 407. MongoDB Documentation, Release 2.6.4 • mongos retrieves the results from each shard, performs a merge sort to order the results, and proceeds to the reduce/finalize phase as needed. mongos then writes the result to the output collection in sharded mode. • This model requires only a small amount of memory, even for large data sets. • Shard chunks are not automatically split during insertion. This requires manual intervention until the chunks are granular and balanced. Important: For best results, only use the sharded output options for mapReduce in version 2.2 or later. Map Reduce Concurrency The map-reduce operation is composed of many tasks, including reads from the input collection, executions of the map function, executions of the reduce function, writes to a temporary collection during processing, and writes to the output collection. During the operation, map-reduce takes the following locks: • The read phase takes a read lock. It yields every 100 documents. • The insert into the temporary collection takes a write lock for a single write. • If the output collection does not exist, the creation of the output collection takes a write lock. • If the output collection exists, then the output actions (i.e. merge, replace, reduce) take a write lock. This write lock is global, and blocks all operations on the mongod instance. Changed in version 2.4: The V8 JavaScript engine, which became the default in 2.4, allows multiple JavaScript operations to execute at the same time. Prior to 2.4, JavaScript code (i.e. map, reduce, finalize functions) executed in a single thread. Note: The final write lock during post-processing makes the results appear atomically. However, output actions merge and reduce may take minutes to process. For the merge and reduce, the nonAtomic flag is available, which releases the lock between writing each output document. the db.collection.mapReduce() reference for more information. 7.3 Aggregation Examples This document provides the practical examples that display the capabilities of aggregation (page 391). Aggregation with the Zip Code Data Set (page 404) Use the aggregation pipeline to group values and to calculate aggregated sums and averages for a collection of United States zip codes. Aggregation with User Preference Data (page 407) Use the pipeline to sort, normalize, and sum data on a collection of user data. Map-Reduce Examples (page 411) Define map-reduce operations that select ranges, group data, and calculate sums and averages. Perform Incremental Map-Reduce (page 413) Run a map-reduce operations over one collection and output results to another collection. Troubleshoot the Map Function (page 415) Steps to troubleshoot the map function. Troubleshoot the Reduce Function (page 416) Steps to troubleshoot the reduce function. 7.3. Aggregation Examples 403
  • 408. MongoDB Documentation, Release 2.6.4 7.3.1 Aggregation with the Zip Code Data Set The examples in this document use the zipcode collection. This collection is available at: me-dia. mongodb.org/zips.json3. Use mongoimport to load this data set into your mongod instance. Data Model Each document in the zipcode collection has the following form: { "_id": "10280", "city": "NEW YORK", "state": "NY", "pop": 5574, "loc": [ -74.016323, 40.710537 ] } The _id field holds the zip code as a string. The city field holds the city name. A city can have more than one zip code associated with it as different sections of the city can each have a different zip code. The state field holds the two letter state abbreviation. The pop field holds the population. The loc field holds the location as a latitude longitude pair. All of the following examples use the aggregate() helper in the mongo shell. aggregate() provides a wrapper around the aggregate database command. See the documentation for your driver for a more idiomatic interface for data aggregation operations. Return States with Populations above 10 Million To return all states with a population greater than 10 million, use the following aggregation operation: db.zipcodes.aggregate( { $group : { _id : "$state", totalPop : { $sum : "$pop" } } }, { $match : {totalPop : { $gte : 10*1000*1000 } } } ) Aggregations operations using the aggregate() helper process all documents in the zipcodes collection. aggregate() connects a number of pipeline (page 391) operators, which define the aggregation process. In this example, the pipeline passes all documents in the zipcodes collection through the following steps: • the $group operator collects all documents and creates documents for each state. These new per-state documents have one field in addition to the _id field: totalPop which is a generated field using the $sum operation to calculate the total value of all pop fields in the source documents. After the $group operation the documents in the pipeline resemble the following: 3http://media.mongodb.org/zips.json 404 Chapter 7. Aggregation
  • 409. MongoDB Documentation, Release 2.6.4 { "_id" : "AK", "totalPop" : 550043 } • the $match operation filters these documents so that the only documents that remain are those where the value of totalPop is greater than or equal to 10 million. The $match operation does not alter the documents, which have the same format as the documents output by $group. The equivalent SQL for this operation is: SELECT state, SUM(pop) AS totalPop FROM zipcodes GROUP BY state HAVING totalPop >= (10*1000*1000) Return Average City Population by State To return the average populations for cities in each state, use the following aggregation operation: db.zipcodes.aggregate( [ { $group : { _id : { state : "$state", city : "$city" }, pop : { $sum : "$pop" } } }, { $group : { _id : "$_id.state", avgCityPop : { $avg : "$pop" } } } ] ) Aggregations operations using the aggregate() helper process all documents in the zipcodes collection. aggregate() connects a number of pipeline (page 391) operators that define the aggregation process. In this example, the pipeline passes all documents in the zipcodes collection through the following steps: • the $group operator collects all documents and creates new documents for every combination of the city and state fields in the source document. A city can have more than one zip code associated with it as different sections of the city can each have a different zip code. After this stage in the pipeline, the documents resemble the following: { "_id" : { "state" : "CO", "city" : "EDGEWATER" }, "pop" : 13154 } • the second $group operator collects documents by the state field and use the $avg expression to compute a value for the avgCityPop field. The final output of this aggregation operation is: { "_id" : "MN", "avgCityPop" : 5335 }, Return Largest and Smallest Cities by State To return the smallest and largest cities by population for each state, use the following aggregation operation: 7.3. Aggregation Examples 405
  • 410. MongoDB Documentation, Release 2.6.4 db.zipcodes.aggregate( { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } }, { $sort: { pop: 1 } }, { $group: { _id : "$_id.state", biggestCity: { $last: "$_id.city" }, biggestPop: { $last: "$pop" }, smallestCity: { $first: "$_id.city" }, smallestPop: { $first: "$pop" } } }, // the following $project is optional, and // modifies the output format. { $project: { _id: 0, state: "$_id", biggestCity: { name: "$biggestCity", pop: "$biggestPop" }, smallestCity: { name: "$smallestCity", pop: "$smallestPop" } } } ) Aggregation operations using the aggregate() helper process all documents in the zipcodes collection. aggregate() combines a number of pipeline (page 391) operators that define the aggregation process. All documents from the zipcodes collection pass into the pipeline, which consists of the following steps: • the $group operator collects all documents and creates new documents for every combination of the city and state fields in the source documents. By specifying the value of _id as a sub-document that contains both fields, the operation preserves the state field for use later in the pipeline. The documents produced by this stage of the pipeline have a second field, pop, which uses the $sum operator to provide the total of the pop fields in the source document. At this stage in the pipeline, the documents resemble the following: { "_id" : { "state" : "CO", "city" : "EDGEWATER" }, "pop" : 13154 } • $sort operator orders the documents in the pipeline based on the value of the pop field from largest to smallest. This operation does not alter the documents. • the second $group operator collects the documents in the pipeline by the state field, which is a field inside the nested _id document. Within each per-state document this $group operator specifies four fields: Using the $last expression, the $group operator creates the biggestcity and biggestpop fields that store the city with the largest pop-ulation and that population. Using the $first expression, the $group operator creates the smallestcity and smallestpop fields that store the city with the smallest population and that population. The documents, at this stage in the pipeline resemble the following: { "_id" : "WA", "biggestCity" : "SEATTLE", "biggestPop" : 520096, "smallestCity" : "BENGE", 406 Chapter 7. Aggregation
  • 411. MongoDB Documentation, Release 2.6.4 "smallestPop" : 2 } • The final operation is $project, which renames the _id field to state and moves the biggestCity, biggestPop, smallestCity, and smallestPop into biggestCity and smallestCity sub-documents. The output of this aggregation operation is: { "state" : "RI", "biggestCity" : { "name" : "CRANSTON", "pop" : 176404 }, "smallestCity" : { "name" : "CLAYVILLE", "pop" : 45 } } 7.3.2 Aggregation with User Preference Data Data Model Consider a hypothetical sports club with a database that contains a users collection that tracks the user’s join dates, sport preferences, and stores these data in documents that resemble the following: { _id : "jane", joined : ISODate("2011-03-02"), likes : ["golf", "racquetball"] } { _id : "joe", joined : ISODate("2012-07-02"), likes : ["tennis", "golf", "swimming"] } Normalize and Sort Documents The following operation returns user names in upper case and in alphabetical order. The aggregation includes user names for all documents in the users collection. You might do this to normalize user names for processing. db.users.aggregate( [ { $project : { name:{$toUpper:"$_id"} , _id:0 } }, { $sort : { name : 1 } } ] ) All documents from the users collection pass through the pipeline, which consists of the following operations: • The $project operator: – creates a new field called name. 7.3. Aggregation Examples 407
  • 412. MongoDB Documentation, Release 2.6.4 – converts the value of the _id to upper case, with the $toUpper operator. Then the $project creates a new field, named name to hold this value. – suppresses the id field. $project will pass the _id field by default, unless explicitly suppressed. • The $sort operator orders the results by the name field. The results of the aggregation would resemble the following: { "name" : "JANE" }, { "name" : "JILL" }, { "name" : "JOE" } Return Usernames Ordered by Join Month The following aggregation operation returns user names sorted by the month they joined. This kind of aggregation could help generate membership renewal notices. db.users.aggregate( [ { $project : { month_joined : { $month : "$joined" }, name : "$_id", _id : 0 } }, { $sort : { month_joined : 1 } } ] ) The pipeline passes all documents in the users collection through the following operations: • The $project operator: – Creates two new fields: month_joined and name. – Suppresses the id from the results. The aggregate() method includes the _id, unless explicitly suppressed. • The $month operator converts the values of the joined field to integer representations of the month. Then the $project operator assigns those values to the month_joined field. • The $sort operator sorts the results by the month_joined field. The operation returns results that resemble the following: { "month_joined" : 1, "name" : "ruth" }, { "month_joined" : 1, "name" : "harold" }, 408 Chapter 7. Aggregation
  • 413. MongoDB Documentation, Release 2.6.4 { "month_joined" : 1, "name" : "kate" } { "month_joined" : 2, "name" : "jill" } Return Total Number of Joins per Month The following operation shows how many people joined each month of the year. You might use this aggregated data for recruiting and marketing strategies. db.users.aggregate( [ { $project : { month_joined : { $month : "$joined" } } } , { $group : { _id : {month_joined:"$month_joined"} , number : { $sum : 1 } } }, { $sort : { "_id.month_joined" : 1 } } ] ) The pipeline passes all documents in the users collection through the following operations: • The $project operator creates a new field called month_joined. • The $month operator converts the values of the joined field to integer representations of the month. Then the $project operator assigns the values to the month_joined field. • The $group operator collects all documents with a given month_joined value and counts how many docu-ments there are for that value. Specifically, for each unique value, $group creates a new “per-month” document with two fields: – _id, which contains a nested document with the month_joined field and its value. – number, which is a generated field. The $sum operator increments this field by 1 for every document containing the given month_joined value. • The $sort operator sorts the documents created by $group according to the contents of the month_joined field. The result of this aggregation operation would resemble the following: { "_id" : { "month_joined" : 1 }, "number" : 3 }, { "_id" : { "month_joined" : 2 }, "number" : 9 }, { "_id" : { "month_joined" : 3 }, 7.3. Aggregation Examples 409
  • 414. MongoDB Documentation, Release 2.6.4 "number" : 5 } Return the Five Most Common “Likes” The following aggregation collects top five most “liked” activities in the data set. This type of analysis could help inform planning and future development. db.users.aggregate( [ { $unwind : "$likes" }, { $group : { _id : "$likes" , number : { $sum : 1 } } }, { $sort : { number : -1 } }, { $limit : 5 } ] ) The pipeline begins with all documents in the users collection, and passes these documents through the following operations: • The $unwind operator separates each value in the likes array, and creates a new version of the source document for every element in the array. Example Given the following document from the users collection: { _id : "jane", joined : ISODate("2011-03-02"), likes : ["golf", "racquetball"] } The $unwind operator would create the following documents: { _id : "jane", joined : ISODate("2011-03-02"), likes : "golf" } { _id : "jane", joined : ISODate("2011-03-02"), likes : "racquetball" } • The $group operator collects all documents the same value for the likes field and counts each grouping. With this information, $group creates a new document with two fields: – _id, which contains the likes value. – number, which is a generated field. The $sum operator increments this field by 1 for every document containing the given likes value. • The $sort operator sorts these documents by the number field in reverse order. • The $limit operator only includes the first 5 result documents. The results of aggregation would resemble the following: 410 Chapter 7. Aggregation
  • 415. MongoDB Documentation, Release 2.6.4 { "_id" : "golf", "number" : 33 }, { "_id" : "racquetball", "number" : 31 }, { "_id" : "swimming", "number" : 24 }, { "_id" : "handball", "number" : 19 }, { "_id" : "tennis", "number" : 18 } 7.3.3 Map-Reduce Examples In the mongo shell, the db.collection.mapReduce() method is a wrapper around the mapReduce command. The following examples use the db.collection.mapReduce() method: Consider the following map-reduce operations on a collection orders that contains documents of the following prototype: { _id: ObjectId("50a8240b927d5d8b5891743c"), cust_id: "abc123", ord_date: new Date("Oct 04, 2012"), status: 'A', price: 25, items: [ { sku: "mmm", qty: 5, price: 2.5 }, { sku: "nnn", qty: 5, price: 2.5 } ] } Return the Total Price Per Customer Perform the map-reduce operation on the orders collection to group by the cust_id, and calculate the sum of the price for each cust_id: 1. Define the map function to process each input document: • In the function, this refers to the document that the map-reduce operation is processing. • The function maps the price to the cust_id for each document and emits the cust_id and price pair. var mapFunction1 = function() { emit(this.cust_id, this.price); }; 2. Define the corresponding reduce function with two arguments keyCustId and valuesPrices: 7.3. Aggregation Examples 411
  • 416. MongoDB Documentation, Release 2.6.4 • The valuesPrices is an array whose elements are the price values emitted by the map function and grouped by keyCustId. • The function reduces the valuesPrice array to the sum of its elements. var reduceFunction1 = function(keyCustId, valuesPrices) { return Array.sum(valuesPrices); }; 3. Perform the map-reduce on all documents in the orders collection using the mapFunction1 map function and the reduceFunction1 reduce function. db.orders.mapReduce( mapFunction1, reduceFunction1, { out: "map_reduce_example" } ) This operation outputs the results to a collection named map_reduce_example. If the map_reduce_example collection already exists, the operation will replace the contents with the re-sults of this map-reduce operation: Calculate Order and Total Quantity with Average Quantity Per Item In this example, you will perform a map-reduce operation on the orders collection for all documents that have an ord_date value greater than 01/01/2012. The operation groups by the item.sku field, and calculates the number of orders and the total quantity ordered for each sku. The operation concludes by calculating the average quantity per order for each sku value: 1. Define the map function to process each input document: • In the function, this refers to the document that the map-reduce operation is processing. • For each item, the function associates the sku with a new object value that contains the count of 1 and the item qty for the order and emits the sku and value pair. var mapFunction2 = function() { for (var idx = 0; idx < this.items.length; idx++) { var key = this.items[idx].sku; var value = { count: 1, qty: this.items[idx].qty }; emit(key, value); } }; 2. Define the corresponding reduce function with two arguments keySKU and countObjVals: • countObjVals is an array whose elements are the objects mapped to the grouped keySKU values passed by map function to the reducer function. • The function reduces the countObjVals array to a single object reducedValue that contains the count and the qty fields. • In reducedVal, the count field contains the sum of the count fields from the individual array ele-ments, and the qty field contains the sum of the qty fields from the individual array elements. var reduceFunction2 = function(keySKU, countObjVals) { reducedVal = { count: 0, qty: 0 }; 412 Chapter 7. Aggregation
  • 417. MongoDB Documentation, Release 2.6.4 for (var idx = 0; idx < countObjVals.length; idx++) { reducedVal.count += countObjVals[idx].count; reducedVal.qty += countObjVals[idx].qty; } return reducedVal; }; 3. Define a finalize function with two arguments key and reducedVal. The function modifies the reducedVal object to add a computed field named avg and returns the modified object: var finalizeFunction2 = function (key, reducedVal) { reducedVal.avg = reducedVal.qty/reducedVal.count; return reducedVal; }; 4. Perform the map-reduce operation on the orders collection using the mapFunction2, reduceFunction2, and finalizeFunction2 functions. db.orders.mapReduce( mapFunction2, reduceFunction2, { out: { merge: "map_reduce_example" }, query: { ord_date: { $gt: new Date('01/01/2012') } }, finalize: finalizeFunction2 } ) This operation uses the query field to select only those documents with ord_date greater than new Date(01/01/2012). Then it output the results to a collection map_reduce_example. If the map_reduce_example collection already exists, the operation will merge the existing contents with the results of this map-reduce operation. 7.3.4 Perform Incremental Map-Reduce Map-reduce operations can handle complex aggregation tasks. To perform map-reduce operations, MongoDB provides the mapReduce command and, in the mongo shell, the db.collection.mapReduce() wrapper method. If the map-reduce data set is constantly growing, you may want to perform an incremental map-reduce rather than performing the map-reduce operation over the entire data set each time. To perform incremental map-reduce: 1. Run a map-reduce job over the current collection and output the result to a separate collection. 2. When you have more data to process, run subsequent map-reduce job with: • the query parameter that specifies conditions that match only the new documents. • the out parameter that specifies the reduce action to merge the new results into the existing output collection. Consider the following example where you schedule a map-reduce operation on a sessions collection to run at the end of each day. 7.3. Aggregation Examples 413
  • 418. MongoDB Documentation, Release 2.6.4 Data Setup The sessions collection contains documents that log users’ sessions each day, for example: db.sessions.save( { userid: "a", ts: ISODate('2011-11-03 14:17:00'), length: 95 } ); db.sessions.save( { userid: "b", ts: ISODate('2011-11-03 14:23:00'), length: 110 } ); db.sessions.save( { userid: "c", ts: ISODate('2011-11-03 15:02:00'), length: 120 } ); db.sessions.save( { userid: "d", ts: ISODate('2011-11-03 16:45:00'), length: 45 } ); db.sessions.save( { userid: "a", ts: ISODate('2011-11-04 11:05:00'), length: 105 } ); db.sessions.save( { userid: "b", ts: ISODate('2011-11-04 13:14:00'), length: 120 } ); db.sessions.save( { userid: "c", ts: ISODate('2011-11-04 17:00:00'), length: 130 } ); db.sessions.save( { userid: "d", ts: ISODate('2011-11-04 15:37:00'), length: 65 } ); Initial Map-Reduce of Current Collection Run the first map-reduce operation as follows: 1. Define the map function that maps the userid to an object that contains the fields userid, total_time, count, and avg_time: var mapFunction = function() { var key = this.userid; var value = { userid: this.userid, total_time: this.length, count: 1, avg_time: 0 }; emit( key, value ); }; 2. Define the corresponding reduce function with two arguments key and values to calculate the total time and the count. The key corresponds to the userid, and the values is an array whose elements corresponds to the individual objects mapped to the userid in the mapFunction. var reduceFunction = function(key, values) { var reducedObject = { userid: key, total_time: 0, count:0, avg_time:0 }; values.forEach( function(value) { reducedObject.total_time += value.total_time; reducedObject.count += value.count; } ); return reducedObject; }; 3. Define the finalize function with two arguments key and reducedValue. The function modifies the reducedValue document to add another field average and returns the modified document. 414 Chapter 7. Aggregation
  • 419. MongoDB Documentation, Release 2.6.4 var finalizeFunction = function (key, reducedValue) { if (reducedValue.count > 0) reducedValue.avg_time = reducedValue.total_time / reducedValue.count; return reducedValue; }; 4. Perform map-reduce on the session collection using the mapFunction, the reduceFunction, and the finalizeFunction functions. Output the results to a collection session_stat. If the session_stat collection already exists, the operation will replace the contents: db.sessions.mapReduce( mapFunction, reduceFunction, { out: "session_stat", finalize: finalizeFunction } ) Subsequent Incremental Map-Reduce Later, as the sessions collection grows, you can run additional map-reduce operations. For example, add new documents to the sessions collection: db.sessions.save( { userid: "a", ts: ISODate('2011-11-05 14:17:00'), length: 100 } ); db.sessions.save( { userid: "b", ts: ISODate('2011-11-05 14:23:00'), length: 115 } ); db.sessions.save( { userid: "c", ts: ISODate('2011-11-05 15:02:00'), length: 125 } ); db.sessions.save( { userid: "d", ts: ISODate('2011-11-05 16:45:00'), length: 55 } ); At the end of the day, perform incremental map-reduce on the sessions collection, but use the query field to select only the new documents. Output the results to the collection session_stat, but reduce the contents with the results of the incremental map-reduce: db.sessions.mapReduce( mapFunction, reduceFunction, { query: { ts: { $gt: ISODate('2011-11-05 00:00:00') } }, out: { reduce: "session_stat" }, finalize: finalizeFunction } ); 7.3.5 Troubleshoot the Map Function The map function is a JavaScript function that associates or “maps” a value with a key and emits the key and value pair during a map-reduce (page 394) operation. To verify the key and value pairs emitted by the map function, write your own emit function. Consider a collection orders that contains documents of the following prototype: { _id: ObjectId("50a8240b927d5d8b5891743c"), cust_id: "abc123", ord_date: new Date("Oct 04, 2012"), 7.3. Aggregation Examples 415
  • 420. MongoDB Documentation, Release 2.6.4 status: 'A', price: 250, items: [ { sku: "mmm", qty: 5, price: 2.5 }, { sku: "nnn", qty: 5, price: 2.5 } ] } 1. Define the map function that maps the price to the cust_id for each document and emits the cust_id and price pair: var map = function() { emit(this.cust_id, this.price); }; 2. Define the emit function to print the key and value: var emit = function(key, value) { print("emit"); print("key: " + key + " value: " + tojson(value)); } 3. Invoke the map function with a single document from the orders collection: var myDoc = db.orders.findOne( { _id: ObjectId("50a8240b927d5d8b5891743c") } ); map.apply(myDoc); 4. Verify the key and value pair is as you expected. emit key: abc123 value:250 5. Invoke the map function with multiple documents from the orders collection: var myCursor = db.orders.find( { cust_id: "abc123" } ); while (myCursor.hasNext()) { var doc = myCursor.next(); print ("document _id= " + tojson(doc._id)); map.apply(doc); print(); } 6. Verify the key and value pairs are as you expected. See also: The map function must meet various requirements. For a list of all the requirements for the map function, see mapReduce, or the mongo shell helper method db.collection.mapReduce(). 7.3.6 Troubleshoot the Reduce Function The reduce function is a JavaScript function that “reduces” to a single object all the values associated with a par-ticular key during a map-reduce (page 394) operation. The reduce function must meet various requirements. This tutorial helps verify that the reduce function meets the following criteria: • The reduce function must return an object whose type must be identical to the type of the value emitted by the map function. • The order of the elements in the valuesArray should not affect the output of the reduce function. • The reduce function must be idempotent. 416 Chapter 7. Aggregation
  • 421. MongoDB Documentation, Release 2.6.4 For a list of all the requirements for the reduce function, see mapReduce, or the mongo shell helper method db.collection.mapReduce(). Confirm Output Type You can test that the reduce function returns a value that is the same type as the value emitted from the map function. 1. Define a reduceFunction1 function that takes the arguments keyCustId and valuesPrices. valuesPrices is an array of integers: var reduceFunction1 = function(keyCustId, valuesPrices) { return Array.sum(valuesPrices); }; 2. Define a sample array of integers: var myTestValues = [ 5, 5, 10 ]; 3. Invoke the reduceFunction1 with myTestValues: reduceFunction1('myKey', myTestValues); 4. Verify the reduceFunction1 returned an integer: 20 5. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects. valuesCountObjects is an array of documents that contain two fields count and qty: var reduceFunction2 = function(keySKU, valuesCountObjects) { reducedValue = { count: 0, qty: 0 }; for (var idx = 0; idx < valuesCountObjects.length; idx++) { reducedValue.count += valuesCountObjects[idx].count; reducedValue.qty += valuesCountObjects[idx].qty; } return reducedValue; }; 6. Define a sample array of documents: var myTestObjects = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, { count: 3, qty: 15 } ]; 7. Invoke the reduceFunction2 with myTestObjects: reduceFunction2('myKey', myTestObjects); 8. Verify the reduceFunction2 returned a document with exactly the count and the qty field: { "count" : 6, "qty" : 30 } 7.3. Aggregation Examples 417
  • 422. MongoDB Documentation, Release 2.6.4 Ensure Insensitivity to the Order of Mapped Values The reduce function takes a key and a values array as its argument. You can test that the result of the reduce function does not depend on the order of the elements in the values array. 1. Define a sample values1 array and a sample values2 array that only differ in the order of the array elements: var values1 = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, { count: 3, qty: 15 } ]; var values2 = [ { count: 3, qty: 15 }, { count: 1, qty: 5 }, { count: 2, qty: 10 } ]; 2. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects. valuesCountObjects is an array of documents that contain two fields count and qty: var reduceFunction2 = function(keySKU, valuesCountObjects) { reducedValue = { count: 0, qty: 0 }; for (var idx = 0; idx < valuesCountObjects.length; idx++) { reducedValue.count += valuesCountObjects[idx].count; reducedValue.qty += valuesCountObjects[idx].qty; } return reducedValue; }; 3. Invoke the reduceFunction2 first with values1 and then with values2: reduceFunction2('myKey', values1); reduceFunction2('myKey', values2); 4. Verify the reduceFunction2 returned the same result: { "count" : 6, "qty" : 30 } Ensure Reduce Function Idempotence Because the map-reduce operation may call a reduce multiple times for the same key, and won’t call a reduce for single instances of a key in the working set, the reduce function must return a value of the same type as the value emitted from the map function. You can test that the reduce function process “reduced” values without affecting the final value. 1. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects. valuesCountObjects is an array of documents that contain two fields count and qty: var reduceFunction2 = function(keySKU, valuesCountObjects) { reducedValue = { count: 0, qty: 0 }; for (var idx = 0; idx < valuesCountObjects.length; idx++) { reducedValue.count += valuesCountObjects[idx].count; reducedValue.qty += valuesCountObjects[idx].qty; } 418 Chapter 7. Aggregation
  • 423. MongoDB Documentation, Release 2.6.4 return reducedValue; }; 2. Define a sample key: var myKey = 'myKey'; 3. Define a sample valuesIdempotent array that contains an element that is a call to the reduceFunction2 function: var valuesIdempotent = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, reduceFunction2(myKey, [ { count:3, qty: 15 } ] ) ]; 4. Define a sample values1 array that combines the values passed to reduceFunction2: var values1 = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, { count: 3, qty: 15 } ]; 5. Invoke the reduceFunction2 first with myKey and valuesIdempotent and then with myKey and values1: reduceFunction2(myKey, valuesIdempotent); reduceFunction2(myKey, values1); 6. Verify the reduceFunction2 returned the same result: { "count" : 6, "qty" : 30 } 7.4 Aggregation Reference Aggregation Pipeline Quick Reference (page 420) Quick reference card for aggregation pipeline. http://docs.mongodb.org/manualreference/operator/aggregation Aggregation pipeline op-erations have a collection of operators available to define and manipulate documents in pipeline stages. Aggregation Commands Comparison (page 424) A comparison of group, mapReduce and aggregate that ex-plores the strengths and limitations of each aggregation modality. SQL to Aggregation Mapping Chart (page 426) An overview common aggregation operations in SQL and Mon-goDB using the aggregation pipeline and operators in MongoDB and common SQL statements. Aggregation Interfaces (page 428) The data aggregation interfaces document the invocation format and output for MongoDB’s aggregation commands and methods. Variables in Aggregation Expressions (page 428) Use of variables in aggregation pipeline expressions. 7.4. Aggregation Reference 419
  • 424. MongoDB Documentation, Release 2.6.4 7.4.1 Aggregation Pipeline Quick Reference Stages Pipeline stages appear in an array. Documents pass through the stages in sequence. All except the $out and $geoNear stages can appear multiple times in a pipeline. db.collection.aggregate( [ { <stage> }, ... ] ) Name Description $geoNearReturns an ordered stream of documents based on the proximity to a geospatial point. Incorporates the functionality of $match, $sort, and $limit for geospatial data. The output documents include an additional distance field and can include a location identifier field. $group Groups input documents by a specified identifier expression and applies the accumulator expression(s), if specified, to each group. Consumes all input documents and outputs one document per each distinct group. The output documents only contain the identifier field and, if specified, accumulated fields. $limit Passes the first n documents unmodified to the pipeline where n is the specified limit. For each input document, outputs either one document (for the first n documents) or zero documents (after the first n documents). $match Filters the document stream to allow only matching documents to pass unmodified into the next pipeline stage. $match uses standard MongoDB queries. For each input document, outputs either one document (a match) or zero documents (no match). $out Writes the resulting documents of the aggregation pipeline to a collection. To use the $out stage, it must be the last stage in the pipeline. $projectReshapes each document in the stream, such as by adding new fields or removing existing fields. For each input document, outputs one document. $redactReshapes each document in the stream by restricting the content for each document based on information stored in the documents themselves. Incorporates the functionality of $project and $match. Can be used to implement field level redaction. For each input document, outputs either one or zero document. $skip Skips the first n documents where n is the specified skip number and passes the remaining documents unmodified to the pipeline. For each input document, outputs either zero documents (for the first n documents) or one document (if after the first n documents). $sort Reorders the document stream by a specified sort key. Only the order changes; the documents remain unmodified. For each input document, outputs one document. $unwindDeconstructs an array field from the input documents to output a document for each element. Each output document replaces the array with an element value. For each input document, outputs n documents where n is the number of array elements and can be zero for an empty array. Expressions Expressions can include field paths and system variables (page 420), literals (page 421), expression objects (page 421), and expression operators (page 421). Expressions can be nested. Field Path and System Variables Aggregation expressions use field path to access fields in the input documents. To specify a field path, use a string that prefixes with a dollar sign $ the field name or the dotted field name, if the field is in embedded document. For example, "$user" to specify the field path for the user field or "$user.name" to specify the field path to "user.name" field. "$<field>" is equivalent to "$$CURRENT.<field>" where the CURRENT (page 429) is a system variable that defaults to the root of the current object in the most stages, unless stated otherwise in specific stages. CURRENT 420 Chapter 7. Aggregation
  • 425. MongoDB Documentation, Release 2.6.4 (page 429) can be rebound. Along with the CURRENT (page 429) system variable, other system variables (page 428) are also available for use in expressions. To use user-defined variables, use $let and $map expressions. To access variables in expressions, use a string that prefixes the variable name with $$. Literals Literals can be of any type. However, MongoDB parses string literals that start with a dollar sign $ as a path to a field and numeric/boolean literals in expression objects (page 421) as projection flags. To avoid parsing literals, use the $literal expression. Expression Objects Expression objects have the following form: { <field1>: <expression1>, ... } If the expressions are numeric or boolean literals, MongoDB treats the literals as projection flags (e.g. 1 or true to include the field), valid only in the $project stage. To avoid treating numeric or boolean literals as projection flags, use the $literal expression to wrap the numeric or boolean literals. Operator Expressions Operator expressions are similar to functions that take arguments. In general, these expressions take an array of arguments and have the following form: { <operator>: [ <argument1>, <argument2> ... ] } If operator accepts a single argument, you can omit the outer array designating the argument list: { <operator>: <argument> } To avoid parsing ambiguity if the argument is a literal array, you must wrap the literal array in a $literal expression or keep the outer array that designates the argument list. Boolean Expressions Boolean expressions evaluates its argument expressions as booleans and return a boolean as the result. In addition to the false boolean value, Boolean expression evaluates as false the following: null, 0, and undefined values. The Boolean expression evaluates all other values as true, including non-zero numeric values and arrays. Name Description $and Returns true only when all its expressions evaluate to true. Accepts any number of argument expressions. $not Returns the boolean value that is the opposite of its argument expression. Accepts a single argument expression. $or Returns true when any of its expressions evaluates to true. Accepts any number of argument expressions. 7.4. Aggregation Reference 421
  • 426. MongoDB Documentation, Release 2.6.4 Set Expressions Set expressions performs set operation on arrays, treating arrays as sets. Set expressions ignores the duplicate entries in each input array and the order of the elements. If the set operation returns a set, the operation filters out duplicates in the result to output an array that contains only unique entries. The order of the elements in the output array is unspecified. If a set contains a nested array element, the set expression does not descend into the nested array but evaluates the array at top-level. Name Description $allElementsRTertuurnes true if no element of a set evaluates to false, otherwise, returns false. Accepts a single argument expression. $anyElementTRreutuerns true if any elements of a set evaluate to true; otherwise, returns false. Accepts a single argument expression. $setDifferenRceeturns a set with elements that appear in the first set but not in the second set; i.e. performs a relative complement6 of the second set relative to the first. Accepts exactly two argument expressions. $setEquals Returns true if the input sets have the same distinct elements. Accepts two or more argument expressions. $setIntersecRteituornns a set with elements that appear in all of the input sets. Accepts any number of argument expressions. $setIsSubsetReturns true if all elements of the first set appear in the second set, including when the first set equals the second set; i.e. not a strict subset7. Accepts exactly two argument expressions. $setUnion Returns a set with elements that appear in any of the input sets. Accepts any number of argument expressions. Comparison Expressions Comparison expressions return a boolean except for $cmp which returns a number. The comparison expressions take two argument expressions and compare both value and type, using the specified BSON comparison order (page 168) for values of different types. Name Description $cmp Returns: 0 if the two values are equivalent, 1 if the first value is greater than the second, and -1 if the first value is less than the second. $eq Returns true if the values are equivalent. $gt Returns true if the first value is greater than the second. $gte Returns true if the first value is greater than or equal to the second. $lt Returns true if the first value is less than the second. $lte Returns true if the first value is less than or equal to the second. $ne Returns true if the values are not equivalent. Arithmetic Expressions Arithmetic expressions perform mathematic operations on numbers. Some arithmetic ex-pressions can also support date arithmetic. 4http://en.wikipedia.org/wiki/Complement_(set_theory) 5http://en.wikipedia.org/wiki/Subset 6http://en.wikipedia.org/wiki/Complement_(set_theory) 7http://en.wikipedia.org/wiki/Subset 422 Chapter 7. Aggregation
  • 427. MongoDB Documentation, Release 2.6.4 Name Description $add Adds numbers to return the sum, or adds numbers and a date to return a new date. If adding numbers and a date, treats the numbers as milliseconds. Accepts any number of argument expressions, but at most, one expression can resolve to a date. $divide Returns the result of dividing the first number by the second. Accepts two argument expressions. $mod Returns the remainder of the first number divided by the second. Accepts two argument expressions. $multiply Multiplies numbers to return the product. Accepts any number of argument expressions. $subtract Returns the result of subtracting the second value from the first. If the two values are numbers, return the difference. If the two values are dates, return the difference in milliseconds. If the two values are a date and a number in milliseconds, return the resulting date. Accepts two argument expressions. If the two values are a date and a number, specify the date argument first as it is not meaningful to subtract a date from a number. String Expressions String expressions, with the exception of $concat, only have a well-defined behavior for strings of ASCII characters. $concat behavior is well-defined regardless of the characters used. Name Description $concat Concatenates any number of strings. $strcasecPmeprforms case-insensitive string comparison and returns: 0 if two strings are equivalent, 1 if the first string is greater than the second, and -1 if the first string is less than the second. $substr Returns a substring of a string, starting at a specified index position up to a specified length. Accepts three expressions as arguments: the first argument must resolve to a string, and the second and third arguments must resolve to integers. $toLower Converts a string to lowercase. Accepts a single argument expression. $toUpper Converts a string to uppercase. Accepts a single argument expression. Text Search Expressions Name Description $meta Access text search metadata. Array Expressions Name Description $size Returns the number of elements in the array. Accepts a single expression as argument. Variable Expressions Name Description $let Defines variables for use within the scope of a subexpression and returns the result of the subexpression. Accepts named parameters. $map Applies a subexpression to each element of an array and returns the array of resulting values in order. Accepts named parameters. Literal Expressions Name Description $literalReturn a value without parsing. Use for values that the aggregation pipeline may interpret as an expression. For example, use a $literal expression to a string that starts with a $ to avoid parsing a field path. 7.4. Aggregation Reference 423
  • 428. MongoDB Documentation, Release 2.6.4 Date Expressions Name Description $dayOfMonthReturns the day of the month for a date as a number between 1 and 31. $dayOfWeek Returns the day of the week for a date as a number between 1 (Sunday) and 7 (Saturday). $dayOfYear Returns the day of the year for a date as a number between 1 and 366 (leap year). $hour Returns the hour for a date as a number between 0 and 23. $millisecond Returns the milliseconds of a date as a number between 0 and 999. $minute Returns the minute for a date as a number between 0 and 59. $month Returns the month for a date as a number between 1 (January) and 12 (December). $second Returns the seconds for a date as a number between 0 and 60 (leap seconds). $week Returns the week number for a date as a number between 0 (the partial week that precedes the first Sunday of the year) and 53 (leap year). $year Returns the year for a date as a number (e.g. 2014). Conditional Expressions Name Description $cond A ternary operator that evaluates one expression, and depending on the result, returns the value of the other two expressions. Accepts either three expressions in an ordered list or three named parameters. $ifNullReturns either the non-null result of the first expression or the result of the second expression if the expression results in a null result. Null result encompasses instances of undefined values or missing fields. Accepts two expressions as arguments. The result of the second expression can be null. Accumulators Accumulators, available only for the $group stage, compute values by combining documents that share the same group key. Accumulators take as input a single expression, evaluating the expression once for each input document, and maintain their state for the group of documents. Name Description $addToSet Returns an array of unique expression values for each group. Order of the array elements is undefined. $avg Returns an average for each group. Ignores non-numeric values. $first Returns a value from the first document for each group. Order is only defined if the documents are in a defined order. $last Returns a value from the last document for each group. Order is only defined if the documents are in a defined order. $max Returns the highest expression value for each group. $min Returns the lowest expression value for each group. $push Returns an array of expression values for each group. $sum Returns a sum for each group. Ignores non-numeric values. 7.4.2 Aggregation Commands Comparison The following table provides a brief overview of the features of the MongoDB aggregation commands. 424 Chapter 7. Aggregation
  • 429. MongoDB Documentation, Release 2.6.4 aggregate mapReduce group De-scrip-tion New in version 2.2. Implements the Map-Reduce Designed with specific goals of aggregation for processing large improving performance and data sets. usability for aggregation tasks. Uses a “pipeline” approach where objects are transformed as they pass through a series of pipeline operators such as $group, $match, and $sort. See http://docs.mongodb.org/manualreference/operator/aggregation for more information on the pipeline operators. Provides grouping functionality. Is slower than the aggregate command and has less functionality than the mapReduce command. Key Fea-tures Pipeline operators can be repeated as needed. Pipeline operators need not produce one output document for every input document. Can also generate new documents or filter out documents. In addition to grouping operations, can perform complex aggregation tasks as well as perform incremental aggregation on continuously growing datasets. See Map-Reduce Examples (page 411) and Perform Incremental Map-Reduce (page 413). Can either group by existing fields or with a custom keyf JavaScript function, can group by calculated fields. See group for information and example using the keyf function. Flex-i-bil-ity Limited to the operators and Custom map, reduce and expressions supported by the finalize JavaScript functions aggregation pipeline. offer flexibility to aggregation However, can add computed logic. fields, create new virtual See mapReduce for details and sub-objects, and extract restrictions on the functions. sub-fields into the top-level of results by using the $project pipeline operator. See $project for more information as well as http://docs.mongodb.org/manualreference/operator/aggregation for more information on all the available pipeline operators. Custom reduce and finalize JavaScript functions offer flexibility to grouping logic. See group for details and restrictions on these functions. Out-put Re-sults Returns results in various options (inline as a document that contains the result set, a cursor to the result set) or stores the results in a collection. The result is subject to the BSON Document size limit if returned inline as a document that contains the result set. Changed in version 2.6: Can return results as a cursor or store the results to a collection. Returns results in various options (inline, new collection, merge, replace, reduce). See mapReduce for details on the output options. Changed in version 2.2: Provides much better support for sharded map-reduce output than previous versions. Returns results inline as an array of grouped items. The result set must fit within the maximum BSON document size limit. Changed in version 2.2: The returned array can contain at most 20,000 elements; i.e. at most 20,000 unique groupings. Previous versions had a limit of 10,000 elements. Shard-ing Supports non-sharded and sharded input collections. Supports non-sharded and sharded input collections. Does not support sharded collection. Notes Prior to 2.4, JavaScript code executed in a single thread. Prior to 2.4, JavaScript code executed in a single thread. More In-for-ma-tion See Aggregation Pipeline (page 391) and aggregate. See Map-Reduce (page 394) and mapReduce. See group. 7.4. Aggregation Reference 425
  • 430. MongoDB Documentation, Release 2.6.4 7.4.3 SQL to Aggregation Mapping Chart The aggregation pipeline (page 391) allows MongoDB to provide native aggregation capabilities that corresponds to many common data aggregation operations in SQL. The following table provides an overview of common SQL aggregation terms, functions, and concepts and the corre-sponding MongoDB aggregation operators: SQL Terms, Functions, and Concepts MongoDB Aggregation Operators WHERE $match GROUP BY $group HAVING $match SELECT $project ORDER BY $sort LIMIT $limit SUM() $sum COUNT() $sum join No direct corresponding operator; however, the $unwind operator allows for somewhat similar functionality, but with fields embedded within the document. Examples The following table presents a quick reference of SQL aggregation statements and the corresponding MongoDB state-ments. The examples in the table assume the following conditions: • The SQL examples assume two tables, orders and order_lineitem that join by the order_lineitem.order_id and the orders.id columns. • The MongoDB examples assume one collection orders that contain documents of the following prototype: { cust_id: "abc123", ord_date: ISODate("2012-11-02T17:04:11.102Z"), status: 'A', price: 50, items: [ { sku: "xxx", qty: 25, price: 1 }, { sku: "yyy", qty: 25, price: 1 } ] } 426 Chapter 7. Aggregation
  • 431. MongoDB Documentation, Release 2.6.4 SQL Example MongoDB Example Description SELECT COUNT(*) AS count db.orders.aggregate( [ FROM orders { $group: { _id: null, count: { $sum: 1 } } } ] ) Count all records from orders SELECT SUM(price) AS total FROM orders db.orders.aggregate( [ { $group: { _id: null, total: { $sum: "$price" } } } ] ) Sum the price field from orders SELECT cust_id, SUM(price) AS total FROM orders GROUP BY cust_id db.orders.aggregate( [ { $group: { _id: "$cust_id", total: { $sum: "$price" } } } ] ) For each unique cust_id, sum the price field. SELECT cust_id, SUM(price) AS total FROM orders GROUP BY cust_id ORDER BY total db.orders.aggregate( [ { $group: { _id: "$cust_id", total: { $sum: "$price" } } }, { $sort: { total: 1 } } ] ) For each unique cust_id, sum the price field, results sorted by sum. SELECT cust_id, ord_date, SUM(price) AS total FROM orders GROUP BY cust_id, ord_date db.orders.aggregate( [ { $group: { _id: { cust_id: "$cust_id", ord_date: { month: { $month: "$ord_date" }, day: { $dayOfMonth: "$ord_date" }, year: { $year: "$ord_date"} } }, total: { $sum: "$price" } } } ] ) For each unique cust_id, ord_date grouping, sum the price field. Excludes the time portion of the date. 7.4. Aggregation Reference 427 SELECT cust_id, count(*) FROM orders db.orders.aggregate( [ { $group: { For cust_id with multiple records, return the cust_id and the corre-sponding record count.
  • 432. MongoDB Documentation, Release 2.6.4 7.4.4 Aggregation Interfaces Aggregation Commands Name Description aggregate Performs aggregation tasks (page 391) such as group using the aggregation framework. count Counts the number of documents in a collection. distinct Displays the distinct values found for a specified key in a collection. group Groups documents in a collection by the specified key and performs simple aggregation. mapReduce Performs map-reduce (page 394) aggregation for large data sets. Aggregation Methods Name Description db.collection.aggregate()Provides access to the aggregation pipeline (page 391). db.collection.group() Groups documents in a collection by the specified key and performs simple aggregation. db.collection.mapReduce()Performs map-reduce (page 394) aggregation for large data sets. 7.4.5 Variables in Aggregation Expressions Aggregation expressions (page 420) can use both user-defined and system variables. Variables can hold any BSON type data (page 167). To access the value of the variable, use a string with the variable name prefixed with double dollar signs ($$). If the variable references an object, to access a specific field in the object, use the dot notation; i.e. "$$<variable>.<field>". User Variables User variable names can contain the ascii characters [_a-zA-Z0-9] and any non-ascii character. User variable names must begin with a lowercase ascii letter [a-z] or a non-ascii character. System Variables MongoDB offers the following system variables: 428 Chapter 7. Aggregation
  • 433. MongoDB Documentation, Release 2.6.4 Variable Description ROOT References the root document, i.e. the top-level doc-ument, currently being processed in the aggregation pipeline stage. CURRENT References the start of the field path being processed in the aggregation pipeline stage. Unless documented oth-erwise, all stages start with CURRENT (page 429) the same as ROOT (page 429). CURRENT (page 429) is modifiable. However, since $<field> is equivalent to $$CURRENT.<field>, rebinding CURRENT (page 429) changes the meaning of $ accesses. DESCEND One of the allowed results of a $redact expression. PRUNE One of the allowed results of a $redact expression. KEEP One of the allowed results of a $redact expression. See also: $let, $redact 7.4. Aggregation Reference 429
  • 434. MongoDB Documentation, Release 2.6.4 430 Chapter 7. Aggregation
  • 435. CHAPTER 8 Indexes Indexes provide high performance read operations for frequently used queries. This section introduces indexes in MongoDB, describes the types and configuration options for indexes, and describes special types of indexing MongoDB supports. The section also provides tutorials detailing procedures and operational concerns, and providing information on how applications may use indexes. Index Introduction (page 431) An introduction to indexes in MongoDB. Index Concepts (page 436) The core documentation of indexes in MongoDB, including geospatial and text indexes. Index Types (page 437) MongoDB provides different types of indexes for different purposes and different types of content. Index Properties (page 456) The properties you can specify when building indexes. Index Creation (page 460) The options available when creating indexes. Index Intersection (page 462) The use of index intersection to fulfill a query. Indexing Tutorials (page 464) Examples of operations involving indexes, including index creation and querying in-dexes. Indexing Reference (page 500) Reference material for indexes in MongoDB. 8.1 Index Introduction Indexes support the efficient execution of queries in MongoDB.Without indexes, MongoDB must scan every document in a collection to select those documents that match the query statement. These collection scans are inefficient because they require mongod to process a larger volume of data than an index for each operation. Indexes are special data structures 1 that store a small portion of the collection’s data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field. Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB defines indexes at the collection level and supports indexes on any field or sub-field of the documents in a MongoDB collection. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect. In some cases, MongoDB can use the data from the index to determine which documents match a query. The following diagram illustrates a query that selects documents using an index. 1 MongoDB indexes use a B-tree data structure. 431
  • 436. MongoDB Documentation, Release 2.6.4 Figure 8.1: Diagram of a query selecting documents using an index. MongoDB narrows the query by scanning the range of documents with values of score less than 30. 8.1.1 Optimization Consider the documentation of the query optimizer (page 61) for more information on the relationship between queries and indexes. Create indexes to support common and user-facing queries. Having these indexes will ensure that MongoDB only scans the smallest possible number of documents. Indexes can also optimize the performance of other operations in specific situations: Sorted Results MongoDB can use indexes to return documents sorted by the index key directly from the index without requiring an additional sort phase. Covered Results When the query criteria and the projection of a query include only the indexed fields, MongoDB will return results directly from the index without scanning any documents or bringing documents into memory. These covered queries can be very efficient. 8.1.2 Index Types MongoDB provides a number of different index types to support specific types of data and queries. 432 Chapter 8. Indexes
  • 437. MongoDB Documentation, Release 2.6.4 Figure 8.2: Diagram of a query that uses an index to select and return sorted results. The index stores score values in ascending order. MongoDB can traverse the index in either ascending or descending order to return sorted results. Figure 8.3: Diagram of a query that uses only the index to match the query criteria and return the results. MongoDB does not need to inspect data outside of the index to fulfill the query. 8.1. Index Introduction 433
  • 438. MongoDB Documentation, Release 2.6.4 Default _id All MongoDB collections have an index on the _id field that exists by default. If applications do not specify a value for _id the driver or the mongod will create an _id field with an ObjectId value. The _id index is unique, and prevents clients from inserting two documents with the same value for the _id field. Single Field In addition to the MongoDB-defined _id index, MongoDB supports user-defined indexes on a single field of a docu-ment (page 438). Consider the following illustration of a single-field index: Figure 8.4: Diagram of an index on the score field (ascending). Compound Index MongoDB also supports user-defined indexes on multiple fields. These compound indexes (page 440) behave like single-field indexes; however, the query can select documents based on additional fields. The order of fields listed in a compound index has significance. For instance, if a compound index consists of { userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sort by score. Consider the following illustration of this compound index: Multikey Index MongoDB uses multikey indexes (page 442) to index the content stored in arrays. If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes (page 442) allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB automatically determines whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify the multikey type. Consider the following illustration of a multikey index: Geospatial Index To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes (page 451) that uses planar geometry when returning results and 2sphere indexes (page 447) that use spherical ge-ometry to return results. 434 Chapter 8. Indexes
  • 439. MongoDB Documentation, Release 2.6.4 Figure 8.5: Diagram of a compound index on the userid field (ascending) and the score field (descending). The index sorts first by the userid field and then by the score field. Figure 8.6: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address documents. The address documents contain the zip field. 8.1. Index Introduction 435
  • 440. MongoDB Documentation, Release 2.6.4 See 2d Index Internals (page 452) for a high level introduction to geospatial indexes. Text Indexes MongoDB provides a text index type that supports searching for string content in a collection. These text indexes do not store language-specific stop words (e.g. “the”, “a”, “or”) and stem the words in a collection to only store root words. See Text Indexes (page 454) for more information on text indexes and search. Hashed Indexes To support hash based sharding (page 621), MongoDB provides a hashed index (page 455) type, which indexes the hash of the value of a field. These indexes have a more random distribution of values along their range, but only support equality matches and cannot support range-based queries. 8.1.3 Index Properties Unique Indexes The unique (page 457) property for an index causes MongoDB to reject duplicate values for the indexed field. To create a unique index (page 457) on a field that already has duplicate values, see Drop Duplicates (page 461) for index creation options. Other than the unique constraint, unique indexes are functionally interchangeable with other MongoDB indexes. Sparse Indexes The sparse (page 457) property of an index ensures that the index only contain entries for documents that have the indexed field. The index skips documents that do not have the indexed field. You can combine the sparse index option with the unique index option to reject documents that have duplicate values for a field but ignore documents that do not have the indexed key. 8.1.4 Index Intersection New in version 2.6. MongoDB can use the intersection of indexes (page 462) to fulfill queries. For queries that specify compound query conditions, if one index can fulfill a part of a query condition, and another index can fulfill another part of the query condition, then MongoDB can use the intersection of the two indexes to fulfill the query. Whether the use of a compound index or the use of an index intersection is more efficient depends on the particular query and the system. For details on index intersection, see Index Intersection (page 462). 8.2 Index Concepts These documents describe and provide examples of the types, configuration options, and behavior of indexes in Mon-goDB. For an over view of indexing, see Index Introduction (page 431). For operational instructions, see Indexing Tutorials (page 464). The Indexing Reference (page 500) documents the commands and operations specific to index construction, maintenance, and querying in MongoDB, including index types and creation options. 436 Chapter 8. Indexes
  • 441. MongoDB Documentation, Release 2.6.4 Index Types (page 437) MongoDB provides different types of indexes for different purposes and different types of content. Single Field Indexes (page 438) A single field index only includes data from a single field of the documents in a collection. MongoDB supports single field indexes on fields at the top level of a document and on fields in sub-documents. Compound Indexes (page 440) A compound index includes more than one field of the documents in a collec-tion. Multikey Indexes (page 442) A multikey index references an array and records a match if a query includes any value in the array. Geospatial Indexes and Queries (page 444) Geospatial indexes support location-based searches on data that is stored as either GeoJSON objects or legacy coordinate pairs. Text Indexes (page 454) Text indexes supports search of string content in documents. Hashed Index (page 455) Hashed indexes maintain entries with hashes of the values of the indexed field. Index Properties (page 456) The properties you can specify when building indexes. TTL Indexes (page 456) The TTL index is used for TTL collections, which expire data after a period of time. Unique Indexes (page 457) A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. Sparse Indexes (page 457) A sparse index does not index documents that do not have the indexed field. Index Creation (page 460) The options available when creating indexes. Index Intersection (page 462) The use of index intersection to fulfill a query. 8.2.1 Index Types MongoDB provides a number of different index types. You can create indexes on any field or embedded field within a document or sub-document. You can create single field indexes (page 438) or compound indexes (page 440). Mon-goDB also supports indexes of arrays, called multi-key indexes (page 442), as well as indexes on geospatial data (page 444). For a list of the supported index types, see Index Type Documentation (page 438). In general, you should create indexes that support your common and user-facing queries. Having these indexes will ensure that MongoDB scans the smallest possible number of documents. In the mongo shell, you can create an index by calling the ensureIndex() method. For more detailed instructions about building indexes, see the Indexing Tutorials (page 464) page. Behavior of Indexes All indexes in MongoDB are B-tree indexes, which can efficiently support equality matches and range queries. The index stores items internally in order sorted by the value of the index field. The ordering of index entries supports efficient range-based operations and allows MongoDB to return sorted results using the order of documents in the index. Ordering of Indexes MongoDB indexes may be ascending, (i.e. 1) or descending (i.e. -1) in their ordering. Nevertheless, MongoDB may also traverse the index in either directions. As a result, for single-field indexes, ascending and descending indexes are 8.2. Index Concepts 437
  • 442. MongoDB Documentation, Release 2.6.4 interchangeable. This is not the case for compound indexes: in compound indexes, the direction of the sort order can have a greater impact on the results. See Sort Order (page 441) for more information on the impact of index order on results in compound indexes. Index Intersection MongoDB can use the intersection of indexes to fulfill queries with compound conditions. See Index Intersection (page 462) for details. Limits Certain restrictions apply to indexes, such as the length of the index keys or the number of indexes per collection. See Index Limitations for details. Index Type Documentation Single Field Indexes (page 438) A single field index only includes data from a single field of the documents in a collection. MongoDB supports single field indexes on fields at the top level of a document and on fields in sub-documents. Compound Indexes (page 440) A compound index includes more than one field of the documents in a collection. Multikey Indexes (page 442) A multikey index references an array and records a match if a query includes any value in the array. Geospatial Indexes and Queries (page 444) Geospatial indexes support location-based searches on data that is stored as either GeoJSON objects or legacy coordinate pairs. Text Indexes (page 454) Text indexes supports search of string content in documents. Hashed Index (page 455) Hashed indexes maintain entries with hashes of the values of the indexed field. Single Field Indexes MongoDB provides complete support for indexes on any field in a collection of documents. By default, all collections have an index on the _id field (page 439), and applications and users may add additional indexes to support important queries and operations. MongoDB supports indexes that contain either a single field or multiple fields depending on the operations that this index-type supports. This document describes indexes that contain a single field. Consider the following illustration of a single field index. See also: Compound Indexes (page 440) for information about indexes that include multiple fields, and Index Introduction (page 431) for a higher level introduction to indexing in MongoDB. Example Given the following document in the friends collection: { "_id" : ObjectId(...), "name" : "Alice" "age" : 27 } 438 Chapter 8. Indexes
  • 443. MongoDB Documentation, Release 2.6.4 Figure 8.7: Diagram of an index on the score field (ascending). The following command creates an index on the name field: db.friends.ensureIndex( { "name" : 1 } ) Cases _id Field Index MongoDB creates the _id index, which is an ascending unique index (page 457) on the _id field, for all collections when the collection is created. You cannot remove the index on the _id field. Think of the _id field as the primary key for a collection. Every document must have a unique _id field. You may store any unique value in the _id field. The default value of _id is an ObjectId which is generated when the client inserts the document. An ObjectId is a 12-byte unique identifier suitable for use as the value of an _id field. Note: In sharded clusters, if you do not use the _id field as the shard key, then your application must ensure the uniqueness of the values in the _id field to prevent errors. This is most-often done by using a standard auto-generated ObjectId. Before version 2.2, capped collections did not have an _id field. In version 2.2 and newer, capped collections do have an _id field, except those in the local database. See Capped Collections Recommendations and Restrictions (page 196) for more information. Indexes on Embedded Fields You can create indexes on fields embedded in sub-documents, just as you can index top-level fields in documents. Indexes on embedded fields differ from indexes on sub-documents (page 440), which include the full content up to the maximum index size of the sub-document in the index. Instead, indexes on embedded fields allow you to use a “dot notation,” to introspect into sub-documents. Consider a collection named people that holds documents that resemble the following example document: {"_id": ObjectId(...) "name": "John Doe" "address": { "street": "Main", "zipcode": "53511", "state": "WI" } } 8.2. Index Concepts 439
  • 444. MongoDB Documentation, Release 2.6.4 You can create an index on the address.zipcode field, using the following specification: db.people.ensureIndex( { "address.zipcode": 1 } ) Indexes on Subdocuments You can also create indexes on subdocuments. For example, the factories collection contains documents that contain a metro field, such as: { _id: ObjectId(...), metro: { city: "New York", state: "NY" }, name: "Giant Factory" } The metro field is a subdocument, containing the embedded fields city and state. The following command creates an index on the metro field as a whole: db.factories.ensureIndex( { metro: 1 } ) The following query can use the index on the metro field: db.factories.find( { metro: { city: "New York", state: "NY" } } ) This query returns the above document. When performing equality matches on subdocuments, field order matters and the subdocuments must match exactly. For example, the following query does not match the above document: db.factories.find( { metro: { state: "NY", city: "New York" } } ) See query-subdocuments for more information regarding querying on subdocuments. Compound Indexes MongoDB supports compound indexes, where a single index structure holds references to multiple fields 2 within a collection’s documents. The following diagram illustrates an example of a compound index on two fields: Compound indexes can support queries that match on multiple fields. Example Consider a collection named products that holds documents that resemble the following document: { "_id": ObjectId(...), "item": "Banana", "category": ["food", "produce", "grocery"], "location": "4th Street Store", "stock": 4, "type": "cases", "arrival": Date(...) } If applications query on the item field as well as query on both the item field and the stock field, you can specify a single compound index to support both of these queries: 2 MongoDB imposes a limit of 31 fields for any compound index. 440 Chapter 8. Indexes
  • 445. MongoDB Documentation, Release 2.6.4 Figure 8.8: Diagram of a compound index on the userid field (ascending) and the score field (descending). The index sorts first by the userid field and then by the score field. db.products.ensureIndex( { "item": 1, "stock": 1 } ) Important: You may not create compound indexes that have hashed index fields. You will receive an error if you attempt to create a compound index that includes a hashed index (page 455). The order of the fields in a compound index is very important. In the previous example, the index will contain references to documents sorted first by the values of the item field and, within each value of the item field, sorted by values of the stock field. See Sort Order (page 441) for more information. In addition to supporting queries that match on all the index fields, compound indexes can support queries that match on the prefix of the index fields. For details, see Prefixes (page 442). Sort Order Indexes store references to fields in either ascending (1) or descending (-1) sort order. For single-field indexes, the sort order of keys doesn’t matter because MongoDB can traverse the index in either direction. However, for compound indexes (page 440), sort order can matter in determining whether the index can support a sort operation. Consider a collection events that contains documents with the fields username and date. Applications can issue queries that return results sorted first by ascending username values and then by descending (i.e. more recent to last) date values, such as: db.events.find().sort( { username: 1, date: -1 } ) or queries that return results sorted first by descending username values and then by ascending date values, such as: db.events.find().sort( { username: -1, date: 1 } ) The following index can support both these sort operations: db.events.ensureIndex( { "username" : 1, "date" : -1 } ) However, the above index cannot support sorting by ascending username values and then by ascending date values, such as the following: db.events.find().sort( { username: 1, date: 1 } ) 8.2. Index Concepts 441
  • 446. MongoDB Documentation, Release 2.6.4 Prefixes Compound indexes support queries on any prefix of the index fields. Index prefixes are the beginning subset of indexed fields. For example, given the index { a: 1, b: 1, c: 1 }, both { a: 1 } and { a: 1, b: 1 } are prefixes of the index. If you have a collection that has a compound index on { a: 1, b: 1 }, as well as an index that consists of the prefix of that index, i.e. { a: 1 }, assuming none of the index has a sparse or unique constraints, then you can drop the { a: 1 } index. MongoDB will be able to use the compound index in all of situations that it would have used the { a: 1 } index. For example, given the following index: { "item": 1, "location": 1, "stock": 1 } MongoDB can use this index to support queries that include: • the item field, • the item field and the location field, • the item field and the location field and the stock field, or • only the item and stock fields; however, this index would be less efficient than an index on only item and stock. MongoDB cannot use this index to support queries that include: • only the location field, • only the stock field, or • only the location and stock fields. Index Intersection Starting in version 2.6, MongoDB can use index intersection (page 462) to fulfill queries. The choice between creating compound indexes that support your queries or relying on index intersection depends on the specifics of your system. See Index Intersection and Compound Indexes (page 463) for more details. Multikey Indexes To index a field that holds an array value, MongoDB adds index items for each item in the array. These multikey indexes allow MongoDB to return documents from queries using the value of an array. MongoDB automatically determines whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify the multikey type. Consider the following illustration of a multikey index: Multikey indexes support all operations supported by other MongoDB indexes; however, applications may use multi-key indexes to select documents based on ranges of values for the value of an array. Multikey indexes support arrays that hold both values (e.g. strings, numbers) and nested documents. Limitations Interactions between Compound and Multikey Indexes While you can create multikey compound indexes (page 440), at most one field in a compound index may hold an array. For example, given an index on { a: 1, b: 1 }, the following documents are permissible: {a: [1, 2], b: 1} {a: 1, b: [1, 2]} 442 Chapter 8. Indexes
  • 447. MongoDB Documentation, Release 2.6.4 Figure 8.9: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address documents. The address documents contain the zip field. However, the following document is impermissible, and MongoDB cannot insert such a document into a collection with the {a: 1, b: 1 } index: {a: [1, 2], b: [1, 2]} If you attempt to insert such a document, MongoDB will reject the insertion, and produce an error that says cannot index parallel arrays. MongoDB does not index parallel arrays because they require the index to include each value in the Cartesian product of the compound keys, which could quickly result in incredibly large and difficult to maintain indexes. ShardKeys Important: The index of a shard key cannot be a multi-key index. Hashed Indexes hashed indexes are not compatible with multi-key indexes. To compute the hash for a hashed index, MongoDB collapses sub-documents and computes the hash for the entire value. For fields that hold arrays or sub-documents, you cannot use the index to support queries that introspect the sub-document. Examples Index Basic Arrays Given the following document: { "_id" : ObjectId("..."), "name" : "Warm Weather", 8.2. Index Concepts 443
  • 448. MongoDB Documentation, Release 2.6.4 "author" : "Steve", "tags" : [ "weather", "hot", "record", "april" ] } Then an index on the tags field, { tags: 1 }, would be a multikey index and would include these four separate entries for that document: • "weather", • "hot", • "record", and • "april". Queries could use the multikey index to return queries for any of the above values. Index Arrays with Embedded Documents You can create multikey indexes on fields in objects embedded in arrays, as in the following example: Consider a feedback collection with documents in the following form: { "_id": ObjectId(...), "title": "Grocery Quality", "comments": [ { author_id: ObjectId(...), date: Date(...), text: "Please expand the cheddar selection." }, { author_id: ObjectId(...), date: Date(...), text: "Please expand the mustard selection." }, { author_id: ObjectId(...), date: Date(...), text: "Please expand the olive selection." } ] } An index on the comments.text field would be a multikey index and would add items to the index for all embedded documents in the array. With the index { "comments.text": 1 } on the feedback collection, consider the following query: db.feedback.find( { "comments.text": "Please expand the olive selection." } ) The query would select the documents in the collection that contain the following embedded document in the comments array: { author_id: ObjectId(...), date: Date(...), text: "Please expand the olive selection." } Geospatial Indexes and Queries MongoDB offers a number of indexes and query mechanisms to handle geospatial information. This section introduces MongoDB’s geospatial features. For complete examples of geospatial queries in MongoDB, see Geospatial Index Tutorials (page 476). 444 Chapter 8. Indexes
  • 449. MongoDB Documentation, Release 2.6.4 Surfaces Before storing your location data and writing queries, you must decide the type of surface to use to perform calculations. The type you choose affects how you store data, what type of index to build, and the syntax of your queries. MongoDB offers two surface types: Spherical To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use 2dsphere (page 447) index. Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate reference system for GeoJSON uses the WGS84 datum. Flat To calculate distances on a Euclidean plane, store your location data as legacy coordinate pairs and use a 2d (page 451) index. Location Data If you choose spherical surface calculations, you store location data as either: GeoJSON Objects Queries on GeoJSON objects always calculate on a sphere. The default coordinate reference system for GeoJSON uses the WGS84 datum. New in version 2.4: Support for GeoJSON storage and queries is new in version 2.4. Prior to version 2.4, all geospatial data used coordinate pairs. Changed in version 2.6: Support for additional GeoJSON types: MultiPoint, MultiLineString, MultiPolygon, Geome-tryCollection. MongoDB supports the following GeoJSON objects: • Point • LineString • Polygon • MultiPoint • MultiLineString • MultiPolygon • GeometryCollection Legacy Coordinate Pairs MongoDB supports spherical surface calculations on legacy coordinate pairs using a 2dsphere index by converting the data to the GeoJSON Point type. If you choose flat surface calculations, and use a 2d index you can store data only as legacy coordinate pairs. Query Operations MongoDB’s geospatial query operators let you query for: Inclusion MongoDB can query for locations contained entirely within a specified polygon. Inclusion queries use the $geoWithin operator. Both 2d and 2dsphere indexes can support inclusion queries. MongoDB does not require an index for inclusion queries after 2.2.3; however, these indexes will improve query performance. 8.2. Index Concepts 445
  • 450. MongoDB Documentation, Release 2.6.4 Intersection MongoDB can query for locations that intersect with a specified geometry. These queries apply only to data on a spherical surface. These queries use the $geoIntersects operator. Only 2dsphere indexes support intersection. Proximity MongoDB can query for the points nearest to another point. Proximity queries use the $near operator. The $near operator requires a 2d or 2dsphere index. Geospatial Indexes MongoDB provides the following geospatial index types to support the geospatial queries. 2dsphere 2dsphere (page 447) indexes support: • Calculations on a sphere • GeoJSON objects and include backwards compatibility for legacy coordinate pairs. • A compound index with scalar index fields (i.e. ascending or descending) as a prefix or suffix of the 2dsphere index field New in version 2.4: 2dsphere indexes are not available before version 2.4. See also: Query a 2dsphere Index (page 478) 2d 2d (page 451) indexes support: • Calculations using flat geometry • Legacy coordinate pairs (i.e., geospatial points on a flat coordinate system) • A compound index with only one additional field, as a suffix of the 2d index field See also: Query a 2d Index (page 481) Geospatial Indexes and Sharding You cannot use a geospatial index as the shard key index. You can create and maintain a geospatial index on a sharded collection if using fields other than shard key. For sharded collections, queries using $near are not supported. You can instead use either the geoNear command or the $geoNear aggregation stage. You also can query for geospatial data using $geoWithin. Additional Resources The following pages provide complete documentation for geospatial indexes and queries: 2dsphere Indexes (page 447) A 2dsphere index supports queries that calculate geometries on an earth-like sphere. The index supports data stored as both GeoJSON objects and as legacy coordinate pairs. 2d Indexes (page 451) The 2d index supports data stored as legacy coordinate pairs and is intended for use in Mon-goDB 2.2 and earlier. geoHaystack Indexes (page 452) A haystack index is a special index optimized to return results over small areas. For queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 2d Index Internals (page 452) Provides a more in-depth explanation of the internals of geospatial indexes. This ma-terial is not necessary for normal operations but may be useful for troubleshooting and for further understanding. 446 Chapter 8. Indexes
  • 451. MongoDB Documentation, Release 2.6.4 2dsphere Indexes New in version 2.4. A 2dsphere index supports queries that calculate geometries on an earth-like sphere. The index supports data stored as both GeoJSON objects and as legacy coordinate pairs. The index supports legacy coordinate pairs by converting the data to the GeoJSON Point type. The default datum for an earth-like sphere in MongoDB 2.4 is WGS84. Coordinate-axis order is longitude, latitude. The 2dsphere index supports all MongoDB geospatial queries: queries for inclusion, intersection and proxim-ity. See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the query operators that support geospatial queries. To create a 2dsphere index, use the db.collection.ensureIndex method. A compound (page 440) 2dsphere index can reference multiple location and non-location fields within a collection’s documents. See Create a 2dsphere Index (page 476) for more information. 2dsphere Version 2 Changed in version 2.6. MongoDB 2.6 introduces a version 2 of 2dsphere indexes. Version 2 is the default version of 2dsphere indexes created in MongoDB 2.6. To create a 2dsphere index as a version 1, include the option { "2dsphereIndexVersion": 1 } when creating the index. Additional GeoJSON Objects Changed in version 2.6. Version 2 adds support for additional GeoJSON object: MultiPoint (page 449), MultiLineString (page 450), MultiPoly-gon (page 450), and GeometryCollection (page 450). sparse Property Changed in version 2.6. Version 2 2dsphere indexes are sparse (page 457) by default and ignores the sparse: true (page 457) option. If a document lacks a 2dsphere index field (or the field is null or an empty array), MongoDB does not add an entry for the document to the 2dsphere index. For inserts, MongoDB inserts the document but does not add to the 2dsphere index. For a compound index that includes a 2dsphere index key along with keys of other types, only the 2dsphere index field determines whether the index references a document. Earlier versions of MongoDB only support Version 1 2dsphere indexes. Version 1 2dsphere indexes are not sparse by default and will reject documents with null location fields. Considerations geoNear and $geoNear Restrictions The geoNear command and the $geoNear pipeline stage require that a collection have at most only one 2dsphere index and/or only one 2d (page 451) index whereas geospatial query operators (e.g. $near and $geoWithin) permit collections to have multiple geospatial indexes. The geospatial index restriction for the geoNear command nor the $geoNear pipeline stage exists because neither the geoNear command nor the $geoNear pipeline stage syntax includes the location field. As such, index selection among multiple 2d indexes or 2dsphere indexes is ambiguous. No such restriction applies for geospatial query operators since these operators take a location field, eliminating the ambiguity. Shard Key Restrictions You cannot use a 2dsphere index as a shard key when sharding a collection. However, you can create and maintain a geospatial index on a sharded collection by using a different field as the shard key. 8.2. Index Concepts 447
  • 452. MongoDB Documentation, Release 2.6.4 GeoJSON Objects MongoDB supports the following GeoJSON objects: • Point (page 448) • LineString (page 448) • Polygon (page 448) • MultiPoint (page 449) • MultiLineString (page 450) • MultiPolygon (page 450) • GeometryCollection (page 450) The MultiPoint (page 449), MultiLineString (page 450), MultiPolygon (page 450), and GeometryCollection (page 450) require 2dsphere index version 2. In order to index GeoJSON data, you must store the data in a location field that you name. The location field contains a subdocument with a type field specifying the GeoJSON object type and a coordinates field specifying the object’s coordinates. Always store coordinates in longitude, latitude order. Use the following syntax: { <location field>: { type: "<GeoJSON type>" , coordinates: <coordinates> } } Point New in version 2.4. The following example stores a GeoJSON Point: { loc: { type: "Point", coordinates: [ 40, 5 ] } } LineString New in version 2.4. The following example stores a GeoJSON LineString: { loc: { type: "LineString", coordinates: [ [ 40, 5 ], [ 41, 6 ] ] } } Polygon New in version 2.4. Polygons consist of an array of GeoJSON LinearRing coordinate arrays. These LinearRings are closed LineStrings. Closed LineStrings have at least four coordinate pairs and specify the same position as the first and last coordinates. The line that joins two points on a curved surface may or may not contain the same set of co-ordinates that joins those two points on a flat surface. The line that joins two points on a curved surface will be a geodesic. Carefully check points to avoid errors with shared edges, as well as overlaps and other types of intersections. Polygons with a Single Ring The following example stores a GeoJSON Polygon with an exterior ring and no interior rings (or holes). Note the first and last coordinate pair with the [ 0 , 0 ] coordinate: { loc : { type: "Polygon", coordinates: [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ] ] } } 448 Chapter 8. Indexes
  • 453. MongoDB Documentation, Release 2.6.4 For Polygons with a single ring, the ring cannot self-intersect. Polygons with Multiple Rings For Polygons with multiple rings: • The first described ring must be the exterior ring. • The exterior ring cannot self-intersect. • Any interior ring must be entirely contained by the outer ring. • Interior rings cannot intersect or overlap each other. Interior rings cannot share an edge. The following document represents a polygon with an interior ring as GeoJSON: { loc : { type : "Polygon", coordinates : [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ], [ [ 2 , 2 ] , [ 3 , 3 ] , [ 4 , 2 ] , [ 2 , 2 ] ] ] } } Figure 8.10: Diagram of a Polygon with internal ring. MultiPoint New in version 2.6: Requires 2dsphere index version 2. The following example stores coordinates of GeoJSON type MultiPoint3: { loc: { type: "MultiPoint", coordinates: [ [ -73.9580, 40.8003 ], 3http://geojson.org/geojson-spec.html#id5 8.2. Index Concepts 449
  • 454. MongoDB Documentation, Release 2.6.4 [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.9814, 40.7681 ] ] } } MultiLineString New in version 2.6: Requires 2dsphere index version 2. The following example stores coordinates of GeoJSON type MultiLineString4: { loc: { type: "MultiLineString", coordinates: [ [ [ -73.96943, 40.78519 ], [ -73.96082, 40.78095 ] ], [ [ -73.96415, 40.79229 ], [ -73.95544, 40.78854 ] ], [ [ -73.97162, 40.78205 ], [ -73.96374, 40.77715 ] ], [ [ -73.97880, 40.77247 ], [ -73.97036, 40.76811 ] ] ] } } MultiPolygon New in version 2.6: Requires 2dsphere index version 2. The following example stores coordinates of GeoJSON type MultiPolygon5: { loc: { type: "MultiPolygon", coordinates: [ [ [ [ -73.958, 40.8003 ], [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.9814, 40.7681 [ [ [ -73.958, 40.8003 ], [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.958, 40.8003 ] ] } } GeometryCollection New in version 2.6: Requires 2dsphere index version 2. The following example stores coordinates of GeoJSON type GeometryCollection6: { loc: { type: "GeometryCollection", geometries: [ { type: "MultiPoint", coordinates: [ [ -73.9580, 40.8003 ], [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.9814, 40.7681 ] ] 4http://geojson.org/geojson-spec.html#id6 5http://geojson.org/geojson-spec.html#id7 6http://geojson.org/geojson-spec.html#geometrycollection 450 Chapter 8. Indexes
  • 455. MongoDB Documentation, Release 2.6.4 }, { type: "MultiLineString", coordinates: [ [ [ -73.96943, 40.78519 ], [ -73.96082, 40.78095 ] ], [ [ -73.96415, 40.79229 ], [ -73.95544, 40.78854 ] ], [ [ -73.97162, 40.78205 ], [ -73.96374, 40.77715 ] ], [ [ -73.97880, 40.77247 ], [ -73.97036, 40.76811 ] ] ] } ] } } 2d Indexes Use a 2d index for data stored as points on a two-dimensional plane. The 2d index is intended for legacy coordinate pairs used in MongoDB 2.2 and earlier. Use a 2d index if: • your database has legacy location data from MongoDB 2.2 or earlier, and • you do not intend to store any location data as GeoJSON objects. See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the query operators that support geospatial queries. Considerations The geoNear command and the $geoNear pipeline stage require that a collection have at most only one 2d index and/or only one 2dsphere index (page 447) whereas geospatial query operators (e.g. $near and $geoWithin) permit collections to have multiple geospatial indexes. The geospatial index restriction for the geoNear command nor the $geoNear pipeline stage exists because neither the geoNear command nor the $geoNear pipeline stage syntax includes the location field. As such, index selection among multiple 2d indexes or 2dsphere indexes is ambiguous. No such restriction applies for geospatial query operators since these operators take a location field, eliminating the ambiguity. Do not use a 2d index if your location data includes GeoJSON objects. To index on both legacy coordinate pairs and GeoJSON objects, use a 2dsphere (page 447) index. You cannot use a 2d index as a shard key when sharding a collection. However, you can create and maintain a geospatial index on a sharded collection by using a different field as the shard key. Behavior The 2d index supports calculations on a flat, Euclidean plane. The 2d index also supports distance-only calculations on a sphere, but for geometric calculations (e.g. $geoWithin) on a sphere, store data as GeoJSON objects and use the 2dsphere index type. A 2d index can reference two fields. The first must be the location field. A 2d compound index constructs queries that select first on the location field, and then filters those results by the additional criteria. A compound 2d index can cover queries. Points on a 2D Plane To store location data as legacy coordinate pairs, use an array or an embedded document. When possible, use the array format: loc : [ <longitude> , <latitude> ] Consider the embedded document form: 8.2. Index Concepts 451
  • 456. MongoDB Documentation, Release 2.6.4 loc : { lng : <longitude> , lat : <latitude> } Arrays are preferred as certain languages do not guarantee associative map ordering. For all points, if you use longitude and latitude, store coordinates in longitude, latitude order. sparse Property 2d indexes are sparse (page 457) by default and ignores the sparse: true (page 457) option. If a document lacks a 2d index field (or the field is null or an empty array), MongoDB does not add an entry for the document to the 2d index. For inserts, MongoDB inserts the document but does not add to the 2d index. For a compound index that includes a 2d index key along with keys of other types, only the 2d index field determines whether the index references a document. geoHaystack Indexes A geoHaystack index is a special index that is optimized to return results over small areas. geoHaystack indexes improve performance on queries that use flat geometry. For queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 2dsphere in-dexes (page 447) allow field reordering; geoHaystack indexes require the first field to be the location field. Also, geoHaystack indexes are only usable via commands and so always return all results at once. Behavior geoHaystack indexes create “buckets” of documents from the same geographic area in order to improve performance for queries limited to that area. Each bucket in a geoHaystack index contains all the documents within a specified proximity to a given longitude and latitude. sparse Property geoHaystack indexes are sparse (page 457) by default and ignore the sparse: true (page 457) option. If a document lacks a geoHaystack index field (or the field is null or an empty array), MongoDB does not add an entry for the document to the geoHaystack index. For inserts, MongoDB inserts the document but does not add to the geoHaystack index. geoHaystack indexes include one geoHaystack index key and one non-geospatial index key; however, only the geoHaystack index field determines whether the index references a document. Create geoHaystack Index To create a geoHaystack index, see Create a Haystack Index (page 482). For information and example on querying a haystack index, see Query a Haystack Index (page 483). 2d Index Internals This document provides a more in-depth explanation of the internals of MongoDB’s 2d geospa-tial indexes. This material is not necessary for normal operations or application development but may be useful for troubleshooting and for further understanding. Calculation of Geohash Values for 2d Indexes When you create a geospatial index on legacy coordinate pairs, MongoDB computes geohash values for the coordinate pairs within the specified location range (page 480) and then indexes the geohash values. To calculate a geohash value, recursively divide a two-dimensional map into quadrants. Then assign each quadrant a two-bit value. For example, a two-bit representation of four quadrants would be: 01 11 00 10 452 Chapter 8. Indexes
  • 457. MongoDB Documentation, Release 2.6.4 These two-bit values (00, 01, 10, and 11) represent each of the quadrants and all points within each quadrant. For a geohash with two bits of resolution, all points in the bottom left quadrant would have a geohash of 00. The top left quadrant would have the geohash of 01. The bottom right and top right would have a geohash of 10 and 11, respectively. To provide additional precision, continue dividing each quadrant into sub-quadrants. Each sub-quadrant would have the geohash value of the containing quadrant concatenated with the value of the sub-quadrant. The geohash for the upper-right quadrant is 11, and the geohash for the sub-quadrants would be (clockwise from the top left): 1101, 1111, 1110, and 1100, respectively. Multi-location Documents for 2d Indexes New in version 2.0: Support for multiple locations in a document. While 2d geospatial indexes do not support more than one set of coordinates in a document, you can use a multi-key index (page 442) to index multiple coordinate pairs in a single document. In the simplest example you may have a field (e.g. locs) that holds an array of coordinates, as in the following example: { _id : ObjectId(...), locs : [ [ 55.5 , 42.3 ] , [ -74 , 44.74 ] , { lng : 55.5 , lat : 42.3 } ] } The values of the array may be either arrays, as in [ 55.5, 42.3 ], or embedded documents, as in { lng : 55.5 , lat : 42.3 }. You could then create a geospatial index on the locs field, as in the following: db.places.ensureIndex( { "locs": "2d" } ) You may also model the location data as a field inside of a sub-document. In this case, the document would contain a field (e.g. addresses) that holds an array of documents where each document has a field (e.g. loc:) that holds location coordinates. For example: { _id : ObjectId(...), name : "...", addresses : [ { context : "home" , loc : [ 55.5, 42.3 ] } , { context : "home", loc : [ -74 , 44.74 ] } ] } You could then create the geospatial index on the addresses.loc field as in the following example: db.records.ensureIndex( { "addresses.loc": "2d" } ) To include the location field with the distance field in multi-location document queries, specify includeLocs: true in the geoNear command. See also: geospatial-query-compatibility-chart 8.2. Index Concepts 453
  • 458. MongoDB Documentation, Release 2.6.4 Text Indexes New in version 2.4. MongoDB provides text indexes to support text search of string content in documents of a collection. text indexes can include any field whose value is a string or an array of string elements. To perform queries that access the text index, use the $text query operator. Changed in version 2.6: MongoDB enables the text search feature by default. In MongoDB 2.4, you need to enable the text search feature manually to create text indexes and perform text search (page 455). Create Text Index To create a text index, use the db.collection.ensureIndex() method. To index a field that contains a string or an array of string elements, include the field and specify the string literal "text" in the index document, as in the following example: db.reviews.ensureIndex( { comments: "text" } ) A collection can have at most one text index. For examples of creating text indexes on multiple fields, see Create a text Index (page 486). Supported Languages and StopWords MongoDB supports text search for various languages. text indexes drop language-specific stop words (e.g. in English, “the”, “an”, “a”, “and”, etc.) and uses simple language-specific suffix stemming. For a list of the supported languages, see Text Search Languages (page 501). If you specify a language value of "none", then the text index uses simple tokenization with no list of stop words and no stemming. If the index language is English, text indexes are case-insensitive for non-diacritics; i.e. case insensitive for [A-z]. To specify a language for the text index, see Specify a Language for Text Index (page 487). sparse Property text indexes are sparse (page 457) by default and ignores the sparse: true (page 457) option. If a document lacks a text index field (or the field is null or an empty array), MongoDB does not add an entry for the document to the text index. For inserts, MongoDB inserts the document but does not add to the text index. For a compound index that includes a text index key along with keys of other types, only the text index field determine whether the index references a document. The other keys do not determine whether the index references the documents or not. Restrictions Text Search and Hints You cannot use hint() if the query includes a $text query expression. Compound Index A compound index (page 440) can include a text index key in combination with ascend-ing/ descending index keys. However, these compound indexes have the following restrictions: A compound text index cannot include any other special index types, such as multi-key (page 442) or geospatial (page 446) index fields. If the compound text index includes keys preceding the text index key, to perform a $text search, the query predicate must include equality match conditions on the preceding keys. See Limit the Number of Entries Scanned (page 491). 454 Chapter 8. Indexes
  • 459. MongoDB Documentation, Release 2.6.4 Drop a Text Index To drop a text index, pass the name of the index to the db.collection.dropIndex() method. To get the name of the index, run the getIndexes() method. For information on the default naming scheme for text indexes as well as overriding the default name, see Specify Name for text Index (page 489). Storage Requirements and Performance Costs text indexes have the following storage requirements and per-formance costs: • text indexes change the space allocation method for all future record allocations in a collection to usePowerOf2Sizes. • text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed field for each document inserted. • Building a text index is very similar to building a large multi-key index and will take longer than building a simple ordered (scalar) index on the same data. • When building a large text index on an existing collection, ensure that you have a sufficiently high limit on open file descriptors. See the recommended settings (page 266). • text indexes will impact insertion throughput because MongoDB must add an index entry for each unique post-stemmed word in each indexed field of each new source document. • Additionally, text indexes do not store phrases or information about the proximity of words in the documents. As a result, phrase queries will run much more effectively when the entire collection fits in RAM. Text Search Text search supports the search of string content in documents of a collection. MongoDB provides the $text operator to perform text search in queries and in aggregation pipelines (page 491). The text search process: • tokenizes and stems the search term(s) during both the index creation and the text command execution. • assigns a score to each document that contains the search term in the indexed fields. The score determines the relevance of a document to a given search query. The $text operator can search for words and phrases. The query matches on the complete stemmed words. For example, if a document field contains the word blueberry, a search on the term blue will not match the document. However, a search on either blueberry or blueberries will match. For information and examples on various text search patterns, see the $text query operator. For examples of text search in aggregation pipeline, see Text Search in the Aggregation Pipeline (page 491). Hashed Index New in version 2.4. Hashed indexes maintain entries with hashes of the values of the indexed field. The hashing function collapses sub-documents and computes the hash for the entire value but does not support multi-key (i.e. arrays) indexes. Hashed indexes support sharding (page 607) a collection using a hashed shard key (page 621). Using a hashed shard key to shard a collection ensures a more even distribution of data. See Shard a Collection Using a Hashed Shard Key (page 641) for more details. MongoDB can use the hashed index to support equality queries, but hashed indexes do not support range queries. You may not create compound indexes that have hashed index fields or specify a unique constraint on a hashed index; however, you can create both a hashed index and an ascending/descending (i.e. non-hashed) index on the same field: MongoDB will use the scalar index for range queries. 8.2. Index Concepts 455
  • 460. MongoDB Documentation, Release 2.6.4 Warning: MongoDB hashed indexes truncate floating point numbers to 64-bit integers before hashing. For example, a hashed index would store the same value for a field that held a value of 2.3, 2.2, and 2.9. To prevent collisions, do not use a hashed index for floating point numbers that cannot be reliably converted to 64-bit integers (and then back to floating point). MongoDB hashed indexes do not support floating point values larger than 253. Create a hashed index using an operation that resembles the following: db.active.ensureIndex( { a: "hashed" } ) This operation creates a hashed index for the active collection on the a field. 8.2.2 Index Properties In addition to the numerous index types (page 437) MongoDB supports, indexes can also have various properties. The following documents detail the index properties that you can select when building an index. TTL Indexes (page 456) The TTL index is used for TTL collections, which expire data after a period of time. Unique Indexes (page 457) A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. Sparse Indexes (page 457) A sparse index does not index documents that do not have the indexed field. TTL Indexes TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time. This is ideal for some types of information like machine generated event data, logs, and session information that only need to persist in a database for a limited amount of time. Considerations TTL indexes have the following limitations: • Compound indexes (page 440) are not supported. • The indexed field must be a date type. • If the field holds an array, and there are multiple date-typed data in the index, the document will expire when the lowest (i.e. earliest) matches the expiration threshold. The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay between the time a document expires and the time that MongoDB removes the document from the database. The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a collection after they expire but before the background task runs or completes. The duration of the removal operation depends on the workload of your mongod instance. Therefore, expired data may exist for some time beyond the 60 second period between runs of the background task. In all other respects, TTL indexes are normal indexes, and if appropriate, MongoDB can use these indexes to fulfill arbitrary queries. Additional Information Expire Data from Collections by Setting TTL (page 198) 456 Chapter 8. Indexes
  • 461. MongoDB Documentation, Release 2.6.4 Unique Indexes A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. To create a unique index, use the db.collection.ensureIndex() method with the unique option set to true. For example, to create a unique index on the user_id field of the members collection, use the following operation in the mongo shell: db.members.ensureIndex( { "user_id": 1 }, { unique: true } ) By default, unique is false on MongoDB indexes. If you use the unique constraint on a compound index (page 440), then MongoDB will enforce uniqueness on the combination of values rather than the individual value for any or all values of the key. Behavior Unique Constraint Across Separate Documents The unique constraint applies to separate documents in the col-lection. That is, the unique index prevents separate documents from having the same value for the indexed key, but the index does not prevent a document from having multiple elements or embedded documents in an indexed array from having the same value. In the case of a single document with repeating values, the repeated value is inserted into the index only once. For example, a collection has a unique index on a.b: db.collection.ensureIndex( { "a.b": 1 }, { unique: true } ) The unique index permits the insertion of the following document into the collection if no other document in the collection has the a.b value of 5: db.collection.insert( { a: [ { b: 5 }, { b: 5 } ] } ) Unique Index and Missing Field If a document does not have a value for the indexed field in a unique index, the index will store a null value for this document. Because of the unique constraint, MongoDB will only permit one document that lacks the indexed field. If there is more than one document without a value for the indexed field or is missing the indexed field, the index build will fail with a duplicate key error. You can combine the unique constraint with the sparse index (page 457) to filter these null values from the unique index and avoid the error. Restrictions You may not specify a unique constraint on a hashed index (page 455). See also: Create a Unique Index (page 467) Sparse Indexes Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index skips over any document that is missing the indexed field. The index is “sparse” because it does not include all documents of a collection. By contrast, non-sparse indexes contain all documents in a collection, storing null values for those documents that do not contain the indexed field. To create a sparse index, use the db.collection.ensureIndex() method with the sparse option set to true. For example, the following operation in the mongo shell creates a sparse index on the xmpp_id field of the addresses collection: 8.2. Index Concepts 457
  • 462. MongoDB Documentation, Release 2.6.4 db.addresses.ensureIndex( { "xmpp_id": 1 }, { sparse: true } ) Note: Do not confuse sparse indexes in MongoDB with block-level7 indexes in other databases. Think of them as dense indexes with a specific filter. Behavior sparse Index and Incomplete Results Changed in version 2.6. If a sparse index would result in an incomplete result set for queries and sort operations, MongoDB will not use that index unless a hint() explicitly specifies the index. For example, the query { x: { $exists: false } } will not use a sparse index on the x field unless explicitly hinted. See Sparse Index On A Collection Cannot Return Complete Results (page 459) for an example that details the behavior. Indexes that are sparse by Default 2dsphere (version 2) (page 447), 2d (page 451), geoHaystack (page 452), and text (page 454) indexes are always sparse. sparse Compound Indexes Sparse compound indexes (page 440) that only contain ascending/descending index keys will index a document as long as the document contains at least one of the keys. For sparse compound indexes that contain a geospatial key (i.e. 2dsphere (page 447), 2d (page 451), or geoHaystack (page 452) index keys) along with ascending/descending index key(s), only the existence of the geospatial field(s) in a document determine whether the index references the document. For sparse compound indexes that contain text (page 454) index keys along with ascending/descending index keys, only the existence of the text index field(s) determine whether the index references a document. sparse and unique Properties An index that is both sparse and unique (page 457) prevents collection from having documents with duplicate values for a field but allows multiple documents that omit the key. Examples Create a Sparse Index On A Collection Consider a collection scores that contains the following documents: { "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } { "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } { "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } The collection has a sparse index on the field score: db.scores.ensureIndex( { score: 1 } , { sparse: true } ) Then, the following query on the scores collection uses the sparse index to return the documents that have the score field less than ($lt) 90: db.scores.find( { score: { $lt: 90 } } ) Because the document for the userid "newbie" does not contain the score field and thus does not meet the query criteria, the query can use the sparse index to return the results: 7http://en.wikipedia.org/wiki/Database_index#Sparse_index 458 Chapter 8. Indexes
  • 463. MongoDB Documentation, Release 2.6.4 { "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } Sparse Index On A Collection Cannot Return Complete Results Consider a collection scores that contains the following documents: { "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } { "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } { "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } The collection has a sparse index on the field score: db.scores.ensureIndex( { score: 1 } , { sparse: true } ) Because the document for the userid "newbie" does not contain the score field, the sparse index does not contain an entry for that document. Consider the following query to return all documents in the scores collection, sorted by the score field: db.scores.find().sort( { score: -1 } ) Even though the sort is by the indexed field, MongoDB will not select the sparse index to fulfill the query in order to return complete results: { "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } { "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } { "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } To use the sparse index, explicitly specify the index with hint(): db.scores.find().sort( { score: -1 } ).hint( { score: 1 } ) The use of the index results in the return of only those documents with the score field: { "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } { "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } See also: explain() and Analyze Query Performance (page 97) Sparse Index with Unique Constraint Consider a collection scores that contains the following documents: { "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" } { "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 } { "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 } You could create an index with a unique constraint (page 457) and sparse filter on the score field using the following operation: db.scores.ensureIndex( { score: 1 } , { sparse: true, unique: true } ) This index would permit the insertion of documents that had unique values for the score field or did not include a score field. Consider the following insert operation (page 84): db.scores.insert( { "userid": "AAAAAAA", "score": 43 } ) db.scores.insert( { "userid": "BBBBBBB", "score": 34 } ) db.scores.insert( { "userid": "CCCCCCC" } ) db.scores.insert( { "userid": "DDDDDDD" } ) 8.2. Index Concepts 459
  • 464. MongoDB Documentation, Release 2.6.4 However, the index would not permit the addition of the following documents since documents already exists with score value of 82 and 90: db.scores.insert( { "userid": "AAAAAAA", "score": 82 } ) db.scores.insert( { "userid": "BBBBBBB", "score": 90 } ) 8.2.3 Index Creation MongoDB provides several options that only affect the creation of the index. Specify these options in a document as the second argument to the db.collection.ensureIndex() method. This section describes the uses of these creation options and their behavior. Related Some options that you can specify to ensureIndex() options control the properties of the index (page 456), which are not index creation options. For example, the unique (page 457) option affects the behavior of the index after creation. For a detailed description of MongoDB’s index types, see Index Types (page 437) and Index Properties (page 456) for related documentation. Background Construction By default, creating an index blocks all other operations on a database. When building an index on a collection, the database that holds the collection is unavailable for read or write operations until the index build completes. Any operation that requires a read or write lock on all databases (e.g. listDatabases) will wait for the foreground index build to complete. For potentially long running index building operations, consider the background operation so that the MongoDB database remains available during the index building operation. For example, to create an index in the background of the zipcode field of the people collection, issue the following: db.people.ensureIndex( { zipcode: 1}, {background: true} ) By default, background is false for building MongoDB indexes. You can combine the background option with other options, as in the following: db.people.ensureIndex( { zipcode: 1}, {background: true, sparse: true } ) Behavior As of MongoDB version 2.4, a mongod instance can build more than one index in the background concurrently. Changed in version 2.4: Before 2.4, a mongod instance could only build one background index per database at a time. Changed in version 2.2: Before 2.2, a single mongod instance could only build one index at a time. Background indexing operations run in the background so that other database operations can run while creating the index. However, the mongo shell session or connection where you are creating the index will block until the index build is complete. To continue issuing commands to the database, open another connection or mongo instance. Queries will not use partially-built indexes: the index will only be usable once the index build is complete. Note: If MongoDB is building an index in the background, you cannot perform other administra-tive operations involving that collection, including running repairDatabase, dropping the collection (i.e. 460 Chapter 8. Indexes
  • 465. MongoDB Documentation, Release 2.6.4 db.collection.drop()), and running compact. These operations will return an error during background index builds. Performance The background index operation uses an incremental approach that is slower than the normal “foreground” index builds. If the index is larger than the available RAM, then the incremental process can take much longer than the foreground build. If your application includes ensureIndex() operations, and an index doesn’t exist for other operational concerns, building the index can have a severe impact on the performance of the database. To avoid performance issues, make sure that your application checks for the indexes at start up using the getIndexes() method or the equivalent method for your driver8 and terminates if the proper indexes do not ex-ist. Always build indexes in production instances using separate application code, during designated maintenance windows. Building Indexes on Secondaries Changed in version 2.6: Secondary members can now build indexes in the background. Previously all index builds on secondaries were in the foreground. Background index operations on a replica set secondaries begin after the primary completes building the index. If MongoDB builds an index in the background on the primary, the secondaries will then build that index in the back-ground. To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step down the primary, restart it as a standalone, and build the index on the former primary. The amount of time required to build the index on a secondary must be within the window of the oplog, so that the secondary can catch up with the primary. Indexes on secondary members in “recovering” mode are always built in the foreground to allow them to catch up as soon as possible. See Build Indexes on Replica Sets (page 469) for a complete procedure for building indexes on secondaries. Drop Duplicates MongoDB cannot create a unique index (page 457) on a field that has duplicate values. To force the creation of a unique index, you can specify the dropDups option, which will only index the first occurrence of a value for the key, and delete all subsequent values. Important: As in all unique indexes, if a document does not have the indexed field, MongoDB will include it in the index with a “null” value. If subsequent fields do not have the indexed field, and you have set {dropDups: true}, MongoDB will remove these documents from the collection when creating the index. If you combine dropDups with the sparse (page 457) option, this index will only include documents in the index that have the value, and the documents without the field will remain in the database. 8http://api.mongodb.org/ 8.2. Index Concepts 461
  • 466. MongoDB Documentation, Release 2.6.4 To create a unique index that drops duplicates on the username field of the accounts collection, use a command in the following form: db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } ) Warning: Specifying { dropDups: true } will delete data from your database. Use with extreme cau-tion. By default, dropDups is false. Index Names The default name for an index is the concatenation of the indexed keys and each key’s direction in the index, 1 or -1. Example Issue the following command to create an index on item and quantity: db.products.ensureIndex( { item: 1, quantity: -1 } ) The resulting index is named: item_1_quantity_-1. Optionally, you can specify a name for an index instead of using the default name. Example Issue the following command to create an index on item and quantity and specify inventory as the index name: db.products.ensureIndex( { item: 1, quantity: -1 } , { name: "inventory" } ) The resulting index has the name inventory. To view the name of an index, use the getIndexes() method. 8.2.4 Index Intersection New in version 2.6. MongoDB can use the intersection of multiple indexes to fulfill queries. 9 In general, each index intersection involves two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query. To illustrate index intersection, consider a collection orders that has the following indexes: { qty: 1 } { item: 1 } MongoDB can use the intersection of the two indexes to support the following query: db.orders.find( { item: "abc123", qty: { $gt: 15 } } ) For query plans that use index intersection, the explain() returns the value Complex Plan in the cursor field. 9 In previous versions, MongoDB could use only a single index to fulfill most queries. The exception to this is queries with $or clauses, which could use a single index for each $or clause. 462 Chapter 8. Indexes
  • 467. MongoDB Documentation, Release 2.6.4 Index Prefix Intersection With index intersection, MongoDB can use an intersection of either the entire index or the index prefix. An index prefix is a subset of a compound index, consisting of one or more keys starting from the beginning of the index. Consider a collection orders with the following indexes: { qty: 1 } { status: 1, ord_date: -1 } To fulfill the following query which specifies a condition on both the qty field and the status field, MongoDB can use the intersection of the two indexes: db.orders.find( { qty: { $gt: 10 } , status: "A" } ) Index Intersection and Compound Indexes Index intersection does not eliminate the need for creating compound indexes (page 440). However, because both the list order (i.e. the order in which the keys are listed in the index) and the sort order (i.e. ascending or descending), matter in compound indexes (page 440), a compound index may not support a query condition that does not include the index prefix keys (page 442) or that specifies a different sort order. For example, if a collection orders has the following compound index, with the status field listed before the ord_date field: { status: 1, ord_date: -1 } The compound index can support the following queries: db.orders.find( { status: { $in: ["A", "P" ] } } ) db.orders.find( { ord_date: { $gt: new Date("2014-02-01") }, status: {$in:[ "P", "A" ] } } ) But not the following two queries: db.orders.find( { ord_date: { $gt: new Date("2014-02-01") } } ) db.orders.find( { } ).sort( { ord_date: 1 } ) However, if the collection has two separate indexes: { status: 1 } { ord_date: -1 } The two indexes can, either individually or through index intersection, support all four aforementioned queries. The choice between creating compound indexes that support your queries or relying on index intersection depends on the specifics of your system. See also: compound indexes (page 440), Create Compound Indexes to Support Several Different Queries (page 494) 8.2. Index Concepts 463
  • 468. MongoDB Documentation, Release 2.6.4 Index Intersection and Sort Index intersection does not apply when the sort() operation requires an index completely separate from the query predicate. For example, the orders collection has the following indexes: { qty: 1 } { status: 1, ord_date: -1 } { status: 1 } { ord_date: -1 } MongoDB cannot use index intersection for the following query with sort: db.orders.find( { qty: { $gt: 10 } } ).sort( { status: 1 } ) That is, MongoDB does not use the { qty: 1 } index for the query, and the separate { status: 1 } or the { status: 1, ord_date: -1 } index for the sort. However, MongoDB can use index intersection for the following query with sort since the index { status: 1, ord_date: -1 } can fulfill part of the query predicate. db.orders.find( { qty: { $gt: 10 } , status: "A" } ).sort( { ord_date: -1 } ) 8.3 Indexing Tutorials Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the documents in a collection. The documents in this section outline specific tasks related to building and maintaining indexes for data in MongoDB collections and discusses strategies and practical approaches. For a conceptual overview of MongoDB indexing, see the Index Concepts (page 436) document. Index Creation Tutorials (page 464) Create and configure different types of indexes for different purposes. Index Management Tutorials (page 472) Monitor and assess index performance and rebuild indexes as needed. Geospatial Index Tutorials (page 476) Create indexes that support data stored as GeoJSON objects and legacy coor-dinate pairs. Text Search Tutorials (page 486) Build and configure indexes that support full-text searches. Indexing Strategies (page 493) The factors that affect index performance and practical approaches to indexing in MongoDB 8.3.1 Index Creation Tutorials Instructions for creating and configuring indexes in MongoDB and building indexes on replica sets and sharded clus-ters. Create an Index (page 465) Build an index for any field on a collection. Create a Compound Index (page 466) Build an index of multiple fields on a collection. Create a Unique Index (page 467) Build an index that enforces unique values for the indexed field or fields. Create a Sparse Index (page 467) Build an index that omits references to documents that do not include the indexed field. This saves space when indexing fields that are present in only some documents. 464 Chapter 8. Indexes
  • 469. MongoDB Documentation, Release 2.6.4 Create a Hashed Index (page 468) Compute a hash of the value of a field in a collection and index the hashed value. These indexes permit equality queries and may be suitable shard keys for some collections. Build Indexes on Replica Sets (page 469) To build indexes on a replica set, you build the indexes separately on the primary and the secondaries, as described here. Build Indexes in the Background (page 470) Background index construction allows read and write operations to continue while building the index, but take longer to complete and result in a larger index. Build Old Style Indexes (page 471) A {v : 0} index is necessary if you need to roll back from MongoDB version 2.0 (or later) to MongoDB version 1.8. Create an Index Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the documents in a collection. Users can create indexes for any collection on any field in a document. By default, MongoDB creates an index on the _id field of every collection. This tutorial describes how to create an index on a single field. MongoDB also supports compound indexes (page 440), which are indexes on multiple fields. See Create a Compound Index (page 466) for instructions on building compound indexes. Create an Index on a Single Field To create an index, use ensureIndex() or a similar method from your driver10. The ensureIndex() method only creates an index if an index of the same specification does not already exist. For example, the following operation creates an index on the userid field of the records collection: db.records.ensureIndex( { userid: 1 } ) The value of the field in the index specification describes the kind of index for that field. For example, a value of 1 specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending order. For additional index types, see Index Types (page 437). The created index will support queries that select on the field userid, such as the following: db.records.find( { userid: 2 } ) db.records.find( { userid: { $gt: 10 } } ) But the created index does not support the following query on the profile_url field: db.records.find( { profile_url: 2 } ) For queries that cannot use an index, MongoDB must scan all documents in a collection for documents that match the query. Additional Considerations Although indexes can improve query performances, indexes also present some operational considerations. See Oper-ational Considerations for Indexes (page 137) for more information. If your collection holds a large amount of data, and your application needs to be able to access the data while building the index, consider building the index in the background, as described in Background Construction (page 460). To build indexes on replica sets, see the Build Indexes on Replica Sets (page 469) section for more information. 10http://api.mongodb.org/ 8.3. Indexing Tutorials 465
  • 470. MongoDB Documentation, Release 2.6.4 Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 469). Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any affect on the resulting index. See also: Create a Compound Index (page 466), Indexing Tutorials (page 464) and Index Concepts (page 436) for more infor-mation. Create a Compound Index Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the documents in a collection. MongoDB supports indexes that include content on a single field, as well as compound indexes (page 440) that include content from multiple fields. Continue reading for instructions and examples of building a compound index. Build a Compound Index To create a compound index (page 440) use an operation that resembles the following prototype: db.collection.ensureIndex( { a: 1, b: 1, c: 1 } ) The value of the field in the index specification describes the kind of index for that field. For example, a value of 1 specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending order. For additional index types, see Index Types (page 437). Example The following operation will create an index on the item, category, and price fields of the products collec-tion: db.products.ensureIndex( { item: 1, category: 1, price: 1 } ) Additional Considerations If your collection holds a large amount of data, and your application needs to be able to access the data while building the index, consider building the index in the background, as described in Background Construction (page 460). To build indexes on replica sets, see the Build Indexes on Replica Sets (page 469) section for more information. Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 469). Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any affect on the resulting index. See also: Create an Index (page 465), Indexing Tutorials (page 464) and Index Concepts (page 436) for more information. 466 Chapter 8. Indexes
  • 471. MongoDB Documentation, Release 2.6.4 Create a Unique Index MongoDB allows you to specify a unique constraint (page 457) on an index. These constraints prevent applications from inserting documents that have duplicate values for the inserted fields. Additionally, if you want to create an index on a collection that has existing data that might have duplicate values for the indexed field, you may choose to combine unique enforcement with duplicate dropping (page 461). Unique Indexes To create a unique index (page 457), consider the following prototype: db.collection.ensureIndex( { a: 1 }, { unique: true } ) For example, you may want to create a unique index on the "tax-id": of the accounts collection to prevent storing multiple account records for the same legal entity: db.accounts.ensureIndex( { "tax-id": 1 }, { unique: true } ) The _id index (page 439) is a unique index. In some situations you may consider using the _id field itself for this kind of data rather than using a unique index on another field. In many situations you will want to combine the unique constraint with the sparse option. When MongoDB indexes a field, if a document does not have a value for a field, the index entry for that item will be null. Since unique indexes cannot have duplicate values for a field, without the sparse option, MongoDB will reject the second document and all subsequent documents without the indexed field. Consider the following prototype. db.collection.ensureIndex( { a: 1 }, { unique: true, sparse: true } ) You can also enforce a unique constraint on compound indexes (page 440), as in the following prototype: db.collection.ensureIndex( { a: 1, b: 1 }, { unique: true } ) These indexes enforce uniqueness for the combination of index keys and not for either key individually. Drop Duplicates To force the creation of a unique index (page 457) index on a collection with duplicate values in the field you are indexing you can use the dropDups option. This will force MongoDB to create a unique index by deleting documents with duplicate values when building the index. Consider the following prototype invocation of ensureIndex(): db.collection.ensureIndex( { a: 1 }, { unique: true, dropDups: true } ) See the full documentation of duplicate dropping (page 461) for more information. Warning: Specifying { dropDups: true } may delete data from your database. Use with extreme cau-tion. Refer to the ensureIndex() documentation for additional index creation options. Create a Sparse Index Sparse indexes are like non-sparse indexes, except that they omit references to documents that do not include the indexed field. For fields that are only present in some documents sparse indexes may provide a significant space savings. See Sparse Indexes (page 457) for more information about sparse indexes and their use. 8.3. Indexing Tutorials 467
  • 472. MongoDB Documentation, Release 2.6.4 See also: Index Concepts (page 436) and Indexing Tutorials (page 464) for more information. Prototype To create a sparse index (page 457) on a field, use an operation that resembles the following prototype: db.collection.ensureIndex( { a: 1 }, { sparse: true } ) Example The following operation, creates a sparse index on the users collection that only includes a document in the index if the twitter_name field exists in a document. db.users.ensureIndex( { twitter_name: 1 }, { sparse: true } ) The index excludes all documents that do not include the twitter_name field. Considerations Note: Sparse indexes can affect the results returned by the query, particularly with respect to sorts on fields not included in the index. See the sparse index (page 457) section for more information. Create a Hashed Index New in version 2.4. Hashed indexes (page 455) compute a hash of the value of a field in a collection and index the hashed value. These indexes permit equality queries and may be suitable shard keys for some collections. Tip MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need to compute hashes. See Hashed Shard Keys (page 621) for more information about hashed indexes in sharded clusters, as well as Index Con-cepts (page 436) and Indexing Tutorials (page 464) for more information about indexes. Procedure To create a hashed index (page 455), specify hashed as the value of the index key, as in the following example: Example Specify a hashed index on _id db.collection.ensureIndex( { _id: "hashed" } ) 468 Chapter 8. Indexes
  • 473. MongoDB Documentation, Release 2.6.4 Considerations MongoDB supports hashed indexes of any single field. The hashing function collapses sub-documents and computes the hash for the entire value, but does not support multi-key (i.e. arrays) indexes. You may not create compound indexes that have hashed index fields. Build Indexes on Replica Sets For replica sets, secondaries will begin building indexes after the primary finishes building the index. In sharded clusters, the mongos will send ensureIndex() to the primary members of the replica set for each shard, which then replicate to the secondaries after the primary finishes building the index. To minimize the impact of building an index on your replica set, use the following procedure to build indexes: See Indexing Tutorials (page 464) and Index Concepts (page 436) for more information. Considerations • Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without falling too far behind to catch up. See the oplog sizing (page 535) documentation for additional information. • This procedure does take one member out of the replica set at a time. However, this procedure will only affect one member of the set at a time rather than all secondaries at the same time. • Do not use this procedure when building a unique index (page 457) with the dropDups option. • Before version 2.6 Background index creation operations (page 460) become foreground indexing operations on secondary members of replica sets. After 2.6, background index builds replicate as background index builds on the secondaries. Procedure Note: If you need to build an index in a sharded cluster, repeat the following procedure for each replica set that provides each shard. Stop One Secondary Stop the mongod process on one secondary. Restart the mongod process without the --replSet option and running on a different port. 11 This instance is now in “standalone” mode. For example, if your mongod normally runs with on the default port of 27017 with the --replSet option you would use the following invocation: mongod --port 47017 11 By running the mongod on a different port, you ensure that the other members of the replica set and all clients will not contact the member while you are building the index. 8.3. Indexing Tutorials 469
  • 474. MongoDB Documentation, Release 2.6.4 Build the Index Create the new index using the ensureIndex() in the mongo shell, or comparable method in your driver. This operation will create or rebuild the index on this mongod instance For example, to create an ascending index on the username field of the records collection, use the following mongo shell operation: db.records.ensureIndex( { username: 1 } ) See also: Create an Index (page 465) and Create a Compound Index (page 466) for more information. Restart the Program mongod When the index build completes, start the mongod instance with the --replSet option on its usual port: mongod --port 27017 --replSet rs0 Modify the port number (e.g. 27017) or the replica set name (e.g. rs0) as needed. Allow replication to catch up on this member. Build Indexes on all Secondaries Changed in version 2.6: Secondary members can now build indexes in the back-ground (page 470). Previously all index builds on secondaries were in the foreground. For each secondary in the set, build an index according to the following steps: 1. Stop One Secondary (page 469) 2. Build the Index (page 470) 3. Restart the Program mongod (page 470) Build the Index on the Primary To build an index on the primary you can either: 1. Build the index in the background (page 470) on the primary. 2. Step down the primary using the rs.stepDown() method in the mongo shell to cause the current primary to become a secondary graceful and allow the set to elect another member as primary. Then repeat the index building procedure, listed below, to build the index on the primary: (a) Stop One Secondary (page 469) (b) Build the Index (page 470) (c) Restart the Program mongod (page 470) Building the index on the background, takes longer than the foreground index build and results in a less compact index structure. Additionally, the background index build may impact write performance on the primary. However, building the index in the background allows the set to be continuously up for write operations during while MongoDB builds the index. Build Indexes in the Background By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur during a foreground index build. Background index construction (page 460) allows read and write operations to continue while building the index. See also: 470 Chapter 8. Indexes
  • 475. MongoDB Documentation, Release 2.6.4 Index Concepts (page 436) and Indexing Tutorials (page 464) for more information. Considerations Background index builds take longer to complete and result in an index that is initially larger, or less compact, than an index built in the foreground. Over time, the compactness of indexes built in the background will approach foreground-built indexes. After MongoDB finishes building the index, background-built indexes are functionally identical to any other index. Procedure To create an index in the background, add the background argument to the ensureIndex() operation, as in the following index: db.collection.ensureIndex( { a: 1 }, { background: true } ) Consider the section on background index construction (page 460) for more information about these indexes and their implications. Build Old Style Indexes Important: Use this procedure only if you must have indexes that are compatible with a version of MongoDB earlier than 2.0. MongoDB version 2.0 introduced the {v:1} index format. MongoDB versions 2.0 and later support both the {v:1} format and the earlier {v:0} format. MongoDB versions prior to 2.0, however, support only the {v:0} format. If you need to roll back MongoDB to a version prior to 2.0, you must drop and re-create your indexes. To build pre-2.0 indexes, use the dropIndexes() and ensureIndex() methods. You cannot simply reindex the collection. When you reindex on versions that only support {v:0} indexes, the v fields in the index definition still hold values of 1, even though the indexes would now use the {v:0} format. If you were to upgrade again to version 2.0 or later, these indexes would not work. Example Suppose you rolled back from MongoDB 2.0 to MongoDB 1.8, and suppose you had the following index on the items collection: { "v" : 1, "key" : { "name" : 1 }, "ns" : "mydb.items", "name" : "name_1" } The v field tells you the index is a {v:1} index, which is incompatible with version 1.8. To drop the index, issue the following command: db.items.dropIndex( { name : 1 } ) To recreate the index as a {v:0} index, issue the following command: db.foo.ensureIndex( { name : 1 } , { v : 0 } ) See also: Index Performance Enhancements (page 794). 8.3. Indexing Tutorials 471
  • 476. MongoDB Documentation, Release 2.6.4 8.3.2 Index Management Tutorials Instructions for managing indexes and assessing index performance and use. Remove Indexes (page 472) Drop an index from a collection. Modify an Index (page 472) Modify an existing index. Rebuild Indexes (page 474) In a single operation, drop all indexes on a collection and then rebuild them. Manage In-Progress Index Creation (page 474) Check the status of indexing progress, or terminate an ongoing in-dex build. Return a List of All Indexes (page 475) Obtain a list of all indexes on a collection or of all indexes on all collections in a database. Measure Index Use (page 475) Study query operations and observe index use for your database. Remove Indexes To remove an index from a collection use the dropIndex() method and the following procedure. If you simply need to rebuild indexes you can use the process described in the Rebuild Indexes (page 474) document. See also: Indexing Tutorials (page 464) and Index Concepts (page 436) for more information about indexes and indexing oper-ations in MongoDB. Remove a Specific Index To remove an index, use the db.collection.dropIndex() method. For example, the following operation removes an ascending index on the tax-id field in the accounts collection: db.accounts.dropIndex( { "tax-id": 1 } ) The operation returns a document with the status of the operation: { "nIndexesWas" : 3, "ok" : 1 } Where the value of nIndexesWas reflects the number of indexes before removing this index. For text (page 454) indexes, pass the index name to the db.collection.dropIndex() method. See Use the Index Name to Drop a text Index (page 489) for details. Remove All Indexes You can also use the db.collection.dropIndexes() to remove all indexes, except for the _id index (page 439) from a collection. These shell helpers provide wrappers around the dropIndexes database command. Your client library may have a different or additional interface for these operations. Modify an Index To modify an existing index, you need to drop and recreate the index. 472 Chapter 8. Indexes
  • 477. MongoDB Documentation, Release 2.6.4 Step 1: Create a unique index. Use the ensureIndex() method create a unique index. db.orders.ensureIndex( { "cust_id" : 1, "ord_date" : -1, "items" : 1 }, { unique: true } ) The method returns a document with the status of the results. The method only creates an index if the index does not already exist. See Create an Index (page 465) and Index Creation Tutorials (page 464) for more information on creating indexes. Step 2: Attempt to modify the index. To modify an existing index, you cannot just re-issue the ensureIndex() method with the updated specification of the index. For example, the following operation attempts to remove the unique constraint from the previously created index by using the ensureIndex() method. db.orders.ensureIndex( { "cust_id" : 1, "ord_date" : -1, "items" : 1 } ) The status document returned by the operation shows an error. Step 3: Drop the index. To modify the index, you must drop the index first. db.orders.dropIndex( { "cust_id" : 1, "ord_date" : -1, "items" : 1 } ) The method returns a document with the status of the operation. Upon successful operation, the ok field in the returned document should specify a 1. See Remove Indexes (page 472) for more information about dropping indexes. Step 4: Recreate the index without the unique constraint. Recreate the index without the unique constraint. db.orders.ensureIndex( { "cust_id" : 1, "ord_date" : -1, "items" : 1 } ) The method returns a document with the status of the results. Upon successful operation, the returned document should show the numIndexesAfter to be greater than numIndexesBefore by one. See also: Index Introduction (page 431), Index Concepts (page 436). 8.3. Indexing Tutorials 473
  • 478. MongoDB Documentation, Release 2.6.4 Rebuild Indexes If you need to rebuild indexes for a collection you can use the db.collection.reIndex() method to rebuild all indexes on a collection in a single operation. This operation drops all indexes, including the _id index (page 439), and then rebuilds all indexes. See also: Index Concepts (page 436) and Indexing Tutorials (page 464). Process The operation takes the following form: db.accounts.reIndex() MongoDB will return the following document when the operation completes: { "nIndexesWas" : 2, "msg" : "indexes dropped for collection", "nIndexes" : 2, "indexes" : [ { "key" : { "_id" : 1, "tax-id" : 1 }, "ns" : "records.accounts", "name" : "_id_" } ], "ok" : 1 } This shell helper provides a wrapper around the reIndex database command. Your client library may have a different or additional interface for this operation. Additional Considerations Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 469). Manage In-Progress Index Creation To see the status of the indexing processes, you can use the db.currentOp() method in the mongo shell. The value of the query field and the msg field will indicate if the operation is an index build. The msg field also indicates the percent of the build that is complete. To terminate an ongoing index build, use the db.killOp() method in the mongo shell. For more information see db.currentOp(). Changed in version 2.4: Before MongoDB 2.4, you could only terminate background index builds. After 2.4, you can terminate any index build, including foreground index builds. 474 Chapter 8. Indexes
  • 479. MongoDB Documentation, Release 2.6.4 Return a List of All Indexes When performing maintenance you may want to check which indexes exist on a collection. Every index on a collection has a corresponding document in the system.indexes (page 271) collection, and you can use standard queries (i.e. find()) to list the indexes, or in the mongo shell, the getIndexes() method to return a list of the indexes on a collection, as in the following examples. See also: Index Concepts (page 436) and Indexing Tutorials (page 464) for more information about indexes in MongoDB and common index management operations. List all Indexes on a Collection To return a list of all indexes on a collection, use the db.collection.getIndexes() method or a similar method for your driver12. For example, to view all indexes on the people collection: db.people.getIndexes() List all Indexes for a Database To return a list of all indexes on all collections in a database, use the following operation in the mongo shell: db.system.indexes.find() See system.indexes (page 271) for more information about these documents. Measure Index Use Synopsis Query performance is a good general indicator of index use; however, for more precise insight into index use, Mon-goDB provides a number of tools that allow you to study query operations and observe index use for your database. See also: Index Concepts (page 436) and Indexing Tutorials (page 464) for more information. Operations Return Query Plan with explain() Append the explain() method to any cursor (e.g. query) to return a document with statistics about the query process, including the index used, the number of documents scanned, and the time the query takes to process in milliseconds. Control Index Use with hint() Append the hint() to any cursor (e.g. query) with the index as the argument to force MongoDB to use a specific index to fulfill the query. Consider the following example: db.people.find( { name: "John Doe", zipcode: { $gt: "63000" } } ).hint( { zipcode: 1 } ) 12http://api.mongodb.org/ 8.3. Indexing Tutorials 475
  • 480. MongoDB Documentation, Release 2.6.4 You can use hint() and explain() in conjunction with each other to compare the effectiveness of a specific index. Specify the $natural operator to the hint() method to prevent MongoDB from using any index: db.people.find( { name: "John Doe", zipcode: { $gt: "63000" } } ).hint( { $natural: 1 } ) Instance Index Use Reporting MongoDB provides a number of metrics of index use and operation that you may want to consider when analyzing index use for your database: • In the output of serverStatus: – indexCounters – scanned – scanAndOrder • In the output of collStats: – totalIndexSize – indexSizes • In the output of dbStats: – dbStats.indexes – dbStats.indexSize 8.3.3 Geospatial Index Tutorials Instructions for creating and querying 2d, 2dsphere, and haystack indexes. Create a 2dsphere Index (page 476) A 2dsphere index supports data stored as both GeoJSON objects and as legacy coordinate pairs. Query a 2dsphere Index (page 478) Search for locations within, near, or intersected by a GeoJSON shape, or within a circle as defined by coordinate points on a sphere. Create a 2d Index (page 480) Create a 2d index to support queries on data stored as legacy coordinate pairs. Query a 2d Index (page 481) Search for locations using legacy coordinate pairs. Create a Haystack Index (page 482) A haystack index is optimized to return results over small areas. For queries that use spherical geometry, a 2dsphere index is a better option. Query a Haystack Index (page 483) Search based on location and non-location data within a small area. Calculate Distance Using Spherical Geometry (page 483) Convert distances to radians and back again. Create a 2dsphere Index To create a geospatial index for Geo