MongoDB is a NoSQL document database. Instead of storing data in rows and columns like a relational database, MongoDB stores data as documents inside collections. These documents look like JSON when you write them, but MongoDB stores them internally in a binary format called BSON, which supports richer data types. MongoDB is available as Community Edition and Enterprise Edition, and the official documentation provides installation paths for Linux, macOS, Windows, and Docker. (MongoDB)
This tutorial covers MongoDB from beginner to advanced level: what it is, why it is useful, how it works, architecture, installation, CRUD, indexing, aggregation, schema design, security, replication, sharding, backup, administration, and real workflows.
1. What Is MongoDB?
MongoDB is a document-oriented database designed for modern applications that need flexible, scalable, high-performance data storage.
In a traditional SQL database, you might store user data like this:
| id | name | city | |
|---|---|---|---|
| 1 | Alice | alice@example.com | Tokyo |
In MongoDB, the same user is stored as a document:
{
_id: ObjectId("..."),
name: "Alice",
email: "alice@example.com",
address: {
city: "Tokyo",
country: "Japan"
},
skills: ["Linux", "Docker", "MongoDB"]
}
A MongoDB document can contain:
- Strings
- Numbers
- Dates
- Arrays
- Nested objects
- Booleans
- ObjectIds
- Binary data
- Geospatial data
MongoDB is popular because application data is often already object-like. For example, a user profile, product catalog item, order, blog post, or IoT event naturally fits as a document.
2. Advantages of MongoDB
MongoDB has several major advantages.
2.1 Flexible Schema
MongoDB does not force every document in a collection to have the exact same fields. That makes it useful when application requirements change quickly.
For example:
{
name: "Laptop",
price: 1200,
category: "Electronics"
}
Another document in the same collection can have extra fields:
{
name: "Phone",
price: 800,
category: "Electronics",
warranty: "2 years",
colors: ["black", "white"]
}
However, flexible schema does not mean “no design.” MongoDB also supports schema validation when you want to enforce data rules such as required fields, data types, or value ranges. (MongoDB)
2.2 Natural Data Model for Applications
Most modern applications use objects, JSON, APIs, and nested data. MongoDB documents map naturally to those structures.
For example, an order can store customer information and ordered items together:
{
orderId: "ORD1001",
customer: {
name: "Alice",
email: "alice@example.com"
},
items: [
{ product: "Keyboard", qty: 1, price: 50 },
{ product: "Mouse", qty: 2, price: 25 }
],
status: "paid"
}
This can reduce the need for joins in many use cases.
2.3 High Availability
MongoDB supports replication through replica sets. A replica set is a group of mongod processes that maintain the same dataset. Replica sets provide redundancy and high availability, and MongoDB’s documentation describes them as the basis for production deployments. (MongoDB)
2.4 Horizontal Scalability
MongoDB supports sharding, which distributes data across multiple machines. A sharded cluster contains shards, mongos query routers, and config servers. Each shard stores a subset of the data and must be deployed as a replica set. (MongoDB)
2.5 Powerful Query and Aggregation System
MongoDB supports filtering, sorting, indexing, text search, geospatial queries, aggregation pipelines, and multi-document transactions. Aggregation pipelines process documents through stages such as filtering, grouping, projecting, sorting, joining, and calculating values. (MongoDB)
2.6 Transactions
MongoDB supports multi-document transactions when you need all-or-nothing changes across multiple documents or collections. Transactions either commit all changes or roll them back. (MongoDB)
3. How MongoDB Works
At a high level, MongoDB works like this:
Application
|
MongoDB Driver
|
mongod or mongos
|
Storage Engine
|
Data Files + Journal
3.1 Application Layer
Your application can be written in Node.js, Python, Java, Go, PHP, C#, Ruby, or another language. It uses a MongoDB driver to connect to the database.
Example connection string:
mongodb://localhost:27017
3.2 Driver Layer
The driver converts your application objects into MongoDB-compatible BSON documents and sends commands to the server.
Example in application logic:
await db.collection("users").insertOne({
name: "Alice",
email: "alice@example.com"
});
3.3 Server Layer: mongod
mongod is the main MongoDB database server process. It handles:
- Client connections
- Reads
- Writes
- Indexes
- Replication
- Storage
- Query execution
- Authentication and authorization
3.4 Storage Layer
MongoDB writes data to disk through its storage engine. In modern MongoDB deployments, WiredTiger is the default storage engine. The storage layer manages data files, indexes, compression, concurrency, and journaling.
3.5 Query Execution
When you run a query like:
db.users.find({ email: "alice@example.com" })
MongoDB checks whether an index can help. If an appropriate index exists, MongoDB uses it to limit the number of documents scanned. Without a useful index, MongoDB may scan every document in the collection. (MongoDB)
4. MongoDB Architecture
MongoDB can run in three main deployment architectures.
4.1 Standalone Architecture
This is the simplest architecture.
Application → mongod → Data files
Use it for:
- Learning
- Local development
- Testing
- Small experiments
Avoid standalone MongoDB for production because it has no automatic failover.
4.2 Replica Set Architecture
A replica set contains multiple MongoDB servers that hold the same data.
┌──────────────┐
Application → │ Primary Node │
└──────┬───────┘
│ replication
┌────────────┴────────────┐
↓ ↓
┌──────────────┐ ┌──────────────┐
│ Secondary │ │ Secondary │
└──────────────┘ └──────────────┘
The primary receives writes. The secondary nodes replicate data from the primary. If the primary fails, the replica set can elect a new primary.
MongoDB replica sets improve:
- Availability
- Fault tolerance
- Data redundancy
- Disaster recovery
- Read scaling in selected cases
MongoDB documentation notes that the primary records changes in the operation log, or oplog, which secondaries use to replicate changes. (MongoDB)
4.3 Sharded Cluster Architecture
Sharding distributes data across multiple shards.
Application
|
mongos
|
Config Servers
|
┌─────────┬─────────┬─────────┐
│ Shard 1 │ Shard 2 │ Shard 3 │
└─────────┴─────────┴─────────┘
A sharded cluster contains:
| Component | Purpose |
|---|---|
| Shard | Stores a subset of data |
mongos | Query router between application and shards |
| Config servers | Store cluster metadata and configuration |
| Shard key | Field used to distribute documents |
| Balancer | Moves chunks to balance data |
Use sharding when:
- One server cannot store all data
- One server cannot handle all reads/writes
- You need horizontal scaling
- You need data distribution by region or workload
5. MongoDB Components
5.1 mongod
The main database server process.
Responsibilities:
- Stores data
- Handles queries
- Maintains indexes
- Performs replication
- Applies access control
- Manages storage
5.2 mongosh
mongosh is the MongoDB Shell. You use it to connect to MongoDB, run commands, query data, create users, inspect collections, and perform administration tasks. MongoDB provides separate installation guidance for mongosh. (MongoDB)
Example:
mongosh
5.3 mongos
mongos is the query router used in sharded clusters. Applications connect to mongos, and mongos routes operations to the correct shards.
5.4 Config Servers
Config servers store metadata for sharded clusters, including shard information and chunk distribution. In modern MongoDB sharded clusters, config servers must be deployed as a replica set. (MongoDB)
5.5 MongoDB Compass
MongoDB Compass is a GUI tool for browsing databases, collections, documents, indexes, and aggregation pipelines.
Use Compass when you want a visual interface instead of shell commands.
5.6 MongoDB Database Tools
MongoDB Database Tools include utilities such as:
| Tool | Purpose |
|---|---|
mongodump | Create binary database backups |
mongorestore | Restore binary backups |
mongoexport | Export data as JSON or CSV |
mongoimport | Import JSON, CSV, or TSV |
bsondump | Inspect BSON files |
mongodump and mongorestore are useful for small deployments, but MongoDB documentation recommends snapshots or Atlas cloud backups for more resilient, non-disruptive backup strategies. (MongoDB)
6. MongoDB Terminology
| Relational Database | MongoDB |
|---|---|
| Database | Database |
| Table | Collection |
| Row | Document |
| Column | Field |
| Primary key | _id |
| Index | Index |
| Join | $lookup, embedding, or application-side join |
| View | View |
| Transaction | Transaction |
| SQL | MongoDB Query Language |
| Schema | Flexible schema / schema validation |
| Server | mongod |
| Cluster router | mongos |
| Replication log | Oplog |
| Horizontal partitioning | Sharding |
7. Installing MongoDB
For beginners, the easiest installation method is Docker. For production, use official packages for your operating system or MongoDB Atlas.
MongoDB’s official installation documentation covers Community and Enterprise editions across Linux, macOS, Windows, and Docker. (MongoDB)
7.1 Install MongoDB Using Docker
Create a MongoDB container:
docker run -d \
--name mongodb \
-p 27017:27017 \
-v mongodb_data:/data/db \
mongo:8.0
Check container status:
docker ps
Connect using mongosh inside the container:
docker exec -it mongodb mongosh
Stop MongoDB:
docker stop mongodb
Start it again:
docker start mongodb
Remove container:
docker rm -f mongodb
Remove volume:
docker volume rm mongodb_data
For a beginner lab, Docker is clean because it avoids OS package conflicts.
7.2 Install MongoDB on Ubuntu
The official Ubuntu installation uses the mongodb-org package maintained by MongoDB Inc.; MongoDB warns that Ubuntu’s separate mongodb package is not maintained by MongoDB Inc. and conflicts with the official package. (MongoDB)
Install prerequisites:
sudo apt-get update
sudo apt-get install -y gnupg curl
Import MongoDB public key:
curl -fsSL https://pgp.mongodb.com/server-8.0.asc | \
sudo gpg -o /usr/share/keyrings/mongodb-server-8.0.gpg \
--dearmor
For Ubuntu 24.04:
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] https://repo.mongodb.org/apt/ubuntu noble/mongodb-org/8.0 multiverse" | \
sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list
Install MongoDB:
sudo apt-get update
sudo apt-get install -y mongodb-org
Start MongoDB:
sudo systemctl start mongod
Enable MongoDB at boot:
sudo systemctl enable mongod
Check status:
sudo systemctl status mongod
Connect:
mongosh
7.3 Install MongoDB on macOS
MongoDB’s macOS Community Edition documentation is based on Homebrew and covers MongoDB 8.0 Community Edition. (MongoDB)
Install Homebrew if needed, then run:
brew tap mongodb/brew
brew install mongodb-community@8.0
Start MongoDB:
brew services start mongodb-community@8.0
Stop MongoDB:
brew services stop mongodb-community@8.0
Connect:
mongosh
7.4 Install MongoDB on Windows
MongoDB’s Windows Community Edition documentation uses the MSI installer by default, and notes that mongosh is not installed with MongoDB Server. (MongoDB)
Typical Windows workflow:
- Download MongoDB Community Server MSI.
- Run the installer.
- Choose “Complete” setup.
- Install MongoDB as a Windows service.
- Install MongoDB Compass if desired.
- Install
mongoshseparately if it is not included. - Open PowerShell or Command Prompt.
- Run:
mongosh
8. Getting Started with MongoDB
Start mongosh:
mongosh
Show databases:
show dbs
Create or switch to a database:
use bookstore
MongoDB creates the database only after you insert data.
Create a collection:
db.createCollection("books")
Show collections:
show collections
Insert one document:
db.books.insertOne({
title: "MongoDB Basics",
author: "Alice Tanaka",
price: 29.99,
category: "Database",
publishedYear: 2026,
tags: ["mongodb", "nosql", "database"],
stock: 50,
createdAt: new Date()
})
Insert many documents:
db.books.insertMany([
{
title: "Linux for DevOps",
author: "Ken Sato",
price: 34.99,
category: "DevOps",
publishedYear: 2025,
tags: ["linux", "devops"],
stock: 30,
createdAt: new Date()
},
{
title: "Docker Practical Guide",
author: "Maria Silva",
price: 39.99,
category: "Containers",
publishedYear: 2026,
tags: ["docker", "containers"],
stock: 20,
createdAt: new Date()
}
])
Find all documents:
db.books.find()
Pretty print:
db.books.find().pretty()
Find one document:
db.books.findOne({ title: "MongoDB Basics" })
Filter documents:
db.books.find({ category: "Database" })
Use comparison operators:
db.books.find({ price: { $gt: 30 } })
Use logical operators:
db.books.find({
$and: [
{ price: { $gt: 20 } },
{ stock: { $gt: 10 } }
]
})
Find by array value:
db.books.find({ tags: "docker" })
Projection: return selected fields only:
db.books.find(
{ category: "Database" },
{ title: 1, author: 1, price: 1, _id: 0 }
)
Sort results:
db.books.find().sort({ price: 1 })
Limit results:
db.books.find().limit(2)
Skip results:
db.books.find().skip(2).limit(2)
Update one document:
db.books.updateOne(
{ title: "MongoDB Basics" },
{ $set: { price: 24.99 } }
)
Update many documents:
db.books.updateMany(
{ category: "Database" },
{ $inc: { stock: 10 } }
)
Add value to array:
db.books.updateOne(
{ title: "MongoDB Basics" },
{ $addToSet: { tags: "beginner" } }
)
Delete one document:
db.books.deleteOne({ title: "Docker Practical Guide" })
Delete many documents:
db.books.deleteMany({ stock: { $lte: 0 } })
Drop collection:
db.books.drop()
Drop database:
db.dropDatabase()
MongoDB CRUD operations are create, read, update, and delete operations on documents. (MongoDB)
9. MongoDB Data Modeling
MongoDB data modeling is one of the most important skills.
You usually choose between:
- Embedding
- Referencing
9.1 Embedding
Embedding means storing related data inside the same document.
Example:
{
title: "MongoDB Basics",
author: "Alice Tanaka",
reviews: [
{
user: "Ravi",
rating: 5,
comment: "Very helpful"
},
{
user: "Yuki",
rating: 4,
comment: "Good beginner guide"
}
]
}
Use embedding when:
- Data is frequently read together
- Child data belongs strongly to parent data
- Child data is limited in size
- You want fast reads
Good examples:
- Blog post with comments
- Order with order items
- User profile with address
- Product with attributes
9.2 Referencing
Referencing means storing related data in separate collections and linking by _id.
Users:
{
_id: ObjectId("64..."),
name: "Alice"
}
Orders:
{
_id: ObjectId("65..."),
userId: ObjectId("64..."),
total: 150
}
Use referencing when:
- Related data is large
- Related data changes frequently
- Many documents share the same related data
- You need many-to-many relationships
Good examples:
- Users and roles
- Products and categories
- Students and courses
- Authors and books
9.3 Rule of Thumb
Use embedding when data is “owned by” the parent and read together.
Use referencing when data is independent, large, shared, or frequently updated.
Tiny hot take: MongoDB schema design is not “no schema.” It is “put the schema where it actually helps.” Sometimes that is the application. Sometimes that is validation. Sometimes it is both.
10. Schema Validation
MongoDB allows flexible documents by default, but you can enforce rules using schema validation. Schema validation can check required fields, data types, value ranges, and document shape. (MongoDB)
Create a collection with validation:
db.createCollection("students", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "email", "age"],
properties: {
name: {
bsonType: "string",
description: "Name must be a string"
},
email: {
bsonType: "string",
pattern: "^.+@.+\\..+$",
description: "Email must be valid"
},
age: {
bsonType: "int",
minimum: 18,
maximum: 100,
description: "Age must be between 18 and 100"
}
}
}
}
})
Valid insert:
db.students.insertOne({
name: "Aiko",
email: "aiko@example.com",
age: 22
})
Invalid insert:
db.students.insertOne({
name: "Aiko",
age: 15
})
The invalid insert fails because email is missing and age is below the allowed minimum.
11. Indexes in MongoDB
Indexes make queries faster. Without an index, MongoDB may scan every document in a collection. With a suitable index, MongoDB can scan fewer documents. Indexes improve query performance but add write overhead because inserts and updates must also update indexes. (MongoDB)
11.1 Create a Single-Field Index
db.books.createIndex({ title: 1 })
1 means ascending order. -1 means descending order.
11.2 Create a Unique Index
db.users.createIndex({ email: 1 }, { unique: true })
Now duplicate emails are rejected.
11.3 Create a Compound Index
db.books.createIndex({ category: 1, price: -1 })
Useful query:
db.books.find({ category: "Database" }).sort({ price: -1 })
11.4 List Indexes
db.books.getIndexes()
11.5 Drop Index
db.books.dropIndex({ title: 1 })
11.6 Use explain()
db.books.find({ title: "MongoDB Basics" }).explain("executionStats")
Look for:
COLLSCAN: collection scan, usually bad for large collectionsIXSCAN: index scan, usually bettertotalDocsExaminedtotalKeysExaminedexecutionTimeMillis
11.7 Index Best Practices
Create indexes for:
- Frequent filters
- Frequent sorts
- Unique fields
- Foreign-key-like reference fields
- Common search patterns
Avoid:
- Too many indexes
- Indexing low-value fields unnecessarily
- Creating indexes without checking query patterns
- Ignoring write overhead
12. Aggregation Pipeline
Aggregation is MongoDB’s data processing framework. It works like a pipeline: documents pass through stages, and each stage transforms, filters, groups, sorts, joins, or calculates data. MongoDB documentation notes that aggregation stages pass output documents to the next stage, and aggregation can return grouped results. (MongoDB)
Basic structure:
db.collection.aggregate([
{ stage1 },
{ stage2 },
{ stage3 }
])
12.1 Sample Data
db.orders.insertMany([
{
customer: "Alice",
status: "paid",
total: 120,
items: [
{ product: "Keyboard", qty: 1, price: 50 },
{ product: "Mouse", qty: 2, price: 35 }
],
createdAt: new Date("2026-01-10")
},
{
customer: "Bob",
status: "pending",
total: 80,
items: [
{ product: "Mouse", qty: 2, price: 40 }
],
createdAt: new Date("2026-01-11")
},
{
customer: "Alice",
status: "paid",
total: 200,
items: [
{ product: "Monitor", qty: 1, price: 200 }
],
createdAt: new Date("2026-01-12")
}
])
12.2 $match
Filters documents.
db.orders.aggregate([
{ $match: { status: "paid" } }
])
12.3 $group
Groups documents and calculates values.
db.orders.aggregate([
{ $match: { status: "paid" } },
{
$group: {
_id: "$customer",
totalSpent: { $sum: "$total" },
orderCount: { $sum: 1 }
}
}
])
12.4 $project
Controls output fields.
db.orders.aggregate([
{
$project: {
customer: 1,
total: 1,
tax: { $multiply: ["$total", 0.1] },
grandTotal: { $multiply: ["$total", 1.1] }
}
}
])
12.5 $sort
db.orders.aggregate([
{ $sort: { total: -1 } }
])
12.6 $unwind
Breaks array items into separate documents.
db.orders.aggregate([
{ $unwind: "$items" },
{
$group: {
_id: "$items.product",
totalQuantity: { $sum: "$items.qty" },
totalRevenue: {
$sum: { $multiply: ["$items.qty", "$items.price"] }
}
}
}
])
12.7 $lookup
Performs a left outer join between collections.
Products:
db.products.insertMany([
{ _id: 1, name: "Keyboard", category: "Accessories" },
{ _id: 2, name: "Mouse", category: "Accessories" }
])
Order items:
db.orderItems.insertMany([
{ orderId: 101, productId: 1, qty: 1 },
{ orderId: 102, productId: 2, qty: 2 }
])
Join:
db.orderItems.aggregate([
{
$lookup: {
from: "products",
localField: "productId",
foreignField: "_id",
as: "productDetails"
}
}
])
13. Transactions
MongoDB supports multi-document transactions for use cases where multiple changes must succeed or fail together. The official docs describe distributed transactions as atomic: changes are applied together or rolled back. (MongoDB)
Example use case:
- Deduct money from one account
- Add money to another account
- Record transfer history
If one step fails, all changes should be rolled back.
Example structure in JavaScript-style pseudocode:
const session = client.startSession();
try {
session.startTransaction();
await accounts.updateOne(
{ accountNo: "A100" },
{ $inc: { balance: -100 } },
{ session }
);
await accounts.updateOne(
{ accountNo: "B200" },
{ $inc: { balance: 100 } },
{ session }
);
await transfers.insertOne(
{
from: "A100",
to: "B200",
amount: 100,
createdAt: new Date()
},
{ session }
);
await session.commitTransaction();
} catch (error) {
await session.abortTransaction();
} finally {
await session.endSession();
}
Use transactions when you truly need them. Do not use transactions to hide poor schema design. In MongoDB, good document design often reduces the need for multi-document transactions.
14. Security: Authentication and Authorization
MongoDB security has two major parts:
| Concept | Meaning |
|---|---|
| Authentication | Proves who the user is |
| Authorization | Controls what the user can do |
MongoDB documentation explains that authentication verifies identity, while authorization determines access to resources and operations. (MongoDB)
MongoDB uses Role-Based Access Control. A user is granted one or more roles, and outside those roles, the user has no access. Access control is not enabled by default in self-managed deployments. (MongoDB)
14.1 Create Admin User
Connect locally before enabling auth:
mongosh
Switch to admin database:
use admin
Create admin user:
db.createUser({
user: "adminUser",
pwd: passwordPrompt(),
roles: [
{ role: "userAdminAnyDatabase", db: "admin" },
{ role: "readWriteAnyDatabase", db: "admin" },
{ role: "dbAdminAnyDatabase", db: "admin" }
]
})
14.2 Enable Authorization
Edit MongoDB config file:
sudo nano /etc/mongod.conf
Add:
security:
authorization: enabled
Restart MongoDB:
sudo systemctl restart mongod
Connect with authentication:
mongosh -u adminUser -p --authenticationDatabase admin
14.3 Create Application User
use bookstore
db.createUser({
user: "bookAppUser",
pwd: passwordPrompt(),
roles: [
{ role: "readWrite", db: "bookstore" }
]
})
Connect as app user:
mongosh -u bookAppUser -p --authenticationDatabase bookstore
14.4 Common Built-In Roles
| Role | Purpose |
|---|---|
read | Read-only access |
readWrite | Read and write access |
dbAdmin | Database administration |
userAdmin | User management for one database |
userAdminAnyDatabase | User management across databases |
readWriteAnyDatabase | Read/write across databases |
clusterAdmin | Cluster administration |
backup | Backup privileges |
restore | Restore privileges |
root | Superuser role |
15. Replication: High Availability
A replica set is the standard production pattern for MongoDB. It contains multiple mongod instances with the same data. MongoDB’s documentation states that replica sets provide redundancy and high availability. (MongoDB)
15.1 Replica Set Components
| Component | Meaning |
|---|---|
| Primary | Accepts writes |
| Secondary | Replicates from primary |
| Arbiter | Votes in elections but does not store data |
| Oplog | Operation log used for replication |
| Election | Process of choosing a new primary |
15.2 Local Replica Set with Docker Compose
Create docker-compose.yml:
services:
mongo1:
image: mongo:8.0
container_name: mongo1
command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
ports:
- "27017:27017"
volumes:
- mongo1_data:/data/db
mongo2:
image: mongo:8.0
container_name: mongo2
command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
ports:
- "27018:27017"
volumes:
- mongo2_data:/data/db
mongo3:
image: mongo:8.0
container_name: mongo3
command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
ports:
- "27019:27017"
volumes:
- mongo3_data:/data/db
volumes:
mongo1_data:
mongo2_data:
mongo3_data:
Start:
docker compose up -d
Connect to first node:
docker exec -it mongo1 mongosh
Initialize replica set:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "mongo1:27017" },
{ _id: 1, host: "mongo2:27017" },
{ _id: 2, host: "mongo3:27017" }
]
})
Check status:
rs.status()
Check primary:
rs.isMaster()
Insert data:
use replab
db.test.insertOne({ message: "replication works", createdAt: new Date() })
16. Sharding: Horizontal Scaling
Sharding is MongoDB’s horizontal scaling system. It distributes collection data across multiple shards. MongoDB sharded clusters include shards, mongos routers, and config servers. (MongoDB)
16.1 When to Use Sharding
Use sharding when:
- Data is too large for one server
- Write throughput is too high for one server
- Working set does not fit on one machine
- You need regional data distribution
- You need very large-scale growth
Do not start with sharding unless you need it. Most applications should start with a replica set.
16.2 Shard Key
A shard key determines how MongoDB distributes data.
Good shard key characteristics:
- High cardinality
- Even distribution
- Supports common queries
- Avoids hot spots
- Does not constantly increase in one direction unless handled carefully
Poor shard key examples:
{ status: 1 }
Bad because status may only have a few values.
Potentially risky:
{ createdAt: 1 }
Can create hot spots if inserts always go to the newest range.
Better example:
{ customerId: 1, orderId: 1 }
16.3 Sharding Commands: Conceptual Example
Enable sharding on database:
sh.enableSharding("shop")
Create index on shard key:
db.orders.createIndex({ customerId: 1, orderId: 1 })
Shard collection:
sh.shardCollection("shop.orders", { customerId: 1, orderId: 1 })
17. Backup and Restore
Backups are non-negotiable. A database without tested backups is basically a suspense movie with invoices.
MongoDB’s backup and restore tools include mongodump and mongorestore. These create and restore BSON data dumps. Official documentation says they are useful for small deployments, but they can affect performance because they read data through a running MongoDB instance. MongoDB recommends verifying backups by restoring them to a test deployment. (MongoDB)
17.1 Backup a Database
mongodump --db bookstore --out /backup/mongodb
Backup with authentication:
mongodump \
--username adminUser \
--password \
--authenticationDatabase admin \
--db bookstore \
--out /backup/mongodb
17.2 Restore a Database
mongorestore --db bookstore /backup/mongodb/bookstore
Restore with authentication:
mongorestore \
--username adminUser \
--password \
--authenticationDatabase admin \
--db bookstore \
/backup/mongodb/bookstore
17.3 Backup Best Practices
- Automate backups
- Store backups off-server
- Encrypt backups
- Test restore regularly
- Label backups with date and database name
- Monitor backup success and failure
- For production, prefer snapshot-based or managed cloud backup where appropriate
18. MongoDB Administration Workflow
A MongoDB administrator is responsible for keeping the database secure, healthy, backed up, observable, and performant.
18.1 Daily Administration Checklist
Run these checks daily:
db.adminCommand({ serverStatus: 1 })
Check database sizes:
db.stats()
Check collection stats:
db.books.stats()
Check current operations:
db.currentOp()
Check replica set status:
rs.status()
Check replication lag:
rs.printSecondaryReplicationInfo()
Review logs:
sudo journalctl -u mongod
or:
sudo tail -f /var/log/mongodb/mongod.log
18.2 Weekly Administration Checklist
- Review slow queries
- Review index usage
- Check disk growth
- Check backup restore process
- Review users and roles
- Check expired or unused accounts
- Confirm monitoring alerts
- Check OS patching plan
- Check MongoDB patch version
- Review schema growth and document sizes
18.3 Monthly Administration Checklist
- Test disaster recovery
- Review capacity planning
- Audit access control
- Review TLS certificates
- Review shard balance if sharded
- Review replica set elections
- Confirm backup retention
- Check deprecated features before upgrades
18.4 Common Admin Commands
Show databases:
show dbs
Show users:
show users
Show roles:
show roles
Check server status:
db.serverStatus()
Check build info:
db.version()
db.adminCommand({ buildInfo: 1 })
Check database stats:
db.stats()
Check collection stats:
db.collection.stats()
Kill operation:
db.killOp(opid)
19. MongoDB User Workflow
A MongoDB user workflow depends on role. Here are the main workflows.
19.1 Developer Workflow
- Understand application data.
- Design document model.
- Choose embedding or referencing.
- Create collections.
- Insert sample data.
- Write CRUD queries.
- Add indexes.
- Test with realistic data volume.
- Use
explain()to inspect performance. - Add schema validation where needed.
- Connect application using driver.
- Deploy with authentication and TLS.
- Monitor slow queries.
19.2 Application Workflow
A typical application flow:
User request
↓
Application route/controller
↓
Validate request
↓
MongoDB driver query
↓
MongoDB server
↓
Return document/result
↓
Application response
Example: create a user.
app.post("/users", async (req, res) => {
const result = await db.collection("users").insertOne({
name: req.body.name,
email: req.body.email,
createdAt: new Date()
});
res.json({ id: result.insertedId });
});
19.3 Analyst Workflow
- Connect to read-only user.
- Explore collections.
- Use aggregation pipeline.
- Export selected results.
- Build reports.
- Avoid production-heavy queries without indexes.
- Coordinate large analytics jobs with admins.
19.4 Admin Workflow
- Provision deployment.
- Configure storage and networking.
- Enable authentication.
- Create users and roles.
- Configure backups.
- Configure monitoring.
- Review indexes.
- Manage replica sets.
- Plan upgrades.
- Respond to incidents.
20. Performance Optimization
MongoDB performance depends on data model, indexes, working set, hardware, queries, and deployment architecture.
20.1 Use Proper Indexes
Bad:
db.orders.find({ customerId: 12345 }).sort({ createdAt: -1 })
Without index, this may scan and sort many documents.
Better:
db.orders.createIndex({ customerId: 1, createdAt: -1 })
20.2 Avoid Returning Too Much Data
Bad:
db.users.find()
Better:
db.users.find(
{ status: "active" },
{ name: 1, email: 1, _id: 0 }
).limit(100)
20.3 Avoid Unbounded Arrays
Bad design:
{
userId: 1,
events: [
thousands_or_millions_of_events
]
}
Better:
{
userId: 1,
eventType: "login",
createdAt: ISODate("2026-01-01T10:00:00Z")
}
Store large event streams as separate documents.
20.4 Use explain()
db.orders.find({ customerId: 1001 }).explain("executionStats")
Check:
- Was an index used?
- How many documents were scanned?
- How long did it take?
- Is there a collection scan?
20.5 Design for Query Patterns
Do not design MongoDB documents only by “normalization rules.” Design for the way your application reads and writes data.
Ask:
- What are the top 10 queries?
- What data is read together?
- What data changes frequently?
- What data grows forever?
- What fields need indexes?
- What fields need uniqueness?
- What operations must be atomic?
21. Advanced MongoDB Features
21.1 TTL Indexes
TTL indexes automatically delete documents after a period of time.
Example: delete logs after 30 days.
db.logs.createIndex(
{ createdAt: 1 },
{ expireAfterSeconds: 2592000 }
)
21.2 Text Indexes
db.articles.createIndex({ title: "text", body: "text" })
Search:
db.articles.find({ $text: { $search: "mongodb tutorial" } })
21.3 Geospatial Indexes
db.places.createIndex({ location: "2dsphere" })
Example document:
{
name: "Tokyo Station",
location: {
type: "Point",
coordinates: [139.7671, 35.6812]
}
}
Find nearby:
db.places.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: [139.7671, 35.6812]
},
$maxDistance: 1000
}
}
})
21.4 Change Streams
Change streams allow applications to listen for database changes.
Example:
const changeStream = db.collection("orders").watch();
changeStream.on("change", change => {
console.log(change);
});
Use cases:
- Real-time notifications
- Cache invalidation
- Event-driven systems
- Audit pipelines
21.5 Capped Collections
Capped collections have fixed size and preserve insertion order.
db.createCollection("logs", {
capped: true,
size: 100000
})
Use for:
- Logs
- Event buffers
- Temporary streams
22. Complete Mini Project: Bookstore Database
Now let’s build a small bookstore database.
22.1 Create Database
use bookstore
22.2 Create Collections
db.createCollection("users")
db.createCollection("books")
db.createCollection("orders")
22.3 Insert Users
db.users.insertMany([
{
name: "Alice",
email: "alice@example.com",
role: "customer",
createdAt: new Date()
},
{
name: "Bob",
email: "bob@example.com",
role: "customer",
createdAt: new Date()
}
])
22.4 Insert Books
db.books.insertMany([
{
title: "MongoDB Basics",
author: "Alice Tanaka",
category: "Database",
price: 29.99,
stock: 50,
tags: ["mongodb", "nosql"]
},
{
title: "Docker Practical Guide",
author: "Maria Silva",
category: "DevOps",
price: 39.99,
stock: 20,
tags: ["docker", "containers"]
},
{
title: "Linux for Beginners",
author: "Ken Sato",
category: "Linux",
price: 24.99,
stock: 100,
tags: ["linux", "server"]
}
])
22.5 Create Indexes
db.users.createIndex({ email: 1 }, { unique: true })
db.books.createIndex({ title: 1 })
db.books.createIndex({ category: 1, price: -1 })
db.orders.createIndex({ userId: 1, createdAt: -1 })
22.6 Create an Order
Find user:
const user = db.users.findOne({ email: "alice@example.com" })
Find books:
const book1 = db.books.findOne({ title: "MongoDB Basics" })
const book2 = db.books.findOne({ title: "Docker Practical Guide" })
Insert order:
db.orders.insertOne({
userId: user._id,
customerEmail: user.email,
items: [
{
bookId: book1._id,
title: book1.title,
qty: 1,
price: book1.price
},
{
bookId: book2._id,
title: book2.title,
qty: 1,
price: book2.price
}
],
total: book1.price + book2.price,
status: "paid",
createdAt: new Date()
})
22.7 Reduce Stock
db.books.updateOne(
{ _id: book1._id },
{ $inc: { stock: -1 } }
)
db.books.updateOne(
{ _id: book2._id },
{ $inc: { stock: -1 } }
)
22.8 Sales Report
db.orders.aggregate([
{ $match: { status: "paid" } },
{ $unwind: "$items" },
{
$group: {
_id: "$items.title",
totalSold: { $sum: "$items.qty" },
revenue: {
$sum: { $multiply: ["$items.qty", "$items.price"] }
}
}
},
{ $sort: { revenue: -1 } }
])
23. MongoDB Best Practices
23.1 Data Modeling Best Practices
- Model data around application queries.
- Embed data that is read together.
- Reference data that is large, shared, or independent.
- Avoid unbounded arrays.
- Keep document size reasonable.
- Use schema validation for critical collections.
- Store duplicate read-optimized fields when useful, but keep them consistent.
23.2 Query Best Practices
- Use indexes for frequent queries.
- Use projection to return only needed fields.
- Use pagination for large result sets.
- Avoid regex queries without proper index strategy.
- Avoid collection scans on large collections.
- Use aggregation carefully on large datasets.
23.3 Index Best Practices
- Create indexes based on real queries.
- Use compound indexes for filter + sort patterns.
- Avoid too many indexes.
- Remove unused indexes.
- Use unique indexes for unique fields.
- Test indexes with
explain().
23.4 Security Best Practices
- Enable authentication.
- Use least-privilege roles.
- Do not use admin users in applications.
- Use TLS in production.
- Rotate passwords and secrets.
- Keep MongoDB patched.
- Restrict network access.
- Audit users and roles regularly.
23.5 Backup Best Practices
- Automate backups.
- Test restores.
- Store backups securely.
- Use snapshots or managed backups for production.
- Monitor backup jobs.
- Document recovery procedures.
23.6 Production Best Practices
- Use replica sets.
- Monitor disk, CPU, memory, connections, locks, and replication lag.
- Use proper indexes.
- Plan capacity.
- Avoid running without authentication.
- Avoid public internet exposure.
- Use sharding only when required.
- Test upgrades before production.
- Keep disaster recovery plans current.
24. Common Mistakes Beginners Make
Mistake 1: Thinking MongoDB Has No Schema
MongoDB has a flexible schema, but your application still needs a clear data model.
Mistake 2: Creating Too Many Indexes
Indexes speed up reads but slow down writes and use storage.
Mistake 3: Using MongoDB Like a Relational Database
If every query requires many joins, your model may not be document-oriented enough.
Mistake 4: Ignoring Backups
Backups are only useful if restores are tested.
Mistake 5: Running Without Authentication
Self-managed MongoDB access control is not enabled by default, so you must enable it for real deployments. (MongoDB)
Mistake 6: Bad Shard Key
A poor shard key can create uneven data distribution and performance hot spots.
Mistake 7: Not Using explain()
You cannot guess query performance reliably. Use explain().
25. MongoDB Cheat Sheet
Database Commands
show dbs
use mydb
db
db.dropDatabase()
Collection Commands
show collections
db.createCollection("users")
db.users.drop()
Insert
db.users.insertOne({ name: "Alice" })
db.users.insertMany([{ name: "Bob" }, { name: "Carol" }])
Read
db.users.find()
db.users.findOne({ name: "Alice" })
db.users.find({ age: { $gt: 18 } })
Projection
db.users.find({}, { name: 1, email: 1, _id: 0 })
Update
db.users.updateOne(
{ name: "Alice" },
{ $set: { city: "Tokyo" } }
)
db.users.updateMany(
{ active: true },
{ $set: { verified: true } }
)
Delete
db.users.deleteOne({ name: "Alice" })
db.users.deleteMany({ active: false })
Index
db.users.createIndex({ email: 1 }, { unique: true })
db.users.getIndexes()
db.users.dropIndex({ email: 1 })
Aggregation
db.orders.aggregate([
{ $match: { status: "paid" } },
{ $group: { _id: "$customer", total: { $sum: "$total" } } },
{ $sort: { total: -1 } }
])
Users
use admin
db.createUser({
user: "adminUser",
pwd: passwordPrompt(),
roles: ["root"]
})
Backup
mongodump --db mydb --out /backup
Restore
mongorestore --db mydb /backup/mydb
26. Suggested Learning Path
Follow this order:
- Understand documents, collections, and databases.
- Install MongoDB locally with Docker.
- Learn
mongosh. - Practice CRUD.
- Learn filters, projections, sorting, and pagination.
- Learn data modeling.
- Learn indexes.
- Learn aggregation.
- Add schema validation.
- Add authentication and users.
- Learn backup and restore.
- Learn replica sets.
- Learn transactions.
- Learn sharding concepts.
- Learn monitoring and production administration.
Final Summary
MongoDB is a flexible, document-oriented database designed for modern applications. Its biggest strengths are flexible schema design, document-based storage, powerful queries, indexes, aggregation, replication, transactions, and horizontal scaling through sharding.
For beginners, start with Docker, mongosh, CRUD, and indexes. For intermediate users, focus on schema design, aggregation, validation, and security. For advanced users, learn replica sets, backup strategies, performance tuning, transactions, and sharding.
A strong MongoDB workflow is:
Design data model
↓
Create collections
↓
Insert documents
↓
Query and update data
↓
Add indexes
↓
Validate schema
↓
Secure users and roles
↓
Back up data
↓
Monitor performance
↓
Scale with replica sets and sharding
That is the practical end-to-end MongoDB journey: from first document to production-ready database.