MongoDB – Complete End-to-End MongoDB Tutorial Blog: From Basics to Advanced

MongoDB is a NoSQL document database. Instead of storing data in rows and columns like a relational database, MongoDB stores data as documents inside collections. These documents look like JSON when you write them, but MongoDB stores them internally in a binary format called BSON, which supports richer data types. MongoDB is available as Community Edition and Enterprise Edition, and the official documentation provides installation paths for Linux, macOS, Windows, and Docker. (MongoDB)

This tutorial covers MongoDB from beginner to advanced level: what it is, why it is useful, how it works, architecture, installation, CRUD, indexing, aggregation, schema design, security, replication, sharding, backup, administration, and real workflows.

1. What Is MongoDB?

MongoDB is a document-oriented database designed for modern applications that need flexible, scalable, high-performance data storage.

In a traditional SQL database, you might store user data like this:

id	name	email	city
1	Alice	alice@example.com	Tokyo

In MongoDB, the same user is stored as a document:

{
  _id: ObjectId("..."),
  name: "Alice",
  email: "alice@example.com",
  address: {
    city: "Tokyo",
    country: "Japan"
  },
  skills: ["Linux", "Docker", "MongoDB"]
}

A MongoDB document can contain:

Strings
Numbers
Dates
Arrays
Nested objects
Booleans
ObjectIds
Binary data
Geospatial data

MongoDB is popular because application data is often already object-like. For example, a user profile, product catalog item, order, blog post, or IoT event naturally fits as a document.

2. Advantages of MongoDB

MongoDB has several major advantages.

2.1 Flexible Schema

MongoDB does not force every document in a collection to have the exact same fields. That makes it useful when application requirements change quickly.

For example:

{
  name: "Laptop",
  price: 1200,
  category: "Electronics"
}

Another document in the same collection can have extra fields:

{
  name: "Phone",
  price: 800,
  category: "Electronics",
  warranty: "2 years",
  colors: ["black", "white"]
}

However, flexible schema does not mean “no design.” MongoDB also supports schema validation when you want to enforce data rules such as required fields, data types, or value ranges. (MongoDB)

2.2 Natural Data Model for Applications

Most modern applications use objects, JSON, APIs, and nested data. MongoDB documents map naturally to those structures.

For example, an order can store customer information and ordered items together:

{
  orderId: "ORD1001",
  customer: {
    name: "Alice",
    email: "alice@example.com"
  },
  items: [
    { product: "Keyboard", qty: 1, price: 50 },
    { product: "Mouse", qty: 2, price: 25 }
  ],
  status: "paid"
}

This can reduce the need for joins in many use cases.

2.3 High Availability

MongoDB supports replication through replica sets. A replica set is a group of mongod processes that maintain the same dataset. Replica sets provide redundancy and high availability, and MongoDB’s documentation describes them as the basis for production deployments. (MongoDB)

2.4 Horizontal Scalability

MongoDB supports sharding, which distributes data across multiple machines. A sharded cluster contains shards, mongos query routers, and config servers. Each shard stores a subset of the data and must be deployed as a replica set. (MongoDB)

2.5 Powerful Query and Aggregation System

MongoDB supports filtering, sorting, indexing, text search, geospatial queries, aggregation pipelines, and multi-document transactions. Aggregation pipelines process documents through stages such as filtering, grouping, projecting, sorting, joining, and calculating values. (MongoDB)

2.6 Transactions

MongoDB supports multi-document transactions when you need all-or-nothing changes across multiple documents or collections. Transactions either commit all changes or roll them back. (MongoDB)

3. How MongoDB Works

At a high level, MongoDB works like this:

Application
   |
MongoDB Driver
   |
mongod or mongos
   |
Storage Engine
   |
Data Files + Journal

3.1 Application Layer

Your application can be written in Node.js, Python, Java, Go, PHP, C#, Ruby, or another language. It uses a MongoDB driver to connect to the database.

Example connection string:

mongodb://localhost:27017

3.2 Driver Layer

The driver converts your application objects into MongoDB-compatible BSON documents and sends commands to the server.

Example in application logic:

await db.collection("users").insertOne({
  name: "Alice",
  email: "alice@example.com"
});

3.3 Server Layer: `mongod`

mongod is the main MongoDB database server process. It handles:

Client connections
Reads
Writes
Indexes
Replication
Storage
Query execution
Authentication and authorization

3.4 Storage Layer

MongoDB writes data to disk through its storage engine. In modern MongoDB deployments, WiredTiger is the default storage engine. The storage layer manages data files, indexes, compression, concurrency, and journaling.

3.5 Query Execution

When you run a query like:

db.users.find({ email: "alice@example.com" })

MongoDB checks whether an index can help. If an appropriate index exists, MongoDB uses it to limit the number of documents scanned. Without a useful index, MongoDB may scan every document in the collection. (MongoDB)

4. MongoDB Architecture

MongoDB can run in three main deployment architectures.

4.1 Standalone Architecture

This is the simplest architecture.

Application → mongod → Data files

Use it for:

Learning
Local development
Testing
Small experiments

Avoid standalone MongoDB for production because it has no automatic failover.

4.2 Replica Set Architecture

A replica set contains multiple MongoDB servers that hold the same data.

              ┌──────────────┐
Application → │ Primary Node │
              └──────┬───────┘
                     │ replication
        ┌────────────┴────────────┐
        ↓                         ↓
┌──────────────┐          ┌──────────────┐
│ Secondary    │          │ Secondary    │
└──────────────┘          └──────────────┘

The primary receives writes. The secondary nodes replicate data from the primary. If the primary fails, the replica set can elect a new primary.

MongoDB replica sets improve:

Availability
Fault tolerance
Data redundancy
Disaster recovery
Read scaling in selected cases

MongoDB documentation notes that the primary records changes in the operation log, or oplog, which secondaries use to replicate changes. (MongoDB)

4.3 Sharded Cluster Architecture

Sharding distributes data across multiple shards.

Application
    |
  mongos
    |
Config Servers
    |
 ┌─────────┬─────────┬─────────┐
 │ Shard 1 │ Shard 2 │ Shard 3 │
 └─────────┴─────────┴─────────┘

A sharded cluster contains:

Component	Purpose
Shard	Stores a subset of data
`mongos`	Query router between application and shards
Config servers	Store cluster metadata and configuration
Shard key	Field used to distribute documents
Balancer	Moves chunks to balance data

Use sharding when:

One server cannot store all data
One server cannot handle all reads/writes
You need horizontal scaling
You need data distribution by region or workload

5. MongoDB Components

5.1 `mongod`

The main database server process.

Responsibilities:

Stores data
Handles queries
Maintains indexes
Performs replication
Applies access control
Manages storage

5.2 `mongosh`

mongosh is the MongoDB Shell. You use it to connect to MongoDB, run commands, query data, create users, inspect collections, and perform administration tasks. MongoDB provides separate installation guidance for mongosh. (MongoDB)

Example:

mongosh

5.3 `mongos`

mongos is the query router used in sharded clusters. Applications connect to mongos, and mongos routes operations to the correct shards.

5.4 Config Servers

Config servers store metadata for sharded clusters, including shard information and chunk distribution. In modern MongoDB sharded clusters, config servers must be deployed as a replica set. (MongoDB)

5.5 MongoDB Compass

MongoDB Compass is a GUI tool for browsing databases, collections, documents, indexes, and aggregation pipelines.

Use Compass when you want a visual interface instead of shell commands.

5.6 MongoDB Database Tools

MongoDB Database Tools include utilities such as:

Tool	Purpose
`mongodump`	Create binary database backups
`mongorestore`	Restore binary backups
`mongoexport`	Export data as JSON or CSV
`mongoimport`	Import JSON, CSV, or TSV
`bsondump`	Inspect BSON files

mongodump and mongorestore are useful for small deployments, but MongoDB documentation recommends snapshots or Atlas cloud backups for more resilient, non-disruptive backup strategies. (MongoDB)

6. MongoDB Terminology

Relational Database	MongoDB
Database	Database
Table	Collection
Row	Document
Column	Field
Primary key	`_id`
Index	Index
Join	`$lookup`, embedding, or application-side join
View	View
Transaction	Transaction
SQL	MongoDB Query Language
Schema	Flexible schema / schema validation
Server	`mongod`
Cluster router	`mongos`
Replication log	Oplog
Horizontal partitioning	Sharding

7. Installing MongoDB

For beginners, the easiest installation method is Docker. For production, use official packages for your operating system or MongoDB Atlas.

MongoDB’s official installation documentation covers Community and Enterprise editions across Linux, macOS, Windows, and Docker. (MongoDB)

7.1 Install MongoDB Using Docker

Create a MongoDB container:

docker run -d \
  --name mongodb \
  -p 27017:27017 \
  -v mongodb_data:/data/db \
  mongo:8.0

Check container status:

docker ps

Connect using mongosh inside the container:

docker exec -it mongodb mongosh

Stop MongoDB:

docker stop mongodb

Start it again:

docker start mongodb

Remove container:

docker rm -f mongodb

Remove volume:

docker volume rm mongodb_data

For a beginner lab, Docker is clean because it avoids OS package conflicts.

7.2 Install MongoDB on Ubuntu

The official Ubuntu installation uses the mongodb-org package maintained by MongoDB Inc.; MongoDB warns that Ubuntu’s separate mongodb package is not maintained by MongoDB Inc. and conflicts with the official package. (MongoDB)

Install prerequisites:

sudo apt-get update
sudo apt-get install -y gnupg curl

Import MongoDB public key:

curl -fsSL https://pgp.mongodb.com/server-8.0.asc | \
  sudo gpg -o /usr/share/keyrings/mongodb-server-8.0.gpg \
  --dearmor

For Ubuntu 24.04:

echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] https://repo.mongodb.org/apt/ubuntu noble/mongodb-org/8.0 multiverse" | \
  sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list

Install MongoDB:

sudo apt-get update
sudo apt-get install -y mongodb-org

Start MongoDB:

sudo systemctl start mongod

Enable MongoDB at boot:

sudo systemctl enable mongod

Check status:

sudo systemctl status mongod

Connect:

mongosh

7.3 Install MongoDB on macOS

MongoDB’s macOS Community Edition documentation is based on Homebrew and covers MongoDB 8.0 Community Edition. (MongoDB)

Install Homebrew if needed, then run:

brew tap mongodb/brew
brew install mongodb-community@8.0

Start MongoDB:

brew services start mongodb-community@8.0

Stop MongoDB:

brew services stop mongodb-community@8.0

Connect:

mongosh

7.4 Install MongoDB on Windows

MongoDB’s Windows Community Edition documentation uses the MSI installer by default, and notes that mongosh is not installed with MongoDB Server. (MongoDB)

Typical Windows workflow:

Download MongoDB Community Server MSI.
Run the installer.
Choose “Complete” setup.
Install MongoDB as a Windows service.
Install MongoDB Compass if desired.
Install mongosh separately if it is not included.
Open PowerShell or Command Prompt.
Run:

mongosh

8. Getting Started with MongoDB

Start mongosh:

mongosh

Show databases:

show dbs

Create or switch to a database:

use bookstore

MongoDB creates the database only after you insert data.

Create a collection:

db.createCollection("books")

Show collections:

show collections

Insert one document:

db.books.insertOne({
  title: "MongoDB Basics",
  author: "Alice Tanaka",
  price: 29.99,
  category: "Database",
  publishedYear: 2026,
  tags: ["mongodb", "nosql", "database"],
  stock: 50,
  createdAt: new Date()
})

Insert many documents:

db.books.insertMany([
  {
    title: "Linux for DevOps",
    author: "Ken Sato",
    price: 34.99,
    category: "DevOps",
    publishedYear: 2025,
    tags: ["linux", "devops"],
    stock: 30,
    createdAt: new Date()
  },
  {
    title: "Docker Practical Guide",
    author: "Maria Silva",
    price: 39.99,
    category: "Containers",
    publishedYear: 2026,
    tags: ["docker", "containers"],
    stock: 20,
    createdAt: new Date()
  }
])

Find all documents:

db.books.find()

Pretty print:

db.books.find().pretty()

Find one document:

db.books.findOne({ title: "MongoDB Basics" })

Filter documents:

db.books.find({ category: "Database" })

Use comparison operators:

db.books.find({ price: { $gt: 30 } })

Use logical operators:

db.books.find({
  $and: [
    { price: { $gt: 20 } },
    { stock: { $gt: 10 } }
  ]
})

Find by array value:

db.books.find({ tags: "docker" })

Projection: return selected fields only:

db.books.find(
  { category: "Database" },
  { title: 1, author: 1, price: 1, _id: 0 }
)

Sort results:

db.books.find().sort({ price: 1 })

Limit results:

db.books.find().limit(2)

Skip results:

db.books.find().skip(2).limit(2)

Update one document:

db.books.updateOne(
  { title: "MongoDB Basics" },
  { $set: { price: 24.99 } }
)

Update many documents:

db.books.updateMany(
  { category: "Database" },
  { $inc: { stock: 10 } }
)

Add value to array:

db.books.updateOne(
  { title: "MongoDB Basics" },
  { $addToSet: { tags: "beginner" } }
)

Delete one document:

db.books.deleteOne({ title: "Docker Practical Guide" })

Delete many documents:

db.books.deleteMany({ stock: { $lte: 0 } })

Drop collection:

db.books.drop()

Drop database:

db.dropDatabase()

MongoDB CRUD operations are create, read, update, and delete operations on documents. (MongoDB)

9. MongoDB Data Modeling

MongoDB data modeling is one of the most important skills.

You usually choose between:

Embedding
Referencing

9.1 Embedding

Embedding means storing related data inside the same document.

Example:

{
  title: "MongoDB Basics",
  author: "Alice Tanaka",
  reviews: [
    {
      user: "Ravi",
      rating: 5,
      comment: "Very helpful"
    },
    {
      user: "Yuki",
      rating: 4,
      comment: "Good beginner guide"
    }
  ]
}

Use embedding when:

Data is frequently read together
Child data belongs strongly to parent data
Child data is limited in size
You want fast reads

Good examples:

Blog post with comments
Order with order items
User profile with address
Product with attributes

9.2 Referencing

Referencing means storing related data in separate collections and linking by _id.

Users:

{
  _id: ObjectId("64..."),
  name: "Alice"
}

Orders:

{
  _id: ObjectId("65..."),
  userId: ObjectId("64..."),
  total: 150
}

Use referencing when:

Related data is large
Related data changes frequently
Many documents share the same related data
You need many-to-many relationships

Good examples:

Users and roles
Products and categories
Students and courses
Authors and books

9.3 Rule of Thumb

Use embedding when data is “owned by” the parent and read together.

Use referencing when data is independent, large, shared, or frequently updated.

Tiny hot take: MongoDB schema design is not “no schema.” It is “put the schema where it actually helps.” Sometimes that is the application. Sometimes that is validation. Sometimes it is both.

10. Schema Validation

MongoDB allows flexible documents by default, but you can enforce rules using schema validation. Schema validation can check required fields, data types, value ranges, and document shape. (MongoDB)

Create a collection with validation:

db.createCollection("students", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email", "age"],
      properties: {
        name: {
          bsonType: "string",
          description: "Name must be a string"
        },
        email: {
          bsonType: "string",
          pattern: "^.+@.+\\..+$",
          description: "Email must be valid"
        },
        age: {
          bsonType: "int",
          minimum: 18,
          maximum: 100,
          description: "Age must be between 18 and 100"
        }
      }
    }
  }
})

Valid insert:

db.students.insertOne({
  name: "Aiko",
  email: "aiko@example.com",
  age: 22
})

Invalid insert:

db.students.insertOne({
  name: "Aiko",
  age: 15
})

The invalid insert fails because email is missing and age is below the allowed minimum.

11. Indexes in MongoDB

Indexes make queries faster. Without an index, MongoDB may scan every document in a collection. With a suitable index, MongoDB can scan fewer documents. Indexes improve query performance but add write overhead because inserts and updates must also update indexes. (MongoDB)

11.1 Create a Single-Field Index

db.books.createIndex({ title: 1 })

1 means ascending order. -1 means descending order.

11.2 Create a Unique Index

db.users.createIndex({ email: 1 }, { unique: true })

Now duplicate emails are rejected.

11.3 Create a Compound Index

db.books.createIndex({ category: 1, price: -1 })

Useful query:

db.books.find({ category: "Database" }).sort({ price: -1 })

11.4 List Indexes

db.books.getIndexes()

11.5 Drop Index

db.books.dropIndex({ title: 1 })

11.6 Use `explain()`

db.books.find({ title: "MongoDB Basics" }).explain("executionStats")

Look for:

COLLSCAN: collection scan, usually bad for large collections
IXSCAN: index scan, usually better
totalDocsExamined
totalKeysExamined
executionTimeMillis

11.7 Index Best Practices

Create indexes for:

Frequent filters
Frequent sorts
Unique fields
Foreign-key-like reference fields
Common search patterns

Avoid:

Too many indexes
Indexing low-value fields unnecessarily
Creating indexes without checking query patterns
Ignoring write overhead

12. Aggregation Pipeline

Aggregation is MongoDB’s data processing framework. It works like a pipeline: documents pass through stages, and each stage transforms, filters, groups, sorts, joins, or calculates data. MongoDB documentation notes that aggregation stages pass output documents to the next stage, and aggregation can return grouped results. (MongoDB)

Basic structure:

db.collection.aggregate([
  { stage1 },
  { stage2 },
  { stage3 }
])

12.1 Sample Data

db.orders.insertMany([
  {
    customer: "Alice",
    status: "paid",
    total: 120,
    items: [
      { product: "Keyboard", qty: 1, price: 50 },
      { product: "Mouse", qty: 2, price: 35 }
    ],
    createdAt: new Date("2026-01-10")
  },
  {
    customer: "Bob",
    status: "pending",
    total: 80,
    items: [
      { product: "Mouse", qty: 2, price: 40 }
    ],
    createdAt: new Date("2026-01-11")
  },
  {
    customer: "Alice",
    status: "paid",
    total: 200,
    items: [
      { product: "Monitor", qty: 1, price: 200 }
    ],
    createdAt: new Date("2026-01-12")
  }
])

12.2 `$match`

Filters documents.

db.orders.aggregate([
  { $match: { status: "paid" } }
])

12.3 `$group`

Groups documents and calculates values.

db.orders.aggregate([
  { $match: { status: "paid" } },
  {
    $group: {
      _id: "$customer",
      totalSpent: { $sum: "$total" },
      orderCount: { $sum: 1 }
    }
  }
])

12.4 `$project`

Controls output fields.

db.orders.aggregate([
  {
    $project: {
      customer: 1,
      total: 1,
      tax: { $multiply: ["$total", 0.1] },
      grandTotal: { $multiply: ["$total", 1.1] }
    }
  }
])

12.5 `$sort`

db.orders.aggregate([
  { $sort: { total: -1 } }
])

12.6 `$unwind`

Breaks array items into separate documents.

db.orders.aggregate([
  { $unwind: "$items" },
  {
    $group: {
      _id: "$items.product",
      totalQuantity: { $sum: "$items.qty" },
      totalRevenue: {
        $sum: { $multiply: ["$items.qty", "$items.price"] }
      }
    }
  }
])

12.7 `$lookup`

Performs a left outer join between collections.

Products:

db.products.insertMany([
  { _id: 1, name: "Keyboard", category: "Accessories" },
  { _id: 2, name: "Mouse", category: "Accessories" }
])

Order items:

db.orderItems.insertMany([
  { orderId: 101, productId: 1, qty: 1 },
  { orderId: 102, productId: 2, qty: 2 }
])

Join:

db.orderItems.aggregate([
  {
    $lookup: {
      from: "products",
      localField: "productId",
      foreignField: "_id",
      as: "productDetails"
    }
  }
])

13. Transactions

MongoDB supports multi-document transactions for use cases where multiple changes must succeed or fail together. The official docs describe distributed transactions as atomic: changes are applied together or rolled back. (MongoDB)

Example use case:

Deduct money from one account
Add money to another account
Record transfer history

If one step fails, all changes should be rolled back.

Example structure in JavaScript-style pseudocode:

const session = client.startSession();

try {
  session.startTransaction();

  await accounts.updateOne(
    { accountNo: "A100" },
    { $inc: { balance: -100 } },
    { session }
  );

  await accounts.updateOne(
    { accountNo: "B200" },
    { $inc: { balance: 100 } },
    { session }
  );

  await transfers.insertOne(
    {
      from: "A100",
      to: "B200",
      amount: 100,
      createdAt: new Date()
    },
    { session }
  );

  await session.commitTransaction();
} catch (error) {
  await session.abortTransaction();
} finally {
  await session.endSession();
}

Use transactions when you truly need them. Do not use transactions to hide poor schema design. In MongoDB, good document design often reduces the need for multi-document transactions.

14. Security: Authentication and Authorization

MongoDB security has two major parts:

Concept	Meaning
Authentication	Proves who the user is
Authorization	Controls what the user can do

MongoDB documentation explains that authentication verifies identity, while authorization determines access to resources and operations. (MongoDB)

MongoDB uses Role-Based Access Control. A user is granted one or more roles, and outside those roles, the user has no access. Access control is not enabled by default in self-managed deployments. (MongoDB)

14.1 Create Admin User

Connect locally before enabling auth:

mongosh

Switch to admin database:

use admin

Create admin user:

db.createUser({
  user: "adminUser",
  pwd: passwordPrompt(),
  roles: [
    { role: "userAdminAnyDatabase", db: "admin" },
    { role: "readWriteAnyDatabase", db: "admin" },
    { role: "dbAdminAnyDatabase", db: "admin" }
  ]
})

14.2 Enable Authorization

Edit MongoDB config file:

sudo nano /etc/mongod.conf

Add:

security:
  authorization: enabled

Restart MongoDB:

sudo systemctl restart mongod

Connect with authentication:

mongosh -u adminUser -p --authenticationDatabase admin

14.3 Create Application User

use bookstore

db.createUser({
  user: "bookAppUser",
  pwd: passwordPrompt(),
  roles: [
    { role: "readWrite", db: "bookstore" }
  ]
})

Connect as app user:

mongosh -u bookAppUser -p --authenticationDatabase bookstore

14.4 Common Built-In Roles

Role	Purpose
`read`	Read-only access
`readWrite`	Read and write access
`dbAdmin`	Database administration
`userAdmin`	User management for one database
`userAdminAnyDatabase`	User management across databases
`readWriteAnyDatabase`	Read/write across databases
`clusterAdmin`	Cluster administration
`backup`	Backup privileges
`restore`	Restore privileges
`root`	Superuser role

15. Replication: High Availability

A replica set is the standard production pattern for MongoDB. It contains multiple mongod instances with the same data. MongoDB’s documentation states that replica sets provide redundancy and high availability. (MongoDB)

15.1 Replica Set Components

Component	Meaning
Primary	Accepts writes
Secondary	Replicates from primary
Arbiter	Votes in elections but does not store data
Oplog	Operation log used for replication
Election	Process of choosing a new primary

15.2 Local Replica Set with Docker Compose

Create docker-compose.yml:

services:
  mongo1:
    image: mongo:8.0
    container_name: mongo1
    command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
    ports:
      - "27017:27017"
    volumes:
      - mongo1_data:/data/db

  mongo2:
    image: mongo:8.0
    container_name: mongo2
    command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
    ports:
      - "27018:27017"
    volumes:
      - mongo2_data:/data/db

  mongo3:
    image: mongo:8.0
    container_name: mongo3
    command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
    ports:
      - "27019:27017"
    volumes:
      - mongo3_data:/data/db

volumes:
  mongo1_data:
  mongo2_data:
  mongo3_data:

Start:

docker compose up -d

Connect to first node:

docker exec -it mongo1 mongosh

Initialize replica set:

rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "mongo1:27017" },
    { _id: 1, host: "mongo2:27017" },
    { _id: 2, host: "mongo3:27017" }
  ]
})

Check status:

rs.status()

Check primary:

rs.isMaster()

Insert data:

use replab
db.test.insertOne({ message: "replication works", createdAt: new Date() })

16. Sharding: Horizontal Scaling

Sharding is MongoDB’s horizontal scaling system. It distributes collection data across multiple shards. MongoDB sharded clusters include shards, mongos routers, and config servers. (MongoDB)

16.1 When to Use Sharding

Use sharding when:

Data is too large for one server
Write throughput is too high for one server
Working set does not fit on one machine
You need regional data distribution
You need very large-scale growth

Do not start with sharding unless you need it. Most applications should start with a replica set.

16.2 Shard Key

A shard key determines how MongoDB distributes data.

Good shard key characteristics:

High cardinality
Even distribution
Supports common queries
Avoids hot spots
Does not constantly increase in one direction unless handled carefully

Poor shard key examples:

{ status: 1 }

Bad because status may only have a few values.

Potentially risky:

{ createdAt: 1 }

Can create hot spots if inserts always go to the newest range.

Better example:

{ customerId: 1, orderId: 1 }

16.3 Sharding Commands: Conceptual Example

Enable sharding on database:

sh.enableSharding("shop")

Create index on shard key:

db.orders.createIndex({ customerId: 1, orderId: 1 })

Shard collection:

sh.shardCollection("shop.orders", { customerId: 1, orderId: 1 })

17. Backup and Restore

Backups are non-negotiable. A database without tested backups is basically a suspense movie with invoices.

MongoDB’s backup and restore tools include mongodump and mongorestore. These create and restore BSON data dumps. Official documentation says they are useful for small deployments, but they can affect performance because they read data through a running MongoDB instance. MongoDB recommends verifying backups by restoring them to a test deployment. (MongoDB)

17.1 Backup a Database

mongodump --db bookstore --out /backup/mongodb

Backup with authentication:

mongodump \
  --username adminUser \
  --password \
  --authenticationDatabase admin \
  --db bookstore \
  --out /backup/mongodb

17.2 Restore a Database

mongorestore --db bookstore /backup/mongodb/bookstore

Restore with authentication:

mongorestore \
  --username adminUser \
  --password \
  --authenticationDatabase admin \
  --db bookstore \
  /backup/mongodb/bookstore

17.3 Backup Best Practices

Automate backups
Store backups off-server
Encrypt backups
Test restore regularly
Label backups with date and database name
Monitor backup success and failure
For production, prefer snapshot-based or managed cloud backup where appropriate

18. MongoDB Administration Workflow

A MongoDB administrator is responsible for keeping the database secure, healthy, backed up, observable, and performant.

18.1 Daily Administration Checklist

Run these checks daily:

db.adminCommand({ serverStatus: 1 })

Check database sizes:

db.stats()

Check collection stats:

db.books.stats()

Check current operations:

db.currentOp()

Check replica set status:

rs.status()

Check replication lag:

rs.printSecondaryReplicationInfo()

Review logs:

sudo journalctl -u mongod

or:

sudo tail -f /var/log/mongodb/mongod.log

18.2 Weekly Administration Checklist

Review slow queries
Review index usage
Check disk growth
Check backup restore process
Review users and roles
Check expired or unused accounts
Confirm monitoring alerts
Check OS patching plan
Check MongoDB patch version
Review schema growth and document sizes

18.3 Monthly Administration Checklist

Test disaster recovery
Review capacity planning
Audit access control
Review TLS certificates
Review shard balance if sharded
Review replica set elections
Confirm backup retention
Check deprecated features before upgrades

18.4 Common Admin Commands

Show databases:

show dbs

Show users:

show users

Show roles:

show roles

Check server status:

db.serverStatus()

Check build info:

db.version()
db.adminCommand({ buildInfo: 1 })

Check database stats:

db.stats()

Check collection stats:

db.collection.stats()

Kill operation:

db.killOp(opid)

19. MongoDB User Workflow

A MongoDB user workflow depends on role. Here are the main workflows.

19.1 Developer Workflow

Understand application data.
Design document model.
Choose embedding or referencing.
Create collections.
Insert sample data.
Write CRUD queries.
Add indexes.
Test with realistic data volume.
Use explain() to inspect performance.
Add schema validation where needed.
Connect application using driver.
Deploy with authentication and TLS.
Monitor slow queries.

19.2 Application Workflow

A typical application flow:

User request
  ↓
Application route/controller
  ↓
Validate request
  ↓
MongoDB driver query
  ↓
MongoDB server
  ↓
Return document/result
  ↓
Application response

Example: create a user.

app.post("/users", async (req, res) => {
  const result = await db.collection("users").insertOne({
    name: req.body.name,
    email: req.body.email,
    createdAt: new Date()
  });

  res.json({ id: result.insertedId });
});

19.3 Analyst Workflow

Connect to read-only user.
Explore collections.
Use aggregation pipeline.
Export selected results.
Build reports.
Avoid production-heavy queries without indexes.
Coordinate large analytics jobs with admins.

19.4 Admin Workflow

Provision deployment.
Configure storage and networking.
Enable authentication.
Create users and roles.
Configure backups.
Configure monitoring.
Review indexes.
Manage replica sets.
Plan upgrades.
Respond to incidents.

20. Performance Optimization

MongoDB performance depends on data model, indexes, working set, hardware, queries, and deployment architecture.

20.1 Use Proper Indexes

Bad:

db.orders.find({ customerId: 12345 }).sort({ createdAt: -1 })

Without index, this may scan and sort many documents.

Better:

db.orders.createIndex({ customerId: 1, createdAt: -1 })

20.2 Avoid Returning Too Much Data

Bad:

db.users.find()

Better:

db.users.find(
  { status: "active" },
  { name: 1, email: 1, _id: 0 }
).limit(100)

20.3 Avoid Unbounded Arrays

Bad design:

{
  userId: 1,
  events: [
    thousands_or_millions_of_events
  ]
}

Better:

{
  userId: 1,
  eventType: "login",
  createdAt: ISODate("2026-01-01T10:00:00Z")
}

Store large event streams as separate documents.

20.4 Use `explain()`

db.orders.find({ customerId: 1001 }).explain("executionStats")

Check:

Was an index used?
How many documents were scanned?
How long did it take?
Is there a collection scan?

20.5 Design for Query Patterns

Do not design MongoDB documents only by “normalization rules.” Design for the way your application reads and writes data.

Ask:

What are the top 10 queries?
What data is read together?
What data changes frequently?
What data grows forever?
What fields need indexes?
What fields need uniqueness?
What operations must be atomic?

21. Advanced MongoDB Features

21.1 TTL Indexes

TTL indexes automatically delete documents after a period of time.

Example: delete logs after 30 days.

db.logs.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 2592000 }
)

21.2 Text Indexes

db.articles.createIndex({ title: "text", body: "text" })

Search:

db.articles.find({ $text: { $search: "mongodb tutorial" } })

21.3 Geospatial Indexes

db.places.createIndex({ location: "2dsphere" })

Example document:

{
  name: "Tokyo Station",
  location: {
    type: "Point",
    coordinates: [139.7671, 35.6812]
  }
}

Find nearby:

db.places.find({
  location: {
    $near: {
      $geometry: {
        type: "Point",
        coordinates: [139.7671, 35.6812]
      },
      $maxDistance: 1000
    }
  }
})

21.4 Change Streams

Change streams allow applications to listen for database changes.

Example:

const changeStream = db.collection("orders").watch();

changeStream.on("change", change => {
  console.log(change);
});

Use cases:

Real-time notifications
Cache invalidation
Event-driven systems
Audit pipelines

21.5 Capped Collections

Capped collections have fixed size and preserve insertion order.

db.createCollection("logs", {
  capped: true,
  size: 100000
})

Use for:

Logs
Event buffers
Temporary streams

22. Complete Mini Project: Bookstore Database

Now let’s build a small bookstore database.

22.1 Create Database

use bookstore

22.2 Create Collections

db.createCollection("users")
db.createCollection("books")
db.createCollection("orders")

22.3 Insert Users

db.users.insertMany([
  {
    name: "Alice",
    email: "alice@example.com",
    role: "customer",
    createdAt: new Date()
  },
  {
    name: "Bob",
    email: "bob@example.com",
    role: "customer",
    createdAt: new Date()
  }
])

22.4 Insert Books

db.books.insertMany([
  {
    title: "MongoDB Basics",
    author: "Alice Tanaka",
    category: "Database",
    price: 29.99,
    stock: 50,
    tags: ["mongodb", "nosql"]
  },
  {
    title: "Docker Practical Guide",
    author: "Maria Silva",
    category: "DevOps",
    price: 39.99,
    stock: 20,
    tags: ["docker", "containers"]
  },
  {
    title: "Linux for Beginners",
    author: "Ken Sato",
    category: "Linux",
    price: 24.99,
    stock: 100,
    tags: ["linux", "server"]
  }
])

22.5 Create Indexes

db.users.createIndex({ email: 1 }, { unique: true })
db.books.createIndex({ title: 1 })
db.books.createIndex({ category: 1, price: -1 })
db.orders.createIndex({ userId: 1, createdAt: -1 })

22.6 Create an Order

Find user:

const user = db.users.findOne({ email: "alice@example.com" })

Find books:

const book1 = db.books.findOne({ title: "MongoDB Basics" })
const book2 = db.books.findOne({ title: "Docker Practical Guide" })

Insert order:

db.orders.insertOne({
  userId: user._id,
  customerEmail: user.email,
  items: [
    {
      bookId: book1._id,
      title: book1.title,
      qty: 1,
      price: book1.price
    },
    {
      bookId: book2._id,
      title: book2.title,
      qty: 1,
      price: book2.price
    }
  ],
  total: book1.price + book2.price,
  status: "paid",
  createdAt: new Date()
})

22.7 Reduce Stock

db.books.updateOne(
  { _id: book1._id },
  { $inc: { stock: -1 } }
)

db.books.updateOne(
  { _id: book2._id },
  { $inc: { stock: -1 } }
)

22.8 Sales Report

db.orders.aggregate([
  { $match: { status: "paid" } },
  { $unwind: "$items" },
  {
    $group: {
      _id: "$items.title",
      totalSold: { $sum: "$items.qty" },
      revenue: {
        $sum: { $multiply: ["$items.qty", "$items.price"] }
      }
    }
  },
  { $sort: { revenue: -1 } }
])

23. MongoDB Best Practices

23.1 Data Modeling Best Practices

Model data around application queries.
Embed data that is read together.
Reference data that is large, shared, or independent.
Avoid unbounded arrays.
Keep document size reasonable.
Use schema validation for critical collections.
Store duplicate read-optimized fields when useful, but keep them consistent.

23.2 Query Best Practices

Use indexes for frequent queries.
Use projection to return only needed fields.
Use pagination for large result sets.
Avoid regex queries without proper index strategy.
Avoid collection scans on large collections.
Use aggregation carefully on large datasets.

23.3 Index Best Practices

Create indexes based on real queries.
Use compound indexes for filter + sort patterns.
Avoid too many indexes.
Remove unused indexes.
Use unique indexes for unique fields.
Test indexes with explain().

23.4 Security Best Practices

Enable authentication.
Use least-privilege roles.
Do not use admin users in applications.
Use TLS in production.
Rotate passwords and secrets.
Keep MongoDB patched.
Restrict network access.
Audit users and roles regularly.

23.5 Backup Best Practices

Automate backups.
Test restores.
Store backups securely.
Use snapshots or managed backups for production.
Monitor backup jobs.
Document recovery procedures.

23.6 Production Best Practices

Use replica sets.
Monitor disk, CPU, memory, connections, locks, and replication lag.
Use proper indexes.
Plan capacity.
Avoid running without authentication.
Avoid public internet exposure.
Use sharding only when required.
Test upgrades before production.
Keep disaster recovery plans current.

24. Common Mistakes Beginners Make

Mistake 1: Thinking MongoDB Has No Schema

MongoDB has a flexible schema, but your application still needs a clear data model.

Mistake 2: Creating Too Many Indexes

Indexes speed up reads but slow down writes and use storage.

Mistake 3: Using MongoDB Like a Relational Database

If every query requires many joins, your model may not be document-oriented enough.

Mistake 4: Ignoring Backups

Backups are only useful if restores are tested.

Mistake 5: Running Without Authentication

Self-managed MongoDB access control is not enabled by default, so you must enable it for real deployments. (MongoDB)

Mistake 6: Bad Shard Key

A poor shard key can create uneven data distribution and performance hot spots.

Mistake 7: Not Using `explain()`

You cannot guess query performance reliably. Use explain().

25. MongoDB Cheat Sheet

Database Commands

show dbs
use mydb
db
db.dropDatabase()

Collection Commands

show collections
db.createCollection("users")
db.users.drop()

Insert

db.users.insertOne({ name: "Alice" })
db.users.insertMany([{ name: "Bob" }, { name: "Carol" }])

Read

db.users.find()
db.users.findOne({ name: "Alice" })
db.users.find({ age: { $gt: 18 } })

Projection

db.users.find({}, { name: 1, email: 1, _id: 0 })

Update

db.users.updateOne(
  { name: "Alice" },
  { $set: { city: "Tokyo" } }
)

db.users.updateMany(
  { active: true },
  { $set: { verified: true } }
)

Delete

db.users.deleteOne({ name: "Alice" })
db.users.deleteMany({ active: false })

Index

db.users.createIndex({ email: 1 }, { unique: true })
db.users.getIndexes()
db.users.dropIndex({ email: 1 })

Aggregation

db.orders.aggregate([
  { $match: { status: "paid" } },
  { $group: { _id: "$customer", total: { $sum: "$total" } } },
  { $sort: { total: -1 } }
])

Users

use admin

db.createUser({
  user: "adminUser",
  pwd: passwordPrompt(),
  roles: ["root"]
})

Backup

mongodump --db mydb --out /backup

Restore

mongorestore --db mydb /backup/mydb

26. Suggested Learning Path

Follow this order:

Understand documents, collections, and databases.
Install MongoDB locally with Docker.
Learn mongosh.
Practice CRUD.
Learn filters, projections, sorting, and pagination.
Learn data modeling.
Learn indexes.
Learn aggregation.
Add schema validation.
Add authentication and users.
Learn backup and restore.
Learn replica sets.
Learn transactions.
Learn sharding concepts.
Learn monitoring and production administration.

Final Summary

MongoDB is a flexible, document-oriented database designed for modern applications. Its biggest strengths are flexible schema design, document-based storage, powerful queries, indexes, aggregation, replication, transactions, and horizontal scaling through sharding.

For beginners, start with Docker, mongosh, CRUD, and indexes. For intermediate users, focus on schema design, aggregation, validation, and security. For advanced users, learn replica sets, backup strategies, performance tuning, transactions, and sharding.

A strong MongoDB workflow is:

Design data model
  ↓
Create collections
  ↓
Insert documents
  ↓
Query and update data
  ↓
Add indexes
  ↓
Validate schema
  ↓
Secure users and roles
  ↓
Back up data
  ↓
Monitor performance
  ↓
Scale with replica sets and sharding

That is the practical end-to-end MongoDB journey: from first document to production-ready database.

rajeshkumar

1. What Is MongoDB?

2. Advantages of MongoDB

2.1 Flexible Schema

2.2 Natural Data Model for Applications

2.3 High Availability

2.4 Horizontal Scalability

2.5 Powerful Query and Aggregation System

2.6 Transactions

3. How MongoDB Works

3.1 Application Layer

3.2 Driver Layer

3.3 Server Layer: mongod

3.4 Storage Layer

3.5 Query Execution

4. MongoDB Architecture

4.1 Standalone Architecture

4.2 Replica Set Architecture

4.3 Sharded Cluster Architecture

5. MongoDB Components

5.1 mongod

5.2 mongosh

5.3 mongos

5.4 Config Servers

5.5 MongoDB Compass

5.6 MongoDB Database Tools

6. MongoDB Terminology

7. Installing MongoDB

7.1 Install MongoDB Using Docker

7.2 Install MongoDB on Ubuntu

7.3 Install MongoDB on macOS

7.4 Install MongoDB on Windows

8. Getting Started with MongoDB

9. MongoDB Data Modeling

9.1 Embedding

9.2 Referencing

9.3 Rule of Thumb

10. Schema Validation

11. Indexes in MongoDB

11.1 Create a Single-Field Index

11.2 Create a Unique Index

11.3 Create a Compound Index

11.4 List Indexes

11.5 Drop Index

11.6 Use explain()

11.7 Index Best Practices

12. Aggregation Pipeline

12.1 Sample Data

12.2 $match

12.3 $group

12.4 $project

12.5 $sort

12.6 $unwind

12.7 $lookup

13. Transactions

14. Security: Authentication and Authorization

14.1 Create Admin User

14.2 Enable Authorization

14.3 Create Application User

14.4 Common Built-In Roles

15. Replication: High Availability

15.1 Replica Set Components

15.2 Local Replica Set with Docker Compose

16. Sharding: Horizontal Scaling

16.1 When to Use Sharding

16.2 Shard Key

16.3 Sharding Commands: Conceptual Example

17. Backup and Restore

17.1 Backup a Database

17.2 Restore a Database

17.3 Backup Best Practices

18. MongoDB Administration Workflow

18.1 Daily Administration Checklist

18.2 Weekly Administration Checklist

18.3 Monthly Administration Checklist

18.4 Common Admin Commands

19. MongoDB User Workflow

19.1 Developer Workflow

19.2 Application Workflow

19.3 Analyst Workflow

19.4 Admin Workflow

3.3 Server Layer: `mongod`

5.1 `mongod`

5.2 `mongosh`

5.3 `mongos`

11.6 Use `explain()`

12.2 `$match`

12.3 `$group`

12.4 `$project`

12.5 `$sort`

12.6 `$unwind`

12.7 `$lookup`

20.4 Use `explain()`

Mistake 7: Not Using `explain()`