MongoDB – Complete End-to-End MongoDB Tutorial Blog: From Basics to Advanced

MongoDB is a NoSQL document database. Instead of storing data in rows and columns like a relational database, MongoDB stores data as documents inside collections. These documents look like JSON when you write them, but MongoDB stores them internally in a binary format called BSON, which supports richer data types. MongoDB is available as Community Edition and Enterprise Edition, and the official documentation provides installation paths for Linux, macOS, Windows, and Docker. (MongoDB)

This tutorial covers MongoDB from beginner to advanced level: what it is, why it is useful, how it works, architecture, installation, CRUD, indexing, aggregation, schema design, security, replication, sharding, backup, administration, and real workflows.


1. What Is MongoDB?

MongoDB is a document-oriented database designed for modern applications that need flexible, scalable, high-performance data storage.

In a traditional SQL database, you might store user data like this:

idnameemailcity
1Alicealice@example.comTokyo

In MongoDB, the same user is stored as a document:

{
  _id: ObjectId("..."),
  name: "Alice",
  email: "alice@example.com",
  address: {
    city: "Tokyo",
    country: "Japan"
  },
  skills: ["Linux", "Docker", "MongoDB"]
}

A MongoDB document can contain:

  • Strings
  • Numbers
  • Dates
  • Arrays
  • Nested objects
  • Booleans
  • ObjectIds
  • Binary data
  • Geospatial data

MongoDB is popular because application data is often already object-like. For example, a user profile, product catalog item, order, blog post, or IoT event naturally fits as a document.


2. Advantages of MongoDB

MongoDB has several major advantages.

2.1 Flexible Schema

MongoDB does not force every document in a collection to have the exact same fields. That makes it useful when application requirements change quickly.

For example:

{
  name: "Laptop",
  price: 1200,
  category: "Electronics"
}

Another document in the same collection can have extra fields:

{
  name: "Phone",
  price: 800,
  category: "Electronics",
  warranty: "2 years",
  colors: ["black", "white"]
}

However, flexible schema does not mean “no design.” MongoDB also supports schema validation when you want to enforce data rules such as required fields, data types, or value ranges. (MongoDB)

2.2 Natural Data Model for Applications

Most modern applications use objects, JSON, APIs, and nested data. MongoDB documents map naturally to those structures.

For example, an order can store customer information and ordered items together:

{
  orderId: "ORD1001",
  customer: {
    name: "Alice",
    email: "alice@example.com"
  },
  items: [
    { product: "Keyboard", qty: 1, price: 50 },
    { product: "Mouse", qty: 2, price: 25 }
  ],
  status: "paid"
}

This can reduce the need for joins in many use cases.

2.3 High Availability

MongoDB supports replication through replica sets. A replica set is a group of mongod processes that maintain the same dataset. Replica sets provide redundancy and high availability, and MongoDB’s documentation describes them as the basis for production deployments. (MongoDB)

2.4 Horizontal Scalability

MongoDB supports sharding, which distributes data across multiple machines. A sharded cluster contains shards, mongos query routers, and config servers. Each shard stores a subset of the data and must be deployed as a replica set. (MongoDB)

2.5 Powerful Query and Aggregation System

MongoDB supports filtering, sorting, indexing, text search, geospatial queries, aggregation pipelines, and multi-document transactions. Aggregation pipelines process documents through stages such as filtering, grouping, projecting, sorting, joining, and calculating values. (MongoDB)

2.6 Transactions

MongoDB supports multi-document transactions when you need all-or-nothing changes across multiple documents or collections. Transactions either commit all changes or roll them back. (MongoDB)


3. How MongoDB Works

At a high level, MongoDB works like this:

Application
   |
MongoDB Driver
   |
mongod or mongos
   |
Storage Engine
   |
Data Files + Journal

3.1 Application Layer

Your application can be written in Node.js, Python, Java, Go, PHP, C#, Ruby, or another language. It uses a MongoDB driver to connect to the database.

Example connection string:

mongodb://localhost:27017

3.2 Driver Layer

The driver converts your application objects into MongoDB-compatible BSON documents and sends commands to the server.

Example in application logic:

await db.collection("users").insertOne({
  name: "Alice",
  email: "alice@example.com"
});

3.3 Server Layer: mongod

mongod is the main MongoDB database server process. It handles:

  • Client connections
  • Reads
  • Writes
  • Indexes
  • Replication
  • Storage
  • Query execution
  • Authentication and authorization

3.4 Storage Layer

MongoDB writes data to disk through its storage engine. In modern MongoDB deployments, WiredTiger is the default storage engine. The storage layer manages data files, indexes, compression, concurrency, and journaling.

3.5 Query Execution

When you run a query like:

db.users.find({ email: "alice@example.com" })

MongoDB checks whether an index can help. If an appropriate index exists, MongoDB uses it to limit the number of documents scanned. Without a useful index, MongoDB may scan every document in the collection. (MongoDB)


4. MongoDB Architecture

MongoDB can run in three main deployment architectures.


4.1 Standalone Architecture

This is the simplest architecture.

Application → mongod → Data files

Use it for:

  • Learning
  • Local development
  • Testing
  • Small experiments

Avoid standalone MongoDB for production because it has no automatic failover.


4.2 Replica Set Architecture

A replica set contains multiple MongoDB servers that hold the same data.

              ┌──────────────┐
Application → │ Primary Node │
              └──────┬───────┘
                     │ replication
        ┌────────────┴────────────┐
        ↓                         ↓
┌──────────────┐          ┌──────────────┐
│ Secondary    │          │ Secondary    │
└──────────────┘          └──────────────┘

The primary receives writes. The secondary nodes replicate data from the primary. If the primary fails, the replica set can elect a new primary.

MongoDB replica sets improve:

  • Availability
  • Fault tolerance
  • Data redundancy
  • Disaster recovery
  • Read scaling in selected cases

MongoDB documentation notes that the primary records changes in the operation log, or oplog, which secondaries use to replicate changes. (MongoDB)


4.3 Sharded Cluster Architecture

Sharding distributes data across multiple shards.

Application
    |
  mongos
    |
Config Servers
    |
 ┌─────────┬─────────┬─────────┐
 │ Shard 1 │ Shard 2 │ Shard 3 │
 └─────────┴─────────┴─────────┘

A sharded cluster contains:

ComponentPurpose
ShardStores a subset of data
mongosQuery router between application and shards
Config serversStore cluster metadata and configuration
Shard keyField used to distribute documents
BalancerMoves chunks to balance data

Use sharding when:

  • One server cannot store all data
  • One server cannot handle all reads/writes
  • You need horizontal scaling
  • You need data distribution by region or workload

5. MongoDB Components

5.1 mongod

The main database server process.

Responsibilities:

  • Stores data
  • Handles queries
  • Maintains indexes
  • Performs replication
  • Applies access control
  • Manages storage

5.2 mongosh

mongosh is the MongoDB Shell. You use it to connect to MongoDB, run commands, query data, create users, inspect collections, and perform administration tasks. MongoDB provides separate installation guidance for mongosh. (MongoDB)

Example:

mongosh

5.3 mongos

mongos is the query router used in sharded clusters. Applications connect to mongos, and mongos routes operations to the correct shards.


5.4 Config Servers

Config servers store metadata for sharded clusters, including shard information and chunk distribution. In modern MongoDB sharded clusters, config servers must be deployed as a replica set. (MongoDB)


5.5 MongoDB Compass

MongoDB Compass is a GUI tool for browsing databases, collections, documents, indexes, and aggregation pipelines.

Use Compass when you want a visual interface instead of shell commands.


5.6 MongoDB Database Tools

MongoDB Database Tools include utilities such as:

ToolPurpose
mongodumpCreate binary database backups
mongorestoreRestore binary backups
mongoexportExport data as JSON or CSV
mongoimportImport JSON, CSV, or TSV
bsondumpInspect BSON files

mongodump and mongorestore are useful for small deployments, but MongoDB documentation recommends snapshots or Atlas cloud backups for more resilient, non-disruptive backup strategies. (MongoDB)


6. MongoDB Terminology

Relational DatabaseMongoDB
DatabaseDatabase
TableCollection
RowDocument
ColumnField
Primary key_id
IndexIndex
Join$lookup, embedding, or application-side join
ViewView
TransactionTransaction
SQLMongoDB Query Language
SchemaFlexible schema / schema validation
Servermongod
Cluster routermongos
Replication logOplog
Horizontal partitioningSharding

7. Installing MongoDB

For beginners, the easiest installation method is Docker. For production, use official packages for your operating system or MongoDB Atlas.

MongoDB’s official installation documentation covers Community and Enterprise editions across Linux, macOS, Windows, and Docker. (MongoDB)


7.1 Install MongoDB Using Docker

Create a MongoDB container:

docker run -d \
  --name mongodb \
  -p 27017:27017 \
  -v mongodb_data:/data/db \
  mongo:8.0

Check container status:

docker ps

Connect using mongosh inside the container:

docker exec -it mongodb mongosh

Stop MongoDB:

docker stop mongodb

Start it again:

docker start mongodb

Remove container:

docker rm -f mongodb

Remove volume:

docker volume rm mongodb_data

For a beginner lab, Docker is clean because it avoids OS package conflicts.


7.2 Install MongoDB on Ubuntu

The official Ubuntu installation uses the mongodb-org package maintained by MongoDB Inc.; MongoDB warns that Ubuntu’s separate mongodb package is not maintained by MongoDB Inc. and conflicts with the official package. (MongoDB)

Install prerequisites:

sudo apt-get update
sudo apt-get install -y gnupg curl

Import MongoDB public key:

curl -fsSL https://pgp.mongodb.com/server-8.0.asc | \
  sudo gpg -o /usr/share/keyrings/mongodb-server-8.0.gpg \
  --dearmor

For Ubuntu 24.04:

echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] https://repo.mongodb.org/apt/ubuntu noble/mongodb-org/8.0 multiverse" | \
  sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list

Install MongoDB:

sudo apt-get update
sudo apt-get install -y mongodb-org

Start MongoDB:

sudo systemctl start mongod

Enable MongoDB at boot:

sudo systemctl enable mongod

Check status:

sudo systemctl status mongod

Connect:

mongosh

7.3 Install MongoDB on macOS

MongoDB’s macOS Community Edition documentation is based on Homebrew and covers MongoDB 8.0 Community Edition. (MongoDB)

Install Homebrew if needed, then run:

brew tap mongodb/brew
brew install mongodb-community@8.0

Start MongoDB:

brew services start mongodb-community@8.0

Stop MongoDB:

brew services stop mongodb-community@8.0

Connect:

mongosh

7.4 Install MongoDB on Windows

MongoDB’s Windows Community Edition documentation uses the MSI installer by default, and notes that mongosh is not installed with MongoDB Server. (MongoDB)

Typical Windows workflow:

  1. Download MongoDB Community Server MSI.
  2. Run the installer.
  3. Choose “Complete” setup.
  4. Install MongoDB as a Windows service.
  5. Install MongoDB Compass if desired.
  6. Install mongosh separately if it is not included.
  7. Open PowerShell or Command Prompt.
  8. Run:
mongosh

8. Getting Started with MongoDB

Start mongosh:

mongosh

Show databases:

show dbs

Create or switch to a database:

use bookstore

MongoDB creates the database only after you insert data.

Create a collection:

db.createCollection("books")

Show collections:

show collections

Insert one document:

db.books.insertOne({
  title: "MongoDB Basics",
  author: "Alice Tanaka",
  price: 29.99,
  category: "Database",
  publishedYear: 2026,
  tags: ["mongodb", "nosql", "database"],
  stock: 50,
  createdAt: new Date()
})

Insert many documents:

db.books.insertMany([
  {
    title: "Linux for DevOps",
    author: "Ken Sato",
    price: 34.99,
    category: "DevOps",
    publishedYear: 2025,
    tags: ["linux", "devops"],
    stock: 30,
    createdAt: new Date()
  },
  {
    title: "Docker Practical Guide",
    author: "Maria Silva",
    price: 39.99,
    category: "Containers",
    publishedYear: 2026,
    tags: ["docker", "containers"],
    stock: 20,
    createdAt: new Date()
  }
])

Find all documents:

db.books.find()

Pretty print:

db.books.find().pretty()

Find one document:

db.books.findOne({ title: "MongoDB Basics" })

Filter documents:

db.books.find({ category: "Database" })

Use comparison operators:

db.books.find({ price: { $gt: 30 } })

Use logical operators:

db.books.find({
  $and: [
    { price: { $gt: 20 } },
    { stock: { $gt: 10 } }
  ]
})

Find by array value:

db.books.find({ tags: "docker" })

Projection: return selected fields only:

db.books.find(
  { category: "Database" },
  { title: 1, author: 1, price: 1, _id: 0 }
)

Sort results:

db.books.find().sort({ price: 1 })

Limit results:

db.books.find().limit(2)

Skip results:

db.books.find().skip(2).limit(2)

Update one document:

db.books.updateOne(
  { title: "MongoDB Basics" },
  { $set: { price: 24.99 } }
)

Update many documents:

db.books.updateMany(
  { category: "Database" },
  { $inc: { stock: 10 } }
)

Add value to array:

db.books.updateOne(
  { title: "MongoDB Basics" },
  { $addToSet: { tags: "beginner" } }
)

Delete one document:

db.books.deleteOne({ title: "Docker Practical Guide" })

Delete many documents:

db.books.deleteMany({ stock: { $lte: 0 } })

Drop collection:

db.books.drop()

Drop database:

db.dropDatabase()

MongoDB CRUD operations are create, read, update, and delete operations on documents. (MongoDB)


9. MongoDB Data Modeling

MongoDB data modeling is one of the most important skills.

You usually choose between:

  1. Embedding
  2. Referencing

9.1 Embedding

Embedding means storing related data inside the same document.

Example:

{
  title: "MongoDB Basics",
  author: "Alice Tanaka",
  reviews: [
    {
      user: "Ravi",
      rating: 5,
      comment: "Very helpful"
    },
    {
      user: "Yuki",
      rating: 4,
      comment: "Good beginner guide"
    }
  ]
}

Use embedding when:

  • Data is frequently read together
  • Child data belongs strongly to parent data
  • Child data is limited in size
  • You want fast reads

Good examples:

  • Blog post with comments
  • Order with order items
  • User profile with address
  • Product with attributes

9.2 Referencing

Referencing means storing related data in separate collections and linking by _id.

Users:

{
  _id: ObjectId("64..."),
  name: "Alice"
}

Orders:

{
  _id: ObjectId("65..."),
  userId: ObjectId("64..."),
  total: 150
}

Use referencing when:

  • Related data is large
  • Related data changes frequently
  • Many documents share the same related data
  • You need many-to-many relationships

Good examples:

  • Users and roles
  • Products and categories
  • Students and courses
  • Authors and books

9.3 Rule of Thumb

Use embedding when data is “owned by” the parent and read together.

Use referencing when data is independent, large, shared, or frequently updated.

Tiny hot take: MongoDB schema design is not “no schema.” It is “put the schema where it actually helps.” Sometimes that is the application. Sometimes that is validation. Sometimes it is both.


10. Schema Validation

MongoDB allows flexible documents by default, but you can enforce rules using schema validation. Schema validation can check required fields, data types, value ranges, and document shape. (MongoDB)

Create a collection with validation:

db.createCollection("students", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email", "age"],
      properties: {
        name: {
          bsonType: "string",
          description: "Name must be a string"
        },
        email: {
          bsonType: "string",
          pattern: "^.+@.+\\..+$",
          description: "Email must be valid"
        },
        age: {
          bsonType: "int",
          minimum: 18,
          maximum: 100,
          description: "Age must be between 18 and 100"
        }
      }
    }
  }
})

Valid insert:

db.students.insertOne({
  name: "Aiko",
  email: "aiko@example.com",
  age: 22
})

Invalid insert:

db.students.insertOne({
  name: "Aiko",
  age: 15
})

The invalid insert fails because email is missing and age is below the allowed minimum.


11. Indexes in MongoDB

Indexes make queries faster. Without an index, MongoDB may scan every document in a collection. With a suitable index, MongoDB can scan fewer documents. Indexes improve query performance but add write overhead because inserts and updates must also update indexes. (MongoDB)


11.1 Create a Single-Field Index

db.books.createIndex({ title: 1 })

1 means ascending order. -1 means descending order.


11.2 Create a Unique Index

db.users.createIndex({ email: 1 }, { unique: true })

Now duplicate emails are rejected.


11.3 Create a Compound Index

db.books.createIndex({ category: 1, price: -1 })

Useful query:

db.books.find({ category: "Database" }).sort({ price: -1 })

11.4 List Indexes

db.books.getIndexes()

11.5 Drop Index

db.books.dropIndex({ title: 1 })

11.6 Use explain()

db.books.find({ title: "MongoDB Basics" }).explain("executionStats")

Look for:

  • COLLSCAN: collection scan, usually bad for large collections
  • IXSCAN: index scan, usually better
  • totalDocsExamined
  • totalKeysExamined
  • executionTimeMillis

11.7 Index Best Practices

Create indexes for:

  • Frequent filters
  • Frequent sorts
  • Unique fields
  • Foreign-key-like reference fields
  • Common search patterns

Avoid:

  • Too many indexes
  • Indexing low-value fields unnecessarily
  • Creating indexes without checking query patterns
  • Ignoring write overhead

12. Aggregation Pipeline

Aggregation is MongoDB’s data processing framework. It works like a pipeline: documents pass through stages, and each stage transforms, filters, groups, sorts, joins, or calculates data. MongoDB documentation notes that aggregation stages pass output documents to the next stage, and aggregation can return grouped results. (MongoDB)

Basic structure:

db.collection.aggregate([
  { stage1 },
  { stage2 },
  { stage3 }
])

12.1 Sample Data

db.orders.insertMany([
  {
    customer: "Alice",
    status: "paid",
    total: 120,
    items: [
      { product: "Keyboard", qty: 1, price: 50 },
      { product: "Mouse", qty: 2, price: 35 }
    ],
    createdAt: new Date("2026-01-10")
  },
  {
    customer: "Bob",
    status: "pending",
    total: 80,
    items: [
      { product: "Mouse", qty: 2, price: 40 }
    ],
    createdAt: new Date("2026-01-11")
  },
  {
    customer: "Alice",
    status: "paid",
    total: 200,
    items: [
      { product: "Monitor", qty: 1, price: 200 }
    ],
    createdAt: new Date("2026-01-12")
  }
])

12.2 $match

Filters documents.

db.orders.aggregate([
  { $match: { status: "paid" } }
])

12.3 $group

Groups documents and calculates values.

db.orders.aggregate([
  { $match: { status: "paid" } },
  {
    $group: {
      _id: "$customer",
      totalSpent: { $sum: "$total" },
      orderCount: { $sum: 1 }
    }
  }
])

12.4 $project

Controls output fields.

db.orders.aggregate([
  {
    $project: {
      customer: 1,
      total: 1,
      tax: { $multiply: ["$total", 0.1] },
      grandTotal: { $multiply: ["$total", 1.1] }
    }
  }
])

12.5 $sort

db.orders.aggregate([
  { $sort: { total: -1 } }
])

12.6 $unwind

Breaks array items into separate documents.

db.orders.aggregate([
  { $unwind: "$items" },
  {
    $group: {
      _id: "$items.product",
      totalQuantity: { $sum: "$items.qty" },
      totalRevenue: {
        $sum: { $multiply: ["$items.qty", "$items.price"] }
      }
    }
  }
])

12.7 $lookup

Performs a left outer join between collections.

Products:

db.products.insertMany([
  { _id: 1, name: "Keyboard", category: "Accessories" },
  { _id: 2, name: "Mouse", category: "Accessories" }
])

Order items:

db.orderItems.insertMany([
  { orderId: 101, productId: 1, qty: 1 },
  { orderId: 102, productId: 2, qty: 2 }
])

Join:

db.orderItems.aggregate([
  {
    $lookup: {
      from: "products",
      localField: "productId",
      foreignField: "_id",
      as: "productDetails"
    }
  }
])

13. Transactions

MongoDB supports multi-document transactions for use cases where multiple changes must succeed or fail together. The official docs describe distributed transactions as atomic: changes are applied together or rolled back. (MongoDB)

Example use case:

  • Deduct money from one account
  • Add money to another account
  • Record transfer history

If one step fails, all changes should be rolled back.

Example structure in JavaScript-style pseudocode:

const session = client.startSession();

try {
  session.startTransaction();

  await accounts.updateOne(
    { accountNo: "A100" },
    { $inc: { balance: -100 } },
    { session }
  );

  await accounts.updateOne(
    { accountNo: "B200" },
    { $inc: { balance: 100 } },
    { session }
  );

  await transfers.insertOne(
    {
      from: "A100",
      to: "B200",
      amount: 100,
      createdAt: new Date()
    },
    { session }
  );

  await session.commitTransaction();
} catch (error) {
  await session.abortTransaction();
} finally {
  await session.endSession();
}

Use transactions when you truly need them. Do not use transactions to hide poor schema design. In MongoDB, good document design often reduces the need for multi-document transactions.


14. Security: Authentication and Authorization

MongoDB security has two major parts:

ConceptMeaning
AuthenticationProves who the user is
AuthorizationControls what the user can do

MongoDB documentation explains that authentication verifies identity, while authorization determines access to resources and operations. (MongoDB)

MongoDB uses Role-Based Access Control. A user is granted one or more roles, and outside those roles, the user has no access. Access control is not enabled by default in self-managed deployments. (MongoDB)


14.1 Create Admin User

Connect locally before enabling auth:

mongosh

Switch to admin database:

use admin

Create admin user:

db.createUser({
  user: "adminUser",
  pwd: passwordPrompt(),
  roles: [
    { role: "userAdminAnyDatabase", db: "admin" },
    { role: "readWriteAnyDatabase", db: "admin" },
    { role: "dbAdminAnyDatabase", db: "admin" }
  ]
})

14.2 Enable Authorization

Edit MongoDB config file:

sudo nano /etc/mongod.conf

Add:

security:
  authorization: enabled

Restart MongoDB:

sudo systemctl restart mongod

Connect with authentication:

mongosh -u adminUser -p --authenticationDatabase admin

14.3 Create Application User

use bookstore

db.createUser({
  user: "bookAppUser",
  pwd: passwordPrompt(),
  roles: [
    { role: "readWrite", db: "bookstore" }
  ]
})

Connect as app user:

mongosh -u bookAppUser -p --authenticationDatabase bookstore

14.4 Common Built-In Roles

RolePurpose
readRead-only access
readWriteRead and write access
dbAdminDatabase administration
userAdminUser management for one database
userAdminAnyDatabaseUser management across databases
readWriteAnyDatabaseRead/write across databases
clusterAdminCluster administration
backupBackup privileges
restoreRestore privileges
rootSuperuser role

15. Replication: High Availability

A replica set is the standard production pattern for MongoDB. It contains multiple mongod instances with the same data. MongoDB’s documentation states that replica sets provide redundancy and high availability. (MongoDB)


15.1 Replica Set Components

ComponentMeaning
PrimaryAccepts writes
SecondaryReplicates from primary
ArbiterVotes in elections but does not store data
OplogOperation log used for replication
ElectionProcess of choosing a new primary

15.2 Local Replica Set with Docker Compose

Create docker-compose.yml:

services:
  mongo1:
    image: mongo:8.0
    container_name: mongo1
    command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
    ports:
      - "27017:27017"
    volumes:
      - mongo1_data:/data/db

  mongo2:
    image: mongo:8.0
    container_name: mongo2
    command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
    ports:
      - "27018:27017"
    volumes:
      - mongo2_data:/data/db

  mongo3:
    image: mongo:8.0
    container_name: mongo3
    command: ["mongod", "--replSet", "rs0", "--bind_ip_all"]
    ports:
      - "27019:27017"
    volumes:
      - mongo3_data:/data/db

volumes:
  mongo1_data:
  mongo2_data:
  mongo3_data:

Start:

docker compose up -d

Connect to first node:

docker exec -it mongo1 mongosh

Initialize replica set:

rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "mongo1:27017" },
    { _id: 1, host: "mongo2:27017" },
    { _id: 2, host: "mongo3:27017" }
  ]
})

Check status:

rs.status()

Check primary:

rs.isMaster()

Insert data:

use replab
db.test.insertOne({ message: "replication works", createdAt: new Date() })

16. Sharding: Horizontal Scaling

Sharding is MongoDB’s horizontal scaling system. It distributes collection data across multiple shards. MongoDB sharded clusters include shards, mongos routers, and config servers. (MongoDB)


16.1 When to Use Sharding

Use sharding when:

  • Data is too large for one server
  • Write throughput is too high for one server
  • Working set does not fit on one machine
  • You need regional data distribution
  • You need very large-scale growth

Do not start with sharding unless you need it. Most applications should start with a replica set.


16.2 Shard Key

A shard key determines how MongoDB distributes data.

Good shard key characteristics:

  • High cardinality
  • Even distribution
  • Supports common queries
  • Avoids hot spots
  • Does not constantly increase in one direction unless handled carefully

Poor shard key examples:

{ status: 1 }

Bad because status may only have a few values.

Potentially risky:

{ createdAt: 1 }

Can create hot spots if inserts always go to the newest range.

Better example:

{ customerId: 1, orderId: 1 }

16.3 Sharding Commands: Conceptual Example

Enable sharding on database:

sh.enableSharding("shop")

Create index on shard key:

db.orders.createIndex({ customerId: 1, orderId: 1 })

Shard collection:

sh.shardCollection("shop.orders", { customerId: 1, orderId: 1 })

17. Backup and Restore

Backups are non-negotiable. A database without tested backups is basically a suspense movie with invoices.

MongoDB’s backup and restore tools include mongodump and mongorestore. These create and restore BSON data dumps. Official documentation says they are useful for small deployments, but they can affect performance because they read data through a running MongoDB instance. MongoDB recommends verifying backups by restoring them to a test deployment. (MongoDB)


17.1 Backup a Database

mongodump --db bookstore --out /backup/mongodb

Backup with authentication:

mongodump \
  --username adminUser \
  --password \
  --authenticationDatabase admin \
  --db bookstore \
  --out /backup/mongodb

17.2 Restore a Database

mongorestore --db bookstore /backup/mongodb/bookstore

Restore with authentication:

mongorestore \
  --username adminUser \
  --password \
  --authenticationDatabase admin \
  --db bookstore \
  /backup/mongodb/bookstore

17.3 Backup Best Practices

  • Automate backups
  • Store backups off-server
  • Encrypt backups
  • Test restore regularly
  • Label backups with date and database name
  • Monitor backup success and failure
  • For production, prefer snapshot-based or managed cloud backup where appropriate

18. MongoDB Administration Workflow

A MongoDB administrator is responsible for keeping the database secure, healthy, backed up, observable, and performant.


18.1 Daily Administration Checklist

Run these checks daily:

db.adminCommand({ serverStatus: 1 })

Check database sizes:

db.stats()

Check collection stats:

db.books.stats()

Check current operations:

db.currentOp()

Check replica set status:

rs.status()

Check replication lag:

rs.printSecondaryReplicationInfo()

Review logs:

sudo journalctl -u mongod

or:

sudo tail -f /var/log/mongodb/mongod.log

18.2 Weekly Administration Checklist

  • Review slow queries
  • Review index usage
  • Check disk growth
  • Check backup restore process
  • Review users and roles
  • Check expired or unused accounts
  • Confirm monitoring alerts
  • Check OS patching plan
  • Check MongoDB patch version
  • Review schema growth and document sizes

18.3 Monthly Administration Checklist

  • Test disaster recovery
  • Review capacity planning
  • Audit access control
  • Review TLS certificates
  • Review shard balance if sharded
  • Review replica set elections
  • Confirm backup retention
  • Check deprecated features before upgrades

18.4 Common Admin Commands

Show databases:

show dbs

Show users:

show users

Show roles:

show roles

Check server status:

db.serverStatus()

Check build info:

db.version()
db.adminCommand({ buildInfo: 1 })

Check database stats:

db.stats()

Check collection stats:

db.collection.stats()

Kill operation:

db.killOp(opid)

19. MongoDB User Workflow

A MongoDB user workflow depends on role. Here are the main workflows.


19.1 Developer Workflow

  1. Understand application data.
  2. Design document model.
  3. Choose embedding or referencing.
  4. Create collections.
  5. Insert sample data.
  6. Write CRUD queries.
  7. Add indexes.
  8. Test with realistic data volume.
  9. Use explain() to inspect performance.
  10. Add schema validation where needed.
  11. Connect application using driver.
  12. Deploy with authentication and TLS.
  13. Monitor slow queries.

19.2 Application Workflow

A typical application flow:

User request
  ↓
Application route/controller
  ↓
Validate request
  ↓
MongoDB driver query
  ↓
MongoDB server
  ↓
Return document/result
  ↓
Application response

Example: create a user.

app.post("/users", async (req, res) => {
  const result = await db.collection("users").insertOne({
    name: req.body.name,
    email: req.body.email,
    createdAt: new Date()
  });

  res.json({ id: result.insertedId });
});

19.3 Analyst Workflow

  1. Connect to read-only user.
  2. Explore collections.
  3. Use aggregation pipeline.
  4. Export selected results.
  5. Build reports.
  6. Avoid production-heavy queries without indexes.
  7. Coordinate large analytics jobs with admins.

19.4 Admin Workflow

  1. Provision deployment.
  2. Configure storage and networking.
  3. Enable authentication.
  4. Create users and roles.
  5. Configure backups.
  6. Configure monitoring.
  7. Review indexes.
  8. Manage replica sets.
  9. Plan upgrades.
  10. Respond to incidents.

20. Performance Optimization

MongoDB performance depends on data model, indexes, working set, hardware, queries, and deployment architecture.


20.1 Use Proper Indexes

Bad:

db.orders.find({ customerId: 12345 }).sort({ createdAt: -1 })

Without index, this may scan and sort many documents.

Better:

db.orders.createIndex({ customerId: 1, createdAt: -1 })

20.2 Avoid Returning Too Much Data

Bad:

db.users.find()

Better:

db.users.find(
  { status: "active" },
  { name: 1, email: 1, _id: 0 }
).limit(100)

20.3 Avoid Unbounded Arrays

Bad design:

{
  userId: 1,
  events: [
    thousands_or_millions_of_events
  ]
}

Better:

{
  userId: 1,
  eventType: "login",
  createdAt: ISODate("2026-01-01T10:00:00Z")
}

Store large event streams as separate documents.


20.4 Use explain()

db.orders.find({ customerId: 1001 }).explain("executionStats")

Check:

  • Was an index used?
  • How many documents were scanned?
  • How long did it take?
  • Is there a collection scan?

20.5 Design for Query Patterns

Do not design MongoDB documents only by “normalization rules.” Design for the way your application reads and writes data.

Ask:

  • What are the top 10 queries?
  • What data is read together?
  • What data changes frequently?
  • What data grows forever?
  • What fields need indexes?
  • What fields need uniqueness?
  • What operations must be atomic?

21. Advanced MongoDB Features

21.1 TTL Indexes

TTL indexes automatically delete documents after a period of time.

Example: delete logs after 30 days.

db.logs.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 2592000 }
)

21.2 Text Indexes

db.articles.createIndex({ title: "text", body: "text" })

Search:

db.articles.find({ $text: { $search: "mongodb tutorial" } })

21.3 Geospatial Indexes

db.places.createIndex({ location: "2dsphere" })

Example document:

{
  name: "Tokyo Station",
  location: {
    type: "Point",
    coordinates: [139.7671, 35.6812]
  }
}

Find nearby:

db.places.find({
  location: {
    $near: {
      $geometry: {
        type: "Point",
        coordinates: [139.7671, 35.6812]
      },
      $maxDistance: 1000
    }
  }
})

21.4 Change Streams

Change streams allow applications to listen for database changes.

Example:

const changeStream = db.collection("orders").watch();

changeStream.on("change", change => {
  console.log(change);
});

Use cases:

  • Real-time notifications
  • Cache invalidation
  • Event-driven systems
  • Audit pipelines

21.5 Capped Collections

Capped collections have fixed size and preserve insertion order.

db.createCollection("logs", {
  capped: true,
  size: 100000
})

Use for:

  • Logs
  • Event buffers
  • Temporary streams

22. Complete Mini Project: Bookstore Database

Now let’s build a small bookstore database.


22.1 Create Database

use bookstore

22.2 Create Collections

db.createCollection("users")
db.createCollection("books")
db.createCollection("orders")

22.3 Insert Users

db.users.insertMany([
  {
    name: "Alice",
    email: "alice@example.com",
    role: "customer",
    createdAt: new Date()
  },
  {
    name: "Bob",
    email: "bob@example.com",
    role: "customer",
    createdAt: new Date()
  }
])

22.4 Insert Books

db.books.insertMany([
  {
    title: "MongoDB Basics",
    author: "Alice Tanaka",
    category: "Database",
    price: 29.99,
    stock: 50,
    tags: ["mongodb", "nosql"]
  },
  {
    title: "Docker Practical Guide",
    author: "Maria Silva",
    category: "DevOps",
    price: 39.99,
    stock: 20,
    tags: ["docker", "containers"]
  },
  {
    title: "Linux for Beginners",
    author: "Ken Sato",
    category: "Linux",
    price: 24.99,
    stock: 100,
    tags: ["linux", "server"]
  }
])

22.5 Create Indexes

db.users.createIndex({ email: 1 }, { unique: true })
db.books.createIndex({ title: 1 })
db.books.createIndex({ category: 1, price: -1 })
db.orders.createIndex({ userId: 1, createdAt: -1 })

22.6 Create an Order

Find user:

const user = db.users.findOne({ email: "alice@example.com" })

Find books:

const book1 = db.books.findOne({ title: "MongoDB Basics" })
const book2 = db.books.findOne({ title: "Docker Practical Guide" })

Insert order:

db.orders.insertOne({
  userId: user._id,
  customerEmail: user.email,
  items: [
    {
      bookId: book1._id,
      title: book1.title,
      qty: 1,
      price: book1.price
    },
    {
      bookId: book2._id,
      title: book2.title,
      qty: 1,
      price: book2.price
    }
  ],
  total: book1.price + book2.price,
  status: "paid",
  createdAt: new Date()
})

22.7 Reduce Stock

db.books.updateOne(
  { _id: book1._id },
  { $inc: { stock: -1 } }
)

db.books.updateOne(
  { _id: book2._id },
  { $inc: { stock: -1 } }
)

22.8 Sales Report

db.orders.aggregate([
  { $match: { status: "paid" } },
  { $unwind: "$items" },
  {
    $group: {
      _id: "$items.title",
      totalSold: { $sum: "$items.qty" },
      revenue: {
        $sum: { $multiply: ["$items.qty", "$items.price"] }
      }
    }
  },
  { $sort: { revenue: -1 } }
])

23. MongoDB Best Practices

23.1 Data Modeling Best Practices

  • Model data around application queries.
  • Embed data that is read together.
  • Reference data that is large, shared, or independent.
  • Avoid unbounded arrays.
  • Keep document size reasonable.
  • Use schema validation for critical collections.
  • Store duplicate read-optimized fields when useful, but keep them consistent.

23.2 Query Best Practices

  • Use indexes for frequent queries.
  • Use projection to return only needed fields.
  • Use pagination for large result sets.
  • Avoid regex queries without proper index strategy.
  • Avoid collection scans on large collections.
  • Use aggregation carefully on large datasets.

23.3 Index Best Practices

  • Create indexes based on real queries.
  • Use compound indexes for filter + sort patterns.
  • Avoid too many indexes.
  • Remove unused indexes.
  • Use unique indexes for unique fields.
  • Test indexes with explain().

23.4 Security Best Practices

  • Enable authentication.
  • Use least-privilege roles.
  • Do not use admin users in applications.
  • Use TLS in production.
  • Rotate passwords and secrets.
  • Keep MongoDB patched.
  • Restrict network access.
  • Audit users and roles regularly.

23.5 Backup Best Practices

  • Automate backups.
  • Test restores.
  • Store backups securely.
  • Use snapshots or managed backups for production.
  • Monitor backup jobs.
  • Document recovery procedures.

23.6 Production Best Practices

  • Use replica sets.
  • Monitor disk, CPU, memory, connections, locks, and replication lag.
  • Use proper indexes.
  • Plan capacity.
  • Avoid running without authentication.
  • Avoid public internet exposure.
  • Use sharding only when required.
  • Test upgrades before production.
  • Keep disaster recovery plans current.

24. Common Mistakes Beginners Make

Mistake 1: Thinking MongoDB Has No Schema

MongoDB has a flexible schema, but your application still needs a clear data model.

Mistake 2: Creating Too Many Indexes

Indexes speed up reads but slow down writes and use storage.

Mistake 3: Using MongoDB Like a Relational Database

If every query requires many joins, your model may not be document-oriented enough.

Mistake 4: Ignoring Backups

Backups are only useful if restores are tested.

Mistake 5: Running Without Authentication

Self-managed MongoDB access control is not enabled by default, so you must enable it for real deployments. (MongoDB)

Mistake 6: Bad Shard Key

A poor shard key can create uneven data distribution and performance hot spots.

Mistake 7: Not Using explain()

You cannot guess query performance reliably. Use explain().


25. MongoDB Cheat Sheet

Database Commands

show dbs
use mydb
db
db.dropDatabase()

Collection Commands

show collections
db.createCollection("users")
db.users.drop()

Insert

db.users.insertOne({ name: "Alice" })
db.users.insertMany([{ name: "Bob" }, { name: "Carol" }])

Read

db.users.find()
db.users.findOne({ name: "Alice" })
db.users.find({ age: { $gt: 18 } })

Projection

db.users.find({}, { name: 1, email: 1, _id: 0 })

Update

db.users.updateOne(
  { name: "Alice" },
  { $set: { city: "Tokyo" } }
)

db.users.updateMany(
  { active: true },
  { $set: { verified: true } }
)

Delete

db.users.deleteOne({ name: "Alice" })
db.users.deleteMany({ active: false })

Index

db.users.createIndex({ email: 1 }, { unique: true })
db.users.getIndexes()
db.users.dropIndex({ email: 1 })

Aggregation

db.orders.aggregate([
  { $match: { status: "paid" } },
  { $group: { _id: "$customer", total: { $sum: "$total" } } },
  { $sort: { total: -1 } }
])

Users

use admin

db.createUser({
  user: "adminUser",
  pwd: passwordPrompt(),
  roles: ["root"]
})

Backup

mongodump --db mydb --out /backup

Restore

mongorestore --db mydb /backup/mydb

26. Suggested Learning Path

Follow this order:

  1. Understand documents, collections, and databases.
  2. Install MongoDB locally with Docker.
  3. Learn mongosh.
  4. Practice CRUD.
  5. Learn filters, projections, sorting, and pagination.
  6. Learn data modeling.
  7. Learn indexes.
  8. Learn aggregation.
  9. Add schema validation.
  10. Add authentication and users.
  11. Learn backup and restore.
  12. Learn replica sets.
  13. Learn transactions.
  14. Learn sharding concepts.
  15. Learn monitoring and production administration.

Final Summary

MongoDB is a flexible, document-oriented database designed for modern applications. Its biggest strengths are flexible schema design, document-based storage, powerful queries, indexes, aggregation, replication, transactions, and horizontal scaling through sharding.

For beginners, start with Docker, mongosh, CRUD, and indexes. For intermediate users, focus on schema design, aggregation, validation, and security. For advanced users, learn replica sets, backup strategies, performance tuning, transactions, and sharding.

A strong MongoDB workflow is:

Design data model
  ↓
Create collections
  ↓
Insert documents
  ↓
Query and update data
  ↓
Add indexes
  ↓
Validate schema
  ↓
Secure users and roles
  ↓
Back up data
  ↓
Monitor performance
  ↓
Scale with replica sets and sharding

That is the practical end-to-end MongoDB journey: from first document to production-ready database.