Essential mongosh: show dbs (list databases), show collections (list collections), db.coll.find() (query documents), db.coll.insertOne() (insert). Key operations: db.coll.createIndex() (speed up queries), db.coll.aggregate() (data processing), mongodump (backup), mongorestore (restore). Always create indexes on fields you query often. Use the aggregation pipeline for anything more complex than a simple find.
Key Takeaways
- mongosh is the modern MongoDB shell. Use it instead of the legacy mongo shell
- Always create indexes on fields you filter, sort, or join on. Without them, MongoDB scans every document
- The aggregation pipeline is MongoDB's most powerful feature. Learn $match, $group, $lookup, and $project first
- Use mongodump and mongorestore for backups. Automate them with cron
- Schema validation exists in MongoDB. Use it from day one to prevent document drift
- Use explain('executionStats') to understand why a query is slow before adding indexes
I have been using MongoDB across different projects for years. Content management systems, product catalogs, event tracking pipelines, user profile stores. Every time I spin up a new collection, debug a slow query, or help someone get started, I end up looking up the same commands.
This cheat sheet is everything I keep coming back to. It covers the MongoDB commands and patterns you will actually use as a developer, with real examples. Not a documentation dump. Just the practical stuff.
Bookmark this page. You will come back to it.
Table of Contents
- Connecting to MongoDB
- Database and Collection Management
- Inserting Documents
- Querying Documents
- Query Operators
- Projection and Sorting
- Updating Documents
- Deleting Documents
- Array Operations
- Aggregation Pipeline
- Indexes
- Schema Validation
- Data Types
- Backup and Restore
- Import and Export
- Users, Roles, and Authentication
- Replication
- Sharding
- Performance and Troubleshooting
- Useful Monitoring Commands
- Quick Reference Table
Connecting to MongoDB
The mongosh shell is the modern way to interact with MongoDB. It became the default shell in MongoDB 6.0, and the legacy mongo shell was removed entirely. If you are still using the old shell, switch now.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Connect to local MongoDB (default localhost:27017)
mongosh
# Connect to a specific database
mongosh "mongodb://localhost:27017/mydb"
# Connect with username and password
mongosh "mongodb://username:password@localhost:27017/mydb"
# Connect to MongoDB Atlas
mongosh "mongodb+srv://username:password@cluster0.abc123.mongodb.net/mydb"
# Connect with authentication database
mongosh --host localhost --port 27017 -u admin -p password --authenticationDatabase admin
# Connect with TLS/SSL
mongosh "mongodb://localhost:27017/mydb" --tls --tlsCertificateKeyFile /path/to/cert.pem
If you are using MongoDB Atlas, the connection string is on your cluster dashboard under “Connect”. It looks like mongodb+srv://.... The +srv part handles DNS-based discovery so you do not need to list every node manually.
Quick environment setup for repeated connections:
1
2
3
4
5
# Set as environment variable
export MONGODB_URI="mongodb+srv://user:pass@cluster0.abc123.mongodb.net/mydb"
# Then just run
mongosh $MONGODB_URI
Database and Collection Management
MongoDB creates databases and collections on the fly when you first write data to them. There is no CREATE DATABASE or CREATE TABLE like in SQL. But you can still manage them explicitly.
Database Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Show all databases (only shows databases with data)
show dbs
// Switch to a database (creates it on first write if it does not exist)
use mydb
// Show current database
db
// Drop the current database (careful, this is permanent)
db.dropDatabase()
// Get database stats
db.stats()
Collection Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Show all collections in current database
show collections
// Create a collection explicitly
db.createCollection("users")
// Create a collection with options
db.createCollection("logs", {
capped: true,
size: 104857600, // 100 MB max size
max: 100000 // 100k documents max
})
// Get collection stats
db.users.stats()
// Rename a collection
db.users.renameCollection("customers")
// Drop a collection
db.users.drop()
// List collections with details
db.getCollectionInfos()
Capped collections are worth knowing about. They are fixed-size collections that automatically overwrite the oldest documents when they hit the size limit. They are useful for logs, metrics, and any data where you only care about the most recent entries.
Note: Unlike relational databases like PostgreSQL where you explicitly define tables and schemas upfront, MongoDB creates collections implicitly on first insert. This is convenient for development but can lead to typo-based collection names in production. Use schema validation to enforce structure.
Inserting Documents
MongoDB stores data as documents (JSON-like objects called BSON internally). Every document gets a unique _id field automatically if you do not provide one.
insertOne
1
2
3
4
5
6
7
8
9
10
11
// Insert a single document
db.users.insertOne({
name: "Ajit Singh",
email: "ajit@example.com",
age: 30,
role: "developer",
createdAt: new Date()
})
// The response includes the generated _id
// { acknowledged: true, insertedId: ObjectId("...") }
insertMany
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Insert multiple documents at once
db.users.insertMany([
{ name: "Alice", email: "alice@example.com", age: 28, role: "designer" },
{ name: "Bob", email: "bob@example.com", age: 35, role: "developer" },
{ name: "Charlie", email: "charlie@example.com", age: 42, role: "manager" }
])
// With ordered: false, MongoDB continues inserting even if one fails
db.users.insertMany(
[
{ _id: 1, name: "Alice" },
{ _id: 1, name: "Duplicate" }, // This will fail
{ _id: 2, name: "Bob" } // This still gets inserted
],
{ ordered: false }
)
insertMany is significantly faster than calling insertOne in a loop. MongoDB batches the writes internally. For bulk loading thousands of documents, always use insertMany or the bulk write API.
Querying Documents
The find method is what you will use most. It is the equivalent of SELECT in SQL.
Basic Queries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Find all documents
db.users.find()
// Find with a condition (WHERE equivalent)
db.users.find({ role: "developer" })
// Find one document
db.users.findOne({ email: "ajit@example.com" })
// Find by _id
db.users.findOne({ _id: ObjectId("507f1f77bcf86cd799439011") })
// Count documents
db.users.countDocuments({ role: "developer" })
// Count all documents (faster, uses metadata)
db.users.estimatedDocumentCount()
// Check if any document matches
db.users.findOne({ email: "ajit@example.com" }) !== null
// Get distinct values
db.users.distinct("role")
// Returns: ["developer", "designer", "manager"]
Pretty Print and Formatting
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Format output for readability
db.users.find().pretty()
// Limit output
db.users.find().limit(5)
// Skip documents (pagination)
db.users.find().skip(10).limit(5)
// Sort results (1 = ascending, -1 = descending)
db.users.find().sort({ createdAt: -1 })
// Chain them together for pagination
db.users.find({ role: "developer" })
.sort({ name: 1 })
.skip(20)
.limit(10)
Query Operators
MongoDB query operators let you build complex filters. These are the ones you will use every day.
Comparison Operators
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Greater than
db.users.find({ age: { $gt: 30 } })
// Greater than or equal
db.users.find({ age: { $gte: 30 } })
// Less than
db.users.find({ age: { $lt: 30 } })
// Less than or equal
db.users.find({ age: { $lte: 30 } })
// Not equal
db.users.find({ role: { $ne: "admin" } })
// In a set of values
db.users.find({ role: { $in: ["developer", "designer"] } })
// Not in a set
db.users.find({ role: { $nin: ["admin", "banned"] } })
Logical Operators
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// AND (implicit, just add fields)
db.users.find({ role: "developer", age: { $gt: 25 } })
// AND (explicit)
db.users.find({
$and: [
{ role: "developer" },
{ age: { $gt: 25 } }
]
})
// OR
db.users.find({
$or: [
{ role: "admin" },
{ role: "moderator" }
]
})
// NOT
db.users.find({ age: { $not: { $gt: 30 } } })
// NOR (none of the conditions match)
db.users.find({
$nor: [
{ role: "banned" },
{ age: { $lt: 18 } }
]
})
Element Operators
1
2
3
4
5
6
7
8
// Field exists
db.users.find({ phone: { $exists: true } })
// Field does not exist
db.users.find({ phone: { $exists: false } })
// Field is a specific BSON type
db.users.find({ age: { $type: "int" } })
String Matching
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Regex match
db.users.find({ email: { $regex: /@gmail\.com$/ } })
// Case-insensitive regex
db.users.find({ name: { $regex: /^ajit/i } })
// Text search (requires text index)
db.articles.find({ $text: { $search: "mongodb tutorial" } })
// Text search with score
db.articles.find(
{ $text: { $search: "mongodb aggregation" } },
{ score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } })
Query Operators Quick Reference
| Operator | Description | Example |
|---|---|---|
$eq |
Equals | { age: { $eq: 30 } } |
$ne |
Not equals | { role: { $ne: "admin" } } |
$gt / $gte |
Greater than (or equal) | { age: { $gt: 25 } } |
$lt / $lte |
Less than (or equal) | { age: { $lt: 50 } } |
$in |
Matches any in array | { role: { $in: ["a", "b"] } } |
$nin |
Matches none in array | { role: { $nin: ["x"] } } |
$and |
All conditions match | { $and: [{...}, {...}] } |
$or |
Any condition matches | { $or: [{...}, {...}] } |
$not |
Negates a condition | { age: { $not: { $gt: 30 } } } |
$exists |
Field exists | { phone: { $exists: true } } |
$regex |
Pattern match | { name: { $regex: /^A/ } } |
$text |
Full text search | { $text: { $search: "term" } } |
$elemMatch |
Array element match | { scores: { $elemMatch: { $gt: 80 } } } |
Projection and Sorting
Projection controls which fields come back in your query results. This is like SELECT column1, column2 in SQL instead of SELECT *.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Return only name and email (1 = include)
db.users.find({}, { name: 1, email: 1 })
// Exclude specific fields (0 = exclude)
db.users.find({}, { password: 0, internalNotes: 0 })
// You cannot mix include and exclude (except for _id)
// This is valid: include fields but exclude _id
db.users.find({}, { name: 1, email: 1, _id: 0 })
// Sort ascending by name
db.users.find().sort({ name: 1 })
// Sort descending by creation date
db.users.find().sort({ createdAt: -1 })
// Multi-field sort
db.users.find().sort({ role: 1, name: 1 })
// Combine projection, sort, skip, limit
db.users.find(
{ role: "developer" },
{ name: 1, email: 1, _id: 0 }
).sort({ name: 1 }).skip(0).limit(10)
Warning: Using skip for pagination gets slow on large collections because MongoDB still has to scan through the skipped documents. For large datasets, use range-based pagination with a cursor:
1
2
3
4
5
// Instead of skip-based pagination
db.users.find().sort({ _id: 1 }).skip(100000).limit(10) // Slow
// Use cursor-based pagination
db.users.find({ _id: { $gt: lastSeenId } }).sort({ _id: 1 }).limit(10) // Fast
Updating Documents
MongoDB gives you fine-grained control over updates. You can modify specific fields without replacing the entire document.
updateOne and updateMany
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Update a single field
db.users.updateOne(
{ email: "ajit@example.com" },
{ $set: { role: "admin" } }
)
// Update multiple fields
db.users.updateOne(
{ email: "ajit@example.com" },
{ $set: { role: "admin", updatedAt: new Date() } }
)
// Update all matching documents
db.users.updateMany(
{ role: "intern" },
{ $set: { role: "junior_developer" } }
)
// Increment a numeric field
db.products.updateOne(
{ _id: productId },
{ $inc: { views: 1 } }
)
// Multiply a field
db.products.updateOne(
{ _id: productId },
{ $mul: { price: 1.10 } } // 10% price increase
)
// Set field only if the new value is less than current (useful for tracking minimums)
db.scores.updateOne(
{ playerId: 1 },
{ $min: { bestTime: 45.2 } }
)
// Set field only if the new value is greater (useful for tracking maximums)
db.scores.updateOne(
{ playerId: 1 },
{ $max: { highScore: 9500 } }
)
// Remove a field
db.users.updateOne(
{ email: "ajit@example.com" },
{ $unset: { temporaryField: "" } }
)
// Rename a field
db.users.updateMany(
{},
{ $rename: { "name": "fullName" } }
)
Upsert
Upsert inserts a document if no match is found, or updates it if a match exists. This is one of the most useful operations in MongoDB.
1
2
3
4
5
6
7
8
9
// Upsert: update if exists, insert if not
db.users.updateOne(
{ email: "newuser@example.com" },
{
$set: { name: "New User", role: "developer" },
$setOnInsert: { createdAt: new Date() }
},
{ upsert: true }
)
The $setOnInsert operator only applies when a new document is inserted, not on updates. This is useful for setting creation timestamps.
replaceOne
1
2
3
4
5
6
7
8
9
10
// Replace entire document (except _id)
db.users.replaceOne(
{ email: "ajit@example.com" },
{
name: "Ajit Singh",
email: "ajit@example.com",
role: "admin",
updatedAt: new Date()
}
)
Careful: replaceOne replaces the entire document. Any fields not in the replacement document are gone. Use updateOne with $set when you only want to change specific fields.
findOneAndUpdate
1
2
3
4
5
6
// Update and return the modified document in one operation
db.users.findOneAndUpdate(
{ email: "ajit@example.com" },
{ $set: { lastLogin: new Date() } },
{ returnDocument: "after" } // Returns the document AFTER the update
)
This is the MongoDB equivalent of PostgreSQL’s UPDATE ... RETURNING *. It is atomic, so no other operation can modify the document between the find and the update.
Deleting Documents
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Delete one document
db.users.deleteOne({ email: "old@example.com" })
// Delete all matching documents
db.users.deleteMany({ role: "banned" })
// Delete all documents in a collection (keeps the collection and indexes)
db.users.deleteMany({})
// Drop the entire collection (faster than deleteMany for clearing everything)
db.users.drop()
// Find and delete in one atomic operation
db.queue.findOneAndDelete(
{ status: "pending" },
{ sort: { createdAt: 1 } } // Delete the oldest pending item
)
findOneAndDelete is useful for implementing job queues. You atomically grab and remove the next item, so no two workers process the same job.
Array Operations
Arrays are a first-class feature in MongoDB. You can query inside them, update individual elements, and use them in aggregations without any extra setup.
Querying Arrays
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Match if array contains a value
db.users.find({ tags: "javascript" })
// Match if array contains ALL of these values
db.users.find({ tags: { $all: ["javascript", "mongodb"] } })
// Match by array size
db.users.find({ tags: { $size: 3 } })
// Match array element by condition
db.users.find({
scores: { $elemMatch: { $gt: 80, $lt: 100 } }
})
// Query nested documents in arrays
db.orders.find({
"items.productId": "prod123"
})
Modifying Arrays
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// Add to array
db.users.updateOne(
{ _id: userId },
{ $push: { tags: "python" } }
)
// Add multiple values
db.users.updateOne(
{ _id: userId },
{ $push: { tags: { $each: ["python", "rust", "go"] } } }
)
// Add to array only if value does not already exist
db.users.updateOne(
{ _id: userId },
{ $addToSet: { tags: "javascript" } }
)
// Remove from array by value
db.users.updateOne(
{ _id: userId },
{ $pull: { tags: "php" } }
)
// Remove from array by condition
db.users.updateOne(
{ _id: userId },
{ $pull: { scores: { $lt: 50 } } }
)
// Remove last element from array
db.users.updateOne(
{ _id: userId },
{ $pop: { tags: 1 } } // 1 = last, -1 = first
)
// Update a specific array element by position
db.users.updateOne(
{ _id: userId },
{ $set: { "tags.0": "typescript" } } // Update first element
)
// Update first matching array element
db.users.updateOne(
{ _id: userId, "scores.subject": "math" },
{ $set: { "scores.$.grade": "A" } }
)
Aggregation Pipeline
The aggregation pipeline is the most powerful feature in MongoDB. It lets you process documents through a sequence of stages, transforming and computing results step by step. Think of it as piping data through a series of functions.
flowchart LR
A["fa:fa-database Collection"] --> B["$match\nFilter docs"]
B --> C["$group\nAggregate"]
C --> D["$sort\nOrder results"]
D --> E["$project\nReshape output"]
E --> F["fa:fa-file-alt Results"]
style A fill:#dbeafe,stroke:#3b82f6,color:#1e3a5f
style B fill:#e0f2fe,stroke:#0284c7,color:#0c4a6e
style C fill:#dcfce7,stroke:#16a34a,color:#14532d
style D fill:#fef3c7,stroke:#d97706,color:#78350f
style E fill:#fce7f3,stroke:#db2777,color:#831843
style F fill:#dbeafe,stroke:#3b82f6,color:#1e3a5f
$match (Filter)
Always put $match as early as possible. It reduces the number of documents flowing through the rest of the pipeline.
1
2
3
4
// Filter orders from 2025
db.orders.aggregate([
{ $match: { createdAt: { $gte: ISODate("2025-01-01") } } }
])
$group (Aggregate)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Total revenue by status
db.orders.aggregate([
{
$group: {
_id: "$status",
totalRevenue: { $sum: "$total" },
orderCount: { $count: {} },
avgOrder: { $avg: "$total" },
maxOrder: { $max: "$total" }
}
}
])
// Group by month
db.orders.aggregate([
{
$group: {
_id: {
year: { $year: "$createdAt" },
month: { $month: "$createdAt" }
},
revenue: { $sum: "$total" },
orders: { $count: {} }
}
},
{ $sort: { "_id.year": -1, "_id.month": -1 } }
])
$project (Reshape)
1
2
3
4
5
6
7
8
9
10
11
// Reshape output with computed fields
db.users.aggregate([
{
$project: {
fullName: { $concat: ["$firstName", " ", "$lastName"] },
email: 1,
ageInMonths: { $multiply: ["$age", 12] },
_id: 0
}
}
])
$lookup (Join Collections)
$lookup is the MongoDB equivalent of a SQL JOIN. Use it to pull in data from another collection.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Join orders with user details
db.orders.aggregate([
{
$lookup: {
from: "users",
localField: "userId",
foreignField: "_id",
as: "user"
}
},
{ $unwind: "$user" }, // Flatten the array to a single object
{
$project: {
total: 1,
status: 1,
"user.name": 1,
"user.email": 1
}
}
])
$unwind (Flatten Arrays)
1
2
3
4
5
6
7
8
9
10
11
12
// Flatten an array field to create one document per array element
db.orders.aggregate([
{ $unwind: "$items" },
{
$group: {
_id: "$items.productId",
totalSold: { $sum: "$items.quantity" },
revenue: { $sum: { $multiply: ["$items.price", "$items.quantity"] } }
}
},
{ $sort: { revenue: -1 } }
])
$facet (Multiple Pipelines)
Run multiple aggregation pipelines on the same input in a single pass. Great for dashboards.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Get stats and top items in one query
db.orders.aggregate([
{
$facet: {
summary: [
{
$group: {
_id: null,
totalOrders: { $count: {} },
totalRevenue: { $sum: "$total" },
avgOrder: { $avg: "$total" }
}
}
],
byStatus: [
{ $group: { _id: "$status", count: { $count: {} } } },
{ $sort: { count: -1 } }
],
recentOrders: [
{ $sort: { createdAt: -1 } },
{ $limit: 5 },
{ $project: { total: 1, status: 1, createdAt: 1 } }
]
}
}
])
Full Pipeline Example
Here is a real-world aggregation that computes monthly revenue breakdown:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
db.orders.aggregate([
// Stage 1: Filter completed orders
{ $match: { status: "completed" } },
// Stage 2: Group by month
{
$group: {
_id: {
year: { $year: "$createdAt" },
month: { $month: "$createdAt" }
},
revenue: { $sum: "$total" },
orders: { $count: {} },
avgOrderValue: { $avg: "$total" }
}
},
// Stage 3: Sort chronologically
{ $sort: { "_id.year": 1, "_id.month": 1 } },
// Stage 4: Reshape for readability
{
$project: {
_id: 0,
year: "$_id.year",
month: "$_id.month",
revenue: { $round: ["$revenue", 2] },
orders: 1,
avgOrderValue: { $round: ["$avgOrderValue", 2] }
}
}
])
Aggregation Pipeline Best Practices
- $match first. Filter early to reduce the number of documents flowing through the pipeline. This is the single biggest performance win
- $project early. Drop fields you do not need before expensive stages like
$groupand$lookup - $lookup late. Join collections after grouping and filtering, not before. Joining 10,000 documents before filtering is wasteful
- Index your $match fields. Only
$matchat the beginning of a pipeline (or immediately after$sort) can use indexes - Watch memory. Aggregation stages that accumulate data ($group, $sort on large datasets) have a 100MB memory limit by default. Use
{ allowDiskUse: true }for large pipelines
1
2
// Allow disk use for large aggregations
db.orders.aggregate([...], { allowDiskUse: true })
Indexes
Indexes are the difference between a query that takes 5ms and one that takes 5 seconds. Without an index, MongoDB does a collection scan (reads every document). With an index, it jumps directly to matching documents.
For a deeper dive into how indexes work at the data structure level, see How Database Indexing Works.
flowchart LR
subgraph Without["Without Index"]
direction TB
W1["Query: find age > 30"] --> W2["Scan all 1M documents"]
W2 --> W3["Return 50k matches"]
W4["Time: 800ms"]
end
subgraph With["With Index"]
direction TB
I1["Query: find age > 30"] --> I2["B-tree lookup"]
I2 --> I3["Return 50k matches"]
I4["Time: 12ms"]
end
style Without fill:#fee2e2,stroke:#dc2626,color:#7f1d1d
style With fill:#dcfce7,stroke:#16a34a,color:#14532d
Creating Indexes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Single field index (ascending)
db.users.createIndex({ email: 1 })
// Single field index (descending)
db.users.createIndex({ createdAt: -1 })
// Unique index (enforces uniqueness like a SQL UNIQUE constraint)
db.users.createIndex({ email: 1 }, { unique: true })
// Compound index (multi-field)
db.orders.createIndex({ userId: 1, createdAt: -1 })
// Text index (for full-text search)
db.articles.createIndex({ title: "text", body: "text" })
// TTL index (auto-delete documents after a time period)
db.sessions.createIndex(
{ createdAt: 1 },
{ expireAfterSeconds: 3600 } // Delete after 1 hour
)
// Partial index (only index documents matching a condition)
db.orders.createIndex(
{ createdAt: -1 },
{ partialFilterExpression: { status: "pending" } }
)
// Sparse index (only index documents that have the field)
db.users.createIndex(
{ phone: 1 },
{ sparse: true }
)
// Hashed index (for hash-based sharding)
db.users.createIndex({ userId: "hashed" })
// Wildcard index (index all fields in a subdocument)
db.products.createIndex({ "attributes.$**": 1 })
Managing Indexes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// List all indexes on a collection
db.users.getIndexes()
// Get index usage statistics
db.users.aggregate([{ $indexStats: {} }])
// Drop an index by name
db.users.dropIndex("email_1")
// Drop all indexes except _id
db.users.dropIndexes()
// Hide an index (MongoDB 4.4+, test impact without dropping)
db.users.hideIndex("email_1")
// Unhide it
db.users.unhideIndex("email_1")
Index Design Rules
- Index fields you query on. If a field appears in
find(),$match, orsort(), it likely needs an index - Compound index order matters. For
{ userId: 1, createdAt: -1 }, queries onuserIdalone use the index. Queries on onlycreatedAtdo not. This is the leftmost prefix rule - Use TTL indexes for expiring data. Sessions, OTPs, temporary tokens. Let MongoDB clean them up automatically
- Do not over-index. Every index slows down writes because MongoDB updates all indexes on every insert, update, and delete
- Do not index low-cardinality fields alone. A boolean field like
isActivehas only two values. An index on it barely helps
Audit your indexes regularly. If $indexStats shows an index with zero usage, drop it. It is just slowing down writes and wasting memory.
Schema Validation
MongoDB is schema-flexible, not schema-less. In production, you should enforce structure using JSON Schema validation. Without it, different parts of your application will write different shapes of documents and you will spend months cleaning up the mess.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// Create a collection with validation
db.createCollection("users", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "email", "role"],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
email: {
bsonType: "string",
pattern: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$",
description: "must be a valid email"
},
age: {
bsonType: "int",
minimum: 0,
maximum: 150,
description: "must be an integer between 0 and 150"
},
role: {
bsonType: "string",
enum: ["developer", "designer", "manager", "admin"],
description: "must be one of the allowed roles"
}
}
}
},
validationLevel: "strict", // strict or moderate
validationAction: "error" // error or warn
})
// Add validation to an existing collection
db.runCommand({
collMod: "users",
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "email"],
properties: {
name: { bsonType: "string" },
email: { bsonType: "string" }
}
}
},
validationLevel: "moderate" // Only validates new inserts and updates to valid docs
})
validationLevel options:
strict: validates all inserts and updatesmoderate: only validates inserts and updates to documents that already match the schema. Existing invalid documents are left alone
validationAction options:
error: rejects writes that violate the schemawarn: allows the write but logs a warning
If you are coming from PostgreSQL where the schema is enforced by CREATE TABLE, think of MongoDB schema validation as the equivalent. It is opt-in, but in production, you should always opt in.
Data Types
MongoDB uses BSON (Binary JSON), which supports more types than plain JSON.
| Type | Usage | Example |
|---|---|---|
String |
Text data | "hello" |
Int32 |
32-bit integer | NumberInt(42) |
Int64 / Long |
64-bit integer | NumberLong(9007199254740993) |
Double |
Floating point | 3.14 |
Decimal128 |
High-precision decimal | NumberDecimal("19.99") |
Boolean |
True/false | true |
Date |
UTC datetime | new Date() or ISODate() |
ObjectId |
12-byte unique ID | ObjectId("507f1f77bcf86cd799439011") |
Array |
Ordered list | [1, 2, 3] |
Object |
Embedded document | { key: "value" } |
Null |
Null value | null |
Binary |
Binary data | BinData(0, "base64data") |
Regex |
Regular expression | /pattern/i |
Timestamp |
Internal MongoDB type | Timestamp(1234567890, 1) |
Practical advice from experience:
- Use
NumberDecimalfor money. Just like in PostgreSQL where you use NUMERIC, never use floating point for financial data.0.1 + 0.2 !== 0.3in floating point - Use
ISODate()for dates, not strings. Dates stored as strings cannot be sorted or compared correctly and cannot use date operators in aggregation - Let MongoDB generate
_idunless you have a reason not to. ObjectId gives you uniqueness, a timestamp (embedded in the first 4 bytes), and natural chronological ordering - Use
NumberLongfor large integers. JavaScript numbers lose precision beyond 2^53. If your IDs or counters are large, useNumberLongexplicitly
For a deeper look at how databases store these types on disk, see How Databases Store Data Internally.
Backup and Restore
Regular backups are not optional. Here is how to do them properly with MongoDB tools.
mongodump (Binary Backup)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Dump entire server
mongodump --out=/backups/full_backup/
# Dump a specific database
mongodump --db=mydb --out=/backups/
# Dump a specific collection
mongodump --db=mydb --collection=users --out=/backups/
# Dump with compression (much smaller files)
mongodump --db=mydb --gzip --out=/backups/
# Dump from a remote server
mongodump --uri="mongodb+srv://user:pass@cluster0.abc123.mongodb.net/mydb" --out=/backups/
# Dump with query filter
mongodump --db=mydb --collection=orders --query='{"createdAt":{"$gte":{"$date":"2025-01-01T00:00:00Z"}}}'
mongorestore (Restore from Backup)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Restore entire backup
mongorestore /backups/full_backup/
# Restore a specific database
mongorestore --db=mydb /backups/mydb/
# Restore a specific collection
mongorestore --db=mydb --collection=users /backups/mydb/users.bson
# Restore compressed backup
mongorestore --gzip /backups/
# Drop existing data before restoring
mongorestore --drop --db=mydb /backups/mydb/
# Restore to a different database
mongorestore --db=mydb_restored /backups/mydb/
Automate backups with cron:
1
2
3
# Add to crontab: crontab -e
# Run nightly at 2 AM
0 2 * * * mongodump --db=mydb --gzip --out=/backups/$(date +\%Y\%m\%d)/
For MongoDB Atlas, you get automatic backups and point-in-time recovery built into the managed service. Use the Atlas UI or API to configure backup schedules and retention policies.
Import and Export
mongoexport and mongoimport work with JSON and CSV files. Use them for moving data in and out of MongoDB in human-readable formats.
Export
1
2
3
4
5
6
7
8
9
10
11
# Export collection to JSON
mongoexport --db=mydb --collection=users --out=users.json
# Export to CSV
mongoexport --db=mydb --collection=users --type=csv --fields=name,email,role --out=users.csv
# Export with a query filter
mongoexport --db=mydb --collection=orders --query='{"status":"completed"}' --out=completed_orders.json
# Export pretty-printed JSON
mongoexport --db=mydb --collection=users --out=users.json --pretty
Import
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Import from JSON
mongoimport --db=mydb --collection=users --file=users.json
# Import from CSV with header row
mongoimport --db=mydb --collection=users --type=csv --headerline --file=users.csv
# Import and drop existing collection first
mongoimport --db=mydb --collection=users --drop --file=users.json
# Import JSON array
mongoimport --db=mydb --collection=users --jsonArray --file=users_array.json
# Upsert during import (update if exists, insert if not)
mongoimport --db=mydb --collection=users --mode=upsert --upsertFields=email --file=users.json
When to use what:
mongodump/mongorestore: for full backups and restores. Binary format, preserves all BSON typesmongoexport/mongoimport: for moving data between systems, sharing with other tools, or loading CSV data
Users, Roles, and Authentication
MongoDB uses a role-based access control system. Enable authentication in production. Running MongoDB without auth is one of the most common security mistakes.
Managing Users
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
// Create an admin user
use admin
db.createUser({
user: "admin",
pwd: "secure_password",
roles: [{ role: "userAdminAnyDatabase", db: "admin" }]
})
// Create an application user with specific database access
use mydb
db.createUser({
user: "appuser",
pwd: "app_password",
roles: [{ role: "readWrite", db: "mydb" }]
})
// Create a read-only user
db.createUser({
user: "analyst",
pwd: "analyst_password",
roles: [{ role: "read", db: "mydb" }]
})
// View all users in current database
db.getUsers()
// Change a user's password
db.changeUserPassword("appuser", "new_secure_password")
// Grant additional roles
db.grantRolesToUser("appuser", [{ role: "read", db: "analytics" }])
// Revoke roles
db.revokeRolesFromUser("appuser", [{ role: "read", db: "analytics" }])
// Drop a user
db.dropUser("olduser")
Built-in Roles
| Role | Access | Use Case |
|---|---|---|
read |
Read-only on a database | Analytics, reporting |
readWrite |
Read and write on a database | Application user |
dbAdmin |
Schema and index management | DBA tasks |
userAdmin |
Manage users on a database | User management |
clusterAdmin |
Manage replica sets and shards | Cluster operations |
readAnyDatabase |
Read on all databases | Cross-database analytics |
readWriteAnyDatabase |
Read/write on all databases | Super app user |
userAdminAnyDatabase |
Manage users on all databases | Admin |
root |
Full access to everything | Superuser (use sparingly) |
A practical setup for a web application:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Application user: can read/write data but cannot manage schema
use mydb
db.createUser({
user: "webapp",
pwd: "app_password",
roles: [{ role: "readWrite", db: "mydb" }]
})
// Read-only user for analytics dashboards
db.createUser({
user: "dashboard",
pwd: "dashboard_password",
roles: [{ role: "read", db: "mydb" }]
})
// Admin for database management
use admin
db.createUser({
user: "dba",
pwd: "dba_password",
roles: [
{ role: "dbAdmin", db: "mydb" },
{ role: "userAdmin", db: "mydb" }
]
})
Replication
Replication gives you high availability and data redundancy. A MongoDB replica set is a group of nodes that maintain the same data. One node is the primary (handles writes), and the others are secondaries (replicate data and handle reads if configured).
flowchart TB
App["fa:fa-desktop Application"] --> Primary["fa:fa-server Primary\nReads + Writes"]
Primary -->|"replication"| S1["fa:fa-server Secondary 1\nReads"]
Primary -->|"replication"| S2["fa:fa-server Secondary 2\nReads"]
Primary -.->|"if primary fails"| Election["fa:fa-vote-yea Election"]
Election -.->|"promotes one\nsecondary"| Primary
style Primary fill:#dbeafe,stroke:#3b82f6,color:#1e3a5f
style S1 fill:#e0f2fe,stroke:#0284c7,color:#0c4a6e
style S2 fill:#e0f2fe,stroke:#0284c7,color:#0c4a6e
style App fill:#f8fafc,stroke:#64748b,color:#1e293b
style Election fill:#fef3c7,stroke:#d97706,color:#78350f
Replica Set Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// Check replica set status
rs.status()
// Check who is primary (MongoDB 5.0+)
db.hello()
// Legacy equivalent (deprecated in 5.0)
// rs.isMaster()
// Initiate a replica set
rs.initiate({
_id: "myReplicaSet",
members: [
{ _id: 0, host: "mongo1:27017" },
{ _id: 1, host: "mongo2:27017" },
{ _id: 2, host: "mongo3:27017" }
]
})
// Add a new member
rs.add("mongo4:27017")
// Remove a member
rs.remove("mongo4:27017")
// Add an arbiter (votes in elections but holds no data)
rs.addArb("arbiter:27017")
// Get replica set configuration
rs.conf()
// Step down primary (triggers election)
rs.stepDown()
// Set read preference to read from secondaries
db.getMongo().setReadPref("secondaryPreferred")
Read preferences:
primary: all reads go to primary (default)primaryPreferred: read from primary, fall back to secondarysecondary: all reads go to secondariessecondaryPreferred: read from secondary, fall back to primarynearest: read from the node with the lowest network latency
Always use an odd number of voting members (3, 5, 7) to avoid split-brain scenarios during elections. For more on why odd numbers matter in distributed systems, see Majority Quorum in Distributed Systems.
Sharding
Sharding distributes data across multiple servers. Use it when a single replica set cannot handle your data volume or throughput.
flowchart TB
App["fa:fa-desktop Application"] --> Router["fa:fa-random mongos\nQuery Router"]
Router --> S1["fa:fa-database Shard 1\nUsers A-M"]
Router --> S2["fa:fa-database Shard 2\nUsers N-Z"]
Config["fa:fa-cog Config Servers\nMetadata"]
Router --> Config
style Router fill:#fef3c7,stroke:#d97706,color:#78350f
style S1 fill:#dbeafe,stroke:#3b82f6,color:#1e3a5f
style S2 fill:#dbeafe,stroke:#3b82f6,color:#1e3a5f
style Config fill:#f1f5f9,stroke:#64748b,color:#1e293b
style App fill:#f8fafc,stroke:#64748b,color:#1e293b
Sharding Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Enable sharding on a database
sh.enableSharding("mydb")
// Shard a collection with a hashed shard key
sh.shardCollection("mydb.users", { userId: "hashed" })
// Shard a collection with a range-based shard key
sh.shardCollection("mydb.orders", { createdAt: 1 })
// Check sharding status
sh.status()
// Check data distribution
db.users.getShardDistribution()
Choosing a Shard Key
The shard key determines how MongoDB distributes documents across shards. Getting it wrong is expensive to fix because you cannot change it without migrating data.
- High cardinality. The key should have many distinct values. A boolean field is a terrible shard key
- Even distribution. Data should spread evenly across shards. Monotonically increasing keys (like timestamps) create hot spots
- Query isolation. Queries should be able to target a single shard using the shard key. Scatter-gather queries (hitting all shards) are slow
- Avoid monotonically increasing keys. Using
_id(ObjectId) orcreatedAtas a shard key sends all new writes to the same shard
For a deeper look at how data gets distributed evenly, see Consistent Hashing Explained.
Performance and Troubleshooting
Knowing how to diagnose slow queries is one of the most valuable MongoDB skills you can have. The process is similar to what you would do in PostgreSQL with EXPLAIN ANALYZE, but the tools are different.
explain()
explain() shows you what MongoDB is doing with your query. Always use it when debugging performance.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Basic execution stats
db.users.find({ email: "ajit@example.com" }).explain("executionStats")
// Key things to look for in the output:
// - queryPlanner.winningPlan.stage: "COLLSCAN" = bad (no index), "IXSCAN" = good
// - executionStats.nReturned: how many documents were returned
// - executionStats.totalDocsExamined: how many documents MongoDB read
// - executionStats.executionTimeMillis: how long the query took
// Explain an aggregation pipeline
db.orders.explain("executionStats").aggregate([
{ $match: { status: "pending" } },
{ $group: { _id: "$userId", total: { $sum: "$amount" } } }
])
Red flags in explain output:
| What You See | What It Means | What To Do |
|---|---|---|
stage: "COLLSCAN" |
No index used, scanning every document | Add an index on the query fields |
totalDocsExamined >> nReturned |
MongoDB reads way more docs than it returns | Refine your index or query |
executionTimeMillis > 100 |
Query is slow for most use cases | Check indexing, reduce data scanned |
stage: "SORT" with no SORT_KEY_GENERATOR |
In-memory sort, not using an index | Add an index that covers the sort |
Profiler
The profiler captures slow queries so you can find bottlenecks.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Enable profiler for queries slower than 100ms
db.setProfilingLevel(1, { slowms: 100 })
// Enable profiler for all queries (careful in production)
db.setProfilingLevel(2)
// Check current profiling level
db.getProfilingStatus()
// View slow queries
db.system.profile.find().sort({ ts: -1 }).limit(5).pretty()
// Find the slowest queries
db.system.profile.find().sort({ millis: -1 }).limit(10)
// Disable profiler
db.setProfilingLevel(0)
Current Operations
1
2
3
4
5
6
7
8
9
10
11
// See all running operations
db.currentOp()
// Find long-running operations
db.currentOp({
"active": true,
"secs_running": { $gt: 5 }
})
// Kill a specific operation
db.killOp(opId)
Useful Monitoring Commands
These are the commands I run first when something seems off with the database.
Server Status
1
2
3
4
5
6
7
8
9
10
11
// Overall server status
db.serverStatus()
// Connection count
db.serverStatus().connections
// Memory usage
db.serverStatus().mem
// Operation counters
db.serverStatus().opcounters
Collection and Index Sizes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Collection stats including size and index details
db.users.stats()
// Total data size for a collection (in bytes)
db.users.totalSize()
// Total index size
db.users.totalIndexSize()
// Quick overview of all collections with sizes
db.getCollectionNames().forEach(function(coll) {
var stats = db[coll].stats();
print(coll + ": " + (stats.size / 1024 / 1024).toFixed(2) + " MB, " +
stats.count + " documents, " +
Object.keys(stats.indexSizes || {}).length + " indexes");
})
Index Usage
1
2
3
4
5
6
7
// Check which indexes are being used and how often
db.users.aggregate([{ $indexStats: {} }])
// Find collections with the most documents
db.getCollectionNames().map(function(coll) {
return { collection: coll, count: db[coll].estimatedDocumentCount() };
}).sort(function(a, b) { return b.count - a.count; })
If an index shows zero queries since the last server restart, it is just overhead. Drop it. For a deeper understanding of why unused indexes hurt write performance, see How Database Indexing Works.
Quick Reference Table
| Task | Command |
|---|---|
| Connect to MongoDB | mongosh "mongodb://host:27017/db" |
| List databases | show dbs |
| Switch database | use dbname |
| List collections | show collections |
| Create collection | db.createCollection("name") |
| Drop collection | db.coll.drop() |
| Drop database | db.dropDatabase() |
| Insert one | db.coll.insertOne({...}) |
| Insert many | db.coll.insertMany([{...}, {...}]) |
| Find all | db.coll.find() |
| Find with filter | db.coll.find({ field: "value" }) |
| Find one | db.coll.findOne({ field: "value" }) |
| Count documents | db.coll.countDocuments({ filter }) |
| Update one | db.coll.updateOne({ filter }, { $set: {...} }) |
| Update many | db.coll.updateMany({ filter }, { $set: {...} }) |
| Upsert | db.coll.updateOne({...}, {$set:{...}}, {upsert:true}) |
| Delete one | db.coll.deleteOne({ filter }) |
| Delete many | db.coll.deleteMany({ filter }) |
| Sort | db.coll.find().sort({ field: -1 }) |
| Limit | db.coll.find().limit(10) |
| Skip (offset) | db.coll.find().skip(20).limit(10) |
| Distinct values | db.coll.distinct("field") |
| Create index | db.coll.createIndex({ field: 1 }) |
| List indexes | db.coll.getIndexes() |
| Drop index | db.coll.dropIndex("indexName") |
| Aggregate | db.coll.aggregate([{ $match: {...} }]) |
| Explain query | db.coll.find({...}).explain("executionStats") |
| Backup database | mongodump --db=dbname --out=/path/ |
| Restore database | mongorestore --db=dbname /path/dbname/ |
| Export JSON | mongoexport --db=db --collection=c --out=f.json |
| Import JSON | mongoimport --db=db --collection=c --file=f.json |
| Create user | db.createUser({user:"u",pwd:"p",roles:[...]}) |
| Replica set status | rs.status() |
| Shard status | sh.status() |
| Current operations | db.currentOp() |
| Kill operation | db.killOp(opId) |
| Server status | db.serverStatus() |
Related Posts
These posts go deeper into specific database topics:
Database Deep Dives:
- PostgreSQL vs MongoDB vs DynamoDB: Which Should You Use?
- How Database Indexing Works
- How Databases Store Data Internally
- How OpenAI Scales PostgreSQL to 800 Million Users
System Design:
Related Cheat Sheets:
- PostgreSQL Cheat Sheet
- Docker Cheat Sheet - Run MongoDB in containers
- Linux Commands Cheat Sheet
- Kubernetes Cheat Sheet
External Resources
- MongoDB Documentation - the official reference
- mongosh Reference - the modern shell documentation
- MongoDB University - free official courses
- MongoDB Compass - GUI tool for exploring and querying your data
- MongoDB Atlas - managed cloud hosting
Every developer working with NoSQL databases will use MongoDB at some point. The commands in this cheat sheet cover 90% of what you will do day to day. Start with the CRUD basics, learn the aggregation pipeline for anything complex, and always use explain() when things are slow.
What MongoDB commands or patterns do you use most? Drop them in the comments below.