MongoDB Basics for Developers Coming From SQL

Intermediate 9 min read

What you'll learn

✓How documents and collections map to rows and tables
✓How to query, project, and update documents
✓How aggregation pipelines replace complex SQL
✓How to model relationships without JOINs
✓When MongoDB is a better fit than a relational DB

Prerequisites

•Basic SQL: see [SQL SELECT Basics](/blog/sql-select-basics)
•Familiarity with JOINs: see [SQL Joins](/blog/sql-joins)

MongoDB stores JSON-like documents in collections. If you come from SQL, the mental model shifts in three places: schema is flexible, related data is often embedded, and the query language is a chain of stages rather than a single statement.

The mapping

SQL	MongoDB
database	database
table	collection
row	document
column	field
JOIN	embed or $lookup
primary key	_id

A document looks like this:

{
  "_id": "u_123",
  "email": "ada@example.com",
  "name": "Ada",
  "addresses": [
    { "city": "Pune", "primary": true }
  ]
}

There is no fixed schema. Two documents in the same collection can have different fields. In practice, teams use schema validators or libraries like Mongoose to keep things consistent.

Inserts and reads

db.users.insertOne({
  email: "ada@example.com",
  name: "Ada",
  createdAt: new Date()
});

db.users.findOne({ email: "ada@example.com" });

Filters are documents themselves. Operators start with a dollar sign:

db.orders.find({
  total: { $gte: 500 },
  status: { $in: ["paid", "shipped"] }
});

Projections pick fields:

db.users.find({}, { email: 1, name: 1, _id: 0 });

Updates

Updates are partial by default. You name the operator you want.

db.users.updateOne(
  { _id: "u_123" },
  { $set: { name: "Ada L." }, $inc: { loginCount: 1 } }
);

For arrays, you have $push, $pull, and positional operators:

db.users.updateOne(
  { _id: "u_123", "addresses.city": "Pune" },
  { $set: { "addresses.$.primary": false } }
);

upsert: true inserts if no document matches. Useful for idempotent writes.

Indexes

MongoDB has B-tree indexes much like SQL. The first rule is the same: index what you filter and sort by.

db.users.createIndex({ email: 1 }, { unique: true });
db.orders.createIndex({ customerId: 1, createdAt: -1 });

Compound index order matters. Put equality fields first, then ranges, then sort fields. Check usage with:

db.orders.find({ customerId: "c_1" }).sort({ createdAt: -1 }).explain("executionStats");

Look for IXSCAN instead of COLLSCAN. The principles map cleanly onto what you already know from SQL Indexes and Performance.

Aggregation pipelines

Joins, grouping, and reshaping live in the aggregation framework. A pipeline is an array of stages:

db.orders.aggregate([
  { $match: { status: "paid" } },
  { $group: {
      _id: "$customerId",
      total: { $sum: "$total" },
      count: { $sum: 1 }
  }},
  { $sort: { total: -1 } },
  { $limit: 10 }
]);

For joins, use $lookup:

db.orders.aggregate([
  { $lookup: {
      from: "customers",
      localField: "customerId",
      foreignField: "_id",
      as: "customer"
  }},
  { $unwind: "$customer" }
]);

$lookup works, but it is slower than a real SQL join. Heavy use is a signal you should be embedding instead.

Modeling: embed or reference

The biggest shift from SQL is asking, for every relationship, embed or reference.

Embed when:

The child belongs to one parent and is read with it.
The child is small and bounded.
You want atomic updates across the pair.

Reference when:

The child is shared by many parents.
The child grows unbounded.
You query the child independently.

A blog post with a handful of tags embeds them. A user with millions of orders references them.

A common trap is treating MongoDB like SQL with no joins. If every read needs three $lookup stages, you modeled it wrong.

Transactions

MongoDB supports multi-document ACID transactions on replica sets:

const session = client.startSession();
session.startTransaction();
try {
  await accounts.updateOne({ _id: "a" }, { $inc: { balance: -100 } }, { session });
  await accounts.updateOne({ _id: "b" }, { $inc: { balance:  100 } }, { session });
  await session.commitTransaction();
} catch (e) {
  await session.abortTransaction();
  throw e;
} finally {
  session.endSession();
}

Use them when you need to, but embedding often removes the need. Updating a single document is already atomic.

Reading plans

db.orders.find({ status: "paid" }).explain("executionStats");

Check totalDocsExamined versus nReturned. A healthy index keeps them close. A bad plan examines thousands of docs to return ten.

When to pick MongoDB

Mongo shines when documents are the natural unit: product catalogs, user profiles, event logs, content with varying shape. It is also strong for fast iteration when the schema is still moving.

Reach for a relational database when your data is highly relational, when reporting and ad hoc analytics matter, or when strict typed schemas are a feature, not a chore. Both are perfectly acceptable backends behind a REST or RPC layer, see What is REST?.

Wrap up

MongoDB rewards developers who think in documents and pipelines. Map your read patterns to your model, index what you filter and sort, and prefer embedding over $lookup when the data shape allows. Most performance problems are modeling problems in disguise.