Schema Patterns

Embedding vs. Referencing

Embed when data is accessed together and bounded in size. Reference for large, shared, or independently queried data.

// Embedding (one-to-few, data always read together)
{
  _id: ObjectId("..."),
  name: "Alice",
  address: {
    street: "123 Main St",
    city: "Anytown",
    zip: "12345"
  },
  phones: [
    { type: "mobile", number: "555-1234" },
    { type: "work",   number: "555-5678" }
  ]
}

// Referencing (one-to-many, independently queried)
// orders collection
{ _id: ObjectId("order1"), userId: ObjectId("user1"), total: 99.99 }

// users collection
{ _id: ObjectId("user1"), name: "Alice" }

// Fetch with $lookup
db.orders.aggregate([
  { $lookup: { from: "users", localField: "userId", foreignField: "_id", as: "user" } }
]);

Bucket Pattern

Group time-series or sequential data into buckets to reduce document count and improve range query performance.

// Instead of one document per sensor reading (millions of docs):
// Group readings into hourly buckets
{
  _id: ObjectId("..."),
  sensorId: "sensor_42",
  bucketStart: ISODate("2024-01-15T10:00:00Z"),
  count: 60,
  readings: [
    { ts: ISODate("2024-01-15T10:00:05Z"), temp: 22.1 },
    { ts: ISODate("2024-01-15T10:01:03Z"), temp: 22.3 },
    // ... up to 60 readings per bucket
  ],
  stats: { min: 21.8, max: 23.1, sum: 1335.4 }
}

// Add a reading to current bucket, or create new bucket
db.sensor_data.updateOne(
  { sensorId: "sensor_42", count: { $lt: 200 } },
  {
    $push: { readings: { ts: new Date(), temp: 22.5 } },
    $inc:  { count: 1 },
    $min:  { "stats.min": 22.5 },
    $max:  { "stats.max": 22.5 }
  },
  { upsert: true }
);

Polymorphic Pattern

Store different document shapes in one collection when they share common query patterns.

// Products collection with different shapes
{ _id: 1, type: "book",     title: "MongoDB Guide", author: "Joe", pages: 320 }
{ _id: 2, type: "dvd",      title: "MongoDB in Action", duration: 120, region: 1 }
{ _id: 3, type: "clothing", title: "Dev Hoodie", size: "L", color: "black" }

// Query all types by common field
db.products.find({ title: /MongoDB/ });

// Type-specific query
db.products.find({ type: "book", pages: { $gt: 200 } });

// Index on common field
db.products.createIndex({ type: 1, title: 1 });

Tree Structures

// Pattern 1: Parent Reference
{ _id: "mongodb", parent: "databases", name: "MongoDB" }
{ _id: "databases", parent: "tech", name: "Databases" }
{ _id: "tech", parent: null, name: "Technology" }

// Find children of "databases"
db.categories.find({ parent: "databases" });

// Pattern 2: Array of Ancestors (fast ancestor queries)
{ _id: "mongodb", name: "MongoDB", ancestors: ["tech", "databases"] }

// Find all descendants of "tech"
db.categories.find({ ancestors: "tech" });

// Pattern 3: Materialized Path
{ _id: "mongodb", path: ",tech,databases,mongodb,", name: "MongoDB" }

// Find subtree
db.categories.find({ path: /,databases,/ });

// Pattern 4: Nested Sets (fast reads, slow writes)
{ _id: "mongodb", name: "MongoDB", lft: 5, rgt: 6 }
// Find all ancestors: find where lft < 5 AND rgt > 6

Computed Pattern & Extended Reference

// Computed Pattern: pre-compute expensive aggregates
// Instead of computing total on every read, maintain it
{
  _id: ObjectId("order1"),
  userId: ObjectId("user1"),
  items: [...],
  subtotal: 89.97,    // pre-computed
  tax:      7.20,
  total:   97.17      // pre-computed
}

// Extended Reference: duplicate frequently-read fields
// orders collection
{
  _id: ObjectId("order1"),
  userId: ObjectId("user1"),
  // Duplicated from users to avoid $lookup on every read
  customerName: "Alice",
  customerEmail: "[email protected]",
  total: 97.17
}
// Trade-off: update both when customer name changes