Aggregation in MongoDB


Aggregation in MongoDB

Aggregation in MongoDB is a powerful feature used to process and transform data in collections. It allows for operations like filtering, grouping, sorting, and reshaping documents into an organized result set. MongoDB provides a variety of aggregation methods, with the Aggregation Framework being the most versatile.

The Aggregation Framework operates on a collection of documents and processes them in multiple stages, each transforming the documents in some way before passing them on to the next stage.

1. The Aggregation Pipeline

The aggregation pipeline consists of stages, where each stage performs a specific operation on the data. The output of one stage is passed as input to the next stage, allowing for complex data transformations.

db.collection.aggregate([ { stage1 }, { stage2 }, ... ])

Each stage uses a specific operator to perform a particular task (e.g., filtering, grouping, projecting).


Key Stages in Aggregation

  1. $match: Filters documents by a specified condition, similar to the find() method.
    • Example:
    db.orders.aggregate([ { $match: { status: "shipped" } } ])
    This stage returns only documents where the status is "shipped".

  1. $group: Groups documents by a specified key and performs aggregations like counting, summing, averaging, etc.
    • Example:
    db.orders.aggregate([ { $group: { _id: "$customerId", totalOrders: { $sum: 1 } } } ])
    This groups the documents by customerId and counts the number of orders per customer.

  1. $project: Reshapes documents by specifying which fields to include or exclude, often used for transforming data.
    • Example:
    db.users.aggregate([ { $project: { fullName: { $concat: ["$firstName", " ", "$lastName"] }, email: 1 } } ])
    This creates a new fullName field by concatenating firstName and lastName while also including the email field.

  1. $sort: Sorts documents by a specified field in ascending (1) or descending (-1) order.
    • Example:
    db.orders.aggregate([ { $sort: { orderDate: -1 } } ])
    This sorts the orders by orderDate in descending order (latest first).

  1. $limit and $skip: Control the number of documents returned and allow for pagination.

    • Example:
    db.orders.aggregate([ { $limit: 5 } ])

    This limits the result to the first 5 documents.

    db.orders.aggregate([ { $skip: 5 }, { $limit: 5 } ])

    This skips the first 5 documents and returns the next 5.


  1. $lookup: Performs a left outer join with another collection and merges the data from both collections into a single document.
    • Example:
    db.orders.aggregate([ { $lookup: { from: "customers", localField: "customerId", foreignField: "_id", as: "customerDetails" }} ])
    This performs a join between the orders and customers collections, matching the customerId in orders with the _id in customers.

  1. $unwind: Deconstructs an array field from a document into multiple documents, one for each element in the array.
    • Example:
    db.orders.aggregate([ { $unwind: "$items" } ])
    If an order has multiple items, this stage will create separate documents for each item in the order.

  1. $addFields: Adds new fields or modifies existing fields in a document.
    • Example:
    db.users.aggregate([ { $addFields: { fullName: { $concat: ["$firstName", " ", "$lastName"] } } } ])
    This adds a new fullName field to each document by concatenating firstName and lastName.

  1. $out: Writes the result of the aggregation to a new collection.
    • Example:
    db.orders.aggregate([ { $match: { status: "shipped" } }, { $out: "shippedOrders" } ])
    This writes all the shipped orders into a new collection called shippedOrders.

Aggregation Example

Here’s a complete example using multiple stages. Let’s say we want to analyze the total sales per customer from an orders collection. Each order contains a customerId and an array of items with prices.

db.orders.aggregate([ { $unwind: "$items" }, // Unwind the items array { $group: { _id: "$customerId", totalSales: { $sum: "$items.price" } } }, // Group by customer and sum the item prices { $lookup: { from: "customers", // Join with the customers collection localField: "_id", foreignField: "_id", as: "customerInfo" } }, { $sort: { totalSales: -1 } }, // Sort customers by total sales in descending order { $project: { customerInfo: { $arrayElemAt: ["$customerInfo", 0] }, totalSales: 1 } } // Include customer info and total sales ])

In this example:

  • We unwind the items array, so each order can be processed item by item.
  • We group the data by customerId and calculate the total sales for each customer.
  • We join the customers collection to add customer information.
  • Finally, we sort by total sales and project the required fields.

MongoDB Aggregation Operators

  • $sum: Sums the values.
  • $avg: Averages the values.
  • $min: Returns the minimum value.
  • $max: Returns the maximum value.
  • $first and $last: Returns the first and last values based on sorting.
  • $push: Appends a value to an array of the grouped documents.