Aggregation in MongoDB
Aggregation in MongoDB
Aggregation in MongoDB is a powerful feature used to process and transform data in collections. It allows for operations like filtering, grouping, sorting, and reshaping documents into an organized result set. MongoDB provides a variety of aggregation methods, with the Aggregation Framework being the most versatile.
The Aggregation Framework operates on a collection of documents and processes them in multiple stages, each transforming the documents in some way before passing them on to the next stage.
1. The Aggregation Pipeline
The aggregation pipeline consists of stages, where each stage performs a specific operation on the data. The output of one stage is passed as input to the next stage, allowing for complex data transformations.
db.collection.aggregate([ { stage1 }, { stage2 }, ... ])
Each stage uses a specific operator to perform a particular task (e.g., filtering, grouping, projecting).
Key Stages in Aggregation
- $match: Filters documents by a specified condition, similar to the
find()
method.- Example:
This stage returns only documents where thedb.orders.aggregate([ { $match: { status: "shipped" } } ])
status
is"shipped"
.
- $group: Groups documents by a specified key and performs aggregations like counting, summing, averaging, etc.
- Example:
This groups the documents bydb.orders.aggregate([ { $group: { _id: "$customerId", totalOrders: { $sum: 1 } } } ])
customerId
and counts the number of orders per customer.
- $project: Reshapes documents by specifying which fields to include or exclude, often used for transforming data.
- Example:
This creates a newdb.users.aggregate([ { $project: { fullName: { $concat: ["$firstName", " ", "$lastName"] }, email: 1 } } ])
fullName
field by concatenatingfirstName
andlastName
while also including theemail
field.
- $sort: Sorts documents by a specified field in ascending (
1
) or descending (-1
) order.- Example:
This sorts the orders bydb.orders.aggregate([ { $sort: { orderDate: -1 } } ])
orderDate
in descending order (latest first).
$limit and $skip: Control the number of documents returned and allow for pagination.
- Example:
db.orders.aggregate([ { $limit: 5 } ])
This limits the result to the first 5 documents.
db.orders.aggregate([ { $skip: 5 }, { $limit: 5 } ])
This skips the first 5 documents and returns the next 5.
- $lookup: Performs a left outer join with another collection and merges the data from both collections into a single document.
- Example:
This performs a join between thedb.orders.aggregate([ { $lookup: { from: "customers", localField: "customerId", foreignField: "_id", as: "customerDetails" }} ])
orders
andcustomers
collections, matching thecustomerId
inorders
with the_id
incustomers
.
- $unwind: Deconstructs an array field from a document into multiple documents, one for each element in the array.
- Example:
If an order has multiple items, this stage will create separate documents for each item in the order.db.orders.aggregate([ { $unwind: "$items" } ])
- $addFields: Adds new fields or modifies existing fields in a document.
- Example:
This adds a newdb.users.aggregate([ { $addFields: { fullName: { $concat: ["$firstName", " ", "$lastName"] } } } ])
fullName
field to each document by concatenatingfirstName
andlastName
.
- $out: Writes the result of the aggregation to a new collection.
- Example:
This writes all the shipped orders into a new collection calleddb.orders.aggregate([ { $match: { status: "shipped" } }, { $out: "shippedOrders" } ])
shippedOrders
.
Aggregation Example
Here’s a complete example using multiple stages. Let’s say we want to analyze the total sales per customer from an orders
collection. Each order contains a customerId
and an array of items with prices.
db.orders.aggregate([
{ $unwind: "$items" }, // Unwind the items array
{ $group: { _id: "$customerId", totalSales: { $sum: "$items.price" } } }, // Group by customer and sum the item prices
{ $lookup: {
from: "customers", // Join with the customers collection
localField: "_id",
foreignField: "_id",
as: "customerInfo"
}
},
{ $sort: { totalSales: -1 } }, // Sort customers by total sales in descending order
{ $project: { customerInfo: { $arrayElemAt: ["$customerInfo", 0] }, totalSales: 1 } } // Include customer info and total sales
])
In this example:
- We unwind the
items
array, so each order can be processed item by item. - We group the data by
customerId
and calculate the total sales for each customer. - We join the
customers
collection to add customer information. - Finally, we sort by total sales and project the required fields.
MongoDB Aggregation Operators
- $sum: Sums the values.
- $avg: Averages the values.
- $min: Returns the minimum value.
- $max: Returns the maximum value.
- $first and $last: Returns the first and last values based on sorting.
- $push: Appends a value to an array of the grouped documents.