en programming language Web related css3 非公開: How to use $lookup with MongoDB

How to use $lookup with MongoDB

MongoDB is a popular NoSQL database that stores data in collections. A MongoDB collection consists of one or more documents containing the actual data in JSON format. Documents are equivalent to rows in a traditional relational SQL database, and collections are equivalent to tables.

An important feature of a database is the ability to query the data stored in it. Querying data enables retrieval of specific information, data analysis, data reporting, and even data integration.

To be able to effectively query a database, it is important to be able to combine data from multiple tables (for SQL databases or multiple collections in a NOSQL database) into a single result set.

With MongoDB $lookup, users combine information from two collections at query time. This performs the equivalent of a left outer join in SQL databases.

Use and purpose of $lookup

An important function of databases is data processing to obtain meaningful information from raw data.

For example, if you run a restaurant business, you might want to analyze your restaurant data to find out your daily revenue, find out which foods are in demand on weekends, or find out how many cups of coffee you sold in your store. You can also. each hour of the day.

Simple database queries are not sufficient for these needs. I need to perform advanced queries on the data I have stored. To address these needs, MongoDB has a feature called aggregation pipelines.

An aggregation pipeline is a system of configurable operations called stages that are used to process data and produce final aggregation results. Examples of aggregation pipeline stages include $sort, $match, $group, $merge, $count, and $lookup.

These stages can be applied in any order within the aggregation pipeline. Each stage of the aggregation pipeline performs different operations on the data passing through the aggregation pipeline.

left outer join
left outer join

So $lookup is one of the stages in the MongoDB aggregation pipeline. $Lookup is used to perform a left outer join between two collections in a MongoDB database. A left outer join joins all documents or entries on the left with matching documents or entries on the right.

For example, consider the following two collections. These are presented in tabular form for ease of understanding.

Order collection :

Order ID Customer ID order date total amount
1 100 2022-05-01 50.00
2 101 2022-05-02 75.00
3 102 2022-05-03 100.00

Customer collection :

customer_num customer name customer email address customer phone number
100 john doe john@example.com 555-1234
102 jane smith jane@example.com 555-5678

If we perform a left outer join on the above collection using the customer_id field appearing in order_collection , then order_collection will be the left collection and Customers_collection will be the right collection. The results include all documents and documents in the Orders Collection. A Customers collection whose customer_num matches the customer_id of one of the records in the Orders collection.

The final result of a left outer join operation on the orders and customers collection, expressed in tabular form, is as follows:

Note that for the customer with customer_id 101 in the Orders collection, there is no matching customer_num value in the Customers collection, and the corresponding value missing from the Customer table is filled with null.

$lookup performs a strict equality comparison between fields and retrieves the entire matched document, not just the matched fields.

$lookup syntax

The syntax for $lookup is:

 {
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

$lookup has four parameters.

  • from – Represents the collection to search for documents. In the previous example using order_collection and Customers_collection , we would place customers_collection as the from of the collection.
  • localField – This is the field in the working or primary collection that you want to use to compare with the field in the from collection ( customer_collection in this case). In the example above, localField would be customer_id within orders_collection .
  • externalField – This is the field to compare within the collection specified by from . In this example, this will be customer_num found within customer_collection . Use this as the from value.
  • as – This is the new field name you specify to represent the field as it appears in the document. Document contains the document resulting from the match between localField and externalField. All these matches will be put into an array in this field. If there are no matches, this field contains an empty array.

From the previous two collections, use the following code to perform a $lookup operation on the two collections with orders_collection as the working or primary collection.

 {
    $lookup: {
      from: "customers_collection",
      localField: "customer_id",
      foreignField: "customer_num",
      as: "customer_info"
 }

Note that the as field can be any string value. However, if you specify a name that already exists in the working document, that field will be overwritten.

Combine data from multiple collections

MongoDB $lookup is a useful stage in MongoDB’s aggregation pipeline. Although there is no requirement for a $lookup stage in a MongoDB aggregation pipeline, this stage is very important when performing complex queries that require joining data across multiple collections.

The $lookup stage performs a left outer join on the two collections. As a result, a new field is created or the value of an existing field is overwritten with an array containing documents from another collection.

These documents are selected based on whether they have values ​​that match the values ​​of the fields being compared. The final result is a field containing an array of documents if a match is found, or an empty array if no match is found.

Consider the collection of employees and projects shown below.

Screenshot of 2023-05-11-11-10-07
Screenshot of 2023-05-11-11-10-07

You can combine two collections using the following code.

 db.projects.aggregate([
   {
      $lookup: {
         from: "employees",
         localField: "employees",
         foreignField: "_id",
         as: "assigned_employees"
      }
   }
])

The result of this operation is a combination of two collections. The result is the projects and all employees assigned to each project. Employees are represented by an array.

Screenshot of 2023-05-11-11-12-53
Screenshot of 2023-05-11-11-12-53

Pipeline stages that can be used with $lookup

As mentioned earlier, $lookup is a stage in the MongoDB aggregation pipeline and can be used with other aggregation pipeline stages. To demonstrate how these stages can be used in conjunction with $lookup , we will use the following two collections for illustration purposes:

collection
collection

MongoDB stores it in JSON format. The above collection would look like this in MongoDB:

Screenshot from 2023-05-11-10-14-11
Screenshot from 2023-05-11-10-14-11

Examples of aggregation pipeline stages that can be used with $lookup include:

$match

$match is an aggregation pipeline stage used to filter the document stream to allow only documents that meet the specified conditions to proceed to the next stage of the aggregation pipeline. This stage is best used to optimize the aggregation pipeline by removing unnecessary documents early in the pipeline.

Using the previous two collections, you can combine $match and $lookup like this:

 db.users.aggregate([
   {
      $match: {
         country: "USA"
      }
   },
   {
      $lookup: {
         from: "orders",
         localField: "_id",
         foreignField: "user_id",
         as: "orders"
      }
   }
])

$match is used to filter users from the US. The results of $match are combined with $lookup to retrieve order details for US users. The result of the above operation is shown below.

Screenshot from 2023-05-11-10-24-04
Screenshot from 2023-05-11-10-24-04

$project

$project is a stage used to restructure the document by specifying fields to include, exclude, or add to the document. For example, if you are processing documents with 10 fields each, but only four fields in the document contain the data you need for data processing, use $project to move the unnecessary fields outside. Can be filtered.

This avoids sending unnecessary data to the next stage of the aggregation pipeline.

You can combine $lookup and $project like this:

 db.users.aggregate([
   {
      $lookup: {
         from: "orders",
         localField: "_id",
         foreignField: "user_id",
         as: "orders"
      }
   },
   {
      $project: {
         name: 1,
         _id: 0,
         total_spent: { $sum: "$orders.price" }
      }
   }
])

Above, we use $lookup to join the users and orders collection, and use $project to display only the name of each user and the amount spent by each user. $project is also used to remove the _id field from the results. The result of the above operation is shown below.

Screenshot of 2023-05-11-10-34-57
Screenshot of 2023-05-11-10-34-57

$unwind

$unwind is an aggregation stage used to explode or unwind array fields to create a new document for each element in the array. This is useful when performing aggregations on array field values.

For example, in the example below, if you want to perform an aggregation on the hobby field, you cannot do it because it is an array. However, you can use $unwind to perform aggregations on the resulting documents.

relax
relax

With user and order collections, you can use $lookup and $unwind together like this:

 db.users.aggregate([
   {
      $lookup: {
         from: "orders",
         localField: "_id",
         foreignField: "user_id",
         as: "orders"
      }
   },
   {
      $unwind: "$orders"
   }
])

In the code above, $lookup returns an array field called orders. Then use $unwind to unwind the array field. The result of this operation is shown below. Notice that it appears twice because Alice has two orders.

Screenshot of 2023-05-11-10-41-52
Screenshot of 2023-05-11-10-41-52

Example of using $lookup

$lookup is a useful tool when performing data processing. For example, suppose you have two collections that you want to join based on a field in the collections that has similar data. You can do this using a simple $lookup stage to add a new field to the primary collection that contains documents retrieved from another collection.

Consider the collection of users and orders shown below.

Screenshot from 2023-05-11-10-14-11
Screenshot from 2023-05-11-10-14-11

Combining the two collections using $lookup yields the result shown below.

Screenshot of 2023-05-11-11-43-05
Screenshot of 2023-05-11-11-43-05

You can also perform more complex joins using $lookup . $lookup is not limited to performing joins between two collections. You can implement multiple $lookup stages to perform joins on more than two collections. Consider the three collections shown below.

Screenshot of 2023-05-11-11-55-37
Screenshot of 2023-05-11-11-55-37

You can use the code below to perform a more complex join between the three collections and retrieve details of all orders made and products ordered.

The code below allows you to do just that.

 db.orders.aggregate([
   {
      $lookup: {
         from: "order_items",
         localField: "_id",
         foreignField: "order_id",
         as: "order_items"
      }
   },
   {
      $unwind: "$order_items"
   },
   {
      $lookup: {
         from: "products",
         localField: "order_items.product_id",
         foreignField: "_id",
         as: "product_details"
      }
   },
   {
      $group: {
         _id: "$_id",
         customer: { $first: "$customer" },
         total: { $sum: "$order_items.price" },
         products: { $push: "$product_details" }
      }
   }
])

The result of the above operation is shown below.

Screenshot from 2023-05-11-12-02-17
Screenshot from 2023-05-11-12-02-17

conclusion

When performing data processing that involves multiple collections, $lookup is useful because it allows you to combine data and draw conclusions based on data stored in multiple collections. Data processing rarely relies on just one collection.

Combining data across multiple collections is a critical step in drawing meaningful conclusions from your data. Therefore, consider leveraging the $lookup stage in your MongoDB aggregation pipelines so you can better process your data and derive meaningful insights from the raw data stored across your collections.

You can also explore some MongoDB commands and queries.