MongoDB polymerization

MongoDB polymerized (aggregate) mainly for processing the data (such as statistical averages, sums, etc.), and returns the result of the calculated data. Somewhat similar sql statement count (*).

aggregate () method

MongoDB polymerization method uses aggregate ().

grammar

The basic syntax aggregate () method is as follows:

>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)

Examples

Data collection is as follows:

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by_user: 'w3cschool.cc',
   url: 'http://www.w3cschool.cc',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   _id: ObjectId(7df78ad8902d)
   title: 'NoSQL Overview', 
   description: 'No sql database is very fast',
   by_user: 'w3cschool.cc',
   url: 'http://www.w3cschool.cc',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 10
},
{
   _id: ObjectId(7df78ad8902e)
   title: 'Neo4j Overview', 
   description: 'Neo4j is no sql database',
   by_user: 'Neo4j',
   url: 'http://www.neo4j.com',
   tags: ['neo4j', 'database', 'NoSQL'],
   likes: 750
},

Now we set the above is calculated for each of the number of articles written by authors using aggregate () calculated as follows:

> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{
   "result" : [
      {
         "_id" : "w3cschool.cc",
         "num_tutorial" : 2
      },
      {
         "_id" : "Neo4j",
         "num_tutorial" : 1
      }
   ],
   "ok" : 1
}
>

Similar examples above sql statement: select by_user, count (*) from mycol group by by_user

In the above example, we by_user field by field to group the data and calculates the sum of the same value by_user field.

The following table shows some aggregation expression:

expression	description	Examples
$ Sum	Calculate the sum.	db.mycol.aggregate ([{$ group: {_id: "$ by_user", num_tutorial: {$ sum: "$ likes"}}}])
$ Avg	Calculating the average	db.mycol.aggregate ([{$ group: {_id: "$ by_user", num_tutorial: {$ avg: "$ likes"}}}])
$ Min	Gets a collection of all the documents correspond worth minimum.	db.mycol.aggregate ([{$ group: {_id: "$ by_user", num_tutorial: {$ min: "$ likes"}}}])
$ Max	Gets a collection of all documents corresponding to the maximum worth.	db.mycol.aggregate ([{$ group: {_id: "$ by_user", num_tutorial: {$ max: "$ likes"}}}])
$ Push	In the resulting document to insert a value into an array.	db.mycol.aggregate ([{$ group: {_id: "$ by_user", url: {$ push: "$ url"}}}])
$ AddToSet	In the resulting document to insert a value into an array, but does not create a copy.	db.mycol.aggregate ([{$ group: {_id: "$ by_user", url: {$ addToSet: "$ url"}}}])
$ First	Being the first document data according to the sort resource documents.	db.mycol.aggregate ([{$ group: {_id: "$ by_user", first_url: {$ first: "$ url"}}}])
$ Last	Gets the last document data according to the sort resource documents	db.mycol.aggregate ([{$ group: {_id: "$ by_user", last_url: {$ last: "$ url"}}}])

The concept of pipeline

Pipes in Unix and Linux in general is used to output the current command as a parameter to the next command.

MongoDB MongoDB document polymeric pipe in the pipe after a processed result to the next pipeline processing. Pipeline operation can be repeated.

Expression: processing input and output documents. Expressions are stateless, can only be used to calculate the current pipeline of aggregate documents, you can not deal with other documents.

Here we introduce the aggregation framework commonly used in several operations:

$ Project: Modify the structure of the input document. Can be used to rename, add or remove fields, it can also be used to create nested calculations and documentation.
$ Match: used to filter data, only the output of qualified documents. $ Match using MongoDB standard query operators.
$ Limit: to limit the number of documents MongoDB polymerization pipeline returned.
$ Skip: Skip the specified number of documents in the polymerization pipeline, and returns the rest of the document.
$ Unwind: a document is split in an array of type field into multiple, each array containing a value.
$ Group: the collection of documents grouping can be used for statistical results.
$ Sort: the input document output after ordering.
$ GeoNear: Ordered document output close to a geographic location.

Examples of the pipeline operator

1, $ project examples

db.article.aggregate(
    { $project : {
        title : 1 ,
        author : 1 ,
    }}
 );

In this case the results would only there _id, tilte and author of three fields, default _id field is to be included, if the order does not contain _id if it can be:

db.article.aggregate(
    { $project : {
        _id : 0 ,
        title : 1 ,
        author : 1
    }});

2. $ match examples

db.articles.aggregate( [
                        { $match : { score : { $gt : 70, $lte : 90 } } },
                        { $group: { _id: null, count: { $sum: 1 } } }
                       ] );

$ Match is used to obtain a score greater than 70 is less than or equal to 90 records, then the matching records to the next stage of the pipeline operator $ group for processing.

3. $ skip instance

db.article.aggregate(
    { $skip : 5 });

After a $ skip processing pipeline operator, the first five documents are "filtered" out.

Previous: MongoDB index

Next: MongoDB copy (replica set)