實際使用mongoDB時,遇到無法簡單下query得到結果的情況(例如計算不同科目上課學生的平均分數),我們會使用aggregation。
先來看一段aggregation的例子:
1 2 3 4 5 6 7 8 9 10
| db.companies.aggregate( [ { $match: { "relationships.person": { $ne: null } } }, { $project: { relationships: 1, _id: 0 } }, { $unwind: "$relationships" }, { $group: { _id: "$relationships.person", count: { $sum: 1 } } }, { $sort: { count: -1 } } ] )
|
aggregation的語法是以[]封裝所有的流程,裡面每一個{}可視為stage,類似linux command line可以將shell script的結果傳給下一個;以上例來說,先找出companies裡relationships的person不為空值的結果,將其傳給下一階段。
而常見的stage有$match, $project, $unwind和$group
$match能夠找出指定條件的document
$project和query的使用方式一樣,選擇結果想要出現的欄位
$unwind則是將document的陣列元素打散產生數個document,使新的document只出現一個該陣列元素
$group能將collection裡依條件做分群的動作,之後還能對新document做數值運算
再回到上例,$match的結果到$project stage時,選擇只要relationships這個陣列的資料,在$unwind根據relationships產生數個document,每個document就是一筆陣列元素。接下來$group把relationships的person當作新document的_id,以person分群並計算同一個人參與多少公司。最後根據count由大排到小。
一般建議的作法是先使用$match或$unwind限縮document的數量,讓之後的處理能加快。
下一個例子是找出學校科目中學生平均最高的例子,document內容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| { "_id" : ObjectId("50b59cd75bed76f46522c392"), "student_id" : 10, "class_id" : 5, "scores" : [ { "type" : "exam", "score" : 69.17634380939022 }, { "type" : "quiz", "score" : 61.20182926719762 }, { "type" : "homework", "score" : 73.3293624199466 }, { "type" : "homework", "score" : 15.206314042622903 }, { "type" : "homework", "score" : 36.75297723087603 }, { "type" : "homework", "score" : 64.42913107330241 } ] }
|
我們要找的目標只包含homework和exam,不包括quiz,所以第一階階使用$match和$unwind
1 2 3 4
| > db.grades.aggregate( [ { $unwind: "$scores" }, { $match: { "scores.type": { $ne: "quiz" } } } ])
|
結果:
1 2 3 4 5 6
| { "_id" : ObjectId("50b59cd75bed76f46522c34e"), "student_id" : 0, "class_id" : 2, "scores" : { "type" : "exam", "score" : 57.92947112575566 } } { "_id" : ObjectId("50b59cd75bed76f46522c34e"), "student_id" : 0, "class_id" : 2, "scores" : { "type" : "homework", "score" : 68.1956781058743 } } { "_id" : ObjectId("50b59cd75bed76f46522c34e"), "student_id" : 0, "class_id" : 2, "scores" : { "type" : "homework", "score" : 67.95019716560351 } } { "_id" : ObjectId("50b59cd75bed76f46522c34e"), "student_id" : 0, "class_id" : 2, "scores" : { "type" : "homework", "score" : 18.81037253352722 } } { "_id" : ObjectId("50b59cd75bed76f46522c351"), "student_id" : 0, "class_id" : 16, "scores" : { "type" : "exam", "score" : 59.1805667559299 } } { "_id" : ObjectId("50b59cd75bed76f46522c351"), "student_id" : 0, "class_id" : 16, "scores" : { "type" : "homework", "score" : 6.48470951607214 } }
|
再來做group,以’class_id’為新collection的_id並計算該科的平均分數,最後再做排序:
1 2 3 4 5 6
| > db.grades.aggregate( [ { $unwind: "$scores" }, { $match: { "scores.type": { $ne: "quiz" } } }, { $group: { _id: { "class": "$class_id" }, avg: { $avg: "$scores.score" } } }, { $sort: { avg: -1 }} ])
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| { "_id" : { "class" : 1 }, "avg" : 64.89462395242741 } { "_id" : { "class" : 9 }, "avg" : 58.32770509408073 } { "_id" : { "class" : 5 }, "avg" : 56.80408386154663 } { "_id" : { "class" : 26 }, "avg" : 55.73775030281308 } { "_id" : { "class" : 4 }, "avg" : 55.51344573971435 } { "_id" : { "class" : 20 }, "avg" : 54.06379996612098 } { "_id" : { "class" : 24 }, "avg" : 53.72872750965091 } { "_id" : { "class" : 16 }, "avg" : 53.36304538085671 } { "_id" : { "class" : 13 }, "avg" : 52.10852066426234 } { "_id" : { "class" : 14 }, "avg" : 51.843459264371795 } { "_id" : { "class" : 23 }, "avg" : 51.55076295987544 } { "_id" : { "class" : 11 }, "avg" : 51.52594302618334 } { "_id" : { "class" : 3 }, "avg" : 51.415324716021125 } { "_id" : { "class" : 29 }, "avg" : 50.190997428907586 } { "_id" : { "class" : 0 }, "avg" : 49.79223626365225 } { "_id" : { "class" : 17 }, "avg" : 49.63426020975154 } { "_id" : { "class" : 6 }, "avg" : 49.51404572271715 } { "_id" : { "class" : 27 }, "avg" : 49.195554303080606 } { "_id" : { "class" : 19 }, "avg" : 48.92999492446235 } { "_id" : { "class" : 7 }, "avg" : 48.78360353370019 }
|
官方參考文件