mapreduce - Sum of Substrings in mongodb -
we have field(s) in mongodb has numbers in string form, values such "$123,00,89.00" or "1234$" etc
is possible customize $sum accumulators in mongodb, that, processing can done @ each field value while sum performed. such substring or reg-ex processing etc.
the .mapreduce()
method need here. cannot "cast" values in aggregation framework 1 "type" ( exception of "to string" or date
numeric ).
the javascript processing means can convert string value "summing". somthing ( bit more work on "safe" regex required "currency" values:
db.collection.mapreduce( function() { emit(null, this.amount.replace(/\$|,|\./g,"") / 100 ); }, function(key,values) { return array.sum(values); }, { "out": { "inline": 1 } } )
or .group()
uses javascript procesing, bit more restrcitive in it's requirements:
db.collection.group({ "key": null, "reduce": function( curr,result ) { result.total += curr.amount.replace(/\$|,|\./g,"") /100; }, "initial": { "total": 0 } });
so javascript processing option these sorts of operations not supported in aggregatation framework.
a number can string:
db.junk.aggregate([{ "$project": { "a": { "$substr": [ 1,0,1 ] } } }]) { "_id" : objectid("55a458c567446a4351c804e5"), "a" : "1" }
and date
can become number:
db.junk.aggregate([{ "$project": { "a": { "$subtract": [ new date(), new date(0) ] } } }]) { "_id" : objectid("55a458c567446a4351c804e5"), "a" : numberlong("1436835669446") }
but there no other operators "cast" "string" "numeric" or anthing regex replace shown above.
if want use .aggregate()
need fix data format support it, "numeric":
var bulk = db.collection.initializeorderedbulkop(), count = 0; db.collection.find({ "amount": /\$|,\./g }).foreach(function(doc) { doc.amount = doc.amount.replace(/\$|,|\./g,"") /100; bulk.find({ "_id": doc._id }).updateone({ "$set": { "amount": doc.amount } }); count++; // execute once in 1000 operations if ( count % 1000 == 0 ) { bulk.execute(); bulk = db.collection.initializeorderedbulkop(); } }); // clean queued operations if ( count % 1000 != 0 ) bulk.execute();
then can use .aggregate()
on "numeric" data:
db.collection.aggregate([ { "$group": { "_id": null, "total": { "$sum": "$amount" } } } ])
Comments
Post a Comment