mongoAdjust: Use a function to update documents in a mongo-Db dynamically

Description Usage Arguments Value Examples

View source: R/Mongo_extensions.R

Description

A regular query to a mongo-db such as used in the update-method sets field statically, or with limited calculations (increment or multiply) But doing more extensive modifications is not possible in this way. If you want, for example to take 2 textfields, concatenate them and store them in the DB, you need to retrieve the document(s), adapt their value and update this value in the database. Potentially, you need a lot of memory to do this in one go, so this function does this sequentially. Its use is comparable to using the apply-family, the following steps are taken:

  1. The findqry is executed over the mongo-db, which returns a pointer to the result-list (which is still stored on the server)

  2. A page of /emphpagesize documents is retrieved, identifier-information is stored

  3. The resulting documents are passed on to FUN. Then there are a few possibilites, based on how many fields you want to update, and whether these are arrayfields, or single values:

    • If setfield is a length-one character, and FUN returns a vector, this is interpreted as one value for each document.

    • If setfield is a length-one character, and FUN returns an unnamed list, this is interpreted as an element for each document. Note that these element are coerced to arrays, even if the elements themselves are length-one. If you want to prevent this (e.g. to mix values and arrays), you can use unbox.

    • If setfield is a character of length>1, FUN is expected to return a list with names equal to the values of setfield. The elements of these lists are treated the same as in the other 2 steps.

    • If setfield is of length 1, it is also allowed to have FUN return a named list of length-one with the name of setfield. This is if you don't know the length of setfield beforehand.

  4. Updates are done bases on unique values, so if the results are just a few possible values, updating will be faster. For passing on values to mongo-db, toJSON is used, which influences some details (NA's are converted to null, NULLs to empty arrays, etc.). Use the jsonargs parameter to pass on extra arguments to toJSON.

Usage

1
2
3
mongoAdjust(moncol, findqry = "{}", infields = c("All"),
  setfield = "extraInfo_from_R", FUN, ..., jsonargs = list(), skip = 0,
  limit = 0, pagesize = 1000, verbose = FALSE)

Arguments

moncol

Used when directly calling mongoAdjust, pointer to the mongo-collection. If you use monPlus()$adjust, this is retrieved from monPlus

findqry

Query to find matching documents

infields

Fields to retrieve and pass on to FUN, as a character-vector. Use 'All' (default) to get all possible fields, or a first element of 'Not' to list all fields that should be discarded.

setfield

Field(s) to set, as a character-vector. May be overlapping with infields.

FUN

Function to handle the documents, and give back values to set.
input to FUN is a list of retrieved documents (normally of length pagesize, unless the end is reached), and extra arguments passed on with ...
If setfield is of length>1, it should give back a named list, with length(FUN())==length(setfield) && names(FUN())==setfield, containing as element unnamed lists or vectors of length equal to the input provided.
If setfield is of length one, it can either give back a similar named list of length one, or a similar element (unnamed list or vector of same length as input)

...

Extra arguments passes on to FUN

jsonargs

List of extra arguments passed on toJSON, see there for details. Useful to specify encoding of dates, NA, NULLs, etc. A number of arguments is arguments has differing defaults: POSIXt and raw defult to 'mongo' if not specified.

skip

Number of document to skip, useful for stopping and resuming later

limit

Limit on number of documents. 0 for unlimited.

pagesize

Number of documents to use for one page. Smaller uses less memory, but is slower.

verbose

Emit extra output (counter after a page has been processed). Takes over the default from a monPlus-object if provided.

Value

A list with elements modifiedCount and matchedCount, sum of all documents.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Assumed: we can establish a connection to mongodb://localhost:27017,
# with documents containing a field firstname and lastname
MyMongo <- monPlus('MyCol','MyMon')
MyMongo$insert(c('{"OwnID":"Doc1","Author": {"FirstName": "John", "LastName": "Smith"}}',
'{"OwnID":"Doc2","Author": {"FirstName": "James", "LastName": "Brown"}}',
'{"OwnID":"Doc3","Author": {"FirstName": "George", "LastName": "Watson"}}'))
mongoAdjust(MyMongo$col, infields=c('Author.FirstName','Author.LastName'),
setfield='Author.FullName',FUN=function(x) {unname(sapply(x, paste, collapse=' '))})

# Cleaning up
MyMongo$remove('{"OwnID": {"$in": ["Doc1","Doc2","Doc3"]}}')
if(MyMongo$count()==0) MyMongo$drop()

Dans-labs/MongoPlussed documentation built on July 23, 2018, 2:35 p.m.