mongo: MongoDB client
In mongolite: Fast and Simple 'MongoDB' Client for R

View source: R/mongo.R

mongo

R Documentation

MongoDB client

Description

Connect to a MongoDB collection. Returns a mongo connection object with methods listed below. Connections automatically get pooled between collection and gridfs objects to the same database.

Usage

mongo(
  collection = "test",
  db = "test",
  url = "mongodb://localhost",
  verbose = FALSE,
  options = ssl_options()
)

Arguments

`collection`	name of collection
`db`	name of database
`url`	address of the mongodb server in mongo connection string URI format
`verbose`	emit some more output
`options`	additional connection options such as SSL keys/certs.

Details

This manual page is deliberately minimal, see the mongolite user manual for more details and worked examples.

Value

Upon success returns a pointer to a collection on the server. The collection can be interfaced using the methods described below.

Methods

aggregate(pipeline = '{}', handler = NULL, pagesize = 1000, iterate = FALSE): Execute a pipeline using the Mongo aggregation framework. Set iterate = TRUE to return an iterator instead of data frame.
count(query = '{}'): Count the number of records matching a given query. Default counts all records in collection.
disconnect(gc = TRUE): Disconnect collection. The connection gets disconnected once the client is not used by collections in the pool.
distinct(key, query = '{}'): List unique values of a field given a particular query.
drop(): Delete entire collection with all data and metadata.
export(con = stdout(), bson = FALSE, query = '{}', fields = '{}', sort = '{"_id":1}'): Streams all data from collection to a connection in jsonlines format (similar to mongoexport). Alternatively when bson = TRUE it outputs the binary bson format (similar to mongodump).
find(query = '{}', fields = '{"_id" : 0}', sort = '{}', skip = 0, limit = 0, handler = NULL, pagesize = 1000): Retrieve fields from records matching query. Default handler will return all data as a single dataframe.
import(con, bson = FALSE): Stream import data in jsonlines format from a connection, similar to the mongoimport utility. Alternatively when bson = TRUE it assumes the binary bson format (similar to mongorestore).
index(add = NULL, remove = NULL): List, add, or remove indexes from the collection. The add and remove arguments can either be a field name or json object. Returns a dataframe with current indexes.
info(): Returns collection statistics and server info (if available).
insert(data, pagesize = 1000, stop_on_error = TRUE, ...): Insert rows into the collection. Argument 'data' must be a data-frame, named list (for single record) or character vector with json strings (one string for each row). For lists and data frames, arguments in ... get passed to jsonlite::toJSON
iterate(query = '{}', fields = '{"_id":0}', sort = '{}', skip = 0, limit = 0): Runs query and returns iterator to read single records one-by-one.
mapreduce(map, reduce, query = '{}', sort = '{}', limit = 0, out = NULL, scope = NULL): Performs a map reduce query. The map and reduce arguments are strings containing a JavaScript function. Set out to a string to store results in a collection instead of returning.
remove(query = "{}", just_one = FALSE): Remove record(s) matching query from the collection.
rename(name, db = NULL): Change the name or database of a collection. Changing name is cheap, changing database is expensive.
replace(query, update = '{}', upsert = FALSE): Replace matching record(s) with value of the update argument.
run(command = '{"ping": 1}', simplify = TRUE): Run a raw mongodb command on the database. If the command returns data, output is simplified by default, but this can be disabled.
update(query, update = '{"$set":{}}', upsert = FALSE, multiple = FALSE): Modify fields of matching record(s) with value of the update argument.

References

Mongolite User Manual

Jeroen Ooms (2014). The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects. arXiv:1403.2805. https://arxiv.org/abs/1403.2805

Examples

# Connect to demo server
con <- mongo("mtcars", url =
  "mongodb+srv://readwrite:test@cluster0-84vdt.mongodb.net/test")
if(con$count() > 0) con$drop()
con$insert(mtcars)
stopifnot(con$count() == nrow(mtcars))

# Query data
mydata <- con$find()
stopifnot(all.equal(mydata, mtcars))
con$drop()

# Automatically disconnect when connection is removed
rm(con)
gc()

## Not run: 
# dplyr example
library(nycflights13)

# Insert some data
m <- mongo(collection = "nycflights")
m$drop()
m$insert(flights)

# Basic queries
m$count('{"month":1, "day":1}')
jan1 <- m$find('{"month":1, "day":1}')

# Sorting
jan1 <- m$find('{"month":1,"day":1}', sort='{"distance":-1}')
head(jan1)

# Sorting on large data requires index
m$index(add = "distance")
allflights <- m$find(sort='{"distance":-1}')

# Select columns
jan1 <- m$find('{"month":1,"day":1}', fields = '{"_id":0, "distance":1, "carrier":1}')

# List unique values
m$distinct("carrier")
m$distinct("carrier", '{"distance":{"$gt":3000}}')

# Tabulate
m$aggregate('[{"$group":{"_id":"$carrier", "count": {"$sum":1}, "average":{"$avg":"$distance"}}}]')

# Map-reduce (binning)
hist <- m$mapreduce(
  map = "function(){emit(Math.floor(this.distance/100)*100, 1)}",
  reduce = "function(id, counts){return Array.sum(counts)}"
)

# Stream jsonlines into a connection
tmp <- tempfile()
m$export(file(tmp))

# Remove the collection
m$drop()

# Import from jsonlines stream from connection
dmd <- mongo("diamonds")
dmd$import(url("http://jeroen.github.io/data/diamonds.json"))
dmd$count()

# Export
dmd$drop()

## End(Not run)

mongolite documentation built on Oct. 7, 2024, 1:11 a.m.