Dataset: Create, update, delete and certify MicroStrategy datasets

DatasetR Documentation

Create, update, delete and certify MicroStrategy datasets

Description

When creating a new dataset, provide a dataset name and an optional description. When updating a pre-existing dataset, provide the dataset identifier. Tables are added to the dataset in an iterative manner using 'add_table()'.

Public fields

connection

MicroStrategy connection object

name

Name of the dataset

description

Description of the dataset. Must be less than or equal to 250 characters

folder_id

If specified the dataset will be saved in this folder

dataset_id

Identifier of a pre-existing dataset. Used when updating a pre-existing dataset

owner_id

Owner ID

path

Cube path

modification_time

Last modification time, "yyyy-MM-dd HH:mm:ss" in UTC

size

Cube size

cube_state

Cube status,for example, 0=unpublished, 1=publishing, 64=ready

verbose

If True (default), displays additional messages.

Methods

Public methods


Method new()

Interface for creating, updating, and deleting MicroStrategy in-memory datasets.

Usage
Dataset$new(
  connection,
  name = NULL,
  description = NULL,
  dataset_id = NULL,
  verbose = TRUE
)
Arguments
connection

MicroStrategy connection object returned by 'Connection$New()'.

name

(character): Name of the dataset.

description

(character, optional): Description of the dataset. Must be less than or equal to 250 characters.

dataset_id

(character, optional): Identifier of a pre-existing dataset. Used when updating a pre-existing dataset.

verbose

Setting to control the amount of feedback from the I-Server.

Details

When creating a new dataset, provide a dataset name and an optional description. When updating a pre-existing dataset, provide the dataset identifier. Tables are added to the dataset in an iterative manner using 'add_table()'.

Returns

A new 'Datasets' object


Method add_table()

Add a data.frame to a collection of tables which are later used to update the MicroStrategy dataset

Usage
Dataset$add_table(
  name,
  data_frame,
  update_policy,
  to_metric = NULL,
  to_attribute = NULL
)
Arguments
name

(character): Logical name of the table that is visible to users of the dataset in MicroStrategy.

data_frame

('data.frame'): R data.frame to add or update.

update_policy

(character): Update operation to perform. One of 'add' (inserts new, unique rows), 'update' (updates data in existing rows and columns), 'upsert' (updates existing data and inserts new rows), or 'replace' (replaces the existing data with new data).

to_metric

(optional, vector): By default, R numeric data types are treated as metrics in the MicroStrategy dataset while character and date types are treated as attributes. For example, a column of integer-like strings ("1", "2", "3") would, by default, be an attribute in the newly created dataset. If the intent is to format this data as a metric, provide the respective column name as a character vector in 'to_metric' parameter.

to_attribute

(optional, vector): Logical opposite of 'to_metric'. Helpful for formatting an integer-based row identifier as a primary key in the dataset.

Details

Add tables to the dataset in an iterative manner using 'add_table()'.


Method create()

Create a new dataset.

Usage
Dataset$create(
  folder_id = NULL,
  auto_upload = TRUE,
  auto_publish = TRUE,
  chunksize = 1e+05
)
Arguments
folder_id

ID of the shared folder that the dataset should be created within. If 'None', defaults to the user's My Reports folder.

auto_upload

(default TRUE) If True, automatically uploads the data to the I-Server. If False, simply creates the dataset definition but does not upload data to it.

auto_publish

(default TRUE) If True, automatically publishes the data used to create the dataset definition. If False, simply creates the dataset but does not publish it. To publish the dataset, data has to be uploaded first.

chunksize

(int, optional) Number of rows to transmit to the I-Server with each request when uploading.


Method update()

Updates an existing dataset with new data.

Usage
Dataset$update(chunksize = 1e+05, auto_publish = TRUE)
Arguments
chunksize

(int, optional): Number of rows to transmit to the I-Server with each request when uploading.

auto_publish

(default TRUE) If True, automatically publishes the data. If False, data will be uploaded but the cube will not be published


Method publish()

Publish the uploaded data to the selected dataset. A dataset can be published just once.

Usage
Dataset$publish()

Method publish_status()

Check the status of data that was uploaded to a dataset.

Usage
Dataset$publish_status()
Returns

Response status code


Method delete()

Delete a dataset that was previously created using the REST API.

Usage
Dataset$delete()
Returns

Response object from the Intelligence Server acknowledging the deletion process.


Method certify()

Certify a dataset that was previously creted using the REST API

Usage
Dataset$certify()
Returns

Response object from the Intelligence Server acknowledging the certification process.


Method clone()

The objects of this class are cloneable with this method.

Usage
Dataset$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples

## Not run: 
# Create data frames
df1 <- data.frame("id" = c(1, 2, 3, 4, 5),
                  "first_name" = c("Jason", "Molly", "Tina", "Jake", "Amy"),
                  "last_name" = c("Miller", "Jacobson", "Turner", "Milner", "Cooze"))

df2 <- data.frame("id" = c(1, 2, 3, 4, 5),
                  "age" = c(42, 52, 36, 24, 73),
                  "state" = c("VA", "NC", "WY", "CA", "CA"),
                  "salary" = c(50000, 100000, 75000, 85000, 250000))

# Create a list of tables containing one or more tables and their names
my_dataset <- Dataset$new(connection=conn, name="HR Analysis")
my_dataset$add_table("Employees", df1, "add")
my_dataset$add_table("Salaries", df2, "add")
my_dataset$create()

# By default Dataset$create() will upload the data to the Intelligence Server and publish the
 dataset.
# If you just want to create the dataset but not upload the row-level data, use
Dataset$create(auto_upload=FALSE)

# followed by
Dataset$update()
Dataset$publish()

# When the source data changes and users need the latest data for analysis and reporting in
# MicroStrategy, mstrio allows you to update the previously created dataset.

ds <- Dataset$new(connection=conn, dataset_id="...")
ds$add_table(name = "Stores", data_frame = stores_df, update_policy = 'update')
ds$add_table(name = "Sales", data_frame = stores_df, update_policy = 'upsert')
ds$update(auto_publish=TRUE)

# By default Dataset$update() will upload the data to the Intelligence Server and publish the
 dataset.
# If you just want to update the dataset but not publish the row-level data, use
Dataset$update(auto_publish=FALSE)

# By default, the raw data is transmitted to the server in increments of 100,000 rows. On very
# large datasets (>1 GB), it is beneficial to increase the number of rows transmitted to the
# Intelligence Server with each request. Do this with the chunksize parameter:

ds$update(chunksize = 500000)

# If you want to cerfify an existing dataset, use
ds$certify()

## End(Not run)

mstrio documentation built on April 13, 2022, 5:07 p.m.