ingest: Ingestion functions for Kusto

ingest_localR Documentation

Ingestion functions for Kusto

Description

Ingestion functions for Kusto

Usage

ingest_local(
  database,
  src,
  dest_table,
  method = NULL,
  staging_container = NULL,
  ingestion_token = database$token,
  http_status_handler = "stop",
  ...
)

ingest_url(database, src, dest_table, async = FALSE, ...)

ingest_blob(
  database,
  src,
  dest_table,
  async = FALSE,
  key = NULL,
  token = NULL,
  sas = NULL,
  ...
)

ingest_adls2(
  database,
  src,
  dest_table,
  async = FALSE,
  key = NULL,
  token = NULL,
  sas = NULL,
  ...
)

ingest_adls1(
  database,
  src,
  dest_table,
  async = FALSE,
  key = NULL,
  token = NULL,
  sas = NULL,
  ...
)

Arguments

database

A Kusto database endpoint object, created with kusto_database_endpoint.

src

The source data. This can be either a data frame, local filename, or URL.

dest_table

The name of the destination table.

method

For local ingestion, the method to use. See 'Details' below.

staging_container

For local ingestion, an Azure storage container to use for staging the dataset. This can be an object of class either AzureStor::blob_container or AzureStor::adls_filesystem. Only used if method="indirect".

ingestion_token

For local ingestion, an Azure Active Directory authentication token for the cluster ingestion endpoint. Only used if method="streaming".

http_status_handler

For local ingestion, how to handle HTTP conditions >= 300. Defaults to "stop"; alternatives are "warn", "message" and "pass". The last option will pass through the raw response object from the server unchanged, regardless of the status code. This is mostly useful for debugging purposes, or if you want to see what the Kusto REST API does. Only used if method="streaming".

...

Named arguments to be treated as ingestion parameters.

async

For the URL ingestion functions, whether to do the ingestion asychronously. If TRUE, the function will return immediately while the server handles the operation in the background.

key, token, sas

Authentication arguments for the Azure storage ingestion methods. If multiple arguments are supplied, a key takes priority over a token, which takes priority over a SAS. Note that these arguments are for authenticating with the Azure storage account, as opposed to Kusto itself.

Details

There are up to 3 possible ways to ingest a local dataset, specified by the method argument.

  • method="indirect": The data is uploaded to blob storage, and then ingested from there. This is the default if the AzureStor package is present.

  • method="streaming": The data is uploaded to the cluster ingestion endpoint. This is the default if the AzureStor package is not present, however be aware that currently (as of February 2019) streaming ingestion is in beta and has to be enabled for a cluster by filing a support ticket.

  • method="inline": The data is embedded into the command text itself. This is only recommended for testing purposes, or small datasets.

Note that the destination table must be created ahead of time for the ingestion to proceed.

Examples

## Not run: 

# ingesting from local:

# ingest via Azure storage
cont <- AzureStor::storage_container("https://mystorage.blob.core.windows.net/container",
    sas="mysas")
ingest_local(db, "file.csv", "table",
    method="indirect", storage_container=cont)

ingest_local(db, "file.csv", "table", method="streaming")

# ingest by inlining data into query
ingest_inline(db, "file.csv", "table", method="inline")

# ingesting online data:

# a public dataset: Microsoft web data from UCI machine learning repository
ingest_url(db,
    "https://archive.ics.uci.edu/ml/machine-learning-databases/anonymous/anonymous-msweb.data",
    "table")

# from blob storage:
ingest_blob(db,
    "https://mystorage.blob.core.windows.net/container/myblob",
    "table",
    sas="mysas")

# from ADLSGen2:
token <- AzureRMR::get_azure_token("https://storage.azure.com", "mytenant", "myapp", "password")
ingest_blob(db,
    "abfss://filesystem@myadls2.dfs.core.windows.net/data/myfile",
    "table",
    token=token)


## End(Not run)

Azure/AzureKusto documentation built on Oct. 16, 2023, 7:04 p.m.