The package allows you query time-series data and statistics from Axibase Time-Series Database (ATSD) and save time-series data in ATSD. List of package functions:
Execute library(atsd)
to start working with the atsd package.
The connection parameters are loaded from the package configuration file,
atsd/connection.config,
which is located in the atsd package folder.
The command
installed.packages()["atsd", "LibPath"]
shows you where the atsd package folder is. Open a text editor and modify the configuration file. It should look as follows:
# the url of ATSD including port number url=http://host_name:port_number # the user name user=atsd_user_name # the user's password password=atsd_user_password # validate ATSD SSL certificate: yes, no verify=no # cryptographic protocol used by ATSD https server: # default, ssl2, ssl3, tls1 encryption=ssl3
Reload the modified connection parameters from the configuration file:
set_connection()
Check that parameters are correct:
show_connection()
Refer to Chapter 9 for more options on managing ATSD connection parameters.
Description: The function retrieves historical time-series data or forecasts from ATSD.
Returns object: data frame
Arguments:
metric (required, string)
The name of the metric you want to get data for,
for example, "disk_used_percent".
To obtain a list of metrics collected by ATSD use the
get_metrics() function.
selection_interval (required, string)
This is the time interval for which the data will be selected.
Specify it as "n-unit", where
unit is a Second, Minute, Hour, Day, Week, Month, Quarter, or Year
and n is the number of units,
for example, "3-Week" or "12-Hour".
entity (optional, string)
The name of the entity you want to get data for.
If not provided, then data for all entities will be fetched
for the specified metric.
Obtain the list of entities with the
get_entities() function.
entity_group (optional, string)
The name of entity group, for example, "HP Servers".
Extracts data for all entities belonging to this group.
tags (optional, string vector)
List of user-defined series tags to filter the fetched time-series data,
for example, c("disk_name=sda1", "mount_point=/") .
end_time (optional, string)
The end time of the selection interval, for example, end_time = "date('2014-12-27')"
.
If not provided, the current time will be used.
Specify the date and time, or use one of the supported expressions:
end time syntax.
For example, 'current_day' would set the end of selection interval
to 00:00:00 of the current day.
aggregate_interval (optional, string)
The length of the aggregation interval.
The period of produced time-series will be equal to the
aggregate_interval.
The value for each period is computed by the
aggregate_statistics
function applied to all samples of the original time-series within the period.
The format of the
aggregate_interval
is the same as for the
selection_interval
argument, for example, "1-Minute".
aggregate_statistics (optional, string vector)
The statistic functions used for aggregation.
Multiple values are supported, for example, c("Min", "Avg", "StDev").
The default value is "Avg".
interpolation (optional, string)
If aggregation is enabled, then the values for the periods without data
will be computed by one of the following interpolation functions:
"None", "Linear", "Step". The default value is "None".
export_type (optional, string)
Supported options: "History" or "Forecast". The default value is "History".
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
By default, verbose = TRUE.
Examples:
# get historic data for the given entity, metric, and selection_interval dfr <- query(entity = "nurswgvml007", metric = "cpu_busy", selection_interval = "1-Hour") # end_time usage example query(entity = "host-383", metric = "cpu_usage", selection_interval = "1-Day", end_time = "date('2015-02-10 10:15:03')") # get forecasts query(metric = "cpu_busy", selection_interval = "30-Minute", export_type = "Forecast", verbose = FALSE) # use aggregation query(metric = "disk_used_percent", entity_group = "Linux", tags = c("mount_point=/boot", "file_system=/dev/sda1"), selection_interval = "1-Week", aggregate_interval = "1-Minute", aggregate_statistics = c("Avg", "Min", "Max"), interpolation = "Linear", export_type = "Forecast")
Description: The function builds a zoo object from the given data frame. The timestamp argument provides a column of the data frame which is used as the index for the zoo object. The value argument indicates the series which will be saved in a zoo object. If several columns are listed in the value argument, they will all be saved in a multivariate zoo object. Information from other columns is ignored. To use this function the 'zoo' package should be installed.
Returns object: zoo object
Arguments:
dfr (required, data frame)
The data frame.
timestamp (optional, character or numeric)
Name or number of the column with timestamps. By default,
timestamp = "Timestamp"
.
value
(optional, character vector or numeric vector)
Names or numbers of columns with series values.
By default, value = "Value"
.
Examples:
# query ATSD for data and transform it to zoo object dfr <- query(entity = "nurswgvml007", metric = "cpu_busy", selection_interval = "1-Hour") z <- to_zoo(dfr)
Description: This function fetches a list of metrics and their tags from ATSD, and converts it to a data frame.
Returns object: data frame
Each row of the data frame corresponds to a metric and its tags:
name
Metric name (unique)
counter
Counters are metrics with continuously incrementing value
lastInsertTime
Last time value was received by ATSD for this metric
tags
User-defined tags (as requested by the "tags" argument)
Arguments:
expression (optional, string)
Select metrics matching particular name pattern and/or user-defined metric tags.
For examples refer to "Expression syntax" chapter.
active (optional,
one of strings: "true" or "false")
Filter metrics by lastInsertTime attribute.
If active = "true",
only metrics with positive lastInsertTime are included in the response.
tags (optional, string vector)
User-defined metric tags to be included in the response.
By default, all the tags will be included.
limit (optional, integer)
If limit > 0, the response shows the top-N metrics ordered by name.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
Examples:
# get all metrics and include all their tags in the data frame metrics <- get_metrics() # get the first 100 active metrics which have the tag, "table", # include this tag into response and exclude oter user-defined metric tags metrics <- get_metrics(expression = "tags.table != ''", active = "true", tags = "table", limit = 100)
Description: This function fetches a list of entities and their tags from ATSD, and converts it to a data frame.
Returns object: data frame
Each row of the data frame corresponds to an entity and its tags:
name
Entity name (unique)
enabled
Enabled status, incoming data is discarded for disabled entities
lastInsertTime
Last time value was received by ATSD for this entity
tags
User-defined tags (as requested by the "tags" argument)
Arguments:
expression (optional, string)
Select entities matching particular name pattern and/or user-defined entity tags.
For examples refer to "Expression syntax" chapter.
active (optional,
one of strings: "true" or "false")
Filter entities by lastInsertTime attribute.
If active = "true",
only entities with positive lastInsertTime are included in the response.
tags (optional, string vector)
User-defined entity tags to be included in the response.
By default, all the tags will be included.
limit (optional, integer)
If limit > 0, the response shows the top-N entities ordered by name.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
Examples:
# get all entities entities <- get_entities() # select entities by name and user-defined tag "app" entities <- get_entities(expression = "name like 'nur*' and lower(tags.app) like '*hbase*'")
Function name: get_series_tags()
Description: The function determines time series collected by ATSD for a given metric. For each time series it lists tags associated with the series, and last time the series was updated. The list of fetched time series is based on data stored on disk for the last 24 hours.
Returns object: data frame
Each row of the data frame corresponds to a time series and its tags:
entity
Name of entity which generate the time series.
lastInsertTime
Last time value was received by ATSD for this time series.
tags
Tags of the series.
Arguments:
metric (required, string)
The name of the metric you want to get time series for,
for example, "disk_used_percent".
To obtain a list of metrics collected by ATSD use the
get_metrics() function.
entity (optional, string)
The name of the entity you want to get time series for.
If not provided, then data for all entities will be fetched
for the specified metric.
Obtain the list of entities with the
get_entities() function.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
Examples:
# get all time series and their tags collected by ATSD for the "disk_used_percent" metric tags <- get_series_tags(metric = "disk_used_percent") # get all time series and their tags for the "disk_used_percent" metric # end "nurswgvml007" entity get_series_tags(metric = "disk_used_percent", entity = "nurswgvml007")
Description: Save time-series from the data frame into ATSD. The data frame should have a column with timestamps and at least one numeric column with values of a metric.
Returns object: NULL
Arguments:
dfr (required, data frame)
The data frame should have a column with timestamps and
at least one numeric column with values of a metric.
time_col (optional, numeric or character)
Number or name of the column with the timestamps. Default value is 1.
For example,
time_col = 1, or
time_col = "Timestamp".
Read "Timestamps format" section below for supported timestamp classes and formats.
time_format (optional, string)
Optional string argument, indicates format of timestamps.
This argument is used in the case when timestamp format is not clear from their class.
The value of this argument can be one of the following: "ms"
(for epoch milliseconds),
"sec"
(for epoch seconds), or a format string,
for example "\%Y-\%m-\%d \%H:\%M:\%S"
.
This format string will be used to convert the provided timestamps
to epoch milliseconds before storing the timestamps in ATSD.
Read "Timestamp format" section for details.
tz (optional, string)
By default, tz = "GMT"
. Specify time zone, when timestamps
are strings formatted as described in the
time_format
argument.
For example, tz = "Australia/Darwin"
.
View the "TZ" column of the time zones table for
a list of possible values.
metric_col (required, numeric or character vector)
Specifies numbers or names of the columns
where metric values are stored.
For example, metric_col = c(2, 3, 4)
, or metric_col = c("Value", "Avg")
.
If
metric_name
argument is not given, then names of columns, in lower case,
are used as metric names when saving them in ATSD.
metric_name (optional, character vector)
Specifies metric names. The series indicated by
metric_col
argument are saved in ATSD along with metric names, provided by the
metric_name .
So the number and order of names in the
metric_name
should match to columns in
\metric_col .
If
metric_name
argument is not provided, then names of columns, in lower case,
are used as metric names when saving them in ATSD.
entity_col (optional, numeric or character)
Optional argument, should be provided if the entity argument is not given.
Number or name of a column with entities.
Several entities in the column are allowed.
For example, entity_col = 4
, or entity_col = "server001"
.
entity (optional, character)
Should be provided if the
entity_col argument is not given.
Name of the entity.
tags_col (optional, numeric or character vector)
Lists numbers or names of the columns containing tag values.
So the name of a column is a tag name, and values in the column
are the tag values.
tags (optional, character vector)
Lists tags and their values in "tag=value" format.
Each indicated tag will be saved with each series.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
Timestamp format.
The list of allowed timestamp types.
Numeric, in epoch milliseconds or epoch seconds. In that case time_format = "ms"
or time_format = "sec"
should be used, and time zone argument
tz is ignored.
Object of one of type Date
, POSIXct
, POSIXlt
, chron
from the chron
package
or timeDate
from the timeDate
package.
In that case arguments
time_format and
tz are ignored.
String, for example, "2015-01-03 10:07:15". In this case
time_format
argument should specify which format string is used for the timestamps.
For example, time_format = "\%Y-\%m-\%d \%H:\%M:\%S"
.
Type ?strptime
to see list of format symbols.
This format string will be used to convert provided timestamps
to epoch milliseconds before storing the timestamps in ATSD.
So time zone, as written in
tz argument, and standard
origin "1970-01-01 00:00:00" are used for conversion. In fact conversion is done with use
of command:
as.POSIXct(time_stamp, format = time_format, origin="1970-01-01", tz = tz)
.
Note that timestamps will be stored in epoch milliseconds. So if you put some data into ATSD
and then retrieve it back, the timestamps will refer to the same time but in GMT time zone.
For example, if you save timestamp "2015-02-15 10:00:00"
with
tz = "Australia/Darwin"
in ATSD, and then retrieve it back, you will get the timestamp
"2015-02-15 00:30:00"
because Australia/Darwin
time zone has a +09:30 shift relative to the GMT zone.
Entity specification
You can provide entity name in one of entity or entity_col arguments. In the first case all series will have the same entity. In the second case, entities specified in entity_col column will be saved along with corresponding series.
Tags specification
The tags_col argument indicates which columns of the data frame keeps the time-series tags. The name of each column specified by the tags_col argument is a tag name, and the values in the column are tag values.
Before storing the series in ATSD, the data frame will be split into several data frames, each of them has a unique entity and unique list of tag values. This entity and tags are stored in ATSD along with the time-series from the data frame. NA's and missing values in time-series will be ignored.
In tags argument you can specify tags which are the same for all rows (records) of the data frame. So each series value saved in ATSD will have tags, provided in the tags argument.
Examples:
# Save time-series from columns 3, 4, 5 of data frame dfr. # Timestamps are saved as strings in 2nd column # and their format string and time zone are provided. # Entities and tags are in columns 1, 6, 7. # All saved series will have tag "os_type" with value "linux". save_series(dfr, time_col = 2, time_format = "%Y/%m/%d %H:%M:%S", tz = "Australia/Darwin", metric_col = c(3, 4, 5), entity_col = 1, tags_col = c(6, 7), tags = "os_type = linux")
In this section, we explain the syntax of the
expression
argument of the functions get_metrics()
and get_entities()
.
The expression
is used to filter result for which
expression evaluates to TRUE
.
The variable name
is used to select metrics/entities by names:
# get metric with name 'cpu_busy' metrics <- get_metrics(expression = "name = 'cpu_busy'", verbose = FALSE)
Metrics and entities have user-defined tags.
Each of these tags is a pair ("tag_name" : "tag_value").
The variable tags.tag_name
in an expression refers to the tag_value
for given metric/entity.
If a metric/entity does not have this tag, the tag_value
will be an empty string.
# get metrics without 'source' tag, and include all tags of fetched metrics in output get_metrics(expression = "tags.source != ''", tags = "*")
To get metrics with a user-defined tag 'table' equal to 'System':
# get metrics whose tag 'table' is equal to 'System' metrics <- get_metrics(expression = "tags.table = 'System'", tags = "*")
To build more complex expressions, use brackets (
, )
, and
and
, or
, not
logical operators as well as &&
, ||
, !
.
entities <- get_entities(expression = "tags.app != '' and (tags.os != '' or tags.ip != '')")
To test if a string is in a collections, use in
operator:
get_entities(expression = "name in ('derby-test', 'atom.axibase.com')")
Use like
operator to match values with expressions containing wildcards:
expression = "name like 'disk*'"
.
The wildcard *
mean zero or more characters.
The wildcard .
means any one character.
metrics <- get_metrics(expression = "name like '*cpu*' and tags.table = 'System'")
# get metrics with names consisting of 3 letters metrics <- get_metrics(expression = "name like '...'")
There are additional functions you can use in an expression:
list(string, delimeter))
Splits the string by delimeter. The default delimiter is a comma.
upper(string)
Converts the string argument to upper case.
lower(string)
Converts the string argument to lower case.
collection(name)
Refers to a named collection of strings created in ATSD.
likeAll(string, collection of patterns)
Returns true if every element in the collection of patterns matches the given string.
likeAny(string, collection of patterns)
Returns true if at least one element in the collection of patterns matches the given string.
get_metrics(expression = "likeAll(lower(name), list('cpu*,*use*'))") get_metrics(expression = "likeAny(lower(name), list('cpu*,*use*'))") get_metrics(expression = "name in collection('fs_ignore')")
The atsd package uses connection parameters to connect with ATSD. These parameters are:
url - the url of ATSD including port number
user - the user name
password - the user's password
verify - should ATSD SSL certificate be validated
encryption - cryptographic protocol used by ATSD https server
The configuration parameters are loaded from the package configuration file when you load the atsd package into R. (See Section 2.)
The functions show_connection()
, set_connection()
,
and save_connection()
,
show configuration parameters, change them,
and store them in the configuration file.
Function name: show_connection()
Returns object: NULL
Description: The function prints current values of the connection parameters. (They may be different from the values in the configuration file.)
Arguments: no
Examples:
show_connection()
Function name: set_connection()
Returns object: NULL
Description: The function overrides the connection parameters for the duration of the current R session without changing the configuration file. If called without arguments the function sets the connection parameters from the configuration file. If the file argument is provided the function use it. In both cases the current values of the parameters became the same as in the file. In case the file argument is not provided, but some of other arguments are specified, the only specified parameters will be changed.
Arguments:
url (optional, string)
The url of ATSD including port number.
user (optional, string)
The user name.
password (optional, string)
The user's password.
verify (optional, string)
String - "yes" or "no",
verify = "yes"
ensures validation of ATSD SSL certificate
and verify = "no"
suppresses the validation
(applicable in the case of 'https' protocol).
encryption (optional, string)
Cryptographic protocol used by ATSD https server.
Possible values are: "default", "ssl2", "ssl3", and "tls1"
(In most cases, use "ssl3" or "tls1".)
file (optional, string)
The absolute path to the file from which
the connection parameters could be read.
The file should be formatted as the package configuration file,
see Section 2.
Examples:
# Modify the user set_connection(user = "user001") # Modify the cryptographic protocol set_connection(encryption = "tls1") # Set the parameters of the https connection: url, user name, password # should the certificate of the server be verifyed # which cryptographic protocol is used for communication set_connection(url = "https://my.company.com:8443", user = "user001", password = "123456", verify = "no", encryption = "ssl3") # Set up the connection parameters from the file: set_connection(file = "/home/user001/atsd_https_connection.txt")
Function name: save_connection()
Returns object: NULL
Description: The function writes the connection parameters into the configuration file. If called without arguments the functions use current values of the connection parameters (including NAs). Otherwise only the provided arguments will be written to the configuration file. If configuration file is absent it will be created in the atsd package folder. Arguments:
url (optional, string)
The url of ATSD including port number.
user (optional, string)
The user name.
password (optional, string)
The user's password.
verify (optional, string)
String - "yes" or "no",
verify = "yes"
ensures validation of ATSD SSL certificate
and verify = "no"
suppresses the validation
(applicable in the case of 'https' protocol).
encryption (optional, string)
Cryptographic protocol used by ATSD https server.
Possible values are: "default", "ssl2", "ssl3", and "tls1"
(In most cases, use "ssl3" or "tls1".)
Examples:
# Write the current values of the connection parameters to the configuration file. save_connection() # Write the user name and password in the configuration file. save_connection(user = "user00", password = "123456") # Write all parameters nedeed for the https connection to the configuration file. save_connection(url = "https://my.company.com:8443", user = "user001", password = "123456", verify = "no", encryption = "ssl3")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.