PrepareData: PrepareData

View source: R/preparedata.R

PrepareDataR Documentation

PrepareData

Description

Prepares input data for charting.

Usage

PrepareData(
  chart.type,
  subset = TRUE,
  weights = NULL,
  input.data.table = NULL,
  input.data.tables = NULL,
  input.data.raw = NULL,
  input.data.pasted = NULL,
  input.data.other = NULL,
  data.source = NULL,
  signif.append = FALSE,
  signif.symbol = "Arrow",
  signif.symbol.size = 12,
  signif.p.cutoffs = c(0.5, 0.2, 0.1, 0.05, 0.01, 0.005, 0.001, 1e-04, 1e-05, 1e-06),
  signif.colors.pos = rep("#0000FF", 10),
  signif.colors.neg = rep("#FF0000", 10),
  signif.colors.on.font = FALSE,
  first.aggregate = NULL,
  scatter.input.columns.order = NULL,
  scatter.mult.yvals = FALSE,
  group.by.last = FALSE,
  tidy = TRUE,
  tidy.labels = FALSE,
  transpose = FALSE,
  select.rows = NULL,
  first.k.rows = NA,
  last.k.rows = NA,
  select.columns = NULL,
  first.k.columns = NA,
  last.k.columns = NA,
  auto.order.rows = FALSE,
  sort.rows = FALSE,
  sort.rows.exclude = c("NET", "SUM", "Total"),
  sort.rows.column = NULL,
  sort.rows.decreasing = FALSE,
  auto.order.columns = FALSE,
  sort.columns = FALSE,
  sort.columns.exclude = c("NET", "SUM", "Total"),
  sort.columns.row = NULL,
  sort.columns.decreasing = FALSE,
  hide.output.threshold = 0,
  hide.values.threshold = 0,
  hide.rows.threshold = 0,
  hide.columns.threshold = 0,
  reverse.rows = FALSE,
  reverse.columns = FALSE,
  row.names.to.remove = c("NET", "SUM", "Total"),
  column.names.to.remove = c("NET", "SUM", "Total"),
  split = "[;,]",
  hide.empty.rows.and.columns = TRUE,
  hide.empty.rows = hide.empty.rows.and.columns,
  hide.empty.columns = hide.empty.rows.and.columns,
  hide.percent.symbol = FALSE,
  as.percentages = FALSE,
  categorical.as.binary = NULL,
  date.format = "Automatic",
  show.labels = TRUE,
  column.labels = "",
  row.labels = "",
  values.title = ""
)

Arguments

chart.type

Character; chart type to be plotted.

subset

subset An optional vector specifying a subset of observations to be used in the fitting process, or, the name of a variable in data. It may not be an expression.

weights

An optional vector of sampling weights, or, the name of a variable in data. It may not be an expression.

input.data.table

Array; typically a table of some kind, which is then processed using AsTidyTabularData.

input.data.tables

List of array; each component is assumed to be a Qtable and will be processed using. AsTidyTabularData

input.data.raw

List, containing variables or data.frames or Regression outputs from flipRegression. In the case of multiple Regression outputs, the labels default to the R name of the Regression output.

input.data.pasted

List of length six; the first component of which is assumed to be from a user-entered/pasted table; will be processed by ParseUserEnteredTable.

input.data.other

A PickAny Multi Q variable.

data.source

Where multiple data inputs are provided, a text string can be provided to disambiguate. Refer to the source code for a precise understanding of how this works (it is not obvious and is not likely to be of any use for most cases, so should usually be left as a NULL).

signif.append

Append attributes used to show statistical test for significance.

signif.symbol

Character; Symbol used on chart to indicate significance. This can "Arrow" or "Caret".

signif.symbol.size

Numeric; size of symbol in pixels.

signif.p.cutoffs

Numeric; vector of p-values used to determine color of symbols. These values should be supplied in decreasing order. The colors used will correspond to the smallest cutoff larger than the p-value of that cell.

signif.colors.pos

Character; vector of colors, of the same length as signif.p.cutoffs.

signif.colors.neg

Character; vector of colors, of the same length as signif.p.cutoffs.

signif.colors.on.font

Boolean; whether signif colors should also affect data label font colors.

first.aggregate

Logical; whether or not the input data needs to be aggregated in this function. A single variable is tabulated, 2 variables are crosstabbed if group.by.last is selected, and otherwise the mean is computed. If input.data.raw contains two an 'X' variable and a 'Y' variable in the first two elements of the list, the data is automatically aggregated and crosstabbed.

scatter.input.columns.order

(deprecated) Use scatter.mult.yvals instead.

scatter.mult.yvals

Logical; When chart.type is "Scatter', a TRUE value indicates that columns of input.data.table or input.data.pasted should be considered multiple series instead of different attributes (default).

group.by.last

Logical; TRUE and first.aggregate and there is data in either of input.data.table or input.data.pasted, the data is aggregated using the last variable

tidy

Logical; whether or not the input data needs to be aggregated in this function (e.g., if an x and y variable have been provided, a contingency table is used to aggregate. This defaults to TRUE. It aggressively seeks to turn the data into a named vector or a matrix using TidyTabularData. This is not applied when data.input.tables are provided, or when the chart type is any of "Scatter", "Bean", "Histogram", "Density", "Box", or "Violin".

tidy.labels

Logical; whether to remove common prefixes from the labels of the input data.

transpose

Logical; should the resulting matrix (of created) be transposed?

select.rows

String; Comma separated list of rows, by name or index to select from input table. If blank (default), then all rows are selected.

first.k.rows

Integer; Number of rows to select from the top of the input table. This occurs after select and sort.

last.k.rows

Integer; Number of rows to select from the bottom of the input table. This occurs after select and sort.

select.columns

String; Comma separated list of columns, by name or index to select from input table. If blank (default), then all columns are selected.

first.k.columns

Integer; Number of columns to select from the left of the input table. This occurs after select and sort.

last.k.columns

Integer; Number of columns to select from the right of the input table. This occurs after select and sort.

auto.order.rows

Logical; Automatically order rows by correspondence analysis.

sort.rows

Logical; whether to sort the rows of the table. This operation is performed after row selection. (Ignored if auto.order.rows is true).

sort.rows.exclude

String; If sort.rows is TRUE, then rows in sort.rows.exclude will be excluded from sorting and appended at the bottom of the table.

sort.rows.column

String; If sort.rows is true, this column (specified by name or index) is used for sorting the rows. If not specified, the column with the largest Column n or the right-most column will be used for sorting.

sort.rows.decreasing

Logical; Whether rows should be sorted in decreasing order.

auto.order.columns

Logical; Automatically order columns by correspondence analysis.

sort.columns

Logical; whether to sort the columns of the table. This operation is performed after column selection (Ignored if auto.order.columns is true.

sort.columns.exclude

String; If sort.columns is TRUE, then columns in sort.columns.exclude will be excluded from sorting and appended at the right of the table.'

sort.columns.row

String; If sort.columns is true, this row (specified by name or index) is used for sorting the columns. If not specified, the row with the largest n or the bottom row will be used for sorting.

sort.columns.decreasing

Logical; Whether columns should be sorted in decreasing order.

hide.output.threshold

Integer; If sample size ('Column n' or 'n') is provided then each cell in the input table will be checked to ensure 'n' or 'Column n' is larger than specified threshold, otherwise an error message is given.

hide.values.threshold

Integer; If sample size ('Column n' or 'n') is provided then each cell in the input table will be checked to ensure 'n' or 'Column n' is larger than specified threshold, otherwise the cell will be set to NA.

hide.rows.threshold

Integer; If sample size ('Column n' or 'n') is provided, then rows and with sample sizes smaller than threshold will be removed from table. Vectors will be treated as 1-d matrices

hide.columns.threshold

Integer; If sample size ('Column n' or 'n') is provided, then columns with sample sizes smaller than threshold will be removed from table. Vectors will not be affected.

reverse.rows

Logical; Whether to reverse order of rows. This operation is performed after row selection and sorting.

reverse.columns

Logical; Whether to reverse order of columns. This operation is peformed after column selection and sorting.

row.names.to.remove

Character vector or delimited string of row labels specifying rows to remove from the returned table; default is c("NET", "SUM")

column.names.to.remove

Character vector or delimited string of column labels specifying columns to remove from the returned table; default is c("NET", "SUM").

split

Character delimiter to split row.names.to.remove and col.names.to.remove on. Default is to split on either of "," or ";". Assumed to be a regular expression; see strsplit.

hide.empty.rows.and.columns

Logical; if TRUE empty rows and columns will be removed from the data. Empty here meaning that a row or column contains all NA values, or in the case of percentages, that a row or column contains only 0's. Retained for backwards-compatibility but is superseded by hide.empty.rows and hide.empty.columns.

hide.empty.rows

Logical; hide rows with only NAs or 0's (percentages).

hide.empty.columns

Logical; hide columns with only NAs or 0's (percentages).

hide.percent.symbol

Percentage data is shown without percentage symbols and the symbol is also removed from the statistic attribute.

as.percentages

Logical; If TRUE, aggregate values in the output table are given as percentages summing to 100. If FALSE, column sums are given.

categorical.as.binary

If data is aggregated and this is true, then categorical variables will be converted into indicator variables for each level in the factor.

date.format

One of "Automatic", "US", "International" or "No date formatting". This is used to determine whether strings which are interpreted as dates in the (row)names will be read in the US (month-day-year) or the International (day-month-year) format. By default US format is used if it cannot be deduced from the input data.

show.labels

Logical; If TRUE, labels are used for names in the data output if raw data is supplied.

column.labels

A comma separated list of names to replace the default column names of pd$data. This is applied after all other data manipulations

row.labels

A comma separated list of names to replace the default row names of pd$data. This is applied after all other data manipulations

values.title

The title for the values axis of a chart (e.g., the y-axis of a column chart or the x-axis of a bar chart).

Details

It is assumed that only one of input.data.pasted, input.data.table, input.data.tables, input.data.other, input.data.raw is non-NULL. They are checked for nullity in that order.

Value

A list with components

  • data - If possible, a named vector or matrix, or if that is not posible or a data.frame is requested, a data.frame.

  • weights - Numeric vector of user-supplied weights.

  • values.title - Character string to be used for the y-axis title; will only be a non-empty string if some aggregation has been performed on data

  • scatter.variable.indices A named vector indicating which columns in data should be plotted in a scatterplot as x, y, sizes, and colors. Is NULL if chart.type does not contain "Scatter" or "Bubble". NA is used when the data does not exist.

See Also

AsTidyTabularData, TidyRawData, ParseUserEnteredTable


Displayr/flipChart documentation built on Sept. 20, 2024, 10:56 a.m.