check_data | R Documentation |
Run data check pipeline to seek for potential problems with the data
check_data(
data,
y = NULL,
time = NULL,
status = NULL,
type = "auto",
verbose = TRUE,
check_correlation = TRUE
)
data |
A data source, that is one of the major R formats: data.table, data.frame, matrix, and so on. |
y |
A string that indicates a target column name for regression or classification. Either y, or pair: time, status can be used. By default NULL. |
time |
A string that indicates a time column name for survival analysis task. Either y, or pair: time, status can be used. By default NULL. |
status |
A string that indicates a status column name for survival analysis task. Either y, or pair: time, status can be used. By default NULL. |
type |
A character, one of 'binary_clf'/'regression'/'survival'/'auto'/'multiclass' that sets the type of the task. If 'auto' (the default option) then the function will figure out 'type' based on the number of unique values in the 'y' variable, or the presence of time/status columns. |
verbose |
A logical value, if set to TRUE, provides all information about the process, if FALSE gives none. |
check_correlation |
A logical value, if set to TRUE, provides information about the correlations between numeric, and categorical pairs of variables. Available only when verbose is set to TRUE. Default value is TRUE. |
A list with two vectors: lines of the report (str) and the outliers (outliers).
check_data(lisbon, 'Price')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.