OpenStatsList: Method "OpenStatsList"

Description Usage Arguments Value Note Author(s) See Also Examples

View source: R/OpenStatsList.R

Description

The driver function to create 'OpenStatsList' object from a data frame.

- The mandatory variable for creating a 'standard' OpenStatsList objects is 'Genotype'. Having two levels in the 'Genotype' field is mandatory. The function further checks for the optional 'Sex' with two levels (Male/Female), LifeStage' with two levels (Early/Late), 'Batch' (defined as date_of experiment in the IMPC) and 'Weight' (defined as animal body weight in the IMPC) and reports any abnormality in the data.

- For advance applications, the function is capable of creating a 'OpenStatsList' object without performing checks. To do this, set clean.dataset to FALSE.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
OpenStatsList(
	dataset                                                 ,
	testGenotype                = 'experimental'            ,
	refGenotype                 = 'control'                 ,
	hemiGenotype                = NULL                      ,
	clean.dataset               = TRUE                      ,
	dataset.colname.genotype    = 'biological_sample_group' ,
	dataset.colname.sex         = 'sex'                     ,
	dataset.colname.batch       = 'date_of_experiment'      ,
	dataset.colname.lifestage   = 'LifeStage'               ,
	dataset.colname.weight      = 'weight'                  ,
	dataset.values.missingValue = c(' ','')                 ,
	dataset.values.male         = NULL                      ,
	dataset.values.female       = NULL                      ,
	dataset.values.early        = NULL                      ,
	dataset.values.late         = NULL                      ,
	debug                       = TRUE
)

Arguments

dataset

mandatory argument. data frame created from file or from another source. See notes for more details

testGenotype

mandatory argument. Defines the test genotype to be compared to the reference genotype. Default 'experimental'

refGenotype

defines the reference genotype; assigned default value is 'control'

hemiGenotype

optional argument. defines the genotype value for hemizygous that will be changed to test genotype value

clean.dataset

logical flag. 'TRUE' activates all checks and modification on the input data. The overview of the checks is, existence of the variables, checking levels, missings and relabeling

dataset.colname.genotype

mandatory argument. Column name within dataset for the genotype. Default 'biological_sample_group'

dataset.colname.sex

optional argument. column name within dataset for the sex. Default 'sex'

dataset.colname.batch

optional argument. column name within dataset for the batch effect. Default 'date_of_experiment'

dataset.colname.lifestage

optional argument. column name within dataset for the life stage. Default 'LifeStage'

dataset.colname.weight

optional argument. column name within dataset for the body weight. Default 'weight'

dataset.values.missingValue

value used as missing value in the dataset. Default '(space)'.

dataset.values.male

value used to label "males" in the dataset

dataset.values.female

value used to label "females" in the dataset

dataset.values.early

value used to label "early life stage" in the dataset

dataset.values.late

value used to label "late life stage" in the dataset

debug

A logical flag. Set to TRUE to see more details about the progress of the function. Default TRUE

Value

an instance of the OpenStatsList class. The S4 object contains:

1. raw data: 'OpenStatsListObject@datasetUNF'
2. polished 'data: OpenStatsListObject@datasetPL'
3. the inputarguments to the 'OpenStatsList' function

Note

OpenStats allows a 'data.frame' for the input data. This data.frame can be formed from csv, tsv, txt etc. files and is organised with rows and columns for samples and features respectively. This allows a wide range of integration with other Bioconductor/CRAN packages, for instance, the output of Bioconductor 'SummarizedExperiment' package can be transformed and fed into OpenStats (note that SummarizedExperiment allows sample in columns and feature in rows that requires at least a transpose operation). Additionally, Bioconductor 'PhenStat' function 'PhenList' produces very similar results to 'OpenStatsList' that allows direct processing of the 'PhenList' object by downstream OpenStats operational functions.

Author(s)

Hamed Haseli Mashhadi <hamedhm@ebi.ac.uk>

See Also

OpenStatsAnalysis,plot.OpenStatsList,summary.OpenStatsList, summary.OpenStatsList,

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
####################################################################
df <- read.csv(system.file("extdata", "test_continuous.csv", package = "OpenStats"))
####################################################################
# OpenStatsList object
####################################################################
OpenStatsList <- OpenStatsList(
  dataset = df,
  testGenotype = "experimental",
  refGenotype = "control",
  dataset.colname.batch = "date_of_experiment",
  dataset.colname.genotype = "biological_sample_group",
  dataset.colname.sex = "sex",
  dataset.colname.weight = "weight"
)
p <- plot(OpenStatsList,
  vars = c(
    "Genotype",
    "Sex",
    "data_point",
    "age_in_days"
  )
)
p$Continuous
p$Categorical
summary(OpenStatsList, style = "grid")
class(OpenStatsList)
rm(OpenStatsList)

OpenStats documentation built on Nov. 8, 2020, 5:20 p.m.