tcplSubsetChid: Subset level 5 data to a single sample per chemical

View source: R/tcplSubsetChid.R

tcplSubsetChidR Documentation

Subset level 5 data to a single sample per chemical

Description

tcplSubsetChid subsets level 5 data to a single tested sample per chemical. In other words, if a chemical is tested more than once (a chid has more than one spid) for a given assay endpoint, the function uses a series of logic to select a single "representative" sample.

Usage

tcplSubsetChid(dat, flag = TRUE, type = "mc", export_ready = FALSE)

Arguments

dat

data.table, a data.table with level 5 data

flag

Integer, the mc6_mthd_id values to go into the flag count, see details for more information

type

Character of length 1, the data type, "sc" or "mc"

export_ready

Boolean, default FALSE, should only export ready 1 values be included in calculation

Details

tcplSubsetChid is intended to work with level 5 data that has chemical and assay information mapped with tcplPrepOtpt.

To select a single sample, first a "consensus hit-call" is made by majority rule, with ties defaulting to active. After the chemical-wise hit call is made, the samples corresponding to to chemical-wise hit call are logically ordered using the fit category, the number of the flags, and AC50 (or modl_ga), then the first sample for every chemical is selected.

The flag param can be used to specify a subset of flags to be used in the flag count. Leaving flag TRUE utilize all the available flags. Setting flag to FALSE will do the subsetting without considering any flags.

Value

A data.table with a single sample for every given chemical-assay pair.

See Also

tcplPrepOtpt

Examples

## Not run: 
## Load the example level 5 data
d1 <- tcplLoadData(lvl = 5, fld = "aeid", val = 797)
d1 <- tcplPrepOtpt(d1)

## Subset to an example of a duplicated chid
d2 <- d1[chid == 20182]
d2[, list(m4id, hitc, fitc, modl_ga)]

## Here the consensus hit-call is 1 (active), and the fit categories are
## all equal. Therefore, if the flags are ignored, the selected sample will
## be the sample with the lowest modl_ga.
tcplSubsetChid(dat = d2, flag = FALSE)[, list(m4id, modl_ga)]

## End(Not run)


tcpl documentation built on Oct. 10, 2024, 1:07 a.m.