Description Usage Arguments Value Author(s) References Examples
Parses a PubChem Bioassay experimental result from two required files (a csv file and an XML description) into a bioassay
object.
1 2 | parsePubChemBioassay(aid, csvFile, xmlFile, duplicates = "drop",
missingCid = "drop", scoreRegex = "inhibition|ic50|ki|gi50|ec50|ed50|lc50")
|
aid |
The assay identifier (aid) for the assay to be parsed. |
csvFile |
A CSV file for a given assay, as downloaded from PubChem Bioassay. |
xmlFile |
An XML description file for a given assay, as downloaded from PubChem Bioassay. |
duplicates |
Specifies how duplicate CIDs in the same assay are treated. If 'drop' is specified, only the first of each duplicated cid is kept and a warning is returned. If 'FALSE' processing will stop with an error if duplicates are present. If 'TRUE' duplicates will be included without warning, which may cause erroneous results with other bioassayR functions that assume a unique cid list for each assay. |
missingCid |
A value of either 'drop' or a logical value of FALSE. If 'FALSE' processing will stop with an error for any input compounds with an empty cid string. If 'drop' is specified, a warning will be issued and these compounds will be skipped. |
scoreRegex |
A regular expression (perl compatible, case insensitive) to be matched to the column names in the CSV header, to identify relavent score rows. If any rows match this regex, the first matching row will be used in place of the 'PUBCHEM_ACTIVITY_SCORE' and it's row name will be stored as the assays scoring method. The default will identify most PubChem Bioassays which contain protein target inhibition data. If a matching row contains all empty or non-numeric results, the next matching row is automatically used. |
A bioassay
object containing the loaded data.
Tyler Backman
http://pubchem.ncbi.nlm.nih.gov NCBI PubChem
1 2 3 4 5 6 7 8 | ## get sample data locations
extdata_dir <- system.file("extdata", package="bioassayR")
assayDescriptionFile <- file.path(extdata_dir, "exampleAssay.xml")
activityScoresFile <- file.path(extdata_dir, "exampleScores.csv")
## parse files
myAssay <- parsePubChemBioassay("1000", activityScoresFile, assayDescriptionFile)
myAssay
|
Loading required package: DBI
Loading required package: RSQLite
Loading required package: Matrix
Loading required package: rjson
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:Matrix':
colMeans, colSums, rowMeans, rowSums, which
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colMeans, colSums, colnames,
dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
intersect, is.unsorted, lapply, lengths, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which, which.max, which.min
Attaching package: 'bioassayR'
The following objects are masked from 'package:BiocGenerics':
organism, organism<-
class: bioassay
aid: 1000
source_id: PubChem BioAssay
assay_type: confirmatory
organism: NA
scoring: IC50
targets: 116516899
target_types: protein
total scores: 57
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.