Description Usage Arguments Details See Also Examples
Remove proteins with unavailable IDs, ambiguous expression ratios, and duplicated IDs.
1 |
dat |
data frame, protein expression data |
IDcol |
character, name of column that has the UniProt IDs |
up2 |
logical, TRUE for up-regulated proteins, FALSE for down-regulated proteins |
cleanup
is used in the pdat_
functions to clean up the dataset given in dat
.
IDcol
is the name of the column that has the UniProt IDs, and up2
indicates the expression change for each protein.
The function removes proteins with unavailable (NA or "") or duplicated IDs.
If up2
is provided, the function also removes unquantified proteins (those that have NA values of up2
) and those with ambiguous expression ratios (up and down for the same ID).
For each operation, a message is printed describing the number of proteins that are unavailable, unquantified, ambiguous, or duplicated.
Alternatively, if IDcol
is a logical value, it selects proteins to be unconditionally removed.
This function is used extensively by the pdat_
functions, where it is called after check_IDs
(if needed).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # Set up a simple workflow
extdatadir <- system.file("extdata", package="canprot")
datadir <- paste0(extdatadir, "/expression/pancreatic/")
dataset <- "CYD+05"
dat <- read.csv(paste0(datadir, dataset, ".csv.xz"), as.is = TRUE)
up2 <- dat$Ratio..cancer.normal. > 1
# Remove two unavailable and one duplicated proteins
dat <- cleanup(dat, "Entry", up2)
# Now we can retrieve the amino acid compositions
pcomp <- protcomp(dat$Entry)
# Read another data file
datadir <- paste0(system.file("extdata", package="canprot"), "/expression/colorectal/")
dataset <- "STK+15"
dat <- read.csv(paste0(datadir, "STK+15.csv.xz"), as.is = TRUE)
# Remove unavailable proteins
dat <- cleanup(dat, "uniprot")
# Remove proteins that have less than 2-fold expression ratio
dat <- cleanup(dat, abs(log2(dat$invratio)) < 1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.