PROVEANDb-class | R Documentation |
The PROVEANDb class is a container for storing a connection to a PROVEAN sqlite database.
The SIFT tool is no longer actively maintained. A few of the orginal authors have started the PROVEAN (Protein Variation Effect Analyzer) project. PROVEAN is a software tool which predicts whether an amino acid substitution or indel has an impact on the biological function of a protein. PROVEAN is useful for filtering sequence variants to identify nonsynonymous or indel variants that are predicted to be functionally important.
See the web pages for a complete description of the methods.
PROVEAN Home: http://provean.jcvi.org/index.php/
SIFT Home: http://sift.jcvi.org/
Though SIFT is not under active development, the PROVEAN team still
provids the SIFT scores in the pre-computed downloads. This package,
SIFT.Hsapiens.dbSNP137
, contains both SIFT and PROVEAN scores.
One notable difference between this and the previous SIFT database
package is that keys
in SIFT.Hsapiens.dbSNP132
are
rs IDs whereas in SIFT.Hsapiens.dbSNP137
they are NCBI dbSNP IDs.
In the code below, x
is a PROVEANDb
object.
metadata(x)
:
Returns x
's metadata in a data frame.
columns(x)
:
Returns the names of the columns
that can be used to subset the
data columns.
keys(x, keytype="DBSNPID", ...)
:
Returns the names of the keys
that can be used to subset the
data rows. For SIFT.Hsapiens.dbSNP137 the keys
are NCBI dbSNP ids.
keytypes(x)
:
Returns the names of the columns
that can be used as keys
.
For SIFT.Hsapiens.dbSNP137 the NCBI dbSNP ids are the only keytype.
select(x, keys = NULL, columns = NULL, keytype = "DBSNPID", ...)
:
Returns a subset of data defined by the character vectors keys
and columns
. If no keys
are supplied, all rows are
returned. If no columns
are supplied, all columns
are returned.
Valerie Obenchain
The PROVEAN tool has replaced SIFT: http://provean.jcvi.org/about.php
Choi Y, Sims GE, Murphy S, Miller JR, Chan AP (2012) Predicting the Functional Effect of Amino Acid Substitutions and Indels. PLoS ONE 7(10): e46688.
Choi Y (2012) A Fast Computation of Pairwise Sequence Alignment Scores Between a Protein and a Set of Single-Locus Variants of Another Protein. In Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine (BCB '12). ACM, New York, NY, USA, 414-417.
Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4(7):1073-81
Ng PC, Henikoff S. Predicting the Effects of Amino Acid Substitutions on Protein Function Annu Rev Genomics Hum Genet. 2006;7:61-80.
Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003 Jul 1;31(13):3812-4.
if (require(SIFT.Hsapiens.dbSNP137)) {
## metadata
metadata(SIFT.Hsapiens.dbSNP137)
## keys are the DBSNPID (NCBI dbSNP ID)
dbsnp <- keys(SIFT.Hsapiens.dbSNP137)
head(dbsnp)
columns(SIFT.Hsapiens.dbSNP137)
## Return all columns. Note that the key, DBSNPID,
## is always returned.
select(SIFT.Hsapiens.dbSNP137, dbsnp[10])
## subset on keys and cols
cols <- c("VARIANT", "PROVEANPRED", "SIFTPRED")
select(SIFT.Hsapiens.dbSNP137, dbsnp[20:23], cols)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.