PXDataset2 | R Documentation |
The rpx
package provides the infrastructure to access, store and
retrieve information for ProteomeXchange (PX) data sets. This can
be achieved with PXDataset2
objects can be created with the
PXDataset2()
constructor that takes the unique ProteomeXchange
project identifier as input.
The new PXDataset2
class superseeds the previous and now
deprecated PXDataset
version.
PXDataset2(id, cache = rpxCache())
PXDataset(id, cache = rpxCache())
## S4 method for signature 'PXDataset2'
pxid(object)
## S4 method for signature 'PXDataset2'
pxurl(object)
## S4 method for signature 'PXDataset2'
pxtax(object)
## S4 method for signature 'PXDataset2'
pxref(object)
pxtitle(object)
pxinstruments(object)
pxSubmissionDate(object)
pxPublicationDate(object)
pxptms(object)
pxprotocols(object, which = c("project", "samples", "data"))
## S4 method for signature 'PXDataset2'
pxfiles(object, n = 10, as.vector = TRUE)
## S4 method for signature 'PXDataset2'
pxCacheInfo(object)
## S4 method for signature 'PXDataset2'
pxget(object, list, cache = rpxCache())
id |
|
cache |
Object of class |
object |
An instance of class |
which |
|
n |
|
as.vector |
|
list |
|
The rpx
packages uses caching to store ProteomeXchange projects
and project files. When creating an object with PXDataset2()
,
the cache is first queried for the projects identifier. If a
unique hit is found, the project is retrieved and returned. If no
matching project identifier is found, then the remote resource is
accessed to first create the new PXDataset2()
project, then
cache it before returning it to the user. The same mechanism is
applied when project files are requested.
Caching is supported by BiocFileCache package. The PXDataset2()
constructor and the px_get()
function can be passed a instance
of class BiocFileCache
that defines the cache. The default is to
use the package-wide cache defined in rpxCache()
. For more
details on how to manage the cache (for example if some files need
to be deleted), please refer to the BiocFileCache
package
vignette and documentation. See also rpxCache()
for additional
details.
The PXDataset2()
returns a cached PXDataset2
object. It thus also modifies the cache used to projet
caching, as defined by the cache
argument.
px_id
character(1)
containing the dataset's unique
ProteomeXchange identifier, as used to create the object.
px_rid
character(1)
storing the cached resource name in
the BiocFileCache instance stored in cachepath
.
px_title
character(1)
with the project's title.
px_url
‘character(1) with the project’s URL.
px_doi
character(1)
with the project's DOI.
px_ref
character
containing the project's reference(s).
px_ref_doi
character
containing the project's reference DOIs.
px_pubmed
character
containing the project's reference
PubMed identifier.
px_files
data.frame
containing information about the
project files, including file names, URIs and types. The files
are retrieved from the project's README.txt file.
px_tax
charcter
(typically of length 1) containing the
taxonomy of the sample.
px_metadata
list
containing the project's metadata, as
downloaded from the ProteomeXchange site. All slots but
px_files
are populated from this one.
cachepath
character(1)
storing the path to the cache the
project object is stored in.
pxfiles(object, n = 10, as.vector = TRUE)
by default,
invisibly returns all the project file names. The function
prints the first n
files specifying whether they are local of
remote (based on the cache the object is stored in). The
printing can be ignored by wrapping the call in
suppressMessages()
. If as.vector
is set to FALSE
, it
returns a data.frame
with variables ID, NAME, URI, TYPE,
MAPPINGS and PXID. Note that the variables and their content
will depend on the rpx
version that was installed when these
objects were created and cached.
pxget(object, list, cache)
: list
is a vector defining the
files to be downloaded. If list = "all"
, all files are
downloaded. The file names, as returned by pxfiles()
can also
be used. Alternatively, a logical
or numeric
index can be
used. If missing, the file to be downloaded can be selected
from a menu.
The argument cache
can be passed to define the path to the
cache. The default cache is the packages' default as returned
by rpxCache()
.
pxtax(object)
: returns the taxonomic name of object
.
pxurl(object)
: returns the base url on the ProteomeXchange
server where the project files reside.
pxCacheInfo(object, cache): prints and invisibly returns
object's caching information from
cache(default is
rpxCache()'). The return value is a named vector of length two
containing the resourne identifier and the cache location.
‘pxtitle(object): returns the project’s title.
pxref(object)
: returns the project's bibliographic
reference(s).
pxinstruments(object)
: returns the instrument(s) used to
acquire the data.
pxptms(object)
: returns the PTMs searched for in the
experiment.
pxprotocols(object, which)
: returns a list with the project
description, sample processing and/or data processing
protocols.
Laurent Gatto
Vizcaino J.A. et al. 'ProteomeXchange: globally co-ordinated proteomics data submission and dissemination', Nature Biotechnology 2014, 32, 223 – 226, doi:10.1038/nbt.2839.
Source repository for the ProteomeXchange project: https://code.google.com/p/proteomexchange/
px <- PXDataset("PXD000001")
px
pxtax(px)
pxurl(px)
pxref(px)
pxfiles(px)
pxfiles(px, as.vector = FALSE)
pxCacheInfo(px)
fas <- pxget(px, "erwinia_carotovora.fasta")
fas
library("Biostrings")
readAAStringSet(fas)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.