knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
library(newprideR)
NewprideR allows for exploring the Pride archive through R. NewprideR has classes and functions for Pride projects, files, peptides, and proteins. It allows for most search functionality featured on the Pride Archive website, and a bit more. This makes systematically searching, and downloading possible solely in R.
All metadata received is obtained through the Pride Swagger UI API, which can be found here: https://www.ebi.ac.uk/pride/ws/archive/v2/swagger-ui.html
First install the package devtools if not installed already, then use install_github()
.
install.package("devtools") library(devtools) install_github("booker-tremayne/new-prideR")
Project Summaries are how meta data for projects are stored.
search.ProjectSummary()
is the function to search for projects across Pride.
It can accept:
For example:
project.list <- search.ProjectSummary(keywords = "imaging", page.size = 2, page.number = 4, instrument = "LTQ Orbitrap") project.list
get.ProjectSummary()
retrieves more detailed information regarding a single project. It accepts project accessions.
first.project <- get.ProjectSummary("PXD000001") first.project
list.ProjectSummary()
retrieves projects from newest to oldest, only accepting page size and page number.
get.similar.projects()
accepts a project accession and returns projects Pride deems similar.
File Details store metadata corresponding to files which projects contain.
File Detail Lists File Details and their associated project projects. It is intended for lists that contain multiple Project Summaries and their respective File Details
get.FileDetail()
is used to get FileDetail data from a project. It accepts project accessions.
get.FileDetailList()
gets all the FileDetails from a list of Project Summaries given to it.
search.FileDetail()
is the searching function for File Details.
It can accept:
Note that this returns a list of FileDetailList objects.
For example:
project.list <- search.ProjectSummary(keywords = "imaging", page.size = 2, page.number = 19, instrument = "LTQ Orbitrap") file.list <- search.FileDetail(project.list, keywords = c(".tif", ".ibd", ".imzml"), all = TRUE) file.list
download.by.accession()
downloads all files from a project, accepting a project accession string.
download.project.list()
downloads all files from all projects in a project list, accepting a list of Project Summaries. Names for folders are created according to the project title.
download.by.name()
downloads a single file according to the file name.
Note that get.FileDetailList()
will be slow, and search.FileDetail()
will take a few minutes if given a Project Summary list and not a FileDetailList.
Peptide Details stores metadata for peptides within Pride.
get.PeptideDetail.accession()
accepts an accession, and returns a list of peptides assocaited with the given project.
peptide.list <- get.PeptideDetail.accession("PXD019134", page.size = 2) peptide.list
get.list.PeptideDetail()
returns a list of all peptides from Pride, accepting page size and page number.
Protein Details stores metadata for proteins within Pride, and is similar to peptide details
get.ProteinDetail.accession()
accepts an accession, and returns a list of peptides assocaited with the given project.
protein.list <- get.ProteinDetail.accession("PXD019134", page.size = 2) protein.list
get.list.ProteinDetail()
returns a list of all peptides from Pride, accepting page size and page number.
We wish to obtain the projects that can be analyzed with Cardinal. For this, we need projects that contain ".ibd" and ."imzml" files. First, we should obtain all projects that contain the word "imaging" or "msi". This is necessary since the file searching method is slow when searching across several projects. We also wish for all the results to be focused on breast cancer.
imaging.list <- search.ProjectSummary(c("imaging", "msi"), page.size = 100, disease = "Breast cancer")
Now, we must retrieve only projects containing the file extensions ".imzml" and ".ibd".
successful.list <- search.FileDetail(imaging.list, c(".imzml", ".ibd"), all = TRUE) successful.list
Finally, we want an optical image in our projects to analyze as well. To do so, we may search for projects containing files with the extension ".jpg" or ".jpeg" or ".tif" or ".png". We can feed our previous successful list into the search.project.list()
function to do so.
final.list <- search.FileDetail(successful.list, c(".jpg", ".jpeg", ".tif", ".png")) final.list
Now we may choose to download this/these projects.
download.project.list(final.list, "/users/UserName/exampleDownload")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.