hpar: Human Protein Atlas in R

Description Details Author(s) References See Also Examples

Description

This package provides a simple interface to the Human Protein Atlas. From the Human Protein Atlas Project page: The Swedish Human Protein Atlas project, funded by the Knut and Alice Wallenberg Foundation, has been set up to allow for a systematic exploration of the human proteome using Antibody-Based Proteomics. This is accomplished by combining high-throughput generation of affinity-purified antibodies with protein profiling in a multitude of tissues and cells assembled in tissue microarrays. Confocal microscopy analysis using human cell lines is performed for more detailed protein localization. The program hosts the Human Protein Atlas portal with expression profiles of human proteins in tissues and cells.

Details

Several flat files are distributed by the HPA project and available within the package as data.frames, other datasets are available through a search query on the HPA website. The description below is taken from the HPA site:

hpaNormalTissue

Normal tissue data: Expression profiles for proteins in human tissues based on immunohistochemisty using tissue micro arrays. The tab-separated file includes Ensembl gene identifier ("Gene"), tissue name ("Tissue"), annotated cell type ("Cell type"), expression value ("Level"), and the gene reliability of the expression value ("Reliability").

hpaNormalTissue16.1

Same as above, for version 16.1

hpaCancer

Pathology data: Staining profiles for proteins in human tumor tissue based on immunohistochemisty using tissue micro arrays and log-rank P value for Kaplan-Meier analysis of correlation between mRNA expression level and patient survival. The tab-separated file includes Ensembl gene identifier ("Gene"), gene name ("Gene name"), tumor name ("Cancer"), the number of patients annotated for different staining levels ("High", "Medium", "Low" & "Not detected") and log-rank p values for patient survival and mRNA correlation ("prognostic - favourable", "unprognostic - favourable", "prognostic - unfavourable", "unprognostic - unfavourable").

hpaCancer16.1

Same as above, for version 16.1

rnaGeneTissue

RNA HPA tissue gene data: Transcript expression levels summarized per gene in 37 tissues based on RNA-seq. The tab-separated file includes Ensembl gene identifier ("Gene"), analysed sample ("Tissue"), transcripts per million ("TPM"), protein-transcripts per million ("pTPM") and normalized expression ("NX").

rnaGeneCellLine

RNA HPA cell line gene data: Transcript expression levels summarized per gene in 64 cell lines. The tab-separated file includes Ensembl gene identifier ("Gene"), analysed sample ("Cell line"), transcripts per million ("TPM"), protein-coding transcripts per million ("pTPM") and normalized expression ("NX").

rnaGeneCellLine16.1

Same as above, for version 16.1

hpaSubcellularLoc

Subcellular location data: Subcellular location of proteins based on immunofluorescently stained cells. The tab-separated file includes the following columns: Ensembl gene identifier ("Gene"), name of gene ("Gene name"), gene reliability score ("Reliability"), enhanced locations ("Enhanced"), supported locations ("Supported"), Approved locations ("Approved"), uncertain locations ("Uncertain"), locations with single-cell variation in intensity ("Single-cell variation intensity"), locations with spatial single-cell variation ("Single-cell variation spatial"), locations with observed cell cycle dependency (type can be one or more of biological definition, custom data or correlation) ("Cell cycle dependency"), Gene Ontology Cellular Component term identifier ("GO id").

hpaSubcellularLoc16.1

Same as above, for version 16.1

hpaSubcellularLoc14

Same as above, for version 14.

hpaSecretome

Secretome data: The human secretome is here defined as all Ensembl genes with at least one predicted secreted transcript according to HPA predictions. The complete information about the HPA Secretome data is given on https://www.proteinatlas.org/humanproteome/blood/secretome. This dataset has 230 columns and includes the Ensembl gene identifier ("Gene"). Information about the additionnal variables can be found here by clicking on Show/hide columns.

Detailed description for gene entries and images and not included in the package but can be accessed from within the R environment through a web browser while on-line.

The full data sets can be individually loaded using the data function (see example below). Data about individual genes of interest can retrived with the getHpa function.

HPA data usage policy: The use of data and images from this site in publications and presentations is permitted provided that the following conditions are met:

  1. The publication and/or presentation are solely for informational and non-commercial purposes.

  2. The source of the data and/or image is referred to this site (www.proteinatlas.org) and/or one or more of our publications are cited.

Author(s)

Laurent Gatto <laurent.gatto@uclouvain.be>

References

See the Human Protein Atlas Project page http://www.proteinatlas.org/ and http://www.proteinatlas.org/about/download for more details and documentation.

Uhlen et al (2015). Tissue-based map of the human proteome. Science. 347(6220):1260419.

Uhlen et al (2010). Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 28(12):1248-50.

Berglund et al (2008). A gene-centric Human Protein Atlas for expression profiles based on antibodies. Mol Cell Proteomics. 7(10):2019-27.

Uhlen et al (2005). A human protein atlas for normal and cancer tissues based on antibody proteomics. Mol Cell Proteomics. 4(12):1920-1932.

Ponten et al (2008). The Human Protein Atlas - a tool for pathology. J Pathology. 216(4):387-93.

See Also

getHpaDate for release information. Gene-specific information should be accessed using the getHpa function.

The package vignette can be accessed with vignette("hpar").

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22

hpar documentation built on Nov. 8, 2020, 8:32 p.m.