supp_figure2d: ENCODE RNAseq data for the cellular localization analysis.

Description Format Source Examples

Description

Quantified RNAseq data from 11 cell lines from the GRCh38 assembly was downloaded from ENCODE and quantifications for miR34a asRNA (ENSG00000234546), ACTB (ENSG00000075624), GAPDH (ENSG00000111640), and MALAT1 (ENSG00000251562) were extracted. Cell lines for which data was downloaded include: A549, GM12878, HeLa-S3, HepG2, HT1080, K562, MCF-7, NCI-H460, SK-MEL-5, SK-N-DZ, SK-N-SH. Initial exploratory analysis revealed that several cell lines should not be included in the final figure for the following reasons: The SK-N-SH has a larger proportion of GAPDH in the nucleus than cytoplasm. The variation of miR34a asRNA expression is too large for SK-MEL-5. K562, HT1080, SK-N-DZ, and NCI-H460 have no or low miR34a asRNA expression. In addition, both the cytoplasmic markers ACTB and GAPDH were analyzed for their ability to differentiate between the nuclear and cytoplasmic fractions, and GAPDH was choosed for the final analysis due its superior performance. Furthermore, only polyadenylated libraries were used in the final analysis, due to the fact that the cellular compartment enrichment was seen to be improved in these samples, and all analyzed genes are reported to be polyadenylated (MALAT1: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2722846/. Only samples with 2 biological replicates were retained. For each cell type, gene, and biological replicate the fraction of transcripts per million (TPM) in each cellular compartment was calculate as the fraction of TPM in the specific compartment by the total TPM. The mean and standard deviation for the fraction was subsequently calculated for each cell type and cellular compartment and this information was represented in the final figure.

Format

A tibble with 228 rows and 66 variables:

Source

https://www.encodeproject.org

Examples

1
getData('Supplementary Figure 2d')

GranderLab/miR34a_asRNA_project documentation built on May 26, 2019, 7:26 a.m.