README.md

tinyarray

Introduction

Hi, I'm Xiao Jie. This is an R package I wrote based on my own data analysis needs. I'm glad you found it. I will update some useful functions here on the public account "bioinfoplanet" and also do some other sharing.

Installation

1.online

if(!require(tinyarray))install.packages("tinyarray")
if(!require(devtools))install.packages("devtools")
if(!require(tinyarray))devtools::install_github("xjsun1221/tinyarray",upgrade = FALSE,dependencies = TRUE)

2.local

Click the green button "code" on this page, then click "Download ZIP" to download it to your working directory. Install it with devtools::install_local("tinyarray-master.zip",upgrade = F,dependencies = T).

functions

1.basic

draw_heatmap(),draw_volcano(),draw_venn(),draw_boxplot(),draw_KM(),draw_venn(),risk_plot()

ggheat() is a function from the ggplot2 package that can be used to create heatmaps. It is still relatively immature and mainly used for aligning plots and collecting legends.

Something about ggheat(): https://mp.weixin.qq.com/s/WhsBf6QAhVXeXeScM59cSA

2.Downstream Analysis of Gene Expression Array Data from GEO Database

geo_download(): Provide a GEO number and get back the expression matrix, clinical information table, and platform number used.

find_anno(): Look up the annotation of the array platform.

get_deg(): Provide the array expression matrix, grouping information, probe annotation and get back the differential analysis results.

multi_deg(): Differential analysis for multiple groups (up to 5).

If you want to do differential analysis and get the common figures in one step, you can use get_deg_all() and multi_deg_all(). This part mainly integrates and simplifies the differential analysis of GEOquery, Annoprobe, and limma.

quick_enrich(): Simple and intuitive enrichment analysis.

double_enrich(): Separate enrichment of up- and down-regulated genes, combined with plotting.

https://mp.weixin.qq.com/s/YQQoDsE5JaKpgFGlbEfQNg

https://mp.weixin.qq.com/s/j5IB_MQ0zeOCe1j_ahwtdQ

3.Exploring Expression Matrices

make_tcga_group(): Quickly get the grouping according to the TCGA sample naming rules.

sam_filter(): Remove duplicate tumor samples in TCGA.

match_exp_cl(): Match TCGA expression matrix with clinical information.

trans_array(): Replace the row names of the matrix, such as replacing the probe names of the expression matrix with gene names.

trans_exp(): Convert TCGA or TCGA+GTeX data to gene IDs (old version, genecode v22 or v23)

trans_exp_new(): Convert TCGA or TCGA+GTeX data to gene IDs(new versions)

t_choose(): Do t-tests for individual genes in batches.

cor.full() and cor.one(): Calculate correlations between genes in batches.

4.Survival Analysis and Visualization

point_cut(): Calculate the best cutoff point for survival analysis in batches.

surv_KM(): Do KM survival analysis in batches, supporting grouping with the best cutoff point.

surv_cox(): Do single factor Cox in batches, supporting grouping with the best cutoff point.

risk_plot(): Risk factor three-way linkage.

https://mp.weixin.qq.com/s/WYBhGxfGg6QFUPHFBashaA

exp_boxplot(): Draw T-N boxplot for the interested genes.

exp_surv(): Draw KM-plot for the interested genes.

box_surv(): Draw boxplot and KM-plot for the interested genes.

5.Something about network graph

hypertest(): Do hypergeometric distribution test for mRNA and lncRNA in batches.

plcortest(): Do correlation test for mRNA and lncRNA in batches.

https://www.yuque.com/xiaojiewanglezenmofenshen/bsgk2d/dt0isp

interaction_to_edges(): Generate the connection table for the network graph based on the relationship table.

edges_to_nodes(): Generate the node table based on the connection table.

6.Tricks

dumd(): Count how many values each column of the data frame has.

intersect_all(): Take the intersection of any number of vectors.

union_all(): Take the union of any number of vectors.



Try the tinyarray package in your browser

Any scripts or data that you put into this service are public.

tinyarray documentation built on Aug. 18, 2023, 9:07 a.m.