This package provides a set of helper tools that allow automating some common tasks one encounters during routine genome analysis.
Using the data.table package for performing these functions really fast.
Load from a file Write to a file Read .fasta/fastq to a Genome object Write a Genome object to file Read .fasta to a Proteome object Write a Proteome object to file
Fetch the gene length (coding exons) for a list of genes Fetch the GC content of the sequence for a list of genes Perform Gene Ontology Enrichment analysis for a set of interesting genes
Min max normalization Row median/deviation normalization Sample specific normalization Upper Quartile Normalization Gene counts to expression in 'Counts per million': (CPM) Gene counts to expression in 'Transcripts per million': (TPM) Gene counts to expression in 'Relative Log Expression': (RLE) as used in DESeq Invertibility of matrix Principal Component Analysis + 2D plots Multi-Dimensional Scaling + 2D plots Singular Value Decomposition + 2D plots
Modify heirarchical clustering to produce a plot colored by groups Produce an MA plot (to identify differentially expressed genes, for instance) Produce fancy heatmaps Produce a smooth histogram by modifying base R plotting parameters
Includes ordering comparisons by significance
T-tests F-tests Significance of difference of means test Significance of difference of variances test Wilcoxon tests Kolmogorov–Smirnov test
Maximum likelihood estimation of Gaussian distribution parameters + AIC Maximum likelihood estimation of Weibull distribution parameters + AIC Implementation of binomial generalized linear model + AIC + BIC Bayesian posterior estimation for a mixture of betas Bayesian posterior estimation for binomial beta distribution
Support Vector Machine classifier Naive Bayes Classifier Random Forest Classifier Linear Discriminant Analysis Limma linear model for differential gene expression and computation of Residual sum of squares for downstream analysis
Carries out genome-wide association analysis using parallelized code to perform it really fast. Input genotype and phenotype data and get significance measures by fitting a generalized linear model per SNP.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.