README.md
In ssarda/genomeutils: Utilities for Genome Analysis

genomeutils

This package provides a set of helper tools that allow automating some common tasks one encounters during routine genome analysis.

Using the data.table package for performing these functions really fast.

Load from a file Write to a file Read .fasta/fastq to a Genome object Write a Genome object to file Read .fasta to a Proteome object Write a Proteome object to file

Fetch the gene length (coding exons) for a list of genes Fetch the GC content of the sequence for a list of genes Perform Gene Ontology Enrichment analysis for a set of interesting genes

Min max normalization Row median/deviation normalization Sample specific normalization Upper Quartile Normalization Gene counts to expression in 'Counts per million': (CPM) Gene counts to expression in 'Transcripts per million': (TPM) Gene counts to expression in 'Relative Log Expression': (RLE) as used in DESeq Invertibility of matrix Principal Component Analysis + 2D plots Multi-Dimensional Scaling + 2D plots Singular Value Decomposition + 2D plots

Modify heirarchical clustering to produce a plot colored by groups Produce an MA plot (to identify differentially expressed genes, for instance) Produce fancy heatmaps Produce a smooth histogram by modifying base R plotting parameters

Includes ordering comparisons by significance

T-tests F-tests Significance of difference of means test Significance of difference of variances test Wilcoxon tests Kolmogorov–Smirnov test

Maximum likelihood estimation of Gaussian distribution parameters + AIC Maximum likelihood estimation of Weibull distribution parameters + AIC Implementation of binomial generalized linear model + AIC + BIC Bayesian posterior estimation for a mixture of betas Bayesian posterior estimation for binomial beta distribution

Support Vector Machine classifier Naive Bayes Classifier Random Forest Classifier Linear Discriminant Analysis Limma linear model for differential gene expression and computation of Residual sum of squares for downstream analysis

Carries out genome-wide association analysis using parallelized code to perform it really fast. Input genotype and phenotype data and get significance measures by fitting a generalized linear model per SNP.

ssarda/genomeutils documentation built on May 30, 2019, 8:42 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com