snikumbh/seqArchR: Identify Different Architectures of Sequence Elements

seqArchR enables unsupervised discovery of _de novo_ clusters with characteristic sequence architectures characterized by position-specific motifs or composition of stretches of nucleotides, e.g., CG-richness. seqArchR does _not_ require any specifications w.r.t. the number of clusters, the length of any individual motifs, or the distance between motifs if and when they occur in pairs/groups; it directly detects them from the data. seqArchR uses non-negative matrix factorization (NMF) as its backbone, and employs a chunking-based iterative procedure that enables processing of large sequence collections efficiently. Wrapper functions are provided for visualizing cluster architectures as sequence logos.

Getting started

Package details

Bioconductor views Clustering DNASeq DimensionReduction FeatureExtraction GeneRegulation Genetics MathematicalBiology MotifDiscovery SystemsBiology Transcriptomics
Maintainer
LicenseGPL-3 | file LICENSE
Version1.1.3
URL https://snikumbh.github.io/seqArchR/ https://github.com/snikumbh/seqArchR
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("snikumbh/seqArchR")
snikumbh/seqArchR documentation built on March 11, 2024, 7:06 p.m.