msc.uc: Cluster Analyses

View source: R/msc.uc.R

msc.ucR Documentation

Cluster Analyses

Description

The function msc.uc reads the output of the clustering analyses (UC file) for each specified minimum percent identity (MPI) into a single list, which will be analyzed automatically to calculate and visualize, per MPI, the number of minicircle sequence classes (MSCs), the proportion of perfect alignments (i.e. alignments without any insertion/deletion, but allowing point mutations) and the number of alignment gaps. Gaps are defined by i) the number of insertions/deletions and ii) the length in base pairs of each individual insertion/deletion. It also issues a warning when large gaps (>500 bp) are found, which points the user to anomalous alignments due to e.g. artificial dimers introduced by the assembly process. This allows the user to make an informed decision about the MPI (or MPI's) that best captures minicircle sequence richness within a (group of) sample(s) while minimizing the number and length of alignment gaps.

Usage

msc.uc(files)

Arguments

files

a character vector that includes the file names of UC files (produced by USEARCH or VSEARCH), such as all.minicircles.circ.id70.uc, all.minicircles.circ.id80.uc, and so on. Please ensure that your file names end with 'idxx.uc' for this function to work properly.

Value

MSCs

a numerical vector containing the number of MSC per MPI.

perfect aligments

a numerical vector containing the proportions of perfect alignments per MPI.

insertions

a list showing the insertion lengths per MPI. Each element in the list corresponds to a specific MPI, and it provides the lengths of identified insertions.

deletions

a list showing the deletions lengths per MPI. Each element in the list corresponds to a specific MPI, and it provides the lengths of identified deletions.

insertions summary

a table showing the length and the number of insertions across different MPIs.

deletions summary

a table showing the length and the number of deletions across different MPIs.

plots

various plots showing previous results.

Examples

data(exData)

### run function

ucs <- msc.uc(files = system.file("extdata", exData$ucs, package="rKOMICS"))

ucs$MSCs["100"] 
ucs$MSCs["97"] 

### results
ucs$plots



rKOMICS documentation built on July 9, 2023, 7:46 p.m.