msc.length: Length of minicircles

View source: R/msc.length.R

msc.lengthR Documentation

Length of minicircles

Description

The msc.length function allows you to check the length of minicircle sequences based on a single FASTA file. This function helps determine the size distribution of minicircle sequences.

Usage

msc.length(file, samples, groups)

Arguments

file

the name of the FASTA file that contains all the minicircle sequences. The file should be in the format "all.minicircles.circ.fasta".

samples

a character vector containing the sample names.

groups

a vector of the same length as the samples, specifying the groups (e.g., subspecies) to which the samples belong.

Value

length

a numerical vector containing the lengths of the minicircle sequences. Each element corresponds to the length of a specific minicircle sequence.

plot

a histogram that visualizes the frequency distribution of minicircle sequence lengths. The histogram provides an overview of the length distribution of the minicircles.

Examples

require(ggplot2)
require(ggpubr)

### run function
bf <- msc.length(file = system.file("extdata", "all.minicircles.fasta", package="rKOMICS"),
                 samples = exData$samples, groups = exData$subspecies)
af <- msc.length(file = system.file("extdata", "all.minicircles.circ.fasta", package="rKOMICS"),
                 samples = exData$samples, groups = exData$subspecies)

length(which(bf$length<800)) 
length(which(bf$length>1400)) 

### visualize results
hist(af$length, breaks=50)

### alter plot
ggarrange(bf$plot + labs(caption = "Before filtering"), 
          af$plot + labs(caption = "After filtering"), nrow=2)



rKOMICS documentation built on July 9, 2023, 7:46 p.m.