Description Usage Arguments Value Examples
Assembling minicircle sequences with KOMICS generates individual fasta files (one per sample). The preprocess function allows you to filter the minicircle sequences based on sequence length (as the size of minicircular kDNA is species-specific and variable) and circularization success. The function will write filtered individual fasta files in the current working directory.
1 |
files |
a character vector containing the fasta file names in the format sampleA.minicircles.fasta, sampleB.minicircles.fasta,... (output of KOMICS). |
groups |
a factor specifying to which group (e.g. species) the samples belong to. It should have the same length as the list of files. |
circ |
a logical parameter. By default non-circularized minicicle sequences will be excluded. If interested in non-circularized sequences as well, set the parameter to FALSE. |
min |
a minimum value for the minicircle sequences length. Default value is set to 500. |
max |
a maximum value for the minicircle sequences length. Default value is set to 1500. |
writeDNA |
a logical parameter. By default filtered minicircle sequences will by written in fasta format to the current working directory. Set to FALSE if only interested in other output values like plots and summary. |
samples |
the sample names (based on the input files). |
N_MC |
a table containing the sample name, which group it belongs to and the number of minicirce sequences (N_MC) before and after filtering. |
plot |
a barplot visualizing the number of minicircle sequences per sample before and after filtering. |
summary |
the total number of minicircle sequences before and after filtering. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | require(ggplot2)
data(exData)
### setwd("")
### run function
table(exData$species)
pre <- preprocess(files = system.file("extdata", exData$fastafiles, package="rKOMICS"),
groups = exData$species,
circ = TRUE, min = 500, max = 1200, writeDNA = FALSE)
pre$summary
### visualize results
barplot(pre$N_MC[,"beforefiltering"],
names.arg = pre$N_MC[,1], las=2, cex.names=0.4)
### alter plot
pre$plot + labs(caption = paste0('N of MC sequences before and after filtering, ', Sys.Date()))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.