| ovlp.res | R Documentation |
This function resolves overlapping repeats assigned to the same transcript and returns a data frame of repeats with no overlaps. The user can define the criteria to solve the overlaps, either by higher score (HS), longer length (LE) or lower Kimura's distances (LD).
ovlp.res(
RepMask,
anot,
gff3,
stranded = T,
outdir,
rm.cotrans = F,
trpt.length = NULL,
align,
threads = 1,
ignore.aln.pos = T,
over.res = c("HS", "LS", "LD"),
by = "classRep",
...
)
RepMask |
RepeatMasker output file. If rm.cotrans = F, them you must enter a RepeatMasker output file without co-transcripted repeats. |
anot |
annotation file in outfmt6 format. It is necessary when the option rm.cotrans = T |
gff3 |
gff3 file. It is necessary when the option rm.cotrans = T |
stranded |
logical vector indicating if the library is strand specific |
outdir |
Output directory |
trpt.length |
A data.frame with two columns: the first column must contain the name of the transcripts, and the second the length corresponding to each transcript. The default is trpt.length=NULL, and the lengths for each transcript are taken from the RepeatMasker file. |
align |
.align file |
threads |
Number of cores to use in the processing. By default threads = 1 |
ignore.aln.pos |
The RepeatMasker alignments file may have discrepancies in the repeats positions with respect to the output file. If you selected over.res = "LD", then you can choose whether to take into account the positions of the alignment file or to take the average per repeats class (default). |
over.res |
Indicates the method by which the repetition overlap will be resolved ("HS" by default). HS: higher score, bases are assigned to the element with the highest score LS: longer element, bases are assigned to the longest element LD: lower divergence, bases are assigned to the element with the least divergence. in all cases both elements have the same characteristics, the bases are assigned to the first element. |
rm.cotrnas |
logical vector indicating whether co-transcribed repeats should be removed |
cleanTEsProt |
logical vector indicating whether the search for TEs-related proteins should be carried out (e.g. transposases, integrases, env, reverse transcriptase, etc.). We recommend that users use a curated annotations file, in which these genes have been excluded; therefore the default option is F. When T is selected, a search is performed against a database obtained from UniProt, so we recommend that the annotations file have this format for the subject sequence id (e.g. "CO1A2_MOUSE"/"sp|Q01149|CO1A2_MOUSE"/"tr|H9GLU4|H9GLU4_ANOCA") |
featureSum |
Returns statistics related to the characteristics of the transcripts. Requires a gff3 file. If TRUE, returns a list of the |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.