repbase.filter: Filter the Repbase query output

View source: R/repbase.filter.R

repbase.filterR Documentation

Filter the Repbase query output

Description

Filter the output of the repbase.query function to quantify the number of hits for each query LTR transposon (duplicates) and retain only hits found in Repbase that span the annotation sequence in Repbase to a certain percentage (scope).

Usage

repbase.filter(query.output, scope.value = 0.7, verbose = TRUE)

Arguments

query.output

a data.frame returned by the repbase.query function.

scope.value

a value between [0,1] qunatifying the percentage of minimum sequence similariy between the LTR transposon and the corresponding annotated sequence found in Repbase.

verbose

a logical value indicating whether or not additional information shall be printed to the console while executing this function.

Value

A data.frame storing the filtered output returned by repbase.query.

Author(s)

Hajk-Georg Drost

See Also

repbase.query, repbase.clean

Examples

## Not run: 
# PreProcess Repbase: A thaliana
# and save the output into the file "Athaliana_repbase.ref"
repbase.clean(repbase.file = "athrep.ref",
              output.file  = "Athaliana_repbase.ref")
             
# perform blastn search against A thaliana repbase annotation
AthalianaRepBaseAnnotation <- repbase.query(ltr.seqs     = "TAIR10_chr_all-ltrdigest_complete.fas", 
                                           repbase.path = "Athaliana_repbase.ref", 
                                           cores        = 1)
 # filter the annotation query output                                           
 AthalianaAnnot.HighMatches <- repbase.filter(AthalianaRepBaseAnnotation, 
                                              scope = 0.9)
 Ath.TE.Matches.Families <- sort(table(
                            unlist(lapply(stringr::str_split(
                            names(table(AthalianaAnnot.HighMatches$subject_id)),"_"),
                            function(x) paste0(x[2:3],collapse = ".")))),
                                        decreasing = TRUE)
 
 # visualize the hits found to have a scope of 90%
 barplot(Ath.TE.Matches.Families,
        las       = 3, 
        cex.names = 0.8,
        col       = bcolor(length(Ath.TE.Matches.Families)), 
        main = "RepBase Annotation: A. thaliana")


## End(Not run)

HajkD/LTRpred documentation built on April 22, 2022, 4:35 p.m.