mk.reference: Makes a reference file for Salmon
In FemeniasM/ExplorATEproject: Explore Active Transposable Elements from RNAseq data

mk.reference

R Documentation

Makes a reference file for Salmon

Description

This function creates decoys and a transcriptome that will be used by Salmon. It also creates a reference file to import the estimates after the Salmon run. The user can enter a RepMask file without deleting co-transcribed or overlapping repeats with the RepMask argument, or enter a RepMask file without co-transcribed but overlapping repeats with the RepMask.clean argument, or a file free of co-transcribed or overlapping repeats with the RepMask.ovlp.clean argument. When the file contains co-transcribed repeats, it must indicate rm.cotrans = T and when the file contains overlaps it must indicate overlapping = T.

Usage

mk.reference(
  RepMask,
  overlapping = T,
  by = "classRep",
  trme,
  threads = 1,
  annot_by = "transcripts",
  rule = c(80, 80, 80),
  best.by = "total_repeat_length",
  outdir,
  over.res = "HS",
  trpt.length = NULL,
  ...
)

Arguments

`RepMask`	RepeatMasker output file. If rm.cotrans = F it is assumed that the file does not contain cotranscribed repeats. If overlapping = F it is assumed that the file does not contain overlapping.
`overlapping`	Indicates whether the RepMask file contains overlapping repetitions (TRUE) or not (FALSE). When the RepMask file contains overlapping repetitions, the ovlp.res() function will be used to solve them and the resolution criteria must be indicated (higher score (HS), longer length (LE) or lower Kimura distances (LD))
`by`	The column by which the repeats will be classified: 'classRep' (default) or 'namRep'.
`trme`	transcriptome in fasta format
`threads`	Number of cores to use in the processing. By default threads = 1
`annot_by`	A character vector indicating whether the annotations should be made by "transcripts" or by "fragments". When annot_by = "transcripts", the proportion of each transposon class/family in each transcript is calculated and the transcript is annotated with the class/family with the highest coverage.
`rule`	A numerical vector respectively indicating the minimum percentage of identity, the percentage of the length of class/family repeat with respect to the length of the transcript, and the length (in base pairs) of the repeat to be analyzed. #The position of the numbers indicates respectively: Example: c(80, 60, 100) indicates that those repeats with 80% identity or more in at least 60% of the transcript, and are at least 100 bp in length will be annotated as target TEs. Default is c(80,80,80)
`best.by`	Defines if only the best match of each transcript/sequence id should be returned (by default best.by = NULL which shows all matches for the sequence). The user can choose whether to be based on the longest repeat length ('total_repeat_length') or the highest percent identity ('per_divergence'). The mk.reference() function uses the best.by argument when references are annotated by transcripts (annot_by = 'transcripts') A logical vector indicating its only the longest repeats for each transcript is reported. By default best = TRUE
`outdir`	Output directory
`over.res`	Indicates the method by which the repetition overlap will be resolved. HS: higher score, bases are assigned to the element with the highest score LS: longer element, bases are assigned to the longest element LD: lower divergence, bases are assigned to the element with the least divergence. in all cases both elements have the same characteristics, the bases are assigned to the first element.
`trpt.length`	A data.frame with two columns: the first column must contain the name of the transcripts, and the second the length corresponding to each transcript. The default is trpt.length=NULL, and the lengths for each transcript are taken from the RepeatMasker file.
`rm.cotrnas`	logical vector indicating whether co-transcribed repeats should be removed
`align`	.align file
`anot`	annotation file in outfmt6 format. It is necessary when the option rm.cotrans = T
`gff3`	gff3 file. It is necessary when the option rm.cotrans = T
`stranded`	logical vector indicating if the library is strand specific
`cleanTEsProt`	logical vector indicating whether the search for TEs-related proteins should be carried out (e.g. transposases, integrases, env, reverse transcriptase, etc.). We recommend that users use a curated annotations file, in which these genes have been excluded; therefore the default option is F. When T is selected, a search is performed against a database obtained from UniProt, so we recommend that the annotations file have this format for the subject sequence id (e.g. "CO1A2_MOUSE"/"sp\|Q01149\|CO1A2_MOUSE"/"tr\|H9GLU4\|H9GLU4_ANOCA")
`featureSum`	Returns statistics related to the characteristics of the transcripts. Requires a gff3 file. If TRUE, returns a list of the
`ignore.aln.pos`	The RepeatMasker alignments file may have discrepancies in the repeats positions with respect to the output file. If you selected over.res = "LD", then you can choose whether to take into account the positions of the alignment file or to take the average per repeats class (default).

FemeniasM/ExplorATEproject documentation built on Nov. 30, 2022, 5:26 p.m.

FemeniasM/ExplorATEproject index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

FemeniasM/ExplorATEproject
Explore Active Transposable Elements from RNAseq data

mk.reference: Makes a reference file for Salmon
In FemeniasM/ExplorATEproject: Explore Active Transposable Elements from RNAseq data

Makes a reference file for Salmon

Description

Usage

Arguments

Related to mk.reference in FemeniasM/ExplorATEproject...

R Package Documentation

Browse R Packages

We want your feedback!

FemeniasM/ExplorATEproject Explore Active Transposable Elements from RNAseq data

mk.reference: Makes a reference file for Salmon In FemeniasM/ExplorATEproject: Explore Active Transposable Elements from RNAseq data

Makes a reference file for Salmon

Description

Usage

Arguments

Related to mk.reference in FemeniasM/ExplorATEproject...

R Package Documentation

Browse R Packages

We want your feedback!

FemeniasM/ExplorATEproject
Explore Active Transposable Elements from RNAseq data

mk.reference: Makes a reference file for Salmon
In FemeniasM/ExplorATEproject: Explore Active Transposable Elements from RNAseq data