readUc: Convert .uc Files to Dataframe
In jackgisby/packFinder: de novo Annotation of Pack-TYPE Transposable Elements

readUc

R Documentation

Convert .uc Files to Dataframe

Description

Reads .uc files (USEARCH Cluster Format) generated by the VSEARCH clustering and alignment algorithms.

Usage

readUc(file, output = "cluster")

Arguments

file

The file path of the .uc file.

output

The type of analysis that was carried out to produce the .uc file.

If output is specified as "cluster", VSEARCH clustering was carried out.
If output is specified as "alignment", VSEARCH pairwise global alignment was carried out.

Note that clustering produces one "H" record for each sequence, and one "C" record for each cluster, while an alignment produces an "H" record for each alignment (see details).

Details

USEARCH cluster format is a tab separated text file that contains clustering and/or alignment information for a set of sequences. For each sequence a record type, "H, C or N", is provided providing information about the type of "hit" in the dataframe. These refer to:

H - Hit - for alignments, indicates an identified alignment of two supplied sequences. For clustering, indicates the cluster assignment for a query.
C - Cluster record - a record for each cluster generated.
N - No hit - indicates that no cluster was assigned or no alignment was found with a target sequence. For clustering, a query with no hits becomes the centroid of a new cluster.

Additionally, for each record a "compressed alignment" is generated. This is the alignment represented in a compact format including the letters "M", "D", and "I". Before each letter, the number of consecutive columns of the given letter type is also given. The letter types are as follows:

"M" - Match - Identical bases between the query and target sequence
"D" - Deletion - A gap in the target sequence
"I" - Insertion - A gap in the query sequence

An example of this would be "13M", referring to 13 consecutive matches between the query and target sequence.

Value

A dataframe containing the converted .uc file. The fields contained within are as follows:

Record type - "H, C or N", see details for further information.
Cluster designation (output = "cluster" only)
Sequence length, or cluster size
Percent identity to target
The nucleotide strand (output = "cluster" only)
A compressed alignment - see details for further information.
ID of query sequence
ID of target sequence ("H" records only)

Author(s)

Jack Gisby

References

VSEARCH may be downloaded from https://github.com/torognes/vsearch. See https://www.ncbi.nlm.nih.gov/pubmed/27781170 for further information.

Examples

readUc(system.file(
    "extdata", 
    "packMatches.uc", 
    package = "packFinder"
))

jackgisby/packFinder documentation built on July 19, 2022, 2:25 a.m.

jackgisby/packFinder index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jackgisby/packFinder
de novo Annotation of Pack-TYPE Transposable Elements

readUc: Convert .uc Files to Dataframe
In jackgisby/packFinder: de novo Annotation of Pack-TYPE Transposable Elements

Convert .uc Files to Dataframe

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to readUc in jackgisby/packFinder...

R Package Documentation

Browse R Packages

We want your feedback!

jackgisby/packFinder de novo Annotation of Pack-TYPE Transposable Elements

readUc: Convert .uc Files to Dataframe In jackgisby/packFinder: de novo Annotation of Pack-TYPE Transposable Elements

Convert .uc Files to Dataframe

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to readUc in jackgisby/packFinder...

R Package Documentation

Browse R Packages

We want your feedback!

jackgisby/packFinder
de novo Annotation of Pack-TYPE Transposable Elements

readUc: Convert .uc Files to Dataframe
In jackgisby/packFinder: de novo Annotation of Pack-TYPE Transposable Elements