import_usearch_uc: Import usearch table format ('.uc') to OTU table

Description Usage Arguments Details See Also Examples

View source: R/IO-methods.R

Description

UPARSE is an algorithm for OTU-clustering implemented within usearch. At last check, the UPARSE algortihm was accessed via the -cluster_otu option flag. For details about installing and running usearch, please refer to the usearch website. For details about the output format, please refer to the uc format definition. This importer is intended to read a particular table format output that is generated by usearch, its so-called “cluster format”, a file format that is often given the .uc extension in usearch documentation.

Usage

1
2
import_usearch_uc(ucfile, colRead = 9, colOTU = 10,
  readDelimiter = "_", verbose = TRUE)

Arguments

ucfile

(Required). A file location character string or connection corresponding to the file that contains the usearch output table. This is passed directly to read.table. Please see its file argument documentation for further links and details.

colRead

(Optional). Numeric. The column index in the uc-table file that holds the read IDs. The default column index is 9.

colOTU

(Optional). Numeric. The column index in the uc-table file that holds OTU IDs. The default column index is 10.

readDelimiter

(Optional). An R regex as a character string. This should be the delimiter that separates the sample ID from the original ID in the demultiplexed read ID of your sequence file. The default is plain underscore, which in this regex context is "_".

verbose

(Optional). A logical. Default is TRUE. Should progresss messages be catted to standard out?

Details

Because usearch is an external (non-R) application, there is no direct way to continuously check that these suggested arguments and file formats will remain in their current state. If there is a problem, please verify your version of usearch, create a small reproducible example of the problem, and post it as an issue on the phyloseq issues tracker. The version of usearch upon which this import function was created is 7.0.109. Hopefully later versions of usearch maintain this function and format, but the phyloseq team has no way to guarantee this, and so any feedback about this will help maintain future functionality. For instance, it is currently assumed that the 9th and 10th columns of the .uc table hold the read-label and OTU ID, respectively; and it is also assumed that the delimiter between sample-name and read in the read-name entries is a single "_". If this is not true, you may have to update these parameters, or even modify the current implementation of this function.

Also note that there is now a UPARSE-specific output file format, uparseout, and it might make more sense to create and import that file for use in phyloseq. If so, you'll want to import using the import_uparse() function.

See Also

import

import_biom

import_qiime

Examples

1
2
usearchfile <- system.file("extdata", "usearch.uc", package="phyloseq")
import_usearch_uc(usearchfile)

Example output

Reading `ucfile` into memory and parsing into table 
Initially read 100 entries. 
... Now removing unassigned OTUs (* or NA)... 
Removed 7 entries that had no OTU assignment. 
A total of 93 will be assigned to the OTU table.
OTU Table:          [33 taxa and 37 samples]
                     taxa are columns
               
                174337 175279 215097 2232355 269386 3154070 4226619 4308637
  D1.393095          0      0      0       0      0       0       0       0
  D10.5n.393082      0      0      0       0      0       0       0       0
  D10.a.393084       0      0      0       0      0       0       0       0
  D10.b.393074       0      0      0       0      0       0       0       0
  D11.a.392963       0      0      0       0      1       0       0       0
  D11.b.392956       0      0      0       0      0       0       0       0
  D12.392988         0      0      0       0      0       0       0       0
  D13.5n.393036      0      0      0       0      0       0       1       0
  D13.a.393151       0      0      0       0      0       0       0       0
  D13.b.393109       0      0      0       0      0       0       0       0
  D14.393072         0      0      0       0      0       0       0       0
  D15.5n.393148      0      0      0       0      0       0       0       0
  D15.a.392972       0      0      0       0      0       0       0       0
  D15.b.393025       0      0      0       0      0       0       0       0
  D16.a.393131       0      0      0       0      0       0       0       0
  D16.b.393030       0      0      0       0      0       0       0       0
  D17.392970         0      0      0       0      0       0       0       0
  D18.6m.393070      0      0      1       0      0       0       0       0
  D19.5n.393000      0      0      0       0      0       0       0       0
  D19.a.393146       0      0      0       0      0       0       0       1
  D19.b.393086       0      0      0       0      0       0       0       0
  D2.393107          1      0      0       1      0       0       0       0
  D20.393127         0      1      0       0      0       0       0       0
  D21.5n.392974      0      0      0       0      0       1       0       0
  D21.a.393001       1      0      0       0      0       0       0       0
  D21.b.393031       0      0      0       0      0       0       0       0
  D22.393118         1      0      0       0      0       0       0       0
  D22.5n.393054      0      0      0       0      0       0       0       0
  D23.392960         0      0      0       0      0       0       0       0
  D24.393019         0      0      0       0      0       0       0       0
  D26.392975         1      0      1       0      0       0       0       0
  D27.393075         1      0      0       0      0       0       0       0
  D28.393022         0      0      0       0      0       0       0       0
  D29.393071         0      0      0       0      0       0       0       0
  D3.393129          0      0      0       0      0       0       0       0
  D30.392996         0      0      0       0      0       0       0       0
  D31.393093         1      0      0       0      0       0       0       0
               
                4331364 4358723 4381430 4381553 4401375 4412540 4416570 4416951
  D1.393095           0       0       0       0       0       0       0       0
  D10.5n.393082       0       0       0       0       0       0       0       0
  D10.a.393084        0       0       0       1       0       0       0       0
  D10.b.393074        0       0       0       1       0       0       0       0
  D11.a.392963        0       0       0       0       0       0       0       0
  D11.b.392956        0       0       0       0       0       0       0       0
  D12.392988          0       0       0       0       0       0       0       0
  D13.5n.393036       0       0       0       0       0       0       0       0
  D13.a.393151        0       0       1       0       0       0       0       0
  D13.b.393109        0       0       0       0       0       0       0       0
  D14.393072          0       0       0       0       0       0       0       0
  D15.5n.393148       0       0       0       0       0       1       0       0
  D15.a.392972        0       0       0       0       1       0       0       0
  D15.b.393025        0       0       0       0       0       0       0       0
  D16.a.393131        0       0       0       0       0       0       0       0
  D16.b.393030        0       0       0       0       0       0       0       0
  D17.392970          0       0       0       0       0       0       0       0
  D18.6m.393070       0       0       0       0       0       0       0       0
  D19.5n.393000       0       0       0       0       0       0       0       0
  D19.a.393146        0       0       0       1       0       0       0       1
  D19.b.393086        0       0       0       0       0       0       0       0
  D2.393107           0       0       0       0       0       0       0       0
  D20.393127          0       0       0       0       0       0       0       0
  D21.5n.392974       0       0       0       0       0       0       1       0
  D21.a.393001        1       0       0       0       0       0       1       0
  D21.b.393031        0       0       0       0       0       0       0       0
  D22.393118          0       1       0       0       0       0       0       0
  D22.5n.393054       0       0       0       0       0       0       0       0
  D23.392960          0       0       0       0       0       0       0       0
  D24.393019          0       0       0       0       0       0       0       0
  D26.392975          0       0       0       1       0       0       0       0
  D27.393075          0       0       0       0       0       0       0       0
  D28.393022          0       0       0       0       0       0       0       0
  D29.393071          0       0       0       0       0       0       0       0
  D3.393129           0       0       0       0       0       0       0       0
  D30.392996          0       1       0       0       0       0       0       0
  D31.393093          0       0       0       0       0       0       0       0
               
                4437368 4446898 4447950 4448492 4449518 4463709 4465746 4468234
  D1.393095           0       0       0       0       0       0       1       0
  D10.5n.393082       0       0       1       0       0       0       0       0
  D10.a.393084        0       0       0       0       0       0       0       0
  D10.b.393074        0       0       0       0       1       0       0       0
  D11.a.392963        0       0       0       0       0       0       0       3
  D11.b.392956        0       0       0       0       0       0       0       0
  D12.392988          0       0       0       0       0       0       1       0
  D13.5n.393036       0       0       0       0       1       0       0       0
  D13.a.393151        0       0       1       0       0       0       0       0
  D13.b.393109        0       0       2       0       0       0       0       0
  D14.393072          0       1       0       0       0       0       0       0
  D15.5n.393148       0       0       0       0       0       0       0       0
  D15.a.392972        0       0       0       0       0       0       0       0
  D15.b.393025        0       0       1       0       0       0       0       0
  D16.a.393131        0       0       0       0       0       0       0       0
  D16.b.393030        0       0       0       0       0       0       0       0
  D17.392970          0       0       1       0       0       0       0       1
  D18.6m.393070       0       0       0       0       1       0       0       1
  D19.5n.393000       0       0       0       0       1       0       0       0
  D19.a.393146        0       0       0       0       1       0       0       0
  D19.b.393086        0       0       0       0       0       0       0       1
  D2.393107           0       0       0       0       0       1       0       0
  D20.393127          0       0       0       0       0       0       0       1
  D21.5n.392974       0       0       0       0       0       0       0       2
  D21.a.393001        0       0       0       0       0       0       0       0
  D21.b.393031        0       0       0       0       1       0       0       0
  D22.393118          0       0       0       0       1       0       0       0
  D22.5n.393054       0       0       1       0       0       0       0       0
  D23.392960          0       0       0       0       0       0       0       0
  D24.393019          1       0       0       0       0       0       0       0
  D26.392975          0       0       0       0       0       0       0       0
  D27.393075          0       0       0       0       0       0       0       0
  D28.393022          0       0       0       0       0       0       0       1
  D29.393071          0       0       0       1       0       0       0       0
  D3.393129           0       0       0       0       0       0       0       1
  D30.392996          0       0       0       0       0       0       0       0
  D31.393093          0       0       0       0       0       0       0       0
               
                4475642 4480244 4480359 4481131 4481359 4481719 4483037 4484111
  D1.393095           0       0       0       2       0       0       0       0
  D10.5n.393082       0       0       1       0       0       1       0       0
  D10.a.393084        0       0       0       0       0       0       0       0
  D10.b.393074        0       0       0       0       0       0       0       0
  D11.a.392963        0       0       0       0       0       0       0       0
  D11.b.392956        0       0       0       0       0       0       1       0
  D12.392988          0       0       0       0       0       0       0       0
  D13.5n.393036       0       0       0       0       0       0       0       1
  D13.a.393151        0       0       0       1       0       0       0       0
  D13.b.393109        0       0       0       1       1       0       0       0
  D14.393072          0       0       0       0       0       3       0       0
  D15.5n.393148       0       0       0       0       0       0       0       1
  D15.a.392972        0       0       0       0       0       0       0       0
  D15.b.393025        0       0       0       0       0       0       0       0
  D16.a.393131        0       0       0       0       0       0       0       1
  D16.b.393030        0       0       0       0       0       1       1       0
  D17.392970          0       0       0       0       0       0       0       0
  D18.6m.393070       0       0       0       0       0       0       0       0
  D19.5n.393000       0       0       0       0       0       0       0       0
  D19.a.393146        0       0       1       0       0       0       1       0
  D19.b.393086        0       0       0       0       0       0       1       0
  D2.393107           0       0       0       2       0       0       0       0
  D20.393127          0       0       0       0       1       0       0       0
  D21.5n.392974       0       1       0       0       0       0       0       0
  D21.a.393001        0       0       0       1       0       1       0       0
  D21.b.393031        0       0       0       0       0       0       0       0
  D22.393118          0       0       0       0       0       0       0       0
  D22.5n.393054       0       0       0       1       0       1       0       0
  D23.392960          0       0       0       0       0       1       0       0
  D24.393019          0       0       0       0       0       1       0       0
  D26.392975          0       0       0       0       0       0       0       0
  D27.393075          0       0       0       0       0       0       0       1
  D28.393022          0       0       0       0       0       1       0       0
  D29.393071          0       0       0       0       0       0       0       0
  D3.393129           0       0       0       1       0       0       0       0
  D30.392996          0       0       0       1       0       0       0       0
  D31.393093          1       0       0       0       0       0       0       0
               
                619817
  D1.393095          0
  D10.5n.393082      0
  D10.a.393084       0
  D10.b.393074       0
  D11.a.392963       0
  D11.b.392956       0
  D12.392988         0
  D13.5n.393036      0
  D13.a.393151       0
  D13.b.393109       0
  D14.393072         0
  D15.5n.393148      0
  D15.a.392972       0
  D15.b.393025       0
  D16.a.393131       1
  D16.b.393030       0
  D17.392970         0
  D18.6m.393070      0
  D19.5n.393000      0
  D19.a.393146       0
  D19.b.393086       0
  D2.393107          0
  D20.393127         0
  D21.5n.392974      0
  D21.a.393001       0
  D21.b.393031       0
  D22.393118         0
  D22.5n.393054      0
  D23.392960         0
  D24.393019         0
  D26.392975         0
  D27.393075         0
  D28.393022         0
  D29.393071         0
  D3.393129          0
  D30.392996         0
  D31.393093         0

phyloseq documentation built on Nov. 8, 2020, 6:41 p.m.