import | R Documentation |
Import data into R
import(con, ...)
## S4 method for signature 'character'
import(con, format = NULL, ...)
## S4 method for signature 'textConnection'
import(
con,
format = c("csv", "tsv", "json", "yaml"),
colnames = TRUE,
quote = "\"",
naStrings = pipette::naStrings,
quiet = FALSE
)
## S4 method for signature 'PipetteRdsFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteRDataFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteDelimFile'
import(
con,
rownames = TRUE,
rownameCol = NULL,
colnames = TRUE,
quote = "\"",
naStrings = pipette::naStrings,
comment = "",
skip = 0L,
nMax = Inf,
engine = c("base", "data.table", "readr"),
makeNames = syntactic::makeNames,
metadata = FALSE,
quiet = FALSE
)
## S4 method for signature 'PipetteLinesFile'
import(
con,
comment = "",
skip = 0L,
nMax = Inf,
stripWhitespace = FALSE,
removeBlank = FALSE,
metadata = FALSE,
engine = c("base", "data.table", "readr"),
quiet = FALSE
)
## S4 method for signature 'PipetteExcelFile'
import(
con,
sheet = 1L,
rownames = TRUE,
rownameCol = NULL,
colnames = TRUE,
skip = 0L,
nMax = Inf,
naStrings = pipette::naStrings,
makeNames = syntactic::makeNames,
metadata = FALSE,
quiet = FALSE
)
## S4 method for signature 'PipetteBamFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteBcfFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteCramFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteFastaFile'
import(
con,
moleculeType = c("DNA", "RNA", "AA"),
metadata = FALSE,
quiet = FALSE
)
## S4 method for signature 'PipetteFastqFile'
import(con, moleculeType = c("DNA", "RNA"), metadata = FALSE, quiet = FALSE)
## S4 method for signature 'PipetteGafFile'
import(con, metadata = FALSE, quiet = FALSE)
## S4 method for signature 'PipetteGctFile'
import(
con,
metadata = FALSE,
quiet = FALSE,
return = c("matrix", "data.frame")
)
## S4 method for signature 'PipetteGmtFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteGmxFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteGrpFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteJsonFile'
import(con, metadata = FALSE, quiet = FALSE)
## S4 method for signature 'PipetteMafFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteMtxFile'
import(con, rownamesFile, colnamesFile, metadata = FALSE, quiet = FALSE)
## S4 method for signature 'PipetteOboFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipettePzfxFile'
import(
con,
sheet = 1L,
makeNames = syntactic::makeNames,
metadata = FALSE,
quiet = FALSE
)
## S4 method for signature 'PipetteSamFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteVcfFile'
import(con, quiet = FALSE)
## S4 method for signature 'PipetteYamlFile'
import(con, metadata = FALSE, quiet = FALSE)
## S4 method for signature 'PipetteBcbioCountsFile'
import(con, metadata = FALSE, quiet = FALSE)
## S4 method for signature 'PipetteRioFile'
import(
con,
rownames = TRUE,
rownameCol = NULL,
colnames = TRUE,
makeNames = syntactic::makeNames,
metadata = FALSE,
quiet = FALSE,
...
)
## S4 method for signature 'PipetteRtracklayerFile'
import(con, metadata = FALSE, quiet = FALSE, ...)
con |
|
format |
|
... |
Additional arguments. |
colnames |
|
quote |
|
naStrings |
|
quiet |
|
rownames |
|
rownameCol |
|
comment |
|
skip |
|
nMax |
|
engine |
|
makeNames |
|
metadata |
|
stripWhitespace |
|
removeBlank |
|
sheet |
|
moleculeType |
|
return |
|
rownamesFile, colnamesFile |
|
import()
supports automatic loading of common file types, by wrapping
popular importer functions. It intentionally designed to be simple, with few
arguments. Remote URLs and compressed files are supported. If you need more
complex import settings, just call the wrapped importer directly instead.
Varies, depending on the file type (format):
R data serialized (RDS
):
variable.
Currently recommend over RDA, if possible.
Imported by readRDS()
.
R data (RDA
, RDATA
):
variable.
Must contain a single object.
Doesn't require internal object name to match, unlike loadData()
.
Imported by load()
.
Plain text delimited (CSV
, TSV
, TXT
):
data.frame
.
Data separated by commas, tabs, or visual spaces.
Note that TXT structure is amgibuous and actively discouraged.
Refer to Data frame return
section for details on how to change the
default return type to DFrame
, tbl_df
or data.table
.
Imported by readr::read_delim()
by default.
Excel workbook (XLSB
, XLSX
):
data.frame
.
Resave in plain text delimited format instead, if possible.
Imported by readxl::read_excel()
.
Legacy Excel workbook (pre-2007) (XLS
):
data.frame
.
Resave in plain text delimited format instead, if possible.
Note that import of files in this format is slow.
Imported by readxl::read_excel()
.
GraphPad Prism project (PZFX
):
data.frame
.
Experimental. Consider resaving in CSV format instead.
Imported by pzfx::read_pzfx()
.
General feature format (GFF
, GFF1
, GFF2
, GFF3
, GTF
):
GRanges
.
Imported by rtracklayer::import()
.
Gene Ontology (GO) annotation file (GAF
):
data.frame
with 17 columns.
Imported by base::read.table()
.
MatrixMarket exchange sparse matrix (MTX
):
sparseMatrix
.
Imported by Matrix::readMM()
.
**Sequence alignment/map format (SAM
, BAM
, CRAM
):
list
.
Imported by Rsamtools::scanBam
.
Mutation annotation format (MAF
):
MAF
.
Imported by maftools::read.maf()
.
Variant annotation format (VCF
, BCF
):
list
.
Imported by Rsamtools::scanBcf
.
Gene cluster text (GCT
):
matrix
or data.frame
.
Imported by readr::read_delim()
.
Gene sets (for GSEA) (GMT
, GMX
):
character
.
Browser extensible data (BED
, BED15
, BEDGRAPH
, BEDPE
):
GRanges
.
Imported by rtracklayer::import()
.
ChIP-seq peaks (BROADPEAK
, NARROWPEAK
):
GRanges
.
Imported by rtracklayer::import()
.
Wiggle track format (BIGWIG
, BW
, WIG
):
GRanges
.
Imported by rtracklayer::import()
.
JSON serialization data (JSON
):
list
.
Imported by jsonlite::read_json()
.
YAML serialization data (YAML
, YML
):
list
.
Imported by yaml::yaml.load_file()
.
Lines (LOG
, MD
, PY
, R
, RMD
, SH
):
character
.
Source code or log files.
Imported by readr::read_delim()
by default.
Infrequently used rio-compatible formats (ARFF
, DBF
, DIF
, DTA
,
MAT
, MTP
, ODS
, POR
, SAS7BDAT
, SAV
, SYD
, REC
, XPT
):
variable.
Imported by rio::import()
.
Row names. Row name handling has become an inconsistent mess in R because
of differential support in base R, tidyverse, data.table, and Bioconductor.
To maintain sanity, import()
attempts to handle row names automatically.
The function checks for a rowname
column in delimited data, and moves these
values into the object's row names, if supported by the return type (e.g.
data.frame
, DFrame
). Note that tbl_df
(tibble) and data.table
intentionally do not support row names. When returning in this format, no
attempt to assign the rowname
column into the return object's row names is
made. Note that import()
is strict about this matching and only checks for
a rowname
column, similar to the default syntax recommended in
tibble::rownames_to_column()
. To disable this behavior, set rownames = FALSE
, and no attempt will be made to set the row names.
Column names. import()
assumes that delimited files always contain
column names. If you are working with a file that doesn't contain column
names, either set colnames = FALSE
or pass the names in as a character
vector. It's strongly recommended to always define column names in a
supported file type.
FASTA and FASTQ files are currently managed internally by the Biostrings
package. Refer to readDNAStringSet
and readRNAStringSet
for details.
Import of these files will return DNAStringSet
or RNAStringSet
depending
on the input, defined by moleculeType
argument.
The GFF (General Feature Format) format consists of one line per feature, each containing 9 columns of data, plus optional track definition lines. The GTF (General Transfer Format) is identical to GFF version 2.
See also:
Refer to the IGV website for details.
Refer to the Broad Institute GSEA wiki for details.
Reading a Matrix Market Exchange file requires ROWNAMES
and COLNAMES
sidecar files containing the corresponding row and column names of the sparse
matrix.
bcbio count matrix (e.g. generated from featureCounts) and related sidecar files are natively supported.
COUNTS
: Counts table (e.g. RNA-seq aligned counts).
COLNAMES
: Sidecar file containing column names.
ROWNAMES
: Sidecar file containing row names.
These file formats are intentionally not supported:
DOC
, DOCX
, PDF
, PPT
, PPTX
.
GMTFile
and OBOFile
are also supported by BiocSet package.
Updated 2023-12-15.
Packages:
rio.
Import functions:
BiocIO::import()
.
Rsamtools::scanBam()
.
Rsamtools::scanBcf()
.
data.table::fread()
.
maftools::read.maf()
.
readr::read_delim()
.
rio::import()
.
rtracklayer::import()
.
utils::read.table()
.
vroom::vroom()
.
con <- system.file("extdata", "example.csv", package = "pipette")
## Row and column names enabled.
x <- import(con = con)
print(head(x))
## Row and column names disabled.
x <- import(con = con, rownames = FALSE, colnames = FALSE)
print(head(x))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.