load_tsv_counts: Load counts from multiple tab-delimited reports

View source: R/load_tsv_counts.R

load_tsv_countsR Documentation

Load counts from multiple tab-delimited reports

Description

This returns a dataframe where the rows are genes/loci and columns are samples. If files is a named vector, the names are used as sample names; otherwise sample names can be provided by colnames. If not provided, unique strings will be extracted from the report filenames.

Usage

load_tsv_counts(
  files,
  colnames = names(files),
  all_locs = NULL,
  header = FALSE,
  gene_id_column = 1,
  count_column = 2,
  exclude_features = c()
)

Arguments

files

A character vector with paths to the reports.

colnames

Output column names. If files is a named vector, the names will be used as column names. Otherwise finds a unique string from the report filenames.

all_locs

A character vector with all genes/loci to be included in the output; typically this is a list of all loci in the annotation. If not provided, the loci will be the union of loci in all reports, in an arbitrary order.

header

Whether the first line of the input files is a header.

gene_id_column

Report column number containing the gene ID.

count_column

Report column number containing the count.

exclude_features

Gene/feature IDs to be excluded from dataframe.

Value

A dataframe where the rows are genes/loci and the columns are samples (when each report corresponds to one sample).

Examples

ddir <- system.file("extdata", package="scopetools")
tsv_files <- Sys.glob(file.path(ddir, '*.ReadsPerGene.out.tab'))

gene_counts <- load_tsv_counts(tsv_files)


nixonlab/scopetools documentation built on Sept. 30, 2022, 11:15 a.m.