loadCohortDefinition: Read in the cohort definition file

Description Usage Arguments Value Errors Warnings Todo

Description

Creates a data frame from the cohort description file. The file read must be tab-delimited, one line per sample, with header. Lines beginning with a '#' are ignored. It must include the columns 'sample' and 'exonExpressionFile' (case sensitive). Allows loading only a subset of the samples. All columns from the file are loaded.

Usage

1
2
loadCohortDefinition(file, samples = NULL, comment.char = "#",
  dataRoot = NULL)

Arguments

file

The name of the cohort file to read in (tab-delimited with header)

samples

The list of samples to read in, as a character vector. The rows matching these are the only ones read in. By default, this is set to NULL, meaning it reads all the samples. If any sample in the provided list is not found, a warning will be generated. Must match exactly (case sensitive).

comment.char

The character starting comment lines, by default '#'. Must be a single character. Any line beginning with this character is considered to be a comment line and is ignored. Must be the first character in a line (no leading white space is allowed). Set to the empty text, "", to skip comment line filtering.

dataRoot

A directory to prepended to the exonExpressionFile filenames. By default this is NULL and nothing is prepended. NA and empty text "" are liekwise ignored. It is ok if dataRoot does not exist at this time as the file locations are not verified at this point.

Value

A data frame representing the sample cohort with at least two columns:

sample The sample names read from the input file's sample column
exonExpressionFile The exon expression file names read from the input file's exonExpressionFile column

Errors

Can't find the cohort definition file: file

The file you are trying to read in can't be found - case, permission, relative directory, and misspellings are all possible reasons.

The cohort file has no sample column sample

The cohort file must have a column sample with the sample names

The cohort file has no expression file column exonExpressionFile

The cohort file must have a column exonExpressionFile giving the file names of the exon expression data for the sample in the same row.

The sample and exon expression file columns must contain text

Either or both of these columns was read in as something other than a character vector.

The cohort file can not contains duplicate samples

The cohort file contains multiple lines with the same sample name, even after filtering out any samples you didn't want considered. That's not ok.

The cohort file can not contain duplicate cohort expression files

The cohort file contains multiple lines with the same cohort expression file name, even after filtering out any samples you didn't want considered. That's not ok. If two different samples really have the same expression file name, you'll have to put them in different directories.

All sample and exon expression file entries must be non-empty text

Can not have missing or empty text for samples as this is the primary key for later work. Can not have missing or empty text for exon expression files (after filtering) as such samples can not then be part of the cohort.

Warnings

Duplicates in "samples" parameter filter list ignored

The list of samples you provided to filter the cohort data frame by contained the same sample name more than once. You can only include a sample once, so the duplicates are just ignored. Warning you allows this to be changed for future use (if you care).

Ignoring missing samples specified in the "samples" parameter list: sample, sample,...

The list of samples you provided to filter the cohort data frame contained sample names that were not actually in the cohort file. Perhaps you should double check your cohort file?

Todo


jefferys/FusionExpressionPlot documentation built on May 19, 2019, 3:59 a.m.