readFullDataTable: convert a GenomeStudio FullDataTable file to the import...

View source: R/fitPolyTools.R

readFullDataTableR Documentation

convert a GenomeStudio FullDataTable file to the import format for fitPoly

Description

A GenomeStudio file in wide format (samples side-by-side) is converted to a fitPoly input file in long format

Usage

readFullDataTable(filename, rawXY=FALSE,
markergroups=list(), out, filetype=c("dat","RData")[2])

Arguments

filename

name of a FullDataTable tab-separated text file exported from Illumina's GenomeStudio. The file must contain a column "Name" with the marker names, and for each sample a pair of columns "sample.X" and "sample.Y" if rawXY is FALSE, or "sample.X raw" and "sample.Y raw" (note the space) if rawXY is TRUE. Further columns may be present but are not read.

rawXY

if FALSE (default) the normalized .X and .Y columns are read; if TRUE the "raw" columns (.X raw and .Y raw) are read instead.

markergroups

a list with character vectors of marker names, or integer vectors of marker numbers in file order. If the data set is large, the conversion to long format may exceed memory limits. In these cases the data can be split into marker groups that are converted separately and each saved to a separate file. If the list is empty (default) all markers are converted as one block.

out

the name of an output file (without extension). If a list of markergroups is given, out must be a valid file name (without extension); in that case multiple output files are created with filenames in which the list element numbers are appended to out. If no markergroups are specified out may also be set to "" or NA; in that case no file is created and the converted data are only returned as function result.
If out is not "" or NA, then also a file <out>_meanR.dat is saved with for all samples their mean R value and number of missing data (over all markers)

filetype

either "dat" or "RData" (default): the former produces tab-separated text files, the latter saves RData files with the converted data in a data frame with name "dat".

Details

The wide-format input is converted to a long-format form with columns MarkerName, SampleName, X, Y, R (= X + Y) and ratio (= Y / R). The X and Y signal intensities are obtained from the <sample>.X and <sample>.Y columns in the input data (or from the <sample>.X raw and <sample>.Y raw columns if rawXY is TRUE). R and ratio are calculated from these values and not read from the input data.

Value

If no markergroups are specified, a data.frame is returned with columns MarkerName, SampleName, X and Y (also if raw data are read, the column names are X and Y), R (= X + Y), ratio (= Y / (X+Y) ).
If a list of markergroups is specified the function result is NULL and the converted data.frames are only saved as files.
If the saved files are RData files, they all contain one data.frame named "dat".


fitPoly documentation built on April 3, 2025, 8:58 p.m.