genDataRead | R Documentation |
This function will read in data from PED or haplin formatted file.
genDataRead(
file.in = stop("Filename must be given!", call. = FALSE),
file.out = NULL,
dir.out = ".",
format = stop("Format parameter is required!"),
header = FALSE,
n.vars,
cov.file.in,
cov.header,
map.file,
map.header = FALSE,
allele.sep = ";",
na.strings = "NA",
col.sep = "",
overwrite = NULL
)
file.in |
The name of the main input file with genotype information. |
file.out |
The base for the output filename (by default, constructed from the input file name). |
dir.out |
The path to the directory where the output files will be saved. |
format |
Format of data (will influence how data is processed) - choose from:
. |
header |
Whether the first line of the main input file contains column names; default: FALSE; NB: this is useful only for 'haplin'-formatted files! |
n.vars |
The number of columns with covariate data (if any) in the main file; NB: if the main file is in PED format, it is assumed that the first 6 columns contain the standard PED-covariates (i.e., family ID, ID of the child, father and mother, sex and case-control status), so in this case setting 'n.vars' is useful only if the PED file contains more than 6 covariate columns. |
cov.file.in |
Name of the file containing additional covariate data, if any. Caution: unless the 'cov.header' argument is used, it is assumed that the first line of this file contains the header (i.e., the column names of the additional data). |
cov.header |
The character vector containing the names of covariate columns (in the file with additional covariate data if given by the 'cov.file.in' argument; or in the main file, if it's a "haplin"-formatted file). |
map.file |
Filename (with path if the file is not in current directory) of the .map file holding the SNP names, if available (see Details). |
map.header |
Logical: does the map.file contain a header in the first row? Default: FALSE. |
allele.sep |
Character: separator between two alleles (default: ";"). |
na.strings |
Character or NA: how the missing data is coded (default: "NA"). |
col.sep |
Character: separator between the columns (i.e., markers; default: any whitespace character). |
overwrite |
Whether to overwrite the output files: if NULL (default), will prompt the user to give answer; set to TRUE, will automatically overwrite any existing files; and set to FALSE, will stop if the output files exist. |
The function reads in all the data in the file, creates ff objects to store
the genetic information and data.frame to store covariate data (if any). These
objects are saved in .RData
and .ffData
files, which can be later on
easily uploaded to R (with genDataLoad) and re-used.
A list object with three elements:
cov.data - a data.frame
with covariate data (if available in
the input file)
gen.data - a list with chunks of the genetic data; the data is divided column-wise, using 10,000 columns per chunk; each element of this list is a ff matrix
aux - a list with meta-data and important parameters.
The .map file should contain at least two columns, where the second one contains SNP names. Any additional columns should be separated by a whitespace character, but will be ignored. The file should contain a header.
When reading in a covariate file together with the genotype information, it is advised to include the header in the file, so that there is no doubt to the naming of the data columns.
# The argument 'overwrite' is set to TRUE!
examples.dir <- system.file( "extdata", package = "Haplin" )
# ped format:
example.file2 <- file.path( examples.dir, "exmpl_data.ped" )
ped.data.read <- genDataRead( example.file2, file.out = "exmpl_ped_data",
dir.out = tempdir( check = TRUE ), format = "ped", overwrite = TRUE )
ped.data.read
# haplin format:
example.file1 <- file.path( examples.dir, "HAPLIN.trialdata2.txt" )
haplin.data.read <- genDataRead( file.in = example.file1,
file.out = "exmpl_haplin_data", format = "haplin", allele.sep = "", n.vars = 2,
cov.header = c( "smoking", "sex" ), overwrite = TRUE,
dir.out = tempdir( check = TRUE ) )
haplin.data.read
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.