Description Usage Arguments Value
setGenData assumes that the plaintext file (fileIn) contains
records of individuals in rows, and phenotypes, covariates and markers in
columns. The columns included in columns 1:nColSkip are used to
populate the slot of a @phenogenData
object, and the remaining columns are used to fill the slot
. If the first row contains a header
(@genoheader=TRUE), data in this row is used to determine variables names
for @pheno and marker names for @map and @geno.
Genotypes are stored in a distributed matrix (dMatrix). By default a
column-distributed (cDMatrix) is used for @geno,
but the user can modify this using the distributed.by argument. The
number of chunks is either specified by the user (use nChunks when
calling setGenData) or determined internally so that each
ff_matrix object has a number of cells that is smaller than
.Machine$integer.max/1.2. setGenData creates a folder
(folderOut) that contains the binary flat files (geno_*.bin)
and the genData object (typically named
genData.RData. Optionally (if returnData is TRUE) it returns
the genData object to the environment. The filename of
the ff_matrix objects are saved as relative names. Therefore, to be
able to access the content of the data included in @geno the working
directory must either be the folder where these files are saved
(folderOut) or the object must be loaded using the loadGenData
function included in the package.
1 2 3 4 5 | setGenData(fileIn, header, dataType, distributed.by = "columns", n = NULL,
p = NULL, folderOut = paste("genData_", sub("\\.[[:alnum:]]+$", "",
basename(fileIn)), sep = ""), returnData = TRUE, na.strings = "NA",
nColSkip = 6, idCol = 2, verbose = FALSE, nChunks = NULL,
dimorder = if (distributed.by == "rows") 2:1 else 1:2)
|
fileIn |
The path to the plaintext file. |
header |
If TRUE, the file contains a header. |
dataType |
The coding of genotypes. Use 'character' for A/C/G/T or 'integer' for numeric coding. |
distributed.by |
If columns a column-distributed matrix
( |
n |
The number of individuals. |
p |
The number of markers. |
folderOut |
The path to the folder where to save the binary files. |
returnData |
If TRUE, the function returns a
|
na.strings |
The character string use to denote missing value. |
nColSkip |
The number of columns to be skipped to reach the genotype information in the file. |
idCol |
The index of the ID column. |
verbose |
If TRUE, progress updates will be posted. |
nChunks |
The number of chunks to create. |
dimorder |
The physical layout of the chunks. |
If returnData is TRUE, a genData object
is returned.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.