Description Usage Arguments Value
setGenData
assumes that the plaintext file (fileIn
) contains
records of individuals in rows, and phenotypes, covariates and markers in
columns. The columns included in columns 1:nColSkip
are used to
populate the slot
of a @pheno
genData
object, and the remaining columns are used to fill the slot
. If the first row contains a header
(@geno
header=TRUE
), data in this row is used to determine variables names
for @pheno
and marker names for @map
and @geno
.
Genotypes are stored in a distributed matrix (dMatrix
). By default a
column-distributed (cDMatrix
) is used for @geno
,
but the user can modify this using the distributed.by
argument. The
number of chunks is either specified by the user (use nChunks
when
calling setGenData
) or determined internally so that each
ff_matrix
object has a number of cells that is smaller than
.Machine$integer.max/1.2
. setGenData
creates a folder
(folderOut
) that contains the binary flat files (geno_*.bin
)
and the genData
object (typically named
genData.RData
. Optionally (if returnData
is TRUE) it returns
the genData
object to the environment. The filename of
the ff_matrix
objects are saved as relative names. Therefore, to be
able to access the content of the data included in @geno
the working
directory must either be the folder where these files are saved
(folderOut
) or the object must be loaded using the loadGenData
function included in the package.
1 2 3 4 5 | setGenData(fileIn, header, dataType, distributed.by = "columns", n = NULL,
p = NULL, folderOut = paste("genData_", sub("\\.[[:alnum:]]+$", "",
basename(fileIn)), sep = ""), returnData = TRUE, na.strings = "NA",
nColSkip = 6, idCol = 2, verbose = FALSE, nChunks = NULL,
dimorder = if (distributed.by == "rows") 2:1 else 1:2)
|
fileIn |
The path to the plaintext file. |
header |
If TRUE, the file contains a header. |
dataType |
The coding of genotypes. Use 'character' for A/C/G/T or 'integer' for numeric coding. |
distributed.by |
If columns a column-distributed matrix
( |
n |
The number of individuals. |
p |
The number of markers. |
folderOut |
The path to the folder where to save the binary files. |
returnData |
If TRUE, the function returns a
|
na.strings |
The character string use to denote missing value. |
nColSkip |
The number of columns to be skipped to reach the genotype information in the file. |
idCol |
The index of the ID column. |
verbose |
If TRUE, progress updates will be posted. |
nChunks |
The number of chunks to create. |
dimorder |
The physical layout of the chunks. |
If returnData
is TRUE, a genData
object
is returned.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.