pre5.genos2numeric: Categorize genotype data into 3 levels
In genMOSSplus: Application of MOSS algorithm to genome-wide association study (GWAS)

Description Usage Arguments Details Value Note Author(s) See Also Examples

pre5.genos2numeric(file.ped, dir.ped, file.dat, dir.dat = dir.ped, dir.out, 
num.nonsnp.col = 2, num.nonsnp.last.col = 1, letter.encoding = TRUE, 
ped.has.ext = TRUE, dat.has.ext = TRUE, remove.bad.genos = FALSE, 
save.ids.name = "")

`file.ped`	The name of file with genotypes, after imputation.
`dir.ped`	The name of directory where `file.ped` can be found.
`file.dat`	The .dat file, should be tab separated, and no header.
`dir.dat`	The name of directory where `file.dat` can be found. Defaults to `dir.ped`.
`dir.out`	The name of output directory to which resulting file should be saved. The file will be named "Num.<file.ped>".
`num.nonsnp.col`	The number of leading columns in the .ped files that do not contain SNP values. The first columns of the file represent non-SNP values (like patient ID, gender, etc). For MaCH1 input format, the `num.nonsnp.col=5`, for PLINK it is 6 (due to extra disease status column).
`num.nonsnp.last.col`	The number of last columns that do not correspond to geno values. Ex. If last column is the disease status (0s and 1s), then set this variable to 1. If 2 last columns correspond to confounding variables, set the variable to 2.
`letter.encoding`	Flag whether or not the ecoding used for Alleles is letters (A, C, T, G). If True, then does additional check for Alleles corresponding to the letters, and prints out warning messages if other symbols appear instead.
`ped.has.ext`	Flag whether or not `file.ped` name has a filename extension (ex. ".ped", ".txt"). This is necessary for naming the output file.
`dat.has.ext`	Flag whether or not file.dat name has a filename extension (ex. ".dat", ".txt").
`remove.bad.genos`	Flag whether or not you want to remove a geno if at least one of its values is not valid (ex. "2" when only letters are expected, or "NA", etc). Warning: set this to TRUE only if the CASE and CONTROLs have been merged into the `file.ped`, (otherwise we do not want to remove some SNPs from CASE but not from CONTROL and generate two different .dat files).
`save.ids.name`	The file name to which patient IDs should be saved. If not empty, then will save IDs of patients into another file with this name. Since dataset is generally split across many files, one chromosome each, the patient IDs should be the same across these files, thus it is enough to extract the patient ID ONCE, when running this code on the smallest chromosome. For runs on all other chromosomes, leave save.ids.name="" to save time and avoid redundant work. Could name output file as "patients.fam".

Categorizes genotype data into 3 levels, 1, 2, 3. Genos with two different Alleles are encoded as "2". Other genotypes are encoded as "1" or "3", where most frequent geno is "1". No missing values allowed, must be done after imputation. Geno values should use letters A, T, C, G if letter.encoding=TRUE. Also can work as a check for weird imputed values. For example, it is possible that an Allele is predicted by MaCH1 having value "2" (instead of A, T, C, or G) - it is best to remove SNPs that contain these weirdly imputed values.

The following files will be produced:

 - <file.ped>_num<ending.ped> - in \code{\var{dir.out}} directory, the resultant 
    binary file: the SNP columns + last columns (but no user IDs will be 
    recorded), where <ending.ped> is the filename extension of file.ped.
 - <file.dat>_num.dat - in dir.out directory, the corresponding .dat file, will 
    be different from original <file.dat> if remove.bad.genos=TRUE.
 - <save.ids.name> - the patient IDs, if save.ids.name is not empty "".

<file.ped>_num<ending.ped> filename - the name of the output file.

Note: in case of any bad values in the file.ped (ex. "NA", "0/0", "0", "1 1", etc), the output file Num_<file.ped> will still be produced, with '2' encoded by default in the place of bad input values, if remove.bad.genos=FALSE. Warning messages will be printed. If remove.bad.genos=TRUE, then these SNPs will be entirely removed, along with their names in the .dat file.

Olia Vesselova

pre4.combine.case.control, pre4.combine.case.control.batch, pre5.genos2numeric.batch

1	print("See the demo 'gendemo'.")

[1] "See the demo 'gendemo'."

genMOSSplus documentation built on May 1, 2019, 10:31 p.m.

genMOSSplus index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

genMOSSplus
Application of MOSS algorithm to genome-wide association study (GWAS)

pre5.genos2numeric: Categorize genotype data into 3 levels
In genMOSSplus: Application of MOSS algorithm to genome-wide association study (GWAS)

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Example output

Related to pre5.genos2numeric in genMOSSplus...

R Package Documentation

Browse R Packages

We want your feedback!

genMOSSplus Application of MOSS algorithm to genome-wide association study (GWAS)

pre5.genos2numeric: Categorize genotype data into 3 levels In genMOSSplus: Application of MOSS algorithm to genome-wide association study (GWAS)

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Example output

Related to pre5.genos2numeric in genMOSSplus...

R Package Documentation

Browse R Packages

We want your feedback!

genMOSSplus
Application of MOSS algorithm to genome-wide association study (GWAS)

pre5.genos2numeric: Categorize genotype data into 3 levels
In genMOSSplus: Application of MOSS algorithm to genome-wide association study (GWAS)