pcoc: PCoC for correcting for population stratification

Description Usage Arguments Details Value Author(s) References Examples

View source: R/PCoC_main.R

Description

Identify the clustered and continuous patterns of the genetic variation using the PCoC, which calculates the principal coordinates and the clustering of the subjects for correcting for PS.

Usage

1
2
3
4
5
6
7
pcoc(
  genoFile,
  outFile.txt = "pcoc.result.txt",
  n.MonteCarlo = 1000,
  num.splits = 10,
  miss.val = 9
)

Arguments

genoFile

a txt file containing the genotypes (0, 1, 2, or 9). The element of the file in Row i and Column j represents the genotype at the ith marker of the jth subject. 0, 1, and 2 denote the number of risk alleles, and 9 (default) is for the missing genotype.

outFile.txt

a txt file for saving the result of this function. The default is "pcoc.result.txt".

n.MonteCarlo

the number of times for the Monte Carlo procedure. The default is 1000.

num.splits

the number of groups into which the markers are split. The default is 10.

miss.val

the number representing the missing data in the input data. The default is 9. The element 9 for the missing data in the genoFile should be changed according to the value of miss.val.

Details

The hidden population structure is a possible confounding effect in the large-scale genome-wide association studies. Cases and controls might have systematic differences because of the unrecognized population structure. The PCoC procedure uses the techniques from the multidimensional scaling and the clustering to correct for the population stratification. The PCoC could be seen as an extension of the EIGENSTRAT.

Value

A list of principal.coordinates and cluster. principal.coordinates is the principal coordinates and cluster is the clustering of the subjects. If the number of clusters is only one, cluster is omitted.

Author(s)

Lin Wang, Wei Zhang, and Qizhai Li.

References

Lin Wang, Wei Zhang, and Qizhai Li. AssocTests: An R Package for Genetic Association Studies. Journal of Statistical Software. 2020; 94(5): 1-26.

Q Li and K Yu. Improved Correction for Population Stratification in Genome-Wide Association Studies by Identifying Hidden Population Structures. Genetic Epidemiology. 2008; 32(3): 215-226.

KV Mardia, JT Kent, and JM Bibby. Multivariate Analysis. New York: Academic Press. 1976.

Examples

1
2
3
4
5
6
pcocG.eg <- matrix(rbinom(4000, 2, 0.5), ncol = 40)
write.table(pcocG.eg, file = "pcocG.eg.txt", quote = FALSE,
       sep = "", row.names = FALSE, col.names = FALSE)
pcoc(genoFile = "pcocG.eg.txt", outFile.txt = "pcoc.result.txt",
       n.MonteCarlo = 50, num.splits = 10, miss.val = 9)
file.remove("pcocG.eg.txt", "pcoc.result.txt")

AssocTests documentation built on Jan. 13, 2021, 5:09 a.m.

Related to pcoc in AssocTests...