glue.chr.segment.par: Splicing chromosomal segments

Description Usage Arguments Value Examples

Description

This function splices the triad chromosomal segments into "complete" trios. The spliced trio sets are written into separate plink files chromosome by chromosome. It is parallelized and if no no_cores value is given the ceiling of half of the total number of CPUs available will be used in the parallelization.

Usage

1
2
3
glue.chr.segment.par(input.plink.file, out.put.file, brks, sel.fam.all,
  snp.all2, pathway.all, target.snp, pop.vec = NA, no_cores = NA,
  flip = TRUE)

Arguments

input.plink.file

for simulations of homogenous population, it is a vector of three character strings for the base filenames of the mother's father's and child's plink base filenames. The plink files are in bed format and in the same folder three files with extensions .bed .bim and .fam are expected for each individual's genotypes. The mothers, fathers, and childredn must be from the same set of trio families even though the ordering of the families can be different for the three sets of data. For simulations under population stratification it is a list of two vectors. Each vector is a vector of three character strings for the base filenames as described above.The two vectors correspond to the two subpopulations.

out.put.file

is a character string giving the base file name for the output file. Genotypes on different chromosomes are output to different files. The final file name also contains information on chromosome number. E.g., for a base filename "trio" and for chromosome 1 the final file name is "trio1sim".

brks

is a matrix of integers showing where the chromosomal breaks is to take place for each individual in the simulated trios.

sel.fam.all

is a matrix of integer giving the families (in terms of row number) selected for each chromosomal segment and each simulated trio.

snp.all2

is a dataframe containing the list of SNPs in PLINK .bim format. Two columns of the dataframe is used: column 1 with column name "V1" containing the chromosome number and column 2 with column name "V2" containing the rs number of the SNPs.

pathway.all

is a matrix giving the genotypes on the pathway SNPs in the simulated trio.

target.snp

is a vector of integers showing the row number of the target SNPs in the .bim file.

pop.vec

is a vector of 1's and 2's giving the subpopulation group of each simulated trio. This parameter is relevant only for stratified scenarios.

no_cores

is an integer which specifies the number of CPU cores to be parallelized.

flip

is a boolean indicating whether the mother's and the father's genotypes will be swapped to wipe out potential maternal effects in the orignal data.

Value

This function does not return values. Instead it writes PLINK files into the designated directory. Each set of PLINK files contains genotype data for one chromosome for all trios. The first one third of the rows are genotypes of the mothers'. The second one third are those of the fathers' and the last one third are the children's.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
tar.snp <- c(21, 118, 121, 140, 155, 168, 218, 383) 
found.brks <- get.brks(N.brk=3,n.ped=1000, snp.all2, tar.snp,rcmb.rate=NA)
breaks <- found.brks[[1]]
family.position <- found.brks[[2]] 
betas <- c(-6.4, 3.2, 5.8)
pwy <- list(1:4,5:8)
m.file <- file.path(system.file(package = "TriadSim"),'extdata/pop1_4chr_mom')
f.file <- file.path(system.file(package = "TriadSim"),'extdata/pop1_4chr_dad')
k.file <- file.path(system.file(package = "TriadSim"),'extdata/pop1_4chr_kid')
# the preloaded data frame snp.all2 contains the data frame read from the corresponding .bim file.
target.geno <- get.target.geno(c(m.file,f.file,k.file), tar.snp,snp.all2)
mom.target <- target.geno[[1]]
dad.target <- target.geno[[2]]
kid.target <- target.geno[[3]]
fitted.model <- fit.risk.model.par(n.ped=1000,brks=breaks,target.snp=tar.snp, 
fam.pos=family.position,mom.tar=mom.target,dad.tar=dad.target, kid.tar=kid.target,  
pathways=pwy,betas, e.fr=NA, betas,pop1.frac= NA,rate.beta=NA,no_cores=2)
sel.fam <- fitted.model[[1]]
sim.pathway.geno <-  fitted.model[[2]]
## Not run: 
glue.chr.segment.par(c(m.file,f.file,k.file),file.path(tempdir(),'trio'), breaks,sel.fam,
                     snp.all2,sim.pathway.geno,target.snp,pop.vec=NA,no_cores=1,flip=TRUE) 

## End(Not run)

bbms09/TrioSim documentation built on May 11, 2019, 9:27 p.m.