Generate_Meta_Files: Generate summary statistics files
In MetaSKAT: Meta Analysis for SNP-Set (Sequence) Kernel Association Test

View source: R/Meta_SSD_Write.R

Generate_Meta_Files

R Documentation

Generate summary statistics files

Description

Generate Meta SSD (MSSD) and Meta Info (MInfo) files. Both files are needed to run MetaSKAT with summary statistics.

Usage


	Generate_Meta_Files(obj, File.Bed, File.Bim
	, File.SetID, File.MSSD, File.MInfo, N.Sample
	, File.Permu = NULL, data=NULL, impute.method="fixed")
	
	Generate_Meta_Files_FromDosage(obj, File.Dosage
	, File.SetID, File.MSSD, File.MInfo, N.Sample
	, File.Permu=NULL, data=NULL, impute.method="fixed")

Arguments

`obj`	returned object from SKAT_Null_Model.
`File.Bed`	name of the binary ped file (BED).
`File.Bim`	name of the binary bim file (BIM).
`File.SetID`	name of the SNP set ID file. The first column must be Set ID, and the second column must be SNP ID. There should be no header!!
`File.MSSD`	name of the MSSD file that will be generated.
`File.MInfo`	name of the MInfo file that will be generated.
`N.Sample`	number of samples.
`File.Permu`	name of a file that will have score statistics from permuted phenotypes (currently internal use only).
`data`	an optional data frame containing the variables in the model (default=NULL). If it is NULL, the variables are taken from environment(formula).
`impute.method`	a method to impute missing genotypes (default= "fixed"). "bestguess" imputes missing genotypes as the most likely values(0,1,2), "random" imputes missing genotypes by generating binomial(2,p) random variables (p = MAF), and "fixed" imputes missing genotypes by assigning the mean genotype values (2p).
`File.Dosage`	name of the dosage file. The dosage file must not have a header.

Details

These functions generate summary statistic files (MSSD and MInfo files) from plink binary files. To run meta analysis, each study should provide both MSSD and MInfo files. The MSSD is a binary file with between-SNP information matrices, and MInfo is a text file with information on study cohorts and SNPsets.

If users want to use dosages instead of hard call genotypes, Generate_Meta_Files_FromDosage should be used instead of Generate_Meta_Files. The dosage file should follow the plink dosage file format with a single dosage value per each SNP (Format=1 in plink). The first three columns should be SNP ID, allele type1 (a1) and allele type2 (a2). After the first three columns, there should be N.Sample columns of dosage data. Each column represents each sample, and the order of samples should be matched with the order in phenotypes and covariates used in SKAT_Null_Model.

ex)

rs0001 A T 0.1 0.2

rs0002 C G 1.2 0

Dosage value is the expected number of a2 copies, and 0 .. 2 scale. So the value 0.1 indicates that the expected number of copy of a2 is 0.1.