match_ped_summary: match_ped_summary.R

Description Usage Arguments Value

View source: R/match_ped_summary.R

Description

We can only do set-based tests with SNPs that are in both the ped file (to estimate correlation) and in the summary statistics file (obviously because we need the summary statistic). This function tells us which SNPs are indeed in both. Use only with one region (contiguous length on one chromosome) at a time.

Usage

1
2
match_ped_summary(SS_fname_root, fname_root, ped_file, map_file, CHR, start_bp,
  end_bp, gene_name, threshold_1000G, checkpoint)

Arguments

SS_fname_root

Root of the summary statistic filename. The full filename should be [SS_root][CHR].txt. This file should have column headers with the names 'CHR' and 'P-value' and 'BP' and 'RS'.

fname_root

Root of the downloaded 1000G files. Used to delete the files. Leave as NULL if you don't want to delete the downloaded files.

ped_file

A standard PLINK ped file, hopefully cleaned from clean_1000G_raw.

map_file

The standard .map file downloaded from 1000G, hopefully cleaned from clean_1000G_raw.

CHR

The chromosome of the region.

start_bp

The starting BP of the region.

end_bp

The ending BP of the region.

gene_name

A name given to the region (often a gene); used for printing error messages.

threshold_1000G

Only use 1000G SNPs which pass this MAF? Rare alleles may be too unstable for estimating correlations.

checkpoint

A boolean, if TRUE, print out diagnostic/error messages.

Value

A list with the elements temp_Gmat (containing the genotypes at each qualifying SNP) and temp_Gmat_record (containing the info on SNPs), or 1 if nothing to return.


ryanrsun/LungCancerAssoc documentation built on May 24, 2019, 7:26 p.m.