#' Fit Cox survival to all variants from a standard IMPUTE2 output after
#' genotype imputation
#'
#' Performs survival analysis using Cox proportional hazard models on imputed
#' genetic data from GDS files.
#'
#' @param gdsfile path to .gds file. Location of the .gds file should also contain
#' .snp.rdata and .scan.rdata files.
#' @param covariate.file data.frame comprising phenotype information, all
#' covariates to be added in the model must be numeric.
#' @param id.column character giving the name of the ID column
#' in covariate.file.
#' @param sample.ids character vector of sample IDs to keep in
#' survival analysis
#' @param time.to.event character of column name in covariate.file that
#' represents the time interval of interest in the analysis
#' @param event character of column name in covariate.file that represents
#' the event of interest to be included in the analysis
#' @param covariates character vector with exact names of columns in
#' covariate.file to include in analysis
#' @param inter.term character string giving the column name of the covariate
#' that will be added to the interaction term with
#' SNP (e.g. \code{term*SNP}). See details.
#' @param print.covs character string of either \code{"only"}, \code{"all"}
#' or \code{"some"}, defining which covariate statistics should be printed to
#' the output. See details.
#' @param out.file character of output file name (do not include extension)
#' @param chunk.size integer number of variants to process per thread
#' @param maf.filter numeric to filter minor allele frequency
#' (i.e. choosing 0.05 means filtering MAF>0.05). User can set this to
#' \code{NULL} if no filtering is preffered. Default is 0.05.
#' @param flip.dosage logical to flip which allele the dosage was
#' calculated on, default \code{flip.dosage=TRUE}
#' @param verbose logical for messages that describe which part of the
#' analysis is currently being run
#' @param clusterObj A cluster object that can be used with the
#' \code{parApply} function. See details.
#'
#' @details
#' Testing for SNP-covariate interactions:
#' User can define the column name of the covariate that will be included in
#' the interaction term.
#' For example, for given covariates \code{a} and \code{b}, where \code{c} is
#' defined as the \code{inter.term} the model will be:
#' \code{~ a + b + c + SNP + c*SNP}.
#'
#' Printing results of other covariates:
#' \code{print.covs} argument controls the number of covariates will be printed
#' as output. The function is set to \code{only}
#' by default and will only print the SNP or if an interaction term is given,
#' the results of the interaction
#' term (e.g. \code{SNP*covariate}). Whereas, \code{all} will print results
#' (coef, se.coef, p.value etc) of all covariates included in the model.
#' \code{some} is only applicable if an interaction term is given and will
#' print the results for SNP, covariate tested for interaction and the
#' interaction term. User should be mindful about using the \code{all} option,
#' as it will likely slow down the analysis and will increase
#' the output file size.
#'
#' User defined parallelization:
#' This function uses \code{parApply} from \code{parallel} package to fit
#' models to SNPs in parallel.
#' User is not required to set any options for the parallelization.
#' However, advanced users who wish to optimize it, can provide a
#' cluster object generated by \code{makeCluster} family of functions that
#' suits their need and platform.
#'
#' @return
#' Saves two text files directly to disk:
#' \code{.coxph} extension containing CoxPH survival analysis results.
#' \code{.snps_removed} extension containing SNPs that were removed
#' due to low variance or user-defined thresholds.
#'
#' @examples
#' gdsfile <- system.file(package="gwasurvivr",
#' "extdata",
#' "gds_example.gds")
#' covariate.file <- system.file(package="gwasurvivr",
#' "extdata",
#' "simulated_pheno.txt")
#' covariate.file <- read.table(covariate.file,
#' sep=" ",
#' header=TRUE,
#' stringsAsFactors = FALSE)
#' covariate.file$SexFemale <- ifelse(covariate.file$sex=="female", 1L, 0L)
#' sample.ids <- covariate.file[covariate.file$group=="experimental",]$ID_2
#' gdsCoxSurv(gdsfile=gdsfile,
#' covariate.file=covariate.file,
#' id.column="ID_2",
#' sample.ids=sample.ids,
#' time.to.event="time",
#' event="event",
#' covariates=c("age", "SexFemale", "DrugTxYes"),
#' inter.term=NULL,
#' print.covs="only",
#' out.file="impute_example",
#' chunk.size=50,
#' maf.filter=0.005,
#' flip.dosage=TRUE,
#' verbose=TRUE,
#' clusterObj=NULL)
#'
#' @importFrom survival Surv coxph.fit
#' @importFrom matrixStats rowMeans2 rowVars rowSds
#' @importFrom SummarizedExperiment rowRanges
#' @importFrom utils write.table
#' @importFrom stats pnorm rnorm setNames complete.cases
#' @import parallel
#' @import GWASTools
#'
#' @export
gdsCoxSurv <- function(gdsfile,
covariate.file,
id.column,
sample.ids = NULL,
time.to.event,
event,
covariates,
inter.term = NULL,
print.covs = "only",
out.file,
chunk.size = 5000,
maf.filter = 0.05,
exclude.snps=NULL,
flip.dosage = TRUE,
verbose = TRUE,
clusterObj = NULL)
{
coxSurv(createGdsCoxSurv(gdsfile,
covariate.file,
id.column,
sample.ids,
time.to.event,
event,
covariates,
inter.term,
print.covs,
out.file,
chunk.size,
maf.filter,
exclude.snps,
flip.dosage,
verbose,
clusterObj))
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.