ctwas | R Documentation |
Causal inference for TWAS
ctwas(
pgenfs,
exprfs,
Y,
ld_regions = c("EUR", "ASN", "AFR"),
ld_regions_version = c("b37", "b38"),
ld_regions_custom = NULL,
thin = 1,
prob_single = 0.8,
max_snp_region = Inf,
rerun_gene_PIP = 0.8,
niter1 = 3,
niter2 = 30,
L = 5,
group_prior = NULL,
group_prior_var = NULL,
estimate_group_prior = T,
estimate_group_prior_var = T,
use_null_weight = T,
coverage = 0.95,
standardize = T,
ncore = 1,
outputdir = getwd(),
outname = NULL,
logfile = NULL
)
pgenfs |
A character vector of .pgen or .bed files. One file for one chromosome, in the order of 1 to 22. Therefore, the length of this vector needs to be 22. If .pgen files are given, then .pvar and .psam are assumed to present in the same directory. If .bed files are given, then .bim and .fam files are assumed to present in the same directory. |
exprfs |
A character vector of .'expr' or '.expr.gz' files. One file for one chromosome, in the order of 1 to 22. Therefore, the length of this vector needs to be 22. '.expr.gz' file is gzip compressed '.expr' files. '.expr' is a matrix of imputed expression values, row is for each sample, column is for each gene. Its sample order is same as in files provided by '.pgenfs'. We also assume corresponding '.exprvar' files are present in the same directory. '.exprvar' files are just tab delimited text files, with columns:
Its rows should be in the same order as the columns for corresponding '.expr' files. |
Y |
a vector of length n, phenotype, the same order as provided by '.pgenfs' (defined in .psam or .fam files). |
ld_regions |
A string representing the population to use for defining
LD regions. These LD regions were previously defined by ldetect. The user can also
provide custom LD regions matching genotype data, see
|
ld_regions_version |
A string representing the genome reference build ("b37", "b38") to use for defining
LD regions. See |
ld_regions_custom |
A bed format file defining LD regions. The default
is |
thin |
The proportion of SNPs to be used for the parameter estimation and initial fine
mapping steps. Smaller |
prob_single |
Blocks with probability greater than |
max_snp_region |
Inf or integer. Maximum number of SNPs in a region. Default is Inf, no limit. This can be useful if there are many SNPs in a region and you don't have enough memory to run the program. This applies to the last rerun step (using full SNPs and rerun susie for regions with strong gene signals) only. |
rerun_gene_PIP |
if thin <1, will rerun blocks with the max gene PIP
> |
niter1 |
the number of iterations of the E-M algorithm to perform during the initial parameter estimation step |
niter2 |
the number of iterations of the E-M algorithm to perform during the complete parameter estimation step |
L |
the number of effects for susie during the fine mapping steps |
group_prior |
a vector of two prior inclusion probabilities for SNPs and genes. This is ignored
if |
group_prior_var |
a vector of two prior variances for SNPs and gene effects. This is ignored
if |
estimate_group_prior |
TRUE/FALSE. If TRUE, the prior inclusion probabilities for SNPs and genes are estimated
using the data. If FALSE, |
estimate_group_prior_var |
TRUE/FALSE. If TRUE, the prior variances for SNPs and genes are estimated
using the data. If FALSE, |
use_null_weight |
TRUE/FALSE. If TRUE, allow for a probability of no effect in susie |
coverage |
A number between 0 and 1 specifying the “coverage” of the estimated confidence sets |
standardize |
TRUE/FALSE. If TRUE, all variables are standardized to unit variance |
ncore |
The number of cores used to parallelize susie over regions |
outputdir |
a string, the directory to store output |
outname |
a string, the output name |
logfile |
the log file, if NULL will print log info on screen |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.