Description Usage Arguments Value Note Author(s) References See Also Examples
This function estimates gene-wise multi-SNP ACME models.
It requires the output of multithreadACME
to know all the local SNPs for each gene.
It then performs forward step-wise selection of the local SNPs,
based on the adjusted-R-squared at each step.
The arguments closely mirror those of multithreadACME
their values must correspond to a set of output files from that function
(as well as the input files which originally produced the output).
It saves the data in filematrix format,
similar to the output of multithreadACME
.
Note that each multi-SNP model will contain at least one SNP, even if that initial SNP was not significant under the single-SNP models. This initial SNP will be the one with the highest adjusted-R-squared value among the single-SNP models. However, after the initial SNP, further SNPs are added only if the combined model's adjusted-R-squared is greater than that from the previous combined model.
1 2 3 4 5 6 7 8 9 10 | multisnpACME(
genefm = "gene",
snpsfm = "snps",
glocfm = "gene_loc",
slocfm = "snps_loc",
cvrtfm = "cvrt",
acmefm = "ACME",
workdir = ".",
genecap = Inf,
verbose = TRUE)
|
genefm |
Name of the filematrix with gene expression data. One column per gene and one row per sample. |
snpsfm |
Name of the filematrix with SNP data. One column per SNP and one row per sample. |
glocfm |
Name of the filematix with gene location information.
Must contain two columns,
first with gene start location and second with the gene end.
The locations must be stored as numbers,
the locations for different chromosomes must differ greatly.
We suggest encoding
(location = 1e9 * chromosome + position_on_chromosome).
The rows must match the columns of the |
slocfm |
Name of the filematrix with SNP locations.
Must have one column and rows matching columns of
|
cvrtfm |
Name of the filematirx with covariates. Must not include constant (it is added automatically). One column per covariate and one row per sample. |
acmefm |
Name of the filematrix to in which the ACME estimates are stored.
A new file matrix with the name |
workdir |
Directory where the input filematrices are located. |
genecap |
Number of genes to estimate multi-SNP model for. |
verbose |
Set to |
The function creates a filematrix named
paste0(acmefm, "_multiSNP")
with 4 rows
and a column for a SNP when it is included in a mult-SNP model.
If the SNP is included in more than one multi-SNP model,
it will appear multiple times in the matrix
(but with different beta estimates, corresponding to the paritular models).
The rows contain gene-SNP ids,
step-wise adjusted-R-squared statistics,
and beta estimates:
geneid |
The gene id - the column number for the gene
in the |
snp_id |
The SNP id - the column number for the SNP
in the |
beta0 |
The beta0 estimate in the full model. |
beta |
The beta estimate for the SNP in the full model (after all chosen SNPs have been added). |
forward_adjR2 |
The step-wise adjusted-R-squared, computed for the full model when the SNP was added. |
The rows of genefm
, snpsfm
, and cvrtfm
filematrices must match.
The SNPs must have increasing locations.
Andrey A Shabalin andrey.shabalin@gmail.com, John Palowitch
The manuscript is available at: http://onlinelibrary.wiley.com/doi/10.1111/biom.12810/full
For package overview and code examples see the package vignette via:
browseVignettes("ACMEeqtl")
or
RShowDoc("doc/ACMEeqtl.html", "html", "ACMEeqtl")
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | # First we generate a eQTL dataset in filematrix format
tempdirectory = tempdir()
z = create_artificial_data(
nsample = 50,
ngene = 11,
nsnp = 51,
ncvrt = 1,
minMAF = 0.2,
saveDir = tempdirectory,
returnData = FALSE,
savefmat = TRUE,
savetxt = FALSE,
verbose = FALSE)
# Then we run multithreadACME to obtain single-SNP estimates.
# In this example, we use 2 CPU cores (threads)
# for testing of all gene-SNP pairs within 100,000 bp.
multithreadACME(
genefm = "gene",
snpsfm = "snps",
glocfm = "gene_loc",
slocfm = "snps_loc",
cvrtfm = "cvrt",
acmefm = "ACME",
cisdist = 10e+06,
threads = 1, # Use more for faster run
workdir = file.path(tempdirectory, "filematrices"),
verbose = FALSE)
# Now the filematrix `ACME` holds estimations for all local gene-SNP pairs.
fm = fm.open(file.path(tempdirectory, "filematrices", "ACME"))
TenResults = fm[,1:10]
rownames(TenResults) = rownames(fm)
close(fm)
show(t(TenResults))
# Now we can estimate multi-SNP ACME models for each gene:
multisnpACME(
genefm = "gene",
snpsfm = "snps",
glocfm = "gene_loc",
slocfm = "snps_loc",
cvrtfm = "cvrt",
acmefm = "ACME",
workdir = file.path(tempdirectory, "filematrices"),
genecap = Inf,
verbose = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.