Description Usage Arguments Value Examples
View source: R/optimize_functions.R
This function is used for searching for the optimal parameters used in SINTER. The goal is find parameters that maximize the average of p-values from the neighbor_test function.
1 2 3 4 5 6 | predict_opt(atac_data, expr_data, DNase_train, RNA_train,
num_predictor = c(25, 25, 30), cluster_scale = c(10, 20, 50),
k_range = c(20:29), sigma_range = c(0.01, 1), k_in = 20,
sigma_in = 0.1, dim = 3, dist_scale_in = 10, subsample = FALSE,
MNN_opt = TRUE, fast = FALSE, MNN_ref = "scATAC", tol_er = 0.001,
ncore = 10, seed = 12345)
|
atac_data |
scATAC-seq data for matching. |
expr_data |
scRNA-seq data for matching. |
DNase_train |
ENCODE cluster features from DNase-seq data for building the regression model. |
RNA_train |
Gene expression from ENCODE RNA-seq data for building the regression model. |
num_predictor |
Searching space for number of predictors used in the regression model. |
cluster_scale |
Searching space for the scale to determine the number of gene clusters. |
k_range |
Searching space for k, the number of mutual nearest neighbor in MNN if flag MNN_opt==TRUE. |
sigma_range |
Searching space for sigma, the bandwidth of the Gaussian smoothing kernel used to compute the correction vector if flag MNN_opt==TRUE. |
k_in |
Setting K, the number of mutual nearest neighbor in MNN if flag MNN_opt!=TRUE. |
sigma_in |
Setting sigma, the bandwidth of the Gaussian smoothing kernel used to compute the correction vector if flag MNN_opt!=TRUE. |
dim |
Number of dimension used for matching the single cells. For example, the number of principal components. |
dist_scale_in |
Scale used to define the radius of the region for testing. |
subsample |
A percentage value to determine whether the paramter searching should be done in a subset of cells instead of using all cells. Set subsample=FALSE to use all cells. |
MNN_opt |
A flag to determine whether the parameters search should be performed for MNN. |
fast |
A flag indicates whether or not to use a fast neighbor_test. |
MNN_ref |
A flag to determine which data type is used as reference in MNN. Select from "scATAC" and "scRNA". |
tol_er |
The desired accuracy in function optimize. |
ncore |
Number of CPU cores used for parallel processing. Use ncore = 1 to run the function without parallel processing. |
seed |
The seed used for subsampling if subsample!=FALSE. |
num_predictor_opt |
The optimal value for number of predictors. |
cluster_scale_opt |
The optimal value for cluster scale. |
k_opt |
The optimal value for k. |
sigma_opt |
The optimal value for sigma. |
max_obj |
The average p-value based on the optimal parameters. |
1 2 3 4 5 | ## Not run:
result_opt <- predict_opt(atac_data,expr_data,DNase_train,RNA_train,num_predictor=c(25,25,30),cluster_scale=c(10,20,50),k_range=c(20:29),sigma_range=c(0.01,1),
k_in=20,sigma_in=0.1,dim=3,dist_scale_in=10,subsample=FALSE,MNN_opt=TRUE,MNN_ref="scATAC",tol_er=0.001,ncore=10,seed=12345)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.