v.pudms: test a PU fit on a test data set
In RomeroLab/pudms: Positive-Unlabeled Learning for the Analysis of Deep Mutational Scanning Datasets

test a PU fit on a test data set

v.pudms(
  protein_dat,
  py1 = NULL,
  nhyperparam = 10,
  nfolds = 5,
  test_idx = 1:nfolds,
  seed = round(runif(1, min = 1, max = 1000)),
  order = 1,
  refstate = NULL,
  verbose = T,
  nobs_thresh = 10,
  lambda = 0,
  pvalue = FALSE,
  n_eff_prop = 1,
  intercept = F,
  maxit = 1000,
  eps = 0.001,
  inner_eps = 0.01,
  initial_coef = NULL,
  p.adjust.method = "BH",
  tol = 1e-05,
  nCores = 1,
  full.fit = FALSE,
  full.fit.pvalue = FALSE,
  outfile = NULL
)

`protein_dat`	input data. A data table containing (sequence, labeled, unlabeled, seqId)
`py1`	a numeric value, a numeric vector or NULL; the prevalence of positives in the unlabeled data. If length(py1) >1, optimal py1 will be chosen based on auc values on a test data set. If NULL (default), a sequence of py1 values (of length nhyperparam)–ranging from 0.001 to 0.5 interpolated in a log scale–will be considered.
`nhyperparam`	an integer for the length of the py1 sequence if py1 == NULL
`nfolds`	the number of subsamples. (nfolds -1)/nfolds splits will be used for training, and the rest will be used for testing.
`test_idx`	a vector of indices of cross-validation models which will be fitted. Default is to fit the model for each of the cross-validation fold.
`seed`	a seed number for reproducibility
`order`	an integer; 1= main effects, 2= main effects + pairwise effects
`refstate`	a character which will be used for the common reference state; the default is to use the most frequent amino acid as the reference state for each of the position.
`verbose`	a logical value. The default is TRUE
`nobs_thresh`	the number of minimum required mutations per position
`lambda`	l1 penalty
`pvalue`	a logial value; if TRUE, p-values based on the asymptotic distribution are obtained
`n_eff_prop`	proportion of an effective sample size
`intercept`	a logical value; if TRUE, an estimated intercept is reported together with other coefficients
`maxit`	maximum number of iterations
`eps`	convergence threshold for the outer loop
`inner_eps`	convergence threshold for the inner loop
`initial_coef`	a vector representing an initial point where we start PUlasso algorithm from.
`p.adjust.method`	method for multiple comparison
`tol`	NULL or a numeric value; if the estimated roc curve <= y+tol, the estimated roc curve is determined to be contained by the maximal curve. The default is NULL, where we use tol = 1sd value of the length(test_idx) roc curves at each x value of the estimated roc curve.
`nCores`	the number of threads for computing.
`full.fit`	a logical value; if TRUE, the model will be fitted using a full data set and at a chosen py1.
`full.fit.pvalue`	a logical value; if TRUE, p-values for the full fit will be returned
`outfile`	NULL or a string; if a string is provided, an output with the name of the string will be exported in a working directory.

a list containing v.dmsfit (all fits using training/test splits), roc_curves (average roc curve at each py1), dmsfit (pudms.fit using a full data set at the selected py1), folds (test/training split information), py1 (a sequence of py1 values used for searching), py1.opt (the selected py1 value based on the predictive performance of the models)

RomeroLab/pudms documentation built on Jan. 2, 2021, 5:10 a.m.

RomeroLab/pudms index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RomeroLab/pudms
Positive-Unlabeled Learning for the Analysis of Deep Mutational Scanning Datasets

v.pudms: test a PU fit on a test data set
In RomeroLab/pudms: Positive-Unlabeled Learning for the Analysis of Deep Mutational Scanning Datasets

Description

Usage

Arguments

Value

Related to v.pudms in RomeroLab/pudms...

R Package Documentation

Browse R Packages

We want your feedback!

RomeroLab/pudms Positive-Unlabeled Learning for the Analysis of Deep Mutational Scanning Datasets

v.pudms: test a PU fit on a test data set In RomeroLab/pudms: Positive-Unlabeled Learning for the Analysis of Deep Mutational Scanning Datasets

Description

Usage

Arguments

Value

Related to v.pudms in RomeroLab/pudms...

R Package Documentation

Browse R Packages

We want your feedback!

RomeroLab/pudms
Positive-Unlabeled Learning for the Analysis of Deep Mutational Scanning Datasets

v.pudms: test a PU fit on a test data set
In RomeroLab/pudms: Positive-Unlabeled Learning for the Analysis of Deep Mutational Scanning Datasets