UPC_Generic: Generic function to apply Universal exPression Codes (UPC)...

Description Usage Arguments Value Note Author(s) References

View source: R/UPC_Models.R


This function can be used to derive UPC values to any type of gene-expression data. It requires the user to specify expression values for many genes (or transcripts, exons, or probes). And optionally, the user can specify the length and/or GC content (proportion of G or C bases) for the corresponding genomic region (e.g., gene). If these values are specified, the UPC algorithm corrects for biases resulting from length or GC content.


UPC_Generic(expressionValues, lengths = NULL, gcContent = NULL, modelType = "nn",
  convThreshold = 0.001, higherValuesIndicateHigherExpression = TRUE, verbose = TRUE)



A numeric vector that indicates expression levels. This is the only required parameter.


An integer vector that indicates the length of the genomic region that corresponds to each expression value. The order of these values must be identical to the order of the expression values. This parameter is optional.


A numeric vector that indicates the proportion of bases in the genomic region that are a G or C nucleotide. The order of these values must be identical to the order of the expression values. This parameter is optional.


Various models can be used for the mixture model to differentiate between active and inactive probes. The default is the normal-normal model (“nn”), which uses the normal distribution. Other available options are log-normal (“ln”), negative-binomial (“nb”), and normal-normal Bayes (“nn_bayes”).


Convergence threshold that determines at what point the mixture-model parameters have stabilized. The default value should be suitable in most cases. However, if the model fails to converge (or converges too quickly), it may be useful to adjust this value. Optional.


In most cases, higher expression values indicate relatively high expression. However, if higher values indicate relatively low expresssion, this parameter can be used to indicate such. Accordingly, UPC values closer to one will indicate higher expression, and UPC values closer to zero will indicate the opposite. Optional.


Whether to output more detailed status information as processing occurs. Default is TRUE.


An numeric vector with UPC values whose order corresponds with the order of the input values.


The modelType parameter indicates which type of mixture model to use for UPC transformation. The "nn_bayes" model type is an experimental new approach intended for experiments where a subset of genes are expressed at extreme levels.


Stephen R. Piccolo


Piccolo SR, Withers MR, Francis OE, Bild AH and Johnson WE. Multi-platform single-sample estimates of transcriptional activation. Proceedings of the National Academy of Sciences of the United States of America, 2013, 110:44 17778-17783.

SCAN.UPC documentation built on May 2, 2018, 2:03 a.m.