# Normalizes the empirical distribution of one or more samples to a target distribution

### Description

Normalizes the empirical distribution of one or more samples to a target distribution. After normalization, all samples have the same average empirical density distribution.

### Usage

1 2 3 4 5 6 | ```
## S3 method for class 'numeric'
normalizeQuantileSpline(x, w=NULL, xTarget, sortTarget=TRUE, robust=TRUE, ...)
## S3 method for class 'matrix'
normalizeQuantileSpline(X, w=NULL, xTarget=NULL, sortTarget=TRUE, robust=TRUE, ...)
## S3 method for class 'list'
normalizeQuantileSpline(X, w=NULL, xTarget=NULL, sortTarget=TRUE, robust=TRUE, ...)
``` |

### Arguments

`x, X` |
A single ( |

`w` |
An optional |

`xTarget` |
The target empirical distribution as a |

`sortTarget` |
If |

`robust` |
If |

`...` |
Arguments passed to ( |

### Value

Returns an object of the same type and dimensions as the input.

### Missing values

Both argument `X`

and `xTarget`

may contain non-finite values.
These values do not affect the estimation of the normalization function.
Missing values and other non-finite values in `X`

,
remain in the output as is. No new missing values are introduced.

### Author(s)

Henrik Bengtsson

### References

[1] H. Bengtsson, R. Irizarry, B. Carvalho, and T. Speed, *Estimation and assessment of raw copy numbers at the single locus level*, Bioinformatics, 2008.

### See Also

The target distribution can be calculated as the average
using `averageQuantile`

().

Internally either
`robustSmoothSpline`

(`robust=TRUE`

) or
`smooth.spline`

(`robust=FALSE`

) is used.

An alternative normalization method that is also normalizing the
empirical densities of samples is `normalizeQuantileRank`

().
Contrary to this method, that method requires that all samples are
based on the exact same set of data points and it is also more likely
to over-correct in the tails of the distributions.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ```
# Simulate three samples with on average 20% missing values
N <- 10000
X <- cbind(rnorm(N, mean=3, sd=1),
rnorm(N, mean=4, sd=2),
rgamma(N, shape=2, rate=1))
X[sample(3*N, size=0.20*3*N)] <- NA
# Plot the data
layout(matrix(c(1,0,2:5), ncol=2, byrow=TRUE))
xlim <- range(X, na.rm=TRUE);
plotDensity(X, lwd=2, xlim=xlim, main="The three original distributions")
Xn <- normalizeQuantile(X)
plotDensity(Xn, lwd=2, xlim=xlim, main="The three normalized distributions")
plotXYCurve(X, Xn, xlim=xlim, main="The three normalized distributions")
Xn2 <- normalizeQuantileSpline(X, xTarget=Xn[,1], spar=0.99)
plotDensity(Xn2, lwd=2, xlim=xlim, main="The three normalized distributions")
plotXYCurve(X, Xn2, xlim=xlim, main="The three normalized distributions")
``` |