pcv: PCA on automatically selected attributes in high dimensional...

Description Usage Arguments Details Value Examples

Description

Conduct PCA on variables with biggest variance in high dimensional data matrix

Usage

1
pcv(x, cols=5, sites=5000)

Arguments

x

name of data matrix

cols

number of principal components to extract

sites

number of attributes to consider

Details

pcv assumes data in a numeric matrix and variable major format, i.e. every line corresponds to to a variable, while the columns correspond to the individual observations. This is commonly the case for data in high throughput experiments where the number of data points per individuals is high (> 10,000), while the size of batches is comparably small (dozens to hundreds). Variables with missing values are disregarded for the selection.

Use t() to transpose individual major data sets beforehand.

pcv selects the attributes with the highest variance up to the numbers provided, but takes considerations to limit these to the actual size of the present data set.

This is often used as first step in high throughput measurements to detect global effects of known batch variables.

Value

matrix with rows corresponding to observations and columns to extracted components. Values denote the scores on the extracted components for the respective observations.

Examples

1
2
    pcs <- pcv(t(iris[1:4]),cols=2)    
    cor(pcs,iris[-5])

preputils documentation built on July 1, 2020, 5:35 p.m.

Related to pcv in preputils...