signal: Compute Localized Admixture Signals

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/signal.R

Description

Produces estimates of localized ancestry for each individual.

Usage

1
2
3
signal(table, who = colnames(table), populations, popA = NA, popB = NA, 
	normalize = FALSE, n.pca = 5, PCAonly = FALSE, verbose = TRUE, tol = 0.001,
	n.signal = NULL, window.size = NULL, genmap = NULL)

Arguments

table

matrix of genotype calls (rows, length T) versus individuals (columns, length n).

who

individuals to include in the analysis.

populations

list containing a vector of IDs for each population in the analysis.

popA

name of ancestral population 1 (used for forming the axes of variation). Must match one of the names in populations.

popB

name of ancestral population 2 (used for forming the axes of variation). Must match one of the names in populations.

normalize

if TRUE, normalize the data matrix. Default is FALSE.

n.pca

number of PCA axes to compute (only the first principal component is used for forming the signals, but additional components may be desired for visualization). Default is 5.

PCAonly

if TRUE, only compute the PCA, do not compute the signals. Default is FALSE.

verbose

if TRUE, print summary to screen. Default is TRUE.

tol

tolerance for normalization of admixture signals (ε in accompanying paper). Default is 0.001.

n.signal

(optional) number of data points in the windowed signal.

window.size

(optional) size of window specified as a proportion of total length;
e.g., window.size = 0.01 with signal of length T = 3000 SNPs generates windows of 0.01 x 3000 = 30 polymorphisms. Value need not be a round number.

genmap

(optional) genetic distance of genotype calls, supplied as vector of length T. If specified, signals will be formulated in terms of genetic distance along the chromosome (rather than physical position).

Details

Applies PCA to genome-wide data using ancestral reference populations. The first eigenvector reflects the population structure. All individuals are then projected on to this axis to form the SNP-level admixture signals. PCA scores are used to estimate the proportion of admixture at the level of individuals (indP) and populations (popP). There is no restriction on the length of the data (number of SNPs) and the default is to provide an estimate of localized ancestry at each SNP.

Optionally, it is also possible to window the signals, producing processed signals of length n.signal. The windows may be overlapping or disjoint with width specified through the window.size option (see examples). If genmap is specified, the signals will be formulated in terms of genetic distance along the chromosome (note: this function is not described in the accompanying paper).

Value

Returns an object of class adsig, a list with the following components:

call

function call.

date

date of function call.

individuals

individuals for whom projections on the first principal component are calculated.

n.snps

number of polymorphisms in the table.

signals

The admixture signals, output as a T x n data matrix, where n is the number of individuals and T is the number of data points (either the number of polymorphisms if n.signal = NULL or n.signal otherwise).

n.tol

the number of entries replaced by zero in the normalization procedure. This is dependent on the value set for the tolerance, tol.

popP

estimated proportion of admixture for each population.

indP

estimated proportion of admixture for each individual.

pa.ind

columns are principal axes in individual coordinates (n_A + n_B rows, n.pca columns).

pa.snp

columns are principal axes in polymorphism coordinates (T rows, n.pca columns).

G

matrix of quadratic form in individual coordinates.

ev

vector of eigenvalues.

gendist

(only if genmap is specified in input) Vector of genetic distances along the chromosome, length n.signal.

Author(s)

Jean Sanderson

References

Sanderson J, H Sudoyo, TM Karafet, MF Hammer and MP Cox. 2015. Reconstructing past admixture processes from local genomic ancestry using wavelet transformation. Genetics 200:469-481. https://doi.org/10.1534/genetics.115.176842

See Also

wavesum, plotsignal

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
data(admix)

# EXAMPLE 1
# Generate the admixture signal 
AdexPCA <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=NULL)

# Plot the resulting PCA
plot(AdexPCA$pc.ind[,1],AdexPCA$pc.ind[,2],col=admix$colplot,xlab="PC1",ylab="PC2",pch=16)
legend("bottomright",c("popA","popB","popAB"),col=c(3,4,2),pch=16)


# EXAMPLE 2
# Generate the admixture signal with windowing
AdexPCA2 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=1000,window.size=0.01)

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA2,ind="AD00001",popA=AdexPCA2$popA,popB=AdexPCA2$popB)


# EXAMPLE 3
# Generate the admixture signal with windowing
# As in EXAMPLE 2 but with n.signal reduced to 100 to provide disjoint windows
AdexPCA3 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=100,window.size=0.01)

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA3,ind="AD00001",popA=AdexPCA2$popA,popB=AdexPCA2$popB)


# EXAMPLE 4
# Generate the admixture signal in terms of genetic distance
# As in EXAMPLE 2 but with genmap specified so that signals are formulated using genetic distances
AdexPCA4 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
	n.signal=1000,window.size=0.01,genmap=admix$map[,2])

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA4,ind="AD00001",popA=AdexPCA4$popA,popB=AdexPCA4$popB)

adwave documentation built on May 1, 2019, 8 p.m.

Related to signal in adwave...