# Supervised scatter matrix based on quantiles

### Description

Function for a supervised scatter matrix that is the weighted
covariance matrix of `x`

with weights 1/(`q2-q1`

) if `y`

is between the
lower (`q1`

) and upper (`q2`

) quantile and 0 otherwise (or vice versa).

### Usage

1 2 3 |

### Arguments

`x` |
numeric data matrix with at least two columns. |

`y` |
numerical vector specifying the dependent variable. |

`q1` |
percentage for lower quantile of |

`q2` |
percentage for upper quantile of |

`pos` |
logical. If TRUE then the weights are 1/( |

`type` |
passed on to function |

`method` |
passed on to function |

`na.action` |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |

`check` |
logical. Checks if the input should be checked for consistency. If not needed setting it to FALSE might save some time. |

### Details

The weights for this supervised scatter matrix for `pos=TRUE`

are
*w(y) = I(q1-quantile < y < q2-quantile)/(q2-q1)*. Then `scovq`

is calculated as

*
scovq = ∑ w(y) (x-x_w_bar)'(x-x_w_bar).*

where * x_w_bar = sum w(y)x*.

To see how this function can be used in the context of supervised invariant coordinate selection see the example below.

### Value

a matrix.

### Author(s)

Klaus Nordhausen

### References

Liski, E., Nordhausen, K. and Oja, H. (2013), Supervised invariant coordinate selection, *To appear in Statistics*, **???**, ???–???.

### See Also

`cov.wt`

and `ics`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | ```
# Creating some data
# The number of explaining variables
p <- 10
# The number of observations
n <- 400
# The error variance
sigma <- 0.5
# The explaining variables
X <- matrix(rnorm(p*n),n,p)
# The error term
epsilon <- rnorm(n, sd = sigma)
# The response
y <- X[,1]^2 + X[,2]^2*epsilon
# SICS with ics
X.centered <- sweep(X,2,colMeans(X),"-")
SICS <- ics(X.centered, S1=cov, S2=scovq, S2args=list(y=y, q1=0.25,
q2=0.75, pos=FALSE), stdKurt=FALSE, stdB="Z")
# Assuming it is known that k=2, then the two directions
# of interest are choosen as:
k <- 2
KURTS <- SICS@gKurt
KURTS.max <- ifelse(KURTS >= 1, KURTS, 1/KURTS)
ordKM <- order(KURTS.max, decreasing = TRUE)
indKM <- ordKM[1:k]
# The two variables of interest
Zk <- ics.components(SICS)[,indKM]
# The correspondings transformation matrix
Bk <- coef(SICS)[indKM,]
# The corresponding projection matrix
Pk <- t(Bk) %*% solve(Bk %*% t(Bk)) %*% Bk
# Visualization
pairs(cbind(y,Zk))
# checking the subspace difference
# true projection
B0 <- rbind(rep(c(1,0),c(1,p-1)),rep(c(0,1,0),c(1,1,p-2)))
P0 <- t(B0) %*% solve(B0 %*% t(B0)) %*% B0
# crone and crosby subspace distance measure, should be small
k - sum(diag(P0 %*% Pk))
``` |

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.