Function to assess the significance of group assignmetn from TwoHC_assign
Description
Function to evaluate the significance of the group assignments generated by TwoHC_assign
.
Usage
1  TwoHC_perm(TwoHC, nperm = 1000)

Arguments
TwoHC 
Output from the 'TwoHC_assign' function 
nperm 
The number of permutations. 
Details
Significance of group assignment for each patient is calculated as follows: for a given patient, examine the previously found optimal partition in each HC tree and identify two clusters to which this patient is most similar. Say these two competing clusters are cluster1 and cluster2 of size n1 and n2, respectively. Create a binary vector (call it x) of size n1 + n2 which has n2 ones and n1 zeros . Construct a Cox model using followup information of samples in cluster1 and cluster2 as response, and x as covariate. The absolute value of the estimated group parameter ('beta_obs') in the Cox model that compares the survival times of the other patients in the two competing clusters expresses the predicted gain in survival from the assignment by 'TwoHC_assign' with respect to random. The beta_obs will be transformed to
r^i_{obs} = exp≤ft(\hatβ^i_{obs}\right),
which quantifies the gain in relative risk in the Cox model. The problem is that this is biased, because 'TwoHC_assign' already used 'beta_obs'. Hence, even when the two groups would be equally good for the molecular profiles in the two competing clusters, we obtain 'r_obs' < 1. To correct for this bias this function uses a permutation argument. For each new patient it applies 'nperm' permutations of the surival data among the two competing clusters. As above we compute 'r_perm' for each permutation which contains the same bias as 'r_obs'. Let Z_i = median(r_perm(1), ... , r_perm(nperm)), then riskratio rr_obs(i) = r_obs(i) / Z_i quantifies the biasedcorrected reduction in relative risk. The permuted version 'rr_perm(i)' is defined analogously (see vignette). Finally it defines a test statistic:
{T}_{obs} = \frac{\frac{1}{n}∑_{i=1}^{n}log≤ft(rr^i_{obs}\right)}{stdev≤ft(log≤ft(rr^1_{obs},…,rr^n_{obs}\right)\right)}
'T_obs' compared with the background of its nulldistribution as obtained by permutation to calculate pvalue.
Value
A list object contains following objects:
Obs.betas 
A numeric vector contains the coefficient from the Cox model corresponding to each test sample. 
Perm.betas 
A matrix contains the coefficient from the Cox model trained with permuted followup data. Columns represent test samples, rows represent permutations 
Ranks 
A numeric vector contains the rank of each observed coefficient among the nperm coefficients generated by permutations. 
RiskRatios 
A numeric vector contains the ratio of relative risk for the test set. 
Pvalue 
pvalue of the overall group assignment. 
Author(s)
Askar Obulkasim
References
Obulkasim,A. et al., (2013). "Semisupervised adaptiveheight snipping of the Hierarchical Clustering tree", submitted.
Harrel,E.F. et al., (1982). "Evaluating the yield of medical tests", JAMA, 247, 25432546.
Obulkasim,A. et al., (2011). "Stepwise classification of cancer samples using clinical and molecular data", BMC Bioinformatics, 12, 422.
See Also
See also TwoHC_assign
, cluster_pred
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  data(TcgaGBM)
attach(TcgaGBM)
id1 < which(drugs == "Avastin")
id2 < which(drugs == "Temodar")
twoHC < TwoHC_assign(X = em[ ,c(id1[1:50], id2[1:50])], index1 = 1:50, index2 = 51:100,
new.X = em[, c(id1[51:60], id2[51:60])], minclus = 4,
surv.time = surv.time[c(id1[1:50], id2[1:50])],
status = status[c(id1[1:50], id2[1:50])])
result < TwoHC_perm(twoHC, nperm = 100)
## Not run:
### Examples with a larger number of permutations (not run).
result < TwoHC_perm(twoHC, nperm = 10000)
par(mfrow = c(1, 2))
plot(density(result$Ranks), xlab = "Ranks")
plot(density(result$RiskRatios), xlab = "Observed relative riskratios")
## End(Not run)
