Function to evaluate the significance of the group assignments generated by TwoHC_assign
.
1  TwoHC_perm(TwoHC, nperm = 1000)

TwoHC 
Output from the 'TwoHC_assign' function 
nperm 
The number of permutations. 
Significance of group assignment for each patient is calculated as follows: for a given patient, examine the previously found optimal partition in each HC tree and identify two clusters to which this patient is most similar. Say these two competing clusters are cluster1 and cluster2 of size n1 and n2, respectively. Create a binary vector (call it x) of size n1 + n2 which has n2 ones and n1 zeros . Construct a Cox model using followup information of samples in cluster1 and cluster2 as response, and x as covariate. The absolute value of the estimated group parameter ('beta_obs') in the Cox model that compares the survival times of the other patients in the two competing clusters expresses the predicted gain in survival from the assignment by 'TwoHC_assign' with respect to random. The beta_obs will be transformed to
r^i_{obs} = exp≤ft(\hatβ^i_{obs}\right),
which quantifies the gain in relative risk in the Cox model. The problem is that this is biased, because 'TwoHC_assign' already used 'beta_obs'. Hence, even when the two groups would be equally good for the molecular profiles in the two competing clusters, we obtain 'r_obs' < 1. To correct for this bias this function uses a permutation argument. For each new patient it applies 'nperm' permutations of the surival data among the two competing clusters. As above we compute 'r_perm' for each permutation which contains the same bias as 'r_obs'. Let Z_i = median(r_perm(1), ... , r_perm(nperm)), then riskratio rr_obs(i) = r_obs(i) / Z_i quantifies the biasedcorrected reduction in relative risk. The permuted version 'rr_perm(i)' is defined analogously (see vignette). Finally it defines a test statistic:
{T}_{obs} = \frac{\frac{1}{n}∑_{i=1}^{n}log≤ft(rr^i_{obs}\right)}{stdev≤ft(log≤ft(rr^1_{obs},…,rr^n_{obs}\right)\right)}
'T_obs' compared with the background of its nulldistribution as obtained by permutation to calculate pvalue.
A list object contains following objects:
Obs.betas 
A numeric vector contains the coefficient from the Cox model corresponding to each test sample. 
Perm.betas 
A matrix contains the coefficient from the Cox model trained with permuted followup data. Columns represent test samples, rows represent permutations 
Ranks 
A numeric vector contains the rank of each observed coefficient among the nperm coefficients generated by permutations. 
RiskRatios 
A numeric vector contains the ratio of relative risk for the test set. 
Pvalue 
pvalue of the overall group assignment. 
Askar Obulkasim
Obulkasim,A. et al., (2013). "Semisupervised adaptiveheight snipping of the Hierarchical Clustering tree", submitted.
Harrel,E.F. et al., (1982). "Evaluating the yield of medical tests", JAMA, 247, 25432546.
Obulkasim,A. et al., (2011). "Stepwise classification of cancer samples using clinical and molecular data", BMC Bioinformatics, 12, 422.
See also TwoHC_assign
, cluster_pred
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  data(TcgaGBM)
attach(TcgaGBM)
id1 < which(drugs == "Avastin")
id2 < which(drugs == "Temodar")
twoHC < TwoHC_assign(X = em[ ,c(id1[1:50], id2[1:50])], index1 = 1:50, index2 = 51:100,
new.X = em[, c(id1[51:60], id2[51:60])], minclus = 4,
surv.time = surv.time[c(id1[1:50], id2[1:50])],
status = status[c(id1[1:50], id2[1:50])])
result < TwoHC_perm(twoHC, nperm = 100)
## Not run:
### Examples with a larger number of permutations (not run).
result < TwoHC_perm(twoHC, nperm = 10000)
par(mfrow = c(1, 2))
plot(density(result$Ranks), xlab = "Ranks")
plot(density(result$RiskRatios), xlab = "Observed relative riskratios")
## End(Not run)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.