%0 Journal Article %T A Normalized Tree Index for identification of correlated clinical parameters in microarray experiments %A Christian W Martin %A Anika Tauchen %A Anke Becker %A Tim W Nattkemper %J BioData Mining %D 2011 %I BioMed Central %R 10.1186/1756-0381-4-2 %X We propose a novel index, the Normalized Tree Index (NTI), to compute a correlation coefficient between the clustering result of high-dimensional microarray data and nominal clinical parameters. The NTI detects correlations between hierarchically clustered microarray data and nominal clinical parameters (labels) and gives a measurement of significance in terms of an empiric p-value of the identified correlations. Therefore, the microarray data is clustered by hierarchical agglomerative clustering using standard settings. In a second step, the computed cluster tree is evaluated. For each label, a NTI is computed measuring the correlation between that label and the clustered microarray data.The NTI successfully identifies correlated clinical parameters at different levels of significance when applied on two real-world microarray breast cancer data sets. Some of the identified highly correlated labels confirm the actual state of knowledge whereas others help to identify new risk factors and provide a good basis to formulate new hypothesis.The NTI is a valuable tool in the domain of biomedical data analysis. It allows the identification of correlations between high-dimensional data and nominal labels, while at the same time a p-value measures the level of significance of the detected correlations.Hierarchical agglomerative clustering is the basis for most visual data mining tasks in microarray applications [1-3]. Compared to non-hierarchical cluster algorithms, it has the advantage that the number of clusters does not have to be specified in advance. This property is of utmost importance since the number of clusters is usually unknown making a precise a priori prediction of the number of clusters impossible. A second reason for the frequent application of hierarchical agglomerative clustering is its visualization ability [4]. The intrinsic hierarchical cluster structure of the data becomes visually accessible at once in the computed cluster tree. The visualization abili %U http://www.biodatamining.org/content/4/1/2