|
CLUSTER ANALYSIS OF MICROARRAY DATA BASED ON SIMILARITY MEASUREMENTKeywords: DNA microarray data , Clustering technique , Cluster validity index , Similarity measurement , Splitting and Merging , Cancer data analysis Abstract: DNA microarray technology is a fundamental tool in gene expression data analysis. The collection of datasets fromthe technology has underscored the need for quantitative analytical tools to examine such data. Due to the large number ofgenes and complex gene regulation networks, clustering is a useful exploratory technique for analyzing these data. Manyclustering algorithms have been proposed to analyze microarray gene expression data, but very few of them evaluate thequality of the clusters. In this paper, a novel cluster analysis technique has been proposed without considering number ofclusters a priori. The method computes a similarity measurement function based on which the clusters are merged andsubsequently splits a cluster by computing the degree of separation of the cluster. The process of splitting and mergingperforms iteratively until the cluster validity index (i.e. DB index) degrades. The experimental result shows that the proposedcluster analysis technique gives comparable results on gene cancer dataset with existing methods. This study may help raiserelevant issues in the extraction of meaningful biological information from microarray expression data
|