%0 Journal Article
%T HAMSTER: visualizing microarray experiments as a set of minimum spanning trees
%A Raymond Wan
%A Larisa Kiseleva
%A Hajime Harada
%A Hiroshi Mamitsuka
%A Paul Horton
%J Source Code for Biology and Medicine
%D 2009
%I BioMed Central
%R 10.1186/1751-0473-4-8
%X HAMSTER (Helpful Abstraction using Minimum Spanning Trees for Expression Relations) is an open source system for generating a set of MSTs from the experiments of a microarray data set. While previous works have generated a single MST from a data set for data clustering, we recursively merge experiments and repeat this process to obtain a set of MSTs for data visualization. Depending on the parameters chosen, each tree is analogous to a snapshot of one step of the hierarchical clustering process. We scored and ranked these trees using one of three proposed schemes. HAMSTER is implemented in C++ and makes use of Graphviz for laying out each MST.We report on the running time of HAMSTER and demonstrate using data sets from the NCBI Gene Expression Omnibus (GEO) that the images created by HAMSTER offer insights that differ from the dendrograms of hierarchical clustering. In addition to the C++ program which is available as open source, we also provided a web-based version (HAMSTER+) which allows users to apply our system through a web browser without any computer programming knowledge.Researchers may find it helpful to include HAMSTER in their microarray analysis workflow as it can offer insights that differ from hierarchical clustering. We believe that HAMSTER would be useful for certain types of gradient data sets (e.g time-series data) and data that indicate relationships between cells/tissues. Both the source and the web server variant of HAMSTER are available from http://hamster.cbrc.jp/ webcite.The high dimensionality and exploratory nature of microarray data analysis has led to the application of several unsupervised data clustering techniques to aid in the visualization of gene expression data. Three popular methods are hierarchical clustering (HC) [1], k-means [2], and self-organizing maps (SOM) [3] (others have previously compared these systems [4]). Implementations of these methods can be found in TreeView [1], Cluster [5], and GENECLUSTER [3]; in more general
%U http://www.scfbm.org/content/4/1/8