%0 Journal Article %T Clustering Methodologies for Software Engineering %A Mark Shtern %A Vassilios Tzerpos %J Advances in Software Engineering %D 2012 %I Hindawi Publishing Corporation %R 10.1155/2012/792024 %X The size and complexity of industrial strength software systems are constantly increasing. This means that the task of managing a large software project is becoming even more challenging, especially in light of high turnover of experienced personnel. Software clustering approaches can help with the task of understanding large, complex software systems by automatically decomposing them into smaller, easier-to-manage subsystems. The main objective of this paper is to identify important research directions in the area of software clustering that require further attention in order to develop more effective and efficient clustering methodologies for software engineering. To that end, we first present the state of the art in software clustering research. We discuss the clustering methods that have received the most attention from the research community and outline their strengths and weaknesses. Our paper describes each phase of a clustering algorithm separately. We also present the most important approaches for evaluating the effectiveness of software clustering. 1. Introduction Software clustering methodologies group entities of a software system, such as classes or source files, into meaningful subsystems in order to help with the process of understanding the high-level structure of a large and complex software system. A software clustering approach that is successful in accomplishing this task can have significant practical value for software engineers, particularly those working on legacy systems with obsolete or nonexistent documentation. Research in software clustering has been actively carried out for more than twenty years. During this time, several software clustering algorithms have been published in the literature [1¨C8]. Most of these algorithms have been applied to particular software systems with considerable success. There is consensus between software clustering researchers that a software clustering approach can never hope to cluster a software system as well as an expert who is knowledgeable about the system [9]. Therefore, it is important to understand how good a solution created by a software clustering algorithm is. The research community has developed several methods to assess the quality of software clustering algorithms [10¨C14]. In this paper, we present a review of the most important software clustering methodologies that have been presented in the literature. We also outline directions for further research in software clustering, such as the development of better software clustering algorithms or the improvement and evaluation of %U http://www.hindawi.com/journals/ase/2012/792024/