%0 Journal Article %T Interlinking Developer Identities within and across Open Source Projects: The Linked Data Approach %A Aftab Iqbal %A Michael Hausenblas %J ISRN Software Engineering %D 2013 %R 10.1155/2013/584731 %X Software developers use various software repositories in order to interact with each other or to solve related problems. These repositories provide a rich source of information for a wide range of tasks. However, one issue to overcome in order to make this information useful is the identification and interlinking of multiple identities of developers. In this paper, we propose a Linked Data-based methodology to interlink and integrate multiple identities of a developer found in different software repositories of a project as well as across repositories of multiple projects. Providing such interlinking will enable us to keep track of a developer¡¯s activity not only within a single project but also across multiple projects. The methodology will be presented in general and applied to 5 Apache projects as a case study. Further, we show that the few methods suggested so far are not always appropriate to overcome the developer identification problem. 1. Introduction and Motivation In Software Engineering, many tools with underlying repositories have been introduced to support the collaboration and coordination in distributed software development. Research has shown that these software repositories contain rich amount of information about software projects. By mining the information contained in these software repositories, practitioners can depend less on their experience and more on the historical data [1]. However, software repositories are commonly used only as record-keeping repositories and rarely for design decision processes [2]. Examples of software repositories are [3] source control repositories, bug repositories, archived communication, and so forth. Developers (we will use the term ¡°developer¡± to represent the core developers, contributors, bug reporters and users of an open source project) use these repositories to interact with each other or to solve software-related problems. By extracting rich information from these repositories, one can guide decision processes in modern software development. For example, source code and bugs are quite often discussed on bug repositories and project mailing lists. Data in these software repositories could be analyzed to extract bug and source code related discussions, which could be linked to the actual bug description and source code. This could allow keeping track of developers discussion related to a bug or source code in different software repositories. Developers are required to adopt an identity for each software repository they want to use. For example, they are required to adopt an email address in %U http://www.hindawi.com/journals/isrn.software.engineering/2013/584731/