Home // International Journal On Advances in Software, volume 4, numbers 3 and 4, 2011 // View article
Turning Large Software Component Repositories into Small Index Files
Authors:
Marcos Paixão
Leila Silva
Talles Brito
Gledson Elias
Keywords: Component repositories; clustering techniques; indexing
Abstract:
Software component repositories have adopted semistructured data models for representing syntactic and semantic features of handled assets. Such models imply key challenges to search engines, which are related to the design of indexing techniques that ought to be efficient in terms of storage space requirements. In such a context, by applying clustering techniques before indexing component repositories, this paper proposes an approach for reducing the number of assets in the repository, and consequently, the size of index files. Based on an illustrative repository, outcomes indicate a significant optimization in the number of assets to be indexed, and, as a consequence, produces significant gains in storage requirements. Besides, it has been assessed in terms of two different clustering evaluation methods, evincing that the proposed approach can be considered a good clustering algorithm because produces compact and well-separated clusters.
Pages: 412 to 421
Copyright: Copyright (c) to authors, 2011. Used with permission.
Publication date: April 30, 2012
Published in: journal
ISSN: 1942-2628