Home // International Journal On Advances in Software, volume 4, numbers 3 and 4, 2011 // View article


Turning Large Software Component Repositories into Small Index Files

Authors:
Marcos Paixão
Leila Silva
Talles Brito
Gledson Elias

Keywords: Component repositories; clustering techniques; indexing

Abstract:
Software component repositories have adopted semistructured data models for representing syntactic and semantic features of handled assets. Such models imply key challenges to search engines, which are related to the design of indexing techniques that ought to be efficient in terms of storage space requirements. In such a context, by applying clustering techniques before indexing component repositories, this paper proposes an approach for reducing the number of assets in the repository, and consequently, the size of index files. Based on an illustrative repository, outcomes indicate a significant optimization in the number of assets to be indexed, and, as a consequence, produces significant gains in storage requirements. Besides, it has been assessed in terms of two different clustering evaluation methods, evincing that the proposed approach can be considered a good clustering algorithm because produces compact and well-separated clusters.

Pages: 412 to 421

Copyright: Copyright (c) to authors, 2011. Used with permission.

Publication date: April 30, 2012

Published in: journal

ISSN: 1942-2628