Home // SEMAPRO 2010, The Fourth International Conference on Advances in Semantic Processing // View article


Using WordNet for Concept-Based Document Indexing in Information Retrieval

Authors:
Fatiha Boubekeur
Mohand Boughanem
Lynda Tamine
Mariam Daoud

Keywords: Information retrieval; conceptual indexing; concept weighting; WordNet

Abstract:
Concept-based document indexing deals with representing documents by means of semantic entities, the concepts, rather than lexical entities, the keywords. In this paper we propose an approach for concept-based document representation and weighting. Particularly, we propose (1) an approach for concept-identification (2) and a novel concept weighting scheme. The concepts are first extracted from WordNet and then weighted by means of a new measure of their importance in the document. Our conceptual indexing approach outperforms better than classical keyword-based approaches, and preliminary tests with the weighting scheme give better results than the classical tf-idf approach.

Pages: 151 to 157

Copyright: Copyright (c) IARIA, 2010

Publication date: October 25, 2010

Published in: conference

ISSN: 2308-4510

ISBN: 978-1-61208-104-5

Location: Florence, Italy

Dates: from October 25, 2010 to October 30, 2010