Home // SEMAPRO 2010, The Fourth International Conference on Advances in Semantic Processing // View article
Using WordNet for Concept-Based Document Indexing in Information Retrieval
Authors:
Fatiha Boubekeur
Mohand Boughanem
Lynda Tamine
Mariam Daoud
Keywords: Information retrieval; conceptual indexing; concept weighting; WordNet
Abstract:
Concept-based document indexing deals with representing documents by means of semantic entities, the concepts, rather than lexical entities, the keywords. In this paper we propose an approach for concept-based document representation and weighting. Particularly, we propose (1) an approach for concept-identification (2) and a novel concept weighting scheme. The concepts are first extracted from WordNet and then weighted by means of a new measure of their importance in the document. Our conceptual indexing approach outperforms better than classical keyword-based approaches, and preliminary tests with the weighting scheme give better results than the classical tf-idf approach.
Pages: 151 to 157
Copyright: Copyright (c) IARIA, 2010
Publication date: October 25, 2010
Published in: conference
ISSN: 2308-4510
ISBN: 978-1-61208-104-5
Location: Florence, Italy
Dates: from October 25, 2010 to October 30, 2010