Home // International Journal On Advances in Intelligent Systems, volume 14, numbers 1 and 2, 2021 // View article
Authors:
Tim vor der Brück
Michael Kaufmann
Keywords: OdeNet; fuzzy sets; targeted marketing; histogram equalization
Abstract:
Estimating the semantic similarity between texts is important for a wide range of application scenarios in natural language processing. With the increasing availability of large text corpora, data-driven approaches such as Word2Vec have become quite successful. In contrast, semantic methods, that employ manually designed knowledge bases such as ontologies have lost some of their former popularity. However, manually designed expert knowledge can still be a valuable resource, since it can be leveraged to boost the performance of data-driven approaches. In this paper, we introduce a novel hybrid similarity estimate based on fuzzy sets that exploits both word embeddings and a lexical ontology. As ontology, we use OdeNet, a freely available resource developed by the Darmstadt University of Applied Sciences. Our application scenario is targeted marketing, in which we aim to match people to the best fitting marketing target group based on short German text snippets. The evaluation showed that the use of an ontology did indeed improve the overall result in comparison with a baseline data-driven estimate.
Pages: 114 to 120
Copyright: Copyright (c) to authors, 2021. Used with permission.
Publication date: December 31, 2021
Published in: journal
ISSN: 1942-2679