Home // ALLDATA 2019, The Fifth International Conference on Big Data, Small Data, Linked Data and Open Data // View article
Creating Data-Driven Ontologies
Authors:
Maaike H.T. de Boer
Jack P.C. Verhoosel
Keywords: Knowledge engineering; Machine Learning; Agriculture
Abstract:
The manual creation of an ontology is a tedious task. In the field of ontology learning, Natural Language Processing (NLP) techniques are used to automatically create ontologies. In this paper, we present a methodology using data-driven techniques to create ontologies from unstructured documents in the agriculture domain. We use state-of-the-art NLP techniques based on Stanford OpenIE, Hearst patterns and co-occurrences to create ontologies. We add an NLP-method that uses dependency parsing and transformation rules based on linguistic patterns. In addition, we use keyword-driven techniques from the query expansion field, based on Word2vec, WordNet and ConceptNet, to create ontologies. We add a method that takes the union of the ontologies produced by the keyword-based methods. The semantic quality of the different ontologies is calculated using automatically extracted keywords. We define recall, precision and F1-score based on the concepts and relations in which the keywords are present. The results show that 1) the method based on co-occurrences has the best F1-score with more than 100 keywords; 2) the keyword-based methods have a higher F1- score than the NLP-based methods with less than 100 keywords in the evaluation and; 3) the combined keyword-based method always has a higher F1-score compared to each single method. In our future work, we will focus on improving the dependency parsing algorithm, improving combining different ontologies, and improving our quality evaluation methodology.
Pages: 52 to 57
Copyright: Copyright (c) IARIA, 2019
Publication date: March 24, 2019
Published in: conference
ISSN: 2519-8386
ISBN: 978-1-61208-700-9
Location: Valencia, Spain
Dates: from March 24, 2019 to March 28, 2019