Home // DBKDA 2019, The Eleventh International Conference on Advances in Databases, Knowledge, and Data Applications // View article
Utilizing Citation Context in a Two-Level Topic Model for Knowledge Discovery
Authors:
Lixue Zou
Li Wang
Xiwen Liu
Keywords: Topic model; Citation context; Knowledge Discovery, XML data
Abstract:
Knowledge discovery from academic articles has received increasing attention since full text has been made available by the development of the digital databases. In a corpus of scientific articles, documents are connected by citations and one document has two different parts in the corpus: citation context and autonomous text. We believe that the topic distributions of these two parts are different and related in a certain way. In the existing topic models, little effort is made to incorporate the citation context. In this paper, we propose a citation context topic model which considers the corpus at two levels: cited topic level and citing topic level, utilizing citation context extracted from the full text. Each document has two different representations in the latent topic space. We apply our model to a dataset of PubMed Central, where the full text is available from the XML data. The results clearly show that the citation context can help to discover the latent two-level topics and demonstrate a very promising knowledge discovery capability.
Pages: 1 to 3
Copyright: Copyright (c) IARIA, 2019
Publication date: June 2, 2019
Published in: conference
ISSN: 2308-4332
ISBN: 978-1-61208-715-3
Location: Athens, Greece
Dates: from June 2, 2019 to June 6, 2019