Home // INTELLI 2013, The Second International Conference on Intelligent Systems and Applications // View article
Automated Annotation of Text Using the Classification-based Annotation Workbench (CLAW)
Authors:
R.oy George
Hema Nair
Khalil Shujaee
David Krooks
Chandler Armstrong
Keywords: Text Annotation, Multi-label Classification, Bayes Theorem, Annotation Workbench.
Abstract:
Text annotation is used to mark up text using highlights, comments, footnotes, tags, and links. Manual annotation is a human intensive process and is not feasible for a large corpus of text. Classification is a technique that may be used to automate the annotation process. This paper develops a Classification-based Text Annotation Workbench (CLAW), an annotation assistance tool that incorporates automated classification to reduce the difficulty of manual annotation. There are several technical challenges posed by the practical nature of the text corpus and the annotation methodology. The text corpus, is large and consists of numerous reports, lessons learnt and best practices. Complexity is introduced due to the size of the documents, the variety of formats and the range of subject matter. The annotation taxonomy is extensive and unstructured and may be applied to the text body without constraints. Consequently, the search space for the label(s) become prohibitively large and it becomes necessary to adopt strategies that reduce the complexity of the classification process. We introduce a simplification technique to reduce the large classification search space. We improve precision by supplementing these predictive algorithms with similarity based measures and evaluate CLAW for performance using both prediction-based metrics and ranking-based metrics. It is shown that CLAW performs better than a competing algorithm on all evaluation metrics.
Pages: 6 to 11
Copyright: Copyright (c) IARIA, 2013
Publication date: April 21, 2013
Published in: conference
ISSN: 2308-4065
ISBN: 978-1-61208-269-1
Location: Venice, Italy
Dates: from April 21, 2013 to April 26, 2013