Home // ICIMP 2020, The Fifteenth International Conference on Internet Monitoring and Protection // View article


Modeling Natural Language Policies into Controlled Natural Language: A Twitter Case Study

Authors:
Irfan Tanoli
Sebastião Pais

Keywords: Natural Language; Controlled Natural Languages; Social Networks; Natural Language Processing; Data Policies

Abstract:
Social network providers usually describe the terms of data storage, usage, and sharing, by adopting natural languages. To automatically evaluate such terms of use, to understand, analyse, and enforce rights and obligations over the user's data, it is of uttermost importance to translate them in a machine-readable format. Natural Languages (NLs) are the most prominent form of knowledge representation for humans. However, due to NLs complexities, it is quite burdensome to process their sentences by machines in a seamless and standardised way. Controlled Natural Languages (CNLs) are subsets of NLs that are obtained by restricting the grammar and vocabulary, to minimize - or even eliminate - ambiguity and complexity of NL. These languages hold two major characteristics: they look informal and easy to read and write by humans, quite like natural languages, but they can be easily transited into machine-readable forms. In this paper, we study some policy-oriented CNLs. We adopt them as source languages for translating sample Twitter policies. Then, we assess the value of the different languages, according to the difficulties of the translation, its readability, and other compelling properties to find which CNL is more suitable for NL translation.

Pages: 15 to 21

Copyright: Copyright (c) IARIA, 2020

Publication date: September 27, 2020

Published in: conference

ISSN: 2308-3980

ISBN: 978-1-61208-804-4

Location: Lisbon, Portugal

Dates: from September 27, 2020 to October 1, 2020