Home // DATA ANALYTICS 2018, The Seventh International Conference on Data Analytics // View article
Towards a Scalable Data-Intensive Text Processing Architecture with Python and Cassandra
Authors:
Gregor-Patrick Heine
Thomas Woltron
Alexander Wöhrer
Keywords: Cassandra; Streaming; Python; Multiprocessing; Twitter; Sentiment Analysis
Abstract:
Canonical sentiment analysis implementations hinge on synchronous Hyper Text Transfer Protocol (HTTP) calls. This paper introduces an asynchronous streaming approach. A method for public opinion surveillance is proposed via stream subscriptions. A prototype combining Twitter streams, Python text processing and Cassandra storage methods is introduced elaborating on three major points: 1) Comparison of performance regarding writing methods. 2) Multiprocessing procedures employing data parallelization and asynchronous concurrent database writes. 3) Public opinion surveillance via noun-phrase extraction.
Pages: 15 to 18
Copyright: Copyright (c) IARIA, 2018
Publication date: November 18, 2018
Published in: conference
ISSN: 2308-4464
ISBN: 978-1-61208-681-1
Location: Athens, Greece
Dates: from November 18, 2018 to November 22, 2018