Home // ICIW 2014, The Ninth International Conference on Internet and Web Applications and Services // View article
Scalable Web Content Understanding Framework
Authors:
Yang Sun
Hyungsik Shin
Sayandev Mukherjee
Ronald Sujithan
Hongfeng Yin
Yoshikazu Akinaga
Pero Subasic
Keywords: contextual tagging; advertising; content understanding engine.
Abstract:
The contextualization of an unknown web page is a fundamental need in many online applications. We propose a new framework known as the Content Understanding Engine (CUE) that allows computational stages to be composed with different technologies to contextualize an unknown URL. We describe how this computation pipeline interfaces with our Big Data infrastructure and how this approach simplifies deployment to private or public cloud environments. The implementation details of this framework are provided along with a use case to demonstrate the value of the CUE. We provide the results from our evaluation of this pipelined architecture with a wide range of URL from different topics.
Pages: 105 to 110
Copyright: Copyright (c) IARIA, 2014
Publication date: July 20, 2014
Published in: conference
ISSN: 2308-3972
ISBN: 978-1-61208-361-2
Location: Paris, France
Dates: from July 20, 2014 to July 24, 2014