Home // CONTENT 2010, The Second International Conference on Creative Content Technologies // View article
Internet Business Intelligence
Authors:
Hao Tan
Parisa Ghodous
Jacky Montiel
Keywords: schema matching; web-database integration;
Abstract:
Business Intelligence (BI) refers to computer-based techniques used in spotting, digging-out, and analyzing business data. It is mainly focused on how to dig out business data. This type of business data is a on-line web database which can be searched through their Web query interfaces. Deep Web (often called hidden web or invisible web) is composed of all the web databases. With the evolution of the "deep web", more and more researchers pay attention to the "integration" of the web database. However, to achieve this goal, it needs a complex system and many applications to work together. We are interested in an automatic extracting system to get the formulas or the lists of the results from those websites in specific domain of government procurement. To tackle this challenge, we propose a solution to create a unified interface and to inquire resources in a predefined domain. In this paper, we will discuss the automatic extracting system in several steps. First of all, the web query interfaces crawler which can execute JavaScript guarantees the coverage of the web database. Secondly, the query interface extractor and the interface integrator can allow us query all these founded web databases through a global query interface. Thirdly, the result page extractor and the result integrator can give a unified presentation. Lastly, a feedback method is developed to gather the result accuracy. A statistical model is built to improve the performance of the step 2 and 3. We assume our system is a dynamic system, which means the more we use it, the more precise results we will get.
Pages: 19 to 26
Copyright: Copyright (c) IARIA, 2010
Publication date: November 21, 2010
Published in: conference
ISSN: 2308-4162
ISBN: 978-1-61208-110-6
Location: Lisbon, Portugal
Dates: from November 21, 2010 to November 26, 2010