Home // DBKDA 2012, The Fourth International Conference on Advances in Databases, Knowledge, and Data Applications // View article


A Multidimensional Data Modeling of the SEER Database from the USA National Cancer Institute

Authors:
Heidy M. Marin-Castro
Jose Torres-Jimenez
Diana I. Escalona-Vargas

Keywords: data warehouse; OLAP; drill-down; roll-up; cancer database.

Abstract:
Nowadays, one of the main challenges in computer science is to process the large amount of data available in diverse data sources, such as databases or files, in order to find useful information. For this purpose, it is required specialized tools that process raw data in a smart way to discover knowledge. In this paper, we present the design of a data warehouse and a tool called TDR (Tool Drill-Roll) that allow to discover knowledge from the database SEER (Surveillance, Epidemiology, and End Results) from the Cancer Institute in the United States of America, which has more than five million of records. The data warehouse is designed using a multidimensional approach and the TDR tool allows to exploit interesting information from SEER using drill-down and roll-up(two operators of On line Analytical Processing (OLAP)). The data warehouse can be seen at many levels of granularity. Our developed TDR tool allows knowing the statistics of the incidence, mortality and survival of patients with cancer along of years and extract useful information related to this disease that could be used to establish a relation between certain characteristics of patients that has an specific type of cancer. The knowledge discovered by our TDR tool could be of interest for government, health care institutes or research community for decision making. The main contribution of this paper is the discovery of new knowledge from the SEER database. The methodology used to design the data warehouse and the TDR tool could be applied to others domains with minimal changes.

Pages: 81 to 85

Copyright: Copyright (c) IARIA, 2012

Publication date: February 29, 2012

Published in: conference

ISSN: 2308-4332

ISBN: 978-1-61208-185-4

Location: Saint Gilles, Reunion

Dates: from February 29, 2012 to March 5, 2012