Home // DATA ANALYTICS 2020, The Ninth International Conference on Data Analytics // View article


Breast Cancer Dataset Analytics

Authors:
Kevin Diami
Noha Hazzazi

Keywords: Classification; Breast Cancer; Neural Networks; Support Vector Machines; Random Forest; Python; Weka

Abstract:
Breast cancer is a disease that causes the cells of the breast to uncontrollably grow. It is the most occurring cancer in females worldwide. The type of breast cancer is governed by to breast cells that turn into cancer. Breast cancer can begin in different parts of the breast including lobules, ducts, and connective tissue. The clinical prognostic (the likelihood or expected development of a disease) stage depends on a number of factors including tumor size, lymph node status, whether the cancer has spread to other parts of the body, the cancer grade, Estrogen status, and Progesterone status. In this paper, the clinical prognostic stage (referred to as 6th Stage in this study) will be predicted using both a Python program and the Weka tool. The three algorithms, Neural Networks, Support Vector Machines, and Random Forest will be applied to the SEER Breast Cancer Dataset to classify the 6th Stage, which includes five classes.

Pages: 13 to 20

Copyright: Copyright (c) IARIA, 2020

Publication date: October 25, 2020

Published in: conference

ISSN: 2308-4464

ISBN: 978-1-61208-816-7

Location: Nice, France

Dates: from October 25, 2020 to October 29, 2020