Home // FUTURE COMPUTING 2021, The Thirteenth International Conference on Future Computational Technologies and Applications // View article


Data Pre-processing and Clustering Algorithm for Epidemic Disease Diagnosis Data

Authors:
Yaoyao Sang
Lianjiang Zhu
Tao Du
Shouning Qu

Keywords: Tuberculosis clinic data; data normalization; location information; Density peak clustering algorithm

Abstract:
The paper mainly solves the problem of nonstandard tuberculosis data, and makes cluster analysis. It is an evaluation of existing work.The current epidemic disease data are huge and has great research value, but it is not easy to be directly used to discover knowledge. In this paper, The tuberculosis clinic data is pre-processed first, the ways of data normalization mainly includes data cleaning, data integration, data transformation and data normalization. When processing the location information in the data, an innovative method of data weighting is proposed, which can makes the complex medical data be numerical and normalized. Then the novel unsupervised machine learning method Density peak clustering algorithm is used to clustering the data set and prove the validation of our method. This work can form clustering results and discover knowledge on this basis. In this paper, firstly, the relevant literature is cited, then the pre-processing method is introduced, and then the application of density clustering algorithm is introduced.Finally, the summary and prospect are given

Pages: 14 to 19

Copyright: Copyright (c) IARIA, 2021

Publication date: April 18, 2021

Published in: conference

ISSN: 2308-3735

ISBN: 978-1-61208-846-4

Location: Porto, Portugal

Dates: from April 18, 2021 to April 22, 2021