Home // International Journal On Advances in Software, volume 16, numbers 1 and 2, 2023 // View article
Methodological Choices in Machine Learning Applications
Authors:
Kendall Nygard
Mostofa Ahsan
Aakanksha Rastogi
Rashmi Satyal
Keywords: Machine Learning; Data Management; Feature Engineering; Feature Selection; Self-Driving Cars; Intrusion Detection
Abstract:
Machine learning is a subset of artificial intelligence in which a machine has an ability to learn and employ complex algorithms to impersonate human behavior. Development of a machine learning model involves careful preparation and management of data and selection and features to produce meaningful results. The data issues are often challenging due to availability, characteristics, properties, categorization, and balance. We report on relevant literature, case studies and experiments surrounding the data issues. We describe alternative machine learning methodologies and emphasize supervised learning, including treatment of experimental procedures. Procedures and challenges in the collection, quantity, distribution, quality, sampling, of and relevancy of data are included. Applications of machine learning models are presented, including classification models for self-driving cars. These models introduce anti-autonomy trust modeling. We also describe intrusion detection models that can detect malicious activity in computing systems. These applications also provide insight into overfitting and underfitting training data. Feature engineering and feature selection issues are presented, including approaches to identifying, combining, and eliminating attributes and features to determine which are needed and their significance. Approaches for treating class imbalances in data management are discussed. Comparisons among categorical encoding techniques are presented. The work provides perspective and insight into resolving multiple issues that must be addressed in utilizing machine learning models in practice.
Pages: 82 to 96
Copyright: Copyright (c) to authors, 2023. Used with permission.
Publication date: June 30, 2023
Published in: journal
ISSN: 1942-2628