Home // ICDS 2012, The Sixth International Conference on Digital Society // View article
Identifying Potentially Useful Email Header Features for Email Spam Filtering
Authors:
Omar Al-Jarrah
Ismail Khater
Basheer Al-Duwairi
Keywords: Email spam, Machine Learning
Abstract:
Email spam continues to be a major problem in the Internet. With the spread of malware combined with the power of botnets, spammers are now able to launch large scale spam campaigns causing major traffic increase and leading to enormous economical loss. In this paper, we identify potentially useful email header features for email spam filtering by analyzing publicly available datasets. Then, we use these features as input to several machine learning-based classifiers and compare their performance in filtering email spam. These classifiers are: C4.5 Decision Tree (DT), Support Vector Machine (SVM), Multilayer Perception (MP), Nave Bays (NB), Bayesian Network (BN), and Random Forest (RF). Experimental studies based on publicly available datasets show that RF classifier has the best performance with an average accuracy, precision, recall, F-Measure, ROC area of 98.5%, 98.4%, 98.5%, and 98.5%, respectively
Pages: 140 to 145
Copyright: Copyright (c) IARIA, 2012
Publication date: January 30, 2012
Published in: conference
ISSN: 2308-3956
ISBN: 978-1-61208-176-2
Location: Valencia, Spain
Dates: from January 30, 2012 to February 4, 2012