PHISHING WEBSITE IDENTIFICATION: UNLEASHING THE POTENTIAL OF MACHINE LEARNING AND STACKING ENSEMBLES TECHNIQUES

Authors: Md. Ziaul Hassan*, Md. Humayun Kabir and Md. Amir Hamja
* Corresponding Author
Published on 2024-11-28
DOI: https://www.doi.org/10.59125/JST.22112
Abstract:

As the internet continues to play an integral role in our daily lives, the proliferation of phishing websites poses a significant threat to online security which needs to be addressed. This paper predicts phishing websites using Machine Learning (ML), and ensemble ML algorithms by utilizing an open dataset from Kaggle Competition. The study utilizes a varied array of ML algorithms, encompassing Decision Tree (DT), K-Nearest Neighbors (KNN), Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machine (GBM) to classify features indicative of phishing activities. Additionally, we propose two Stacking ensemble learning by combining ML models. The first one namely ESM1 which combines DT, KNN, SVM as base learners and LR as meta learner and second one namely ESM2 uses GBM, KNN, SVM as base learners and LR as meta learner. The performance of the used models are compared based on several matrices including accuracy, precision, recall, F1 score and AUC values. Among the trained models the ESM2 model outperformed others with a value of 0.98 for all performance measures and closely followed by individual RF and GBM with a value of 0.98 for all metrices. This research contributes to the ongoing efforts to enhance cybersecurity measures by leveraging the power of machine learning and stacking ensemble learning for the proactive identification and mitigation of online phishing threats.

About JST


winwin winwin winwin winwin winwin bongda tv winvn SEN88 D9BET