Advancing user classification models: A comparative analysis of machine learning approaches to enhance faculty password policies at the University of Buraimi

Boumedyen Shannaq, Oualid Ali, Said Al Maqbali, Afraa Al-Zeidi

Article ID: 9311
Vol 8, Issue 13, 2024


Abstract


In this paper, we assess the results of experiment with different machine learning algorithms for the data classification on the basis of accuracy, precision, recall and F1-Score metrics. We collected metrics like Accuracy, F1-Score, Precision, and Recall: From the Neural Network model, it produced the highest Accuracy of 0.129526 also highest F1-Score of 0.118785, showing that it has the correct balance of precision and recall ratio that can pick up important patterns from the dataset. Random Forest was not much behind with an accuracy of 0.128119 and highest precision score of 0.118553 knit a great ability for handling relations in large dataset but with slightly lower recall in comparison with Neural Network. This ranked the Decision Tree model at number three with a 0.111792, Accuracy Score while its Recall score showed it can predict true positives better than Support Vector Machine (SVM), although it predicts more of the positives than it actually is a majority of the times. SVM ranked fourth, with accuracy of 0.095465 and F1-Score of 0.067861, the figure showing difficulty in classification of associated classes. Finally, the K-Neighbors model took the 6th place, with the predetermined accuracy of 0.065531 and the unsatisfactory results with the precision and recall indicating the problems of this algorithm in classification. We found out that Neural Networks and Random Forests are the best algorithms for this classification task, while K-Neighbors is far much inferior than the other classifiers.


Keywords


password classification; machine learning; TF-IDF vectorization; random forest; K-Nearest Neighbors (KNN); decision tree; neural network; support vector machine

Full Text:

PDF


References


Aboukadri, S., Ouaddah, A., and Mezrioui, A. (2024). Machine learning in identity and access management systems: Survey and deep dive. Computers & Security, 139, 103729. https://doi.org/10.1016/j.cose.2024.103729

Akinola, O., Akinola, A., Ifeanyi, I. V., Oyerinde, O., Adewole, O. J., Sulaimon, B., and Oyekan, B. O. (2024). Artificial Intelligence and Machine Learning Techniques for Anomaly Detection and Threat Mitigation in Cloud-Connected Medical Devices. International Journal of Scientific Research and Modern Technology (IJSRMT), 1–13. https://doi.org/10.38124/ijsrmt.v3i3.26

Alketbi, S., BinAmro, M., Alhammadi, A., and Kaddoura, S. (2024). A Comparative Study of Machine Learning Models for Classification and Detection of Cybersecurity Threat in Hacking Forum. 2024 15th Annual Undergraduate Research Conference on Applied Computing (URC), 1–6. https://doi.org/10.1109/URC62276.2024.10604519

Almujahid, N. F., Haq, M. A., and Alshehri, M. (2024). Comparative evaluation of machine learning algorithms for phishing site detection. PeerJ Computer Science, 10, e2131. https://doi.org/10.7717/peerj-cs.2131

Alrawili, R., AlQahtani, A. A. S., and Khan, M. K. (2024). Comprehensive survey: Biometric user authentication application, evaluation, and discussion. Computers and Electrical Engineering, 119, 109485. https://doi.org/10.1016/j.compeleceng.2024.109485

Al-Shamsi, I. R., Shannaq, B., Adebiaye, R., & Owusu, T. (2024). Exploring biometric attendance technology in the Arab academic environment: Insights into faculty loyalty and educational performance in policy initiatives. Journal of Infrastructure, Policy and Development, 8(9), 6991. https://doi.org/10.24294/jipd.v8i9.6991

Alshamsi, I., Sadriwala, K. F., Ibrahim Alazzawi, F. J., & Shannaq, B. (2024). Exploring the impact of generative AI technologies on education: Academic expert perspectives, trends, and implications for sustainable development goals. Journal of Infrastructure, Policy and Development, 8(11), 8532. https://doi.org/10.24294/jipd.v8i11.8532

Altulaihan, E., Almaiah, M. A., and Aljughaiman, A. (2024). Anomaly Detection IDS for Detecting DoS Attacks in IoT Networks Based on Machine Learning Algorithms. Sensors, 24(2), 713. https://doi.org/10.3390/s24020713

Amity University Uttar Pradesh. (2024). Cyber Security Threats and Countermeasures in Digital Age. Journal of Applied Science and Education (JASE), 4(1), 1–20. https://doi.org/10.54060/a2zjournals.jase.42

Andelić, N., Baressi S̆egota, S., and Car, Z. (2024). Robust password security: A genetic programming approach with imbalanced dataset handling. International Journal of Information Security, 23(3), 1761–1786. https://doi.org/10.1007/s10207-024-00814-2

Aouedi, O., Vu, T.-H., Sacco, A., Nguyen, D. C., Piamrat, K., Marchetto, G., and Pham, Q.-V. (2024). A Survey on Intelligent Internet of Things: Applications, Security, Privacy, and Future Directions. IEEE Communications Surveys & Tutorials, 1–1. https://doi.org/10.1109/COMST.2024.3430368

Atadoga, A., Sodiya, E. O., Umoga, U. J., and Amoo, O. O. (2024). A comprehensive review of machine learning’s role in enhancing network security and threat detection. World Journal of Advanced Research and Reviews, 21(2), 877–886. https://doi.org/10.30574/wjarr.2024.21.2.0501

Atzori, M., Calò, E., Caruccio, L., Cirillo, S., Polese, G., and Solimando, G. (2024). Evaluating password strength based on information spread on social networks: A combined approach relying on data reconstruction and generative models. Online Social Networks and Media, 42, 100278. https://doi.org/10.1016/j.osnem.2024.100278

Bakhtiarnia, A., Zhang, Q., and Iosifidis, A. (2024). Efficient High-Resolution Deep Learning: A Survey. ACM Computing Surveys, 56(7), 1–35. https://doi.org/10.1145/3645107

Baseer, S., and Charumathi, K. S. (2024). Multi-Factor Authentication: A User Experience Study. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4840295

Bello, H. O., Ige, A. B. and Ameyaw, M. N. (2024). Deep learning in high-frequency trading: Conceptual challenges and solutions for real-time fraud detection. World Journal of Advanced Engineering Technology and Sciences, 12(2), 035–046. https://doi.org/10.30574/wjaets.2024.12.2.0265

Blessing, J., Hugenroth, D., Anderson, R. J., and Beresford, A. R. (2024). SoK: Web Authentication in the Age of End-to-End Encryption (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2406.18226

Bonneau, J., Herley, C., Oorschot, P. C. V., and Stajano, F. (2012). The Quest to Replace Passwords: A Framework for Comparative Evaluation of Web Authentication Schemes. 2012 IEEE Symposium on Security and Privacy, 553–567. https://doi.org/10.1109/SP.2012.44

Boumedyen, S. and Richmond, A. (2017). A Security Analysis To Be Technology Architecture for Ministry of Regional Municipalities and Water Resources (MRMWR) Sultanate of Oman. International Journal of Research in Social Sciences, 7(4), 247-258. https://www.ijmra.us/project%20doc/2017/IJRSS_APRIL2017/IJMRA-11393.pdf

Chanthati, S. R. (2024). How the power of machine – machine learning, data science and NLP can be used to prevent spoofing and reduce financial risks. Global Journal of Engineering and Technology Advances, 20(2), 100–119. https://doi.org/10.30574/gjeta.2024.20.2.0149

Chen, H., and Babar, M. A. (2024). Security for Machine Learning-based Software Systems: A Survey of Threats, Practices, and Challenges. ACM Computing Surveys, 56(6), 1–38. https://doi.org/10.1145/3638531

Escobar-Linero, E., Luna-Perejón, F., Muñoz-Saavedra, L., Sevillano, J. L., and Domínguez-Morales, M. (2022). On the feature extraction process in machine learning. An experimental study about guided versus non-guided process in falling detection systems. Engineering Applications of Artificial Intelligence, 114, 105170. https://doi.org/10.1016/j.engappai.2022.105170

Etzler, S., Schönbrodt, F. D., Pargent, F., Eher, R., and Rettenberger, M. (2024). Machine Learning and Risk Assessment: Random Forest Does Not Outperform Logistic Regression in the Prediction of Sexual Recidivism. Assessment, 31(2), 460–481. https://doi.org/10.1177/10731911231164624

Farhan, Y. H., Shakir, M., Tareq, M. A., & Shannaq, B. (2024). Incorporating Deep Median Networks for Arabic Document Retrieval Using Word Embeddings-Based Query Expansion. Journal of Information Science Theory and Practice, 12(3), 36–48. https://doi.org/10.1633/JISTAP.2024.12.3.3

Fraser, W., Broadbent, M., Pitropakis, N., and Chrysoulas, C. (2024). Examining the Strength of Three Word Passwords. In N. Pitropakis, S. Katsikas, S. Furnell, & K. Markantonakis (Eds.), ICT Systems Security and Privacy Protection (Vol. 710, pp. 119–133). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-65175-5_9

Gautam, A., Yadav, T. K., Seamons, K., and Ruoti, S. (2024). Passwords Are Meant to Be Secret: A Practical Secure Password Entry Channel for Web Browsers (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2402.06159

George A. S. (2024). The Dawn of Passkeys: Evaluating a Passwordless Future. https://doi.org/10.5281/ZENODO.10697886

Hagui, I., Msolli, A., Ben Henda, N., Helali, A., Gassoumi, A., Nguyen, T. P., and Hassen, F. (2023). A blockchain-based security system with light cryptography for user authentication security. Multimedia Tools and Applications, 83(17), 52451–52480. https://doi.org/10.1007/s11042-023-17643-5

Han, J. (2024). CNN-Based Multi-Factor Authentication System for Mobile Devices Using Faces and Passwords. Applied Sciences, 14(12), 5019. https://doi.org/10.3390/app14125019

Harshita, B., and Leema, N. (2024). ESD: E-mail Spam Detection using Cybersecurity-Driven Header Analysis and Machine Learning based Content Analysis. International Journal of Performability Engineering, 20(4), 205. https://doi.org/10.23940/ijpe.24.04.p2.205213

Hasan, M. K., Weichen, Z., Safie, N., Ahmed, F. R. A., and Ghazal, T. M. (2024). A Survey on Key Agreement and Authentication Protocol for Internet of Things Application. IEEE Access, 12, 61642–61666. https://doi.org/10.1109/ACCESS.2024.3393567

Kaur, K., and Kaur, P. (2022). SABDM: A self‐attention based bidirectional‐RNN deep model for requirements classification. Journal of Software: Evolution and Process, e2430. https://doi.org/10.1002/smr.2430

Komadina, A., Kovačević, I., Štengl, B., and Groš, S. (2024). Comparative Analysis of Anomaly Detection Approaches in Firewall Logs: Integrating Light-Weight Synthesis of Security Logs and Artificially Generated Attack Detection. Sensors, 24(8), 2636. https://doi.org/10.3390/s24082636

Labu, R. and Ahammed, F. (2024). Next-Generation Cyber Threat Detection and Mitigation Strategies: A Focus on Artificial Intelligence and Machine Learning. Journal of Computer Science and Technology Studies, 6(1), 179–188. https://doi.org/10.32996/jcsts.2024.6.1.19

Liu, Z., and He, K. (2024). A Decade’s Battle on Dataset Bias: Are We There Yet? (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2403.08632

Lykousas, N., and Patsakis, C. (2024). Decoding developer password patterns: A comparative analysis of password extraction and selection practices. Computers & Security, 145, 103974. https://doi.org/10.1016/j.cose.2024.103974

Maçãs, C., Campos, J. R., Lourenço, N., and Machado, P. (2024). Visualisation of Random Forest classification. Information Visualization, 14738716241260745. https://doi.org/10.1177/14738716241260745

Manthiramoorthy, C., Khan, K. M. S., and A, N. A. (2023). Comparing Several Encrypted Cloud Storage Platforms. International Journal of Mathematics, Statistics, and Computer Science, 2, 44–62. https://doi.org/10.59543/ijmscs.v2i.7971

Martín, A. G., de Diego, I. M., Fernández-Isabel, A., Beltrán, M., and Fernández, R. R. (2022). Combining user behavioural information at the feature level to enhance continuous authentication systems. Knowledge-Based Systems, 244, 108544. https://doi.org/10.1016/j.knosys.2022.108544

Mo, Y., Li, S., Dong, Y., Zhu, Z. and Li, Z. (2024). Password Complexity Prediction Based on RoBERTa Algorithm. https://doi.org/10.5281/ZENODO.11180356

Ng, C. K., Al-Quraishi, T., and Souza-Daw, T. D. (2023). Application of Sequential Analysis on Runtime Behavior for Ransomware Classification. Applied Data Science and Analysis, 2023, 126–142. https://doi.org/10.58496/ADSA/2023/012

Norman, D., Mouleeswaran, S. K, and Reeja, S. R. (2024). Natural language processing and stable diffusion model based graphical authentication using passphrase. Intelligent Decision Technologies, 18(2), 935–951. https://doi.org/10.3233/IDT-230279

Okoli, U. I., Obi, O. C., Adewusi, A. O., and Abrahams, T. O. (2024). Machine learning in cybersecurity: A review of threat detection and defense mechanisms. World Journal of Advanced Research and Reviews, 21(1), 2286–2295. https://doi.org/10.30574/wjarr.2024.21.1.0315

Othman, R., Rossi, B., and Barbara, R. (2024). A Comparison of Vulnerability Feature Extraction Methods from Textual Attack Patterns (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2407.06753

Papaspirou, V., Papathanasaki, M., Maglaras, L., Kantzavelou, I., Douligeris, C., Ferrag, M. A., and Janicke, H. (2023). A Novel Authentication Method That Combines Honeytokens and Google Authenticator. Information, 14(7), 386. https://doi.org/10.3390/info14070386

Pendela, N. P. S., Janet, K. A., Yadav, A. M. R., Subramanyam, C. B., Hariharan, S., and Kekreja, V. (2024). Enhancing Cyberbullying Detection: A Multi-Algorithmic Approach. 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), 1–5. https://doi.org/10.1109/ADICS58448.2024.10533585

Por, L. Y., Ng, I. O., Chen, Y.-L., Yang, J., and Ku, C. S. (2024). A Systematic Literature Review on the Security Attacks and Countermeasures Used in Graphical Passwords. IEEE Access, 12, 53408–53423. https://doi.org/10.1109/ACCESS.2024.3373662

Rashid Al-Shamsi, I., & Shannaq, B. (2024). Leveraging clustering techniques to drive sustainable economic innovation in the India–Gulf interchange. Cogent Social Sciences, 10(1), 2341483. https://doi.org/10.1080/23311886.2024.2341483

Rooney, M. J., Levy, Y., Li, W., and Kumar, A. (2024). Comparing experts’ and users’ perspectives on the use of password workarounds and the risk of data breaches. Information & Computer Security. https://doi.org/10.1108/ICS-05-2024-0116

Shakir, M., Farsi, M. J. A., Al-Shamsi, I. R., Shannaq, B., and Taufiq-Hail, G. A.-M., (2024). The Influence of Mobile Information Systems Implementation on Enhancing Human Resource Performance Skills: An Applied Study in a Small Organization. International Journal of Interactive Mobile Technologies (iJIM), 18(13), 37–68. https://doi.org/10.3991/ijim.v18i13.47027

Shannaq, B. (2024a). Digital Formative Assessment as a Transformative Educational Technology. In K. Arai (Ed.), Advances in Information and Communication (Vol. 921, pp. 471–481). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-54053-0_32

Shannaq, B. (2024b). Unveiling the Nexus: Exploring TAM Components Influencing Professors’ Satisfaction With Smartphone Integration in Lectures: A Case Study From Oman. TEM Journal, 2365–2375. https://doi.org/10.18421/TEM133-63

Shannaq, B. (2024c). Enhancing Human-Computer Interaction: An Interactive and Automotive Web Application - Digital Associative Tool for Improving Formulating Search Queries. In K. Arai (Ed.), Advances in Information and Communication (Vol. 921, pp. 511–523). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-54053-0_35

Shannaq, B. (2024d). Novel Algorithm for Differentiating Authorized Users from Fraudsters by Analyzing Mobile Keypad Input Patterns during Password Updates. TEM Journal, 13(4)

Shannaq, B., Shamsi, I. A., and Majeed, S. N. A. (2019). Management Information System for Predicting Quantity Martial’s. TEM Journal, 8, 1143–1149. https://doi.org/10.18421/TEM84-06

Shannaq, B., Talab, M. A., Shakir, M., Sheker, M. T., and Farhan, A. M. (2023). Machine learning model for managing the insider attacks in big data. 020013. https://doi.org/10.1063/5.0188358

Shannaq, B., Adebiaye, R., Owusu, T., & Al-Zeidi, A. (2024). An intelligent online human-computer interaction tool for adapting educational content to diverse learning capabilities across Arab cultures: Challenges and strategies. Journal of Infrastructure, Policy and Development, 8(9), 7172. https://doi.org/10.24294/jipd.v8i9.7172

Shannaq, B., & Shakir, M. (2024). Enhancing Security with Multi-Factor User Behavior Identification Via Longest Common Subsequence Analysis. Informatica 48 (2024) 73–82 73, 48(16), 73–82. https://doi.org/10.31449/inf.v48i19.6529

Shi, Y., and Wang, Y. (2024). A Comparative Work to Highlight the Superiority of Mouth Brooding Fish (MBF) over the Various ML Techniques in Password Security Classification. International Journal of Advanced Computer Science and Applications, 15(5). https://doi.org/10.14569/IJACSA.2024.0150520

Singh, N., and Das, A. K. (2024). TFAS: Two factor authentication scheme for blockchain enabled IoMT using PUF and fuzzy extractor. The Journal of Supercomputing, 80(1), 865–914. https://doi.org/10.1007/s11227-023-05507-6

Singla, D., and Verma, N. (2024). Performance Analysis of Authentication System: A Systematic LiteratureReview. Recent Advances in Computer Science and Communications, 17(7), e121223224363. https://doi.org/10.2174/0126662558246531231121115514

Smith, L., Prior, S., and Ophoff, J. (2024). Investigating the Accessibility and Usability of Multi-factor Authentication for Young People. In C. Stephanidis, M. Antona, S. Ntoa, & G. Salvendy (Eds.), HCI International 2024 Posters (Vol. 2119, pp. 129–135). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-61966-3_15

Ugwu, C., Ukwandu, E., Ofusori, L., Ezugwu, A., Ome, U., Ezema, M., and Ndunagu, J. (2024). Factors Influencing The Experiences of End-users in Password-Based Authentication System. https://doi.org/10.21203/rs.3.rs-4438584/v1

Umejiaku, A. P., and Sheng, V. S. (2024). RoseCliff Algorithm: Making Passwords Dynamic. Applied Sciences, 14(2), 723. https://doi.org/10.3390/app14020723

Vanila, S., Jeyavathana, B., Rathinam, A., and Elango, K. (2024). Enhancing Password Security With Machine Learning-Based Strength Assessment Techniques: In J. A. Ruth, V. G. V. Mahesh, P. Visalakshi, R. Uma, & A. Meenakshi (Eds.), Advances in Information Security, Privacy, and Ethics (pp. 296–314). IGI Global. https://doi.org/10.4018/979-8-3693-4159-9.ch018

Veras, R., Collins, C., and Thorpe, J. (2021). A Large-Scale Analysis of the Semantic Password Model and Linguistic Patterns in Passwords. ACM Transactions on Privacy and Security, 24(3), 1–21. https://doi.org/10.1145/3448608

Wang, Y., Han, Y., Wang, C., Song, S., Tian, Q., and Huang, G. (2024). Computation-efficient deep learning for computer vision: A survey. Cybernetics and Intelligence, 1–24. Cybernetics and Intelligence. https://doi.org/10.26599/CAI.2024.9390002

Wasfi, H., Stone, R., and Genschel, U. (2024). Word-Pattern: Enhancement of Usability and Security of User-Chosen Recognition Textual Password. International Journal of Advanced Computer Science and Applications, 15(6). https://doi.org/10.14569/IJACSA.2024.0150605

Yu, X., and Liao, Q. (2016). User password repetitive patterns analysis and visualization. Information & Computer Security, 24(1), 93–115. https://doi.org/10.1108/ICS-06-2015-0026

Zhou, D.-W., Cai, Z.-W., Ye, H.-J., Zhan, D.-C., and Liu, Z. (2024). Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need. International Journal of Computer Vision. https://doi.org/10.1007/s11263-024-02218-0

Zhou, E., Peng, Y., Shao, G., Deng, F., Miao, Y., and Fan, W. (2024). Password cracking using chunk similarity. Future Generation Computer Systems, 150, 380–394. https://doi.org/10.1016/j.future.2023.09.013




DOI: https://doi.org/10.24294/jipd9311

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Boumedyen Shannaq, Oualid Ali, Said Al Maqbali, Afraa Al-Zeidi

License URL: https://creativecommons.org/licenses/by/4.0/

This site is licensed under a Creative Commons Attribution 4.0 International License.