Policy-driven innovations in fraud prevention: Developing an ARFLGB-XGBoost early warning model to mitigate online romance scams in telecommunication networks

Guancheng Chen, Wenzhuo Du, Yichen Shan, Anqi Wang, Liping Wang

Article ID: 7153
Vol 8, Issue 12, 2024

VIEWS - 1184 (Abstract)

Abstract


To address the escalating online romance scams within telecom fraud, we developed an Adaptive Random Forest Light Gradient Boosting (ARFLGB)-XGBoost early warning system. Our method involves compiling detailed Online Romance Scams (ORS) incident data into a 24-variable dataset, categorized to analyze feature importance with Random Forest and LightGBM models. An innovative adaptive algorithm, the Adaptive Random Forest Light Gradient Boosting, optimizes these features for integration with XGBoost, enhancing early Online romance scams threat detection. Our model showed significant performance improvements over traditional models, with accuracy gains of 3.9%, a 12.5% increase in precision, recall improvement by 5%, an F1 score increase by 5.6%, and a 5.2% increase in Area Under the Curve (AUC). This research highlights the essential role of advanced fraud detection in preserving communication network integrity, contributing to a stable economy and public safety, with implications for policymakers and industry in advancing secure communication infrastructure.


Keywords


telecom network fraud; online romance scams; early warning model; XGBoost; innovative adaptive algorithm

Full Text:

PDF


References

  1. Abbasimehr, H., Paki, R., & Bahrini, A. (2023). A novel XGBoost-based featurization approach to forecast renewable energy consumption with deep learning models. Sustainable Computing: Informatics and Systems, 38. https://doi.org/10.1016/j.suscom.2023.100863
  2. Afriyie, J. K., Tawiah, K., Pels, W. A., et al. (2023). A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions. Decision Analytics Journal, 6. https://doi.org/10.1016/j.dajour.2023.100163
  3. Ashraf, M. T., Dey, K., & Mishra, S. (2023). Identification of high-risk roadway segments for wrong-way driving crash using rare event modeling and data augmentation techniques. Accident Analysis & Prevention, 181. https://doi.org/10.1016/j.aap.2022.106933
  4. Bahaghighat, M., Ghasemi, M., & Ozen, F. (2023). A high-accuracy phishing website detection method based on machine learning. Journal of Information Security and Applications, 77. https://doi.org/10.1016/j.jisa.2023.103553
  5. Belly, G., Boeckelmann, L., Graciano, C. M. C., et al. (2023). Forecasting sovereign risk in the Euro area via machine learning. Journal of Forecasting, 42(3), 657–684. Portico. https://doi.org/10.1002/for.2938
  6. Breiman, L. L. (2001). Random forest. Mach Learn. 45, 5-32. https://doi.org/10.1023/A:1010933404324
  7. Cao, R., Wang, J., Mao, M., et al. (2023). Feature-wise attention based boosting ensemble method for fraud detection. Engineering Applications of Artificial Intelligence, 126. https://doi.org/10.1016/j.engappai.2023.106975
  8. Coluccia, A., Pozza, A., Ferretti, F., et al. (2020). Online Romance Scams: Relational Dynamics and Psychological Characteristics of the Victims and Scammers. A Scoping Review. Clinical Practice & Epidemiology in Mental Health, 16(1), 24–35. https://doi.org/10.2174/1745017902016010024
  9. Cross, C., & Holt, T. J. (2023). More than Money: Examining the Potential Exposure of Romance Fraud Victims to Identity Crime. Global Crime, 24(2), 107–121. https://doi.org/10.1080/17440572.2023.2185607
  10. Domashova, J., & Kripak, E. (2022). Development of a generalized algorithm for identifying atypical bank transactions using machine learning methods. Procedia Computer Science, 213, 101–109. https://doi.org/10.1016/j.procs.2022.11.044
  11. G.S., T., Dheeshjith, S., Iyengar, S. S., et al. (2021). A hybrid and effective learning approach for Click Fraud detection. Machine Learning with Applications, 3. https://doi.org/10.1016/j.mlwa.2020.100016
  12. Gogineni, A., Panday, I. K., Kumar, P., et al. (2023). Predicting compressive strength of concrete with fly ash and admixture using XGBoost: a comparative study of machine learning algorithms. Asian Journal of Civil Engineering, 25(1), 685–698. https://doi.org/10.1007/s42107-023-00804-0
  13. Izotova, A., & Valiullin, A. (2021). Comparison of Poisson process and machine learning algorithms approach for credit card fraud detection. Procedia Computer Science, 186, 721–726. https://doi.org/10.1016/j.procs.2021.04.214
  14. Jiang, Z., Chen, K., Wen, H., et al. (2022). Applying blockchain-based method to smart contract classification for CPS applications. Digital Communications and Networks, 8(6), 964–975. https://doi.org/10.1016/j.dcan.2022.08.011
  15. Kamboj, A., Kumar, P., Bairwa, A. K., et al. (2023). Detection of malware in downloaded files using various machine learning models. Egyptian Informatics Journal, 24(1), 81–94. https://doi.org/10.1016/j.eij.2022.12.002
  16. Koc, K., Ekmekcioğlu, Ö., & Gurgun, A. P. (2021). Integrating feature engineering, genetic algorithm and tree-based machine learning methods to predict the post-accident disability status of construction workers. Automation in Construction, 131. https://doi.org/10.1016/j.autcon.2021.103896
  17. Kolev, M. (2023). XGB-COF: A machine learning software in Python for predicting the friction coefficient of porous Al-based composites with Extreme Gradient Boosting. Software Impacts, 17. https://doi.org/10.1016/j.simpa.2023.100531
  18. Kumar, M. (2023). Early detection of chronic kidney disease using recursive feature elimination and cross-validated XGBoost model. International Journal of Computational Materials Science and Engineering, 13(04). https://doi.org/10.1142/s2047684123500367
  19. Lao, Z., He, D., Wei, Z., et al. (2023). Intelligent fault diagnosis for rail transit switch machine based on adaptive feature selection and improved LightGBM. Engineering Failure Analysis, 148. https://doi.org/10.1016/j.engfailanal.2023.107219
  20. Lazarus, S., Button, M., & Kapend, R. (2022). Exploring the value of feminist theory in understanding digital crimes: Gender and cybercrime types. The Howard Journal of Crime and Justice, 61(3), 381–398. Portico. https://doi.org/10.1111/hojo.12485
  21. Lazarus, S., Whittaker, J. M., McGuire, M. R., et al. (2023). What do we know about online romance fraud studies? A systematic review of the empirical literature (2000 to 2021). Journal of Economic Criminology, 2. https://doi.org/10.1016/j.jeconc.2023.100013
  22. Liu, W., Fan, H., Xia, M., et al. (2022). Predicting and interpreting financial distress using a weighted boosted tree-based tree. Engineering Applications of Artificial Intelligence, 116. https://doi.org/10.1016/j.engappai.2022.105466
  23. Liu, Z., Zhang, Z., Yang, H., et al. (2023). An innovative model fusion algorithm to improve the recall rate of peer-to-peer lending default customers. Intelligent Systems with Applications, 20. https://doi.org/10.1016/j.iswa.2023.200272
  24. Ma, L., Zhou, C., Lee, D., et al. (2022). Prediction of axial compressive capacity of CFRP-confined concrete-filled steel tubular short columns based on XGBoost algorithm. Engineering Structures, 260. https://doi.org/10.1016/j.engstruct.2022.114239
  25. Mohiuddin, G., Lin, Z., Zheng, J., et al. (2023). Intrusion Detection using hybridized Meta-heuristic techniques with Weighted XGBoost Classifier. Expert Systems with Applications, 232. https://doi.org/10.1016/j.eswa.2023.120596
  26. Mokbal, F. M. M., Dan, W., Xiaoxi, W., et al. (2021). XGBXSS: An Extreme Gradient Boosting Detection Framework for Cross-Site Scripting Attacks Based on Hybrid Feature Selection Approach and Parameters Optimization. Journal of Information Security and Applications, 58. https://doi.org/10.1016/j.jisa.2021.102813
  27. Murugan, M. S., & T, S. K. (2023). Large-scale data-driven financial risk management & analysis using machine learning strategies. Measurement: Sensors, 27. https://doi.org/10.1016/j.measen.2023.100756
  28. Musbah, H., Ali, G., Aly, H. H., et al. (2022). Energy management using multi-criteria decision making and machine learning classification algorithms for intelligent system. Electric Power Systems Research, 203. https://doi.org/10.1016/j.epsr.2021.107645
  29. Nanath, K., & Olney, L. (2023). An investigation of crowdsourcing methods in enhancing the machine learning approach for detecting online recruitment fraud. International Journal of Information Management Data Insights, 3(1). https://doi.org/10.1016/j.jjimei.2023.100167
  30. Nti, I. K., & Somanathan, A. R. (2024). A Scalable RF-XGBoost Framework for Financial Fraud Mitigation. IEEE Transactions on Computational Social Systems, 11(2), 1556–1563. https://doi.org/10.1109/tcss.2022.3209827
  31. Qian, Q., Sun, H., Wu, J., et al. (2020). AKI Prediction Models in ICU: A Comparative Study (Preprint). https://doi.org/10.2196/preprints.18257
  32. Razavi, R., Gharipour, A., Fleury, M., et al. (2019). A practical feature-engineering framework for electricity theft detection in smart grids. Applied Energy, 238, 481–494. https://doi.org/10.1016/j.apenergy.2019.01.076
  33. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Cornell University. https://doi.org/10.1145/2939672.2939778
  34. Sha, Y., Faber, J., Gou, S., et al. (2022). An acoustic signal cavitation detection framework based on XGBoost with adaptive selection feature engineering. Measurement, 192. https://doi.org/10.1016/j.measurement.2022.110897
  35. Singh, V. K. V., Rashwan, H. H., Akram, F. F., et al. (2018). Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network. Cornell University.
  36. Srivastava, A., Singh, V., & Drall, G. S. (2019). Sentiment Analysis of Twitter Data. International Journal of Healthcare Information Systems and Informatics, 14(2), 1–16. https://doi.org/10.4018/ijhisi.2019040101
  37. Tan, B., Gan, Z., & Wu, Y. (2023). The measurement and early warning of daily financial stability index based on XGBoost and SHAP: Evidence from China. Expert Systems with Applications, 227. https://doi.org/10.1016/j.eswa.2023.120375
  38. Tang, Z., Xiao, Y., Jiao, Y., et al. (2022). Research on Short-Term Low-Voltage Distribution Network Line Loss Prediction Based on Kmeans-LightGBM. Journal of Circuits, Systems and Computers, 31(13). https://doi.org/10.1142/s0218126622502280
  39. Tianyu, B. B., Changbing, Z. Z., Chenlin, L. L., Economics, S.O.S. (2019). Design and application of pur-chase behavior recognition model in implicit feedback data based on Lightgbm algorithm. Wireless Internet Technology.
  40. Wang, X., Gao, S., Guo, Y., et al. (2022). A Combined Prediction Model for Hog Futures Prices Based on WOA‐LightGBM‐CEEMDAN. Complexity, 2022(1). Portico. https://doi.org/10.1155/2022/3216036
  41. Wang, X., Zhang, G., Lou, S., et al. (2022). Two-round feature selection combining with LightGBM classifier for disturbance event recognition in phase-sensitive OTDR system. Infrared Physics & Technology, 123. https://doi.org/10.1016/j.infrared.2022.104191
  42. Wang, Z., & Thing, V. L. L. (2023). Feature mining for encrypted malicious traffic detection with deep learning and other machine learning algorithms. Computers & Security, 128. https://doi.org/10.1016/j.cose.2023.103143
  43. Whitty, M. T., & Buchanan, T. (2016). The online dating romance scam: The psychological impact on victims – both financial and non-financial. Criminology & Criminal Justice, 16(2), 176–194. https://doi.org/10.1177/1748895815603773
  44. Yan, Z., Chen, H., Dong, X., et al. (2022). Research on prediction of multi-class theft crimes by an optimized decomposition and fusion method based on XGBoost. Expert Systems with Applications, 207. https://doi.org/10.1016/j.eswa.2022.117943
  45. Yang, L. L., Niu, X. X., Wu, J. J. (2021). RF-LighGBM: A probabilistic ensemble way to predict customer repurchase behaviour in community e-commerce. Cornell University.
  46. Zhang, T., Zhu, W., Wu, Y., et al. (2023). An explainable financial risk early warning model based on the DS-XGBoost model. Finance Research Letters, 56. https://doi.org/10.1016/j.frl.2023.104045
  47. Zhao, Z., Li, D., & Dai, W. (2023). Machine-learning-enabled intelligence computing for crisis management in small and medium-sized enterprises (SMEs). Technological Forecasting and Social Change, 191. https://doi.org/10.1016/j.techfore.2023.122492
  48. Zheng, H.-L., An, S.-Y., Qiao, B.-J., et al. (2023). A data-driven interpretable ensemble framework based on tree models for forecasting the occurrence of COVID-19 in the USA. Environmental Science and Pollution Research, 30(5), 13648–13659. https://doi.org/10.1007/s11356-022-23132-3
  49. Zhou, X., Zhao, C., & Bian, X. (2023). Prediction of maximum ground surface settlement induced by shield tunneling using XGBoost algorithm with golden-sine seagull optimization. Computers and Geotechnics, 154. https://doi.org/10.1016/j.compgeo.2022.105156
  50. Zhu, C., Zhang, C., Wang, R., et al. (2023). Building of safer urban hubs: Insights from a comparative study on cyber telecom scams and early warning design. Urban Governance, 3(3), 200–210. https://doi.org/10.1016/j.ugj.2023.05.004


DOI: https://doi.org/10.24294/jipd.v8i12.7153

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Guancheng Chen, Wenzhuo Du, Yichen Shan, Anqi Wang, Liping Wang

License URL: https://creativecommons.org/licenses/by/4.0/

This site is licensed under a Creative Commons Attribution 4.0 International License.