Survey data preprocessing for optimal modelling through ANNs applied to management environments

Joaquín Texeira-Quirós, Maria do Rosário Texeira Justino, António José Gonçalves, Marina Godinho Antunes, Pedro Ribeiro Mucharreira

Article ID: 7108
Vol 8, Issue 9, 2024

VIEWS - 838 (Abstract) 41 (PDF)

Abstract


Surveys are one of the most important tasks to be executed to get valued information. One of the main problems is how the data about many different persons can be processed to give good information about their environment. Modelling environments through Artificial Neural Networks (ANNs) is highly common because ANN’s are excellent to model predictable environments using a set of data. ANN’s are good in dealing with sets of data with some noise, but they are fundamentally surjective mathematical functions, and they aren’t able to give different results for the same input. So, if an ANN is trained using data where samples with the same input configuration has different outputs, which can be the case of survey data, it can be a major problem for the success of modelling the environment. The environment used to demonstrate the study is a strategic environment that is used to predict the impact of the applied strategies to an organization financial result, but the conclusions are not limited to this type of environment. Therefore, is necessary to adjust, eliminate invalid and inconsistent data. This permits one to maximize the probability of success and precision in modeling the desired environment. This study demonstrates, describes and evaluates each step of a process to prepare data for use, to improve the performance and precision of the ANNs used to obtain the model. This is, to improve the model quality. As a result of the studied process, it is possible to see a significant improvement both in the possibility of building a model as in its accuracy.


Keywords


survey; data; processing; modelling; neural networks; ANN

Full Text:

PDF


References


Baashar, Y., Alkawsi, G., Mustafa, A., et al. (2022). Toward Predicting Student’s Academic Performance Using Artificial Neural Networks (ANNs). Applied Sciences, 12(3), 1289. https://doi.org/10.3390/app12031289

Cai, J., Luo, J., Wang, S., et al. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70–79. https://doi.org/10.1016/j.neucom.2017.11.077

Dancey, C., & Reidy, J. (2017). Statistics without Maths for Psychology, 7th ed. Pearson.

García-Carrasco, J., Maté, A., & Trujillo, J. (2023). A Data-Driven Methodology for Guiding the Selection of Preprocessing Techniques in a Machine Learning Pipeline. In: Proceedings of International Conference on Advanced Information Systems Engineering; Springer, Cham.

Gonçalves, A. J. (2020). Strategic variables for forecasting financial results in small companies through neural networks and decision trees (Spanish) [PhD thesis]. Universidad de Extremadura, Badajoz, Spain.

Gonzalez Zelaya, C. V. (2019). Towards Explaining the Effects of Data Preprocessing on Machine Learning. In: Proceeding of the 2019 IEEE 35th International Conference on Data Engineering (ICDE). https://doi.org/10.1109/icde.2019.00245

Hitt, M., Ireland, R., & Hoskisson, R. (2011). Concepts Strategic Management: Competitiveness & Globalization, 9th ed. Canada: Cengage South-Western.

Hoel, P. (1966). Introduction to Mathematical Statistics. New York, London & Sydney: John Wiley & Sons, Inc.

Justino, M. do R. T. F., Texeira-Quirós, J., Gonçalves, A. J., et al. (2024). The Role of Artificial Neural Networks (ANNs) in Supporting Strategic Management Decisions. Journal of Risk and Financial Management, 17(4), 164. https://doi.org/10.3390/jrfm17040164

Lopez-Ramirez, E., Lopez-Zamora, S., Escobedo, S., et al. (2023). Artificial Neural Networks (ANNs) for Vapor-Liquid-Liquid Equilibrium (VLLE) Predictions in N-Octane/Water Blends. Processes, 11(7), 2026. https://doi.org/10.3390/pr11072026

Maharana, K., Mondal, S., & Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Global Transitions Proceedings, 3(1), 91–99. https://doi.org/10.1016/j.gltp.2022.04.020

Moore, D. (2003). The Basic Practice of Statistics 3rd ed. Freeman Publishers.

Mumuni, A., & Mumuni, F. (2024). Automated data processing and feature engineering for deep learning and big data applications: A survey. Journal of Information and Intelligence. https://doi.org/10.1016/j.jiixd.2024.01.002

Porter, M. (1996). What is strategy? Harvard Business Review.

Soong, T. (2004). Fundamental of Probability and Statistics for Engineers. Jonh Wiley & Sons, Inc.




DOI: https://doi.org/10.24294/jipd.v8i9.7108

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Joaquín Texeira-Quirós, Maria do Rosário Texeira Justino, António José Gonçalves, Marina Godinho Antunes, Pedro Ribeiro Mucharreira

License URL: https://creativecommons.org/licenses/by/4.0/

This site is licensed under a Creative Commons Attribution 4.0 International License.