Journal Browser
Search
Survey data preprocessing for optimal modelling through ANNs applied to management environments
Joaquín Texeira-Quirós
Maria do Rosário Texeira Justino
António José Gonçalves
Marina Godinho Antunes
Pedro Ribeiro Mucharreira
Journal of Infrastructure Policy and Development 2024, 8(9); https://doi.org/10.24294/jipd.v8i9.7108
Submitted:12 Jun 2024
Accepted:17 Jul 2024
Published:06 Sept 2024
Abstract

Surveys are one of the most important tasks to be executed to get valued information. One of the main problems is how the data about many different persons can be processed to give good information about their environment. Modelling environments through Artificial Neural Networks (ANNs) is highly common because ANN’s are excellent to model predictable environments using a set of data. ANN’s are good in dealing with sets of data with some noise, but they are fundamentally surjective mathematical functions, and they aren’t able to give different results for the same input. So, if an ANN is trained using data where samples with the same input configuration has different outputs, which can be the case of survey data, it can be a major problem for the success of modelling the environment. The environment used to demonstrate the study is a strategic environment that is used to predict the impact of the applied strategies to an organization financial result, but the conclusions are not limited to this type of environment. Therefore, is necessary to adjust, eliminate invalid and inconsistent data. This permits one to maximize the probability of success and precision in modeling the desired environment. This study demonstrates, describes and evaluates each step of a process to prepare data for use, to improve the performance and precision of the ANNs used to obtain the model. This is, to improve the model quality. As a result of the studied process, it is possible to see a significant improvement both in the possibility of building a model as in its accuracy.

References
Baashar, Y., Alkawsi, G., Mustafa, A., et al. (2022). Toward Predicting Student’s Academic Performance Using Artificial Neural Networks (ANNs). Applied Sciences, 12(3), 1289. https://doi.org/10.3390/app12031289
Cai, J., Luo, J., Wang, S., et al. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70–79. https://doi.org/10.1016/j.neucom.2017.11.077
Dancey, C., & Reidy, J. (2017). Statistics without Maths for Psychology, 7th ed. Pearson.
García-Carrasco, J., Maté, A., & Trujillo, J. (2023). A Data-Driven Methodology for Guiding the Selection of Preprocessing Techniques in a Machine Learning Pipeline. In: Proceedings of International Conference on Advanced Information Systems Engineering; Springer, Cham.
Gonçalves, A. J. (2020). Strategic variables for forecasting financial results in small companies through neural networks and decision trees (Spanish) [PhD thesis]. Universidad de Extremadura, Badajoz, Spain.
Gonzalez Zelaya, C. V. (2019). Towards Explaining the Effects of Data Preprocessing on Machine Learning. In: Proceeding of the 2019 IEEE 35th International Conference on Data Engineering (ICDE). https://doi.org/10.1109/icde.2019.00245
Hitt, M., Ireland, R., & Hoskisson, R. (2011). Concepts Strategic Management: Competitiveness & Globalization, 9th ed. Canada: Cengage South-Western.
Hoel, P. (1966). Introduction to Mathematical Statistics. New York, London & Sydney: John Wiley & Sons, Inc.
Justino, M. do R. T. F., Texeira-Quirós, J., Gonçalves, A. J., et al. (2024). The Role of Artificial Neural Networks (ANNs) in Supporting Strategic Management Decisions. Journal of Risk and Financial Management, 17(4), 164. https://doi.org/10.3390/jrfm17040164
Lopez-Ramirez, E., Lopez-Zamora, S., Escobedo, S., et al. (2023). Artificial Neural Networks (ANNs) for Vapor-Liquid-Liquid Equilibrium (VLLE) Predictions in N-Octane/Water Blends. Processes, 11(7), 2026. https://doi.org/10.3390/pr11072026
Maharana, K., Mondal, S., & Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Global Transitions Proceedings, 3(1), 91–99. https://doi.org/10.1016/j.gltp.2022.04.020
Moore, D. (2003). The Basic Practice of Statistics 3rd ed. Freeman Publishers.
Mumuni, A., & Mumuni, F. (2024). Automated data processing and feature engineering for deep learning and big data applications: A survey. Journal of Information and Intelligence. https://doi.org/10.1016/j.jiixd.2024.01.002
Porter, M. (1996). What is strategy? Harvard Business Review.
Soong, T. (2004). Fundamental of Probability and Statistics for Engineers. Jonh Wiley & Sons, Inc.
© 2025 by the EnPress Publisher, LLC. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Copyright © by EnPress Publisher. All rights reserved.

TOP