“random forest”Search All-EnPress Publisher LLC.

Aug 6, 2020

Study on the distribution pattern and influencing factors of shrinking cities in Northeast China based on the random forest model

Based on the population change data of 2005–2009, 2010–2014, 2015–2019 and 2005–2019, the shrinking cities in Northeast China are determined to analyze their spatial distribution pattern. And the influencing factors and effects of shrinking cities in Northeast China are explored by using multiple linear regression method and random forest regression method. The results show that: 1) In space, the shrinking cities in Northeast China are mainly distributed in the “land edge” areas represented by Changbai Mountain, Sanjiang Plain, Xiaoxing’an Mountain and Daxing’an Mountain. In terms of time, the contraction center shows an obvious trend of moving northward, while the opposite expansion center shows a trend of moving southward, and the shrinking cities gather further; 2) in the study of influencing factors, the results of multiple linear regression and random forest regression show that socio-economic factors play a major role in the formation of shrinking cities; 3) the precision of random forest regression is higher than that of multiple linear regression. The results show that per capita GDP has the greatest impact on the contraction intensity, followed by the unemployment rate, science and education expenses and the average wage of on-the-job workers. Among the four influencing factors, only the unemployment rate promotes the contraction, and the other three influencing factors inhibit the formation of shrinking cities to various degrees.

Abstract

Download PDF(1.81M)

XML

3

55

Jan 16, 2025

Spatial analysis and classification of land use patterns in Lucknow district, UP, India using GIS and random forest approach

Mapping land use and land cover (LULC) is essential for comprehending changes in the environment and promoting sustainable planning. To achieve accurate and effective LULC mapping, this work investigates the integration of Geographic Information Systems (GIS) with Machine Learning (ML) methodology. Different types of land covers in the Lucknow district were classified using the Random Forest (RF) algorithm and Landsat satellite images. Since the research area consists of a variety of landforms, there are issues with classification accuracy. These challenges are met by combining supplementary data into the GIS framework and adjusting algorithm parameters like selection of cloud free images and homogeneous training samples. The result demonstrates a net increase of 484.59 km² in built-up areas. A net decrement of 75.44 km² was observed in forest areas. A drastic net decrease of 674.52 km² was observed for wetlands. Most of the wastelands have been converted into urban areas and agricultural land based on their suitability with settlements or crops. The classifications achieved an overall accuracy near 90%. This strategy provides a reliable way to track changes in land cover, supporting resource management, urban planning, and environmental preservation. The results highlight how sophisticated computational methods can enhance the accuracy of LULC evaluations.

Abstract

Download PDF(604.71KB)

XML

11

77

Aug 7, 2024

Advancing financial analytics: Integrating XGBoost, LSTM, and Random Forest Algorithms for precision forecasting of corporate financial distress

This study thoroughly examined the use of different machine learning models to predict financial distress in Indonesian companies by utilizing the Financial Ratio dataset collected from the Indonesia Stock Exchange (IDX), which includes financial indicators from various companies across multiple industries spanning a decade. By partitioning the data into training and test sets and utilizing SMOTE and RUS approaches, the issue of class imbalances was effectively managed, guaranteeing the dependability and impartiality of the model’s training and assessment. Creating first models was crucial in establishing a benchmark for performance measurements. Various models, including Decision Trees, XGBoost, Random Forest, LSTM, and Support Vector Machine (SVM) were assessed. The ensemble models, including XGBoost and Random Forest, showed better performance when combined with SMOTE. The findings of this research validate the efficacy of ensemble methods in forecasting financial distress. Specifically, the XGBClassifier and Random Forest Classifier demonstrate dependable and resilient performance. The feature importance analysis revealed the significance of financial indicators. Interest_coverage and operating_margin, for instance, were crucial for the predictive capabilities of the models. Both companies and regulators can utilize the findings of this investigation. To forecast financial distress, the XGB classifier and the Random Forest classifier could be employed. In addition, it is important for them to take into account the interest coverage ratio and operating margin ratio, as these finansial ratios play a critical role in assessing their performance. The findings of this research confirm the effectiveness of ensemble methods in financial distress prediction. The XGBClassifier and RandomForestClassifier demonstrate reliable and robust performance. Feature importance analysis highlights the significance of financial indicators, such as interest coverage ratio and operating margin ratio, which are crucial to the predictive ability of the models. These findings can be utilized by companies and regulators to predict financial distress.

Abstract

Download PDF(1.03M)

XML

2

101

Nov 10, 2023

Comparison of Ridge Regression and GA-RF Models for Boston House Price Prediction

The purpose of this paper is to explore the performance of ridge regression and the random forest model improved by genetic algorithm in predicting the Boston house price data set and conduct a comparative analysis. To achieve it, the data is divided into training set and test set according to the ratio of 70-30. The RidgeCV library is used to select the best regularization parameter for the Ridge regression model, and for the random forest model, the genetic algorithm is used to optimize the model's hyperparameters. The result shows that compared with ridge regression, the random forest model improved by genetic algorithm can perform better in the regression problem of Boston house prices.

Abstract

Download PDF(1.75M)

XML

1

62

Aug 8, 2024

Deciphering the complexity of COVID-19 transmission: Unveiling precision through robust vaccination policies and advanced predictive modeling with random forest regression

In the realm of COVID-19 transmission data, scientists are scrutinizing policies to identify the ideal vaccination rate for halting the virus. This study aimed to pinpoint the minimal vaccinated percentage needed to break the virus cycle within communities. The underlying motivation stems from the urgent need to contain COVID-19’s spread and reduce the strain on healthcare systems worldwide. With fluctuating infection rates and the emergence of new variants, understanding the optimal vaccination rate has become a cornerstone in public health planning and pandemic response. Using diverse machine learning methods, this study analyzed infection peaks and hospitalization rates during vaccination campaigns across countries. The goal was to find the vaccination threshold necessary to prevent virus resurgence, even with new variants. This critical milestone is crucial for health systems to combat the pandemic effectively. The study’s analysis revealed the correlation between vaccination rates and hospitalizations, highlighting immunization’s pivotal role. Employing Random Forest regression, the study successfully predicted new cases and hospitalization rates, offering valuable insights into pandemic management strategies. For future research, we recommend exploring the impact of vaccination on the evolution of virus variants and the potential influence of socio-economic factors on vaccination uptake. Moreover, a broader analysis across different geographical regions can further validate the study’s findings and enhance global pandemic preparedness.

Abstract

Download PDF(555.11KB)

XML

0

52

Nov 11, 2024

Advancing user classification models: A comparative analysis of machine learning approaches to enhance faculty password policies at the University of Buraimi

In this paper, we assess the results of experiment with different machine learning algorithms for the data classification on the basis of accuracy, precision, recall and F1-Score metrics. We collected metrics like Accuracy, F1-Score, Precision, and Recall: From the Neural Network model, it produced the highest Accuracy of 0.129526 also highest F1-Score of 0.118785, showing that it has the correct balance of precision and recall ratio that can pick up important patterns from the dataset. Random Forest was not much behind with an accuracy of 0.128119 and highest precision score of 0.118553 knit a great ability for handling relations in large dataset but with slightly lower recall in comparison with Neural Network. This ranked the Decision Tree model at number three with a 0.111792, Accuracy Score while its Recall score showed it can predict true positives better than Support Vector Machine (SVM), although it predicts more of the positives than it actually is a majority of the times. SVM ranked fourth, with accuracy of 0.095465 and F1-Score of 0.067861, the figure showing difficulty in classification of associated classes. Finally, the K-Neighbors model took the 6th place, with the predetermined accuracy of 0.065531 and the unsatisfactory results with the precision and recall indicating the problems of this algorithm in classification. We found out that Neural Networks and Random Forests are the best algorithms for this classification task, while K-Neighbors is far much inferior than the other classifiers.

Abstract

Download PDF(556.88KB)

XML

1

115