Impacts of Human Development Index and Percentage of Total Population on Poverty using OLS and GWR models in Central Java, Indonesia

Central Java province is one of the provinces with the highest number of poor people on the island of Java, with the number of poor people in 2020 increasing by 0.44 million people from the previous year. Poverty is caused by several factors, one of which is the Human Development Index (HDI) and the Total Population level. Each region has different characteristics from other regions. These differences in characteristics cause more specific spatial effects, namely spatial heterogeneity. Geographically Weighted Regression (GWR) is a statistical method that can analyze spatial heterogeneity by assigning different weights and models to each observation location. This study aims to determine whether the HDI variable and percentage of total population significantly impact the number of poor people in Central Java Province in 2020 without eliminating the spatial effect. There are three groupings of variables that affect the Number of Poor People for GWR with the Adaptive Kernel Bisquare weighting function and four groups for the Adaptive Kernel Tricube weighting function. The Key Performance Indicators (KPI) used are Mean R, Akaike Information Criterion (AIC), Absolute Error (MAE), Mean Square Error (MSE), and Mean Absolute Percentage Error (MAPE). Based on these KPIs, the GWR model with the Adaptive Kernel Bisquare weighting function provides better results when compared to the OLS model.


Introduction
Poverty is one of the fundamental economic problems for a country that is the center of government attention, which is a very complex problem that must be handled appropriately. In 2020 the number of poor people will increase by 2.76 million [1]. There were several things that become poverty criteria, one of them was number of poor populations. Central Java Province is one of the provinces with the highest number of poor people on the island of Java. It is known that the number of poor people in Central Java Province in 2020 has increased by 0.44 million people from the previous year [2]. Poverty is caused by several factors, one of which is the Human Development Index (HDI) level and the population in an area.
Each region has characteristics that are different from other regions. These differences in characteristics cause spatial effects. The use of spatial effects in research will cause problems in the form of spatial dependence (spatial autocorrelation) and the emergence of spatial heterogeneity [3], [4].
Spatial heterogeneity occurs because of random location effects, or it can be said that the same response variable gives a different response from one location to another [5]. This trait arises due to differences in characters or traits which are the special characteristics of each location. The existence of spatial heterogeneity will have an impact on the parameter estimator in the model that is not Best Linear Unbiased Estimator (BLUE). Moreover, the existence of heterogeneity can result in the conclusions of the model being misleading; this is due to the tendency of the variance to increase w hich results in an enlarged standard error [6].
One of the statistical methods that can be used to analyze spatial heterogeneity is Geographically Weighted Regression (GWR). GWR is different from the linear regression (OLS) method. If in linear regression using parameter estimation using the least-squares method [7], GWR uses the weighted least squares method in parameter estimation [8]. Several previous studies related to previous methods or cases, such as Andrietya et al. [9], analyze poverty determinants in Central Java's province using linear regression. Haryanto & Andriani [10] analyzed GWR with a fixed kernel exponential weighting function to predict the number of poor people in Central Java in 2018 with predictors of the district/city minimum wage, Human Development Index (HDI), and the Open Unemployment Rate. Fotheringham et al. [11] evaluate the impacts of air pollution in China using a latterly proposed model -Multi-scale Geographically Weighted Regression (MGWR). Koh et al. [12] learn the relationships between the NO3-N concentration and various parameters (topography, hydrology, and land use) across the island in Jeju Island, South Korea, using OLS regression and GWR. Zhao et al. [13] studied the use of principal component analysis (PCA) and the GWR model to predict the spatial distribution of frozen ground temperature. Ma et al. [14] discuss about Bayesian resource of GWR. Muche et al. [15] using GWR analysis to cluster under-nutrition and its predictors among under-five children in Ethiopia: Evidence from demographic and health survey.
Based on this, this study aims to measure the impact of HDI and Percentage of Population on Poverty in the province of Central Java while still paying attention to the spatial effects it has, and this is because each region has unique differences in nature and character.

Data
The data used in this study is secondary data sourced from the Central Java Central Bureau of Statistics (BPS) website, which can be accessed at https://jateng.bps.go.id/. The dependent variable used is the Number of Poor Populations (NPP), and the independent variables consist of the human development index (HDI) and the percentage of Total Population (PP) in Central Java Province in 2020. More explanation is represented in Table 1 below.

Percent
Number of populations in an area

Linear Regression (OLS) and Geographically Weighted Regression (GWR)
Linear regression is an analytical method that can be used to see a linear relationship between two variables where one variable is considered the affected variable while the other variable is regarded as the influencing variable; the regression parameter or coefficient determines this. The variables that influence are called independent variables or predictor variables, and the affected variables are called dependent variables or response variables [16]. One of the most used linear regression analysis approaches is Ordinary Least Square (OLS). OLS is a method used to estimate the regression coefficient by minimizing the number of squares of errors [17].
While the GWR method is a development of the regression method, this method produces a model that has its parameter equation for each observation location [18]. The occurrence of spatial heterogeneity causes the different model equations from each location due to variance from one observation to another observation [8]. Spatial heterogeneity is a condition if one of the independent variables gives a different response to several different locations in an observation [5]. Due to the different reactions to several places, different regression parameter results appear at each observation location [19] This test can be carried out using the Breusch-Pagan method with the hypothesis H0 that there is no spatial heterogeneity. The number of parameter equations obtained is as many as the observation locations used in the study [20], in which the model that has been obtained at one observation location cannot be used to estimate parameters at other observation locations [21].
The method used to estimate the parameters for the GWR model is the Weighted Least Square (WLS) method or the Weighted Least Square Method, using weights [22]. WLS is known to be able to neutralize the consequences of violating the heterogeneity assumption. The GWR method is one of the local analytical methods, while linear regression is a global analysis method. In addition, GWR also pays attention to spatial or location effects. The GWR model is presented in Equation 1.
is the value of the dependent variable at the ℎ observation location, is the ℎ independent variable value at the ℎ observation location, ( , ) is the coordinates of the ℎ observation location, β 0 ( u i , v i ) is the GWR Intercept, β k ( u i , v i ) is the ℎ regression coefficient at the ℎ observation location, and ε i is the error value at the ℎ observation location.
Neighborhood relationships that show proximity between locations are expressed in a spatial weighting matrix [23]. The weights in the GWR model play a significant role because the weighting values represent the location of one observation with another. The spatial weighting of the GWR model is in the form of a diagonal matrix in which the elements are a weighting function of each observation location, based on a point location approach. It is known that there are several methods used to determine the amount of weight for each different place in the GWR model, one of which is the kernel function.
The kernel function or K(u) is a continuous, symmetrical, finite and function The kernel function is used to estimate the parameters in the GWR model if the distance function is continuous and monotonically decreasing. The weighting of the GWR kernel model can be formed using the distance function presented in Equation 2.
is the Euclidean distance between the ℎ observation point and the ℎ observation point [22], and ℎ is the bandwidth. The value of is obtained by Equation 3.
Bandwidth (ℎ) measures the distance of the weighting function, which states to what extent the influence of the location on other locations [24]. The weighting matrix W ( , ) can be determined using a kernel function. The kernel function is known to give weighting according to the optimum bandwidth. Optimum bandwidth selection in the GWR method is essential because it will affect the model's data accuracy. There are two types of kernel functions, namely Adaptive and Fixed. The Adaptive kernel function obtains a different bandwidth value at each observation location by adjusting the observation location. In contrast, the Fixed kernel function will receive only one bandwidth value, which means the bandwidth value for all observation locations.
Some of the adaptive kernel functions are (1) Adaptive Kernel Bisquare, (2) Adaptive Kernel Gaussian, and (3) Adaptive Kernel Tricube. In comparison, the types of Fixed kernel functions include (1) Fixed Kernel Bisquare, (2) Fixed Kernel Gaussian, and ( 3) Fixed Kernel T ricube [25]. Estimation of the parameter ( , ) in the GWR model uses the Weighted Least Squares (WLS) method, namely by giving different spatial weights for each observation location [5]. The amount of spatial weighting is obtained based on the distance between the observation locations. The closer the distance between the observation locations, the greater the spatial weighting.
After obtaining the estimated GWR model parameters, the GWR hypothesis testing is carried out, namely testing the model's suitability, and testing the model parameters. The goodness of fit was conducted to determine the significance of the geographical factors, which are the core of the GWR model with 0 defined there is no significant difference between the classical regression model and the GWR model. The parameter testing of the GWR model is carried out to partially test the parameters. This test is carried out to determine which parameters have a significant effect on the dependent variable at each observation location. With 0 , there is no effect of the independent variable on the dependent variable. This test is carried out by looking at the value of each parameter. Significance level (α) is the threshold that used to determine the significance, if the p value is less than or equal to significance level the data is considered statistically significant. The significant level used is 95% or α = 0.05.

Key Performance Indicator
The Key Performance Indicator (KPI) used is based on the value of 2 , Akaike Information Criterion/AIC [26], Mean Absolute Error/MAE [27], Mean Square Error/MSE [28], and Mean Absolute Percentage Error/ MAPE [29]. This KPI used as comparison criteria to choose the best models, which are presented sequentially in Equation 4-7.
is the number of parameters estimated in the regression model, : number of observations, : 2,718, and : residual. is the actual value of the dependent variable at location , while is the predicted value of the dependent variable at location .
The model is better if it has a more significant coefficient of determination or 2 because it means that the existing factor variables can explain the model more. AIC is one method that can be used to select the best regression model [30], and it is said to be the best regression model if it has the smallest AIC value [31]. Likewise, the smaller MAPE, MAE, and MSE indicators show that the model is getting better.

Methods
This study was started by implementing linear regression and then compared with the GWR method. The first step is data collection on the variable Number of Poor Population (NPP), Human Development Index (HDI), and Percentage of Total Population (PP) in Central Java Province. Then arrange a descriptive analysis to recognize the general description of the dependent and independent variables by making a table or graph. Spatial visualization was used using the Quantum Geographic Information System (QGIS). Spatial visualization is used to invent it more accessible to draw the initial image spatially. QGIS is used because it is a free and powerful GIS software whose functions are almost the same as paid GIS software [32]. Meanwhile, R software is used for inferential analysis because of its convenience, power, and other resources [33].
The second step is linear regression analysis to recognize whether the independent variables/factors significantly influence the dependent variable and continue with the assumption test. Then arrange a test of spatial heterogeneity because the GWR method is used to analyze data that has spatial heterogeneity. If there is spatial heterogeneity, the next step is to determine the weighting with the kernel function to find the optimum bandwidth using the cross-validation method.
The third step is to resemble parameter estimates for the GWR model and continue to carry out the goodness of fit test or model suitability test. This is done to see whether the GWR model has a significant difference from the linear regression model. Suppose the GWR model with the predetermined weighting function does not significantly differ from the classical regression model. In that case, the GWR model's bandwidth search and parameter estimation with other weighting functions are carried out again until the best weighting function is found.
The fourth step is to find the estimated parameters of the GWR model, find the t-count of the parameter estimates, and determine the best model for each observati on location by looking at the tcount. The last step is to calculate the AIC, R 2 , MAPE, MSE, and SSE values from the linear regression and the GWR model to compare the best model for analyzing case studies and concluding.

Results and Discussions Descriptive Statistics
Based on data, in 2020, the average number of Number of Poor Population (NPP) in Central Java is known to be worth 113.74 thousand people, while the HDI is on average 72.51%, and the PP is on average worth 2.86%. Visualizations of NPP, HDI, and PP are presented in Figures 1(a), 1(b), and 1(c).  Figure 1 is a thematic map of NPP and the factors that influence it. The darker the color in the area, the higher the value for the variable. As an illustration, the area with the highest NPP value in Central Java Province is Brebes regency, with a NPP worth 308.8 thousand people, while the lowest NPP is Magelang city which is 9.3 thousand people. Furthermore, the HDI value shows that the area with the lowest HDI value is Brebes Regency with an HDI value of 66.11%, and the highest is Salatiga City, which is 83.14%. While the PP value based on Figure 1(c) tells that the City of Salatiga is the area with the lowest PP, it is known to be worth 0.33%, and the area with the highest PP value is Brebes Regency which is worth 5.42%.

Analysis of Linear Regression and GWR
Linear regression analysis aims to determine the effect of the independent variables/ factors on the Number of Poor Population variables without considering the spatial impact. With α = 0.05, the results of the linear regression are presented in Table 2. Based on Table 1, the value of the F test is obtained by = 62.57 > 1.78 = so that it can be explained that the model is significant. Then, the value of 2 received a weight of 79.64%, which means that the independent variable can define the variance of the dependent variable of 79.64%. Other variables outside the model describe the remaining 20.36%. Then continue with the partial test or t -test to partially test the parameter estimation, or in other words, to find out whether the inde pendent variable/predictor ( ) has a significant effect on the response variable ( ). The value of from the two independent variables is more than . In other words, it can be said that the two independent variables used are significant or affect the dependent variable. The result of t-test shows in table 3 below.   Table 4, the data has a normal distribution of residuals, and no autocorrelation occurs. However, the assumption of homoscedasticity is not fulfilled, or heteroscedasticity befalls. While the results of the multicollinearity test showed that the two variables did not occur multicollinearity. Based on the Breusch Pagan test, it was obtained that the = 001889 < = 0.05 . From this, it can be concluded that there is heterogeneity so that the analysis can be continued using the GWR method.
The first step in the GWR method is the selection of a weighting function. The spatial weighting for the GWR model in this study was determined by performing a goodness of fit test on all types of bandwidth and kernel functions. As explained earlier, the bandwidth used is adaptive and fixed. While the kernel functions used are Bisquare, Gaussian, and Tricube. The weighting function used is a weighting function whose results show that there is a difference between linear regression and GWR as evidenced if < . The goodness of fit test results from all weighting functions are presented in Table  5. Based on Table 5, two weighting functions that result in GWR modeling have significant differences between linear regression and GWR, namely Adaptive Kernel Bisquare and Adaptive Kernel Tricube. Next, GWR modeling is carried out using the two weighting functions. The bandwidth value of each weight is presented in Table 6.  Table 7. Furthermore, to find the weighting function with the Adaptive Kernel Bisquare, the bandwidth and the Euclidean distance from each location are substituted into the Adaptive Kernel Bisquare equation. Based on Table 5, it is shown that the parameter estimation HDI variable has a minimum value of -9.38 and a maximum value of 3.78. The HDI variable can affect NPP in Central Java Province with an estimated value range of -9.38 to 3.78. Likewise, the 1 st quartile and 3 rd quartile values, the HDI predictor variable has a 1 st quartile value of the estimated GWR model parameter, which is -6.68, and the 3 rd quartile value is -1.67, with the median or estimated value of the HDI variable parameter is -4.01. The Global meaning in Table 5 above shows the estimated global parameters or parameter estimates from the linear regression model, worth -4.79 for the HDI variable. Besides other predictor variables.
From the goodness of fit test results, the value = 4.22> = 2.35, the decision to reject 0 is obtained, which means that there is a significant difference between the linear regressi on and the GWR model. In other words, GWR model with the Adaptive Kernel Bisquare function can be used. After obtaining parameter estimates to perform GWR modeling with the Adaptive Kernel Bisquare weighting function, the parameter estimation results are obtained for each district/ municipality in Central Java Province. A partial test on each variable by studying at the t-count value obtained from each variable in each location is carried out to obtain the best model for each district/ municipality, by looki ng at the t-count value obtained from each variable in each location. Table 8 below shows the estimation parameter obtained of every location and the value. The variable is said to be significant to the model if the value of > (0.025;31), from the modeling results, obtained 35 models for each weighted function that shows in Table 9 below. From models that were obtained, there were three groups of significant variables in each observation location the first group is the district/ municipality with the HDI variable ( 1 ), which considerably affects NPP. The second group is the PP variable ( 2 ) which has a significant effect on NPP, while the third group is the HDI variable ( 1 ) and JP ( 2 ), which have significant effects. Table 10 is a breakdown of which district/ municipality is included in each group.    Based on Table 10, the visualization of the distribution of groups of variables that are significant to the NPP with GWR Adaptive Kernel Bisquare in Central Java Province is presented in Figure 2. Based on table 10 and Figure 2, the following illustrates the GWR model obtained for the Semarang municipality is presented in Equation 9.
Group ( ) = 649. 6 -9.33 1 + 46.26 2 (9) The model above shows that if the HDI (X1) increases by 1 unit, the NPP (Y) will decrease by 9,33. If PP (X2) has increased by 1 unit, there will be an increase in PP of 46.26 in Semarang municipality. Based on Table 7 also, the parameter estimation value of the Adaptive Kernel Tricube function obtained that the HDI variable has an estimated minimum value of -9.39 and a maximum of 4.32. It can be said that HDI can affect NPP in Central Java Province with a range of estimated values between -9.39 to 4.32. Likewise, the 1 st and 3 rd quartile values, the HDI predictor variable has the 1 st quartile value of the estimated GWR model parameter, which is -6.86, and the 3 rd quartile value is -1.78, with a median or mean value of the parameter estimate of -4.06. From the results of the goodness of fit test, the value is 3.85. The value of , F(α, 32, 14.57) = 2.26, it can be said that = 3.85 > 2.62 = then the decision to reject 0 is obtained, which means that there is a significant difference between the linear regression and the GWR model and the Adaptive Kernel Tricube weighting function.
In contrast to the Adaptive Kernel Bisquare function, the Adaptive Kernel Tricube function obtained four groups of significant variables at each observation location. The first group is the HDI variable ( 1 ) which has a significant effect on NPP, and the second group is the PP variable (X2) which has a significant impact on NPP. The third group is the variable HDI (X1), and PP (X2) has a significant effect on NPP. In the fourth group, both HDI (X1) and JP (X2) variables have no signi ficant impact on NPP (Y) in the area. The details of the four groups are presented in Table 11 and Figure 3.  In selecting the best model, KPI values are used in the form of R^2, AIC, MAPE, MSE, and SSE values from linear regression and GWR models. KPIs from the linear regression model and GW R model are presented in Table 12.  Based on Table 12, it is found that the Adaptive Kernel Bisquare GWR model has a coefficient of determination or 2 greater than the linear regression model and the Adaptive Kernel Tricube GWR, which is 95.17%. Thus, from looking at the value of the coefficient of determination or 2 , the GWR Adaptive Kernel Bisquare model is better used than the linear regression model. Furthermore, the model with smaller AIC, MAPE, MSE, and SSE values is the GWR Adaptive Kernel Bisquare model with AIC value of 392.43 and MSE of 197.08, SSE of 6515.16, MAPE of 15.52% (including good c ategory) [35].
Based on this, the GWR model with the Adaptive Kernel Bisquare weighting function is the best model used in modeling the Number of Poor Population (NPP) in Central Java Province in 2020. The visualization of the actual data and the prediction results using the linear regression and GWR model is presented in Figure 4.  Figure 4 above shows that the prediction result is good if it has a small error or is close to the original data, which means it has a low difference. Visually, it can be said that the prediction results are said to be the best if the point is closer to the point that states the original data. From Figure 4 above, it can be seen visually that the green dot, namely GWR with Adaptive Kernel Bisquare, is closer to the blue dot, which is the original data, compared to the other points.

Conclusion
Based on the problems and objectives of this study, the area with the lowest NPP value is Magelang municipality with a total of 9.3 thousand inhabitants, and the highest is Brebes Regency with an NPP of 208.8 thousand inhabitants. GWR modeling is suitable for use in this study due to the differences in the nature or character of each location. Based on the values of 2 , AIC, MAPE, MSE, and SSE, it is found that the GWR model with the Adaptive Kernel Bisquare weighting function is the best model to predict the amount of poverty in Central Java province when compared to the OLS model . The example of result that obtained was model of Semarang Municipality shows that if the HDI (X1) increases by 1 unit, the NPP (Y) will decrease by 9,33. If PP (X2) has increased by 1 unit, there will be an increase in PP of 46.26.