ECONOMETRIC ANALYSIS OF FACTORS INFLUENCING THE DEVELOPMENT OF SMALL AND MEDIUM ENTERPRISES: THE CASE OF AZERBAIJAN

The development of small and medium businesses is the main driving force in the sustainable economic development of the country. The main reasons for this are that the development of small and medium enterprises is of exceptional importance in terms of diversifying the economy, increasing its competitiveness, ensuring employment. To achieve these goals, a Strategic Roadmap has been developed to ensure the competitiveness, inclusion, and sustainability of the economy in the Republic of Azerbaijan. Research shows that small and medium enterprises should be considered as a socio-economic system with the characteristics of an economic-cybernetic system. Therefore, a mathematical modeling mechanism of quantitative analysis should be used to establish optimal behavioral and development strategies for this economic-cybernetic system. We will use the econometric research mechanism of mathematical modeling, especially the correlation-regression analysis mechanism, to quantify the impact of environmental factors on small and medium enterprises. In the process of research, data characterizing the activities of small and medium enterprises in the country will be used as a statistical basis.


Introduction
Azerbaijan has begun the transition from an administrative-command economic system to a system of innovative economic relations based on an independent market economy. Activity in the free and independent economic sphere, which belongs to the free market economy, occurs only in the process of entrepreneurial activity. The emergence of various forms of property in a market economy strengthens people's sense of entrepreneurship, changes attitudes to property, and creates interest in increasing production. Entrepreneurship is the basis of this process in a society where market relations are developed and a market economy is formed as the driving force of the country. Entrepreneurship development is one of the important conditions for the development of the non-oil sector in Azerbaijan. The Strategic Roadmap sets out not only the goals and objectives until 2025, but also the sources and mechanisms for achieving these goals. These strategic goals include: 1. Further improvement of the business environment and legal framework to increase the impact of small and medium enterprises on GDP in the country by 2025 and beyond; 2. Ensuring the creation of a sustainable network of small and medium enterprises by ensuring their efficient and effective access to financing resources; 3. Internationalization of small and medium enterprises of the country and increase of access to foreign markets; 4. Increasing the supply of quality products and services on regional bases; 5. Promoting the role of innovation in increasing the competitiveness of small and medium enterprises.
It should be noted that it is estimated that to achieve these strategic goals requires an investment of about 700 million manats, which should provide 1.26 billion in value-added and 34.2 thousand new jobs in the economic system we studied.
Small and medium enterprises play an important role in ensuring economic growth and employment in many developed countries with developed economies. According to official statistics released by the World Bank, employees of small and medium enterprises in these countries make up 60-70% of the country's employed population. These trends are also evident in the developing countries of the world, where they are taking systematic measures to develop small and medium-sized enterprises to increase economic sustainability, increase competitiveness and ensure economic activity. Thus, both developed and developing countries are trying to adapt their economies to economic and financial crises through small and medium enterprises. It should be noted that significant progress has been made in this area. Thus, in the United States, in most EU countries, significant economic growth has been achieved through the development of small and medium enterprises. About 99% of enterprises in the European Union are small and medium enterprises, and more than 60% of the employed population in the country.
Research shows that it is possible to increase the share of small and medium enterprises in the Azerbaijani economy, especially in macroeconomic indicators such as GDP, employment, and foreign exchange inflows. This means that the role and share of small and medium enterprises will play an extremely important role in the future development of Azerbaijan. It should be noted that this goal is quite realistic and achievable because the country has great potential for the development of not only small and medium enterprises but also the economy as a whole.
These opportunities include improving the business environment in the country, expanding access to financial resources, providing access to domestic and foreign markets for small and medium enterprises, training qualified personnel, etc. can be attributed.
Small business entities to the criteria approved by the Cabinet of Ministers of the Republic of Azerbaijan on December 18, 2009, No. 192, small and medium business entities to the criteria approved by the Cabinet of Ministers of the Republic of Azerbaijan on June 5, 2015, No. 215, micro, small and Medium-sized business entities were identified following the criteria approved by the Cabinet of Ministers of the Republic of Azerbaijan on December 21, 2018, No. 556.
Analysis of statistical data on small and medium enterprises in our country shows that there is a dynamic development in the number of such enterprises and the number of employees in these entities. Thus, if in 1999 there were 19,063 small enterprises in the country, in 2019 this figure increased almost 15 times and reached 271,304. During this period, the number of employees in these enterprises increased from 150,229 to 332,255 (an increase of 2.1 times). One of the important indicators characterizing economic entities is the volume of output in these entities. If in 1999, small and medium enterprises in the country produced products worth 1,380.9 million manats, in 2019, this figure increased 14 times and produced goods worth 19,386.7 million manats. (See table 1  The following graph, based on the data in Table 1, shows the growth dynamics of the total number of business entities (small and medium enterprises) in the country. (See Graph 1 below) Graph 1. Graph of the dynamics of growth in the number of small and medium enterprises in the country.
As can be seen from Graph 1, although the number of small and medium enterprises in the country was small in 2000-2002, there has been a significant annual increase since 2003. The purpose of econometric modeling in the form of correlation-regression analysis is to determine the relationship between the result indicator (Y) and the factors affecting it (Xi).
= 0 + 1 1 + in the form of double regression model, or = 0 + 1 1 + 2 2 + ⋯ + is to form a multivariate regression model in the form of above mentioned and to determine its quantitative characteristics. Therefore, let us examine some important theoretical issues in the construction of linear regression models.
Suppose that the state of the economic process at moment t is expressed by Yt. Let's call this quantity endogenous. The value of the endogenous indicator Yt is formed under the influence of various factors. These factors are conventionally divided into x_1t, x_2t,…, x_mt systematic factors and meat random factors. It is accepted to call systematic factors exogenous indicators.
Different methods are used to find the point values of the parameters (coefficients) of the multi-regression model. Examples of the most commonly used methods are the smallest modulus method, the smallest squares method, the moment method, and the maximum match method. The choice of a specific method for estimating parameters depends on the a priori information available about the model variables. Note that the base method for estimating the bi coefficients of linear regression models is the method of least squares, that is, the sum of the squares of the deviations of the actual values of the endogenous parameters Y from its calculated values is minimized, ie For the values obtained for the parameters by the least-squares method to have optimal characteristics, the following conditions, called Gauss-Markov conditions, must be satisfied: The regression model should be linear according to the parameters. There must be no systematic errors in the observations, ie the mathematical expectations of random deviations for all observations must be zero.
Observations must be made with the same accuracy, ie the variance of random deviations must be constant.
Observations should be made in such a way that the random deviations are not correlated with each other, ie the random deviations εi and εj do not depend on each other.
Random deviations should not depend on exogenous variables. There is no significant linear relationship between exogenous variables. Random deviations have a normal distribution. (1-7) If the Gauss-Markov conditions are satisfied, the values obtained for the parameters ( = 0, ̅̅̅̅̅̅ ) based on the least squares method of the linear regression model will be effective values.

Data analysis. Dependent and independent variables
As mentioned above, econometric analysis requires the identification of the indicators of the studied economic-cybernetic system as endogenous and exogenous variables. Therefore, let's identify the indicators of the studied system of small and medium enterprises reflected in Table 1 as endogenous and exogenous variables as follows: • endogenous variable: -Y-the creation of new jobs in the country • exogenous variables: -X1 -total number of business entities (small and medium enterprises); -X2 -number of employees of business entities (small and medium enterprises); -X3 -output of business entities (small and medium enterprises).
Thus, the subject of our econometric research is to establish contact equations (double regression and multivariate regression line models) that characterize the impact of the number of small and medium enterprises, the number of employees, and output in small and medium enterprises on the creation of new jobs in the country and determine their quantitative characteristics. is to do.
In the first phase of the econometric study, the Y endogenous parameters (creation of new jobs in the country) of the Eviews software package were derived from the X1 exogenous variable (number of small and medium enterprises), X2 exogenous variable (number of employees in small and medium enterprises), X3 exogenous variable (small and medium enterprises). linear double regression models reflecting the autonomous dependence of output (output in medium-sized enterprises).
In Phase 2 of the econometric study, linear multivariate regression models of the combined dependence of the Y endogenous variable on the X1 and X2 exogenous variables, the X1 and X3 exogenous variables, and the X2 and X3 exogenous variables on the Eviews software package were constructed.
In the third stage of econometric research, a linear multivariate regression model was developed, which reflects the combined dependence of the studied economic system of the endogenous parameter Y on all indicators involved in the study, ie exogenous variables X1, X2, and X3.
Let us show the relationship equations (double and multivariate regression models and their quantitative characteristics in the form of the following table) in all three stages (Table 2). Let's evaluate the quality of the built regression models. Examples of evaluation mechanisms used for this purpose are R 2 determination coefficient, F-Fisher statistic, DW-Darbin-Watson statistic, etc. can be shown. The most widely used of these criteria is the coefficient of determination R 2 .
Let's explain the essence of this ratio. As is well known, the main purpose of regression analysis is to explain the behavior of a Y-dependent variable. For any set, the value of the Ydependent variable is relatively low in some observations and relatively high in other observations. The researcher is interested in the answer to this question. The scattering of the values of Y in any set can be expressed as the total scattering, ie ∑( − ̅ ) 2 . It is easy to see that for this sum.
It is easy to see that for this sum which is conditionally satisfied. Or rather: TSS = ESS + RSS where TSS is the total sum of the squares of the deviations (ie ∑( − ̅ ) 2 ; ESS -explained sum of squares of deviations (i.e ∑(̂− ̅ ) 2 ; RSS is the residual sum of squares of deviations (ie ∑ 2 ); Then ∑(̂− ̅ ) 2 / ∑( − ̅ ) 2 fractions will represent the part of the total sum of squares explained by the regression equation. This ratio is called the coefficient of determination and is denoted as R 2 , ie 2 = ∑(̂− ̅ ) 2 ∑( − ̅ ) 2 2 value of the determination coefficient varies between 0 and 1, which means 0 ≤ 2 ≤ 1.
Note that the value of the determination coefficient R ^ 2 is equal to 1 if the condition =̂is satisfied. This means that the set is on a straight line of regression at all points. In order to obtain the value of R ^ 2 = 0, the part of the variation of the dependent variable Y explained by regression must be equal to the unexplained part, and =̂. This means that all points of the set are on a horizontal straight line = ̅ . In this case, the variation of the values of the regressor X has no effect on the value of the endogenous variable Y, and there is no correlation between them.
The closer the value of the determination coefficient R 2 to 1, the better the ̂ regression will explain the behavior of Y. That is why R 2 is considered as a measure of the adequacy of the econometric model based on the determination coefficient to the studied real conditions.
Assessing the quality of double and multivariate regression models in small and medium enterprises, the quantities of which depend on the value of the determinant coefficient R 2 , allows us to conclude that the quality of regression models 1-6 in Table 2 is very low. management decisions will not be optimal decisions.
Note that our conclusion on the quality of regression models 1-6 is also confirmed by Fisher statistics and Darbin-Watson statistics. Let's focus on the model (7), which has a slightly higher quality among these models.
The following table shows the statistics for constructing this linear multivariate regression model based on the Eviews software package. (See Table 3  Reflecting the combined dependence of job creation on regressors X1, X2 and X3 = −33,59 + 0,01 1 + 0,01 2 + 0,01 3 (7) (0,41) (0,03) (0,04) (0,52) line can be considered a more qualitative and realistic model than the 1-6 models of the multivariate regression model.  Table 3 shows the probabilities P for each coefficient of the model. If Pi ≤ 0.01, then the H0 hypothesis that the value of the coefficient is equal to 0 will not be accepted at the 1% significance level. If 0.01 ≤ Pi ≤ 0.05, then the H0 hypothesis will not be rejected at the 1% significance level and will be rejected at the 5% level. If Pi > 0.05, then the H0 hypothesis will not be rejected at the 5% significance level.
(7) Statistics P of the regression model show that the number of small and medium enterprises has a coefficient of exogenous variable X1, as well as the number of employees in small and medium enterprises has a coefficient of exogenous variable X2 at a level of 5%. In small and medium enterprises, the output factor of the X3 regressor is insignificant at all levels.
For the regression model (7), the value of the determination coefficient R 2 is equal to R 2 = 0.55. Therefore, the regressors X1, X2 and X3 included in the model explain 55% of the variance of the Y-dependent variable, which reflects the creation of new jobs. On the other hand, in Table 3, for model (7), Ffact = 6.59 > Fcrit = 3.06 is obtained, ie the actual value of the Fisher criterion for this model is greater than its theoretical value. Therefore, the multivariate regression model (7) can be considered an important model.
One of the factors that negatively affects the quality of regression models is the emergence of multicollinearity. One of the conditions of the classical regression model is that the linear variables are not linearly dependent. Violation of this condition leads to the formation of multicollinearity. Multicollinearity in the model can occur for a variety of reasons. For example, several independent variables may have a common time trend. The relatively characteristic features of multicollinearity can be grouped as follows: • Minor changes in the initial data -for example, an increase in the number of observations leads to significant changes in the evaluation of the model's ratios; • Although the model as a whole is significant, the standard error of prices is large and the significance is low; • The signs of the values of the coefficients contradict the theory or are too large quantities.
It is impossible to give unambiguous answers to questions about multicollinearity. Even some econometricians advise not to worry about it at all. In real life, different approaches are used to eliminate multicolonialism. Some researchers suggest removing the free variable that creates multicollinearity and is therefore considered "excess" from research. However, in this case, additional difficulties may arise. Thus, multicollinearity determines the approximate linear dependence between regressors, but does not determine which regressor is "excess". On the other hand, the exclusion of this or that free variable from the study can lead to significant changes in the content capacity of the model. Finally, the omission of a significant regressor, or rather a free variable that actually affects the Y-dependent variable, can cause the values obtained by the least squares method to shift.
Various methods are used to detect the "infection" of regression models with multicollinearity. One of these methods is the method of variance -the variation of the inflation factor. The statistics of testing the multicollinearity of the linear multivariate regression model (7), which characterizes the dependencies in the system of small and medium enterprises by the method of variance-inflation factor variation, are shown in the table below (Table 4). As can be seen from the statistical reports in Table 4 of the test, the values of all the VIFj interpreters are less than 5, which is the threshold value of this criterion (VIF <5). Thus, there is no multicollinearity in the (7) multi-regression model.
As it is known, the subject of linear regression analysis is the assessment of linear relationships between indicators of socio-economic systems. However, at the level of real socio-economic systems, such dependencies do not exist a priori. Therefore, in econometric studies, it is necessary to bring the nonlinear dependencies between the indicators into linear dependencies. It should be noted that this approach is possible in many cases and allows you to make management decisions that are sufficiently adequate to the real situation.
The simplest example of bringing a nonlinear regression model to a line is logarithmic, or rather, bringing the base line model of the regression equation in the form of = 0 + 1 1 + 2 2 + … + + to the logarithmic equation in the form of Log ( ) = 0 + 1 Log ( 1 ) + 2 Log ( 2 ) + … + Log ( ) + .
If the study is based on this approach, then the statistics for the construction of the logarithmic regression equation with respect to the linear model (7) will be obtained as in the following table. (Table 5 According to the statistics of Table 5, the logarithmic regression model of the studied economic system is obtained as follows: ( ) = −7,20 + 0,55 ( 1 ) + 0,39 ( 2 ) + 0,07 ( 3 )) (8) (0,00) (0,00) (0,02) (0,61) The P statistics of the coefficients of the model (8) show that the coefficients of the interpreter of the total number of business entities (X1) and the number of employees in the business entities (X2) are significant at 5%, but the coefficient of the business output (million manat) is not significant at all levels.
A coefficient of determination of 0.64 indicates that the explanators included in the model explain the variation in job creation by 64%. On the other hand, Ffact = 9.66> Fcrit = 3.06, ie the actual value of the Fisher criterion is greater than its theoretical value. Therefore, the multivariate regression model (8) is important.
Explanatory variables selected according to the spider web method affect the economic process with a certain delay in the form of = 0 + 1 1( −1) + ⋯ + ( −1) + and according to the method of inertia, the dependent variable itself -with a certain delay in the form of = 0 + 1 1 + ⋯ + + +1 −1 + Based on these methods (8), the logarithmic regression model will be specified as follows.
Delayed modeling statistics ( According to the statistics in Table 6, we obtain the following multivariate regression model: ( ) = −7,67 + 0,41 ( 1 ) + 0,35 ( 2 ( −1) ) + 0,40 ( 3 ( −2) ) (10) (0,02) (0,04) (0,02) (0,01) All coefficients of the model (10) are significant at 5%. A coefficient of determination of 0.64 indicates that the explanators included in the model explain the variance of job creation by 64%. On the other hand, Ffact = 8.02> Fcrit = 2.52 is obtained, ie the actual value of the Fisher criterion is greater than its theoretical value. Therefore, we conclude that (10) the delayed multivariate regression model is significant and its quality is quite high. (10) The economic interpretation of the regression model is as follows: An increase of 1% in the number of small and medium enterprises leads to a 0.41% increase in new jobs. A 1% increase in the number of employees in small and medium enterprises leads to a 0.35% increase in the number of new jobs. In this case, the explanatory variable affects the dependent variable with a lag delay. A 1% increase in output in small and medium enterprises affects the creation of new jobs by 2 lag and leads to an increase in their number by 0.40%.
As mentioned above, some conditions determined by the Gauss-Markov theorem must be met for the regression coefficients obtained with the least squares method to be effective. If these conditions are not met, special adjustments must be used to obtain effective estimates for the regression coefficients.
According to the second condition of the Gauss-Markov theorem, the variance of the random limit must be constant in all observations. This condition is called the homoscedastic condition. The fulfillment of the condition is a state of heteroskedasticity. In the case of heteroskedasticity, the bj coefficient values obtained by the least-squares method are not effective. Also, the values of standard errors from the regression coefficients slide downwards. This can lead to erroneous conclusions about the variation of the dependent variable.
Different tests are used to detect heteroskedasticity in regression models. One such test is the White test. White's test tests the uncertain hypothesis that there is no heteroscedastic in the multiple regression equation. The test statistic is calculated by an auxiliary regression, where the square residuals fall on all possible cross-products of the regressors.
Assume that the regression = 0 + 1 + 2 + is evaluated. In this case, White test statistics are analyzed for auxiliary regression: = 0 + 1 + 2 + 3 2 + 4 2 + 5 + If the number of parameters in the initial model is large enough, then you can use the "no cross" option of the White test, which does not take into account the cross-product.
The following table (10) shows a fragment of the statistics of testing the multi-regression model for heteroskedasticity by the White test (Table 7) Fragment of White Test Statistics ( According to the regression statistics shown in the table, the value of the F-statistic Prob (F-statistic) = 0.2332 is greater than the value of α = 0.05, which we take as the level of reliability. Hence, the hypothesis of the existence of homockedastics is accepted. On the other hand, the Prob(Obs*R-squared)= 0,2032>α=0,05 condition is satisfied according to the statistics of the White test shown in the table.
Thus, the conclusion that (10) a multivariate regression model is a homoscedastic model is also confirmed by the Prob (Obs * R-squared) characteristic of that statistic.
One of the important conditions for constructing a qualitative regression model with the smallest square model is that the values of ε_i random deviations do not depend on the values of the deviations in all other observations. Compliance with this condition ensures that there is no correlation between arbitrary deviations, including neighboring deviations.
The condition, which is defined as the correlation between the indicators regulated by time or space, is called autocorrelation (sequential correlation). According to the concept of autocorrelation of residues in regression analysis, it is found when building models based on time series.
The causes of autocorrelation include the correct choice of the form of analytical expression of dependence (errors in the specification), the inertia of changes in indicators, the delayed response of indicators to changes in the economic situation (spider web effect), the correction of data shocks.
The accuracy of autocorrelation and the negative results can be attributed to the following: • The values of the bi parameters of the regression model remain linear and non-slip, and lose their efficiency; • The prices of random deviations are shifted and mostly reduced; • Price variances are shifted and mostly reduced. As a result, the cost of t-statistics increases and the forecast quality of the model deteriorates.
In regression models, the Breuch-Godfrey test is used to check for autocorrelation of residues. Brush -The Godfrey test is used when the number of observations is large and to detect a high degree of autorelay. The test is based on the following idea: if there is a connection between neighboring observations, it is natural to expect it in the equation, ie: = 1 −1 + 2 −2 + ⋯ + − + , = 1, ̅̅̅̅̅ Here, et is the random deviation of the initial regression model tested for autocorrelation, and the pk coefficient will differ significantly from zero. Thus, the hypothesis is formulated as follows: 0 : 1 = 2 = ⋯ = = 0 (no autocorrelation) 0 : 1 = 2 = ⋯ = ≠ 0 (k-order autocorrelation is available) The following table (10) shows a fragment of the statistics of testing the multivariate regression model for the presence of autocorrelation using the Breuch-Godfrey test (Table 8).
(avtokorrelyasiya yoxdur) The following table (10) shows a fragment of the statistics of testing the multivariate regression model for the presence of autocorrelation using the Breuch-Godfrey test (Table 8). According to the statistics in the table above, Fstat's Prob (F-statistic) value is 0.2668, which is greater than the reliability level α = 0.05 we studied: Prob (F-statistic) = 0. 2668> α = 0.05 According to the statistics in the table above, Fstat's Prob (F-statistic) value is 0.2668, which is greater than the reliability level α = 0.05 we studied: Prob (F-statistic) = 0. 2668> α = 0.05 Therefore, the H1 hypothesis about the existence of autocorrelation in model (10) is not accepted (there is no autocorrelation). This result is also confirmed in the table as Prob (Obs * R-squared) = 0.1688> α = 0.05.
Thus, the multivariate regression model (10) can be used as an effective mechanism for making and forecasting optimal management decisions at the system level of small and medium enterprises, being a non-multicollinear, homoscedastic, residual autocorrelation model that adequately reflects the real situation.

Conclusion
As is known, the subject of linear regression analysis is the assessment of linear relationships between indicators of socio-economic systems. However, at the level of real socio-economic systems, such dependencies do not exist a priori. Therefore, in econometric studies, it is necessary to bring the nonlinear dependencies between the indicators into linear dependencies. It should be noted that this approach is possible in many cases and allows you to make management decisions that are sufficiently adequate to the real situation. The simplest example of bringing a nonlinear regression model to a line is logarithmic, or rather, bringing the base linear model of the regression equation in the form of = 0 + 1 1 + 2 2 + … + + to the logarithmic equation in the form of Log ( ) = 0 + 1 Log ( 1 ) + 2 Log ( 2 ) + … + Log ( ) + .