Regression Analysis Basics Arcmap Documentation Esri

Leo Migdal
-
regression analysis basics arcmap documentation esri

The Spatial Statistics toolbox provides effective tools for quantifying spatial patterns. Using the Hot Spot Analysis tool, for example, you can ask questions like these: Each of the questions above asks "where?" The next logical question for the types of analyses above involves "why?" Tools in the Modeling Spatial Relationships toolset help you answer this second set of why questions. These tools include Ordinary Least Squares (OLS) regression and Geographically Weighted Regression. Regression analysis allows you to model, examine, and explore spatial relationships and can help explain the factors behind observed spatial patterns.

You may want to understand why people are persistently dying young in certain regions of the country or what factors contribute to higher than expected rates of diabetes. By modeling spatial relationships, however, regression analysis can also be used for prediction. Modeling the factors that contribute to college graduation rates, for example, enables you to make predictions about upcoming workforce skills and resources. You might also use regression to predict rainfall or air quality in cases where interpolation is insufficient due to a scarcity of monitoring stations (for example, rain gauges are often lacking along mountain ridges... OLS is the best known of all regression techniques. It is also the proper starting point for all spatial regression analyses.

It provides a global model of the variable or process you are trying to understand or predict (early death/rainfall); it creates a single regression equation to represent that process. Geographically weighted regression (GWR) is one of several spatial regression techniques, increasingly used in geography and other disciplines. GWR provides a local model of the variable or process you are trying to understand/predict by fitting a regression equation to every feature in the dataset. When used properly, these methods provide powerful and reliable statistics for examining and estimating linear relationships. When you run the Exploratory Regression tool, the primary output is a report. The report can be seen in the geoprocessing messages window when you run in the foreground, or it can be accessed from the Results window.

Optionally, a table will also be created that can help you further investigate the models that have been tested. One purpose of the report is to help you figure out whether or not the candidate explanatory variables you are considering yield any properly specified OLS models. In the event that there are no passing models (models that meet all of the criteria you specified when you launched the Exploratory Regression tool), however, the output will also show you which variables... Strategies for addressing problems associated with each of the diagnostics are given in the Regression Analysis Basics document (see Common regression problems, consequences, and solutions) and in What they don't tell you about regression... For more information about how to determine whether or not you have a properly specified OLS model, please see Regression Analysis Basics and Interpreting OLS results. The Exploratory Regression report has five distinct sections.

Each section is described below. The first set of summaries in the output report is grouped by the number of explanatory variables in the models tested. If you specify a 1 for the Minimum Number of Explanatory Variables parameter, and a 5 for the Maximum Number of Explanatory Variables parameter, you will have 5 summary sections. Each section lists the three models with the highest adjusted R2 values and all passing models. Each summary section also includes the diagnostic values for each model listed: corrected Akaike Information Criteria - AICc, Jarque-Bera p-value - JB, Koenker’s studentized Breusch-Pagan p-value - K(BP), the largest Variance Inflation Factor -... These summaries give you an idea of how well your models are predicting (Adj R2), and if any models pass all of the diagnostic criteria you specified.

If you accepted all of the default Search Criteria (Minimum Acceptable Adj R Squared, Maximum Coefficient p-value Cutoff, Maximum VIF Value Cutoff, Minimum Acceptable Jarque Bera p-value, and Minimum Acceptable Spatial Autocorrelation p-value parameters),... If there aren’t any passing models, the rest of the output report still provides lots of good information about variable relationships, and can help you make decisions about how to move forward. The Exploratory Regression Global Summary section is an important place to start, especially if you haven't found any passing models, because it shows you why none of the models are passing. This section lists the five diagnostic tests and the percentage of models that passed each of those tests. If you don’t have any passing models, this summary will help you figure out which diagnostic test is giving you trouble. ArcGIS Insights is deprecated and will be retiring in 2026.

For information on the deprecation, see ArcGIS Insights deprecation. Regression analysis is a technique that calculates the estimated relationship between a dependent variable and one or more explanatory variables. With regression analysis, you can model the relationship between the chosen variables as well as predict values based on the model. Regression analysis uses a specified estimation method, a dependent variable, and one or more explanatory variables to create an equation that estimates values for the dependent variable. The regression model includes outputs, such as R2 and p-values, to provide information on how well the model estimates the dependent variable. Charts, such as scatter plot matrices, histograms, and point charts, can also be used in regression analysis to analyze relationships and test assumptions.

Regression analysis may be the most commonly used statistic in the social sciences. Regression is used to evaluate relationships between two or more feature attributes. Identifying and measuring relationships allows you to better understand what's going on in a place, predict where something is likely to occur, or examine causes of why things occur where they do. Ordinary Least Squares (OLS) is the best known of the regression techniques. It is also a starting point for all spatial regression analyses. It provides a global model of the variable or process you are trying to understand or predict; it creates a single regression equation to represent that process.

There are a number of resources to help you learn more about both OLS regression and Geographically Weighted Regression. Start with Regression analysis basics. Next, work through the Regression Analysis tutorial. This topic will cover the results of your analysis to help you understand the output and diagnostics of OLS. To run the OLS tool, provide an Input Feature Class with a Unique ID Field, the Dependent Variable you want to model, explain, or predict, and a list of Explanatory Variables. You will also need to provide a path for the Output Feature Class and, optionally, paths for the Output Report File, Coefficient Output Table, and Diagnostic Output Table.

Output generated from the OLS tool includes an output feature class symbolized using the OLS residuals, statistical results, and diagnostics in the Messages window as well as several optional outputs such as a PDF... Each of these outputs is described below as a series of checks when running OLS regression and interpreting OLS results. The feature class containing the dependent and independent variables. The numeric field containing the observed values to be modeled. Specifies the type of data that will be modeled. The new feature class that will contain the dependent variable estimates and residuals.

A list of fields representing independent explanatory variables in the regression model. Regression analysis may be the most commonly used statistic in the social sciences. Regression is used to evaluate relationships between two or more feature attributes. Identifying and measuring relationships allows you to better understand what's going on in a place, predict where something is likely to occur, or examine causes of why things occur where they do. Generalized Linear Regression creates a model of the variable or process you are trying to understand or predict that can be used to examine and quantify relationships among features. This tool is new in ArcGIS Pro 2.3 and includes the functionality of Ordinary Least Squares (OLS).

This tool includes the additional models of Count (Poisson) and Binary (Logistic) which allow the tool to be applied to a wider range of problems. Generalized Linear Regression can be used for a variety of applications, including the following: To run the Generalized Linear Regression tool, provide Input Features with a field representing the Dependent Variable and one or more fields representing the Explanatory Variable(s) or, optionally, Distance Features. These fields must be numeric and have a range of values. Features that contain missing values in the dependent or explanatory variables will be excluded from the analysis; however, you can use the Fill Missing Values tool to complete the dataset before running the Generalized... Next, you must choose a Model Type based on the data you are analyzing.

It is important to use an appropriate model for your data. Descriptions of the model types and how to determine the appropriate one for your data are below. Generalized Linear Regression provides three types of regression models: Continuous, Binary and Count. These types of regressions are known in statistical literature as Gaussian, Logistic, and Poisson, respectively. The Model Type for your analysis should be chosen based on how your Dependent Variable was measured or summarized as well as the range of values it contains.

People Also Search

The Spatial Statistics Toolbox Provides Effective Tools For Quantifying Spatial

The Spatial Statistics toolbox provides effective tools for quantifying spatial patterns. Using the Hot Spot Analysis tool, for example, you can ask questions like these: Each of the questions above asks "where?" The next logical question for the types of analyses above involves "why?" Tools in the Modeling Spatial Relationships toolset help you answer this second set of why questions. These tools...

You May Want To Understand Why People Are Persistently Dying

You may want to understand why people are persistently dying young in certain regions of the country or what factors contribute to higher than expected rates of diabetes. By modeling spatial relationships, however, regression analysis can also be used for prediction. Modeling the factors that contribute to college graduation rates, for example, enables you to make predictions about upcoming workfo...

It Provides A Global Model Of The Variable Or Process

It provides a global model of the variable or process you are trying to understand or predict (early death/rainfall); it creates a single regression equation to represent that process. Geographically weighted regression (GWR) is one of several spatial regression techniques, increasingly used in geography and other disciplines. GWR provides a local model of the variable or process you are trying to...

Optionally, A Table Will Also Be Created That Can Help

Optionally, a table will also be created that can help you further investigate the models that have been tested. One purpose of the report is to help you figure out whether or not the candidate explanatory variables you are considering yield any properly specified OLS models. In the event that there are no passing models (models that meet all of the criteria you specified when you launched the Exp...

Each Section Is Described Below. The First Set Of Summaries

Each section is described below. The first set of summaries in the output report is grouped by the number of explanatory variables in the models tested. If you specify a 1 for the Minimum Number of Explanatory Variables parameter, and a 5 for the Maximum Number of Explanatory Variables parameter, you will have 5 summary sections. Each section lists the three models with the highest adjusted R2 val...