Basic Geographically Weighted Regression (GWR)¶
Mathematical Foundation¶
For a dataset of \(n\) samples and \(p\) independent variables, the basic GWR model at sample \(i\) is defined as:
where:
\(y_i\) is the dependent variable,
\(x_{ik}\) is the \(k\)-th independent variable,
\(\beta_{ik}\) is the \(k\)-th coefficient,
\(\beta_{i0}\) is the intercept,
\(\varepsilon_i \sim \mathcal{N}(0, \sigma^2)\) is the random error.
The locally weighted least-squares estimator of the coefficients is:
where \(\mathbf{W}_i = \operatorname{diag}(w_{i1}, w_{i2}, \dots, w_{in})\) is the spatial weighting matrix. Each \(w_{ij}\) is computed by a kernel function \(k(d_{ij}; b)\) based on the distance from sample \(i\) to sample \(j\).
Diagnostic Information¶
After fitting, the algorithm computes the following diagnostics:
Metric |
Meaning |
Key |
|---|---|---|
RSS |
Residual sum of squares \(\sum (y_i - \hat{y}_i)^2\) |
|
AICc |
Corrected Akaike information criterion (smaller is better) |
|
ENP |
Effective number of parameters |
|
EDF |
Effective degrees of freedom |
|
R² |
Coefficient of determination |
|
Adjusted R² |
Adjusted R-squared |
|
Key Parameters¶
Parameter |
Description |
Default |
|---|---|---|
|
A single bandwidth weight shared by all variables |
Required |
|
The distance metric to use |
|
|
Whether to include an intercept term |
|
|
Auto-select bandwidth: |
|
|
Auto-select variables via forward selection: AIC change threshold |
|
Code Examples¶
Basic Usage¶
from pygwmodel import GWRBasic, BandwidthWeight, CRSDistance
algorithm = GWRBasic(
data,
depen_var="PURCHASE",
indep_vars=["FLOORSZ", "UNEMPLOY", "PROF"],
weight=BandwidthWeight(36.0, adaptive=True),
distance=CRSDistance()
).fit()
# View diagnostic information
print(algorithm.diagnostic['RSquare']) # 0.708
print(algorithm.diagnostic['AICc']) # 2448.27
# Get the result layer (GeoDataFrame)
result = algorithm.result_layer
print(result.columns) # Intercept, FLOORSZ, ..., Intercept_SE, ..., fitted
Bandwidth Selection¶
algorithm = GWRBasic(data, y, x, BandwidthWeight(adaptive=True),
distance=CRSDistance()).fit(
optimize_bw=GWRBasic.BandwidthSelectionCriterionType.CV
)
print(algorithm.weight.bandwidth) # Optimised bandwidth: 67
Independent Variable Selection¶
algorithm = GWRBasic(data, y, x, BandwidthWeight(36.0, adaptive=True),
distance=CRSDistance()).fit(optimize_var=3.0)
# Variable combinations and their AICc values
for vars, aicc in algorithm.indep_var_select_criterions:
print(f"{vars}: {aicc:.2f}")
print(algorithm.indep_vars) # ['FLOORSZ', 'PROF'] — optimal subset
Prediction¶
prediction = algorithm.predict(new_data)
print(prediction.columns) # Intercept, FLOORSZ, ..., y_hat, residual
References¶
Brunsdon, C., Fotheringham, A. S., & Charlton, M. E. (1996). Geographically weighted regression: a method for exploring spatial nonstationarity. Geographical Analysis, 28(4), 281-298.
Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). Geographically weighted regression: the analysis of spatially varying relationships. John Wiley & Sons.