Australian Rainfall Dynamics
Two-part Zero-Inflated Gamma GLMM across 49 stations and 142,193 observations. Markovian persistence, thermodynamic gating, and spatial random effects validated against four independent metrics.
Whether it rained yesterday is the single strongest predictor of today's rainfall. Dry states are 85% self-reinforcing; wet states are transient at 47%. No meteorological predictor comes close.
What governs daily rainfall across 49 Australian weather stations? 142,193 observations over 10 years, 64% structural zeros, and a positive tail with kurtosis of 181. Three simultaneous violations of OLS assumptions that cannot be remediated by transformation. A Gaussian baseline achieves AUC of 0.61 and predicts negative rainfall. These are distributional failures, not tuning failures.
Two-part ZIG-GLMM: a logistic hurdle sub-model estimates dry-day probability; a Gamma component estimates conditional rainfall intensity, with separate linear predictors for each process. Station-level random slopes (glmmTMB) pool information across locations while respecting spatial heterogeneity. All models fitted via ML with Rubin's Rules pooling across five multiply-imputed datasets (42 to 48% missingness in key predictors). Wind compass bearings decomposed into orthogonal U/V vectors. A centred humidity x sunshine interaction captures the thermodynamic Rain Corner. A dispersion submodel addresses residual heteroscedasticity.
Seven nested models (M0 to M6); moisture and pressure dynamics alone yield ΔAIC of -35,938. Final model: AUC = 0.813, Brier Score = 0.165, Brier Skill Score = 0.282, MAE = 2.76 mm (all days) / 5.61 mm (rain-days only). Durbin-Watson approximately 2.0 across most stations. Zero-inflation calibration ratio = 1.00. Scope is explicit: the Gamma tail underestimates exceedances above the 95th percentile and extreme events require a GPD extension.