Chapter Context. This chapter constructs the statistical model incrementally, beginning from a null baseline and introducing successive layers of physical complexity. Each model corresponds to a coherent hypothesis about the mechanisms driving Australian rainfall. The model family Zero-Inflated Gamma (ZIG) was selected in direct response to the distributional findings from the EDA: the 64% zero-inflation rate and extreme positive skew of the rainfall distribution make a single-component Gaussian model fundamentally inappropriate.


6.1 Model Architecture

A Zero-Inflated Gamma model is a two-component mixture. The first component, the zero-inflation submodel, is a logistic regression that estimates the probability of a structurally zero outcome: a dry day on which no precipitation occurs. The second component, the conditional intensity submodel, is a Gamma regression with log link that models the expected rainfall given that rain does fall. Together they address the two questions identified in Chapter 4: whether rain will occur, and if so, how much.

The mathematical specification is given on the project index. The two linear predictors for occurrence and intensity can include different sets of covariates, allowing the drivers of dry-day probability to differ from the drivers of rainfall volume, the structural distinction between the ZIG framework and a Tweedie model. All models are fitted using glmmTMB. Progressive model construction provides a transparent audit trail connecting each modelling decision to its empirical motivation and enables incremental comparison to isolate the contribution of each newly added component.


6.2 Model 0: Null Baseline

Show the code
m0_null <- fit_and_pool(
  cond_formula = rainfall ~ 1,
  zi_formula = ~1,
  datasets = engineered_list
)
saveRDS(m0_null, here::here("models", "m0_null.rds"))
Table 6.1: M0: Null Baseline. Pooled via Rubin’s rules.
M0: Null Baseline
Estimates
Confidence
Inference
Term Estimate exp(β) 95% CI SE t df p
Conditional (log link)
Intercept 1.883 6.570 [NaN, NaN] 0.006 310.01 NaN NA
Zero-inflation (logit link)
Intercept 0.577*** 1.780 [0.566, 0.588] 0.006 104.27 141851.0 <0.001
Note:
† p<0.1 * p<0.05 ** p<0.01 *** p<0.001
AIC: 461669.7 BIC: 461699.3 log-Lik: -230831.8 (averaged across imputed datasets)
Pooled via Rubin’s rules.

The intercept-only model verifies that the ZIG structure correctly recovers the fundamental statistical properties of the dataset (Table 6.1). The zero-inflation intercept \(\hat{\beta}_{zi} = 0.577\) (SE = 0.006, \(p < 0.001\)) back-transforms to an implied dry-day probability of:

\[ \hat{P}(\text{Dry}) = \frac{e^{0.577}}{1 + e^{0.577}} \approx 64.03\% \]

This matches the empirically observed zero-inflation rate of 64.05% to within rounding error, confirming that the hurdle mechanism is correctly calibrated. The conditional intensity intercept \(\hat{\beta}_{cond} = 1.883\) (SE = 0.006) recovers a mean rainy-day intensity of \(e^{1.883} \approx 6.57\) mm, consistent with the non-zero conditional mean established in Section 4.2.


6.3 Model 1: Moisture and Pressure Dynamics

Show the code
m1_moisture <- fit_and_pool(
  cond_formula = m1_cond,
  zi_formula = zi_m1,
  datasets = engineered_list
)
saveRDS(m1_moisture, here::here("models", "m1_moisture.rds"))
Table 6.2: Model 1: Moisture and Pressure Dynamics. Pooled via Rubin’s rules.
Model 1: Moisture and Pressure Dynamics
Estimates
Confidence
Inference
Term Estimate exp(β) 95% CI SE t df p
Conditional (log link)
Intercept 1.380*** 3.974 [1.350, 1.409] 0.014 98.10 17.5 <0.001
Humidity3pm 0.403*** 1.496 [0.377, 0.428] 0.012 32.91 18.0 <0.001
Dewpoint 9am 0.300*** 1.350 [0.261, 0.339] 0.018 16.86 11.0 <0.001
Dewpoint Change -0.155*** 0.857 [-0.216, -0.093] 0.028 -5.58 10.1 <0.001
Pressure Change 0.116** 1.123 [0.041, 0.191] 0.033 3.48 9.4 0.007
Zero-inflation (logit link)
Intercept 0.707*** 2.029 [0.694, 0.721] 0.007 103.24 395.9 <0.001
Humidity3pm -1.038*** 0.354 [-1.065, -1.012] 0.013 -82.25 20.4 <0.001
Dewpoint 9am 0.002 1.002 [-0.014, 0.017] 0.008 0.20 59.0 0.840
Note:
† p<0.1 * p<0.05 ** p<0.01 *** p<0.001
AIC: 425731.4 BIC: 425799.4 log-Lik: -212856.7 (averaged across imputed datasets)
Pooled via Rubin’s rules.

The correlation analysis in Section 4.3 identified afternoon relative humidity and morning dewpoint temperature as among the strongest individual predictors of rainfall. Model 1 formalises these findings by introducing the four most physically direct atmospheric moisture and pressure drivers into the conditional intensity submodel, and a subset into the zero-inflation submodel (Table 6.2). All four conditional predictors are highly significant (\(p < 0.001\)).

The positive coefficients for humidity3pm (\(\hat{\beta} = 0.403\), \(e^{0.403} \approx 1.50\)) and dewpoint_9am (\(\hat{\beta} = 0.300\), \(e^{0.300} \approx 1.35\)) establish that both the relative saturation of the afternoon boundary layer and the absolute moisture content of the morning atmosphere independently amplify rainfall intensity. These two predictors operate at different timescales: afternoon humidity reflects the near-instantaneous condition of the lower troposphere as convective processes become active, while morning dewpoint captures the ambient moisture available before any daytime heating has taken effect.

The dewpoint change coefficient (\(\hat{\beta} = -0.155\), \(p < 0.001\)) is negative: a diurnal drop in dewpoint from morning to afternoon, consistent with dry-air entrainment under subsiding high-pressure systems, is associated with reduced rainfall intensity. A rise in pressure over the observation window (\(\hat{\beta} = 0.116\), \(p = 0.007\)) is positively associated with intensity, a result that likely reflects synoptic-scale convergence patterns preceding frontal rainfall.

The zero-inflation submodel reveals a meaningful asymmetry between the two moisture variables. The humidity3pm coefficient of \(-1.038\) (\(p < 0.001\)) is strongly negative: since the outcome modelled is the dry-day indicator, higher afternoon humidity substantially reduces the probability of a dry day. dewpoint_9am, by contrast, is not significant in the zero-inflation component (\(\hat{\beta} = 0.002\), \(p = 0.840\)). This dissociation suggests that relative saturation of the afternoon atmosphere determines whether the thermodynamic threshold for precipitation is crossed, while the absolute morning moisture content shapes intensity once that threshold has been exceeded.


6.4 Model 2: Seasonality and Day-to-Day Persistence

Show the code
m2_temporal <- fit_and_pool(
  cond_formula = m2_cond,
  zi_formula = zi_m2,
  datasets = engineered_list
)
saveRDS(m2_temporal, here::here("models", "m2_temporal.rds"))
Table 6.3: Model 2: Seasonality and Persistence Effects. Pooled via Rubin’s rules.
Model 2: Seasonality and Persistence Effects
Estimates
Confidence
Inference
Term Estimate exp(β) 95% CI SE t df p
Conditional (log link)
Intercept 1.367*** 3.925 [1.337, 1.398] 0.015 94.08 16.8 <0.001
Humidity3pm 0.432*** 1.540 [0.395, 0.468] 0.017 25.70 13.3 <0.001
Dewpoint 9am 0.260*** 1.297 [0.197, 0.323] 0.029 9.09 10.3 <0.001
Dewpoint Change -0.176*** 0.839 [-0.248, -0.104] 0.032 -5.44 9.9 <0.001
Pressure Change 0.113** 1.120 [0.041, 0.185] 0.032 3.54 9.4 0.006
Day Cos 0.083*** 1.087 [0.043, 0.123] 0.019 4.48 12.5 <0.001
Day Sin -0.017† 0.983 [-0.036, 0.001] 0.009 -1.91 33.6 0.065
Zero-inflation (logit link)
Intercept 1.075*** 2.929 [1.058, 1.091] 0.009 125.46 207.0 <0.001
Humidity3pm -0.915*** 0.401 [-0.952, -0.878] 0.017 -52.96 13.9 <0.001
Dewpoint 9am -0.022† 0.978 [-0.049, 0.005] 0.013 -1.75 16.8 0.099
Rain Yesterday (Yes) -1.456*** 0.233 [-1.489, -1.422] 0.017 -85.41 164.8 <0.001
Cloud Development 0.119*** 1.127 [0.095, 0.144] 0.012 10.12 19.1 <0.001
Pressure Change -0.283*** 0.753 [-0.399, -0.167] 0.051 -5.51 9.3 <0.001
Note:
† p<0.1 * p<0.05 ** p<0.01 *** p<0.001
AIC: 412920.9 BIC: 413026.8 log-Lik: -206446.5 (averaged across imputed datasets)
Pooled via Rubin’s rules.

Model 2 incorporates the temporal structure identified in Chapter 4 (Table 6.3). The Markov Chain analysis in Section 4.4.2 demonstrated that the previous day’s rain state carries a moderate but practically significant effect (Cramér’s \(V \approx 0.31\)), and the seasonal decomposition confirmed that both frequency and intensity follow a smooth annual cycle. Circular sine and cosine encodings of the day-of-year enter the conditional intensity submodel, capturing this periodicity without imposing a discontinuity at the calendar year boundary. The binary indicator rain_yesterday, the cloud_development index, and pressure_change are added to the zero-inflation component to account for daily persistence and synoptic-scale dynamics.

The cosine term (\(\hat{\beta} = 0.083\), \(p < 0.001\)) captures the primary annual amplitude of the seasonal intensity cycle. The sine term (\(\hat{\beta} = -0.017\), \(p = 0.065\)) adjusts its phase but does not reach conventional significance, indicating its contribution is marginal relative to the cosine component.

The persistence coefficient in the zero-inflation submodel is the dominant result. With \(\hat{\beta}_{zi}(\text{rain\_yesterday}) = -1.456\) (SE = 0.017, \(p < 0.001\)), it is the strongest single predictor in the hurdle component across the entire model sequence. Exponentiating gives \(e^{-1.456} \approx 0.233\): having rained the previous day reduces the odds of today being classified as dry by approximately 76.7%. This directly quantifies the asymmetric persistence documented in the Markov transition matrix of Section 4.4.2, wherein wet states are substantially more self-sustaining than dry states. Physically, this reflects the tendency for synoptic weather systems, fronts, troughs, and tropical lows to persist over multiple days.

The cloud_development term in the zero-inflation submodel (\(\hat{\beta} = 0.119\), \(p < 0.001\)) is positive, implying that greater cloud development is associated with a higher probability of a dry day. In the presence of the strong humidity predictors already in the model, incremental cloud growth may serve as a proxy for convective inhibition under stable, non-precipitating high-pressure systems rather than precipitation potential.


6.5 Model 3: Accumulated Weather History

Show the code
m3_history <- fit_and_pool(
  cond_formula = m3_cond,
  zi_formula = stable_zi,
  datasets = engineered_list
)
saveRDS(m3_history, here::here("models", "m3_history.rds"))
Table 6.4: Model 3: Accumulated Weather History. Pooled via Rubin’s rules.
Model 3: Accumulated Weather History
Estimates
Confidence
Inference
Term Estimate exp(β) 95% CI SE t df p
Conditional (log link)
Intercept 1.211*** 3.357 [1.182, 1.239] 0.014 87.36 24.3 <0.001
Humidity3pm 0.422*** 1.526 [0.387, 0.458] 0.016 25.67 14.7 <0.001
Dewpoint 9am 0.221*** 1.247 [0.159, 0.283] 0.028 7.88 10.4 <0.001
Dewpoint Change -0.186*** 0.830 [-0.251, -0.121] 0.029 -6.37 10.1 <0.001
Pressure Change 0.110** 1.116 [0.039, 0.180] 0.031 3.50 9.4 0.006
Day Cos 0.079*** 1.082 [0.040, 0.118] 0.018 4.39 12.8 <0.001
Day Sin -0.028** 0.972 [-0.046, -0.011] 0.009 -3.30 40.8 0.002
Rainfall Ma7 0.127*** 1.135 [0.115, 0.138] 0.006 21.33 2369.4 <0.001
Days Since Rain 0.003 1.003 [-0.010, 0.016] 0.007 0.39 133.2 0.694
Humidity Ma7 -0.078*** 0.925 [-0.097, -0.059] 0.010 -8.09 109.1 <0.001
Rain Yesterday (Yes) 0.322*** 1.379 [0.297, 0.347] 0.013 25.08 991.7 <0.001
Zero-inflation (logit link)
Intercept 1.089*** 2.972 [1.071, 1.107] 0.009 120.17 122.5 <0.001
Humidity3pm -0.630*** 0.533 [-0.680, -0.580] 0.023 -26.99 13.4 <0.001
Dewpoint 9am -0.183*** 0.833 [-0.222, -0.145] 0.018 -10.29 13.4 <0.001
Rain Yesterday (Yes) -1.414*** 0.243 [-1.449, -1.379] 0.018 -79.94 124.1 <0.001
Cloud Development 0.099*** 1.104 [0.077, 0.120] 0.010 9.52 25.8 <0.001
Pressure Change -0.299*** 0.742 [-0.423, -0.174] 0.055 -5.42 9.3 <0.001
Sunshine 0.258*** 1.294 [0.233, 0.283] 0.012 20.95 31.2 <0.001
Evaporation 0.324*** 1.383 [0.284, 0.364] 0.019 17.09 17.2 <0.001
Note:
† p<0.1 * p<0.05 ** p<0.01 *** p<0.001
AIC: 408915.1 BIC: 409066.3 log-Lik: -204437.6 (averaged across imputed datasets)
Pooled via Rubin’s rules.

Model 3 introduces medium-horizon antecedent features; rainfall_ma7, humidity_ma7, days_since_rain, and the binary rain_yesterday into the conditional intensity submodel (Table 6.4). The motivation follows from the dry spell analysis in Section 4.4.3, which showed that rainfall probability decays as dry spells lengthen, and from the moving average analysis in Section 5.1, which showed that 7-day smoothed signals substantially reduce stochastic variance.

The rain_yesterday coefficient in the conditional intensity component (\(\hat{\beta} = 0.322\), \(e^{0.322} \approx 1.38\), \(p < 0.001\)) indicates that rainfall on the preceding day is associated with approximately 38% greater intensity, consistent with a persisting frontal or trough system delivering consecutive events of comparable character. The rainfall_ma7 coefficient (\(\hat{\beta} = 0.127\), \(p < 0.001\)) extends this logic to the weekly horizon: a wetter preceding week is associated with heavier current rainfall, suggesting that prolonged wet spells reflect deeper or more extensive atmospheric systems rather than sequences of meteorologically independent events.

The negative coefficient for humidity_ma7 (\(\hat{\beta} = -0.078\), \(p < 0.001\)) warrants careful interpretation. Conditional on the full predictor set, a persistently humid preceding week is associated with slightly lower rainfall intensity. This likely reflects a distinction between two meteorological regimes: stratiform systems, which sustain elevated background humidity across extended periods without generating intense precipitation, and convective systems, which produce intense episodic rainfall under conditions of rapid moisture change. Once both instantaneous humidity and its 7-day mean are in the model, the latter carries the residual signal of these differing system types.

days_since_rain is not significant in the conditional intensity component (\(\hat{\beta} = 0.003\), \(p = 0.694\)). The duration of an antecedent dry spell predicts whether rain will occur, but once the model conditions on rainfall occurring, the length of the preceding drought provides no additional information about volume. This reinforces the conceptual separation between the occurrence and intensity sub-processes that motivates the ZIG framework.

Imputation provenance flags for the four ghost-sensor variables (sunshine, evaporation, cloud_3pm, and cloud_9am) were tested in the zero-inflation component across all model specifications from Model 3 onward. None achieved statistical significance (all \(p > 0.05\)), confirming that the predictive mean matching imputation did not introduce systematic bias at stations lacking direct instrumentation for these variables. The flags were subsequently excluded from all further model specifications.


6.6 Model 4: Thermodynamic Energy and the Rain Corner Interaction

Show the code
m4_energy <- fit_and_pool(
  cond_formula = m4_cond,
  zi_formula = zi_m3_to_m5,
  datasets = engineered_list
)
saveRDS(m4_energy, here::here("models", "m4_energy.rds"))
Table 6.5: Model 4: Thermodynamic Energy and Interactions. Pooled via Rubin’s rules.
Model 4: Thermodynamic Energy and Interactions
Estimates
Confidence
Inference
Term Estimate exp(β) 95% CI SE t df p
Conditional (log link)
Intercept 1.239*** 3.452 [1.211, 1.267] 0.014 90.61 27.9 <0.001
Humidity3pm 0.229*** 1.258 [0.151, 0.308] 0.036 6.42 11.4 <0.001
Dewpoint 9am 0.202*** 1.224 [0.135, 0.270] 0.030 6.67 10.2 <0.001
Dewpoint Change -0.183*** 0.833 [-0.260, -0.106] 0.034 -5.32 9.8 <0.001
Pressure Change 0.130** 1.139 [0.057, 0.203] 0.033 3.98 9.4 0.003
Day Cos 0.020 1.020 [-0.018, 0.058] 0.018 1.14 13.5 0.274
Day Sin -0.026* 0.974 [-0.047, -0.006] 0.010 -2.63 24.1 0.015
Rainfall Ma7 0.114*** 1.121 [0.102, 0.126] 0.006 18.72 569.9 <0.001
Days Since Rain -0.000 1.000 [-0.013, 0.013] 0.007 -0.04 134.4 0.965
Humidity Ma7 -0.044*** 0.957 [-0.063, -0.025] 0.010 -4.58 135.0 <0.001
Rain Yesterday (Yes) 0.342*** 1.407 [0.316, 0.367] 0.013 26.59 749.1 <0.001
Sunshine -0.037* 0.964 [-0.072, -0.001] 0.017 -2.18 16.8 0.043
Evaporation 0.154*** 1.167 [0.129, 0.179] 0.012 12.55 35.8 <0.001
Instability Index 0.144*** 1.155 [0.104, 0.184] 0.018 7.86 12.1 <0.001
Sun Humid Interaction -0.076*** 0.926 [-0.105, -0.048] 0.014 -5.62 17.2 <0.001
Cloud Development -0.029** 0.971 [-0.047, -0.010] 0.009 -3.20 30.2 0.003
Zero-inflation (logit link)
Intercept 1.089*** 2.972 [1.071, 1.107] 0.009 120.13 122.0 <0.001
Humidity3pm -0.630*** 0.533 [-0.680, -0.580] 0.023 -26.99 13.4 <0.001
Dewpoint 9am -0.183*** 0.833 [-0.222, -0.145] 0.018 -10.29 13.4 <0.001
Rain Yesterday (Yes) -1.414*** 0.243 [-1.449, -1.379] 0.018 -79.99 125.0 <0.001
Cloud Development 0.099*** 1.104 [0.077, 0.120] 0.010 9.52 25.8 <0.001
Pressure Change -0.299*** 0.742 [-0.423, -0.174] 0.055 -5.42 9.3 <0.001
Sunshine 0.258*** 1.294 [0.233, 0.283] 0.012 20.95 31.2 <0.001
Evaporation 0.324*** 1.383 [0.284, 0.364] 0.019 17.09 17.2 <0.001
Note:
† p<0.1 * p<0.05 ** p<0.01 *** p<0.001
AIC: 407848.8 BIC: 408037.8 log-Lik: -203899.4 (averaged across imputed datasets)
Pooled via Rubin’s rules.

The bivariate density analysis in Section 4.7 established that intense rainfall occurs almost exclusively where afternoon humidity is high and sunshine hours are simultaneously low. An additive model is structurally incapable of representing this conditional geometry. Model 4 introduces the multiplicative sun_humid_interaction term alongside the thermodynamic energy predictors sunshine, evaporation, instability_index, and cloud_development (Table 6.5).

The sun_humid_interaction coefficient (\(\hat{\beta} = -0.076\), SE = 0.014, \(p < 0.001\)) is highly significant and physically coherent. After mean-centring both component variables, the product term takes large negative values precisely when humidity is above average and sunshine hours are below average. The negative coefficient therefore assigns greater expected rainfall intensity to this region of the covariate space, directly encoding the Rain Corner identified in Section 4.7: an overcast, moisture-saturated atmosphere in which solar insolation is suppressed by deep cloud cover.

The instability_index (\(\hat{\beta} = 0.144\), \(p < 0.001\)) operates on related physical grounds: higher atmospheric instability promotes convective development and augments rainfall intensity. The evaporation coefficient (\(\hat{\beta} = 0.154\), \(p < 0.001\)) is positive, suggesting that higher potential evaporation on rainy days, as a proxy for energy availability, is associated with greater intensity, consistent with convectively driven systems drawing on sensible and latent heat fluxes.

The entry of the interaction term and thermodynamic covariates produces a marked reduction in the humidity3pm coefficient, from \(0.422\) in Model 3 to \(0.229\) in Model 4 (Table 6.4, Table 6.5). This is not indicative of instability in the humidity signal. It reflects a correct partitioning of explained variance: the portion of humidity’s predictive contribution that operates through its interaction with sunshine and through the atmospheric instability it enables is now attributed to those specific mechanistic pathways rather than being absorbed into the humidity main effect alone.


6.7 Model 5: Wind Vector Dynamics

Show the code
m5_wind <- fit_and_pool(
  cond_formula = m5_cond,
  zi_formula = zi_m3_to_m5,
  datasets = engineered_list
)
saveRDS(m5_wind, here::here("models", "m5_wind.rds"))
Table 6.6: Model 5: Wind Vector Dynamics. Pooled via Rubin’s rules.
Model 5: Wind Vector Dynamics
Estimates
Confidence
Inference
Term Estimate exp(β) 95% CI SE t df p
Conditional (log link)
Intercept 1.210*** 3.355 [1.183, 1.238] 0.013 90.48 29.6 <0.001
Humidity3pm 0.203*** 1.225 [0.126, 0.280] 0.035 5.77 11.4 <0.001
Dewpoint 9am 0.249*** 1.283 [0.191, 0.307] 0.026 9.44 10.8 <0.001
Dewpoint Change -0.169*** 0.845 [-0.244, -0.093] 0.034 -4.96 9.8 <0.001
Pressure Change 0.107** 1.113 [0.044, 0.171] 0.028 3.79 9.6 0.004
Day Cos -0.000 1.000 [-0.034, 0.033] 0.016 -0.03 15.1 0.976
Day Sin -0.033** 0.967 [-0.053, -0.014] 0.010 -3.51 26.7 0.002
Rainfall Ma7 0.108*** 1.114 [0.096, 0.120] 0.006 17.25 274.1 <0.001
Days Since Rain -0.009 0.991 [-0.021, 0.004] 0.006 -1.36 231.0 0.176
Humidity Ma7 -0.047*** 0.954 [-0.067, -0.026] 0.010 -4.61 73.7 <0.001
Rain Yesterday (Yes) 0.329*** 1.389 [0.304, 0.353] 0.013 26.22 1551.9 <0.001
Sunshine -0.041* 0.960 [-0.077, -0.005] 0.017 -2.41 16.4 0.028
Evaporation 0.148*** 1.159 [0.122, 0.174] 0.013 11.75 31.1 <0.001
Instability Index 0.158*** 1.172 [0.115, 0.201] 0.020 8.05 11.6 <0.001
Sun Humid Interaction -0.084*** 0.919 [-0.113, -0.055] 0.014 -6.12 16.7 <0.001
Cloud Development -0.024** 0.976 [-0.042, -0.006] 0.009 -2.78 34.5 0.009
Gust U EW -0.057*** 0.945 [-0.071, -0.042] 0.007 -7.67 71.8 <0.001
Gust V NS -0.008 0.992 [-0.021, 0.005] 0.006 -1.27 90.2 0.208
Wind9am V NS -0.007 0.993 [-0.018, 0.003] 0.005 -1.35 1213.8 0.177
Wind9am U EW -0.118*** 0.889 [-0.132, -0.104] 0.007 -17.02 87.4 <0.001
Zero-inflation (logit link)
Intercept 1.089*** 2.972 [1.071, 1.107] 0.009 120.18 122.5 <0.001
Humidity3pm -0.630*** 0.533 [-0.680, -0.580] 0.023 -26.99 13.4 <0.001
Dewpoint 9am -0.183*** 0.833 [-0.222, -0.145] 0.018 -10.29 13.4 <0.001
Rain Yesterday (Yes) -1.414*** 0.243 [-1.449, -1.379] 0.018 -79.93 123.9 <0.001
Cloud Development 0.099*** 1.104 [0.077, 0.120] 0.010 9.52 25.8 <0.001
Pressure Change -0.299*** 0.742 [-0.423, -0.174] 0.055 -5.42 9.3 <0.001
Sunshine 0.258*** 1.294 [0.233, 0.283] 0.012 20.95 31.2 <0.001
Evaporation 0.324*** 1.383 [0.284, 0.364] 0.019 17.09 17.2 <0.001
Note:
† p<0.1 * p<0.05 ** p<0.01 *** p<0.001
AIC: 407066.5 BIC: 407285.7 log-Lik: -203504.2 (averaged across imputed datasets)
Pooled via Rubin’s rules.

Model 5 tests whether the directional provenance of air masses provides additional information about rainfall intensity beyond the instantaneous thermodynamic state (Table 6.6). As developed in Section 5.2.4, wind direction is encoded as orthogonal zonal (\(U\), east–west) and meridional (\(V\), north–south) vector components, preserving the circular geometry of directional data. Both peak gust components and the 9:00 AM surface wind vectors are included, allowing the model to distinguish between the background synoptic circulation at day onset and the turbulent dynamics associated with active rainfall systems.

The most informative wind coefficients are those for the morning surface vectors. wind9am_U_EW (\(\hat{\beta} = -0.118\), SE = 0.007, \(p < 0.001\)) is negative for the zonal component: since the \(U\) convention assigns positive values to eastward flow, a negative coefficient indicates that westerly airflow at 9:00 AM increases expected rainfall intensity. This is physically consistent with the Roaring Forties, the band of persistent westerly circulation in the mid-latitudes that advects maritime air masses from the Indian Ocean onto the southern and southwestern regions of the continent. wind9am_V_NS is not significant at conventional levels (\(\hat{\beta} = -0.007\), \(p = 0.177\)) when the full covariate set is included.

For the peak gust vectors, gust_U_EW is significant (\(\hat{\beta} = -0.057\), \(p < 0.001\)), while gust_V_NS is not (\(\hat{\beta} = -0.008\), \(p = 0.208\)). The morning wind component has a coefficient approximately twice the magnitude of the corresponding gust component. This contrast supports the interpretation that the prevailing direction of the synoptic-scale air mass at day onset is more informative about event intensity than the locally generated turbulence accompanying active storm activity.

Coefficients for the moisture, thermodynamic, and interaction terms from prior models remain stable in direction and magnitude, confirming that the wind predictors explain variance not already accounted for by thermodynamic state rather than displacing existing effects through collinearity.


6.8 Model 6: Spatial Heterogeneity via Mixed Effects

Show the code
m6_mixed <- fit_and_pool(
  cond_formula = m6_cond,
  zi_formula = zi_m6,
  dispformula = disp_formula,
  datasets = engineered_list,
  control = glmmTMBControl(
    optimizer = nlminb,
    optCtrl = list(iter.max = 1200, eval.max = 1500, rel.tol = 1e-8)
  ),
  preflight_n = 0,
  fail_fast = TRUE,
  parallel = TRUE,
  workers = 4L
)
saveRDS(m6_mixed, here::here("models", "m6_mixed.rds"))
Table 6.7: M6: Mixed Effects Model. Pooled via Rubin’s rules.
M6: Fixed Effects Model
Estimates
Confidence
Inference
Term Estimate exp(β) 95% CI SE t df p
Conditional (log link)
Intercept 6.443*** 628.029 [4.985, 7.900] 0.692 9.32 17.3 <0.001
Humidity3pm 0.249*** 1.283 [0.179, 0.319] 0.035 7.16 44.0 <0.001
Dewpoint 9am 0.188*** 1.207 [0.155, 0.222] 0.016 11.72 24.4 <0.001
Day Sin -0.014* 0.986 [-0.028, -0.000] 0.007 -2.00 150.0 0.048
Rainfall Ma7 0.066*** 1.069 [0.054, 0.079] 0.007 10.20 279.6 <0.001
Humidity Ma7 -0.029* 0.972 [-0.053, -0.005] 0.012 -2.38 66.7 0.020
Dewpoint Change (Spline) -6.364*** 0.002 [-7.924, -4.805] 0.765 -8.32 30.9 <0.001
Pressure Change (Spline) 2.197** 8.998 [1.100, 3.294] 0.489 4.49 9.5 0.001
Rain Yesterday (Yes) 0.313*** 1.367 [0.273, 0.352] 0.020 15.58 3422.4 <0.001
Sunshine -0.009 0.991 [-0.038, 0.021] 0.014 -0.61 25.6 0.550
Evaporation (Spline) 4.909*** 135.439 [3.857, 5.960] 0.524 9.37 52.8 <0.001
Instability Index (Spline, Df=3)1 0.365† 1.441 [-0.046, 0.776] 0.202 1.81 32.7 0.080
Instability Index (Spline, Df=3)2 -2.105** 0.122 [-3.664, -0.546] 0.772 -2.73 41.1 0.009
Instability Index (Spline, Df=3)3 0.743* 2.102 [0.181, 1.305] 0.283 2.62 96.4 0.010
Sun Humid Interaction -0.051*** 0.950 [-0.075, -0.028] 0.012 -4.41 32.6 <0.001
Gust U EW (Spline, Df=2)1 -1.843*** 0.158 [-2.063, -1.623] 0.111 -16.55 135.4 <0.001
Gust U EW (Spline, Df=2)2 0.205** 1.227 [0.066, 0.344] 0.070 2.93 85.3 0.004
Wind9am U EW (Spline, Df=2)1 -2.583*** 0.076 [-2.994, -2.172] 0.200 -12.89 27.2 <0.001
Wind9am U EW (Spline, Df=2)2 -0.308*** 0.735 [-0.481, -0.135] 0.088 -3.50 1822.3 <0.001
Cloud Development -0.022** 0.978 [-0.039, -0.006] 0.008 -2.75 43.6 0.009
Zero-inflation (logit link)
Intercept 1.100*** 3.005 [0.977, 1.223] 0.063 17.54 102114.3 <0.001
Humidity3pm -0.530*** 0.588 [-0.591, -0.470] 0.028 -18.94 12.9 <0.001
Dewpoint 9am -0.512*** 0.599 [-0.565, -0.459] 0.025 -20.43 17.1 <0.001
Rain Yesterday (Yes) -1.367*** 0.255 [-1.405, -1.329] 0.019 -71.41 81.8 <0.001
Cloud Development 0.087*** 1.091 [0.068, 0.106] 0.009 9.33 40.9 <0.001
Pressure Change -0.303** 0.739 [-0.457, -0.149] 0.068 -4.43 9.1 0.002
Sunshine 0.258*** 1.294 [0.228, 0.287] 0.014 17.96 24.9 <0.001
Evaporation 0.239*** 1.269 [0.205, 0.272] 0.017 14.42 31.9 <0.001
Humidity Ma7 -0.254*** 0.776 [-0.290, -0.217] 0.018 -14.34 22.3 <0.001
Day Cos 0.139*** 1.150 [0.097, 0.182] 0.020 6.96 17.6 <0.001
Day Sin 0.202*** 1.224 [0.181, 0.224] 0.011 18.88 31.2 <0.001
Dispersion (log link)
Intercept -0.025 0.975 [-0.140, 0.090] 0.058 -0.43 330.6 0.666
Humidity3pm 0.052*** 1.054 [0.037, 0.068] 0.008 6.55 97.9 <0.001
Rainfall Ma7 -0.014** 0.986 [-0.023, -0.005] 0.004 -3.14 3273.6 0.002
Day Sin -0.045*** 0.956 [-0.056, -0.034] 0.006 -8.21 38064.6 <0.001
Day Cos -0.031*** 0.969 [-0.045, -0.018] 0.007 -4.51 361.6 <0.001
Pressure Change 0.031 1.031 [-0.008, 0.069] 0.017 1.76 10.9 0.107
Dewpoint Change -0.050*** 0.951 [-0.073, -0.027] 0.011 -4.57 18.7 <0.001
Evaporation (Spline) -3.101*** 0.045 [-4.007, -2.194] 0.453 -6.85 57.2 <0.001
Gust U EW (Spline, Df=2)1 -0.643*** 0.526 [-0.853, -0.433] 0.106 -6.05 133.0 <0.001
Gust U EW (Spline, Df=2)2 0.116* 1.122 [0.001, 0.230] 0.058 1.99 189.7 0.048
Note:
† p<0.1 * p<0.05 ** p<0.01 *** p<0.001
AIC: 400151.2 BIC: 400498.9 log-Lik: -200029.6 (averaged across imputed datasets)
Pooled via Rubin’s rules.

The five preceding models share a common assumption: that the relationship between atmospheric predictors and rainfall is spatially uniform across a continent spanning tropical monsoon, subtropical semi-arid, temperate oceanic, and Mediterranean climate regimes. The EDA in Chapter 4 confirmed that both mean rainfall levels and seasonal patterns vary substantially across the station network, and a pooled fixed-effects structure cannot represent that heterogeneity without absorbing it as residual noise.

Model 6 addresses this through a mixed-effects structure (Table 6.7). Location-specific random effects allow the intercept and three slopes, humidity3pm, rain_yesterday, and dewpoint_change, to vary freely across weather stations. An uncorrelated diagonal covariance structure is specified via diag() rather than a full unstructured matrix, which would be overparameterised given the number of stations. The zero-inflation submodel is simultaneously expanded to include sunshine, evaporation, and day-of-year harmonic terms, reflecting the hypothesis that occurrence probability is governed by the same broad range of atmospheric drivers as intensity. The days_since_rain term is replaced by a four-degree-of-freedom natural spline, implementing the non-linear dry spell decay identified in Section 4.4.3: rather than imposing a constant log-odds decrement, the spline allows precipitation probability to decline steeply through the first ten dry days before stabilising.

The fixed-effect estimates for global-mean relationships remain directionally and numerically consistent with Model 5, confirming that the random effects are absorbing genuine between-location variation rather than distorting population-average coefficients.

Two predictors in the expanded zero-inflation submodel warrant attention. dewpoint_9am is now significant (\(\hat{\beta}_{zi} = -0.512\), \(p < 0.001\)), reversing its non-significance in all prior specifications. Once sunshine, evaporation, and seasonal harmonics are controlled for, morning dewpoint carries a distinct contribution to precipitation probability that was previously absorbed by correlated terms. The humidity_ma7 coefficient (\(\hat{\beta}_{zi} = -0.254\), \(p < 0.001\)) is negative and substantial: a persistently humid antecedent week substantially reduces the probability of a dry day, consistent with extended moisture advection maintaining conditions favourable to continued precipitation.

The natural spline terms for days_since_rain are collectively non-significant, with only the second basis function marginally approaching significance (\(p = 0.075\)). The dry spell decay function, while visually non-linear in Section 4.4.3, does not depart substantially from linearity once humidity, seasonal, and persistence predictors are included. The spline is retained for robustness, but the practical difference from a linear specification appears limited under the current covariate set.

6.8.1 Dispersion Modelling

Model 6 additionally estimates the Gamma shape parameter \(\phi_i\) as a function of covariates rather than treating it as a global constant. The conditional variance of rainfall given occurrence, \(\text{Var}(Y \mid Y > 0) = \mu_i^2 / \phi_i\), is unlikely to be uniform across atmospheric conditions: deep convective events produce more erratic intensity distributions than steady frontal rainfall under stable westerly flow, and a fixed dispersion parameter cannot represent this.

The dominant signal comes from the evaporation spline (\(\hat{\beta} = -3.101\), \(p < 0.001\)). Higher surface evaporative demand is associated with substantially lower \(\phi_i\), meaning greater conditional variance in rainfall volume. dewpoint_change follows the same direction (\(\hat{\beta} = -0.050\), \(p < 0.001\)): a larger diurnal drop in dewpoint, indicative of dry-air entrainment under unstable conditions, reduces \(\phi_i\) further and increases conditional dispersion. humidity3pm moves in the opposite direction (\(\hat{\beta} = 0.052\), \(p < 0.001\)), with higher afternoon humidity increasing \(\phi_i\) and narrowing conditional variance. Moister atmospheric states produce more sustained, consistent rainfall rather than isolated high-intensity events. The day-of-year harmonics (day_sin: \(\hat{\beta} = -0.045\); day_cos: \(\hat{\beta} = -0.031\); both \(p < 0.001\)) are negative and highly significant, capturing seasonal modulation of rainfall variability that operates independently of mean-level effects across the annual cycle.

Taken together, the dispersion coefficients reflect a physically interpretable pattern: moisture and atmospheric stability reduce conditional variance, while instability, dry-air intrusion, and high evaporative demand increase it.