Spatiotemporal Analysis of Tuberculosis using Hausdorff–Gaussian Processes

Available at lcgodoy.me/slides/2024-cobal/

Lucas da Cunha Godoy

ldcgodoy@ucsc.com

EEB Department, UCSC

VII COBAL & XVII EBEB – UFMG

2024-12-05

Introduction

Too many slides…

Tuberculosis in context

Mortality: Tuberculosis (TB) remains a major global health threat, second in infectious disease mortality only to COVID-19.
Rio Grande do Sul (RS) reported significantly higher incidence than the national average in 2021, with the eastern region even more affected.
Dependence: Studies demonstrate strong spatial dependence of TB infections in Brazil, but temporal and spatiotemporal structures have been largely overlooked.
Risk Factors: TB risk factors include densely populated areas, poverty, substance abuse, and incarceration (Cortez et al. 2021).

Spatiotemporal (SPT) models for areal data

Spatial models: CAR (Besag 1974), ICAR, BYM (Besag et al. 1991), DAGAR (Datta et al. 2019), RENeGe (Cruz-Reyes et al. 2023).
Nonseparable SPT models are more complex as they consider that the spatial and temporal correlations might be intertwined (Cressie and Wikle 2015, pg. 309–321).
Separable models one way to look at these models is as multivariate spatial processes (MacNab 2022).
Advantages of separable models: Computational efficiency & positive-definiteness of the covariance function.

Proposed methodology & Objectives

Hausdorff–Gaussian Process (HGP): we propose using the newly developed HGP for the spatial portion of the model (Godoy et al. 2024).
Reliable incidence estimates:
- Smaller municipalities benefit from borrowed strength from neighbors, improving estimate reliability.
- Results enable the calculation of standardized incidence ratios to pinpoint high-risk areas.
Forecasting: Predicted TB incidence rates one year ahead offer crucial insights for proactive public health planning.

Hausdorff–Gaussian Process (HGP)

Preliminaries

Areal spatial units are (closed and bounded) sets.
We need to generalize distance between points to distance between sets.
Ideally, this distance should:
1. Take into account the shape, size, and orientation of spatial sample units.
2. Be “spatially interpretable”.

Distances between sets

Distance between a point and a set: \(d(x, A) = \inf_{a \in A} d(x, a)\), where \(d(x, y)\) is the distance between any two elements \(x, y \in D\)
Directed Hausdorff & Hausdorff distance: \[{\vec h}(A, B) = \sup_{a \in A} d(a, B) \quad \text{and} \quad h(A, B) = \max \left \{ \vec{h}(A, B), \vec{h}(B, A) \right \}\]

Symmetric Hausdorff distance: the greater of the two directed Hausdorff distances.
Note that if \(A\) and \(B\) are both singletons, then \(h(A, B) = d(A, B)\).

Meric Properties: (1) Symmetry: \(d(x, y) = d(y, x)\); (2) Nonnegativeness: \(d(x, y) \geq 0\) and \(d(x, x) = 0\); (3)Positiveness: \(d(x, y) = 0 \implies x = y\); (4) Triangle inequality: \(d(x, y) \leq d(x, z) + d(z, y)\)

Metric on \(D \setminus \varnothing\)
The Hausdorff distance ability to account for spatial units’ shapes, sizes and orientation(Min et al. 2007) renders it an interesting tool to achieve our goals.
Moreover, it can distinguish between overlapping, nested, and disjointed regions.
In the figure: (1) The dashed lines denote the Hausdorff distances; (2) The infimum distance between to sets is zero for the three cases; (3) Distance between centroids is the same for the first two figures.

The HGP

General spatial model: \(\{ Z(\mathbf{s}) \; : \; \mathbf{s} \in \mathcal{B}(D) \}\).
Index set: \(\mathcal{B}(D)\) represents the closed and bounded subsets of \(D \subset \mathbb{R}^2\).
Assumption: The HGP assumes \(Z(\mathbf{s})\) to be an isotropic Gaussian Process such that its spatial correlation function depends on the Hausdorff distance.
Powered Exponential Correlation (PEC) function: \(r(h) = \exp\left \{ - \frac{h^{\nu}}{\phi^{\nu}}\right \},\) where \(h\) denotes the Hausdorff distance between \(\mathbf{s}_1, \mathbf{s}_2 \in \mathcal{B}(D)\).

Ideally, we want bounded, compact, and non-empty sets for the Hausdorff distance to be a metric..
In \(\mathbb{R}^2\), a compact subset is bounded.
empty is a subset of any set -> it is bounded bc a subset of a bounded set is bounded itself.
Mean and SD functions can be informed by covariates.
The ensure the validity of the process, the function \(r(\cdot)\) must be positive definite.
Complex functional parametric space.
Inference is simplified (or possible) by assuming parametric forms to those functions.
Mean and SD functions can be informed by covariates.
The ensure the validity of the process, the function \(r(\cdot)\) must be positive definite.
Complex functional parametric space.
Inference is simplified (or possible) by assuming parametric forms to those functions.
The SD function allows the HGP to accommodate both homoscedastic and heteroscedastic scenarios.

Tuberculosis spatiotemporal modeling

Data & Model

Sample units: 54 municipalities, across 11 years (2011 to 2021). We use 2022 to assess the quality of predictions.
Number of TB cases: \(Y_t(\mathbf{s}_i)\) at location \(\mathbf{s}_i\) and time \(t\).
Population: \(P_t(\mathbf{s}_i)\).
Five covariates and two way interactions with presence of prison (except IDESE).

\[\begin{aligned} & (Y_t(\mathbf{s}_i) \mid \mathbf{X}_{t}(\mathbf{s}_i), Z(\mathbf{s}_i, t)) \overset{{\rm ind}}{\sim} \text{Poisson}(P_t(\mathbf{s}_i) \mu_{it}) \\ & \log(\mu_{it}) = \alpha + \mathbf{X}^\top_{t}(\mathbf{s}_i) \beta + Z(\mathbf{s}_i, t) \end{aligned}\]

Priors

We assume \(Z(\mathbf{s}, t)\) is a separable zero-mean Gaussian model such that its SPT covariance matrix is the kronecker product between a spatial covariance (HGP, BYM, & DAGAR) and a temporal correlation (\(\mathrm{AR}(1)\)).
HGP spatial dependence: \(\rho \sim \mathrm{Exp}(a_\rho)\), where \(a_{\rho} = - \log(p_{\rho}) / \rho_0\). \(a_\rho\) is chosen such that \(\mathbb{P}(\rho > \rho_0) = p_\rho\).
Smoothness & marginal SD: \(\nu \sim \mathrm{Beta}(2.5, 1.5)\) (mode at \(0.75\)) & \(\sigma \sim t_{+}(3)\).
Temporal dependence: PC prior (Sørbye and Rue 2017) where \(\mathbb{P}(\lvert \psi \rvert > 0.8) = 0.1\).
Intercept & regression coefficients: \(\alpha\) (i.e., \(\pi(\alpha) \propto 1\)) & \(\boldsymbol{\beta} \sim \mathcal{N}(\mathbf{0}, 10 \mathbf{I})\)

Computational considerations

Super effortful: \(vec(\mathbf{Z}) \sim \mathcal{N}(\mathbf{0}, \sigma^2 \mathrm{R}_s \otimes \mathrm{R}_t)\) requires \(\mathcal{O}(N^3 T^3)\) flops (and storage).
Effortful: With linear algebra, we can reduce the computational complexity (and storage) to \(\approx \mathcal{O}(N^3 + T^3)\)
Neutral: More linear algebra can be used to evaluate a quadratic form with less operations.
Clever: The Cholesky decomposition of \(R^{-1}_t\) is tridiagonal.
Super clever: The complexity to obtain \(chol(R^{-1}_s)\) is dramatically decreased using nearest-neighbor approximations (Finley et al. 2019).

Bayesian Inference & Model Assessment

Posterior: \(\pi(\boldsymbol{\theta} \mid \mathbf{y}, \mathbf{z}) \propto p(\mathbf{y} \mid \mathbf{z}, \boldsymbol{\theta}) p(\mathbf{z} \mid \boldsymbol{\theta}) \pi(\boldsymbol{\theta})\)
MCMC sampler: No-U-Turn (Homan and Gelman 2014).
Convergence assessment: traceplots and split-\({\hat{R}}\) (Vehtari et al. 2021).
Goodness-of-fit criteria: LOOIC (lower values indicate better fit)
Posterior predictive distributions: \(p(\mathbf{y}^{\ast} \mid \mathbf{y})\)
Predictions assessment: Interval Score (IS) and RMSP (lower values indicate better fit)

Point and interval estimates: median and percentiles (\(0.025\) and \(0.975\)) of the marginal MCMC samples.
Parameters initialized by random sampling from their priors.
Random effects initialized from a standard normal distribution.
\(a_\rho\) is chosen so that: \(\mathbb{P}(\rho > U) = p_\rho\). In particular, \(a_\rho = - \log(p_\rho) / \rho_0\).
\(\nu\) hard to be estimated.

Predictions:

use the properties of GP and multivariate Normal distribution to obtain the closed-form distribution of the vector of spatial random effects \(\mathbf{Z}^\ast\) (Diggle et al. 1998);
sample \(\mathbf{z}^\ast_{(b)}\) from the distribution derived in the previous step;
sample \(\mathbf{y}^{\ast}_{(b)}\) from \(p(\mathbf{y}^{\ast} \mid \boldsymbol{\theta}_{(b)}, \mathbf{z}^{\ast}_{(b)})\), where \(\boldsymbol{\theta}_{(b)}\) is the \(b\)-th MCMC sample of \(\boldsymbol{\theta}\).

Spatiotemporal Trend

Explanatory Variables

Results: GOF and Predictive Performance

	LOOIC	RMSP	IS
HGP	3516.1	21.1	87.8
BYM	3606.1	123.3	176.6
DAGAR	3520.9	22.4	88.8

Results: Relative Risks

Parameter	Description	Estimate
\(\exp(\beta_1)\)	Prison	2.34 (1.70, 3.19)
\(\exp(\beta_2)\)	Pop / km²	1.33 (1.15, 1.56)
\(\exp(\beta_2 + \beta_{21})\)		1.75 (1.18, 2.52)
\(\exp(\beta_3)\)	HS dropout %	1.03 (0.99, 1.07)
\(\exp(\beta_3 + \beta_{31})\)		2.25 (1.63, 3.09)
\(\exp(\beta_4)\)	Homicide rate	0.97 (0.93, 1.00)
\(\exp(\beta_4 + \beta_{41})\)		2.51 (1.83, 3.46)
\(\exp(\beta_5)\)	IDESE	0.99 (0.92, 1.07)

Spatiotemporal Dependence

Small Municipalities

Forecast

Closing remarks

Tailored an HGP extension for spatiotemporal disease mapping.
Competitive with specialized models
It helps to gain insights into spatiotemporal disease mapping through spatiotemporal correlation functions.
More reliable estimates of risk factors
Out-of-sample predictions to inform public policies

References

Besag, J. (1974), “Spatial interaction and the statistical analysis of lattice systems,” Journal of the Royal Statistical Society. Series B (Methodological), JSTOR, 192–236.

Besag, J., York, J., and Mollié, A. (1991), “Bayesian image restoration, with two applications in spatial statistics,” Annals of the Institute of Statistical Mathematics, 43, 1–20.

Cortez, A. O., Melo, A. C. de, Neves, L. de O., Resende, K. A., and Camargos, P. (2021), “Tuberculosis in Brazil: One country, multiple realities,” Jornal Brasileiro de Pneumologia, Sociedade Brasileira de Pneumologia e Tisiologia, 47, e20200119. https://doi.org/10.36416/1806-3756/e20200119.

Cressie, N., and Wikle, C. K. (2015), Statistics for spatio-temporal data, Wiley.

Cruz-Reyes, D. L., Assunção, R. M., and Loschi, R. H. (2023), “Inducing high spatial correlation with randomly edge-weighted neighborhood graphs,” Bayesian Analysis, International Society for Bayesian Analysis, 1, 1–35.

Datta, A., Banerjee, S., Hodges, J. S., and Gao, L. (2019), “Spatial disease mapping using directed acyclic graph auto-regressive (DAGAR) models,” Bayesian analysis, NIH Public Access, 14, 1221.

Diggle, P. J., Tawn, J. A., and Moyeed, R. A. (1998), “Model-based geostatistics,” Journal of the Royal Statistical Society Series C: Applied Statistics, Oxford University Press, 47, 299–350.

Finley, A. O., Datta, A., Cook, B. D., Morton, D. C., Andersen, H. E., and Banerjee, S. (2019), “Efficient algorithms for bayesian nearest neighbor gaussian processes,” Journal of Computational and Graphical Statistics, ASA Website, 28, 401–414. https://doi.org/10.1080/10618600.2018.1537924.

Godoy, L. da C., Prates, M. O., and Yan, J. (2024), “Statistical inferences and predictions for areal data and spatial data fusion with hausdorff–gaussian processes,” Journal of the American Statistical Association (under review). https://doi.org/10.48550/arXiv.2208.07900.

Homan, M. D., and Gelman, A. (2014), “The No-U-turn sampler: Adaptively setting path lengths in hamiltonian Monte Carlo,” Journal of Machine Learning Research, JMLR.org, 15, 1593–1623.

MacNab, Y. C. (2022), “Bayesian disease mapping: Past, present, and future,” Spatial Statistics, Elsevier, 50, 100593.

Min, D., Zhilin, L., and Xiaoyong, C. (2007), “Extended Hausdorff distance for spatial objects in GIS,” International Journal of Geographical Information Science, Taylor & Francis, 21, 459–475.

Sørbye, S. H., and Rue, H. (2017), “Penalised complexity priors for stationary autoregressive processes,” Journal of Time Series Analysis, Wiley Online Library, 38, 923–935.

Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., and Bürkner, P.-C. (2021), “Rank-normalization, folding, and localization: An improved \(\hat{R}\) for assessing convergence of MCMC (with discussion),” Bayesian Analysis, International Society for Bayesian Analysis, 16, 667–718.

Spatiotemporal Analysis of Tuberculosis using Hausdorff–Gaussian Processes

Introduction

Too many slides…

Tuberculosis in context

Spatiotemporal (SPT) models for areal data

Proposed methodology & Objectives

Hausdorff–Gaussian Process (HGP)

Preliminaries

Distances between sets

The HGP

Tuberculosis spatiotemporal modeling

Data & Model

Priors

Computational considerations

Bayesian Inference & Model Assessment

Spatiotemporal Trend

Explanatory Variables

Results: GOF and Predictive Performance

Results: Relative Risks

Spatiotemporal Dependence

Small Municipalities

Forecast

Closing remarks

Closing remarks

References

Thank you!

Appendix

Sensitivity analysis