### Routine surveillance data

The primary dataset consists of annual malaria case counts from 2017 to 2021 for 1243 health facilities in Laos. Because 83 healthcare facilities did not have spatial coordinates, the analysis was performed on 1160 geolocated healthcare facility points (care facilities with latitude and longitude). Figure 1 shows observed malaria cases in health facilities in Laos from 2017 to 2021. The number of reported malaria cases in Laos has fallen significantly over the past five years, from approximately 9300 cases in 2017 to less than 4000 cases in both 2020 and 2021. with further reduction to 2305 cases in 2022 (8). The 2019 to 2021 data includes a breakdown by *P. falciparum* And *P. vivax* in health facilities in Laos (see Figure 1 in Supplementary Information (SI)).

### Treatment-seeking tendency

Persistent propensity to seek treatment was estimated using a distance decay-based model ( 18 , 19 ) for any healthcare setting. Here, distance was defined as travel time using a road- and terrain-informed friction surface from the Malaria Atlas Project ( 20 ). The decay model was compared with survey data from the 2017 Lao PDR Social Indicator Survey II (21) (see Figure 2 in SI). According to the survey, the national percentage seeking treatment was 58%, with province-specific percentages ranging from 28 to 77.8%. The minimum and maximum values of the decay curve were then calibrated to better match the national average.

### High-resolution environmental covariates and model selection

The incidence model used a set of environmental covariates at a spatial resolution of 1 km \(\time\) 1 km, which are known to influence the effects of malaria. The selection of covariates was based on an extensive review that identified robust associations with malaria (see (22)). The covariates are described in detail in Table 1. A range of static and dynamic covariates were used.

To evaluate the performance of the modeling approach, an extensive cross-validation analysis was performed. This involved testing different model combinations, including a full model including both covariates and spatial random effects, a model with only covariates, and a model with only spatial random effects.

In addition, the performance of two different parasite models was assessed using the data from 2019 to 2021, including a breakdown by *P. falciparum* And *P. vivax*The first model fitted at the same time *P. falciparum* And *P. vivax*, while the second model mounted these two parasites separately. By comparing the performance of these different models, the most effective approach for predicting and understanding malaria transmission dynamics was determined.

Overall, the modeling approach used here provided valuable insights into the environmental factors that contributed to malaria transmission and helped identify areas at high risk of malaria. Furthermore, the comparison of different model combinations and parasite models helped improve the accuracy and reliability of predictions and ultimately led to more effective malaria control strategies.

### Estimating the watershed population

To calculate appropriate incidence rates for cases observed in healthcare facilities, the specific population in the catchment area for each healthcare facility was estimated. The catchment area population in this case was defined as the number of people likely to seek treatment at each facility. The population data is derived from the High Resolution Settlement Layer (HRSL) which represents the population in Laos in 2018. District-level population estimates for \(2019-2021\) were used to adjust the HRSL population estimates using rakes. In this analysis, a modified gravity model was used to estimate these watershed populations based on travel time to health care facilities, using the latest global maps of travel time to health care facilities ( 20 ).

For the river basin model, this is the probability that an individual in the *i*-the pixel sought treatment at the *J*-the healthcare institution, \(\bar{p}(\text {pixel}_i \right arrow \text {hf}_j)\)was modeled as the relative attractiveness of that facility, \(p(\text {pixel}_i\right arrow \text {hf}_j)\)normalized in relation to the relative attractiveness of all accessible facilities, \(\sigma _i\)which means,

$$\begin{aligned} \bar{p}(\text {pixel}_i \rightarrow \text {hf}_j) = \frac{ p(\text {pixel}_i\rightarrow \text {hf}_j)} {\sum _{k \in \sigma _i} p(\text {pixel}_i\rightarrow \text {hf}_k)}. \end{aligned}$$

This relative attractiveness was modeled as the combination of an inherent decrease proportional to the square of the travel time to that healthcare institution, \(tt(\text {pixel}_i \right arrow \text {hf}_j)\)and a catchment area weighting, \(W_j\)intended to capture the variation in the services available at each facility that would likely attract an individual seeking care, i.e.:

$$\begin{aligned} p(\text {pixel}_i \rightarrow \text {hf}_j) = \frac{W_j}{tt(\text {pixel}_i \rightarrow \text {hf}_j)^2 }. \end{aligned}$$

The watershed weights were treated as a model parameter to be derived together with the clinical incidence during fitting, and a Bayesian prior of form \(\text {log }W_j \sim \text {Normal}(0, 0.25^2)\) was used to regularize their variation. Combining this attractiveness model with the treatment seeking and population areas – referred to here as \(\text {TS}_i\) And \(\text {population}_i\)respectively – allowed the catchment population of the *J*-the healthcare facility to be defined as

$$\begin{aligned} \text {CP}_{j} = \sum \limits _{i=1}^{N_{\text {pixel}}} \text {TS}_i \times \text {population }_i \times \bar{p}(\text {pixel}_i \rightarrow \text {hf}_j). \end{aligned}$$

In the *i*-the pixel limited resident access to facilities to those within a 3-hour (180-minute) travel time. It was assumed that individuals would generally not bypass numerous nearby health facilities to visit a more distant health center. This assumption was reflected in the model, especially in areas where healthcare facilities are sparsely distributed, where the accessible set, \(\sigma _i\), included only the five closest facilities. In contrast, in areas where several health facilities are within a 20-minute travel distance, \(\sigma _i\) consisted of these facilities plus the next four nearest. The watersheds were modeled as static and did not vary on a monthly or annual basis, mainly due to a lack of information on the dynamic activities of health facilities and the number of treatment requests.

### Incidence model

A Bayesian geostatistical framework was used to display the underlying incidence rate at each pixel, *i*for every year, *j*where the above river basin model connects this latent risk surface, \(I_{i,y}\)to the available case data.

For each year, the matrix of covariates, \(\textbf{X}_y\), was formed by merging the set of static covariates with the time-varying covariates appropriate to that year. The linear predictor for the *i*-the pixel in year *j* was then formed by multiplying the *i*-inserting \(\textrm{X}_y\) against an annual slope vector, \(\beta_y\). A fixed annual interception period, \(c_y\) and an annual spatial compensation period, \(f_{\textrm{offset},i,y}\)filled in the formula for \(I_{i,y}\) as

$$\begin{aligned} \text {log }I_{i,y} = c_y + (\textbf{X}_y)^\prime _i\beta _y + f_{\textrm{offset},i,y}. \end{aligned}$$

The annual spatial offset term was taken from a spatial Gaussian random field axis

$$\begin{aligned} f_{\textrm{offset},\cdot ,y} \sim \textrm{GP}(\text {range}_{\text {offset}}, \text {scale}_{\ text {offset}}) \end{aligned}$$

where the hyperparameters are assigned hyperpriors, \(\textrm{range}_{\text {offset}} \sim \textrm{Normal}(1,0,5^2)\) And \(\textrm{scale}_{\text {offset}} \sim \textrm{Normal}(-1,0,5^2)\). Through the previous choice, a modest shrinkage penalty was placed on the annual slope vector, \(\beta _y \sim \textrm{Normal}(0,1^2)\).

The expected cases at the *J*-the facility in years *j*, were calculated by summing over all pixels the product of the latent incident surface with the watershed model; namely,

$$\begin{aligned} \mathrm {expected\cases}_{j,y} = \sum _i I_{i,y} \times \text {TS}_i \times \text {population}_i \times \bar {p}(\text {pixel}_i \right arrow \text {hf}_j). \end{aligned}$$

To allow for over-dispersion relative to the nominal Poisson sampling distribution for case data, the annual case totals observed at each facility were taken from a negative binomial distribution, parameterized as

$$\begin{aligned} \text {cases}_{i,y} \sim \text {NegBinom}\left( \text {mean} = \mathrm {expected\ cases}_{i,y}, \text {variance} = \text {mean}\times (1+\sigma ) \right) \end{aligned}$$

of \(\sigma\) assigned a hyperprior, \(\log \sigma \sim \text {Normal}(-1, 1^2)\).

Model fitting was performed in the Template Model Builder (TMB) and Integrated Nested Laplace Approximation (INLA) packages (32, 33) for R using a Laplace approximation over the random field components and the (logarithm of) the catchment attractiveness weights, with a multivariate Normal approximation of the remaining hyperparameters, centered on the empirical Bayes estimator. The model uncertainty was derived from 300 samples drawn from a posterior Laplace approximation and quantified using the standard deviation of the posterior distribution, as well as the probability of threshold crossing and non-crossing.

Data for the years 2019 to 2021 provided information on the distribution of species *P. falciparum* And *P. vivax* cases for all ages. The methodology described above was applied independently to each species, generating incidence maps tailored to the individual species and averaged per year. To investigate overlapping risk profiles, these species-specific incidence maps were overlaid, allowing a comprehensive assessment of risk to both individual species and their collective impact. The model validation results, tables of goodness-of-fit measures, and diagnostic figures are included in the supplementary material. These resources were intended to help readers evaluate the effectiveness of the geospatial model.