bayesnec
The background of bayesnec
is covered in the Single
model usage vignette. Here we explain multi model usage using
bayesnec
. In bayesnec
it is possible to fit a
custom model set, specific model set, or all of the available models.
When multiple models are specified the bnec
function
returns a model weighted estimate of predicted posterior values, based
on the "pseudobma"
using Bayesian bootstrap through
loo_model_weights
(Vehtari et al.
2020; Vehtari, Gelman, and Gabry 2017). These are reasonably
analogous to the way model weights are generated using AIC or AICc (Burnham and Anderson 2002).
It is also possible to obtain all individual model fits from the
fitted bayesnecfit
model object if required using the
pull_out
function, and also to update an existing model fit
with additional models, or to drop models using the function
amend
.
Multi-model inference can be useful where there are a range of
plausible models that could be used (Burnham and
Anderson 2002) and has been recently adopted in ecotoxicology for
Species Sensitivity Distribution (SSD) model inference (Thorley and Schwarz 2018). The approach may
have considerable value in concentration-response modelling because
there is often no a priori knowledge of the functional form
that the response relationship should take. In this case model averaging
can be a useful way of allowing the data to drive the model selection
process, with weights proportional to how well the individual models
fits the data. Well-fitting models will have high weights, dominating
the model averaged outcome. Conversely, poorly fitting models will have
very low model weights and will therefore have little influence on the
outcome. Where multiple models fit the data equally well, these can
equally influence the outcome, and the resultant posterior predictions
reflect that model uncertainty. It is possible to specify the
"stacking"
method (Yao et al.
2018) for model weights if desired (through the argument
loo_controls
) which aims to minimise prediction error. We
do not currently recommend using stacking weights given the typical
sample sizes associated with most concentration—response experiments,
and because the main motivation for model averaging within the
bayesnec
package is to properly capture model uncertainty
rather than reduce prediction error.
bnec
functionbnec
modelSo far we have explored how to fit individual models via the function
bnec
. The bayesnec
package also has the
capacity to fit a custom selection of models, pre-specified sets of
models, or even all the available models in the package. Note that as
these are Bayesian methods requiring multiple Hamiltonian Monte Carlo
(HMC) chains, using bnec
can be very slow when specifying
models = "all"
within a bayesnecformula
. See
details under ?bnec
for more information on the models, and
model sets that can be specified, as well as the Model
details vignette which contains considerable information on the
available models in bnec
and their appropriate usage. In
general it is safe to call models = "all"
, because by
default bnec
will discard invalid models and the model
averaging approach should result in an overall fit that reflects the
models that best fit the data. However, because the HMC can be slow for
models = "all"
we do recommend testing your fit using a
single (likely) model in the first instance, to make sure there are no
issues with dispersion, the appropriate distribution is selected and
model fitting appears robust (see the Single
model usage vignette for more details).
To run this vignette, we will need the ggplot2
package:
library(bayesnec)
# function which generates an "ecx4param" model
make_ecx_data <- function(top, bot, ec50, beta, x) {
top + (bot - top) / (1 + exp((ec50 - x) * exp(beta)))
}
x <- seq(0, 10, length = 12)
y <- make_ecx_data(x = x, bot = 0, top = 1, beta = 0.5, ec50 = 5)
set.seed(333)
df_ <- data.frame(x = rep(x, 15), y = rnorm(15 * length(x), y, 0.2))
exp_5 <- bnec(y ~ crf(x, model = "decline"), data = df_, iter = 2e3,
open_progress = FALSE)
Here we run bnec
using model = "decline"
using a simulated data example for a Beta-distributed response variable.
We are using the "decline"
set here because we are not
going to consider hormesis (these allow an initial increase in the
response), largely to save time in fitting this example. Moreover, you
might want to consider saving the output as an .RData
file—doing so can be a useful way of fitting large model sets (ie
model = "all"
, or model = "decline"
) at a
convenient time (this can be very slow, and may be run overnight for
example) so you can reload them later to explore, plot, extract values,
and amend the model set as required.
Whenever the model
argument of crf
is a
model set, or a concatenation of specific models, bnec
will
return an object of class bayesmanecfit
.
bayesmanecfit
modelWe have created some plotting method functions for both the
bayesnecfit
and bayesmanecfit
model fits, so
we can plot a bayesmanecfit
model object simply with
autoplot
.
The default plot looks exactly the same as our regular
bayesnecfit
plot, but the output is based on a weighted
average of all the model fits in the model = "decline"
model set that were able to fit successfully using the default
bnec
settings. Because the model set contains a mix of
NEC and ECx models, no-effect toxicity estimate
reported by default on this plot (and in the summary output below) is
reported as a N(S)EC (see Fisher et al.
(2023)).
The fitted bayesmanecfit
object contains different
elements to the bayesnecfit
. In particular,
mod_stats
contains the table of model fit statistics for
all the fitted models. This includes the model name, the WAIC (as
returned from brms
), wi (the model weight, currently
defaulting to "pseudobma"
using Bayesian bootstrap from
loo
), and the dispersion estimates (only reported if
response is modelled with a Poisson or Binomial distribution, otherwise
NA).
exp_5$mod_stats
#> model waic wi dispersion_Estimate dispersion_Q2.5 dispersion_Q97.5
#> nec4param nec4param -84.71235 0.3266459430 NA NA NA
#> neclin neclin -52.72181 0.0004911001 NA NA NA
#> ecxlin ecxlin -27.81568 0.0006015808 NA NA NA
#> ecx4param ecx4param -81.67824 0.0972614372 NA NA NA
#> ecxwb1 ecxwb1 -75.70936 0.0183140713 NA NA NA
#> ecxwb2 ecxwb2 -84.77620 0.2454512947 NA NA NA
#> ecxll5 ecxll5 -84.61205 0.2204366725 NA NA NA
#> ecxll4 ecxll4 -81.55602 0.0907979003 NA NA NA
We can obtain a neater summary of the model fit by using the
summary
method for a bayesmanecfit
object. A
list of fitted models, and model weights are provided. In addition, the
model averaged no-effect estimate is reported, in this case as an
N(S)EC. For this example the nec4param and
ecxwb2 models have the highest weights.
All these model fits are satisfactory despite the relatively low number of iterations set in our example, but the summary would also include a warning if there were fits with divergent transitions.
summary(exp_5)
#> Object of class bayesmanecfit
#>
#> Family: gaussian
#> Links: mu = identity; sigma = identity
#>
#> Number of posterior draws per model: 1600
#>
#> Model weights (Method: pseudobma_bb_weights):
#> waic wi
#> nec4param -84.71 0.33
#> neclin -52.72 0.00
#> ecxlin -27.82 0.00
#> ecx4param -81.68 0.10
#> ecxwb1 -75.71 0.02
#> ecxwb2 -84.78 0.25
#> ecxll5 -84.61 0.22
#> ecxll4 -81.56 0.09
#>
#>
#> Summary of weighted N(S)EC posterior estimates:
#> NB: Model set contains a combination of ECx and NEC
#> models, and is therefore a model averaged
#> combination of NEC and NSEC estimates.
#> Estimate Q2.5 Q97.5
#> N(S)EC 3.37 2.11 3.98
#>
#>
#> Bayesian R2 estimates:
#> Estimate Est.Error Q2.5 Q97.5
#> nec4param 0.86 0.01 0.84 0.87
#> neclin 0.83 0.01 0.80 0.84
#> ecxlin 0.80 0.01 0.78 0.82
#> ecx4param 0.85 0.01 0.83 0.87
#> ecxwb1 0.85 0.01 0.83 0.86
#> ecxwb2 0.86 0.01 0.84 0.87
#> ecxll5 0.86 0.01 0.84 0.87
#> ecxll4 0.85 0.01 0.83 0.87
The bayesmanecfit
object also contains all of the
individual brms
model fits, which can be extracted using
the pull_out
function. For example, we can pull out and
plot the model ecx4param.
This would extract the nec4param model from the
bayesmanecfit
and create a new object of class
bayesnecfit
which contains just a single fit. This would be
identical to fitting the ecx4param as a single model
using bnec
. All of the models in the
bayesmanecfit
can be simultaneously plotted using the
argument all_models = TRUE
.
You can see that some of these models represent very bad fits, and accordingly have extremely low model weights, such as the ecxlin and neclin models in this example. There is no harm in leaving in poor models with low weight, precisely because they have such a low model weight and therefore will not influence posterior predictions. However, it is important to assess the adequacy of model fits of all models, because a poor fit may be more to do with a model that has not converged.
We can assess the chains for one of the higher weighted models to make sure they show good convergence. It is good practice to do this for all models with a high weight, as these are the models most influencing the multi-model inference.
Assessing chains for all the models in bayesmanecfit
does not work as well using the default brms
plotting
method. Instead use check_chains
and make sure to pass a
filename
argument, which means plots are automatically
saved to pdf with a message.
We can also make a plot to compare the posterior probability density
to that of the prior using the check_priors
function, for
an individual model fit, but also saving all fits to a file in the
working directory.
Where a large number of models are failing to converge, obviously it
would be better to adjust iter
and warmup
in
the bnec
call, as well as some of the other arguments to
brms
such as adapt_delta
. See the
?brm
documentation for more details. It is possible to
investigate if models from a bayesmanecfit
class achieved
convergence according to Rhat:
rhat(exp_5) |>
sapply("[[", "failed")
#> nec4param neclin ecxlin ecx4param ecxwb1 ecxwb2 ecxll5 ecxll4
#> FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Here we get a message because none of our models failed the default
Rhat criterion. A more conservative cut off of 1.01 can also be used by
changing the default argument to the desired value. In this case one
model fails, although we note that this is a very stringent criterion,
and we have also used less than the default bayesnec
value
of iter
(10,000).
The models prefixed with ecx are all models that do
not have the NEC as a parameter in the model. That is, they are
smooth curves as a function of concentration and have no breakpoint (no
true No-Effect-Concentration, NEC). Thus the reported no-effect toxicity
estimate is the NSEC (Fisher and Fox 2023)
(see the Model
details vignette for more details). If a true model averaged
estimate of NEC based only on threshold models is desired, this
can be obtained with model = "nec"
. We can use the helper
functions pull_out
and amend
to alter the
model set as required. pull_out
has a model
argument and can be used to pull a single model out (as above) or to
pull out a specific set of models.
We can use this to obtain first a set of NEC only models from the existing set.
In this case, because we have already fitted "decline"
models, we can ignore the message regarding the missing NEC
models—these are all models that are not appropriate for a
Beta
family with a logit
link function, or
allow hormesis, which we did not consider in this example.
Now we have two model sets, an NEC set, and a mixed
NEC and ECx set. Of course, before we use this model
set for any inference, we would need to check the chain mixing and
acf
plot for each of the input models. For the “all” set,
the model with the highest weight is nec4param.
Now we can use the ecx
function to get
EC10 and EC50 values. We can do
this using our all model set, because it is valid to use NEC
models for estimating ECx (see more information in the Model
details vignette).
ECx10 <- ecx(exp_5, ecx_val = 10)
ECx50 <- ecx(exp_5, ecx_val = 50)
ECx10
#> Q50 Q2.5 Q97.5
#> 3.641915 2.699609 4.177541
#> attr(,"resolution")
#> [1] 1000
#> attr(,"ecx_val")
#> [1] 10
#> attr(,"toxicity_estimate")
#> [1] "ecx"
ECx50
#> Q50 Q2.5 Q97.5
#> 5.100009 4.842115 5.357903
#> attr(,"resolution")
#> [1] 1000
#> attr(,"ecx_val")
#> [1] 50
#> attr(,"toxicity_estimate")
#> [1] "ecx"
The weighted NEC estimates can be extracted directly from the NEC model set object, as they are an explicit parameter in these models.
Note that the new NEC estimates from the nec-containing model fits are slightly higher than those reported in the summary output of all the fitted models. This can happen for smooth curves, which is what was used as the underlying data generation model in the simulations here, and is explored in more detail in the Compare posteriors vigenette.
Now we can make a combined plot of our output, showing the model averaged NEC model and the “all averaged model”, along with the relevant thresholds.
preds <- exp_5_nec$w_pred_vals$data
autoplot(exp_5, nec = FALSE, all = FALSE) +
geom_vline(mapping = aes(xintercept = ECx10, colour = "ECx 10"),
linetype = c(1, 3, 3), key_glyph = "path") +
geom_vline(mapping = aes(xintercept = ECx50, colour = "ECx 50"),
linetype = c(1, 3, 3), key_glyph = "path") +
geom_vline(mapping = aes(xintercept = NECvals, colour = "NEC"),
linetype = c(1, 3, 3), key_glyph = "path") +
scale_color_manual(values = c("ECx 10" = "orange", "ECx 50" = "blue",
"NEC" = "darkgrey"), name = "") +
geom_line(data = preds, mapping = aes(x = x, y = Estimate),
colour = "tomato", linetype = 2) +
geom_line(data = preds, mapping = aes(x = x, y = Q2.5),
colour = "tomato", linetype = 2) +
geom_line(data = preds, mapping = aes(x = x, y = Q97.5),
colour = "tomato", linetype = 2) +
theme(legend.position = c(0.8, 0.8),
axis.title = element_text(size = 16),
axis.text = element_text(size = 12),
strip.text.x = element_text(size = 16))