Parametric bootstrap testing to compare two models by analysis of variance (deviance)
anovaPB.Rd
Compute analysis of variance (or deviance) tables for two fitted model objects, with the P-value estimated using a parametric bootstrap -- by repeatedly simulating new responses from the fitted model under the null hypothesis.
Arguments
- objectNull
is the fitted model under the null hypothesis. This can be any object that responds to
simulate
,update
andanova
.- object
is the fitted model under the alternative hypothesis. This can be any object that responds to
update
,anova
andmodel.frame
. It must be larger thanobjectNull
and its model frame should contain all the terms required to fitobjectNull
.- n.sim
the number of simulated datasets to generate for parametric bootstrap testing. Defaults to 999.
- colRef
the column number where the test statistic of interest can be found in the standard
anova
output when callinganova(objectNull,object)
. Default choices have been set for common models (colRef=5
forlm
objects,colRef=6
forlmer
objects andcolRef=4
otherwise, which is correct forglm
andgam
objects).- rowRef
the row number where the test statistic of interest can be found in the standard
anova
output when callinganova(objectNull,object)
. Defaults to2
, because it is on the second row for most common models.- ...
further arguments sent through to
anova
.
Value
A matrix which has the appearance and behaviour of an object returned by
anova(objectNull,object)
, except that the P-value has been computed
by parametric bootstrap, and the matrix now inherits class anovaPB
.
Details
The anova
function typically relies on classical asymptotic results
which sometimes don't apply well, e.g. when using penalised likelihood to fit a
generalised additive model, or when testing whether a random effect should be
included in a mixed model. In such cases we can get a more accurate test by
using simulation to estimate the P-value -- repeatedly simulating
new sets of simulated responses, refitting the null and alternative models, and
recomputing the test statistic. This process allows us to estimate the
distribution of the test statistic when the null hypothesis is true, hence
draw a conclusion about whether the observed test statistic is large relative
to what would be expected under the null hypothesis. The process is often
referred to as a parametric bootstrap (Davison & Hinkley 1997), which
the PB in the function name (anovaPB
) is referring to.
This function will take any pair of fitted models, a null (objectNull
)
and an alternative (object
), and use simulation to estimate the
P-value of the test of whether there is evidence against the model in
objectNull
in favour of the model in object
. This function works
by repeatedly performing an anova
to compare objectNull
to
object
, where the responses in the model.frame
have been replaced
with values simulated by applying simulate
to objectNull
. Hence
for this function to work, the objects being compared need to respond
to the anova
, simulate
and model.frame
functions.
This function needs to be able to find the test statistic in the anova
output, but it is stored in different places for different types of objects.
It is assumed that anova(objectNull,object)
returns a matrix and that the
test statistic is stored in the (rowRef,colRef)
th element, where both
rowRef
and colRef
have been chosen to be correct for common
model types (glm
, gam
, lmer
).
References
Davison A.C. & Hinkley D. V. (1997) Bootstrap methods and their application, Cambridge University Press. Warton DI (2022) Eco-Stats - Data Analysis in Ecology, from t-tests to multivariate abundances. Springer, ISBN 978-3-030-88442-0
Examples
# fit a Poisson regression to random data:
y = rpois(50,lambda=1)
x = 1:50
rpois_glm = glm(y~x,family=poisson())
rpois_int = glm(y~1,family=poisson())
anovaPB(rpois_int,rpois_glm,n.sim=99)
#> Analysis of Deviance Table
#>
#> Model 1: y ~ 1
#> Model 2: y ~ x
#> Resid. Df Resid. Dev Df Deviance
#> 1 49 57.182
#> 2 48 56.980 1 0.20181 0.63