Fitting Linear Models for Multivariate Abundance Data
manylm.Rd
manylm
is used to fit multivariate linear models
to high-dimensional data, such as multivariate abundance data in ecology.
This is the base model-fitting function - see plot.manylm
for
assumption checking, and anova.manylm
or summary.manylm
for significance testing.
Usage
manylm(
formula, data=NULL, subset=NULL, weights=NULL,
na.action=options("na.action"), method="qr", model=FALSE,
x=TRUE, y=TRUE, qr=TRUE, singular.ok=TRUE, contrasts=NULL,
offset, test="LR" , cor.type= "I", shrink.param=NULL,
tol=1.0e-5, ...)
Arguments
- formula
an object of class
"formula"
(or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under Details.- data
an optional data frame, list or environment (or object coercible by
as.data.frame
to a data frame) containing the variables in the model. If not found indata
, the variables are taken fromenvironment(formula)
, typically the environment from whichmanylm
is called.- subset
an optional vector specifying a subset of observations to be used in the fitting process.
- weights
an optional vector of weights to be used in the fitting process. Should be
NULL
or a numeric vector. If non-null, weighted least squares is used with weightsweights
(that is, minimizingsum(weights*e^2)
); otherwise ordinary least squares is used.- na.action
a function which indicates what should happen when the data contain
NA
s. The default is set by thena.action
setting ofoptions
, and isna.fail
if that is unset. The ‘factory-fresh’ default isna.omit
. Another possible value isNULL
, no action. Valuena.exclude
can be useful.- method
the method to be used; for fitting, currently only
method = "qr"
is supported;method = "model.frame"
returns the model frame (the same as withmodel = TRUE
, see below).- model, x, y, qr
logicals. If
TRUE
the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned.- singular.ok
logical. If
FALSE
(the default in S but not in R) a singular fit is an error.- contrasts
an optional list. See the
contrasts.arg
ofmodel.matrix.default
.- offset
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be
NULL
or a numeric vector of length either one or equal to the number of cases. One or moreoffset
terms can be included in the formula instead or as well, and if both are specified their sum is used. Seemodel.offset
.- test
choice of test statistic. Can be one of "LR" (default) = likelihood ratio statistic "F" = Lawley-Hotelling trace statistic
NULL
= no test This parameter is merely stored inmanylm
, and will be used as the default value oftest
in subsequent functions for inference.- cor.type
structure imposed on the estimated correlation matrix under the fitted model. Can be "I"(default), "shrink", or "R". See anova.manylm for details of its usage. This parameter will be used as the default value of
cor.type
in subsequent functions for inference.- shrink.param
shrinkage parameter to be used if
cor.type="shrink"
. This parameter will be used as the default value ofshrink.param
in subsequent functions for inference.- tol
the tolerance used in estimations.
- ...
additional arguments to be passed to the low level regression fitting functions (see below).
Details
Models for manylm
are specified symbolically. For details on this
compare the details section of lm
and formula
. If the formula
includes an offset
, this is evaluated and subtracted from the
response.
See model.matrix
for some further details. The terms in
the formula will be re-ordered so that main effects come first,
followed by the interactions, all second-order, all third-order and so
on: to avoid this pass a terms
object as the formula (see
aov
and demo(glm.vr)
for an example).
A formula has an implied intercept term. To remove this use either
y ~ x - 1
or y ~ 0 + x
. See formula
for
more details of allowed formulae. manylm
calls the lower level function manylm.fit
or manylm.wfit
for the actual numerical computations.
For programming only, you may consider doing likewise.
All of weights
, subset
and offset
are evaluated
in the same way as variables in formula
, that is first in
data
and then in the environment of formula
.
For details on arguments related to hypothesis testing (such as cor.type
and resample
) see summary.manylm
or
anova.manylm
.
Value
manylm
returns an object of c("manylm", "mlm", "lm")
for multivariate
formula response and of of class c("lm")
for univariate response.
A manylm
object is a list containing at least the following components:
- coefficients
a named matrix of coefficients
- residuals
the residuals matrix, that is response minus fitted values.
- fitted.values
the matrix of the fitted mean values.
- rank
the numeric rank of the fitted linear model.
- weights
(only for weighted fits) the specified weights.
- df.residual
the residual degrees of freedom.
- hat.X
the hat matrix.
- txX
the matrix
(t(x)%*%x)
.- test
the
test
argument supplied.- cor.type
the
cor.type
argument supplied.- resample
the
resample
argument supplied.- nBoot
the
nBoot
argument supplied.- call
the matched call.
- terms
the
terms
object used.
- xlevels
(only where relevant) a record of the levels of the factors used in fitting.
- model
if requested (the default), the model frame used.
- offset
the offset used (missing if none were used).
- y
if requested, the response matrix used.
- x
if requested, the model matrix used.
In addition, non-null fits will have components assign
and
(unless not requested) qr
relating to the linear
fit, for use by extractor functions such as summary.manylm
.
Examples
data(spider)
spiddat <- log(spider$abund+1)
lm.spider <- manylm(spiddat~.,data=spider$x)
lm.spider
#>
#> Call:
#> manylm(formula = spiddat ~ ., data = spider$x)
#>
#> Coefficients:
#> Alopacce Alopcune Alopfabr Arctlute Arctperi Auloalbi
#> (Intercept) 1.231267 -1.380086 2.393197 -1.141160 2.027234 -1.103879
#> soil.dry -0.794355 0.749110 -0.879432 0.730493 -0.594573 0.571479
#> bare.sand -0.134612 -0.169884 0.255664 0.159457 0.089462 0.042262
#> fallen.leaves 0.099633 -0.051408 0.043434 -0.236105 0.047585 -0.228302
#> moss 0.098816 -0.211830 -0.018714 -0.042750 -0.150472 -0.231740
#> herb.layer 0.214865 0.147603 0.039480 0.050233 -0.166885 0.467675
#> reflection 0.459053 0.388217 0.046706 -0.096538 0.217172 -0.045856
#> Pardlugu Pardmont Pardnigr Pardpull Trocterr Zoraspin
#> (Intercept) 3.734407 -3.407564 -0.469123 -1.859186 1.441566 0.077075
#> soil.dry -0.761781 1.256562 1.299666 1.562122 0.891209 1.034457
#> bare.sand -0.229075 0.009099 0.127535 -0.110245 -0.155567 0.283460
#> fallen.leaves 0.232496 -0.308228 -0.700012 -0.691232 -0.241817 -0.479958
#> moss -0.200612 0.589305 -0.393534 -0.197432 -0.385430 -0.134735
#> herb.layer 0.050538 0.290093 0.281785 0.352392 0.291045 0.263206
#> reflection -0.319179 0.153976 -0.239496 -0.026756 -0.199446 -0.656028
#>
#Then use the plot function for diagnostic plots, and use anova or summary to
#evaluate significance of different model terms.