Title: | Double Constrained Correspondence Analysis for Trait-Environment Analysis in Ecology |
---|---|
Description: | Double constrained correspondence analysis (dc-CA) analyzes (multi-)trait (multi-)environment ecological data by using the 'vegan' package and native R code. Throughout the two step algorithm of ter Braak et al. (2018) is used. This algorithm combines and extends community- (sample-) and species-level analyses, i.e. the usual community weighted means (CWM)-based regression analysis and the species-level analysis of species-niche centroids (SNC)-based regression analysis. The two steps use canonical correspondence analysis to regress the abundance data on to the traits and (weighted) redundancy analysis to regress the CWM of the orthonormalized traits on to the environmental predictors. The function dc_CA() has an option to divide the abundance data of a site by the site total, giving equal site weights. This division has the advantage that the multivariate analysis corresponds with an unweighted (multi-trait) community-level analysis, instead of being weighted. The first step of the algorithm uses vegan::cca(). The second step uses wrda() but vegan::rda() if the site weights are equal. This version has a predict() function. For details see ter Braak et al. 2018 <doi:10.1007/s10651-017-0395-x>. |
Authors: | Cajo J.F ter Braak [aut] , Bart-Jan van Rossum [aut, cre] |
Maintainer: | Bart-Jan van Rossum <[email protected]> |
License: | GPL-3 |
Version: | 1.2.1 |
Built: | 2024-11-21 05:51:16 UTC |
Source: | https://github.com/biometris/douconca |
anova_sites
performs the community-level permutation test of dc-CA
when site weights vary.
The test uses residual predictor permutation (ter Braak 2022), which is
robust against differences in site total abundance in the response
in dc_CA
(ter Braak & te Beest, 2022).
The arguments of the function are similar to those of
anova.cca
, but more restricted. With equal site-totals
as in dc_CA
, anova(object$RDAonEnv)
is much faster.
anova_sites(object, permutations = 999, by = NULL)
anova_sites(object, permutations = 999, by = NULL)
object |
an object from |
permutations |
a list of control values for the permutations as
returned by the function |
by |
character |
The algorithm is analogous to that of anova.wrda
. The function
is used in anova.dcca
.
A list with two elements with names table
and eigenvalues
.
The table
is as from anova.cca
and
eigenvalues
gives the dc-CA eigenvalues.
ter Braak, C.J.F. & te Beest, D.E. 2022. Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal. Environmental and Ecological Statistics. 29 (4), 849-868. doi:10.1007/s10651-022-00545-4
ter Braak, C.J.F. (2022) Predictor versus response permutation for significance testing in weighted regression and redundancy analysis. Journal of statistical computation and simulation, 92, 2041-2059. doi:10.1080/00949655.2021.2019256
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = TRUE) anova(mod) a_species <- anova_species(mod) a_species # anova_species can be used for manual forward selection of # trait variables, as done for environmental variables in the demo # dune_FS_dcCA.r, based on the first eigenvalue and its significance # and adding the first axis of the constrained species scores from mod to # the Condition of a new mod. (eig1_and_pval <- c(eig = a_species$eig[1], pval = a_species$table$`Pr(>F)`[1])) a_species$eig anova_sites(mod)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = TRUE) anova(mod) a_species <- anova_species(mod) a_species # anova_species can be used for manual forward selection of # trait variables, as done for environmental variables in the demo # dune_FS_dcCA.r, based on the first eigenvalue and its significance # and adding the first axis of the constrained species scores from mod to # the Condition of a new mod. (eig1_and_pval <- c(eig = a_species$eig[1], pval = a_species$table$`Pr(>F)`[1])) a_species$eig anova_sites(mod)
anova_species
performs the species-level permutation test of dc-CA
which is part of anova.dcca
.
The test uses residual predictor permutation (ter Braak 2022), which is
robust against differences in species total abundance in the response
in dc_CA
(ter Braak & te Beest, 2022). The arguments of the
function are similar to those of anova.cca
, but more
restrictive.
anova_species(object, permutations = 999, by = NULL)
anova_species(object, permutations = 999, by = NULL)
object |
an object from |
permutations |
a list of control values for the permutations as
returned by the function |
by |
character |
In anova_species
, the first step extracts the species-niche
centroids (SNC) with respect to all dc-CA ordination axes from an existing
dc-CA analysis. The second step, applies a weighted redundancy analysis of
these SNCs with the traits as predictors. The second step is thus a
species-level analysis, but the final results (eigenvalues/ordination axes)
are identical to those of the analyses steps in dc_CA
.The
second step uses R-code that is analogous to that of
anova.wrda
.
A list with two elements with names table
and eigenvalues
.
The table
is as from anova.cca
and
eigenvalues
gives the dc-CA eigenvalues. This output can be used
for scripting forward selection of traits, similar to the forward selection
of environmental variables in the demo dune_select.r
.
ter Braak, C.J.F. & te Beest, D.E. 2022. Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal. Environmental and Ecological Statistics. 29 (4), 849-868. doi:10.1007/s10651-022-00545-4
ter Braak, C.J.F. (2022) Predictor versus response permutation for significance testing in weighted regression and redundancy analysis. Journal of statistical computation and simulation, 92, 2041-2059. doi:10.1080/00949655.2021.2019256
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = TRUE) anova(mod) a_species <- anova_species(mod) a_species # anova_species can be used for manual forward selection of # trait variables, as done for environmental variables in the demo # dune_FS_dcCA.r, based on the first eigenvalue and its significance # and adding the first axis of the constrained species scores from mod to # the Condition of a new mod. (eig1_and_pval <- c(eig = a_species$eig[1], pval = a_species$table$`Pr(>F)`[1])) a_species$eig anova_sites(mod)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = TRUE) anova(mod) a_species <- anova_species(mod) a_species # anova_species can be used for manual forward selection of # trait variables, as done for environmental variables in the demo # dune_FS_dcCA.r, based on the first eigenvalue and its significance # and adding the first axis of the constrained species scores from mod to # the Condition of a new mod. (eig1_and_pval <- c(eig = a_species$eig[1], pval = a_species$table$`Pr(>F)`[1])) a_species$eig anova_sites(mod)
anova.dcca
performs the community- and species-level permutation tests
of dc-CA and combines these with the 'max test', which takes the maximum of
the P-values. The function arguments are similar to (but more
restrictive than) those of anova.cca
.
## S3 method for class 'dcca' anova(object, ..., permutations = 999, by = c("omnibus", "axis"))
## S3 method for class 'dcca' anova(object, ..., permutations = 999, by = c("omnibus", "axis"))
object |
an object from |
... |
unused. |
permutations |
a list of control values for the permutations for
species and sites (species first, sites second, for traits and environment)
as returned by the function |
by |
character The interpretation of this inertia is, at the species-level, the
environmentally constrained inertia explained by the traits (without trait
covariates) and, at the community-level, the trait-constrained inertia
explained by the environmental predictors (without covariates). The
default ( |
In the general case of varying site abundance totals
(divideBySiteTotals = FALSE
) both the community-level test and the
species-level test use residualized predictor permutation (ter Braak 2022),
so as to ensure that anova.dcca
is robust against differences
in species and site total abundance in the response
(ter Braak & te
Beest, 2022). The community-level test uses anova_sites
. For
the species-level test, anova_species
is used.
With equal site weights, obtained with divide.by.site.total = TRUE
,
the community-level test is obtained by applying anova
to
object$RDAonEnv
using anova.cca
.
This performs residualized response permutation which performs about equal
to residualized predictor permutation in the equi-weight case.
The function anova.cca
is implemented in C and therefore
a factor of 20 or so quicker than the native R-code used in
anova_sites
.
A list of 3 of structures as from anova.cca
. The
elements are c("species", "sites", "max")
ter Braak, C.J.F. & te Beest, D.E. 2022. Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal. Environmental and Ecological Statistics. 29 (4), 849-868. doi:10.1007/s10651-022-00545-4
ter Braak, C.J.F. (2022) Predictor versus response permutation for significance testing in weighted regression and redundancy analysis. Journal of statistical computation and simulation, 92, 2041-2059. doi:10.1080/00949655.2021.2019256
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = TRUE) anova(mod) a_species <- anova_species(mod) a_species # anova_species can be used for manual forward selection of # trait variables, as done for environmental variables in the demo # dune_FS_dcCA.r, based on the first eigenvalue and its significance # and adding the first axis of the constrained species scores from mod to # the Condition of a new mod. (eig1_and_pval <- c(eig = a_species$eig[1], pval = a_species$table$`Pr(>F)`[1])) a_species$eig anova_sites(mod)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = TRUE) anova(mod) a_species <- anova_species(mod) a_species # anova_species can be used for manual forward selection of # trait variables, as done for environmental variables in the demo # dune_FS_dcCA.r, based on the first eigenvalue and its significance # and adding the first axis of the constrained species scores from mod to # the Condition of a new mod. (eig1_and_pval <- c(eig = a_species$eig[1], pval = a_species$table$`Pr(>F)`[1])) a_species$eig anova_sites(mod)
anova.wrda
performs residual predictor permutation for weighted
redundancy analysis (wRDA), which is robust against differences in the
weights (ter Braak, 2022). The arguments of the function are similar to
those of anova.cca
, but more restricted.
## S3 method for class 'wrda' anova(object, ..., permutations = 999, by = c("omnibus", "axis"))
## S3 method for class 'wrda' anova(object, ..., permutations = 999, by = c("omnibus", "axis"))
object |
an object from |
... |
unused. |
permutations |
a list of control values for the permutations as
returned by the function |
by |
character |
The algorithm is based on published R-code for residual predictor permutation in weighted redundancy analysis (ter Braak, 2022), but using QR-decomposition instead of ad-hoc least-squares functions.
A list with two elements with names table
and eigenvalues
.
The table
is as from anova.cca
and
eigenvalues
gives the wrda eigenvalues.
ter Braak, C.J.F. (2022) Predictor versus response permutation for significance testing in weighted regression and redundancy analysis. Journal of statistical computation and simulation, 92, 2041-2059. doi:10.1080/00949655.2021.2019256
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)
Double constrained correspondence analysis (dc-CA) for analyzing
(multi-)trait (multi-)environment ecological data using library vegan
and native R code. It has a formula
interface which allows to assess,
for example, the importance of trait interactions in shaping ecological
communities. The function dc_CA
has an option to divide the abundance
data of a site by the site total, giving equal site weights. This division
has the advantage that the multivariate analysis corresponds with an
unweighted (multi-trait) community-level analysis, instead of being weighted
(Kleyer et al. 2012).
dc_CA( formulaEnv = NULL, formulaTraits = NULL, response = NULL, dataEnv = NULL, dataTraits = NULL, divideBySiteTotals = TRUE, dc_CA_object = NULL, verbose = TRUE )
dc_CA( formulaEnv = NULL, formulaTraits = NULL, response = NULL, dataEnv = NULL, dataTraits = NULL, divideBySiteTotals = TRUE, dc_CA_object = NULL, verbose = TRUE )
formulaEnv |
formula or one-sided formula for the rows (samples) with
row predictors in |
formulaTraits |
formula or one-sided formula for the columns (species)
with column predictors in |
response |
matrix, data frame of the abundance data
(dimension n x m) or list with community weighted means (CWMs)
from |
dataEnv |
matrix or data frame of the row predictors, with rows
corresponding to those in |
dataTraits |
matrix or data frame of the column predictors, with rows
corresponding to the columns in |
divideBySiteTotals |
logical; default |
dc_CA_object |
optional object from an earlier run of this function.
Useful if the same formula for the columns ( |
verbose |
logical for printing a simple summary (default: TRUE) |
Empty (all zero) rows and columns in response
are removed from the
response
and the corresponding rows from dataEnv
and
dataTraits
. Subsequently, any columns with missing values are
removed from dataEnv
and dataTraits
. It gives an error
('name_of_variable' not found), if variables with missing entries are
specified in formulaEnv
and formulaTraits
.
Computationally, dc-CA can be carried out by a single singular value
decomposition (ter Braak et al. 2018), but it is here computed in two steps.
In the first step, the transpose of the response
is regressed on to
the traits (the column predictors) using cca
with
formulaTraits
. The column scores of this analysis (in scaling 1) are
community weighted means (CWM) of the orthonormalized traits. These are then
regressed on the environmental (row) predictors using wrda
with formulaEnv
or using rda
, if site weights
are equal.
A dc-CA can be carried out on, what statisticians call, the sufficient
statistics of the method. This is useful, when the abundance data are not
available or could not be made public in a paper attempting reproducible
research. In this case, response
should be a list
with as first element community weighted means (CWMs) with respect to the
traits, and the trait data, and, optionally, further elements, for functions
related to dc_CA
. The minimum is a
list(CWM, weight = list(columns = species_weights))
with CWM a matrix
or data.frame, but then formulaEnv
, formulaTraits
,
dataEnv
, dataTraits
must be specified in the call to
dc_CA
. The function fCWM_SNC
and its example
show how to set the
response
for this and helps to create the response
from
abundance data in these non-standard applications of dc-CA. Species and site
weights, if not set in response$weights
can be set by a variable
weight
in the data frames dataTraits
and dataEnv
,
respectively, but formulas should then not be ~.
.
The statistics and scores in the example dune_dcCA.r
, have been
checked against the results in Canoco 5.15 (ter Braak & Šmilauer, 2018).
A list of class
dcca
; that is a list with elements
a cca.object
from the
cca
analysis of the transpose of the closed
response
using formula formulaTraits
.
the argument formulaTraits
. If the formula was
~.
, it was changed to explicit trait names.
a list of Y
, dataEnv
and dataTraits
,
after removing empty rows and columns in response
and after closure if
divideBySiteTotals = TRUE
and with the corresponding rows in
dataEnv
and dataTraits
removed.
a list of unit-sum weights of row and columns. The names of
the list are c("row", "columns")
, in that order.
number of sites (rows).
Community weighted means w.r.t. orthonormalized traits.
a wrda
object or
cca.object
from the
wrda
or, if with equal row weights,
rda
analysis, respectively of the column scores of the
cca
, which are the CWMs of orthonormalized traits, using formula
formulaEnv
.
the argument formulaEnv
. If the formula was
~.
, it was changed to explicit environmental variable names.
the dc-CA eigenvalues (same as those of the
rda
analysis).
mean, sd, VIF and (regression) coefficients of
the traits that define the dc-CA axes in terms of the
traits with t-ratios missing indicated by NA
s for 'tval1'.
a one-column matrix with four inertias (weighted variances):
total: the total inertia.
conditionT: the inertia explained by the condition in
formulaTraits
if present (neglecting row constraints).
traits_explain: the inertia explained by the traits (neglecting the
row predictors and any condition in formulaTraits
). This is the
maximum that the row predictors could explain in dc-CA (the sum of the
following two items is thus less than this value).
conditionE: the trait-constrained inertia explained by the condition
in formulaEnv
.
constraintsTE: the trait-constrained inertia explained by the predictors (without the row covariates).
If verbose
is TRUE
(or after out <- print(out)
is
invoked) there are three more items.
c_traits_normed
: mean, sd, VIF and (regression) coefficients of
the traits that define the dc-CA trait axes (composite traits), and their
optimistic t-ratio.
c_env_normed
: mean, sd, VIF and (regression) coefficients of
the environmental variables that define the dc-CA axes in terms of the
environmental variables (composite gradients), and their optimistic t-ratio.
species_axes
: a list with four items
species_scores
: a list with names
c("species_scores_unconstrained", "lc_traits_scores")
with the
matrix with species niche centroids along the dc-CA axes (composite
gradients) and the matrix with linear combinations of traits.
correlation
: a matrix with inter-set correlations of the
traits with their SNCs.
b_se
: a matrix with (unstandardized) regression coefficients
for traits and their optimistic standard errors.
R2_traits
: a vector with coefficient of determination (R2)
of the SNCs on to the traits. The square-root thereof could be called
the species-trait correlation in analogy with the species-environment
correlation in CCA.
sites_axes
: a list with four items
site_scores
: a list with names
c("site_scores_unconstrained", "lc_env_scores")
with the matrix
with community weighted means (CWMs) along the dc-CA axes (composite
gradients) and the matrix with linear combinations of environmental
variables.
correlation
: a matrix with inter-set correlations of the
environmental variables with their CWMs.
b_se
: a matrix with (unstandardized) regression coefficients
for environmental variables and their optimistic standard errors.
R2_env
: a vector with coefficient of determination (R2) of
the CWMs on to the environmental variables. The square-root thereof
has been called the species-environmental correlation in CCA.
All scores in the dcca
object are in scaling "sites"
(1):
the scaling with Focus on Case distances .
Kleyer, M., Dray, S., Bello, F., Lepš, J., Pakeman, R.J., Strauss, B., Thuiller, W. & Lavorel, S. (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? Journal of Vegetation Science, 23, 805-821. doi:10.1111/j.1654-1103.2012.01402.x
ter Braak, CJF, Šmilauer P, and Dray S. 2018. Algorithms and biplots for double constrained correspondence analysis. Environmental and Ecological Statistics, 25(2), 171-197. doi:10.1007/s10651-017-0395-x
ter Braak C.J.F. and P. Šmilauer (2018). Canoco reference manual and user's guide: software for ordination (version 5.1x). Microcomputer Power, Ithaca, USA, 536 pp.
Oksanen, J., et al. (2024) vegan: Community Ecology Package. R package version 2.6-6.1. https://CRAN.R-project.org/package=vegan.
plot.dcca
, scores.dcca
,
print.dcca
and anova.dcca
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) anova(mod, by = "axis") # For more demo on testing, see demo dune_test.r mod_scores <- scores(mod) # correlation of axes with a variable that is not in the model scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot")) cat("head of unconstrained site scores, with meaning\n") print(head(mod_scores$sites)) mod_scores_tidy <- scores(mod, tidy = TRUE) print("names of the tidy scores") print(names(mod_scores_tidy)) cat("\nThe levels of the tidy scores\n") print(levels(mod_scores_tidy$score)) cat("\nFor illustration: a dc-CA model with a trait covariate\n") mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n") mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ", "as the trait model and data did not change\n") mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), dataEnv = dune_trait_env$envir, dc_CA_object = mod2, verbose= FALSE) cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n", "the expected difference is in the component 'call'\n ") print(all.equal(mod3, mod3B)) # only the component call differs
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) anova(mod, by = "axis") # For more demo on testing, see demo dune_test.r mod_scores <- scores(mod) # correlation of axes with a variable that is not in the model scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot")) cat("head of unconstrained site scores, with meaning\n") print(head(mod_scores$sites)) mod_scores_tidy <- scores(mod, tidy = TRUE) print("names of the tidy scores") print(names(mod_scores_tidy)) cat("\nThe levels of the tidy scores\n") print(levels(mod_scores_tidy$score)) cat("\nFor illustration: a dc-CA model with a trait covariate\n") mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n") mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ", "as the trait model and data did not change\n") mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), dataEnv = dune_trait_env$envir, dc_CA_object = mod2, verbose= FALSE) cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n", "the expected difference is in the component 'call'\n ") print(all.equal(mod3, mod3B)) # only the component call differs
The data dune_trait_env
contains three data frames with abundance
data of 28 plant species in 20 samples (relevés), trait data (9 traits:
of which 5 morphological and 4 ecological (Ellenberg indicator values) and
environmental data (9 environmental variables, four of which are geographic
coordinates). Compared to the data in Jongman et al. (1987, 1995), the two
moss species are lacking, and the traits of plant species and the geographic
coordinates of the samples are added. The data and the following description
are an edited version of the DataKey in the Jamil2013_AJ data set in the
CESTES database (Jeliazkov et al. 2020).
The Dune Meadow Data originate from a MSc thesis report of Batterink & Wijffels (1983). It consisted of 80 relevés in about 68 dune meadows (lots) on the island Terschelling in the Netherlands. A subset of their data was selected by Caspar Looman as an example data set for the edited book Jongman, ter Braak and van Tongeren (1987, 1995). The subset consists of 20 relevés, 28 species (and 2 mosses, excluded here) and 5 environmental variables.
The trait data were taken from the LEDA database by Jamil et al 2013. The spatial coordinates were retrieved by geo-referencing in GIS of the maps in Batterink & Wijffels (1993) by Ruut Wegman and Cajo ter Braak. The X, Y coordinates are by geo-referencing the relevé locations on Kaart 3a, 3b and 3c; the X_lot, Y_lot coordinates are from Kaart 2a, 2b and 2c.
For the Ellenberg indicator values see Ellenberg (1992).
The data dune_trait_env
is a list with elements that are data frames
each
comm
: community data; vegetation data.
traits
: trait data, taken from the LEDA database.
envir
: environmental data, taken from Jongman et al. (1987,1995).
The community data collection was done by the Braun-Blanquet method; the data are recorded according to the ordinal scale of van der Maarel (1979, Vegetatio, 39, 97-114); see pages XVII-XVIII and 18 in Jongman, ter Braak & van Tongeren 1995. Nomenclature follows Heukels-Van der Meijden (1983) Flora van Nederland, 20th ed.
The morphological traits
are
SLA
: Specific Leaf Area
Height
: Canopy height of a shoot
LDMC
: Lead dry matter content
Seedmass
: Seed mass
Lifespan
: Life span. Nominal; annual vs. perennial
The ecological traits
(habitat requirements) are the Ellenberg values
F
: Moisture (ranging [1 to 12] (low to high))
R
: Soil acidity, ranging [1 to 9] (acidic to alkaline)
N
: Nitrogen requirement, ranging [1 to 9] (low to high)
L
: Light requirement, ranging [1 to 9] (low to high)
The data frame envir
contains the environmental variables
A1
: horizon thickness
Moist
: Moisture content of the soil (a five point scale)
Mag
: Grassland management type
Use
: type of use (Agricultural grassland use (1) hay production
(2) intermediate (3) grazing)
Manure
: Quantity of manure applied based on N and P manuring
(N/P class in B&W 1983)
X
: longitude geographical coordinates (m) of the 2x2 m2
sample (relevé)
Y
: latitude geographical coordinates (m) of the 2x2 m2
sample (relevé)
X_lot
: longitude geographical coordinates (m) of the lot center
Y_lot
: latitude geographical coordinates (m) of the lot center
The management types are standard farming (SF), biological farming (BF), hobby farming (HF), nature conservation management (NM). The coordinates are Rijksdriehoekscoordinaten in meters. https://nl.wikipedia.org/wiki/Rijksdriehoekscoordinaten
Batterink, M. & Wijffels, G. (1983) Een vergelijkend vegetatiekundig onderzoek naar de typologie en invloeden van het beheer van 1973 tot 1982 in de duinweilanden op Terschelling. Landbouwhogeschool. ISN 215909.01. WUR library stacks 704B58.
Ellenberg, H. (1992) Indicator values of plants in Central Europe. Scripta Geobotanica, 18, 258 pp.
Jamil, T., Ozinga, W.A., Kleyer, M. & ter Braak, C.J.F. (2013) Selecting traits that explain species–environment relationships: a generalized linear mixed model approach. Journal of Vegetation Science, 24, 988-1000. doi:10.1111/j.1654-1103.2012.12036.x.
Jeliazkov, A., Mijatovic, D., and 78 others. (2020) A global database for metacommunity ecology, integrating species, traits, environment and space. Scientific Data, 7. doi:10.1038/s41597-019-0344-7.
Jongman, R.H.G., ter Braak, C.J.F. & van Tongeren, O.F.R. (1987) Data analysis in community and landscape ecology. Pudoc, Wageningen. ISBN 90-220-0908-4.
Jongman, R.H.G., ter Braak, C.J.F. & van Tongeren, O.F.R. (1995) Data analysis in community and landscape ecology. Cambridge University Press, Cambridge. ISBN 0-521-47574-0. https://edepot.wur.nl/248017.
Kleyer, M., and 33 others (2008) The LEDA Traitbase: a database of life-history traits of the Northwest European flora. Journal of Ecology, 96, 1266-1274. doi:10.1111/j.1365-2745.2008.01430.x.
Double constrained correspondence analysis (dc-CA) can be calculated directly
from community weighted means (CWMs), with the trait data from which the
CWMs are calculated, and the environmental data and weights for species
and sites (the abundance totals for species and sites). Statistical testing
at the species level requires also the species niche centroids (SNCs).
The function fCWM_SNC
calculates the CWMs and SNCs from the trait
and environmental data, respectively, using a formula interface, so as to
allow categorical traits and environmental variables. The resulting object
can be set as the response
argument in dc_CA
so as to
give the same output as a call to dc_CA
with the abundance
data as response
, at least up to sign changes of the axes.
fCWM_SNC( response = NULL, dataEnv = NULL, dataTraits = NULL, formulaEnv = NULL, formulaTraits = NULL, divideBySiteTotals = TRUE, dc_CA_object = NULL, minimal_output = TRUE, verbose = TRUE )
fCWM_SNC( response = NULL, dataEnv = NULL, dataTraits = NULL, formulaEnv = NULL, formulaTraits = NULL, divideBySiteTotals = TRUE, dc_CA_object = NULL, minimal_output = TRUE, verbose = TRUE )
response |
matrix, data frame of the abundance data
(dimension n x m) or list with community weighted means (CWMs)
from |
dataEnv |
matrix or data frame of the row predictors, with rows
corresponding to those in |
dataTraits |
matrix or data frame of the column predictors, with rows
corresponding to the columns in |
formulaEnv |
formula or one-sided formula for the rows (samples) with
row predictors in |
formulaTraits |
formula or one-sided formula for the columns (species)
with column predictors in |
divideBySiteTotals |
logical; default |
dc_CA_object |
optional object from an earlier run of this function.
Useful if the same formula for the columns ( |
minimal_output |
logical. Default |
verbose |
logical for printing a simple summary (default: TRUE) |
The argument formulaTraits
determines which CWMs are calculated.
The CWMs are calculated from the trait data (non-centered, non-standardized).
With trait covariates, the other predictor traits are adjusted for the trait
covariates by weighted regression, after which the overall weighted mean
trait is added. This has the advantage that each CWM has the scale of the
original trait.
The SNCs are calculated analogously from environmental data.
Empty (all zero) rows and columns in response
are removed from
the response
and the corresponding rows from dataEnv
and
dataTraits
. Subsequently, any columns with missing values are
removed from dataEnv
and dataTraits
. It gives an error
(object 'name_of_variable' not found), if variables with missing entries
are specified in formulaEnv
and formulaTraits
.
In the current implementation, formulaEnv
and formulaTraits
should contain variable names as is, i.e. transformations of
variables in the formulas gives an error ('undefined columns selected')
when the scores
function is applied.
The default returns a list of CWMs, SNCs, weights, formulaTraits
and
a list of data with elements dataEnv
and dataTraits
. When
minimal_output = FALSE
, many more statistics are given that are
mainly technical or recomputed when the return value is used as
response
in a call to dc_CA
.
Kleyer, M., Dray, S., Bello, F., Lepš, J., Pakeman, R.J., Strauss, B., Thuiller, W. & Lavorel, S. (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? Journal of Vegetation Science, 23, 805-821. doi:10.1111/j.1654-1103.2012.01402.x
ter Braak, CJF, Šmilauer P, and Dray S. 2018. Algorithms and biplots for double constrained correspondence analysis. Environmental and Ecological Statistics, 25(2), 171-197. doi:10.1007/s10651-017-0395-x
ter Braak C.J.F. and P. Šmilauer (2018). Canoco reference manual and user's guide: software for ordination (version 5.1x). Microcomputer Power, Ithaca, USA, 536 pp.
Oksanen, J., et al. (2022) vegan: Community Ecology Package. R package version 2.6-4. https://CRAN.R-project.org/package=vegan.
dc_CA
, plot.dcca
,
scores.dcca
, print.dcca
and
anova.dcca
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites CWMSNC <- fCWM_SNC(formulaEnv = ~ A1 + Moist + Manure + Use + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Condition(Seedmass) + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) names(CWMSNC) #CWMSNC$SNC <- NULL # would give correct dc-CA but no species-level t-values or test mod <- dc_CA(response = CWMSNC) # formulas and data are in CWMSNC! # note that output also gives the environment-constrained inertia, # which is a bonus compare to the usual way to carry out a dcCA. anova(mod)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites CWMSNC <- fCWM_SNC(formulaEnv = ~ A1 + Moist + Manure + Use + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Condition(Seedmass) + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) names(CWMSNC) #CWMSNC$SNC <- NULL # would give correct dc-CA but no species-level t-values or test mod <- dc_CA(response = CWMSNC) # formulas and data are in CWMSNC! # note that output also gives the environment-constrained inertia, # which is a bonus compare to the usual way to carry out a dcCA. anova(mod)
dc_CA
object for
plotting a single axis by your own code or plot.dcca
.getPlotdata
extracts data from a dc_CA
object for
plotting the CWMs and SNCs of a single axis.
getPlotdata( x, axis = 1, envfactor = NULL, traitfactor = NULL, newnames = NULL, facet = TRUE, remove_centroids = FALSE )
getPlotdata( x, axis = 1, envfactor = NULL, traitfactor = NULL, newnames = NULL, facet = TRUE, remove_centroids = FALSE )
x |
results from |
axis |
the axis number to get (default 1). |
envfactor |
name of row factor to display as color and lines in the CWM
plot (default |
traitfactor |
name of column factor to display as color and lines in
the SNC plot (default |
newnames |
a list with two elements: names for traits and for
environmental variables, default |
facet |
logical. Default |
remove_centroids |
logical to remove any centroids from the plot data
(default |
A list with three components
a data.frame containing plot data
a vector of scores per trait/environment
a vector of new names to be used in the plot
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites # must delete "Sites" from response matrix or data frame Y <- dune_trait_env$comm[, -1] # must delete "Sites" out <- dc_CA(formulaEnv = ~ A1 + Moist + Use + Manure + Mag, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = Y, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) dat <- getPlotdata(out) names(dat) names(dat$CWM_SNC) levels(dat$CWM_SNC$groups) plot(out)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites # must delete "Sites" from response matrix or data frame Y <- dune_trait_env$comm[, -1] # must delete "Sites" out <- dc_CA(formulaEnv = ~ A1 + Moist + Use + Manure + Mag, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = Y, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) dat <- getPlotdata(out) names(dat) names(dat$CWM_SNC) levels(dat$CWM_SNC$groups) plot(out)
plot_dcCA_CWM_SNC
plots the CWMs and SNCs of a dc-CA axis against
this axis, with optional centroids and colors for groups of sites and/or
species if available in the data.
plot_dcCA_CWM_SNC( x, axis = 1, envfactor = NULL, traitfactor = NULL, facet = TRUE, newnames = NULL, remove_centroids = FALSE, with_lines = TRUE, getPlotdata2plotdCCA = NULL )
plot_dcCA_CWM_SNC( x, axis = 1, envfactor = NULL, traitfactor = NULL, facet = TRUE, newnames = NULL, remove_centroids = FALSE, with_lines = TRUE, getPlotdata2plotdCCA = NULL )
x |
results from |
axis |
the axis number to get (default 1). |
envfactor |
name of row factor to display as color and lines in the CWM
plot (default |
traitfactor |
name of column factor to display as color and lines in
the SNC plot (default |
facet |
logical. Default |
newnames |
a list with two elements: names for traits and for
environmental variables, default |
remove_centroids |
logical to remove any centroids from the plot data
(default |
with_lines |
logical. Default |
getPlotdata2plotdCCA |
the results of an |
The argument getPlotdata2plotdCCA
is to allow some modifications of
the data frame resulting from getPlotdata
. The variable names
and score levels should remain untouched. plot_dcCA_CWM_SNC
uses the
variables: dcCA
k with axis number k and
"CWM-SNC", "groups", "points", "sizeweight"
for the y-axis, coloring,
shape and size of items, respectively.
The function is used in plot.dcca
.
a ggplot object
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites # must delete "Sites" from response matrix or data frame Y <- dune_trait_env$comm[, -1] # must delete "Sites" out <- dc_CA(formulaEnv = ~ A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = Y, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) plot_dcCA_CWM_SNC(out, facet = FALSE)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites # must delete "Sites" from response matrix or data frame Y <- dune_trait_env$comm[, -1] # must delete "Sites" out <- dc_CA(formulaEnv = ~ A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = Y, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) plot_dcCA_CWM_SNC(out, facet = FALSE)
plot_species_scores_bk
creates a vertical line plot of ordination
scores with selection criterion for which scores to plot with names.
plot_species_scores_bk( species_scores, ylab = "scores", threshold = 7, y_lab_interval = 0.5, speciesname = NULL, scoresname = "RDA1", selectname = "Fratio1", speciesgroup = NULL, expand = 0.2, verbose = TRUE )
plot_species_scores_bk( species_scores, ylab = "scores", threshold = 7, y_lab_interval = 0.5, speciesname = NULL, scoresname = "RDA1", selectname = "Fratio1", speciesgroup = NULL, expand = 0.2, verbose = TRUE )
species_scores |
a species-by-scores matrix, a data frame with
row names (species names) or a tibble with variable with name
|
ylab |
y-axis label. Default: $b_k$. |
threshold |
species with criterion (specified by |
y_lab_interval |
interval of the y-axis ticks. A tick at no effect (0) is always included; default: 0.5. |
speciesname |
name of the variable containing the species names
(default |
scoresname |
name of the column or variable containing the species
scores to be plotted (default |
selectname |
name of the column or variable containing the criterion
for the selection of species to be displayed. Default: |
speciesgroup |
name of the factor, the levels of which receive different colors in the vertical plot. |
expand |
amount of extension of the line plot (default 0.2). |
verbose |
logical for printing the number of species with names out of
the total number (default: |
The absolute value of the criterion values is taken before selection, so
that also the species scores of the ordination can serve as a criterion
(e.g. selectname = "RDA1"
). The function has been copied from
the PRC
package at https://github.com/CajoterBraak/PRC.
The function is used in plot.dcca
.
a ggplot object
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) env_scores <- scores(mod, display = "tval") env_scores <- data.frame(env_scores) env_scores$group <- c("quantitative", "category")[c(1, 1, 2, 2, 2, 1, 1)] plot_species_scores_bk( species_scores = env_scores, ylab = "optimistic t-values", threshold = 0, y_lab_interval = 1, scoresname = "dcCA1", speciesgroup = "group", verbose = FALSE )
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) env_scores <- scores(mod, display = "tval") env_scores <- data.frame(env_scores) env_scores$group <- c("quantitative", "category")[c(1, 1, 2, 2, 2, 1, 1)] plot_species_scores_bk( species_scores = env_scores, ylab = "optimistic t-values", threshold = 0, y_lab_interval = 1, scoresname = "dcCA1", speciesgroup = "group", verbose = FALSE )
plot.dcca
plots the CWMs and SNCs of a dc-CA axis against this axis,
with optional centroids and colors for groups of sites and/or species if
available in the data.
## S3 method for class 'dcca' plot( x, ..., axis = 1, gradient_description = "correlation", envfactor = NULL, traitfactor = NULL, nspecies = 20, species_groups = NULL, widths = c(5, 1, 1), newnames = NULL, facet = TRUE, remove_centroids = FALSE, with_lines = TRUE, verbose = TRUE )
## S3 method for class 'dcca' plot( x, ..., axis = 1, gradient_description = "correlation", envfactor = NULL, traitfactor = NULL, nspecies = 20, species_groups = NULL, widths = c(5, 1, 1), newnames = NULL, facet = TRUE, remove_centroids = FALSE, with_lines = TRUE, verbose = TRUE )
x |
results from |
... |
unused. |
axis |
the axis number to get (default 1). |
gradient_description |
character or 2-character vector for the trait
and environmental gradient, respectively specifying what to plot in the
vertical line plots to describe the dc-CA axis (trait and environmental
gradients). Default: |
envfactor |
name of row factor to display as color and lines in the CWM
plot (default |
traitfactor |
name of column factor to display as color and lines in
the SNC plot (default |
nspecies |
integer. Default |
species_groups |
name of a variable in |
widths |
relative widths of the CWM-SNC plot, the correlation/weight
plot and the species plot. (see |
newnames |
a list with two elements: names for traits and for
environmental variables, default |
facet |
logical. Default |
remove_centroids |
logical to remove any centroids from the plot data
(default |
with_lines |
logical. Default |
verbose |
logical. Default |
If you want to set new names, look at the names with all arguments default,
i.e. myplot <- plot(x)
, and then consult
myplot$nameList$newnames
for the order of the names of traits and
environmental variables. Note that covariates should not be in the list of
names. Contribution (in the definition of species selection in
nspecies
) is defined (as in CA) as the total species abundance in
the (possibly, closed) data multiplied by the square of the score on
the axis.
If the plot.dcca
returns the error "Error in grid.Call"
,
enlarge the plotting area or use verbose = FALSE
and assign the
result.
a ggplot object
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites # must delete "Sites" from response matrix or data frame Y <- dune_trait_env$comm[, -1] # must delete "Sites" out <- dc_CA(formulaEnv = ~ A1 + Moist + Use + Manure + Mag, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = Y, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) dat <- getPlotdata(out) names(dat) names(dat$CWM_SNC) levels(dat$CWM_SNC$groups) plot(out)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites # must delete "Sites" from response matrix or data frame Y <- dune_trait_env$comm[, -1] # must delete "Sites" out <- dc_CA(formulaEnv = ~ A1 + Moist + Use + Manure + Mag, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = Y, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) dat <- getPlotdata(out) names(dat) names(dat$CWM_SNC) levels(dat$CWM_SNC$groups) plot(out)
Prediction of traits from environment, environment from traits and response from trait and environment data.
With type = "traits"
and newdata = NULL
, predict gives the
fitted mean traits, i.e. the fitted community weighted means.
With type = "env"
and newdata = NULL
, predict gives the
fitted mean environment, i.e. the fitted species niche centroids.
## S3 method for class 'dcca' predict( object, ..., type = c("env", "traits", "response", "reg_env", "reg_traits"), rank = "full", newdata = NULL )
## S3 method for class 'dcca' predict( object, ..., type = c("env", "traits", "response", "reg_env", "reg_traits"), rank = "full", newdata = NULL )
object |
return value of |
... |
Other arguments passed to the function (currently ignored). |
type |
type of prediction, |
rank |
rank or number of axes to use. Default "full" for all axes (no rank-reduction). |
newdata |
Data in which to look for variables with which to predict.
For |
Variables that are in the model but not in newdata
are set to their
weighted means in the training data. Predictions are thus at the (weighted)
mean of the quantitative variables not included. Predictions with
not-included factors are at the weighted mean (none of the factor effects
are included).
For type = "response"
and non-null newdata, the species weights of
the training are used; the site weights are taken equal. Many of the
predicted values may be negative, indicating expected absences (0) or small
expected response values.
Regression coefficients obtained with type = "reg_env"
or
type = "reg_traits"
are for standardized traits and environmental
variables.
a matrix with the predictions. The exact content of the matrix
depends on the type
of predictions that are being made.
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Condition(Manure), formulaTraits = ~ SLA + Height + LDMC + Condition(Seedmass) + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) # regression coefficients predict(mod, type = "reg_env") predict(mod, type = "reg_traits") # fit the mean traits at each site (20x6), # that is CWM at each site pred.traits <- predict(mod, type = "traits") head(pred.traits) # fit the mean environment for each species (28x8) # that is SNC of each species pred.env <- predict(mod, type = "env") head(pred.env) pred.resp <- predict(mod, type = "response") # pred has negative values and dc_CA cannot have negatives in the response # so, modify pred.resp, #whichgives about similar eigenvalues as the original data pred.resp[pred.resp < 0] <- 0 mod3 <- dc_CA(formulaEnv = mod$formulaEnv, formulaTraits = mod$formulaTraits, response = pred.resp, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) mod3$eigenvalues / mod$eigenvalues
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Condition(Manure), formulaTraits = ~ SLA + Height + LDMC + Condition(Seedmass) + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) # regression coefficients predict(mod, type = "reg_env") predict(mod, type = "reg_traits") # fit the mean traits at each site (20x6), # that is CWM at each site pred.traits <- predict(mod, type = "traits") head(pred.traits) # fit the mean environment for each species (28x8) # that is SNC of each species pred.env <- predict(mod, type = "env") head(pred.env) pred.resp <- predict(mod, type = "response") # pred has negative values and dc_CA cannot have negatives in the response # so, modify pred.resp, #whichgives about similar eigenvalues as the original data pred.resp[pred.resp < 0] <- 0 mod3 <- dc_CA(formulaEnv = mod$formulaEnv, formulaTraits = mod$formulaTraits, response = pred.resp, dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) mod3$eigenvalues / mod$eigenvalues
Print a summary of a dc-CA object.
## S3 method for class 'dcca' print(x, ...)
## S3 method for class 'dcca' print(x, ...)
x |
a dc-CA object from |
... |
Other arguments passed to the function (currently ignored). |
x <- print(x)
is more efficient for scores.dcca
than
just print(x)
if dc_CA
is called with
verbose = FALSE
).
No return value, results are printed to console.
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) anova(mod, by = "axis") # For more demo on testing, see demo dune_test.r mod_scores <- scores(mod) # correlation of axes with a variable that is not in the model scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot")) cat("head of unconstrained site scores, with meaning\n") print(head(mod_scores$sites)) mod_scores_tidy <- scores(mod, tidy = TRUE) print("names of the tidy scores") print(names(mod_scores_tidy)) cat("\nThe levels of the tidy scores\n") print(levels(mod_scores_tidy$score)) cat("\nFor illustration: a dc-CA model with a trait covariate\n") mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n") mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ", "as the trait model and data did not change\n") mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), dataEnv = dune_trait_env$envir, dc_CA_object = mod2, verbose= FALSE) cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n", "the expected difference is in the component 'call'\n ") print(all.equal(mod3, mod3B)) # only the component call differs
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) anova(mod, by = "axis") # For more demo on testing, see demo dune_test.r mod_scores <- scores(mod) # correlation of axes with a variable that is not in the model scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot")) cat("head of unconstrained site scores, with meaning\n") print(head(mod_scores$sites)) mod_scores_tidy <- scores(mod, tidy = TRUE) print("names of the tidy scores") print(names(mod_scores_tidy)) cat("\nThe levels of the tidy scores\n") print(levels(mod_scores_tidy$score)) cat("\nFor illustration: a dc-CA model with a trait covariate\n") mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n") mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ", "as the trait model and data did not change\n") mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), dataEnv = dune_trait_env$envir, dc_CA_object = mod2, verbose= FALSE) cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n", "the expected difference is in the component 'call'\n ") print(all.equal(mod3, mod3B)) # only the component call differs
Print a summary of a wrda object
## S3 method for class 'wrda' print(x, ...)
## S3 method for class 'wrda' print(x, ...)
x |
a wrda object from |
... |
Other arguments passed to the function (currently ignored). |
No return value, results are printed to console.
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)
This function works very much like the vegan
scores
function, in particular
scores.cca
, with the additional results such as
regression coefficients and linear combinations of traits
('reg_traits', 'lc_traits')
. All scores from CA obey the so called
transition formulas and so do the scores of CCA and dc-CA. The differences
are, for CCA, that the linear combinations of environmental variables (the
constrained site scores) replace the usual (unconstrained)
site scores, and for dc-CA, that the linear combinations of traits (the
constrained species scores) also replace the usual
(unconstrained) species scores in the transition formulas.
## S3 method for class 'dcca' scores( x, ..., choices = 1:2, display = "all", scaling = "sym", which_cor = "in model", tidy = FALSE )
## S3 method for class 'dcca' scores( x, ..., choices = 1:2, display = "all", scaling = "sym", which_cor = "in model", tidy = FALSE )
x |
object of class |
... |
Other arguments passed to the function (currently ignored). |
choices |
integer vector of which axes to obtain. Default: all dc-CA axes. |
display |
a character vector, one or more of |
scaling |
numeric (1,2 or 3) or character |
which_cor |
character or list of trait and environmental variables names (in this order) in the data frames for which inter-set correlations must calculated. Default: a character ("in_model") for all traits and variables in the model, including collinear variables and levels. |
tidy |
Return scores that are compatible with |
The function is modeled after scores.cca
.
The t-ratios are taken from a multiple regression of the unconstrained species (or site) scores on to the traits (or environmental variables).
An example of which_cor
is: which_cor = list(traits = "SLA",
env = c("acidity", "humidity"))
.
A data frame if tidy = TRUE
. Otherwise, a matrix if a single
item is asked for and a named list of matrices if more than one item is
asked for. The following names can be included:
c("sites", "constraints_sites", "centroids", "regression", "t_values",
"correlation", "intra_set_correlation", "biplot", "species",
"constraints_species", "regression_traits", "t_values_traits",
"correlation_traits", "intra_set_correlation_traits", "biplot_traits",
"centroids_traits")
. Each matrix has an attribute "meaning"
explaining its meaning. With tidy = TRUE
, the resulting data frame
has attributes "scaling"
and "meaning"
; the latter has two
columns: (1) name of score type and (2) its meaning, usage and
interpretation.
An example of the meaning of scores in scaling "symmetric"
with
display ="all"
:
CMWs of the trait axes (constraints species) in scaling 'symmetric' optimal for biplots and, almost so, for inter-site distances.
linear combination of the environmental predictors and the covariates (making the ordination axes orthogonal to the covariates) in scaling 'symmetric' optimal for biplots and, almost so, for inter-site distances.
mean, sd, VIF, standardized regression coefficients and their optimistic t-ratio in scaling 'symmetric'.
t-values of the coefficients of the regression of the CWMs of the trait composite on to the environmental variables
inter set correlation, correlation between environmental variables and the sites scores (CWMs)
intra set correlation, correlation between environmental variables and the dc-ca axis (constrained sites scores)
biplot scores of environmental variables for display with biplot-traits for fourth-corner correlations in scaling 'symmetric'.
environmental category means of the site scores in scaling 'symmetric' optimal for biplots and, almost so, for inter-environmental category distances.
SNC on the environmental axes (constraints sites) in scaling 'symmetric' optimal for biplots and, almost so, for inter-species distances.
linear combination of the traits and the trait covariates (making the ordination axes orthogonal to the covariates) in scaling 'symmetric' optimal for biplots and, almost so, for inter-species distances.
mean, sd, VIF, standardized regression coefficients and their optimistic t-ratio in scaling 'symmetric'.
t-values of the coefficients of the regression of the SNCs along a dc-CA axis on to the traits
inter set correlation, correlation between traits and the species scores (SNCs)
intra set correlation, correlation between traits and the dc-ca axis (constrained species scores)
biplot scores of traits for display with biplot scores for fourth-corner correlation in scaling 'symmetric'.
trait category means of the species scores in scaling 'symmetric' optimal for biplots and, almost so, for inter-trait category distances.
The statements on optimality for distance interpretations are based on the
scaling
and the relative magnitude of the dc-CA eigenvalues of the
chosen axes.
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) anova(mod, by = "axis") # For more demo on testing, see demo dune_test.r mod_scores <- scores(mod) # correlation of axes with a variable that is not in the model scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot")) cat("head of unconstrained site scores, with meaning\n") print(head(mod_scores$sites)) mod_scores_tidy <- scores(mod, tidy = TRUE) print("names of the tidy scores") print(names(mod_scores_tidy)) cat("\nThe levels of the tidy scores\n") print(levels(mod_scores_tidy$score)) cat("\nFor illustration: a dc-CA model with a trait covariate\n") mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n") mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ", "as the trait model and data did not change\n") mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), dataEnv = dune_trait_env$envir, dc_CA_object = mod2, verbose= FALSE) cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n", "the expected difference is in the component 'call'\n ") print(all.equal(mod3, mod3B)) # only the component call differs
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan, response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) anova(mod, by = "axis") # For more demo on testing, see demo dune_test.r mod_scores <- scores(mod) # correlation of axes with a variable that is not in the model scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot")) cat("head of unconstrained site scores, with meaning\n") print(head(mod_scores$sites)) mod_scores_tidy <- scores(mod, tidy = TRUE) print("names of the tidy scores") print(names(mod_scores_tidy)) cat("\nThe levels of the tidy scores\n") print(levels(mod_scores_tidy$score)) cat("\nFor illustration: a dc-CA model with a trait covariate\n") mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure, formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits) cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n") mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass), response = dune_trait_env$comm[, -1], # must delete "Sites" dataEnv = dune_trait_env$envir, dataTraits = dune_trait_env$traits, verbose = FALSE) cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ", "as the trait model and data did not change\n") mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag), dataEnv = dune_trait_env$envir, dc_CA_object = mod2, verbose= FALSE) cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n", "the expected difference is in the component 'call'\n ") print(all.equal(mod3, mod3B)) # only the component call differs
This function works very much like the vegan
scores
function, in particular
scores.cca
, but with regression coefficients for
predictors.
## S3 method for class 'wrda' scores( x, ..., choices = 1:2, display = "all", scaling = "sym", which_cor = "in model", tidy = FALSE )
## S3 method for class 'wrda' scores( x, ..., choices = 1:2, display = "all", scaling = "sym", which_cor = "in model", tidy = FALSE )
x |
object of class |
... |
Other arguments passed to the function (currently ignored). |
choices |
integer vector of which axes to obtain. Default: all wrda axes. |
display |
a character vector, one or more of |
scaling |
numeric (1,2 or 3) or character |
which_cor |
character vector environmental variables names in the data frames for which inter-set correlations must calculated. Default: a character ("in_model") for all predictors in the model, including collinear variables and levels. |
tidy |
Return scores that are compatible with |
The function is modeled after scores.cca
.
An example of which_cor is: which_cor = c("acidity", "humidity")
A data frame if tidy = TRUE
. Otherwise, a matrix if a single
item is asked for and a named list of matrices if more than one item is
asked for. The following names can be included: c("sites",
"constraints_sites", "centroids", "regression", "t_values", "correlation",
"intra_set_correlation", "biplot", "species")
. Each matrix has an
attribute "meaning"
explaining its meaning. With tidy = TRUE
,
the resulting data frame has attributes "scaling"
and
"meaning"
; the latter has two columns: (1) name of score type and (2)
its meaning, usage and interpretation.
An example of the meaning of scores in scaling "symmetric"
with
display = "all"
:
CMWs of the trait axes (constraints species) in scaling 'symmetric' optimal for biplots and, almost so, for inter-site distances.
linear combination of the environmental predictors and the covariates (making the ordination axes orthogonal to the covariates) in scaling 'symmetric' optimal for biplots and, almost so, for inter-site distances.
mean, sd, VIF, standardized regression coefficients and their optimistic t-ratio in scaling 'symmetric'.
t-values of the coefficients of the regression of the CWMs of the trait composite on to the environmental variables
inter set correlation, correlation between environmental variables and the sites scores (CWMs)
intra set correlation, correlation between environmental variables and the dc-ca axis (constrained sites scores)
biplot scores of environmental variables for display with biplot-traits for fourth-corner correlations in scaling 'symmetric'.
environmental category means of the site scores in scaling 'symmetric' optimal for biplots and, almost so, for inter-environmental category distances.
SNC on the environmental axes (constraints sites) in scaling 'symmetric' optimal for biplots and, almost so, for inter-species distances.
The statements on optimality for distance interpretations are based on the
scaling
and the relative magnitude of the dc-CA eigenvalues of the
chosen axes.
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)
wrda
is formula-based implementation of weighted redundancy analysis.
wrda(formula, response, data, weights = rep(1, nrow(data)), verbose = TRUE)
wrda(formula, response, data, weights = rep(1, nrow(data)), verbose = TRUE)
formula |
one or two-sided formula for the rows (samples) with row
predictors in |
response |
matrix or data frame of the abundance data (dimension
n x m). Rownames of |
data |
matrix or data frame of the row predictors, with rows
corresponding to those in |
weights |
row weights (a vector). If not specified unit weights are used. |
verbose |
logical for printing a simple summary (default: TRUE) |
The algorithm is a modified version of published R-code for weighted redundancy analysis (ter Braak, 2022).
In the current implementation, formula
should contain variable names
as is, i.e. transformations of variables in the formulas gives
an error ('undefined columns selected') when the scores
function is applied.
Compared to rda
, wrda
does not have residual
axes, i.e. no SVD or PCA of the residuals is performed.
All scores in the dcca
object are in scaling "sites"
(1):
the scaling with Focus on Case distances.
ter Braak C.J.F. and P. Šmilauer (2018). Canoco reference manual and user's guide: software for ordination (version 5.1x). Microcomputer Power, Ithaca, USA, 536 pp.
Oksanen, J., et al. (2022) vegan: Community Ecology Package. R package version 2.6-4. https://CRAN.R-project.org/package=vegan.
scores.wrda
, anova.wrda
,
print.wrda
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)
data("dune_trait_env") # rownames are carried forward in results rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites response <- dune_trait_env$comm[, -1] # must delete "Sites" w <- rep(1, 20) w[1:10] <- 8 w[17:20] <- 0.5 object <- wrda(formula = ~ A1 + Moist + Mag + Use + Condition(Manure), response = response, data = dune_trait_env$envir, weights = w) object # Proportions equal to those Canoco 5.15 mod_scores <- scores(object, display = "all") scores(object, which_cor = c("A1", "X_lot"), display = "cor") anova(object)