Title: | High Throughput Phenotyping (HTP) Data Analysis |
---|---|
Description: | Phenotypic analysis of data coming from high throughput phenotyping (HTP) platforms, including different types of outlier detection, spatial analysis, and parameter estimation. The package is being developed within the EPPN2020 project (<https://eppn2020.plant-phenotyping.eu/>). Some functions have been created to be used in conjunction with the R package 'asreml' for the 'ASReml' software, which can be obtained upon purchase from 'VSN' international (<https://vsni.co.uk/software/asreml-r/>). |
Authors: | Emilie J Millet [aut] , Maria Xose Rodriguez Alvarez [aut] , Diana Marcela Perez Valencia [aut] , Isabelle Sanchez [aut], Nadine Hilgert [aut], Bart-Jan van Rossum [aut, cre] , Fred van Eeuwijk [aut] , Martin Boer [aut] |
Maintainer: | Bart-Jan van Rossum <[email protected]> |
License: | GPL |
Version: | 1.0.7 |
Built: | 2024-11-13 06:01:57 UTC |
Source: | https://github.com/biometris/statgenhtp |
Function for converting an object of class TP to a data.frame.
## S3 method for class 'TP' as.data.frame(x, ...)
## S3 method for class 'TP' as.data.frame(x, ...)
x |
An object of class TP. |
... |
Ignored. |
A data.frame containing the data.frames for all time points in the TP object bound together.
Other functions for data preparation:
createTimePoints()
,
getTimePoints()
,
plot.TP()
,
removeTimePoints()
,
summary.TP()
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Convert phenoTP to data.frame. phenoDat <- as.data.frame(phenoTP)
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Convert phenoTP to data.frame. phenoDat <- as.data.frame(phenoTP)
Count valid observations per time point for a given trait.
countValid(TP, trait)
countValid(TP, trait)
TP |
An object of class TP. |
trait |
A character string indicating the trait for which valid observations should be counted. |
A named numerical vector with he number of valid observations per time point .
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Count valid observations for EffpsII per time point. validPheno <- countValid(phenoTP, trait = "EffpsII") head(validPheno)
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Count valid observations for EffpsII per time point. validPheno <- countValid(phenoTP, trait = "EffpsII") head(validPheno)
Count valid observations per plotId for a given trait.
countValidPlot(TP, trait, plotIds = NULL)
countValidPlot(TP, trait, plotIds = NULL)
TP |
An object of class TP. |
trait |
A character string indicating the trait for which valid observations should be counted. |
plotIds |
A character vector indicating the plotIds for which valid
observations should be checked. If |
A named numerical vector with he number of valid observations per plotId.
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Count valid observations for EffpsII for a subset of plots. countValidPlot(phenoTP, trait = "EffpsII", plotIds = c("c12r22", "c24r41", "c14r32"))
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Count valid observations for EffpsII for a subset of plots. countValidPlot(phenoTP, trait = "EffpsII", plotIds = c("c12r22", "c24r41", "c14r32"))
Convert a data.frame to an object of class TP (Time Points). The function converts a data.frame to an object of class TP in the following steps:
Quality control on the input data. For example, warnings will be given when more than 50% of observations are missing for a plant.
Rename columns to default column names used by the functions in the statgenHTP package. For example, the column in the data containing variety/accession/genotype is renamed to “genotype.” Original column names are stored as an attribute of the individual data.frames in the TP object.
Convert column types to the default column types. For example, the column “genotype” is converted to a factor and “rowNum” to a numeric column.
Convert the column containing time into time format. If needed, the
time format can be provided in timeFormat
. For example, with a
date/time input of the form “day/month/year hour:minute”, use
"%d/%m/%Y %H:%M". For a full list of abbreviations see the R package
strptime. When the input time is a numeric value, the function will
convert it to time from 01-01-1970.
If addCheck = TRUE
, the genotypes listed in
checkGenotypes
are reference genotypes (or check). It will add a
column check with a value "noCheck" for the genotypes that are not in
checkGenotypes
and the name of the genotypes for the
checkGenotypes
. A column genoCheck is also added with the names of
the genotypes that are not in checkGenotypes
and NA
for the
checkGenotypes
. These columns are necessary for fitting models on
data that includes check genotypes, e.g. reference genotypes that are
highly replicated or in case of augmented design.
Split the data into separate data.frames by time point. A TP object is a list of data.frames where each data.frame contains the data for a single time point. If there is only one time point the output will be a list with only one item.
Add a data.frame with columns timeNumber and timePoint as attribute “timePoints” to the TP object. This data.frame can be used for referencing time points by a unique number.
Note that plotId
needs to be a unique identifier for a plot or a
plant. It cannot occur more than once per time point.
createTimePoints( dat, experimentName, genotype, timePoint, timeFormat = NULL, plotId, repId = NULL, rowNum = NULL, colNum = NULL, addCheck = FALSE, checkGenotypes = NULL )
createTimePoints( dat, experimentName, genotype, timePoint, timeFormat = NULL, plotId, repId = NULL, rowNum = NULL, colNum = NULL, addCheck = FALSE, checkGenotypes = NULL )
dat |
A data.frame. |
experimentName |
A character string, the name of the experiment. Stored with the data and used in default plot titles. |
genotype |
A character string indicating the column in dat containing the genotypes. |
timePoint |
A character string indicating the column in dat containing the time points. |
timeFormat |
A character string indicating the input format of the time
points. E.g. for a date/time input of the form day/month/year hour:minute,
use "%d/%m/%Y %H:%M". For a full list of abbreviations see
|
plotId |
A character string indicating the column in dat containing the plotId. This has to be a unique identifier per plot/plant per time point. |
repId |
A character string indicating the column in dat containing the replicates. |
rowNum |
A character string indicating the column in dat containing the row number of the plot. |
colNum |
A character string indicating the column in dat containing the column number of the plot. |
addCheck |
Should a column check be added to the output? If |
checkGenotypes |
A character vector containing the genotypes used as checks in the experiment. |
An object of class TP
. A list with, per time point in the
input, a data.frame containing the data for that time point. A data.frame
with columns timeNumber and timePoint is added as attribute timePoints to
the data. This data.frame can be used for referencing timePoints by their
number.
Other functions for data preparation:
as.data.frame.TP()
,
getTimePoints()
,
plot.TP()
,
removeTimePoints()
,
summary.TP()
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3","check4")) summary(phenoTP)
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3","check4")) summary(phenoTP)
Function for detecting strange time courses. The function uses the estimates
for the spline coefficients per time course (typically per plant).
Correlations between those coefficient vectors are calculated to identify
outlying time courses, i.e., plants. An outlying time course will have low
correlation to the majority of time courses. To support the analysis by
correlations, a principal component analysis is done on the plant
(time course) by spline coefficient matrix. A PCA plot of the plant scores
will show the outlying plants. Finally the pairwise-ratios of the slopes of
a linear model fitted through the spline coefficients are computed. Plants
are tagged when the average pairwise-ratio is lower the a given threshold
(thrSlope
).
detectSerieOut( corrDat, predDat, coefDat, trait, genotypes = NULL, geno.decomp = NULL, thrCor = 0.9, thrPca = 30, thrSlope = 0.7 )
detectSerieOut( corrDat, predDat, coefDat, trait, genotypes = NULL, geno.decomp = NULL, thrCor = 0.9, thrPca = 30, thrSlope = 0.7 )
corrDat |
A data.frame with corrected spatial data. |
predDat |
A data.frame with predicted data from a fitted spline. |
coefDat |
A data.frame with coefficients from a fitted spline. |
trait |
A character string indicating the trait for which to detect outliers. |
genotypes |
A character vector indicating the genotypes for which to
detect outliers. If |
geno.decomp |
A character vector indicating the variables to use to group the genotypic variance in the model. |
thrCor |
A numerical value used as threshold for determining outliers based on correlation between plots. |
thrPca |
A numerical value used as threshold for determining outliers based on angles (in degrees) between PCA scores. |
thrSlope |
A numerical value used as threshold for determining outliers based on slopes. |
An object of class serieOut
, a data.frame
with outlying
series of observations.
Other functions for detecting outliers for series of observations:
plot.serieOut()
,
removeSerieOut()
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-splines on a subset of genotypes subGenoVator <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGenoVator, knots = 50) ## Extract the data.frames with predicted values and P-Spline coefficients. predDat <- fit.spline$predDat coefDat <- fit.spline$coefDat ## The coefficients are then used to tag suspect time courses. outVator <- detectSerieOut(corrDat = spatCorrectedVator, predDat = predDat, coefDat = coefDat, trait = "EffpsII_corr", genotypes = subGenoVator, thrCor = 0.9, thrPca = 30, thrSlope = 0.7) ## The `outVator` can be visualized for selected genotypes. plot(outVator, genotypes = "G151")
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-splines on a subset of genotypes subGenoVator <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGenoVator, knots = 50) ## Extract the data.frames with predicted values and P-Spline coefficients. predDat <- fit.spline$predDat coefDat <- fit.spline$coefDat ## The coefficients are then used to tag suspect time courses. outVator <- detectSerieOut(corrDat = spatCorrectedVator, predDat = predDat, coefDat = coefDat, trait = "EffpsII_corr", genotypes = subGenoVator, thrCor = 0.9, thrPca = 30, thrSlope = 0.7) ## The `outVator` can be visualized for selected genotypes. plot(outVator, genotypes = "G151")
Detect outlying observations in a time series by modeling each plotId using a local regression.
detectSingleOut( TP, trait, plotIds = NULL, checkEdges = TRUE, confIntSize = 5, nnLocfit = 0.5 )
detectSingleOut( TP, trait, plotIds = NULL, checkEdges = TRUE, confIntSize = 5, nnLocfit = 0.5 )
TP |
An object of class |
trait |
A character vector indicating the trait to model in |
plotIds |
A character vector of plotIds for which the outliers should be
detected. If |
checkEdges |
Before fitting the local regression should a check be done if the first and last time point for a plot are outlying observations? |
confIntSize |
A numeric value defining the confidence interval (see Details). |
nnLocfit |
A numeric value defining the constant component of the smoothing parameter nn (see Details). |
See locfit() help function from the locfit R library. The user can act on:
the constant of the smoothing parameter. Increase nnLocfit to have a very smooth curve
the level to calculate the confidence interval. Increase confIntSize to exclude less outliers
An object of class singleOut, a data.frame
with the following
columns.
plotId
time point
modeled trait
prediction from the local regression
standard deviation of the prediction
lower bound of the confidence interval
upper bound of the confidence interval
flag for detected outlier (a value of 1 indicates the observation is an outlier)
Other functions for detecting outliers for single observations:
detectSingleOutMaize()
,
plot.singleOut()
,
removeSingleOut()
## Create a TP object containing the data from the Phenovator. PhenovatorDat1 <- PhenovatorDat1[!PhenovatorDat1$pos %in% c("c24r41", "c7r18", "c7r49"), ] phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## First select a subset of plants, for example here 9 plants plantSel <- phenoTP[[1]]$plotId[1:9] # Then run on the subset resuVatorHTP <- detectSingleOut(TP = phenoTP, trait = "EffpsII", plotIds = plantSel, confIntSize = 3, nnLocfit = 0.1)
## Create a TP object containing the data from the Phenovator. PhenovatorDat1 <- PhenovatorDat1[!PhenovatorDat1$pos %in% c("c24r41", "c7r18", "c7r49"), ] phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## First select a subset of plants, for example here 9 plants plantSel <- phenoTP[[1]]$plotId[1:9] # Then run on the subset resuVatorHTP <- detectSingleOut(TP = phenoTP, trait = "EffpsII", plotIds = plantSel, confIntSize = 3, nnLocfit = 0.1)
Function to detect plant outliers in a temporal lattice experiment on Maize which can be extended to other experiment types. The criteria needs three phenotypes (ex for maize: the estimated biomass, plant height and phyllocron)
if for biomass AND
phyllocron
if for biomass AND
plant height
detectSingleOutMaize( TP, timeBeforeTrt, trait1 = "Biomass", trait2 = "PlantHeight", trait3 = "phyllocron", thr = 0.95 )
detectSingleOutMaize( TP, timeBeforeTrt, trait1 = "Biomass", trait2 = "PlantHeight", trait3 = "phyllocron", thr = 0.95 )
TP |
An object of class TP. |
timeBeforeTrt |
A character or numeric value indicating the date just before treatment in the experiment. When using a character string to reference a time point, the value has to be an exact match to one of the existing timePoints. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
trait1 |
A character vector indicating the first trait to model in TP. |
trait2 |
A character vector indicating the second trait to model in TP. |
trait3 |
A character vector indicating the third trait to model in TP. |
thr |
A numeric value indicating the threshold. |
A list with three data.frames, modDat
containing the
data used for fitting the models, smallPlants
containing the plants
identified as small plants and bigPlants
containing the plants
identified as big plants.
Other functions for detecting outliers for single observations:
detectSingleOut()
,
plot.singleOut()
,
removeSingleOut()
## Create a TP object containing the data from the PhenoArch. phenoTParch <- createTimePoints(dat = PhenoarchDat1, experimentName = "Phenoarch", genotype = "Genotype", timePoint = "Date", plotId = "pos", rowNum = "Row", colNum = "Col") singleOutMaize <- detectSingleOutMaize(phenoTParch, timeBeforeTrt = "2017-04-27", trait1 = "Biomass", trait2 = "PlantHeight", trait3 = "phyllocron", thr = 0.95)
## Create a TP object containing the data from the PhenoArch. phenoTParch <- createTimePoints(dat = PhenoarchDat1, experimentName = "Phenoarch", genotype = "Genotype", timePoint = "Date", plotId = "pos", rowNum = "Row", colNum = "Col") singleOutMaize <- detectSingleOutMaize(phenoTParch, timeBeforeTrt = "2017-04-27", trait1 = "Biomass", trait2 = "PlantHeight", trait3 = "phyllocron", thr = 0.95)
Function for extracting parameter estimates from fitted splines on a specified interval.
estimateSplineParameters( x, estimate = c("predictions", "derivatives", "derivatives2"), what = c("min", "max", "mean", "AUC", "p"), AUCScale = c("min", "hour", "day"), timeMin = NULL, timeMax = NULL, genotypes = NULL, plotIds = NULL, fitLevel = c("geno", "plot", "genoDev", "plotDev") )
estimateSplineParameters( x, estimate = c("predictions", "derivatives", "derivatives2"), what = c("min", "max", "mean", "AUC", "p"), AUCScale = c("min", "hour", "day"), timeMin = NULL, timeMax = NULL, genotypes = NULL, plotIds = NULL, fitLevel = c("geno", "plot", "genoDev", "plotDev") )
x |
An object of class HTPSpline, the output of the
|
estimate |
The P-Spline component for which the estimate should be extracted, the predictions, the first derivatives or the second derivatives ("derivatives2") |
what |
The types of estimate that should be extracted. Either minimum ("min"), maximum ("max"), mean, area under the curve ("AUC") or a percentile. Percentiles should be given as p + percentile. E.g. for the 10th percentile specify what = "p10". Multiple types of estimate can be extracted at once. |
AUCScale |
The area under the curve is dependent on the scale used on the x-axis. By default the area is computed assuming a scale in minutes. This can be changed to either hours or days. |
timeMin |
The lower bound of the time interval from which the
estimates should be extracted. If |
timeMax |
The upper bound of the time interval from which the
estimates should be extracted. If |
genotypes |
A character vector indicating the genotypes for which
estimates should be extracted. If |
plotIds |
A character vector indicating the plotIds for which
estimates should be extracted. If |
fitLevel |
A character string indicating at which level of the data
the parameter estimates should be made. Only used for splines fitted using
|
An object of class splineEst, a data.frame containing the estimated parameters.
Other functions for spline parameter estimation:
plot.splineEst()
### Estimate parameters for fitted P-splines. ## Run the function to fit P-splines on a subset of genotypes. subGeno <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGeno, knots = 50) ## Estimate the maximum value of the predictions at the beginning of the time course. paramVator <- estimateSplineParameters(x = fit.spline, estimate = "predictions", what = "max", timeMin = 1527784620, timeMax = 1528500000, genotypes = subGeno) head(paramVator) ## Create a boxplot of the estimates. plot(paramVator, plotType = "box") ## Estimate the minimum and maximum value of the predictions. paramVator2 <- estimateSplineParameters(x = fit.spline, estimate = "predictions", what = c("min", "max"), genotypes = subGeno) head(paramVator2) ### Estimate parameters for fitted HDM-splines. ## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), time = "timeNumber", pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), weights = "wt", trace = FALSE) ## Estimate minimum, maximum, and mean for predictions at the genotype level. paramArch <- estimateSplineParameters(x = fit.psHDM, what = c("min", "max", "mean"), fitLevel = "geno", estimate = "predictions", timeMax = 28) head(paramArch) ## Create a boxplot of the estimates. plot(paramArch, plotType = "box") ## Estimate area under the curve for predictions at the plot level. paramArch2 <- estimateSplineParameters(x = fit.psHDM, what = "AUC", fitLevel = "plot", estimate = "predictions") head(paramArch2)
### Estimate parameters for fitted P-splines. ## Run the function to fit P-splines on a subset of genotypes. subGeno <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGeno, knots = 50) ## Estimate the maximum value of the predictions at the beginning of the time course. paramVator <- estimateSplineParameters(x = fit.spline, estimate = "predictions", what = "max", timeMin = 1527784620, timeMax = 1528500000, genotypes = subGeno) head(paramVator) ## Create a boxplot of the estimates. plot(paramVator, plotType = "box") ## Estimate the minimum and maximum value of the predictions. paramVator2 <- estimateSplineParameters(x = fit.spline, estimate = "predictions", what = c("min", "max"), genotypes = subGeno) head(paramVator2) ### Estimate parameters for fitted HDM-splines. ## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), time = "timeNumber", pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), weights = "wt", trace = FALSE) ## Estimate minimum, maximum, and mean for predictions at the genotype level. paramArch <- estimateSplineParameters(x = fit.psHDM, what = c("min", "max", "mean"), fitLevel = "geno", estimate = "predictions", timeMax = 28) head(paramArch) ## Create a boxplot of the estimates. plot(paramArch, plotType = "box") ## Estimate area under the curve for predictions at the plot level. paramArch2 <- estimateSplineParameters(x = fit.psHDM, what = "AUC", fitLevel = "plot", estimate = "predictions") head(paramArch2)
Perform REML analysis at each time point using either SpATS or asreml. The idea is to is to accurately separate the genetic effects from the spatial effects at each time point. SpATS is used as a default method. See details for the exact models fitted.
fitModels( TP, trait, timePoints = names(TP), extraFixedFactors = NULL, geno.decomp = NULL, what = c("random", "fixed"), useCheck = FALSE, useRepId = FALSE, engine = c("SpATS", "asreml"), spatial = FALSE, quiet = FALSE )
fitModels( TP, trait, timePoints = names(TP), extraFixedFactors = NULL, geno.decomp = NULL, what = c("random", "fixed"), useCheck = FALSE, useRepId = FALSE, engine = c("SpATS", "asreml"), spatial = FALSE, quiet = FALSE )
TP |
An object of class |
trait |
A character string indicating the trait used as response variable in the model. |
timePoints |
A character or numeric vector indicating the time points
to be modeled. When using a character string to reference a time point, the
value has to be an exact match to one of the existing time points. When using
a number it will be matched by its number ("timeNumber") in the timePoints
attribute of the |
extraFixedFactors |
A character vector indicating the variables to use as extra fixed effects in the model. |
geno.decomp |
A character vector indicating the variables to use to group the genotypic variance in the model. |
what |
A character vector specifying whether "genotype" should
be fitted as "random" or "fixed" effect. Note that when using
|
useCheck |
Should check genotypes be used as an extra factor in the model? |
useRepId |
Should repId be used as a fixed effect in the model? When fitting a spatial model rowId and colId are also nested within repId in the random part of the model. |
engine |
A character string indicating the engine used to fit the models. |
spatial |
Should a spatial model be fitted for asreml? |
quiet |
Should printed progress messages be suppressed? |
The actual model fitted depends on the function parameters specified. The
basic model is the following:
trait = genotype + e
In case useCheck = TRUE
, instead of genotype, genoCheck is used as
genotype and check is used as an extra fixed effect. So then the model
becomes:
trait = check + genoCheck + e
Variables in extraFixedFactors
are fitted as extra fixed
effects.
When SpATS
is used for modeling, an extra spatial term is always
included in the model. This term is constructed using the function
PSANOVA
from the SpATS package asPSANOVA(colNum, rowNum, nseg = nSeg, nest.div = 2)
wherenSeg = c(number of columns, number of rows)
.
When asreml
is used for modeling and spatial = TRUE
,
four models are fitted with different covariance structures.
The best model is determined based on a goodness-of-fit criterion, AIC,
on 20% of the time points or at least 10 time points. The best model is then
run on all time points.
The following combinations of random and spatial terms are fitted
random = repId:rowId + repId:colId, spatial = NULL
random = repId:rowId + repId:colId, spatial = ar1(rowId):colId
random = repId:colId + repId:colId, spatial = rowId:ar1(colId)
random = repId:rowId + repId:colId, spatial = ar1(rowId):ar1(colId)
If there are no replicates in the model, repId is left out from the random
parts above.
When geno.decomp
is specified, the genotypic variance is decomposed
following the variable(s) chosen. For example, when a treatment is used in
geno.decomp
, the initial model becomes:
trait = treatment + treatment:genotype + e
An object of class fitMod
, a list of fitted models.
Maria Xose Rodriguez-Alvarez, Martin P. Boer, Fred A. van Eeuwijk, Paul H.C. Eilers (2017). Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics doi:10.1016/j.spasta.2017.10.003 Butler, D. G., et al. (2018). ASReml-R Reference Manual Version 4. VSN International Ltd, http://asreml.org
Other functions for spatial modeling:
getCorrected()
,
getEffDims()
,
getGenoPred()
,
getHerit()
,
getVar()
,
plot.fitMod()
,
summary.fitMod()
## Using the first example dataset (PhenovatorDat1): ## Fit a model using SpATS on few time points: ## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a model with SpATS for three time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(3, 6, 20)) summary(modPhenoSp) ## Fit a model with SpATS for a single time point with extra fixed factors ## and check genotypes: modPhenoSpCheck <- fitModels(TP = phenoTP, trait = "EffpsII", extraFixedFactors = c("repId", "Image_pos"), useCheck = TRUE, timePoints = 3) ## Fit a model with asreml on few time points with a spatial function: if (requireNamespace("asreml", quietly = TRUE)) { modPhenoSpAs <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20), engine = "asreml", spatial = TRUE) } ## Using the second example dataset (PhenoarchDat1): ## Fit a model with SpATS on one time points with two variables for ## geno.decomp: data("PhenoarchDat1") phenoTParch <- createTimePoints(dat = PhenoarchDat1, experimentName = "Phenoarch", genotype = "Genotype", timePoint = "Date", plotId = "pos", rowNum = "Row", colNum = "Col") modPhenoSpGD <- fitModels(TP = phenoTParch, trait = "LeafArea", geno.decomp = c("Scenario", "population"), timePoints = 16)
## Using the first example dataset (PhenovatorDat1): ## Fit a model using SpATS on few time points: ## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a model with SpATS for three time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(3, 6, 20)) summary(modPhenoSp) ## Fit a model with SpATS for a single time point with extra fixed factors ## and check genotypes: modPhenoSpCheck <- fitModels(TP = phenoTP, trait = "EffpsII", extraFixedFactors = c("repId", "Image_pos"), useCheck = TRUE, timePoints = 3) ## Fit a model with asreml on few time points with a spatial function: if (requireNamespace("asreml", quietly = TRUE)) { modPhenoSpAs <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20), engine = "asreml", spatial = TRUE) } ## Using the second example dataset (PhenoarchDat1): ## Fit a model with SpATS on one time points with two variables for ## geno.decomp: data("PhenoarchDat1") phenoTParch <- createTimePoints(dat = PhenoarchDat1, experimentName = "Phenoarch", genotype = "Genotype", timePoint = "Date", plotId = "pos", rowNum = "Row", colNum = "Col") modPhenoSpGD <- fitModels(TP = phenoTParch, trait = "LeafArea", geno.decomp = c("Scenario", "population"), timePoints = 16)
Fit P-Splines on corrected or raw data. The number of
knots is chosen by the user. The function outputs are predicted P-Spline
values and their first and second derivatives on a dense grid. The
outputs can then be used for outlier detection for time series
(see detectSerieOut
) and to estimate relevant parameters from
the curve for further analysis (see estimateSplineParameters
).
fitSpline( inDat, trait, genotypes = NULL, plotIds = NULL, knots = 50, useTimeNumber = FALSE, timeNumber = NULL, minNoTP = NULL )
fitSpline( inDat, trait, genotypes = NULL, plotIds = NULL, knots = 50, useTimeNumber = FALSE, timeNumber = NULL, minNoTP = NULL )
inDat |
A data.frame with corrected spatial data. |
trait |
A character string indicating the trait for which the spline should be fitted. |
genotypes |
A character vector indicating the genotypes for which
splines should be fitted. If |
plotIds |
A character vector indicating the plotIds for which splines
should be fitted. If |
knots |
The number of knots to use when fitting the spline. |
useTimeNumber |
Should the timeNumber be used instead of the timePoint? |
timeNumber |
If |
minNoTP |
The minimum number of time points for which data should be available for a plant. Defaults to 80% of all time points present in the TP object. No splines are fitted for plants with less than the minimum number of timepoints. |
An object of class HTPSpline
, a list with two
data.frames
, predDat
with predicted values and coefDat
with P-Spline coefficients on a dense grid.
Other functions for fitting splines:
plot.HTPSpline()
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-Splines on a subset of genotypes subGeno <- c("G070", "G160") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGeno, knots = 50) ## Extract the data.frames with predicted values and P-Spline coefficients. predDat <- fit.spline$predDat head(predDat) coefDat <- fit.spline$coefDat head(coefDat) ## Visualize the P-Spline predictions for one genotype. plot(fit.spline, genotypes = "G160") ## Visualize the P-Spline predictions and first derivatives for one plant. plot(fit.spline, plotIds = "c10r29", plotType = "predictions") plot(fit.spline, plotIds = "c10r29", plotType = "derivatives")
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-Splines on a subset of genotypes subGeno <- c("G070", "G160") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGeno, knots = 50) ## Extract the data.frames with predicted values and P-Spline coefficients. predDat <- fit.spline$predDat head(predDat) coefDat <- fit.spline$coefDat head(coefDat) ## Visualize the P-Spline predictions for one genotype. plot(fit.spline, genotypes = "G160") ## Visualize the P-Spline predictions and first derivatives for one plant. plot(fit.spline, plotIds = "c10r29", plotType = "predictions") plot(fit.spline, plotIds = "c10r29", plotType = "derivatives")
Fit the P-spline Hierarchical Curve Data Model used in the second stage of
the two-stage approach proposed by Pérez-Valencia et al. (2022). This model
assumes a three-level hierarchical structure in the data, with plants nested
in genotypes, genotypes nested in populations. The input for this function
is the spatially corrected data, as obtained from the first stage of the
approach (see fitModels
and getCorrected
).
The number of segments is chosen by the user, as well as the B-spline degree,
and the penalty order for the three-levels of the hierarchy. The user can
also decide if different variances for random effects at genotype (separately
for each population) and plant (separately for each genotype) levels are
desired. The function outputs are estimated curves (time series of trajectories
and deviations) and their first and second derivatives for the three-levels
of the hierarchy. The outputs can then be used to estimate relevant parameters
from the curves for further analysis (see estimateSplineParameters
).
fitSplineHDM( inDat, genotypes = NULL, plotIds = NULL, trait, useTimeNumber = FALSE, timeNumber = NULL, pop = "pop", genotype = "genotype", plotId = "plotId", weights = NULL, difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 10, bdeg = 3, pord = 2), smoothGeno = list(nseg = 10, bdeg = 3, pord = 2), smoothPlot = list(nseg = 10, bdeg = 3, pord = 2), offset = NULL, family = gaussian(), maxit = 200, trace = TRUE, thr = 0.001, minNoTP = NULL )
fitSplineHDM( inDat, genotypes = NULL, plotIds = NULL, trait, useTimeNumber = FALSE, timeNumber = NULL, pop = "pop", genotype = "genotype", plotId = "plotId", weights = NULL, difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 10, bdeg = 3, pord = 2), smoothGeno = list(nseg = 10, bdeg = 3, pord = 2), smoothPlot = list(nseg = 10, bdeg = 3, pord = 2), offset = NULL, family = gaussian(), maxit = 200, trace = TRUE, thr = 0.001, minNoTP = NULL )
inDat |
A data.frame with corrected spatial data. |
genotypes |
A character vector indicating the genotypes for which
hierarchical models should be fitted. If |
plotIds |
A character vector indicating the plotIds for which
hierarchical models should be fitted. If |
trait |
A character string indicating the trait for which the spline should be fitted. |
useTimeNumber |
Should the timeNumber be used instead of the timePoint?.
If |
timeNumber |
If |
pop |
A character string indicating the the populations to which each genotype/variety belongs. This variable must be a factor in the data frame. |
genotype |
A character string indicating the populations to which each genotype/variety belongs. This variable must be a factor in the data frame. |
plotId |
A character string indicating the genotypes/varieties to which each plant/plot/individual belongs. This variable must be a factor in the data frame. |
weights |
A character string indicating the column in the data containing
the weights to be used in the fitting process (for error propagation from
first stage to second stage). By default, when |
difVar |
Should different variances for random effects at genotype (separately for each population) and plant level (separately for each genotype) be considered?. |
smoothPop |
A list specifying the P-Spline model at the population level (nseg: number of segments; bdeg: degree of the B-spline basis; pord: penalty order). |
smoothGeno |
A list specifying the P-Spline model at the genotype level. |
smoothPlot |
A list specifying the P-Spline model at the plant level. |
offset |
A character string indicating the column in the data with
an a priori known component to be included in the linear predictor during
fitting. By default, when |
family |
An object of class |
maxit |
An optional value that controls the maximum number of iterations of the algorithm. The default is 200. |
trace |
An optional value that controls the function trace.
The default is |
thr |
An optional value that controls the convergence threshold of the algorithm. The default is 1.e-03. |
minNoTP |
The minimum number of time points for which data should be available for a plant. Defaults to 60% of all time points present in the TP object. No splines are fitted for plants with less than the minimum number of timepoints. |
An object of class psHDM
, a list with the following outputs:
time
, a numeric vector with the timepoints.
popLevs
, a data.frame with the names of the populations
genoLevs
, a factor with the names of the genotypes.
plotLevs
, a factor with the names of the plants
nPlotPop
, a numeric vector with the number of plants per
population.
nGenoPop
, a numeric vector with the number of genotypes per
population.
nPlotGeno
, a numeric vector with the number of plants per
genotype.
MM
, a list with the design matrices at plant, genotype and
population levels.
ed
, a numeric vector with the estimated effective dimension
(or effective degrees of freedom) for each random component of the
model (intercept, slope and non-linear trend) at each level of the
hierarchy (population, genotype and plant)
tot_ed
, a numeric value with the sum of the effective
dimensions for all components of the model.
vc
, a numeric vector with the (REML) variance component
estimates for each random component of the model (intercept,
slope and non-linear trend) at each level of the hierarchy
(population, genotype and plant)
phi
, a numeric value with the error variance estimate.
coeff
, a numeric vector with the estimated fixed and random
effect coefficients.
popLevel
, a data.frame with the estimated population trajectories
and first and second order derivatives.
genoLevel
, a data.frame with the estimated genotype-specific
deviations and trajectories, and their respective first and second
order derivatives.
plotLevel
, a data.frame with the estimated plant-specific
deviations and trajectories, and their respective first and second
order derivatives.
deviance
, the (REML) deviance at convergence.
convergence
, a logical value indicating whether the algorithm
managed to converge before the given number of iterations.
dim
, a numeric vector with the (model) dimension of each
model component (fixed and/or random) at each level of the
hierarchy (population, genotype, and plant).
These values correspond to the number of parameters to be estimated.
family
, an object of class family specifying the distribution
and link function.
cholHn
, the inverse of the variance-covariance matrix for the
coefficients.
smooth
, a list with the information about number of segments
(nseg), degree of the B-spline basis (bdeg) and penalty order (pord)
used for the three levels of the hierarchy.
Pérez-Valencia, D.M., Rodríguez-Álvarez, M.X., Boer, M.P. et al. A two-stage approach for the spatio-temporal analysis of high-throughput phenotyping data. Sci Rep 12, 3177 (2022). doi:10.1038/s41598-022-06935-9
Other functions for fitting hierarchical curve data models:
plot.psHDM()
,
predict.psHDM()
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. head(spatCorrectedArch) ggplot2::ggplot(data = spatCorrectedArch, ggplot2::aes(x= timeNumber, y = LeafArea_corr, group = plotId)) + ggplot2::geom_line(na.rm = TRUE) + ggplot2::facet_grid(~geno.decomp) ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", useTimeNumber = TRUE, timeNumber = "timeNumber", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", weights = "wt", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), trace = FALSE) ## Visualize the data.frames with predicted values at the three levels of ## the hierarchy. # Population level head(fit.psHDM$popLevel) # Genotype level head(fit.psHDM$genoLevel) # Plot level head(fit.psHDM$plotLevel)
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. head(spatCorrectedArch) ggplot2::ggplot(data = spatCorrectedArch, ggplot2::aes(x= timeNumber, y = LeafArea_corr, group = plotId)) + ggplot2::geom_line(na.rm = TRUE) + ggplot2::facet_grid(~geno.decomp) ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", useTimeNumber = TRUE, timeNumber = "timeNumber", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", weights = "wt", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), trace = FALSE) ## Visualize the data.frames with predicted values at the three levels of ## the hierarchy. # Population level head(fit.psHDM$popLevel) # Genotype level head(fit.psHDM$genoLevel) # Plot level head(fit.psHDM$plotLevel)
Extract corrected phenotypic values from an object of class fitMod. After fitting a spatial model at each time point, the raw phenotypic data is corrected by subtracting the (estimated) sources of variation (environmental, design effect) that are of no interest (nuisances). This allows keeping the data resolution at the plot/plant level.
getCorrected(fitMod, timePoints = names(fitMod), outFile = NULL)
getCorrected(fitMod, timePoints = names(fitMod), outFile = NULL)
fitMod |
An object of class |
timePoints |
A character or numeric vector indicating the time point(s) for which the corrected values should be extracted. When using a character string to reference a time point, the value has to be an exact match to one of the existing time points. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
outFile |
A character string indicating the .csv file to which the
results should be written. If |
A data.frame with spatially corrected values per time point.
Other functions for spatial modeling:
fitModels()
,
getEffDims()
,
getGenoPred()
,
getHerit()
,
getVar()
,
plot.fitMod()
,
summary.fitMod()
## Using the first example dataset (PhenovatorDat1). phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the corrected values for one time point: spatCorrSp <- getCorrected(modPhenoSp, timePoints = 6) head(spatCorrSp)
## Using the first example dataset (PhenovatorDat1). phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the corrected values for one time point: spatCorrSp <- getCorrected(modPhenoSp, timePoints = 6) head(spatCorrSp)
Extract effective dimensions from an object of class fitMod. The table below gives an overview of the effective dimensions and an explanation of their meaning.
Effective Dimension | Explanation |
colId | Linear trend along columns |
rowId | Linear trend along rows |
fCol | Smooth trend along columns |
fRow | Smooth trend along rows |
fColRow | Linear trend in rows changing smoothly along cols |
colfRow | Linear trend in cols changing smoothly along rows |
fColfRow | Smooth-by-smooth interaction trend over rows and cols |
surface | Sum of smooth trends |
getEffDims( fitMod, timePoints = names(fitMod), EDType = c("dimension", "ratio"), outFile = NULL )
getEffDims( fitMod, timePoints = names(fitMod), EDType = c("dimension", "ratio"), outFile = NULL )
fitMod |
An object of class |
timePoints |
A character or numeric vector indicating the time point(s) for which the effective dimension should be extracted. When using a character string to reference a time point, the value has to be an exact match to one of the existing time points. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
EDType |
A character string specifying if the effective dimension ("dimension") or the ratio of effective dimensions ("ratio") should be returned. |
outFile |
A character string indicating the .csv file to which the
results should be written. If |
A data.frame with effective dimensions per time point.
Other functions for spatial modeling:
fitModels()
,
getCorrected()
,
getGenoPred()
,
getHerit()
,
getVar()
,
plot.fitMod()
,
summary.fitMod()
## Using the first example dataset (PhenovatorDat1): data("PhenovatorDat1") phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points: modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the effective dimensions for all available time points in the ## model object: effDimSp <- getEffDims(modPhenoSp)
## Using the first example dataset (PhenovatorDat1): data("PhenovatorDat1") phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points: modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the effective dimensions for all available time points in the ## model object: effDimSp <- getEffDims(modPhenoSp)
Extract predictions of the genotypic value from an object of class
fitMod
.
getGenoPred( fitMod, timePoints = names(fitMod), predictChecks = FALSE, outFile = NULL )
getGenoPred( fitMod, timePoints = names(fitMod), predictChecks = FALSE, outFile = NULL )
fitMod |
An object of class |
timePoints |
A character or numeric vector indicating the time point(s) for which the predictions should be extracted. When using a character string to reference a time point, the value has to be an exact match to one of the existing time points. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
predictChecks |
Should predictions of the check genotypes be included
in the ouptut. If |
outFile |
A character string indicating the .csv file to which the
results should be written. If |
A list of two data.frames with predicted genotypic values per time
point. genoPred
with the predicted values for the genotypes and
checkPred
with the predicted values for the checks. If
predictChecks = FALSE
the latter will be NULL
.
Other functions for spatial modeling:
fitModels()
,
getCorrected()
,
getEffDims()
,
getHerit()
,
getVar()
,
plot.fitMod()
,
summary.fitMod()
## Using the first example dataset (PhenovatorDat1). phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the genotypic predictions for one time point: genoPredSp <- getGenoPred(modPhenoSp, timePoints = 6) head(genoPredSp)
## Using the first example dataset (PhenovatorDat1). phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the genotypic predictions for one time point: genoPredSp <- getGenoPred(modPhenoSp, timePoints = 6) head(genoPredSp)
Extract heritabilities from an object of class fitMod. When
geno.decomp
is used, the heritabilities of each level
of geno.decomp are stored in separate columns.
getHerit(fitMod, timePoints = names(fitMod), outFile = NULL)
getHerit(fitMod, timePoints = names(fitMod), outFile = NULL)
fitMod |
An object of class |
timePoints |
A character or numeric vector indicating the time point(s) for which the heritabilities should be extracted. When using a character string to reference a time point, the value has to be an exact match to one of the existing time points. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
outFile |
A character string indicating the .csv file to which the
results should be written. If |
A data.frame with heritabilities per time point.
Other functions for spatial modeling:
fitModels()
,
getCorrected()
,
getEffDims()
,
getGenoPred()
,
getVar()
,
plot.fitMod()
,
summary.fitMod()
## Using the first example dataset (PhenovatorDat1): phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the heritabilities for all available time points.#' getHerit(modPhenoSp)
## Using the first example dataset (PhenovatorDat1): phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the heritabilities for all available time points.#' getHerit(modPhenoSp)
Function for extracting a data.frame with timeNumbers and timePoints from an object of class TP or fitMod.
getTimePoints(x)
getTimePoints(x)
x |
An object of class TP or fitMod |
A data.frame with columns timeNumber and timePoint listing the time points in x
Other functions for data preparation:
as.data.frame.TP()
,
createTimePoints()
,
plot.TP()
,
removeTimePoints()
,
summary.TP()
## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Extract the time points from the object. head(getTimePoints(phenoTP))
## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Extract the time points from the object. head(getTimePoints(phenoTP))
Extract variances from an object of class fitMod.
getVar(fitMod, timePoints = names(fitMod), outFile = NULL)
getVar(fitMod, timePoints = names(fitMod), outFile = NULL)
fitMod |
An object of class |
timePoints |
A character or numeric vector indicating the time point(s) for which the variances should be extracted. When using a character string to reference a time point, the value has to be an exact match to one of the existing time points. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
outFile |
A character string indicating the .csv file to which the
results should be written. If |
A data.frame with variances per time point.
Other functions for spatial modeling:
fitModels()
,
getCorrected()
,
getEffDims()
,
getGenoPred()
,
getHerit()
,
plot.fitMod()
,
summary.fitMod()
## Using the first example dataset (PhenovatorDat1): phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the variances for all available time points. getVar(modPhenoSp)
## Using the first example dataset (PhenovatorDat1): phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points. modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 20)) ## Extract the variances for all available time points. getVar(modPhenoSp)
This dataset contains the corrected data obtained by removing outliers for single observations on the RootDat1 dataset. See the vignettes for details.
noCorrectedRoot
noCorrectedRoot
A data.frame with 15,934 rows and 8 columns:
Original time points, date and time
Date
Thermal time cumulated
Experiment number
Genotypes
Tank in the greenhouse
Unique pot ID using rowcol coordinates
Row coordinate
Column coordinate
Position of the root tip in y axis
A dataset containing data from a greenhouse experiment with maize in the Phenoarch platform (INRAE, France, Cabrera-Bosquet et al. 2016). It consists of one experiment with 1,671 plants grown in a greenhouse under two water scenarios, well-watered (WW) and water deficit (WD). There are two populations of genotypes, panel 1 and panel 2. Panel 1 contains 60 genotypes with 14 replicates: 7 in WW and 7 in WD. Panel 2 contains 30 genotypes with 8 replicates, 4 in WW and 4 in WD. The studied trait is the leaf area extracted from the pictures over time (LeafArea). Plants were pictured every day for 33 days. This dataset was kindly provided by Llorenç Cabrera-Bosquet and Claude Welcker.
PhenoarchDat1
PhenoarchDat1
A data.frame with 42,536 rows and 14 columns:
Date of measurement
Unique pot using rowcol coordinate
Genotype
Water regime, WW or WD
Panel 1 or 2
Pot position on the conveyor belt (i.e. row coordinate)
Line of conveyor belt (i.e. column coordinate)
Biomass from the picture
Leaf area from the picture
Plant height from the picture
Number of leaves manually scored
Leaf emission rate
Cabrera-Bosquet, Llorenç, Fournier Christian, Brichet Nicolas, Welcker Claude, Suard Benoît, and Tardieu François. 2016. “High-throughput estimation of incident light, light interception and radiation-use efficiency of thousands of plants in a phenotyping platform.” New Phytologist 212 (1): 269-81. doi:10.1111/nph.14027
A dataset containing data from a growth chamber experiment with Arabidopsis in the Phenovator platform (WUR, Netherlands, Flood et al. 2016). It consists of one experiment with 1,440 plants grown in a growth chamber. The number of tested genotypes is 192 with 6 to 7 replicates per genotype. Four reference genotypes were also tested with 15 or 30 replicates. The studied trait is the photosystem II efficiency (EffpsII) extracted from the pictures over time (van Rooijen et al. 2017). This dataset was kindly provided by René Boesten and Mark Aarts.
PhenovatorDat1
PhenovatorDat1
A data.frame with 103,839 rows and 10 columns:
Genotypes
Table of experiment
Block define after sowing for post-blocking. They are not full-resolvable blocks.
Position of the camera
Row coordinate
Column coordinate
Unique pot ID
time of picture
Efficiency of the photosystem II
Unique pot ID using rowcol coordinates
Rooijen, Roxanne van, Willem Kruijer, René Boesten, Fred A. van Eeuwijk, Jeremy Harbinson, and Mark G. M. Aarts. 2017. “Natural Variation of YELLOW SEEDLING1 Affects Photosynthetic Acclimation of Arabidopsis Thaliana.” Nature Communications 8 (1). doi:10.1038/s41467-017-01576-3
Flood, Pádraic J., Willem Kruijer, Sabine K. Schnabel, Rob van der Schoor, Henk Jalink, Jan F. H. Snel, Jeremy Harbinson, and Mark G. M. Aarts. 2016. “Phenomics for Photosynthesis, Growth and Reflectance in Arabidopsis Thaliana Reveals Circadian and Long-Term Fluctuations in Heritability.” Plant Methods 12 (1): 14. doi:10.1186/s13007-016-0113-y
Plotting function for objects of class fitMod
. Seven different types
of plots can be made for an object of class fitMod
. A detailed
description and optional extra parameters for the different plots is given
in the sections below.
## S3 method for class 'fitMod' plot( x, ..., plotType = c("rawPred", "corrPred", "herit", "effDim", "variance", "timeLapse", "spatial"), timePoints = names(x), title = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
## S3 method for class 'fitMod' plot( x, ..., plotType = c("rawPred", "corrPred", "herit", "effDim", "variance", "timeLapse", "spatial"), timePoints = names(x), title = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
x |
An object of class fitMod. |
... |
Extra plot options. Described per plotType in their respective section. |
plotType |
A single character string indicating which plot should be made. See the sections below for a detailed explanation of the plots. |
timePoints |
A character or numeric vector indicating the time points to be plotted. When using a character string to reference a time point, the value has to be an exact match to one of the existing timePoints. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
title |
A character string used as title for the plot. If |
output |
Should the plot be output to the current device? If
|
outFile |
A character string indicating the .pdf file or .gif file
(for |
outFileOpts |
A named list of extra options for the pdf outfile, e.g.
width and height. See |
Depending on the plot type either a ggplot object or a list of ggplot objects is invisibly returned.
Plots the raw data (colored dots) overlayed with the predicted values from
the fitted model (black dots). For each genotype a plot is made per
plot/plant over time. These plots are put together in a 5x5 grid. By using
the parameter genotypes
a selection of genotypes can be plotted.
Extra parameter options:
A character vector indicating the genotypes to be plotted.
Should the check genotypes be included in the plot?
Should the data be displayed as lines? Default is
FALSE
.
Plots the spatially corrected data (colored dots) overlayed with the
predicted values from the fitted model (black dors). For each genotype a plot
is made per plot/plant over time. These plots are put together in a 5x5 grid.
By using the parameter genotypes
a selection of genotypes can be
plotted. Extra parameter options:
A character vector indicating the genotypes to be plotted.
Should the check genotypes be included in the plot?
Should the data be displayed as lines? Default is
FALSE
.
Plots the heritability over time. This plot is only available when genotype
is fitted as random factor in the model. If geno.decomp
is used when
fitting the model, heritabilities are plotted for each level of geno.decomp
in a single plot. Extra parameter options:
A numerical vector of length two, used for setting the limits of the y-axis of the plot. If values outside of the plotting range are given, then these are ignored.
Plots the effective dimension over time for models fitted using SpATS. Extra parameter options:
A character vector indicating which effective dimensions
should be plotted. This should be a subset of "colId", "rowId", "fCol",
"fRow", "fColRow", "colfRow", "fColfRow" and "surface". When
useRepId = TRUE
, the effective dimensions of "colId" and "rowId"
become "RepId:colId" and "RepId:rowId". Default all effective dimensions
are plotted.
A character string specifying if the effective dimension ("dimension") or the ratio of effective dimensions ("ratio") should be plotted. Default the dimensions are plotted.
A numerical vector of length two, used for setting the limits of the y-axis of the plot. If values outside of the plotting range are given, then these are ignored.
Plots the residual, column and row variances over time for the fitted models. Extra parameter options:
A numerical vector of length two, used for setting the limits of the y-axis of the plot. If values outside of the plotting range are given, then these are ignored.
Creates a time lapse of the spatial trends of models fitted using SpATS over time.
Creates five plots per time point, spatial plots of the raw data, fitted values, residuals and either BLUEs or BLUPs, and a histogram of the BLUEs or BLUPs. When SpATS was used for modeling an extra plot with the fitted spatial trend is included Extra parameter options:
A character string indicating how the spatial trend should be displayed. Either "raw" for raw values, or "percentage" for displaying as a percentage of the original phenotypic values.
Other functions for spatial modeling:
fitModels()
,
getCorrected()
,
getEffDims()
,
getGenoPred()
,
getHerit()
,
getVar()
,
summary.fitMod()
## Using the first example dataset (PhenovatorDat1): ## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on three points: modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 36)) ## Plot the spatial trends for one time point: plot(modPhenoSp, timePoints = 36, plotType = "spatial", spaTrend = "percentage") ## Not run: ## Create a time lapse of all available time points: plot(modPhenoSp, plotType = "timeLapse", outFile = "TimeLapse_modPhenoSp.gif") ## End(Not run) ## Plot the corrected values for a subset of four genotypes: plot(modPhenoSp, plotType = "corrPred", genotypes = c("check1", "check2", "G007", "G058") ) ## Plot the effective dimensions of all available time points in the model ## for a subset of effective dimensions: plot(modPhenoSp, plotType = "effDim", whichED = c("colId", "rowId", "fColRow","colfRow"), EDType = "ratio")
## Using the first example dataset (PhenovatorDat1): ## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on three points: modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 36)) ## Plot the spatial trends for one time point: plot(modPhenoSp, timePoints = 36, plotType = "spatial", spaTrend = "percentage") ## Not run: ## Create a time lapse of all available time points: plot(modPhenoSp, plotType = "timeLapse", outFile = "TimeLapse_modPhenoSp.gif") ## End(Not run) ## Plot the corrected values for a subset of four genotypes: plot(modPhenoSp, plotType = "corrPred", genotypes = c("check1", "check2", "G007", "G058") ) ## Plot the effective dimensions of all available time points in the model ## for a subset of effective dimensions: plot(modPhenoSp, plotType = "effDim", whichED = c("colId", "rowId", "fColRow","colfRow"), EDType = "ratio")
Plot the results of a fitted spline.
## S3 method for class 'HTPSpline' plot( x, ..., plotType = c("predictions", "derivatives", "derivatives2"), genotypes = NULL, plotIds = NULL, title = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
## S3 method for class 'HTPSpline' plot( x, ..., plotType = c("predictions", "derivatives", "derivatives2"), genotypes = NULL, plotIds = NULL, title = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
x |
An object of class |
... |
Ignored. |
plotType |
A character string indicating which spline component should be plotted, either predictions, derivatives or second derivatives ("derivatives2"). |
genotypes |
A character vector indicating the genotypes for which spline components should be plotted. |
plotIds |
A character vector indicating the plotIds for which spline components should be plotted. |
title |
A character string used as title for the plot. If |
output |
Should the plot be output to the current device? If
|
outFile |
A character string indicating the .pdf file to which the
plots should be written. If |
outFileOpts |
A named list of extra options for the pdf outfile, e.g.
width and height. See |
A list of object of class ggplot is invisibly returned.
Other functions for fitting splines:
fitSpline()
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-Splines on a subset of genotypes subGeno <- c("G070", "G160") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGeno, knots = 50) ## Visualize the P-Spline predictions for one genotype. plot(fit.spline, genotypes = "G160") ## Visualize the first and second derivatives of the predictions for one plant. plot(fit.spline, plotIds = "c10r29", plotType = "derivatives") plot(fit.spline, plotIds = "c10r29", plotType = "derivatives2")
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-Splines on a subset of genotypes subGeno <- c("G070", "G160") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGeno, knots = 50) ## Visualize the P-Spline predictions for one genotype. plot(fit.spline, genotypes = "G160") ## Visualize the first and second derivatives of the predictions for one plant. plot(fit.spline, plotIds = "c10r29", plotType = "derivatives") plot(fit.spline, plotIds = "c10r29", plotType = "derivatives2")
This plot function provides five plots for objects of the class psHDM
after fitting (fitSplineHDM
) or predicting
(predict.psHDM
): (1) Population-specific growth curves (popTra),
(2) Population and genotype-specific growth curves (for all genotypes, popGenoTra),
(3) First-order derivative of the population and genotype-specific growth curves
(for all genotypes, popGenoDeriv), (4) Genotype-specific deviations (for all
genotypes, genoDev), and (5) Genotype- and plot-specific growth curves (for a
selection of genotypes, genoPlotTra). If standard errors are available, 95% point
wise confidence intervals are depicted.
## S3 method for class 'psHDM' plot( x, ..., plotType = c("popTra", "popGenoTra", "popGenoDeriv", "genoDev", "genoPlotTra"), genotypes = NULL, genotypeNames = NULL, genotypeOrder = NULL, xlab = "Time", ylab = expression(tilde(y)[pgi](t)), title = NULL, themeSizeHDM = 15, output = TRUE, outFile = NULL, outFileOpts = NULL )
## S3 method for class 'psHDM' plot( x, ..., plotType = c("popTra", "popGenoTra", "popGenoDeriv", "genoDev", "genoPlotTra"), genotypes = NULL, genotypeNames = NULL, genotypeOrder = NULL, xlab = "Time", ylab = expression(tilde(y)[pgi](t)), title = NULL, themeSizeHDM = 15, output = TRUE, outFile = NULL, outFileOpts = NULL )
x |
An object of class "psHDM" as obtained after fitting
( |
... |
Not used. |
plotType |
A character string indicating which plot should be made. |
genotypes |
A character vector with the genotypes for which plots at
plot level are desired. Only used when |
genotypeNames |
A character vector with alternative names for the
plotted genotypes (genotypes). If |
genotypeOrder |
A vector with the order of the selected genotypes
(genotypes). If |
xlab |
The x-axis label of the plot. |
ylab |
The y-axis label of the plot. |
title |
A character string used as title for the plot. If |
themeSizeHDM |
Reference size for the theme |
output |
Should the plot be output to the current device? If
|
outFile |
A character string indicating the .pdf file to which the
plots should be written. If |
outFileOpts |
A named list of extra options for the pdf outfile, e.g.
width and height. See |
Pérez-Valencia, D.M., Rodríguez-Álvarez, M.X., Boer, M.P. et al. A two-stage approach for the spatio-temporal analysis of high-throughput phenotyping data. Sci Rep 12, 3177 (2022). doi:10.1038/s41598-022-06935-9
Other functions for fitting hierarchical curve data models:
fitSplineHDM()
,
predict.psHDM()
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), time = "timeNumber", pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), weights = "wt", trace = FALSE) ## Plot the P-Spline predictions at the three levels of the hierarchy ## Population-specific growth curves. plot(fit.psHDM, plotType = "popTra") ## Population and genotype-specific growth curves. plot(fit.psHDM, plotType = "popGenoTra") ## First-order derivative of the population- and genotype-specific growth curves. plot(fit.psHDM, plotType = "popGenoDeriv") ## Genotype-specific deviations. plot(fit.psHDM, plotType = "genoDev") ## Genotype- and plot-specific growth curves. plot(fit.psHDM, plotType = "genoPlotTra")
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), time = "timeNumber", pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), weights = "wt", trace = FALSE) ## Plot the P-Spline predictions at the three levels of the hierarchy ## Population-specific growth curves. plot(fit.psHDM, plotType = "popTra") ## Population and genotype-specific growth curves. plot(fit.psHDM, plotType = "popGenoTra") ## First-order derivative of the population- and genotype-specific growth curves. plot(fit.psHDM, plotType = "popGenoDeriv") ## Genotype-specific deviations. plot(fit.psHDM, plotType = "genoDev") ## Genotype- and plot-specific growth curves. plot(fit.psHDM, plotType = "genoPlotTra")
Plot the fitted spline, correlation matrix and PCA biplot for each of the genotypes. Outlying series of observations are shown as filled dots in the fitted spline plot, other observations are shown as open dots.
## S3 method for class 'serieOut' plot( x, ..., reason = c("mean corr", "angle", "slope"), genotypes = NULL, geno.decomp = NULL, useTimeNumber = FALSE, timeNumber = NULL, title = NULL, output = TRUE )
## S3 method for class 'serieOut' plot( x, ..., reason = c("mean corr", "angle", "slope"), genotypes = NULL, geno.decomp = NULL, useTimeNumber = FALSE, timeNumber = NULL, title = NULL, output = TRUE )
x |
An object of class |
... |
Ignored. |
reason |
A character vector indicating which types of outliers should be plotted. |
genotypes |
A character vector indicating which genotypes should be
plotted. If |
geno.decomp |
A character vector indicating which levels of
|
useTimeNumber |
Should the timeNumber be used instead of the timePoint in the labels on the x-axis? |
timeNumber |
If |
title |
A character string used as title for the plot. If |
output |
Should the plot be output to the current device? If
|
A list of ggplot objects is invisibly returned.
Other functions for detecting outliers for series of observations:
detectSerieOut()
,
removeSerieOut()
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-splines on a subset of genotypes subGenoVator <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGenoVator, knots = 50) ## Extract the data.frames with predicted values and P-Spline coefficients. predDat <- fit.spline$predDat coefDat <- fit.spline$coefDat ## The coefficients are then used to tag suspect time courses. outVator <- detectSerieOut(corrDat = spatCorrectedVator, predDat = predDat, coefDat = coefDat, trait = "EffpsII_corr", genotypes = subGenoVator, thrCor = 0.9, thrPca = 30, thrSlope = 0.7) ## The `outVator` can be visualized for selected genotypes. plot(outVator, genotypes = "G151") ## Only visualize outliers tagged because of low correlation ## between slopes of the regression. plot(outVator, genotypes = "G151", reason = "slope")
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## Fit P-splines on a subset of genotypes subGenoVator <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGenoVator, knots = 50) ## Extract the data.frames with predicted values and P-Spline coefficients. predDat <- fit.spline$predDat coefDat <- fit.spline$coefDat ## The coefficients are then used to tag suspect time courses. outVator <- detectSerieOut(corrDat = spatCorrectedVator, predDat = predDat, coefDat = coefDat, trait = "EffpsII_corr", genotypes = subGenoVator, thrCor = 0.9, thrPca = 30, thrSlope = 0.7) ## The `outVator` can be visualized for selected genotypes. plot(outVator, genotypes = "G151") ## Only visualize outliers tagged because of low correlation ## between slopes of the regression. plot(outVator, genotypes = "G151", reason = "slope")
Plot the fitted local regression, confidence intervals and detected outliers for each plotId.
## S3 method for class 'singleOut' plot(x, ..., plotIds = NULL, outOnly = TRUE, output = TRUE)
## S3 method for class 'singleOut' plot(x, ..., plotIds = NULL, outOnly = TRUE, output = TRUE)
x |
An object of class singleOut. |
... |
Ignored. |
plotIds |
A character vector of plotIds for which the outliers should be
detected. If |
outOnly |
Should only plots containing outliers be plotted? |
output |
Should the plot be output to the current device? If
|
A list of ggplot objects is invisibly returned.
Other functions for detecting outliers for single observations:
detectSingleOut()
,
detectSingleOutMaize()
,
removeSingleOut()
## Create a TP object containing the data from the Phenovator. PhenovatorDat1 <- PhenovatorDat1[!PhenovatorDat1$pos %in% c("c24r41", "c7r18", "c7r49"), ] phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Select a subset of plants, for example here 9 plants. plantSel <- phenoTP[[1]]$plotId[1:9] # Then run on the subset. resuVatorHTP <- detectSingleOut(TP = phenoTP, trait = "EffpsII", plotIds = plantSel, confIntSize = 3, nnLocfit = 0.1) ## Visualize the prediction by choosing a single plant... plot(resuVatorHTP, plotIds = "c21r24", outOnly = FALSE) ## ...or a subset of plants. plot(resuVatorHTP, plotIds = plantSel, outOnly = FALSE)
## Create a TP object containing the data from the Phenovator. PhenovatorDat1 <- PhenovatorDat1[!PhenovatorDat1$pos %in% c("c24r41", "c7r18", "c7r49"), ] phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Select a subset of plants, for example here 9 plants. plantSel <- phenoTP[[1]]$plotId[1:9] # Then run on the subset. resuVatorHTP <- detectSingleOut(TP = phenoTP, trait = "EffpsII", plotIds = plantSel, confIntSize = 3, nnLocfit = 0.1) ## Visualize the prediction by choosing a single plant... plot(resuVatorHTP, plotIds = "c21r24", outOnly = FALSE) ## ...or a subset of plants. plot(resuVatorHTP, plotIds = plantSel, outOnly = FALSE)
Plot the results of estimated spline parameters.
## S3 method for class 'splineEst' plot( x, ..., plotType = c("box", "hist"), what = attr(x, "what"), title = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
## S3 method for class 'splineEst' plot( x, ..., plotType = c("box", "hist"), what = attr(x, "what"), title = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
x |
An object of class |
... |
Ignored. |
plotType |
A character string indicating the type of plot to be made. |
what |
The types of estimate that should be plotted. |
title |
A character string used as title for the plot. If |
output |
Should the plot be output to the current device? If
|
outFile |
A character string indicating the .pdf file to which the
plots should be written. If |
outFileOpts |
A named list of extra options for the pdf outfile, e.g.
width and height. See |
A list of objects of class ggplot is invisibly returned.
Other functions for spline parameter estimation:
estimateSplineParameters()
Plotting function for objects of class TP. Plots the layout of the platform for different time points within the TP object. Also a boxplot can be made for selected traits and time points and a plot of correlations between time points. Finally the raw data can be displayed per genotype. A detailed description and optional extra parameters for the different plots are given in the sections below.
## S3 method for class 'TP' plot( x, ..., plotType = c("layout", "box", "cor", "raw"), timePoints = names(x), title = NULL, traits = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
## S3 method for class 'TP' plot( x, ..., plotType = c("layout", "box", "cor", "raw"), timePoints = names(x), title = NULL, traits = NULL, output = TRUE, outFile = NULL, outFileOpts = NULL )
x |
An object of class TP. |
... |
Extra plot options. Described per plotType in their respective section. |
plotType |
A single character string indicating which plot should be made. See the sections below for a detailed explanation of the plots. |
timePoints |
A character or numeric vector indicating the time points to be plotted. When using a character string to reference a time point, the value has to be an exact match to one of the existing timePoints. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
title |
A character string used as title for the plot. If |
traits |
A character vector indicating the traits to be plotted.
If |
output |
Should the plot be output to the current device? If
|
outFile |
A character string indicating the .pdf file to which the
plots should be written. If |
outFileOpts |
A named list of extra options for the pdf outfile, e.g.
width and height. See |
Depending on the plot type, either a ggplot object or a list of ggplot objects is invisibly returned.
Plots the layout of the platform for selected time points (all available time
points by default). This plot can only be made for time points that contain
both row (rowNum
) and column (colNum
) information. If either
one of those is missing the timePoint is skipped with a warning.
If replicates (repId
) are available, a black line is plotted between
diffent replicates. Missing plots are indicated in white. This can either be
single plots in a time point or complete missing columns or rows.
Extra parameter options:
Should individual genotypes be labeled in the plot?
Defaults to FALSE
A character vector of genotypes to be highlighted in the plot.
Creates a boxplot per selected trait grouped by time point (all available time points by default). Extra parameter options:
A character string indicating a column in TP
by which
the boxes in the plot should be grouped.
By default the boxes are grouped per time point.
A character string indicating a column in TP
by which
the boxes are colored. Coloring will be done within the groups indicated by
the groupBy
parameter, e.g. per replicate within each time point
using repId
.
A character string indicating the way the boxes should be ordered. Either "alphabetic" for alphabetical ordering of the groups, "ascending" for ordering by ascending mean, or "descending" for ordering by descending mean. By default boxes are ordered alphabetically.
Draws a heatmap of correlations of raw data between time points per selected trait for selected time points (all available time points by default).
Create a plot of the raw data of the selected trait over time for selected
time points (all available time points by default). Plots are grouped by
genotype, or by genotype x treatment when the geno.decomp
option is
specified. By default, all the genotypes will be plotted which might take
time and memory when the output is not saved in a file (see parameter
outFile
). Extra parameter options:
A character vector indicating the genotypes to be plotted.
A character vector indicating the grouping of the genotypes to be plotted.
Should the data be displayed as lines? Default is FALSE.
Other functions for data preparation:
as.data.frame.TP()
,
createTimePoints()
,
getTimePoints()
,
removeTimePoints()
,
summary.TP()
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Plot the layout for the third time point with the check genotypes ## highlighted plot(phenoTP, plotType = "layout", timePoints = 3, highlight = c("check1", "check2", "check3", "check4")) ## Create a boxplot for "EffpsII" with 5 time points and boxes colored ## by "repId" within time point. plot(phenoTP, plotType = "box", traits = "EffpsII", timePoints = 1:5, colorBy = "repId") ## Create a correlation plot for "EffpsII" for a selection of time points. plot(phenoTP, plotType = "cor", traits = "EffpsII", timePoints = seq(from=1, to=73, by=5)) ## Plot the raw data of four genotypes for the trait "EffpsII": plot(phenoTP, traits = "EffpsII", plotType = "raw", genotypes = c("G001","G002","check1","check2"))
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Plot the layout for the third time point with the check genotypes ## highlighted plot(phenoTP, plotType = "layout", timePoints = 3, highlight = c("check1", "check2", "check3", "check4")) ## Create a boxplot for "EffpsII" with 5 time points and boxes colored ## by "repId" within time point. plot(phenoTP, plotType = "box", traits = "EffpsII", timePoints = 1:5, colorBy = "repId") ## Create a correlation plot for "EffpsII" for a selection of time points. plot(phenoTP, plotType = "cor", traits = "EffpsII", timePoints = seq(from=1, to=73, by=5)) ## Plot the raw data of four genotypes for the trait "EffpsII": plot(phenoTP, traits = "EffpsII", plotType = "raw", genotypes = c("G001","G002","check1","check2"))
Function that predicts the P-spline Hierarchical Curve Data Model (see
fitSplineHDM
) on a dense grid. It provides standard errors
for curves at each level of the hierarchy. User has to be aware that
standard errors at the plot level demand large memory. We suggest set
that option at the FALSE
level
## S3 method for class 'psHDM' predict( object, newtimes, pred = list(pop = TRUE, geno = TRUE, plot = TRUE), se = list(pop = TRUE, geno = TRUE, plot = FALSE), trace = TRUE, ... )
## S3 method for class 'psHDM' predict( object, newtimes, pred = list(pop = TRUE, geno = TRUE, plot = TRUE), se = list(pop = TRUE, geno = TRUE, plot = FALSE), trace = TRUE, ... )
object |
An object of class "psHDM" as obtained after fitting
( |
newtimes |
A numeric vector with timepoints at which predictions are desired |
pred |
A list that controls the hierarchical levels at which
predictions are desired (population/genotypes/plots). The default is
|
se |
A list that controls the hierarchical levels at which standard
errors are desired (population/genotypes/plots). The default is
|
trace |
An optional value that controls the function trace.
The default is |
... |
Not used. |
An object of class psHDM
, a list with the following outputs:
predict.psHDM
newtimes
A numeric vector with the timepoints at which predictions
and/or standard errors have been obtained.
popLevel
A data.frame with the estimated population trajectories
and first and second order derivatives, and if required their respective
standard errors, at the newtimes
.
genoLevel
A data.frame with the estimated genotype-specific
deviations and trajectories and their respective first and second order
derivatives, and if required their respective standard errors,
at the newtimes
.
plotLevel
A data.frame with the estimated plot-specific
deviations and trajectories and their respective first and second order
derivatives, and if required their respective standard errors,
at the newtimes
.
plotObs
A data.frame with the raw data at the original timepoints.
Pérez-Valencia, D.M., Rodríguez-Álvarez, M.X., Boer, M.P. et al. A two-stage approach for the spatio-temporal analysis of high-throughput phenotyping data. Sci Rep 12, 3177 (2022). doi:10.1038/s41598-022-06935-9
Other functions for fitting hierarchical curve data models:
fitSplineHDM()
,
plot.psHDM()
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), time = "timeNumber", pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), weights = "wt", trace = FALSE) ## Predict the P-Splines Hierarchical Curve Data Model on a dense grid ## with standard errors at the population and genotype levels pred.psHDM <- predict(object = fit.psHDM, newtimes = seq(min(fit.psHDM$time[["timeNumber"]]), max(fit.psHDM$time[["timeNumber"]]), length.out = 100), pred = list(pop = TRUE, geno = TRUE, plot = TRUE), se = list(pop = TRUE, geno = TRUE, plot = FALSE)) ## Plot the P-Spline predictions at the three levels of the hierarchy ## Plots at population level. plot(pred.psHDM, plotType = "popTra") ## Plots at genotype level. plot(pred.psHDM, plotType = "popGenoTra") ## Plots of derivatives at genotype level. plot(pred.psHDM, plotType = "popGenoDeriv") ## Plots of deviations at genotype level. plot(pred.psHDM, plotType = "genoDev") ## Plots at plot level. plot(pred.psHDM, plotType = "genoPlotTra")
## The data from the Phenovator platform have been corrected for spatial ## trends and outliers for single observations have been removed. ## We need to specify the genotype-by-treatment interaction. ## Treatment: water regime (WW, WD). spatCorrectedArch[["treat"]] <- substr(spatCorrectedArch[["geno.decomp"]], start = 1, stop = 2) spatCorrectedArch[["genoTreat"]] <- interaction(spatCorrectedArch[["genotype"]], spatCorrectedArch[["treat"]], sep = "_") ## Fit P-Splines Hierarchical Curve Data Model for selection of genotypes. fit.psHDM <- fitSplineHDM(inDat = spatCorrectedArch, trait = "LeafArea_corr", genotypes = c("GenoA14_WD", "GenoA51_WD", "GenoB11_WW", "GenoB02_WD", "GenoB02_WW"), time = "timeNumber", pop = "geno.decomp", genotype = "genoTreat", plotId = "plotId", difVar = list(geno = FALSE, plot = FALSE), smoothPop = list(nseg = 4, bdeg = 3, pord = 2), smoothGeno = list(nseg = 4, bdeg = 3, pord = 2), smoothPlot = list(nseg = 4, bdeg = 3, pord = 2), weights = "wt", trace = FALSE) ## Predict the P-Splines Hierarchical Curve Data Model on a dense grid ## with standard errors at the population and genotype levels pred.psHDM <- predict(object = fit.psHDM, newtimes = seq(min(fit.psHDM$time[["timeNumber"]]), max(fit.psHDM$time[["timeNumber"]]), length.out = 100), pred = list(pop = TRUE, geno = TRUE, plot = TRUE), se = list(pop = TRUE, geno = TRUE, plot = FALSE)) ## Plot the P-Spline predictions at the three levels of the hierarchy ## Plots at population level. plot(pred.psHDM, plotType = "popTra") ## Plots at genotype level. plot(pred.psHDM, plotType = "popGenoTra") ## Plots of derivatives at genotype level. plot(pred.psHDM, plotType = "popGenoDeriv") ## Plots of deviations at genotype level. plot(pred.psHDM, plotType = "genoDev") ## Plots at plot level. plot(pred.psHDM, plotType = "genoPlotTra")
Function for replacing outliers for series of observations in the data by NA.
The input can either be a data.frame, specified in dat
, or the output
of the fitSpline
function, specified in fitSpline
. Exactly one
of these should be provided as input for the function.
removeSerieOut( dat = NULL, fitSpline = NULL, serieOut, reason = c("mean corr", "angle", "slope"), traits = attr(x = serieOut, which = "trait") )
removeSerieOut( dat = NULL, fitSpline = NULL, serieOut, reason = c("mean corr", "angle", "slope"), traits = attr(x = serieOut, which = "trait") )
dat |
A |
fitSpline |
An object of class |
serieOut |
A data.frame with at least the column plotId with values corresponding to those in dat/fitSpline. |
reason |
A character vector indicating which types of outliers should be replaced by NA. |
traits |
The traits that should be replaced by NA. When using the
output of |
Depending on the input either a data.frame
or an object of
class HTPSpline
for which the outliers specified in serieOut
are replaced by NA.
Other functions for detecting outliers for series of observations:
detectSerieOut()
,
plot.serieOut()
## Run the function to fit P-splines on a subset of genotypes. subGenoVator <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGenoVator, knots = 50) ## Extract the tables of predicted values and P-spline coefficients. predDat <- fit.spline$predDat coefDat <- fit.spline$coefDat ## The coefficients are then used to tag suspect time courses outVator <- detectSerieOut(corrDat = spatCorrectedVator, predDat = predDat, coefDat = coefDat, trait = "EffpsII_corr", genotypes = subGenoVator, thrCor = 0.9, thrPca = 30, thrSlope = 0.7) ## Replace the outliers by NA in the corrected data. spatCorrectedVatorOut <- removeSerieOut(dat = spatCorrectedVator, serieOut = outVator) ## Only replace the slope outliers by NA in the corrected data. spatCorrectedVatorOut2 <- removeSerieOut(dat = spatCorrectedVator, serieOut = outVator, reason = "slope") ## Replace the outliers by NA in the corrected data. ## Replace both the corrected value and the raw trait value by NA. spatCorrectedVatorOut3 <- removeSerieOut(dat = spatCorrectedVator, serieOut = outVator, traits = c("EffpsII", "EffpsII_corr"))
## Run the function to fit P-splines on a subset of genotypes. subGenoVator <- c("G160", "G151") fit.spline <- fitSpline(inDat = spatCorrectedVator, trait = "EffpsII_corr", genotypes = subGenoVator, knots = 50) ## Extract the tables of predicted values and P-spline coefficients. predDat <- fit.spline$predDat coefDat <- fit.spline$coefDat ## The coefficients are then used to tag suspect time courses outVator <- detectSerieOut(corrDat = spatCorrectedVator, predDat = predDat, coefDat = coefDat, trait = "EffpsII_corr", genotypes = subGenoVator, thrCor = 0.9, thrPca = 30, thrSlope = 0.7) ## Replace the outliers by NA in the corrected data. spatCorrectedVatorOut <- removeSerieOut(dat = spatCorrectedVator, serieOut = outVator) ## Only replace the slope outliers by NA in the corrected data. spatCorrectedVatorOut2 <- removeSerieOut(dat = spatCorrectedVator, serieOut = outVator, reason = "slope") ## Replace the outliers by NA in the corrected data. ## Replace both the corrected value and the raw trait value by NA. spatCorrectedVatorOut3 <- removeSerieOut(dat = spatCorrectedVator, serieOut = outVator, traits = c("EffpsII", "EffpsII_corr"))
Function for replacing outliers for single observations by NA.
removeSingleOut(TP, singleOut, trait = attr(x = singleOut, which = "trait"))
removeSingleOut(TP, singleOut, trait = attr(x = singleOut, which = "trait"))
TP |
An object of class TP. |
singleOut |
A data.frame with at least the columns plotId and
timePoint with values corresponding to those in TP. If a column outlier is
present, as in the output of |
trait |
The trait that should be set to NA. Can be ignored when using
the output of |
An object of class TP, the input with the outlier replaced by NA.
Other functions for detecting outliers for single observations:
detectSingleOut()
,
detectSingleOutMaize()
,
plot.singleOut()
## Create a TP object containing the data from the Phenovator. PhenovatorDat1 <- PhenovatorDat1[!PhenovatorDat1$pos %in% c("c24r41", "c7r18", "c7r49"), ] phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## First select a subset of plants, for example here 9 plants. plantSel <- phenoTP[[1]]$plotId[1:9] # Then run on the subset resuVatorHTP <- detectSingleOut(TP = phenoTP, trait = "EffpsII", plotIds = plantSel, confIntSize = 3, nnLocfit = 0.1) ## Replace the studied trait by NA for the plants marked as outliers. phenoTPOut <- removeSingleOut(phenoTP, resuVatorHTP)
## Create a TP object containing the data from the Phenovator. PhenovatorDat1 <- PhenovatorDat1[!PhenovatorDat1$pos %in% c("c24r41", "c7r18", "c7r49"), ] phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## First select a subset of plants, for example here 9 plants. plantSel <- phenoTP[[1]]$plotId[1:9] # Then run on the subset resuVatorHTP <- detectSingleOut(TP = phenoTP, trait = "EffpsII", plotIds = plantSel, confIntSize = 3, nnLocfit = 0.1) ## Replace the studied trait by NA for the plants marked as outliers. phenoTPOut <- removeSingleOut(phenoTP, resuVatorHTP)
Function for removing selected time points from an object of class TP.
removeTimePoints(TP, timePoints)
removeTimePoints(TP, timePoints)
TP |
An object of class TP. |
timePoints |
A character or numeric vector indicating the time points to be removed. When using a character string to reference a time point, the value has to be an exact match to one of the existing timePoints. When using a number it will be matched by its number ("timeNumber") in the timePoints attribute of the TP object. |
An object of class TP, the input with the selected time points removed.
Other functions for data preparation:
as.data.frame.TP()
,
createTimePoints()
,
getTimePoints()
,
plot.TP()
,
summary.TP()
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3","check4")) ## Remove the first and last time point from the TP object. phenoTPNew <- removeTimePoints(phenoTP, timePoints = c(1, 73)) ## Compare by looking at summaries. summary(phenoTP) summary(phenoTPNew)
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3","check4")) ## Remove the first and last time point from the TP object. phenoTPNew <- removeTimePoints(phenoTP, timePoints = c(1, 73)) ## Compare by looking at summaries. summary(phenoTP) summary(phenoTPNew)
A dataset containing greenhouse data from the RootPhAir platform (UCLouvain, Belgium). It consists of one experiment with one aeroponic tanks with 340 maize plants. The studied traits are the root tip coordinates in y and x axis, extracted from the pictures over time. Plants were pictured every 2 hours for 10 days. This dataset was kindly provided by Xavier Draye.
RootDat1
RootDat1
A data.frame with 16,275 rows and 10 columns:
Experiment number
Thermal time cumulated
Genotype
Unique pot using tank and rowcol coordinate
Tank A or B
Number of strip of five plants (i.e. row coordinate)
Position within th strip (i.e. column coordinate)
Position of the root tip in x axis
Position of the root tip in y axis
Time of measurement
This dataset contains the corrected data obtained by (1) removing outliers for single observations and (2) running a spatial model on the PhenoarchDat1 dataset. See the vignettes for details.
spatCorrectedArch
spatCorrectedArch
A data.frame with 37,038 rows and 9 columns:
Time number obtained after formatting the original dataset with the function createTP.
Original time point.
Leaf area, corrected data
Leaf area from the picture, raw data
Weight factor
Genotypes
Combination of treatment levels to decompose the genotypic variance (see vignettes)
Column coordinate
Row coordinate
Unique pot ID using rowcol coordinates
This dataset contains the corrected data obtained by (1) removing outliers for single observations and (2) running a spatial model on the PhenovatorDat1 dataset. See the vignettes for details.
spatCorrectedVator
spatCorrectedVator
A data.frame with 103,801 rows and 11 columns:
Time number obtained after formatting the original dataset with the function createTP.
Original time point.
Efficiency of the photosystem II, corrected data
Efficiency of the photosystem II, raw data
Genotypes
Block define after sowing for post-blocking.
Position of the camera
Status of the genotypes: check for the reference genotypes, noCheck for the others.
Column coordinate
Row coordinate
Unique pot ID using rowcol coordinates
This dataset contains the genotypic predictions obtained by (1) removing outliers for single observations and (2) running a spatial model on the PhenoarchDat1 dataset. See the vignettes for details.
spatPredArch
spatPredArch
A data.frame with 6,120 rows and 6 columns:
Time number obtained after formatting the original dataset with the function createTP.
Original time point.
Combination of treatment levels to decompose the genotypic variance (see vignettes)
Genotypes
Biomass, predicted values
Standard errors associated with the prediction
Function for creating a short summary of the contents of a TP object. The summary consists of the name of the experiment, the number of time points, the engine used to fit the models and, in case spatial models where fitted using asreml, the selected spatial model.
## S3 method for class 'fitMod' summary(object, ...)
## S3 method for class 'fitMod' summary(object, ...)
object |
An object of class fitMod. |
... |
Ignored. |
No return value, a summary is printed.
Other functions for spatial modeling:
fitModels()
,
getCorrected()
,
getEffDims()
,
getGenoPred()
,
getHerit()
,
getVar()
,
plot.fitMod()
## Using the first example dataset (PhenovatorDat1): ## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points: modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 36)) ## Create a summary. summary(modPhenoSp)
## Using the first example dataset (PhenovatorDat1): ## Create an object of class TP. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3", "check4")) ## Fit a SpATS model on few time points: modPhenoSp <- fitModels(TP = phenoTP, trait = "EffpsII", timePoints = c(1, 6, 36)) ## Create a summary. summary(modPhenoSp)
Function for creating a short summary of the contents of a TP
object.
The summary consists of the name of the experiment, the number of time
points, the first and last time point and the genotypes defined as checks.
## S3 method for class 'TP' summary(object, ...)
## S3 method for class 'TP' summary(object, ...)
object |
An object of class TP. |
... |
Ignored. |
No return value, a summary is printed.
Other functions for data preparation:
as.data.frame.TP()
,
createTimePoints()
,
getTimePoints()
,
plot.TP()
,
removeTimePoints()
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3","check4")) ## Create a summary. summary(phenoTP)
## Create a TP object containing the data from the Phenovator. phenoTP <- createTimePoints(dat = PhenovatorDat1, experimentName = "Phenovator", genotype = "Genotype", timePoint = "timepoints", repId = "Replicate", plotId = "pos", rowNum = "y", colNum = "x", addCheck = TRUE, checkGenotypes = c("check1", "check2", "check3","check4")) ## Create a summary. summary(phenoTP)