Intro
In the previous article you have learned how to prepare the data for the analysis using the virtualSp dataset and the WorldClim environmental variables. Now it’s time to train your first model, let’s do it!
SDMtune
supports four methods for model training:
- Artificial Neural Networks ANN, using the
nnet
package (Venables and Ripley 2002); - Boosted Regression Trees BRT, using the
gbm
package (Greenwell et al. 2019); - Maximum Entropy with two implementations:
-
Maxent using the
dismo
package (Hijmans et al. 2017); -
Maxnet using the
maxnet
package (Phillips 2017);
-
Maxent using the
- Random Forest RF, using the
randomForest
package (Liaw and Wiener 2002).
The code necessary to train a model is the same for all the implementations. We will show how to train a Maxent model, you can adapt the code for the other methods or check this article.
Train a model with default settings
First we load the SDMtune package:
library(SDMtune)
#>
#> _____ ____ __ ___ __
#> / ___/ / __ \ / |/ // /_ __ __ ____ ___
#> \__ \ / / / // /|_/ // __// / / // __ \ / _ \
#> ___/ // /_/ // / / // /_ / /_/ // / / // __/
#> /____//_____//_/ /_/ \__/ \__,_//_/ /_/ \___/ version 1.3.1
#>
#> To cite this package in publications type: citation("SDMtune").
We use the function train()
to train a
Maxent model. We need to provide two arguments:
-
method
: “Maxent” in our case; -
data
: theSWD()
object with the presence and background locations that we created in the previous article.
default_model <- train(method = "Maxent",
data = data)
The function trains the model using default settings that are:
- linear, quadratic, product and hinge feature class combinations;
- regularization multiplier equal to 1;
- 500 algorithm iterations.
We will see later how to change the default settings, for the moment
let’s have a look at the default_model
object.
Explore an SDMmodel object
The output of the function train()
is an object of class
SDMmodel()
. Let’s print it:
default_model
#>
#> ── Object of class: <SDMmodel> ──
#>
#> Method: Maxent
#>
#> ── Hyperparameters
#> • fc: "lqph"
#> • reg: 1
#> • iter: 500
#>
#> ── Info
#> • Species: Virtual species
#> • Presence locations: 400
#> • Absence locations: 5000
#>
#> ── Variables
#> • Continuous: "bio1", "bio12", "bio16", "bio17", "bio5", "bio6", "bio7", and
#> "bio8"
#> • Categorical: "biome"
When we print an SDMmodel
object we get the following
information:
- the name of the class;
- the method used to train the model;
- the name of the species;
- the number of presence locations;
- the number of absence/background locations;
- the model configurations:
- fc: the feature class combinations;
- reg: the regularization multiplier;
- iter: the number of iterations;
- the environmental variables used to train the model:
- the name of the continuous environmental variables, if any;
- the name of the categorical environmental variables, if any.
An SDMmodel()
object has two slots:
slotNames(default_model)
#> [1] "data" "model"
-
data: an
SWD()
object with the presence absence/background locations used to train the model; -
model: a
Maxent()
object, in our case, with all the model configurations.
The slot model
contains the configurations of the model
plus other information used to make predictions.
slotNames(default_model@model)
#> [1] "results" "reg" "fc" "iter" "extra_args"
#> [6] "lambdas" "coeff" "formula" "lpn" "dn"
#> [11] "entropy" "min_max"
For the moment the most important are: fc, reg and iter that contain the values of the model configuration. We will explore the others later in another article.
Train a model changing the default settings
The function train()
accepts optional arguments that can
be used to change the default model settings. In our previous example we
could have trained the same model using:
default_model <- train(method = "Maxent",
data = data,
fc = "lqph",
reg = 1,
iter = 500)
Try yourself
Try to change the default settings and train a model using linear and hinge as feature class combination, 0.5 as regularization multiplier and 700 iterations. To see the solution highlight the next cell:
model <- train(method = "Maxent",
data = data,
fc = "lh",
reg = 0.5,
iter = 700)
By default Maxent models are trained using the arguments
“removeduplicates=false” and
“addsamplestobackground=false”. The user should have the full
control of the data used to train the model, so is expected that
duplicated locations are already removed and that the presence locations
are already included in the background locations, when desired. You can
use the function thinData()
to remove duplicated locations
and the function addSamplesToBg()
to add the presence
locations to the background locations.
Train a Maxnet model
Train a model using the Maxnet method is as simple
as changing the name of the method in the train()
function,
the only difference here is that we cannot set the number of
iteration.
Try yourself
Try to train a model using the Maxnet method. To see the solution highlight the following cell:
maxnet_model <- train("Maxnet",
data = data)
Conclusion
In this article you have learned:
- how to train a Maxent model using default settings;
- how to explore an
SDMmodel()
object; - how to train a model changing the default settings;
- how to train a model using the Maxnet method.
In the next article you will learn how to use the model that you have just trained to get the predicted value for new localities.