The function uses a Genetic Algorithm implementation to optimize the model hyperparameter configuration according to the chosen metric.
optimizeModel( model, hypers, metric, test = NULL, pop = 20, gen = 5, env = NULL, keep_best = 0.4, keep_random = 0.2, mutation_chance = 0.4, seed = NULL )
model | SDMmodel or SDMmodelCV object. |
---|---|
hypers | named list containing the values of the hyperparameters that should be tuned, see details. |
metric | character. The metric used to evaluate the models, possible values are: "auc", "tss" and "aicc". |
test | SWD object. Testing dataset used to evaluate
the model, not used with aicc and SDMmodelCV objects, default
is |
pop | numeric. Size of the population, default is 5. |
gen | numeric. Number of generations, default is 20. |
env | stack containing the environmental variables, used
only with "aicc", default is |
keep_best | numeric. Percentage of the best models in the population to be retained during each iteration, expressed as decimal number. Default is 0.4. |
keep_random | numeric. Probability of retaining the excluded models during each iteration, expressed as decimal number. Default is 0.2. |
mutation_chance | numeric. Probability of mutation of the child models, expressed as decimal number. Default is 0.4. |
seed | numeric. The value used to set the seed to have consistent
results, default is |
SDMtune object.
To know which hyperparameters can be tuned you can use the output
of the function getTunableArgs. Hyperparameters not included in the
hypers
argument take the value that they have in the passed model.
Part of the code is inspired by this post.
gridSearch and randomSearch.
Sergio Vignali
# \donttest{ # Acquire environmental variables files <- list.files(path = file.path(system.file(package = "dismo"), "ex"), pattern = "grd", full.names = TRUE) predictors <- raster::stack(files) # Prepare presence and background locations p_coords <- virtualSp$presence bg_coords <- virtualSp$background # Create SWD object data <- prepareSWD(species = "Virtual species", p = p_coords, a = bg_coords, env = predictors, categorical = "biome")#>#># Split presence locations in training (80%) and testing (20%) datasets datasets <- trainValTest(data, val = 0.2, test = 0.2, only_presence = TRUE, seed = 61516) train <- datasets[[1]] val <- datasets[[2]] # Train a model model <- train("Maxnet", data = train) # Define the hyperparameters to test h <- list(reg = seq(0.2, 5, 0.2), fc = c("l", "lq", "lh", "lp", "lqp", "lqph")) # Run the function using as metric the AUC if (FALSE) { output <- optimizeModel(model, hypers = h, metric = "auc", test = val, pop = 15, gen = 2, seed = 798) output@results output@models output@models[[1]] # Best model } # }