Train, Validation and Test datasets — trainValTest • SDMtune

Split a dataset randomly in training and testing datasets or training, validation and testing datasets.

Usage

trainValTest(x, test, val = 0, only_presence = FALSE, seed = NULL)

Arguments

x: SWD object containing the data that have to be split in training, validation and testing datasets.
test: numeric. The percentage of data withhold for testing.
val: numeric. The percentage of data withhold for validation, default is 0.
only_presence: logical. If TRUE the split is done only for the presence locations and all the background locations are included in each partition, used manly for presence-only methods, default is FALSE.
seed: numeric. The value used to set the seed in order to have consistent results, default is NULL.

Value

A list with the training, validation and testing or training and testing SWD objects accordingly.

Details

When only_presence = FALSE, the proportion of presence and absence is preserved.

Author

Sergio Vignali

Examples

# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd",
                    full.names = TRUE)

predictors <- terra::rast(files)

# Prepare presence and background locations
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background

# Create SWD object
data <- prepareSWD(species = "Virtual species",
                   p = p_coords,
                   a = bg_coords,
                   env = predictors,
                   categorical = "biome")
#> ℹ Extracting predictor information for presence locations
#> ✔ Extracting predictor information for presence locations [21ms]
#> 
#> ℹ Extracting predictor information for absence/background locations
#> ✔ Extracting predictor information for absence/background locations [47ms]
#> 

# Split presence locations in training (80%) and testing (20%) datasets
# and splitting only the presence locations
datasets <- trainValTest(data,
                         test = 0.2,
                         only_presence = TRUE)
train <- datasets[[1]]
test <- datasets[[2]]

# Split presence locations in training (60%), validation (20%) and testing
# (20%) datasets and splitting the presence and the absence locations
datasets <- trainValTest(data,
                         val = 0.2,
                         test = 0.2)
train <- datasets[[1]]
val <- datasets[[2]]
test <- datasets[[3]]