Introduction to catsr

catsr is a light wrapper around the CATS modelling framework. CATS is a standalone program, usually run on the command line. This R package was written to make CATS more accessible to R users. It can be used to create configuration files for CATS and run them. CATS itself is not included in this package, but a helper function is included to download example data (platform independent) and binaries for the Windows operating system. For other operations system, the preferred way is to download and compile CATS from source.

About CATS

CATS is a modeling framework designed to simulate the effects of environmental changes on the spatio-temporal distribution of plant species regarding for their population dynamics and dispersal processes. CATS is discrete in space (grid-based) and time (using annual cycles).

CATS facilitates simulations at large spatio-temporal extents using grids of high resolutions and allows a fine-grained control of demographic and dispersal processes. An MPI version which allows a single simulation to be run distributed across multiple computers is also provided.

CATS was developed at the the Vienna Institute for Nature Conservation & Analyses (V.I.N.C.A.) and the University of Vienna’s Biodiversity Dynamics and Conservation Group at the Department of Botany and Biodiversity Research. For more information about CATS see the CATS webpage.

About catsr

catsr is used to create configuration files (referencing external data), which are then used as input for CATS. You can use catsr to start CATS from an R session, and process the results. catsr currently only provides a subset of CATS’ functionality when creating configuration files. Alternatively, you can manually create configuration files, and then run them using catsr.

The main purposes of catsr are:

A CATS simulation requires a configuration file, which references other input data. For examples see the CATS quick start guide and the CATS manual.

Configurations are represented in catsr as a data frame. Each row of the data frame corresponds to a single configuration. Each configuration is uniquely identified by its run name (column general.run_name). Each column represents a configuration parameter. The column name relates to the configuration parameter name in the configuration file as outlined in Configuration parameter naming scheme below. The rationale behind this is to make it easy to run CATS simulations with varying parameter configurations.

Creating or loading a configuration data frame

To create such a configuration data frame, you can either load an existing file

library(catsr)
filename <- file.chose()
config <- load_cats_configuration(filename)

or create an empty data frame with all required (and some optional) columns using cats_default_config()

library(catsr)
config <- cats_default_config()
print(config)
#>   general.output_interval general.run_name general.starting_year
#> 1                      NA             <NA>                    NA
#>   general.time_steps species.adult_survival_rate_maximum
#> 1                 NA                                  NA
#>   species.carrying_capacity_maximum species.clonal_growth_rate_maximum
#> 1                                NA                                 NA
#>   species.flowering_frequency_maximum species.germination_rate_maximum
#> 1                                  NA                               NA
#>   species.germination_to_adult_survival_rate_maximum
#> 1                                                 NA
#>   species.initial_population_filename species.maximum_age_of_maturity
#> 1                                <NA>                              NA
#>   species.minimum_age_of_maturity species.name species.OT
#> 1                              NA         <NA>         NA
#>   species.seed_persistence species.seed_survival_rate_maximum
#> 1                       NA                                 NA
#>   species.seed_yield_maximum species.ZT suitability.filename_pattern
#> 1                         NA         NA                         <NA>
#>   suitability.interpolation_interval dispersal.kernel_filenames
#> 1                                 NA                       <NA>
#>   dispersal.maximum_weights dispersal.minimum_weights
#> 1                        NA                        NA

To add additional configuration parameters, add columns to this data frame.

Note: For a list of available fields, and whether they are required or optional, see get_cats_hybrid_fields()

Note: All optional configuration parameters which are set to NA in the data frame will be ignored when creating the configuration files.

Limitations

When creating configurations, catsr currently only supports the hybrid parametrisation mode of CATS and no overlays. This limitation does not apply when external CATS configuration files are used to run CATS in catsr.

Configuration parameter naming scheme

CATS configuration files are INI files. INI files are human readable text files, structured into sections, which contain key-value pairs. They can be e.g. read and modified with the ini package. For a specification of the CATS configuration file format and a full list of available configuration parameters see the CATS manual, sections Configuration files and Configuration parameters.

The name of a configuration parameter in catsr is in the form of section.key_name, which would correspond to this entry in a configuration file

[section]
key name = value

Spaces in the configuration file parameter are replaced with underscores in catsr, and vice versa. Note: Because catsr currently only supports the hybrid parametrisation mode, the environment section with type = suitability is represented in catsr by the shorthand suitability, i.e. suitability.interpolation_interval = 10 corresponds to

[environment]
type = suitability
interpolation interval = 10

Running a CATS simulation

Before you can run CATS, it needs to be installed first. ### Downloading and installing CATS For the Windows operating system, the newest version of CATS can be downloaded and installed usign a helper function:

install.dir <- "path/to/a/directory"
cats_path <- download_and_install_cats(install.dir)

This needs only to be done once, afterwards the path to CATS cats_path can be reused. For other operating systems, please download and install CATS from source.

Running CATS

After installing CATS and creating the configuration data frame (see above), CATS can be run with the run_cats() function. The working directory for CATS needs to be passed to this function. The working directory is where the input data for CATS can be found. All relative paths referenced in the configuration are relative to this working directory.

wd <- "path/to/data"
result <-  run_cats(config, working_directory=wd, path_to_cats=cats_path)

This will run a single simulation for each row in the configuration data frame.

Alternatively, you can also use an existing CATS configuration file instead of the configuration data frame.

wd <- "path/to/data"
config_file <- "path/to/some.conf"
result <-  run_cats(config_file, working_directory=wd, path_to_cats=cats_path)

To run multiple simulations for each configuration entry, see the replicate_numbers parameter of run_cats()

Processing results

After the CATS simulation has finished, the output files that were created can be found in the data frame returned by run_cats(). These contain summary statistics in CSV format, the log ouput of CATS, and population distribution ranges in GeoTIFF format. You can either process them manually, or do an overview plot with plot_cats_run().

Examples

Running an existing CATS configuration file

In the first example we use an already existing CATS configuration file as-is.

library(catsr)
install.dir <- "catsr-test"
dir.create(install.dir)
cats_path <- download_and_install_cats(install.dir)
data_path <- download_cats_quickstart_data(install.dir)
config_file <- file.path(data_path, "Astmons-quick.conf")
result <-  run_cats(config_file, working_directory=data_path, path_to_cats=cats_path)
plot_cats_run(result, config_file)

Modifying an existing CATS configuration file

In the second example we load and modify an already existing CATS configuration file.

library(catsr)
install.dir <- "catsr-test"
dir.create(install.dir)
# Don't reinstall or reextract the data if it already exists
cats_path <- download_and_install_cats(install.dir, redownload=F, reextract=F) 
data_path <- download_cats_quickstart_data(install.dir, redownload=F, reextract=F)
config_file <- file.path(data_path, "Astmons-quick.conf")
config <- load_cats_configuration(config_file) # We load the configuration file
config$general.run_name = "Test" # and modify the data frame before we run it
result <-  run_cats(config, working_directory=data_path, path_to_cats=cats_path)
plot_cats_run(result, config)

Sensitivity analysis

library(catsr)
install.dir <- "catsr-test"
dir.create(install.dir)

# Don't reinstall or reextract the data if it already exists
cats_path <- download_and_install_cats(install.dir, redownload=F, reextract=F) 
data_path <- download_cats_quickstart_data(install.dir, redownload=F, reextract=F)
config_file <- file.path(data_path, "Astmons-quick.conf")
config <- load_cats_configuration(config_file) # We load the configuration file

# Create new configurations by creating all permutation of the specified values for
# flowering frequency and germination rate, while leaving all other parameters untouched
new_configs <- catsr::create_sensitivity_configs(config, species.flowering_frequency_maximum=c(0.1, 0.7),                
                                                 species.germination_rate_maximum=c(0.1, 0.7)) 

# Run all configuration files and plot the change in distribution ranges
output_files <- run_cats_sensitivity(new_configs, path_to_cats=cats_path, working_directory=data_path)