Section 2 Model setup with SWATBuildR

# run the buildR?
run_buildr = FALSE
if (run_buildr) {
  print(paste("Code executed @", Sys.time()))
} else{
  datemod <-
    file.info("model_data/cs10_setup/optain-cs10/optain-cs10.sqlite")
  print(paste("BuildR last run @", datemod$mtime))
}

## [1] "BuildR last run @ 2024-02-27 09:03:29.950109"

The OPTAIN project is using the COCOA approach (Schürz et al. 2022) and as such, needs to use the SWATBuildR package to build the SWAT+ model setup and calculate the connectivity between HRUs in the catchment. This chapter covers this process.

2.1 BuildR Setup

We will need the following packages for this chapter:

# not loading most packages because they interfere with the `install_load` 
# functions of BuildR. this means I need to prefix everything with :: until
# Christoph turns BuildR into a proper package with proper scoping. 

# require(dplyr)
# require(ggplot2)
# require(mapview)
# require(sf)
# require(raster)
# require(terra)
# require(raster)

# Common ggplot theme for simple features
sf_theme <- ggplot2::theme(axis.text.x=ggplot2::element_blank(),
        axis.ticks.x=ggplot2::element_blank(), 
        axis.text.y=ggplot2::element_blank(), 
        axis.ticks.y=ggplot2::element_blank()
        )

We will define the location of our project here, and give it a name:

project_path <- 'model_data/cs10_setup'
project_name <- 'optain-cs10'
project_layer <- TRUE

# load the BuildR
source('model_data/code/swat_buildR/init.R')
# Note the many "maskings" occurring here, dangerous!

2.2 Processing input data

SWATBuildR reads input data, performs some checks on it, and saves it in a compatible format for subsequent calculations. Little documentation exists on the creation of these data sets. IF you have any information to add, feel free to do so!

2.2.1 High-Resolution Digital Elevation Model (DEM)

The high-resolution DEM is the basis for calculation of water connectivity, among other things. Most of the documentation of the creation of the DEM for CS10 has been lost to the sands of time. All we know is that it is located here:

dem_path <- "model_data/input/elevation/dtm3_ns_v5.tif"

2.2.2 Processing the Basin Boundary

The basin boundary has presumably been created using the defined outlet point of the catchment and the DEM. No more is currently known about this file other than that it is located here:

bound_path <- "model_data/input/shape/cs10_basin.shp"

bound_sf <- sf::read_sf(bound_path)
basin <- ggplot2::ggplot(bound_sf) + ggplot2::geom_sf() + sf_theme
print(basin)

Figure 2.1: CS10 Basin, caclulated from the DEM.

The following code and commentary is from BuildR Script version 1.5.14, written by Christoph Schuerz.

BuildR recommends all layers to be in the same CRS, if we set project_layer to FALSE, it will throw an error when this is not the case:

The input layers might be in different coordinate reference systems (CRS). It is recommended to project all layers to the same CRS and check them before using them as model inputs. The model setup process checks if the layer CRSs differ from the one of the basin boundary. By setting ‘proj_layer <- TRUE’ the layer is projected if the CRS is different. If FALSE different CRS trigger an error.

project_layer <- TRUE

We read in and check the basin boundary and run some checks

bound <- read_sf(bound_path) %>% select()
set_proj_crs(bound, data_path)
check_polygon_topology(layer = bound, data_path =  data_path, label = 'basin', 
                       n_feat = 1, checks = c(F,T,T,T,F,F,F,F))

2.2.3 Processing the Land layer

Our land layer is located here:

land_path <- "model_data/input/land/CS10_LU_V2.shp"

Documentation on its creation does not exist.

lu_shp <- sf::read_sf(land_path)
lu_map <- ggplot2::ggplot(lu_shp) + ggplot2::geom_sf(ggplot2::aes(fill = type)) + sf_theme +
  ggplot2::theme(legend.position = "none")
print(lu_map)

Figure 2.2: Land use map of CS10 by Farm ID

We had an issue with the classification of the land uses. For the OPTAIN project, all agricultural fields must have a unique ID, and our land uses only had the ID of the given farm KGB (which had many different fields). To remedy this, new IDs were generated with the format a_###f_# where a_ represents the farm, and f_ represents the respective field of that farm. The farm names needed to be shortened because the SWAT+ model often cannot handle long ID names (longer than 16 characters)

Note, the potential measures polygons were not counted as individual fields, which is why SP_ID does not match up with field_ID – This is by design.

readr::read_csv("model_data/input/crop/a_f_id.csv", show_col_types = F) %>% head()

## # A tibble: 6 × 4
##   KGB      type_fr sp_id field_ID
##   <chr>    <chr>   <dbl> <chr>   
## 1 213/8/1  a_084       1 a_084f_1
## 2 213/7/6  a_083       2 a_083f_1
## 3 213/7/6  a_083       3 a_083f_2
## 4 213/7/4  a_082       4 a_082f_1
## 5 213/65/1 a_081       5 a_081f_1
## 6 213/65/1 a_081       6 a_081f_2

This was done in a simple QGIS workflow of dissolving by farm, splitting from single part to multipart, and then adding an iterating ID per farm field. This workflow could be replicated in R, and then shown here. It is under consideration…

BuildR will now run some checks on our land layer.

land <- read_sf(land_path) %>% 
  check_layer_attributes(., type_to_lower = FALSE) %>%
  check_project_crs(layer = ., data_path =  data_path, proj_layer = project_layer, 
                    label = 'land', type = 'vector')

check_polygon_topology(layer = land, data_path =  data_path, label = 'land', 
                       area_fct =  0.00, cvrg_frc = 99.9,
                       checks = c(T,F,T,T,T,T,T,T))

BuildR splits the land layer into HRU (land) and reservoir (water) objects

split_land_layer(data_path)

2.2.4 Processing the Channels

No documentation exists on the creation of the channels layer, all we know is that it is located here:

channel_path <- 'model_data/input/line/cs10_channels_utm33n.shp'
channels <- read_sf(channel_path) %>% dplyr::select("type")
channel_map <- ggplot2::ggplot() + ggplot2::geom_sf(data = bound_sf) + 
  ggplot2::geom_sf(data = channels, mapping = ggplot2::aes(color = type)) + sf_theme
channel_map

Figure 2.3: CS10 Channels including surface and subsurface channels

BuildR runs some checks:

 channel <- read_sf(channel_path) %>% 
    check_layer_attributes(., type_to_lower = TRUE) %>% 
    check_project_crs(layer = ., data_path =  data_path,
                      proj_layer = project_layer, 
                      label = 'channel', type = 'vector')

check_line_topology(layer = channel, data_path = data_path,
                    label = 'channel', length_fct = 0, can_cross = FALSE)

BuildR then checks the connectivity between the channels and reservoirs. For this we need to define our id_cha_out and id_res_out.

“Variable id_cha_out sets the outlet point of the catchment. Either define a channel OR a reservoir as the final outlet. If channel then assign id_cha_out with the respective id from the channel layer. If reservoir then assign the respective id from the land layer to id_res_out, otherwise leave as NULL”

id_cha_out <- 37
id_res_out <- NULL

Running connectivity checks between channels and reservoirs:

check_cha_res_connectivity(data_path, id_cha_out, id_res_out)

Checking if any defined channel ids for drainage from land objects do not exist

check_land_drain_ids(data_path)

2.2.5 Processing the DEM

BuildR loads and checks the DEM, and saves it.

dem <- rast(dem_path) %>% 
  check_project_crs(layer = ., data_path =  data_path, proj_layer = project_layer, 
                    label = 'dem', type = 'raster')
check_raster_coverage(rst = dem, vct_layer = 'land', data_path = data_path, 
                      label = 'dem', cov_frc = 0.95)
save_dem_slope_raster(dem = dem, data_path = data_path)

Here is the result:

map <- raster::raster("model_data/cs10_setup/optain-cs10/data/raster/dem.tif")
raster::plot(map)

Figure 2.4: CS10 DEM cropped to basin

2.2.6 Processing Soil Data

Our soil map is located here:

soil_layer_path  <- 'model_data/input/soil/soil_raster.tif'

No documentation exists on its creation.

map <- raster::raster(soil_layer_path)
raster::plot(map)

Figure 2.5: CS10 Soil map

The soil data and look-up path are located here:

soil_lookup_path <- 'model_data/input/soil/soil_lookup_26.csv'
soil_data_path   <- 'model_data/input/soil/usersoil_26_v2.csv'

Not much documentation exists here either. Some can be found in the excel sheet:

~/model_data/input/soil/swatsoil2.xlsx

BuildR reads in the soil data, performs checks, processes, and saves.

soil <- rast(soil_layer_path) %>% 
  check_project_crs(layer = ., data_path =  data_path, proj_layer = project_layer, 
                    label = 'soil', type = 'raster')
check_raster_coverage(rst = soil, vct_layer = 'hru', data_path = data_path, 
                      label = 'soil', cov_frc = 0.75)
save_soil_raster(soil, data_path)

BuildR then generates a table with aggregated elevation, slope, soil for HRU units.

aggregate_hru_dem_soil(data_path)

Read and prepare the soil input tables and a soil/hru id table and write them into data_path/tables.sqlite

build_soil_data(soil_lookup_path, soil_data_path, data_path)

2.3 Calculating Contiguous Object Connectivity

SWATBuildR follows COCOA. This section contains the calculations. You can read more about it in the protocol.

2.3.1 Calculating land unit connectivity

Preparing raster layers based on the DEM and the surface channel objects that will be used in the calculation of the land object connectivity.

prepare_terrain_land(data_path)

The connection of each land object to neighboring land and water objects is calculated based on the flow accumulation and the D8 flow pointer along the object edge

calculate_land_connectivity(data_path)

2.3.1.1 Eliminate land object connections with small flow fractions:

For each land object the flow fractions are compared to connection with the largest flow fraction of that land object. Connections are removed if their fraction is smaller than frc_thres relative to the largest one.

This is necessary to:

Simplify the connectivity network
To reduce the risk of circuit routing between land objects. Circuit routing will be checked.

If an error due to circuit routing is triggered, then ‘frc_thres’ must be increased to remove connectivities that may cause this issue.

frc_thres <- 0.3

The remaining land object connections are analyzed for infinite loop routing. For each land unit the connections are propagated and checked if the end up again in the same unit.

# reduce_land_connections(data_path, frc_thres) %>% 
#  check_infinite_loops(., data_path, 'Land')

# Analyzing land objects for infinite loop routing: Completed 6185 Land objects
# in 10M 44S
#
# ✘  150 Land objects identified where water is routed in loops.
#
# You can resolve this issue in the following ways:
#
#  - Use the layer 'land_infinite_loops.gpkg' that was written to
# model_data/cs10_setup/optain-cs10/data/vector to identify land polygons that
# cause the issue and split them to break the loops This would require to
# restart the entire model setup procedure!
#
#   - Increase the value of 'frc_thres'.
# This reduces the number of connections of each land unit (maybe undesired!)
# and can remove the connections that route the water in loops.
#
#   - Continue with the model setup (only recommended for small number of identified units!).
# The function 'resolve_loop_issues()' will then eliminate a certain number of
# connections.

What do we do here ? – I think in the past we just removed all the loops, but it says here it is not recommended when there are a lot of them. Should we increase the threshold? Should we try to break up the loops? You can see the problem polygons on the map below:

land_inifnite_loops_shp <- sf::read_sf("model_data/cs10_setup/optain-cs10/data/vector/land_infinite_loops.gpkg")
mapview::mapview(land_inifnite_loops_shp)

To me it seems to complicated to try to break up the loops, maybe increase the threshold a bit and see what happens? and delete the rest?

Increasing threshold and re-running leads to this:

frc_thres <- 0.1 #     loops (   mins)
frc_thres <- 0.2 # 238 loops (30 mins)
frc_thres <- 0.3 # 150 loops (12 mins)
frc_thres <- 0.4 # 115 loops (   mins)
frc_thres <- 0.5 # 94  loops (   mins)
frc_thres <- 0.6 # 57  loops (   mins)

We have decided that at a threshold of 0.3 and 150, we are within the “not that many” loops territory, considering out of 6000+ HRUs, it is only around 2%. Therefore we will let BuildR continue and remove them. This has been discussed in Issue #55.

frc_thres <- 0.3 # 150 loops
reduce_land_connections(data_path, frc_thres) %>% 
  check_infinite_loops(., data_path, 'Land')

2.3.1.2 Resolve infinite loops

If infinite loops were identified this routine tries to resolve the issues by selectively removing connections between land units in order to get rid of all infinite loops.

 resolve_loop_issues(data_path)

2.3.2 Calculating channel/reservoir connectivity

2.3.2.1 Calculating the water object connectivity

the function returns the cha and res con_out tables in SWAT+ database format and writes them into data_path/tables.sqlite

build_water_object_connectivity(data_path)

2.3.3 Checking the water objects for infinite loops

From the cha_res_con_out tables id_from/id_to links are generated and checked for infinite loop routing.

prepare_water_links(data_path) %>% 
  check_infinite_loops(., data_path, 'Water', Inf)

2.3.4 Terrain properties

Calculate terrain properties such as elevation, slope, catchment area, channel width/depth for channel and reservoir objects and write them into data_path/tables.sqlite

prepare_terrain_water(data_path)

2.4 Generate SWAT+ input

2.4.1 Generate land object SWAT+ input tables

Build the landuse.lum and a landuse/hru id table and write them into data_path/tables.sqlite

build_landuse(data_path)

Build the HRU SWAT+ input files and write them into data_path/tables.sqlite

build_hru_input(data_path)

Add wetlands to the HRUs and build the wetland input files and write them into data_path/tables.sqlite

We only use the wetf land use .

wetland_landuse <- c('wehb', 'wetf', 'wetl', 'wetn')

add_wetlands(data_path, wetland_landuse)

2.4.2 Generate water object SWAT+ input tables

Build the SWAT+ cha input files and write them into data_path/tables.sqlite

build_cha_input(data_path)

Build the SWAT+ res input files and write them into data_path/tables.sqlite

build_res_input(data_path)

Build SWAT+ routing unit con_out based on ‘land_connect_fraction’.

build_rout_con_out(data_path)

Build the SWAT+ rout_unit input files and write them into data_path/tables.sqlite

build_rout_input(data_path)

Build the SWAT+ LSU input files and write them into data_path/tables.sqlite

build_ls_unit_input(data_path)

2.4.3 Build aquifer input

Build the SWAT+ aquifer input files for a single aquifer for the entire catchment. The connectivity to the channels with geomorphic flow must be added after writing the txt input files. This is not implemented in the script yet.

build_single_aquifer_files(data_path)

2.4.4 Add point source inputs

Point sources have been prepared in Section REF

The point source locations are provided with a point vector layer in the path ‘point_path’.

point_path <- 'model_data/input/pointsource/cs10-ps-data.shp'

Maximum distance of a point source to a channel or a reservoir to be included as a point source object (recall) in the model setup:

max_point_dist <- 500 # meters

Adding the point sources with BuildR:

add_point_sources(point_path, data_path, max_point_dist)

2.4.5 Create SWAT+ sqlite database

2.4.5.1 Write the SWAT+Editor project database

The database will be located the ‘project_path’.

stat <- file.remove("model_data/cs10_setup/optain-cs10/optain-cs10.sqlite")

if(stat == FALSE){stop("could not delete database")}

create_swatplus_database(project_path, project_name)

References

Schürz, Christoph, Natalja Čerkasova, Csilla Farkas, Attila Nemes, Plunge, Michael Strauch, Brigitta Szabó, and Mikołaj Piniewski. 2022. “SWAT+ Modeling Protocol for the Assessment of Water and Nutrient Retention Measures in Small Agricultural Catchments,” December. https://doi.org/10.5281/ZENODO.7463395.