Virtual Ecosystem example data

Example data is included with Virtual Ecosystem to provide an introduction to the file formats and configuration. Using this data is described in the usage documentation and this page describes the structure and contents of the example data folder.

It might be useful to install the ve_example directory to a location of your choice when reading these notes, using the command shown below, but the contents of the key files are also linked on this page.

ve_run --install-example /location/path/

Example data directory structure

The ve_example directory contains the following sub-directories:

The config directory contains three configuration files that combine to provide a basic complete configuration for the example data.
The data directory contains NetCDF format files containing the variables required to initialise the model and then iterate over a time series.
The empty out directory is simply used as a location to store model outputs.
The generation_scripts directory contains Python scripts that are used to generate the contents of the data directory. You won’t typically need to look at these, but they provide simple recipes for creating or editing the input data files.
A source directory that contains files used to create the input data files.

Configuration files

The example configuration files are:

The ve_run.toml configures the models to be used in the simulation and the order in which they are initialised and updated.

config/ve_run.toml

[core]
[hydrology.depends]
init = ['plants']
update = ['plants', 'abiotic_simple']
[abiotic_simple]
[animals]
[litter.depends]
update = ['hydrology', 'abiotic_simple']
[soil.depends]
init = ['hydrology', 'abiotic_simple']
update = ['hydrology', 'abiotic_simple']

The data_config.toml file configures the initial variables to be loaded and sets the paths to the source files providing those variables.

config/data_config.toml

[core.data_output_options]
save_initial_state = false

# Climate data
[[core.data.variable]]
file = "../data/example_climate_data.nc"
var_name = "air_temperature_ref"
[[core.data.variable]]
file = "../data/example_climate_data.nc"
var_name = "relative_humidity_ref"
[[core.data.variable]]
file = "../data/example_climate_data.nc"
var_name = "atmospheric_pressure_ref"
[[core.data.variable]]
file = "../data/example_climate_data.nc"
var_name = "precipitation"
[[core.data.variable]]
file = "../data/example_climate_data.nc"
var_name = "atmospheric_co2_ref"
[[core.data.variable]]
file = "../data/example_climate_data.nc"
var_name = "mean_annual_temperature"
[[core.data.variable]]
file = "../data/example_climate_data.nc"
var_name = "wind_speed_ref"

# Elevation
[[core.data.variable]]
file = "../data/example_elevation_data.nc"
var_name = "elevation"

# Hydrology
[[core.data.variable]]
file = "../data/example_surface_runoff_data.nc"
var_name = "surface_runoff"

# Soil
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "pH"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "bulk_density"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "clay_fraction"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "soil_c_pool_lmwc"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "soil_c_pool_maom"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "soil_c_pool_microbe"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "soil_c_pool_pom"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "soil_enzyme_pom"
[[core.data.variable]]
file = "../data/example_soil_data.nc"
var_name = "soil_enzyme_maom"

# Litter 
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "litter_pool_above_metabolic"
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "litter_pool_above_structural"
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "litter_pool_woody"
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "litter_pool_below_metabolic"
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "litter_pool_below_structural"
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "lignin_above_structural"
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "lignin_woody"
[[core.data.variable]]
file = "../data/example_litter_data.nc"
var_name = "lignin_below_structural"

# Plants
[[core.data.variable]]
file = "../data/example_plant_data.nc"
var_name = "plant_cohorts_n"
[[core.data.variable]]
file = "../data/example_plant_data.nc"
var_name = "plant_cohorts_pft"
[[core.data.variable]]
file = "../data/example_plant_data.nc"
var_name = "plant_cohorts_cell_id"
[[core.data.variable]]
file = "../data/example_plant_data.nc"
var_name = "plant_cohorts_dbh"
[[core.data.variable]]
file = "../data/example_plant_data.nc"
var_name = "photosynthetic_photon_flux_density"

The animal_functional_groups.toml file provides basic configuration for the animals model to set functional group definitions.

config/animal_functional_groups.toml

# animal_functional_groups.toml

[[animals.functional_groups]]
name = "carnivorous_bird"
taxa = "bird"
diet = "carnivore"
metabolic_type = "endothermic"
birth_mass = 0.1
adult_mass = 1.0

[[animals.functional_groups]]
name = "herbivorous_bird"
taxa = "bird"
diet = "herbivore"
metabolic_type = "endothermic"
birth_mass = 0.05
adult_mass = 0.5

[[animals.functional_groups]]
name = "carnivorous_mammal"
taxa = "mammal"
diet = "carnivore"
metabolic_type = "endothermic"
birth_mass = 4.0
adult_mass = 40.0

[[animals.functional_groups]]
name = "herbivorous_mammal"
taxa = "mammal"
diet = "herbivore"
metabolic_type = "endothermic"
birth_mass = 1.0
adult_mass = 10.0

[[animals.functional_groups]]
name = "carnivorous_insect"
taxa = "insect"
diet = "carnivore"
metabolic_type = "ectothermic"
birth_mass = 0.001
adult_mass = 0.01

[[animals.functional_groups]]
name = "herbivorous_insect"
taxa = "insect"
diet = "herbivore"
metabolic_type = "ectothermic"
birth_mass = 0.0005
adult_mass = 0.005

The plant_config.toml file provides basic configuration for the plants model to set functional group definitions.

config/plant_config.toml

[plants]
a_plant_integer = 12

[[plants.ftypes]]
pft_name = "shrub"
max_height = 1.0

[[plants.ftypes]]
pft_name = "broadleaf"
max_height = 50.0

Example data files

The data configuration file sets up the links between the variables used in the models and the provided data file in which they are stored. The NetCDF files in the data directory are all generated by the Python scripts provided in the generation_scripts directory, which provide some simple recipes for creating various kinds of input data and saving them in the required format.

Warning

All of these data files currently contain artificial data to test the program flow and data handling of the Virtual Ecosystem simulation. Although some values are taken from real source data, this is not yet a meaningful real world example dataset.

The common.py script file defines some common elements that are used across the data generation scripts, primarily the spatial grid to be used and the dates for time series data.

common.py

"""Common components for example data generation.

This file defines some simple shared components used for generating the example datasets
and which will be imported by the other code in this module.
"""

import numpy as np

# Generate range of cell numbers in the x and y directions. Here we have a 9x9 grid,
# so cells are numbered from 0 to 8 in each direction.
nx = 9
ny = 9
x_cell_ids = np.arange(nx)
y_cell_ids = np.arange(ny)

# How far the center of each cell is from the origin. This applies to both the x and y
# direction independently, so cell (0,0) is at the origin, whereas cell (2,3) is 180m
# from the origin in the x direction and 270m in the y direction.
cell_displacements = np.arange(0, 721, 90)

# Cell id codes
n_cells = nx * ny
cell_id = np.arange(n_cells)

# Time dimension - a time series of 24 months.
time = np.arange(np.datetime64("2013-01"), np.datetime64("2015-01")).astype(
    "datetime64[D]"
)
n_dates = len(time)
time_index = np.arange(n_dates)

Elevation data

The data/example_elevation_data.nc file provides:

Variable	Name	Unit	Dims
elevation	`elevation`	m	XY

elevation_example_data.py

"""Elevation data for `ve_run` example.

This code creates an example elevation map from a digital elevation model
([SRTM](https://www2.jpl.nasa.gov/srtm/)) which is required to run the example hydrology
model.

The commented code is used to download an existing processed SRTM dataset for the SAFE
Project area, covering the region 4°N 116°E to 5°N 117°E, see [SAFE
wiki](https://safeproject.net/dokuwiki/safe_gis/srtm) for reference. The processed
datafile can be downloaded from its [Zenodo record](https://zenodo.org/records/3490488).
The dataset is then downscaled to match the required target resolution of 90m. At the
moment this does not _actually_ align with the climate data, it is simply forced to the
same coordinates and resolution.

To save processing and to avoid adding requirements to the package, the resulting data
is simply stored here and written to an appropriate file format.
"""

import numpy as np
from xarray import DataArray

from virtual_ecosystem.example_data.generation_scripts.common import cell_displacements

# # Load DEM in 30m resolution
# original_data = requests.get(
#     "https://zenodo.org/records/3490488/files/SRTM_UTM50N_processed.tif"
# )

# data_path = Path("SRTM_UTM50N_processed.tif")
# with open(data_path, "wb") as f:
#     f.write(original_data.content)

# dem = rioxarray.open_rasterio("SRTM_UTM50N_processed.tif")

# # Specify the original grid coordinates
# x = dem.coords["x"]  # type: ignore  # noqa
# y = dem.coords["y"]  # type: ignore  # noqa

# # Create a new grid of longitude and latitude coordinates with higher resolution
# new_resolution = 26000
# new_x = np.arange(x.min(), x.max(), new_resolution)  # type: ignore  # noqa
# new_y = np.arange(y.min(), y.max(), new_resolution)  # type: ignore  # noqa

# # Project DEM to new mesh
# dem_9x9 = dem.interp(x=new_x, y=new_y)  # type: ignore  # noqa

# # Reduce the data to reuired information for netcdf
# dem_cleaned = (
#     dem_9x9.drop_vars(
#       ["band", "spatial_ref"]
#     ).squeeze("band").drop_indexes(["x", "y"])
# )

dem_data = np.array(
    [
        [1353.0, 583.0, 248.333, 118.0, 24.0, 35.0, 11.0, 46.333, 0.0],
        [1122.667, 446.111, 404.0, 462.667, 65.444, 52.667, 40.667, 0.0, 11.222],
        [928.667, 284.778, 277.222, 552.667, 655.111, 671.667, 54.667, 42.222, 831.778],
        [1008.0, 992.333, 440.0, 582.0, 523.0, 338.333, 596.0, 548.0, 314.0],
        [619.0, 580.778, 471.222, 271.333, 293.667, 169.0, 609.333, 301.444, 175.667],
        [374.0, 415.111, 500.111, 318.667, 138.556, 91.444, 88.0, 81.0, 152.778],
        [1262.0, 316.667, 606.333, 401.0, 116.0, 110.667, 107.0, 16.0, 11.667],
        [159.333, 1121.778, 1207.222, 524.333, 253.889, 77.444, 76.667, 34.333, 9.889],
        [0.0, 820.222, 1154.889, 850.333, 299.222, 183.556, 7.333, 8.111, 17.889],
    ]
)
dem_cleaned = DataArray(name="elevation", data=dem_data, dims=("x", "y"))

# Change coordinates to match exmple data grid
dem_placed = dem_cleaned.assign_coords(
    {"x": cell_displacements, "y": cell_displacements}
)

# Save to netcdf and remove downloaded data
dem_placed.to_netcdf("../data/example_elevation_data.nc")

This code creates a dummy elevation map from a digital elevation model (SRTM) which is required to run, amongst others, the hydrology_model. The initial data covers the region 4°N 116°E to 5°N 117°E, see SAFE wiki for reference and download. We reduce the initial 30m spatial resolution to match the 9 x 9 grid of the example simulation while covering an area similar to the climate dummy data.

Climate data

The example_climate_data.nc file provides:

Variable	Name	Unit	Dims
air temperature	`air_temperature_ref`	°C	XYT
relative humidity	`relative_humidity_ref`	unitless	XYT
atmospheric pressure	`atmospheric_pressure_ref`	kPa	XYT
precipitation	`precipitation`	mm \(\textrm{month}^{-1}\)	XYT
atmospheric \(\ce{CO_{2}}\) concentration	`atmospheric_co2_ref`	ppm	XYT
mean annual temperature	`mean_annual_temperature`	°C	XY

climate_example_data.py

"""Simple climate data pre-processing example for `ve_run` example data.

This section illustrates how to perform simple manipulations to adjust ERA5-Land data to
use in the Virtual Ecosystem. This includes reading climate data from netcdf,
converting the data into an input format that is suitable for the abiotic module (e.g.
Kelvin to Celsius conversion), adding further required variables, and writing the output
in a new netcdf file. This does not include spatially interpolating coarser resolution
climate data and including the effects of local topography.

Input file: ERA5_land.nc

Metadata:

* Muñoz-Sabater,J. et al: ERA5-Land: A state-of-the-art global reanalysis dataset for
  land applications, Earth Syst. Sci. Data,13, 4349-4383, 2021.
  [https://doi.org/10.5194/essd-13-4349-2021](https://doi.org/10.5194/essd-13-4349-2021)
* Product type: Monthly averaged reanalysis
* Variable: 2m dewpoint temperature, 2m temperature, Surface pressure, Total
  precipitation
* Year: 2013, 2014
* Month: January, February, March, April, May, June, July, August, September, October,
  November, December
* Time: 00:00
* Sub-region extraction: North 6°, West 116°, South 4°, East 118°
* Format: NetCDF3

Once the new netcdf file is created, the final step is to add the grid information to
the grid config `TOML` to load this data correctly when setting up a Virtual Ecosystem
Simulation. Here, we can also add the 45 m offset to position the coordinated at the
centre of the grid cell.

[core.grid]
cell_nx = 9
cell_ny = 9
cell_area = 8100
xoff = -45.0
yoff = -45.0
"""

import numpy as np
import xarray as xr
from xarray import DataArray

from virtual_ecosystem.example_data.generation_scripts.common import (
    time_index,
    x_cell_ids,
    y_cell_ids,
)

# 1. Load ERA5_Land data in low resolution

dataset = xr.open_dataset("../source/ERA5_land.nc")

# 2. Convert temperatures units
# The standard output unit of ERA5-Land temperatures is Kelvin which we need to convert
# to degree Celsius for the Virtual Ecosystem. This includes 2m air temperature and
# 2m dewpoint temperature which are used to calculate relative humidity in next step.

dataset["t2m_C"] = dataset["t2m"] - 273.15  # 2m air temperature
dataset["d2m_C"] = dataset["d2m"] - 273.15  # 2m dewpoint temperature

# 3. Calculate relative humidity
# Relative humidity (RH) is not a standard output from ERA5-Land but can be calculated
# from 2m dewpoint temperature (DPT) and 2m air temperature (T)

dataset["rh2m"] = 100.0 * (
    np.exp(17.625 * dataset["d2m_C"] / (243.04 + dataset["d2m_C"]))
    / np.exp(17.625 * dataset["t2m_C"] / (243.04 + dataset["t2m_C"]))
)

# 4. Convert precipitation units
# The standard output unit for total precipitation in ERA5-Land is meters which we need
# to convert to millimeters. Further, the data represents mean daily accumulated
# precipitation for the 9x9km grid box, so the value has to be scaled to monthly (here
# 30 days). TODO handel daily inputs

dataset["tp_mm"] = dataset["tp"] * 1000 * 30

# 5. Convert surface pressure units
# The standard output unit for surface pressure in ERA5-Land is Pascal (Pa) which we
# need to convert to Kilopascal (kPa).

dataset["sp_kPa"] = dataset["sp"] / 1000

# 6. Clean dataset and rename variables
# In this step, we delete the initial temperature variables (K), precipitation (m), and
# surface pressure(Pa) and rename the remaining variables according to the Virtual
# Ecosystem naming convention.

dataset_cleaned = dataset.drop_vars(["d2m", "d2m_C", "t2m", "tp", "sp"])
dataset_renamed = dataset_cleaned.rename(
    {
        "sp_kPa": "atmospheric_pressure_ref",
        "tp_mm": "precipitation",
        "t2m_C": "air_temperature_ref",
        "rh2m": "relative_humidity_ref",
    }
)

# 7. Add further required variables
# In addition to the variables from the ERA5-Land datasset, a time series of atmospheric
# CO2 is needed. We add this here as a constant field. Mean annual temperature
# is calculated from the full time series of air temperatures; in the future, this
# should be done for each year.

dataset_renamed["atmospheric_co2_ref"] = DataArray(
    np.full_like(dataset_renamed["air_temperature_ref"], 400),
    dims=["time", "latitude", "longitude"],
)
dataset_renamed["wind_speed_ref"] = DataArray(
    np.full_like(dataset_renamed["air_temperature_ref"], 0.1),
    dims=["time", "latitude", "longitude"],
)
dataset_renamed["mean_annual_temperature"] = dataset_renamed[
    "air_temperature_ref"
].mean(dim="time")


# 8. Change coordinates to x-y in meters
# The following code segment changes the coordinate names from `longitude/latitude` to
# `x/y` and the units from `minutes` to `meters`. The ERA5-Land coordinates are treated
# as the centre points of the grid cells which means that when setting up the grid, an
# offset of 4.5 km has to be added.

dataset_xy = (
    dataset_renamed.rename_dims({"longitude": "x", "latitude": "y"})
    .assign_coords({"x": np.arange(0, 180000, 9000), "y": np.arange(0, 180000, 9000)})
    .drop({"longitude", "latitude"})
)

# 9. Scale to 90 m resolution
# The Virtual Ecosystem example data is run on a 90 x 90 m grid. This means that some
# form of spatial downscaling has to be applied to the dataset, for example by spatially
# interpolating coarser resolution climate data and including the effects of local
# topography. This is not yet implemented!

# For the purpose of a example data in the development stage, the coordinates can be
# overwritten to match the Virtual Ecosystem grid and we can select a smaller area.
# Note that the resulting dataset does no longer match a digital elevation model for the
# area!

dataset_xy_100 = (
    dataset_renamed.rename_dims({"longitude": "x", "latitude": "y"})
    .assign_coords({"x": np.arange(0, 1800, 90), "y": np.arange(0, 1800, 90)})
    .drop({"longitude", "latitude"})
)
dataset_xy_example = dataset_xy_100.isel(x=x_cell_ids, y=y_cell_ids)

# 10. Add time_index
# At the moemnt, the example model iterates over time indices rather than real datetime.
# Therefore, we add a `time_index` coordinate to the dataset:

dataset_xy_timeindex = (
    dataset_xy_example.rename_dims({"time": "time_index"})
    .assign_coords({"time_index": time_index})
    .drop("time")
)

# 11. Save netcdf
# Once we confirmed that our dataset is complete and our calculations are correct, we
# save it as a new netcdf file. This can then be fed into the code data loading system
# here {mod}`~virtual_ecosystem.core.data`.

dataset_xy_timeindex.to_netcdf("../data/example_climate_data.nc")

The dummy climate data for the example simulation is based on monthly ERA5-Land data which can be downloaded from the Copernicus climate data store. See the climate data recipes page for more details.

Hydrology data

The example_surface_runoff_data.nc file provides:

Variable	Name	Unit	Dims
surface runoff	`surface_runoff`	mm	XY

runoff_example_data.py

"""Example runoff data for `ve_run`.

This code randomly generates normally distributed surface runoff data to run in the
`ve_run` example data without the SPLASH implementation.
"""

import numpy as np
from xarray import DataArray

from virtual_ecosystem.example_data.generation_scripts.common import cell_displacements

# Randomly generate surface runoff with normal distribution
mu, sigma = 10, 2  # mean and standard deviation
s = np.random.default_rng(seed=42).normal(
    mu, sigma, (len(cell_displacements), len(cell_displacements))
)

runoff = DataArray(
    s,
    dims=["x", "y"],
    coords={"x": cell_displacements, "y": cell_displacements},
    name="surface_runoff",
)

# Save to netcdf
runoff.to_netcdf("../data/example_surface_runoff_data.nc")

The hydrology model requires an initial surface runoff field to calculate accumulated surface runoff. This value is currently created using a normal distribution, and adjusted to Virtual Ecosystem conventions, but will in the future be estimated from rainfall data using the SPLASH model.

Soil data

The example_soil_data.nc file provides:

Variable	Name	Unit	Dims
pH	`pH`	unitless	XY
Bulk soil density	`bulk_density`	kg \(\textrm{m}^{-3}\)	XY
Soil clay fraction	`clay_fraction`	unitless	XY
Soil low molecular weight carbon pool	`soil_c_pool_lmwc`	kg C \(\textrm{m}^{-3}\)	XY
Soil mineral associated organic matter carbon pool	`soil_c_pool_maom`	kg C \(\textrm{m}^{-3}\)	XY
Soil microbial carbon pool	`soil_c_pool_microbe`	kg C \(\textrm{m}^{-3}\)	XY
Soil particulate organic matter carbon pool	`soil_c_pool_pom`	kg C \(\textrm{m}^{-3}\)	XY
Soil particulate organic matter enzyme pool	`soil_enzyme_pom`	kg C \(\textrm{m}^{-3}\)	XY
Soil mineral associated organic matter enzyme pool	`soil_enzyme_maom`	kg C \(\textrm{m}^{-3}\)	XY

This code creates a set of plausible values for which the soil_model absolutely has to function sensibly for. Descriptions of the soil pools can be found here.

soil_example_data.py

"""Example soil data for `ve_run`.

This script generates the data required to run the soil component of the example
dataset. **It is important to note that none of this data is real data**. Instead, this
data is a set of plausible values that the soil model absolutely has to function
sensibly for.
"""

import numpy as np
from xarray import Dataset

from virtual_ecosystem.example_data.generation_scripts.common import cell_displacements

gradient = np.outer(cell_displacements / 90, cell_displacements / 90)

# Generate a range of plausible values (3.5-4.5) for the soil pH [unitless].
pH_values = 3.5 + 1.00 * (gradient) / (64)

# Generate a range of plausible values (1200-1800) for the bulk density [kg m^-3].
bulk_density_values = 1200.0 + 600.0 * (gradient) / (64)

# Generate a range of plausible values (0.27-0.40) for the clay fraction [fraction].
clay_fraction_values = 0.27 + 0.13 * (gradient) / (64)

# Generate a range of plausible values (0.005-0.01) for the lmwc pool [kg C m^-3].
lmwc_values = 0.005 + 0.005 * (gradient) / (64)

# Generate a range of plausible values (1.0-3.0) for the maom pool [kg C m^-3].
maom_values = 1.0 + 2.0 * (gradient) / (64)

# Generate a range of plausible values (0.0015-0.005) for the microbial C pool
# [kg C m^-3].
microbial_C_values = 0.0015 + 0.0035 * (gradient) / (64)

# Generate a range of plausible values (0.1-1.0) for the POM pool [kg C m^-3].
pom_values = 0.1 + 0.9 * (gradient) / (64)

# Generate a range of plausible values (0.01-0.5) for the POM enzyme pool [kg C m^-3].
pom_enzyme_values = 0.01 + 0.49 * (gradient) / (64)

# Generate a range of plausible values (0.01-0.5) for the MAOM enzyme pool [kg C m^-3].
maom_enzyme_values = 0.01 + 0.49 * (gradient) / (64)

# Make example soil dataset
example_soil_data = Dataset(
    data_vars=dict(
        pH=(["x", "y"], pH_values),
        bulk_density=(["x", "y"], bulk_density_values),
        clay_fraction=(["x", "y"], clay_fraction_values),
        soil_c_pool_lmwc=(["x", "y"], lmwc_values),
        soil_c_pool_maom=(["x", "y"], maom_values),
        soil_c_pool_microbe=(["x", "y"], microbial_C_values),
        soil_c_pool_pom=(["x", "y"], pom_values),
        soil_enzyme_pom=(["x", "y"], pom_enzyme_values),
        soil_enzyme_maom=(["x", "y"], maom_enzyme_values),
    ),
    coords=dict(
        x=(["x"], cell_displacements),
        y=(["y"], cell_displacements),
    ),
    attrs=dict(description="Soil data for dummy Virtual Ecosystem model."),
)

# Save the example soil data file as netcdf
example_soil_data.to_netcdf("../data/example_soil_data.nc")

Litter data

The example_litter_data.nc file provides:

Variable	Name	Unit	Dims
above ground metabolic litter pools	`litter_pool_above_metabolic`	kg C \(\textrm{m}^{-2}\)	XY
above ground structural litter pools	`litter_pool_above_structural`	kg C \(\textrm{m}^{-2}\)	XY
woody litter pools	`litter_pool_woody`	kg C \(\textrm{m}^{-2}\)	XY
below ground metabolic litter pools	`litter_pool_below_metabolic`	kg C \(\textrm{m}^{-2}\)	XY
below ground structural litter pools	`litter_pool_below_structural`	kg C \(\textrm{m}^{-2}\)	XY
lignin proportion of above ground structural litter	`lignin_above_structural`	unitless	XY
lignin proportion of woody litter	`lignin_woody`	unitless	XY
lignin proportion of below ground structural litter	`lignin_below_structural`	unitless	XY

The generation script creates a set of plausible values for which the litter_model absolutely has to function sensibly for. Descriptions of the litter pools can be found here.

litter_example_data.py

"""Necessary litter data for `ve_run` example.

This script generates the data required to run the litter component in the example
dataset. It is important to note that none of this data is real data. Instead, the code
below creates some typical values for the required input data and generates a simple
spatial pattern. Descriptions of the relevant litter pools can be found here:
/virtual_ecosystem/docs/source/virtual_ecosystem/soil/soil_details.md.
"""

import numpy as np
from xarray import Dataset

from virtual_ecosystem.example_data.generation_scripts.common import cell_displacements

# Calculate a gradient
gradient = np.multiply.outer(cell_displacements / 90, cell_displacements / 90)

# Generate a range of plausible values (0.05-0.5) for the above ground metabolic litter
# pools [kg C m^-2].
above_metabolic_values = 0.05 + 0.45 * (gradient) / (64)

# Generate a range of plausible values (0.05-0.5) for the above ground structural litter
# pools [kg C m^-2].
above_structural_values = 0.05 + 0.45 * (gradient) / (64)

# Generate range of plausible values (4.75-12.0) for the woody litter pools [kg C m^-2].
woody_values = 4.75 + 7.25 * (gradient) / (64)

# Generate a range of plausible values (0.03-0.08) for the below ground metabolic litter
# pools [kg C m^-2].
below_metabolic_values = 0.03 + 0.05 * (gradient) / (64)

# Generate range of plausible values (0.05-0.125) for the below ground structural litter
# pools [kg C m^-2].
below_structural_values = 0.05 + 0.075 * (gradient) / (64)

# Generate a range of plausible values (0.01-0.9) for lignin proportions of the pools.
lignin_values = 0.01 + 0.89 * (gradient) / (64)

# Make example litter dataset
example_litter_data = Dataset(
    data_vars=dict(
        litter_pool_above_metabolic=(["x", "y"], above_metabolic_values),
        litter_pool_above_structural=(["x", "y"], above_structural_values),
        litter_pool_woody=(["x", "y"], woody_values),
        litter_pool_below_metabolic=(["x", "y"], below_metabolic_values),
        litter_pool_below_structural=(["x", "y"], below_structural_values),
        lignin_above_structural=(["x", "y"], lignin_values),
        lignin_woody=(["x", "y"], lignin_values),
        lignin_below_structural=(["x", "y"], lignin_values),
    ),
    coords=dict(
        x=(["x"], cell_displacements),
        y=(["y"], cell_displacements),
    ),
    attrs=dict(description="Litter data for example Virtual Ecosystem model."),
)

# Save the dummy litter data file as netcdf
example_litter_data.to_netcdf("../data/example_litter_data.nc")

Plant data

The example_plant_data.nc file provides the following variables. Note that the plant data introduces a new axis dimension for the cohorts of plant functional groups (C). In this example data, a single cohort of each of the two configured functional groups is added for each of the 81 grid cells, giving 162 entries along the cohort axis.

Variable	Name	Unit	Dims
Cohort plant functional type	`plant_cohorts_pft`	string	C
Cohort diameter at breast height	`plant_cohorts_dbh`	m	C
Photosynthetic photon flux density	`photosynthetic_photon_flux_density`	µ mol m\(^{-2}\) s\(^{-1}\)	XYT

plant_example_data.py

"""Script to generate example data to initialise the plants model.

This script exports a NetCDF file containing a simple plant community setup for the 9 by
9 example grid. Each cell contains a single cohort of each of two different plant
functional types.

"""

import numpy as np
from xarray import DataArray, Dataset

from virtual_ecosystem.example_data.generation_scripts.common import (
    cell_id,
    n_cells,
    n_dates,
    time,
    time_index,
)

data = Dataset()

# Plant cohort dimensions
n_cohorts = n_cells * 2
cohort_index = np.arange(n_cohorts)


# Add cohort configurations
data["plant_cohorts_n"] = DataArray(
    np.array([5, 10] * n_cells), coords={"cohort_index": cohort_index}
)
data["plant_cohorts_pft"] = DataArray(
    np.array(["broadleaf", "shrub"] * n_cells), coords={"cohort_index": cohort_index}
)
data["plant_cohorts_cell_id"] = DataArray(
    np.repeat(cell_id, 2), coords={"cohort_index": cohort_index}
)
data["plant_cohorts_dbh"] = DataArray(
    np.array([0.1, 0.05] * n_cells), coords={"cohort_index": cohort_index}
)

# Spatio-temporal data
data["photosynthetic_photon_flux_density"] = DataArray(
    data=np.full((n_cells, n_dates), fill_value=1000),
    coords={"cell_id": cell_id, "time_index": time_index},
)


data["time"] = DataArray(time, coords={"time_index": time_index})

data.to_netcdf("../data/example_plant_data.nc", format="NETCDF3_64BIT")