Region processing using model mappings

The nomenclature package supports automated region aggregation as part of a scenario processing workflow. The instructions for region aggregation are provided as a model mapping.

The region-processing supports multiple methods for aggregation of data, including summation and (weighted) average across regions. The method can be specified for each variable via the codelist, see Attributes for region aggregation.

Model mapping format specification

This example illustrates a model mapping:

model: Model A v1.0
native_regions:
  - region_a: alternative_name_a
  - region_b
common_regions:
  - common_region_1:
    - region_a
    - region_b
  - common_region_2:
    - ...
exclude_regions:
  - region_c
  - ...

The properties model and (at least) one of native_regions and common_regions are required in a valid model mapping. See The region codelist for more information.

  • model (str or list of str): the model name(s) for which the mapping applies.

  • native_regions (list): a list of model native regions serves as a selection as to which regions to keep.

    • In the above example region_a is to be renamed to alternative_name_a. This is done by defining a key-value pair of model_native_name: new_name.

    • region_b is selected but the name is not changed.

    • Assuming model_a also defines a third region region_c, since it is not mentioned it will be dropped from the data.

  • common_regions (list): list of common regions which will be computed as aggregates. They are defined as list entries which themselves have a list of constituent regions. These constituent regions must be model native regions.

    The names of the constituent regions must refer to the original model native region names, i.e., region_a and region_b, not alternative_name_a in the example shown above.

  • exclude_regions optional (list of str): If input data for region processing contains regions which are not mentioned in native_regions, in common_regions (as the name of a common region or a constituent region) an error will be raised. This is a safeguard against silently dropping regions which are not in named in native_regions or common_regions.

    If regions are to be excluded, they can be explicitly named in the exclude_regions section which causes their presence to no longer raise an error.

Region aggregation

In order to illustrate how region aggregation is performed, consider the following model mapping:

model: model_a
common_regions:
  - common_region_1:
    - region_a
    - region_b

If the data provided for region aggregation contains results for common_region_1 they are compared and combined according to the following logic:

  1. If a variable is not reported for common_region_1, it is calculated through region aggregation of regions region_a and region_b.

  2. If a variable is only reported for common_region_1 level it is used directly.

  3. If a variable is reported for common_region_1 as well as region_a and region_b. The provided results take precedence over the aggregated ones. Additionally, the aggregation is computed and compared to the provided results. If there are discrepancies, a warning is written to the logs.

    Note

    Please note that in case of differences no error is raised. Therefore it is necessary to check the logs to find out if there were any differences. This is intentional since some differences might be expected.

Computing differences between original and aggregated data

In order to get the differences between the original data (e.g., results reported by the model) and the data aggregated according to the region mapping, perform the following steps:

  1. Make sure you have pyam-iamc >= 1.7.0 and nomenclature-iamc>=0.10.0 installed.

  2. Clone the workflow directory of your project

  3. Navigate to the workflow directory

  4. Using a Jupyter notebook or Python script run the following:

from pyam import IamDataFrame
from nomenclature import DataStructureDefinition, RegionProcessor

data = IamDataFrame("/path/to/your/input/data.xlsx")

dsd = DataStructureDefinition("definitions")
processor = RegionProcessor.from_directory("mappings", dsd)

# get the differences as a pandas dataframe
# the value for the relative tolerances can be adjusted, defaults to 0.01
processed_data, differences = processor.check_region_aggregation(data, rtol_difference=0.01)
# save the result of the region processing
processed_data.to_excel("results.xlsx")
# and the differences
differences.to_excel("differences.xlsx")

Please refer to RegionProcessor.check_region_aggregation() for details.

Alternatively you can also use the nomenclature cli:

$ nomenclature check-region-aggregation /path/to/your/input/data.xlsx
-w workflow_directory --processed_data results.xlsx --differences differences.xlsx

For cli details please refer to Command line interface.