RequiredDataValidator¶
Required data validation checks if certain models, variables, regions and/or periods of time are covered in the timeseries data.
For this, a configuration file specifies the model(s) and dimension(s) expected
in the dataset. These are variable, region and/or year.
Alternatively, instead of using variable, it is possible to declare measurands,
which jointly specify variables and units.
description: Required variables for running MAGICC
model: model_a
required_data:
- measurand:
Emissions|CO2:
unit: Mt CO2/yr
region: World
year: [2020, 2030, 2040, 2050]
In the example above, for model_a, the dataset must include datapoints of the variable Emissions|CO2 (measured in Mt CO2/yr), in the region World, for the years 2020, 2030, 2040 and 2050.
Standard usage¶
from nomenclature import RequiredDataValidator
# ...setting directory/file paths and loading dataset
RequiredDataValidator.from_file(yaml_file_containing_required_data).apply(df)
- class nomenclature.RequiredDataValidator(*, input_data=None, input_meta=None, output_data=None, output_meta=None, fail_ok=False, description=None, model=None, required_data, file)[source]¶
Processor for validating required dimensions in IAMC datapoints
Methods
apply(df)Validates data in IAMC format according to required models and dimensions.
check_required_data_per_model(df, model)Check which required data is missing for a single model.
from_file(file)Create a
RequiredDataValidatorfrom a YAML file.Validate the required data specification against a
DataStructureDefinition.- apply(df)[source]¶
Validates data in IAMC format according to required models and dimensions.
- Parameters:
- dfpyam.IamDataFrame
Data in IAMC format to be validated
- Returns:
- pyam.IamDataFrame
- Raises:
ValueErrorif any required dimension is not found in the data
- check_required_data_per_model(df, model)[source]¶
Check which required data is missing for a single model.
- Parameters:
- dfpyam.IamDataFrame
Data in IAMC format to check.
- modelstr
Model name to filter the data for.
- Returns:
- list of
pandas.DataFrame List of DataFrames describing missing data, one per unfulfilled requirement. Empty if all requirements are satisfied.
- list of
- classmethod from_file(file)[source]¶
Create a
RequiredDataValidatorfrom a YAML file.- Parameters:
- file
pathlib.Pathor str Path to the YAML file containing the required data specification.
- file
- Returns:
- RequiredDataValidator
- validate_with_definition(dsd)[source]¶
Validate the required data specification against a
DataStructureDefinition.Checks that all variables, regions, and units referenced in the required data exist in the provided definition.
- Parameters:
- dsdDataStructureDefinition
Data structure definition to validate against.
- Raises:
- ExceptionGroup
If any required data item references unknown variables, regions, or units.