Variable naming and metadata configuration#
This page describes how variable names and metadata are automatically managed in the DIAL LIDAR Ozone ACTRIS software. All variable names, indices, delimiters, and metadata (e.g., NDACC or AERIS vocabularies) are defined in an external XML file: `SignalCreation/name_variables.xml`.
This mechanism ensures: - consistent and traceable variable names across all modules, - standardized metadata for NetCDF export (e.g., NDACC conventions), - automatic indexing of repeated variables (e.g., multiple channels or filtered datasets), - consistent naming of uncertainty variables.
Overview#
The XML file name_variables.xml defines: 1. Global rules for variable naming (<general> section) 2. Individual variable definitions and their metadata (<var> section)
Example structure:
<varname>
<general>
<nb_digit type="int">1</nb_digit>
<delimiter>_</delimiter>
<index_min type="int">0</index_min>
<index_step type="int">1</index_step>
<uncertainty_suffix>_Unc</uncertainty_suffix>
</general>
<var>
<altitude>altitude</altitude>
<pressure>Pressure
<standard_name>air_pressure</standard_name>
<long_name>Auxiliary atmospheric pressure</long_name>
<aeris_vocabulary>Atmospheric pressure measurements</aeris_vocabulary>
<aeris_uri>https://vocab.aeris-data.fr/parameter/cf3d4167-9218-e92a-6451-36092ee1f502</aeris_uri>
<ndacc_vocabulary>PRESSURE_INDEPENDENT</ndacc_vocabulary>
<ndacc_uri>https://avdc.gsfc.nasa.gov/index.php?site=701636862</ndacc_uri>
</pressure>
<temperature>Temperature
<standard_name>air_temperature</standard_name>
<long_name>Auxiliary atmospheric temperature</long_name>
</temperature>
<ozone>Ozone
<ndacc_vocabulary flag_unc="False">O3.NUMBER.DENSITY_ABSORPTION.DIFFERENTIAL</ndacc_vocabulary>
<ndacc_vocabulary flag_unc="True">O3.NUMBER.DENSITY_ABSORPTION.DIFFERENTIAL_UNCERTAINTY.ORIGINATOR</ndacc_vocabulary>
</ozone>
</var>
</varname>
General configuration#
The <general> block defines the global naming rules that apply to all variables. These parameters control how names are generated, indexed, and extended for uncertainties.
General configuration#
The <general> block defines the global naming rules that apply to all variables.
These parameters control how names are generated, indexed, and extended for uncertainties.
Field |
Description |
|---|---|
|
Number of digits used when there are multiple similar
variables (e.g., |
|
Character used to separate the base variable name from its
index (default: |
|
Minimum index value for numbered variables (typically 0 or 1). |
|
Step increment between consecutive indices (default: 1).
Example: if |
|
String appended to a variable name to designate its
uncertainty version (default: |
These parameters ensure that all variable names are consistent, traceable, and automatically generated according to a predictable structure.
These parameters ensure that all variable names are consistent, traceable, and automatically generated according to a predictable structure.
Variable definitions#
Each <var> entry in the XML file defines an individual variable name and its associated metadata.
The XML tag name (e.g., <pressure>, <temperature>, <ozone>) corresponds to the internal name used in the Python code.
The text value of the element (e.g., Pressure, Temperature, Ozone) defines the base name stored in the NetCDF file, or the prefix used when there are multiple indexed variables (e.g., Raw_Lidar_Data_1, Raw_Lidar_Data_2, etc.). The number of digits for the index is determined by the global
nb_digitparameter from the <general> section.The child elements (sub-tags) define standardized attributes, following the CF conventions and extended vocabularies when applicable:
standard_nameandlong_name→ CF-compliant attributesndacc_vocabularyandndacc_uri→ NDACC-specific metadataaeris_vocabularyandaeris_uri→ AERIS vocabulary referencesgcmd_vocabularyandgcmd_uri→ GCMD keyword mappings (optional)
Each vocabulary tag helps ensure interoperability between datasets and compliance with data sharing standards.
Example:
<ozone>Ozone
<standard_name>mole_concentration_of_ozone_in_air</standard_name>
<long_name>Ozone number density derived from DIAL</long_name>
<ndacc_vocabulary>O3.NUMBER.DENSITY_ABSORPTION.DIFFERENTIAL</ndacc_vocabulary>
<ndacc_uri>https://avdc.gsfc.nasa.gov/index.php?site=701636862</ndacc_uri>
<aeris_vocabulary>Ozone profile from lidar</aeris_vocabulary>
<aeris_uri>https://vocab.aeris-data.fr/parameter/12345678-abcd-efgh-ijkl-987654321000</aeris_uri>
</ozone>
This variable definition ensures that: - The NetCDF variable will be named “Ozone”. - All standardized metadata will be automatically attached during dataset export. - If multiple ozone profiles are produced, they will be named sequentially as “Ozone_1”, “Ozone_2”, etc., using the rules from the <general> section.
Functions overview#
The following utility functions are defined in the module
SignalCreation.Utils.NameVariables.
They read and interpret name_variables.xml and apply consistent naming and metadata handling.
get_var_name()#
get_var_name(name, index=None, flag_unc=False, ...)
Builds a variable name from its XML definition, optionally adding an index or uncertainty suffix.
Examples:
get_var_name("raw_lidar_data") # → "Raw_Lidar_Data"
get_var_name("raw_lidar_data", 1) # → "Raw_Lidar_Data_1"
get_var_name("raw_lidar_data", flag_unc=True) # → "Raw_Lidar_Data_Unc"
If an xarray.Dataset is provided as index, the function searches for existing variables
and appends the next available index automatically.
get_attrs()#
get_attrs(name, flag_unc=False) -> dict
Returns a dictionary of metadata attributes defined in the XML for a variable.
Includes vocabularies such as standard_name, long_name, or ndacc_vocabulary.
Example:
attrs = get_attrs("ozone")
print(attrs)
# {'ndacc_vocabulary': 'O3.NUMBER.DENSITY_ABSORPTION.DIFFERENTIAL'}
add_lidar_da()#
add_lidar_da(ds_init, name, da, da_unc=None, attrs=None, ...)
Adds a new xarray.DataArray (and optionally its uncertainty) to an existing dataset,
using variable names and metadata defined in the XML configuration.
Key features: - Automatically generates unique indexed names (Temperature_0, Temperature_1, …) - Appends _Unc suffix to uncertainty variables - Copies all XML metadata into the variable attributes - Stores both magnitude and units if a pint.UnitRegistry is provided
Example:
ds = add_lidar_da(ds, "pressure", da_pressure, ureg=unit_registry)
ds = add_lidar_da(ds, "ozone", da_ozone, da_unc=da_ozone_unc, ureg=unit_registry)
get_lidar_da()#
get_lidar_da(ds, name, index=None, flag_unc=False, ...)
Retrieves a variable from an xarray.Dataset by matching its XML-defined base name,
optional index, and optional uncertainty flag.
Example:
da_temp = get_lidar_da(ds, "temperature")
da_ozone_unc = get_lidar_da(ds, "ozone", flag_unc=True)
Uncertainty variable handling#
The flag flag_unc=True is used in both naming and attribute retrieval functions to
distinguish uncertainty variables.
A suffix (defined in
<uncertainty_suffix>) is appended to the base variable name (e.g.,_Unc→Ozone_Unc).When retrieving attributes from the XML, entries with
flag_unc="True"are selected for the uncertainty variable, while those without (or withflag_unc="False") apply to the main variable.
Example XML:
<ozone>
<ndacc_vocabulary flag_unc="False">
O3.NUMBER.DENSITY_ABSORPTION.DIFFERENTIAL
</ndacc_vocabulary>
<ndacc_vocabulary flag_unc="True">
O3.NUMBER.DENSITY_ABSORPTION.DIFFERENTIAL_UNCERTAINTY.ORIGINATOR
</ndacc_vocabulary>
</ozone>
API usage:
Naming with suffix:
get_var_name("ozone", flag_unc=True) # → "Ozone_Unc" (assuming <uncertainty_suffix>_Unc</uncertainty_suffix>)
Fetching uncertainty attributes:
attrs_unc = get_attrs("ozone", flag_unc=True)
Adding data + uncertainty to the dataset:
ds = add_lidar_da(ds, "ozone", da_ozone, da_unc=da_ozone_unc, ureg=unit_registry)