Importing atmospheric data#

This page explains how to use SignalCreation.Lidar.Lidar.import_atmospheric_component() to ingest atmospheric fields (pressure, temperature, density) and wavelength-dependent molecular properties into the SignalCreation.Lidar.Lidar object.

This step complements the raw LIDAR signals by providing the thermodynamic state and molecular optical properties required for physical retrievals.

Overview #

############################
# 2. Import Atmospheric    #
############################
lidar.import_atmospheric_component()

This command reads one or several atmospheric models (e.g., radiosonde, NCEP, MAP85), stitches them vertically, and adds the resulting variables into the internal attribute _lidar of the Lidar object.

The attribute ``_lidar`` is an xarray.Dataset that already contains the coordinates ``altitude`` and ``time`` from raw LIDAR import. This method enriches it with:

temperature(z) (K), pressure(z) (Pa), density(z) (m⁻³)
per-wavelength molecular diffusion coefficients and cross-sections
a model flag profile indicating which model contributes at each altitude
an estimate of tropopause_altitude

Method signature #

def import_atmospheric_component(
    self,
    main_parameters_tag: str = "Read/Atmosphere_file",
) -> None:
    """Read temperature, pressure and density of the components of the atmosphere."""

Parameter #

Parameter	Description
`main_parameters_tag` \| Base XML tag that defines the list of models to read. Default: `"Read/Atmosphere_file"`.

XML configuration #

Atmospheric sources are defined under the ``Atmosphere_file`` section. You can declare multiple models (modele_0, modele_1, …) which are vertically stitched according to their order and altitude ranges.

Example:

<Dial>
  <Read>
    <Atmosphere_file>
      <modele_0>
        <filenames type="relativepath">
          <directory check="directory" type="relativepath">
            Configuration_files/Main_files/Atmosphere
          </directory>
          <directory_date_format></directory_date_format>
          <data_name_prefixe delimiter=",">ht, te</data_name_prefixe>
          <data_name_suffixe>.nmc</data_name_suffixe>
          <data_name_date_format>%y%m%d</data_name_date_format>
          <time_shift type="float">0</time_shift>
        </filenames>
        <format>ncep</format>
        <order type="int">1</order>
        <fill_value>nan</fill_value>
        <station_name>OHP</station_name>
      </modele_0>

      <modele_1>
        <filenames type="relativepath">
          Configuration_files/Main_files/Atmosphere/MAP85_PRE.DAT,
          Configuration_files/Main_files/Atmosphere/MAP85_TEMn.DAT
        </filenames>
        <format>MAP85</format>
        <order type="int">2</order>
        <fill_value>nan</fill_value>
      </modele_1>
    </Atmosphere_file>
  </Read>
</Dial>

Fields and options #

filenames Path(s) to the model file(s). Supports the same pattern logic as LIDAR files (base directory, date formatting, prefixes/suffixes, custom delimiter).
format Model format identifier (e.g., radiosonde, ncep, MAP85, arletty). Used internally to select the proper reader.
order Priority/order for vertical stitching (lower numbers are read first). Use it to define which model fills which altitude range.
fill_value Strategy or numeric value to fill gaps (e.g., nan or custom policy like min_max, depending on your implementation).
station_name Optional station ID/name (used by certain formats).
time_shift Offset applied to the selection datetime when picking model files (e.g., -1 to use the previous day’s radiosonde).

Tip

The selection datetime is derived from the mean of the LIDAR time coordinate; latitude is taken from the LIDAR dataset when available. Ensure these two fields are present after raw import.

What the method does (internals)#

Parses models declared under main_parameters_tag (modele_0, modele_1, …).
Reads each model using its reader (path, format, order, station, fill_value, time_shift).
Builds vertical profiles of temperature, pressure, and density on the LIDAR altitude grid.
Computes per-wavelength molecular diffusion coefficients and cross-sections for each wavelength present in the LIDAR channels (from attributes such as received and laser wavelengths).
Adds a Flag_models profile and a tropopause_altitude estimate to the dataset.
Stores everything in the internal xarray.Dataset (_lidar), preserving units and metadata.

Outputs in the dataset #

Typical variables added to _lidar:

temperature (K)
pressure (Pa)
density (m**-3)
molecular_diffusion (per wavelength; carries a wavelength attribute)
atmospheric_cross_section_<λ> (per wavelength; carries a wavelength attribute)
Flag_models (categorical/int code of contributing model per altitude)
tropopause_altitude (km) with _FillValue = -999 if undefined

Note

If needed, dimension names are normalized to the LIDAR dataset convention (e.g., renaming Altitude → altitude, Time → time).

Examples #

All from XML (recommended):

from SignalCreation.Lidar import Lidar

lidar = Lidar("Configuration_files/Parameters/example_station.xml", "2024-03-28")
lidar.import_lidar_data()             # defines altitude/time
lidar.import_atmospheric_component()  # adds T, P, density, molecular terms

print(lidar._lidar[["temperature", "pressure", "density"]])

Custom XML tag (rare, advanced setups):

lidar.import_atmospheric_component(main_parameters_tag="Read/Atmosphere_custom")

Quality and consistency notes #

Warning

Ensure altitude units and reference levels are consistent across models before stitching. Inconsistent vertical references can create artificial steps.

Tip

Check the Flag_models profile to verify the stitching order is applied as intended, and visually compare temperature/pressure transitions near model boundaries.

Note

Wavelength-dependent terms are computed for all wavelengths found in the LIDAR raw variables (laser and received). Make sure channel metadata carry these attributes.

Computing the diffusion coefficient (in development)#

The computation of the atmospheric molecular diffusion coefficient is being progressively implemented in the codebase.

Current implementation #

The diffusion coefficient over altitude is computed through a cascade of intermediate methods:

get_diffusion_coefficient_over_altitude()
    ↳ get_absorption_coefficient_over_altitude()
        ↳ get_cross_section_over_altitude()
            ↳ get_cross_section()

Each level refines the calculation based on available atmospheric parameters and empirical or analytical models of molecular scattering.

get_cross_section() computes the molecular cross-section using one of several classical formulations:
- Nicolet (1984) — Space Science Reviews, 37, 295–316
- Bucholtz (1995) — Applied Optics, 34(15):2765–2773
- Eberhard (2010) — Applied Optics, 49(6):1116–1130
The method parameter allows you to select the computation approach:
```
atm.get_cross_section(wavelength=355, method="Bucholtz")
atm.get_cross_section(wavelength=532, method="Eberhard")
```
Wavelengths can be given as floats (in nanometers) or pint.Quantity objects with explicit units.

Note

The current default method used in production is “Bucholtz (1995)”, which offers a good compromise between physical accuracy and computational efficiency. Other methods (Nicolet, Eberhard) are implemented but still under validation.
get_absorption_coefficient_over_altitude() multiplies the computed cross-section by the molecular density profile to obtain the absorption coefficient (m⁻¹) over altitude.
get_diffusion_coefficient_over_altitude() derives the diffusion coefficient from the absorption coefficient. The result is returned as an xarray.DataArray, aligned on the LIDAR altitude grid.

Example workflow #

from SignalCreation.Atmosphere import Atmosphere

atm = Atmosphere()
diffusion = atm.get_diffusion_coefficient_over_altitude(
    altitude=lidar._lidar["altitude"],
    wavelength=355,
    method="Bucholtz"
)

print(diffusion)

This returns a profile of diffusion coefficients as a function of altitude, consistent with the wavelengths used in the LIDAR configuration.

Status and roadmap #

Warning

These methods are currently being deployed and tested within the production pipeline. Their API and numerical behavior may evolve in upcoming releases.

Planned improvements include:

Caching computed cross-sections to avoid redundant recalculations
Validation of results across methods (Nicolet, Bucholtz, Eberhard)
Optional wavelength-dependent refractive index tables
Integration of uncertainty propagation in diffusion profiles

Once stabilized, these routines will be fully integrated into SignalCreation.Lidar.Lidar.import_atmospheric_component() to automatically populate diffusion coefficients for each LIDAR wavelength.