Importing LIDAR raw data#

This section explains how to use the method SignalCreation.Lidar.Lidar.import_lidar_data() to load raw LIDAR signals into the main SignalCreation.Lidar.Lidar object.

This step is the foundation of all subsequent processing: it defines the altitude and time coordinates of the dataset, as well as the channel-dependent raw photon-count signals.

Overview#

#############################
# 1. Reading Raw Lidar Data #
#############################
lidar.import_lidar_data()

This command reads one or more raw LIDAR files, applies optional preprocessing (e.g. channel selection, unit conversion, resampling), and populates the internal attribute _lidar of the SignalCreation.Lidar.Lidar object.

The attribute ``_lidar`` is an xarray.Dataset that stores all imported raw data and their associated coordinates and metadata. It typically contains:

  • Coordinates: altitude (vertical grid), time (acquisition timestamps)

  • Data variables: one variable per channel (e.g. Raw_Lidar_Data_102)

  • Attributes: physical units, acquisition start/finish times, laser shots, etc.

This dataset forms the basis of all subsequent processing steps — from background subtraction to ozone retrieval — and follows NetCDF-style conventions, allowing interoperability with standard scientific tools such as xarray, netCDF4, or Panoply.

All import options can be set directly through the XML parameter file, or overridden by arguments.

Method signature#

def import_lidar_data(
    self,
    lidar_files: str | None = None,
    channel_to_remove: int | list | None = None,
    lidar_format: str | None = None,
    force_photocounting_units: str | None = None,
    xml_tag: str = "Read/Lidar_files/",
) -> None:
    """Import raw LIDAR data into the Lidar object."""

Parameters#

Parameter

Description

lidar_files

Path(s) to raw file(s). If None, uses the XML tag Read/Lidar_files/filenames.

channel_to_remove

Channel(s) (by Signal_type) to ignore. Optional.

lidar_format

Input file format (e.g., Teslas, Licel).

force_photocounting_units | Force input units: "MHz" or "photocounting".

xml_tag

XML base path (default: Read/Lidar_files/).

Example usage#

Simple example where all configuration is defined in the XML file:

from SignalCreation.Lidar import Lidar

lidar = Lidar("Configuration_files/Parameters/example_station.xml", "2024-03-28")
lidar.import_lidar_data()

print(lidar._lidar)  # shows altitude, time, and raw photon counts per channel

If you want to override parameters manually (e.g., specify files directly):

lidar.import_lidar_data(
    lidar_files="data/raw/sho240328_01.s1",
    lidar_format="Teslas",
    force_photocounting_units="MHz"
)

XML configuration#

A minimal configuration for this step looks like this:

<Dial>
  <Read>
    <Lidar_files>
      <filenames type="relativepath">pathname_lidar</filenames>
      <format>Teslas</format>
      <Signal_type type="int">
        102, 103, 100, 101, 104, 105
      </Signal_type>
    </Lidar_files>
  </Read>
</Dial>

The Signal_type values identify each channel (see Signal types (Signal_Type) for details).

For multiple files, you can list them using a custom delimiter:

<filenames type="relativepath" delimiter=";">
    pathname_lidar_1;pathname_lidar_2
</filenames>

Handling date-dependent file paths#

In many setups, data are organized by date. You can define subfields to dynamically construct file names and directories based on the processing date:

<Dial>
  <Read>
    <date_to_process>
      <date type="date" format_date="iso">2024-03-28</date>
    </date_to_process>
    <Lidar_files>
      <filenames type="relativepath">
        <directory check="directory" type="relativepath">
          Configuration_files/Main_files/Lidar
        </directory>
        <data_name_prefixe>sho</data_name_prefixe>
        <data_name_suffixe>*.s*</data_name_suffixe>
        <data_name_date_format>%y%m%d</data_name_date_format>
      </filenames>
    </Lidar_files>
  </Read>
</Dial>

Optional XML fields#

You can customize how the raw LIDAR files are interpreted and standardized before being stored into the internal dataset.

  • Input units

    Define the unit system of the input signals. By default, data are assumed to be in photon counts (photocounting mode). This setting only applies to photon-counting channels — analog channels must always be expressed in millivolts (mV).

    <unit_photocounting>MHz</unit_photocounting>
    

    If unspecified, the program assumes raw data are in number of photons (counts).

  • Remove channels

    This option allows you to ignore specific channels during import. Provide the list of channel indices (corresponding to Signal_type values) that should not be read from the raw file.

    <Remove_channel type="int">101</Remove_channel>
    

    Multiple values can be provided, separated by commas (e.g. 101, 104, 105). This feature is optional — if omitted, all channels present in the file are imported.

  • Force resolution

    Forces all channels to be resampled to a common altitude resolution. This is useful when the raw data have different range samplings between channels (e.g., 7.5 m and 15 m). The value is expressed in meters.

    <ForceResolution type="float" units="m">15</ForceResolution>
    

    Important

    All channels must share the same final resolution after resampling, otherwise later processing steps may fail.

  • Electronic delay

    Specifies an electronic delay (in nanoseconds) to be added to each channel during altitude vector computation. This compensates for differences in cable lengths or timing offsets between detection chains.

    <electronic_delay type="float" units="ns">
      100, 100, 100, 100, 100, 100
    </electronic_delay>
    

    The number of delay values should match the number of channels in the file.

Signal types#

Each measurement channel in the raw file must be associated with a unique Signal_type value, which identifies its physical meaning (e.g., wavelength, detection mode, polarization).

These values are declared in the XML configuration, separated by commas:

<Signal_type type="int">
    102, 103, 100, 101, 104, 105
</Signal_type>

They are essential for proper identification and processing of each LIDAR channel. See Signal types (Signal_Type) for the complete list and their definitions.

What happens internally#

When called, import_lidar_data() performs the following steps:

  1. Reads file paths and parameters from the XML (or overrides from arguments).

  2. Loads the raw LIDAR data using the ReadLidar class.

  3. Converts these into an internal xarray.Dataset (attribute _lidar).

  4. Optionally computes statistical uncertainties on photon counts.

  5. Extracts acquisition start/finish times and duration, written to XML tags: Read/date_to_process/hours and Read/date_to_process/acquisition_time.

Note

If no valid files are found, the method simply returns without raising an error.

Outputs#

After successful import, you can inspect the data:

print(lidar._lidar)
print(lidar._lidar.dims)
print(lidar._lidar.data_vars)

Typical variables:

  • Raw_Lidar_Data_XXX (per channel)

  • time (datetime64)

  • altitude (float, m)

  • nb_shots (number of laser shots)

  • optional uncertainty arrays

See also#