Retrieval
major modules
mwr_l12l2.retrieval.retrieval
This is the main module orchestrating the retrieval including pre- and post-processing
- class mwr_l12l2.retrieval.retrieval.Retrieval(conf, selected_instrument=None, node=0)[source]
Bases:
objectClass for gathering and preparing all necessary information to run the retrieval
- Parameters:
conf – configuration file or dictionary
node – identifier for different parallel TROPoe runs. Defaults to 0.
- choose_model_files()[source]
choose most actual model forecast run containing time range in MWR data and according zg file
- list_obs_files()[source]
get file lists for the selected station
Note
this method shall list all (MWR) files not just the ones matching time settings. Like that old (obsolete) files are removed when
prepare_obs()is run with delete_mwr_in=True
- postprocess_tropoe()[source]
post-process the outputs of TROPoe and write to NetCDF file matching the E-PROFILE format
- prepare_model()[source]
extract reference profile and uncertainties as well as surface data from ECMWF to files readable by TROPoe
- prepare_obs(start_time=None, end_time=None, delete_mwr_in=False)[source]
Function to prepare E-PROFILE MWR and ALC inputs.
- Parameters:
start_time (datetime64) – The start time for selecting the data.
end_time (datetime64) – The end time for selecting the data.
delete_mwr_in (bool) – Flag indicating whether to delete the MWR files after processing.
- Raises:
MissingDataError – If none of the MWR files contain data between the required time limits.
MissingDataError – If there is not enough data to run the retrieval.
Finally, it sets the necessary attributes for further processing.
- prepare_paths(datestamp='', netcdf_ext='.nc')[source]
prepare input and output paths and filenames from config
- prepare_tropoe_dir()[source]
set up an empty tropoe tmp file directory for the current node (remove old one if existing)
- run(start_time=None, end_time=None)[source]
run the entire retrieval chain
- Parameters:
start_time (optional) – earliest time from which to consider data. If not specified, all data younger than ‘max_age’ specified in retrieval config will be used or, if ‘max_age’ is None, age of data is unlimited.
end_time (optional) – latest time from which to consider data. If not specified, all data received by now is processed.
- select_instrument()[source]
Selects the instrument which has the oldest MWR file in the input directory (based on its filename).
Note that this method is not called in case of the operationnal processing or in case we already have the WIGOS number setup.
This method finds the oldest file in the folder and extracts the station ID from the filename or file content. It then retrieves the instrument configuration based on the station ID and instrument ID. Finally, it sets the necessary attributes for further processing.
- Raises:
MissingDataError – If no MWR data is found in the specified directory.
helper modules
mwr_l12l2.retrieval.tropoe_helpers
This module contains utilities for running the TROPoe container
- mwr_l12l2.retrieval.tropoe_helpers.add_flags(data)[source]
Add dummy quality flags to the given data. TODO: define the quality flags.
Parameters: data (xarray.Dataset): The input data.
Returns: xarray.Dataset: The data with quality flags added.
- mwr_l12l2.retrieval.tropoe_helpers.add_variables_attrs(data, derived_product_list)[source]
Add variables attributes linked to the retrieval_type, retrieval_elevation_angles and retrieval_frequency
- mwr_l12l2.retrieval.tropoe_helpers.extract_attrs(data)[source]
Extracts some attributes from the TROPoe outputs and rename them.
- Parameters:
data (xr.Dataset) – The input data containing the variables to extract the attributes from.
- Returns:
The input data with the added attributes
- Return type:
data (xr.Dataset)
- mwr_l12l2.retrieval.tropoe_helpers.extract_avk(data, tropoe_out_config)[source]
Extracts prior information from the given data based on the TROPoe output configuration. #TODO: this function could be done more generic, e.g. by inputing a list of variables to extract prior information from.
- Parameters:
data (xr.Dataset) – The input data containing the variables to extract prior information from.
tropoe_out_config (dict) – The TROPoe output configuration dictionary.
- Returns:
The input data with the prior information variables added.
- Return type:
xr.Dataset
- Raises:
FileExistsError – If the tropoe_out_config argument is not a dictionary.
- mwr_l12l2.retrieval.tropoe_helpers.extract_prior(data, tropoe_out_config)[source]
Extracts prior information from the given data based on the TROPoe output configuration. #TODO: this function could be done more generic, e.g. by inputing a list of variables to extract prior information from.
- Parameters:
data (xr.Dataset) – The input data containing the variables to extract prior information from.
tropoe_out_config (dict) – The TROPoe output configuration dictionary.
- Returns:
The input data with the prior information variables added.
- Return type:
xr.Dataset
- Raises:
FileExistsError – If the tropoe_out_config argument is not a dictionary.
- mwr_l12l2.retrieval.tropoe_helpers.height_to_altitude(data, station_altitude)[source]
transform height above ground level to altitude above mean sea level and add as dataarray and coordinate
- Parameters:
data – xarray.Dataset in which to change height to altitude
station_altitude – single value or array defining station altitude (in an array the first entry is considered)
- Returns:
updated dataset with altitude variable added (height also kept) and coordinate swapped from height to altitude
- mwr_l12l2.retrieval.tropoe_helpers.model_to_tropoe(model, station_altitude)[source]
extract reference profile and uncertainties as well as surface data from ECMWF to files readable by TROPoe
- Parameters:
model – instance of
mwr_l12l2.model.ecmwf.interpret_ecmwf.ModelInterpreterthat with executed run()- Returns:
xarray.Datasetcontaining model profile data in a form writable to an input nc for TROPoe sfc_data:xarray.Datasetcontaining model surface data in a form writable to an input nc for TROPoe- Return type:
prof_data
- mwr_l12l2.retrieval.tropoe_helpers.run_tropoe(data_path, date, start_hour, end_hour, vip_file, apriori_file, data_mountpoint='/data', tropoe_img='davidturner53/tropoe', tmp_path='mwr_l12l2/retrieval/tmp', verbosity=1)[source]
Run TROPoe container using podman for one specific retrieval
- Parameters:
data_path – path that will be mounted to /data inside the container. Absolute path or relative to project dir
date – date for which retrieval shall be executed. For now retrievals cannot encompass more than one day. Make sure that it is of type
datetime.datetimeor a string of type ‘yyyymmdd’. Alternatively you can pass 0 or ‘0’ to let TROPoe print back the vip-file parameter options.start_hour – hour of the day defining the start time of the retrieval period. Can be a float, int or string.
end_hour – hour of the day defining the end time of the retrieval period. Can be a float, int or string.
vip_file – path to vip file relative to
data_pathor packaged inside container if matching ‘prior.*’apriori_file – path to a-priori file relative to
data_pathdata_mountpoint (optional) – where the data path will be mounted
tropoe_img (optional) – reference of TROPoe container image to use. Will take latest available by default
tmp_path (optional) – tmp path that will be mounted to /tmp inside the container. Uses a dummy folder by default
verbosity (optional) – verbosity level of TROPoe. Defaults to 1