Utils

mwr_l12l2.utils.config_utils

mwr_l12l2.utils.config_utils.check_conf(conf, mandatory_keys, miss_description)[source]

check for mandatory keys of conf dictionary

if key is missing raises MissingConfig(‘xxx is a mandatory key ‘ + miss_description)

mwr_l12l2.utils.config_utils.get_conf(file)[source]

get conf dictionary from yaml files. Don’t do any checks on contents

mwr_l12l2.utils.config_utils.get_inst_config(file)[source]

get configuration for each instrument and check for completeness of config file

mwr_l12l2.utils.config_utils.get_log_config(file)[source]

get configuration for logger and check for completeness of config file

mwr_l12l2.utils.config_utils.get_mars_config(file, mandatory_keys=None, mandatory_keys_request=None)[source]

get configuration for mars request to obtain ECMWF data and check for completeness of config file

Parameters:
  • file – configuration file in yaml format to read in

  • mandatory_keys (optional) – mandatory primary keys. Default is [‘request’, ‘grid’, ‘outfile’] for full request. This list can be reduced for subsequent requests inheriting from the primary one

  • mandatory_keys_request (optional) – mandatory keys in request section. Default is [class’, ‘expver’, ‘type’, ‘stream’, ‘levtype’, ‘levelist’, ‘param’, ‘date’, ‘time’, ‘step’]. This list can be reduced for subsequent requests inheriting from the primary one

mwr_l12l2.utils.config_utils.get_nc_format_config(file)[source]

get configuration for output NetCDF format and check for completeness of config file

mwr_l12l2.utils.config_utils.get_retrieval_config(file)[source]

get configuration for running the retrieval check for completeness of config file and ensure absolute paths

mwr_l12l2.utils.config_utils.interpret_loglevel(conf)[source]

helper function to replace logs level strings in logs level of logging library

mwr_l12l2.utils.config_utils.merge_mars_inst_config(mars_conf, inst_conf)[source]

merge mars config and definitions in instrument config for model request giving instrument config precedence

mwr_l12l2.utils.config_utils.to_abspath(conf, keys)[source]

transform paths corresponding to keys in conf dictionary to absolute paths and return conf dict

mwr_l12l2.utils.file_utils

mwr_l12l2.utils.file_utils.abs_file_path(*file_path)[source]

Make a relative file_path absolute in respect to the mwr_l12l2 project directory. Absolute paths wil not be changed

mwr_l12l2.utils.file_utils.concat_filename(prefix, wigos, inst_id='', suffix='', ext='.nc')[source]

concatenate a filename according to E-PROFILE standards.

Parameters:
  • prefix – prefix of filename (including tailing _ if needed)

  • wigos – WIGOS-ID (or any other ID) of the station

  • inst_id – instrument ID. Will be appended to wigos using _ if not empty. Defaults to ‘’.

  • suffix – suffix part after the station and instrument ids (incl. heading _ if needed). Defaults to ‘’.

  • ext – extension. Defaults to ‘.nc’. Explicitly specify ext =’’ for no extension

mwr_l12l2.utils.file_utils.datestr_from_filename(filename, suffix='')[source]

return date string from filename, assuming it to be the last date-like block (separated by _) before suffix + ext

Accepted dates are in form ‘yyyymmddHHMM’, ‘yyyymmddHHMMSS’, ‘yyyymmdd’, ‘yymm’ etc. but not separated by -, _ or :

Parameters:
  • filename – filename as str. Can contain path and extension.

  • suffix (optional) – suffix of the filename coming after the date and before the extension. Defaults to ‘’;

Returns:

string containing the date in same representation as in the filename

mwr_l12l2.utils.file_utils.datetime64_from_filename(filename, *args, **kwargs)[source]

get numpy.datetime64 object from filename. Calling as datestr_from_fielename()

mwr_l12l2.utils.file_utils.dict_to_file(data, file, sep, header=None, remove_brackets=False, remove_parentheses=False, remove_braces=False)[source]

write dictionary contents to a file. One item per line matching keys and values using ‘sep’.

Parameters:
  • data – dictionary to write to file in question. Numpy 1d-arrays as values are ok, matrices not

  • file – output file incl. path and extension

  • sep – separator sign between key and value as string. Can include whitespaces around separator.

  • header – header string to write to the head of the file before the first dictionary item. Defaults to None

  • remove_brackets (optional) – Remove square brackets [ and ], e.g. from lists, while printing to file. Defaults to False

  • remove_parentheses (optional) – Remove parentheses ( and ), e.g. from tuples, while printing to file. Defaults to False

  • remove_braces (optional) – Remove curly braces { and } while printing to file. Defaults to False

mwr_l12l2.utils.file_utils.generate_output_filename(basename, timestamp_src, files_in=None, time=None, ext='nc')[source]

generate filename in form {basename}{timestamp}.{ext} where timestamp comes from input files or time vector

Parameters:
  • basename – the first part of the filename without the date

  • timestamp_src – source of output file timestamp. Can be ‘instamp_min’/’instamp_max’ for using smallest/largest timestamp of input filenames (needs ‘files_in) or ‘time_min’/’time_max’ for smallest/largest time in data in format yyyymmddHHMM (needs ‘time’).

  • files_in – list of input filenames to processing as a basis for timestamp selection

  • timexarray.DataArray time vector of the data in numpy.datetime64 format. Assume to be sorted

  • ext (optional) – filename extension. Defaults to ‘nc’. Empty not permitted.

mwr_l12l2.utils.file_utils.replace_path(path, part_to_replace, replace_by)[source]

replace parts of a path with another path (e.g. useful when mounting a path to another location)

mwr_l12l2.utils.file_utils.timestamp_to_float(timestamp)[source]

transform timestamp string to a float between 0 and 1 (integer of timestamp normalised by its length)

mwr_l12l2.utils.data_utils

mwr_l12l2.utils.data_utils.datetime64_to_hour(x)[source]

transform numpy.datetime64 to a float representing time of day in hours

mwr_l12l2.utils.data_utils.datetime64_to_str(x, date_format)[source]

transform numpy.datetime64 to a datestring corresponding to ‘date_format’

Parameters:
mwr_l12l2.utils.data_utils.drop_duplicates(ds, dim)[source]

drop duplicates from all data in ds for duplicates in dimension vector

Parameters:
Returns:

ds with unique dimension vector

mwr_l12l2.utils.data_utils.get_from_nc_files(files_in, concat_dim='time')[source]

read (several) NetCDF input files to a xarray.Dataset and fix time encoding for correct nc output

mwr_l12l2.utils.data_utils.get_nearest(data, find_vals)[source]

find values in data nearest values in the input data

mwr_l12l2.utils.data_utils.has_data(ds, var)[source]

check if a variable in a xarray.Dataset exists and contains non-NaN data

mwr_l12l2.utils.data_utils.lists_to_np(indict)[source]

transform all values of a dict with type list to a numpy.ndarray

mwr_l12l2.utils.data_utils.scalars_to_time(ds, variables, time_dim='time')[source]

expand scalar variables onto time dimension to form an array of len(time) containing identical values

Parameters:
  • dsxarray.Dataset containing all requested scalar variables and the time dimension to transform to

  • variables – list of variables to expand onto the time dimension. These will be replaced in-place

  • time_dim (optional) – name of the time dimension. Defaults to ‘time’.

mwr_l12l2.utils.data_utils.set_encoding(ds, vars, enc)[source]

(re-)set encoding of variables in a dataset

Parameters:
  • dsxarray.Dataset containing the data

  • vars – list of variables for which encoding is to be adapted

  • enc – encoding dictionary (containing e.g. units) that encoding of the respective variables shall to be set to.

Returns:

ds with updated encoding for var in vars

mwr_l12l2.utils.data_utils.vectors_to_time(ds, variables, time_dim='time')[source]

expand constant vector variables onto time dimension to form an array of len(time) containing identical values TODO: merge with scalars_to_time

Parameters:
  • dsxarray.Dataset containing all requested scalar variables and the time dimension to transform to

  • variables – list of variables to expand onto the time dimension. These will be replaced in-place

  • time_dim (optional) – name of the time dimension. Defaults to ‘time’.