Tools & utilities

Generate toy data

roger.tools.make_toy_data.make_toy_forcing(base_path, ndays=10, nrows=1, ncols=1, event_type='rain', enable_groundwater_boundary=False, enable_crop_phenology=False, float_type='float32')[source]

Make toy forcing with synthetically generated data.

roger.tools.make_toy_data.make_toy_forcing_event(base_path, ta=10, nhours=5, dt=10, nrows=1, ncols=1, event_type='rain', rain_sum=10, heavyrain_sum=60, float_type='float32')[source]

Make toy forcing for a single event with synthetically generated data.

roger.tools.make_toy_data.make_toy_forcing_tracer(base_path, tracer='Br', start_date='1/10/2010', ndays=10, nrows=1, ncols=1, float_type='float32')[source]

Make toy forcing with synthetically generated data.

Setup tools

roger.tools.setup.write_crop_rotation(input_dir, nrows=1, ncols=1, float_type='float32')[source]

Writes crop rotation data from CSV to NetCDF

Parameters:
  • input_dir (Path) – path to directory with input data

  • nrows (int, optional) – number of rows

  • ncols (int, optional) – number of columns

roger.tools.setup.write_forcing_event(input_dir, nrows=1, ncols=1, uniform=True, prec_correction=False, float_type='float32')[source]

Writes forcing data for a single event from TXT to NetCDF

Parameters:
  • input_dir (Path) – path to directory with input data

  • nrows (int, optional) – number of rows

  • ncols (int, optional) – number of columns

  • uniform (bool, optional) – True if time series are used as input data

  • prec_correction (str, optional) – if True precipitation is corrected according to Richter (1995)

roger.tools.setup.write_forcing(input_dir, nrows=1, ncols=1, uniform=True, enable_crop_phenology=False, enable_groundwater_boundary=False, enable_film_flow=False, end_event=21600, prec_correction=None, float_type='float32')[source]

Writes forcing data to NetCDF

Parameters:
  • input_dir (Path) – path to directory with input data

  • nrows (int, optional) – number of rows

  • ncols (int, optional) – number of columns

  • uniform (bool, optional) – True if time series are used as input data

  • enable_crop_phenology (bool, optional) – if True daily minimum and maximum is required

  • enable_groundwater_boundary (bool, optional) – if True groundwater head is required

  • enable_film_flow (bool, optional) – if True number of events is provided

  • end_event (int, optional) – seconds with no precipitation after which an event is considered to be finished

  • prec_correction (str, optional) – if True precipitation is corrected according to Richter (1995)

roger.tools.setup.write_forcing_tracer(input_dir, tracer, nrows=1, ncols=1, uniform=True, float_type='float32')[source]

Writes tracer forcing data from TXT to NetCDF

Parameters:
  • input_dir (Path) – path to directory with input data

  • tracer (str) – name of tracer (e.g. d18O)

  • nrows (int, optional) – number of rows

  • ncols (int, optional) – number of columns

  • uniform (bool, optional) – True if time series are used as input data

roger.tools.setup.precipitation_correction(prec, ta, month, horizontal_shielding='b1')[source]

Correction of precipitation according to Richter (1995).

Parameters:
  • prec (np.ndarray) – precipitation at time step t (in mm)

  • ta (np.ndarray) – air temperature at time step t (in celsius)

  • month (int) – month at time step t

  • horizontal_shielding (str) – b1 = open location b2 = slightly protected b3 = moderately protected b4 = strongly protected

Returns:

prec_corr – corrected precipitation at time step t (in mm)

Return type:

onp.ndarray

Notes

Richter, D.: Ergebnisse methodischer Untersuchungen zur Korrektur des systematischen Meßfehlers des Hellmann-Niederschlagsmessers, Berichte des Deutschen Wetterdienstes, Selbstverlag des Deutschen Wetterdienstes, Offenbach am Main, 1995.

roger.tools.setup.validate(data)[source]

Check if Dataframe has correct type and is numerical.

This function checks if the input is a pd.DataFrame throws an error in case of incorrect data.

Parameters:

data (pd.DataFrame) – model input data

Raises:

ValueError – In case non-numerical data is passed

Evaluation of simulations

roger.tools.evaluation.join_obs_on_sim(idx, sim_vals, df_obs, rm_na=False)[source]

Join observed values on simulated values.

Parameters:
  • idx (pd.DatetimeIndex) – time index

  • sim_vals (onp.ndarray) – simulated values

  • df_obs (pd.DataFrame) – observed values

  • rm_na (boolean, optional) – whether NaNs are removed. default is False.

Returns:

df – DataFrame containing simulated and observed values.

Return type:

pd.DataFrame

roger.tools.evaluation.plot_sim(df, y_lab='', ls_obs='line', x_lab='Time', ylim=None)[source]

Plot simulated values.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated and observed values

  • y_lab (str) – label of y-axis

  • ls_obs (str, optional) – linestyle of observations

  • x_lab (str, optional) – label of x-axis

  • ylim (tuple, optional) – y-axis limits

Returns:

fig – Plot for observed and simulated values

Return type:

Figure

roger.tools.evaluation.plot_sim_cum(df, y_lab='', ls_obs='line', x_lab='Time', ylim=None)[source]

Plot simulated values.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated and observed values

  • y_lab (str) – label of y-axis

  • ls_obs (str, optional) – linestyle of observations

  • x_lab (str, optional) – label of x-axis

  • ylim (tuple, optional) – y-axis limits

Returns:

fig – Plot for observed and simulated values

Return type:

Figure

roger.tools.evaluation.plot_obs_sim(df, y_lab='', ls_obs='line', x_lab='Time', ylim=None)[source]

Plot observed and simulated values.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated and observed values

  • y_lab (str) – label of y-axis

  • fmt_x (str, optional) – Format of x-axis. Default is numerical (‘num’). Alternatively, date format can be used (‘date’).

  • ls_obs (str, optional) – linestyle of observations

  • x_lab (str, optional) – label of x-axis

  • ylim (tuple, optional) – y-axis limits

Returns:

fig – Plot for observed and simulated values

Return type:

Figure

roger.tools.evaluation.plot_obs_sim_year(df, y_lab, start_month_hyd_year=10, ls_obs='line', x_lab='Time', ylim=None)[source]

Plot observed and simulated values.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated and observed values

  • y_lab (str) – label of y-axis

  • start_month_hyd_year (int, optional) – starting month of hydrologic year

  • ls_obs (str, optional) – linestyle of observations

  • x_lab (str, optional) – label of x-axis

  • ylim (tuple, optional) – y-axis limits

Returns:

figs – list with figures

Return type:

list

roger.tools.evaluation.plot_obs_sim_cum(df, y_lab, x_lab='Time')[source]

Plot cumulated observed and simulated values.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated and observed values

  • y_lab (str) – label of y-axis

  • x_lab (str, optional) – label of x-axis

Returns:

fig – Plot for observed and simulated values

Return type:

Figure

roger.tools.evaluation.plot_obs_sim_cum_year(df, y_lab, start_month_hyd_year=10, x_lab='Time')[source]

Plot cumulated observed and simulated values for each hydrologic year.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated and observed values

  • y_lab (str) – label of y-axis

  • start_month_hyd_year (int, optional) – starting month of hydrologic year

  • x_lab (str, optional) – label of x-axis

Returns:

figs – list with figures

Return type:

list

roger.tools.evaluation.plot_obs_sim_cum_year_facet(df, y_lab, start_month_hyd_year=10, x_lab='Time')[source]

Plot cumulated observed and simulated values for each hydrologic year.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated and observed values

  • y_lab (str) – label of y-axis

  • start_month_hyd_year (int, optional) – starting month of hydrologic year

  • x_lab (str, optional) – label of x-axis

Returns:

fig

Return type:

Figure

roger.tools.evaluation.plot_sim_cum_year_facet(df, y_lab, start_month_hyd_year=10, x_lab='Time')[source]

Plot cumulated observed and simulated values for each hydrologic year.

Parameters:
  • df (pd.DataFrame) – Dataframe with simulated values

  • y_lab (str) – label of y-axis

  • start_month_hyd_year (int, optional) – starting month of hydrologic year

  • x_lab (str, optional) – label of x-axis

Returns:

fig

Return type:

Figure

roger.tools.evaluation.time_to_num(idx, time='days')[source]

Convert DatetimeIndex to numeric range. Conversion is based either on days or hours.

Parameters:
  • idx (pd.DatetimeIndex) – variable time index

  • time (str, optional) – time unit

Returns:

idx_num – numerical date range

Return type:

onp.array

roger.tools.evaluation.assign_hyd_year(df, start_month_hyd_year=10)[source]

Assign hydrologic year.

Parameters:
  • df (DataFrame) – contains hydrologic values

  • start_month_hyd_year (int, optional) – starting month of hydrologic year

Returns:

contains hydrologic values and a column with the assigned hydrologic year

Return type:

DataFrame

roger.tools.evaluation.assign_seasons(df)[source]

Assign seasons.

Parameters:

df (DataFrame) – contains hydrologic values

Returns:

contains hydrologic values and a column with the assigned seasons

Return type:

DataFrame

roger.tools.evaluation.calc_api(prec, w, k)[source]

Calculate antecedent precipitation index (API).

Parameters:
  • prec ((N,)array_like) – precipitation values

  • w (int) – window width

  • k (float) – decay constant ranges between 0.8 and 0.98

Returns:

api – antecedent precipitation index

Return type:

(N,)array_like

roger.tools.evaluation.calc_napi(prec, w, k)[source]

Calculate normalized antecedent precipitation index (NAPI).

Parameters:
  • prec ((N,)array_like) – precipitation values

  • w (int) – window width

  • k (float) – decay constant ranges between 0.8 and 0.98

Returns:

api – antecedent precipitation index

Return type:

(N,)array_like

roger.tools.evaluation.calc_rmse(obs, sim)[source]

Root mean square error (RMSE)

Parameters:
  • obs ((N,)array_like) – observed time series

  • sim ((N,)array_like) – simulated time series

Returns:

eff – Root mean square error (RMSE)

Return type:

float

roger.tools.evaluation.calc_mae(obs, sim)[source]

Mean absolute error (MAE)

Parameters:
  • obs ((N,)array_like) – observed time series

  • sim ((N,)array_like) – simulated time series

Returns:

eff – Mean absolute error (MAE)

Return type:

float

roger.tools.evaluation.calc_mre(obs, sim)[source]

Mean relative error (MRE)

Parameters:
  • obs ((N,)array_like) – observed time series

  • sim ((N,)array_like) – simulated time series

Returns:

eff – Mean relative error (MRE)

Return type:

float

roger.tools.evaluation.calc_mare(obs, sim)[source]

Mean absolute relative error (MARE)

Parameters:
  • obs ((N,)array_like) – observed time series

  • sim ((N,)array_like) – simulated time series

Returns:

eff – Mean absolute relative error (MARE)

Return type:

float

roger.tools.evaluation.calc_ve(obs, sim)[source]

Volumetric efficiency (VE)

Parameters:
  • obs ((N,)array_like) – observed time series

  • sim ((N,)array_like) – simulated time series

Returns:

eff – Volumetric efficiency (VE)

Return type:

float

roger.tools.evaluation.calc_rbs(obs, sim)[source]

Relative bias of sums (RBS)

Parameters:
  • obs ((N,)array_like) – observed time series

  • sim ((N,)array_like) – simulated time series

Returns:

eff – relative bias of sums (RBS)

Return type:

float

roger.tools.evaluation.calc_temp_cor(obs, sim, r='pearson')[source]

Calculate temporal correlation between observed and simulated time series.

Parameters:
  • obs ((N,)array_like) – Observed time series as 1-D array

  • sim ((N,)array_like) – Simulated time series

  • r (str, optional) – Either Spearman correlation coefficient (‘spearman’) or Pearson correlation coefficient (‘pearson’) can be used to describe the temporalcorrelation. The default is to calculate the Pearson correlation.

Returns:

temp_cor – correlation between observed and simulated time series

Return type:

float

Examples

Provide arrays with equal length

>>> from de import de
>>> import numpy as np
>>> obs = onp.array([1.5, 1, 0.8, 0.85, 1.5, 2])
>>> sim = onp.array([1.6, 1.3, 1, 0.8, 1.2, 2.5])
>>> de.calc_temp_cor(obs, sim)
0.8940281850583509
roger.tools.evaluation.calc_kge_beta(obs, sim)[source]

Calculate the beta term of Kling-Gupta-Efficiency (KGE).

Parameters:
  • obs ((N,)array_like) – Observed time series as 1-D array

  • sim ((N,)array_like) – Simulated time series as 1-D array

Returns:

kge_beta – alpha value

Return type:

float

Notes

\[\beta = \frac{\mu_{sim}}{\mu_{obs}}\]

Examples

Provide arrays with equal length

>>> from de import de
>>> import numpy as np
>>> obs = onp.array([1.5, 1, 0.8, 0.85, 1.5, 2])
>>> sim = onp.array([1.6, 1.3, 1, 0.8, 1.2, 2.5])
>>> de.calc_kge_beta(obs, sim)
1.0980392156862746

References

Gupta, H. V., Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling, Journal of Hydrology, 377, 80-91, 10.1016/j.jhydrol.2009.08.003, 2009.

Kling, H., Fuchs, M., and Paulin, M.: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios, Journal of Hydrology, 424-425, 264-277, 10.1016/j.jhydrol.2012.01.011, 2012.

Pool, S., Vis, M., and Seibert, J.: Evaluating model performance: towards a non-parametric variant of the Kling-Gupta efficiency, Hydrological Sciences Journal, 63, 1941-1953, 10.1080/02626667.2018.1552002, 2018.

roger.tools.evaluation.calc_kge_alpha(obs, sim)[source]

Calculate the alpha term of the Kling-Gupta-Efficiency (KGE).

Parameters:
  • obs ((N,)array_like) – Observed time series as 1-D array

  • sim ((N,)array_like) – Simulated time series

Returns:

kge_alpha – alpha value

Return type:

float

Notes

\[\alpha = \frac{\sigma_{sim}}{\sigma_{obs}}\]

Examples

Provide arrays with equal length

>>> from de import de
>>> import numpy as np
>>> obs = onp.array([1.5, 1, 0.8, 0.85, 1.5, 2])
>>> sim = onp.array([1.6, 1.3, 1, 0.8, 1.2, 2.5])
>>> kge.calc_kge_alpha(obs, sim)
1.2812057455166919

References

Gupta, H. V., Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling, Journal of Hydrology, 377, 80-91, 10.1016/j.jhydrol.2009.08.003, 2009.

Kling, H., Fuchs, M., and Paulin, M.: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios, Journal of Hydrology, 424-425, 264-277, 10.1016/j.jhydrol.2012.01.011, 2012.

Pool, S., Vis, M., and Seibert, J.: Evaluating model performance: towards a non-parametric variant of the Kling-Gupta efficiency, Hydrological Sciences Journal, 63, 1941-1953, 10.1080/02626667.2018.1552002, 2018.

roger.tools.evaluation.calc_kge_gamma(obs, sim)[source]

Calculate the gamma term of Kling-Gupta-Efficiency (KGE).

Parameters:
  • obs ((N,)array_like) – Observed time series as 1-D array

  • sim ((N,)array_like) – Simulated time series as 1-D array

Returns:

kge_gamma – gamma value

Return type:

float

Notes

\[\gamma = \frac{CV_{sim}}{CV_{obs}}\]

Examples

Provide arrays with equal length

>>> from de import de
>>> import numpy as np
>>> obs = onp.array([1.5, 1, 0.8, 0.85, 1.5, 2])
>>> sim = onp.array([1.6, 1.3, 1, 0.8, 1.2, 2.5])
>>> kge.calc_kge_gamma(obs, sim)
1.166812375381273

References

Gupta, H. V., Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling, Journal of Hydrology, 377, 80-91, 10.1016/j.jhydrol.2009.08.003, 2009.

Kling, H., Fuchs, M., and Paulin, M.: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios, Journal of Hydrology, 424-425, 264-277, 10.1016/j.jhydrol.2012.01.011, 2012.

Pool, S., Vis, M., and Seibert, J.: Evaluating model performance: towards a non-parametric variant of the Kling-Gupta efficiency, Hydrological Sciences Journal, 63, 1941-1953, 10.1080/02626667.2018.1552002, 2018.

roger.tools.evaluation.calc_kge(obs, sim, r='pearson', var='std')[source]

Calculate Kling-Gupta-Efficiency (KGE).

Parameters:
  • obs ((N,)array_like) – Observed time series as 1-D array

  • sim ((N,)array_like) – Simulated time series as 1-D array

  • r (str, optional) – Either Spearman correlation coefficient (‘spearman’; Pool et al. 2018) or Pearson correlation coefficient (‘pearson’; Gupta et al. 2009) can be used to describe the temporal correlation. The default is to calculate the Pearson correlation.

  • var (str, optional) – Either coefficient of variation (‘cv’; Kling et al. 2012) or standard deviation (‘std’; Gupta et al. 2009, Pool et al. 2018) to describe the gamma term. The default is to calculate the standard deviation.

Returns:

eff – Kling-Gupta-Efficiency

Return type:

float

Examples

Provide arrays with equal length

>>> from de import de
>>> import numpy as np
>>> obs = onp.array([1.5, 1, 0.8, 0.85, 1.5, 2])
>>> sim = onp.array([1.6, 1.3, 1, 0.8, 1.2, 2.5])
>>> kge.calc_kge(obs, sim)
0.683901305466148

Notes

\[ \begin{align}\begin{aligned}KGE = 1 - \sqrt{(\beta - 1)^2 + (\alpha - 1)^2 + (r - 1)^2}\\KGE = 1 - \sqrt{(\frac{\mu_{sim}}{\mu_{obs}} - 1)^2 + (\frac{\sigma_{sim}}{\sigma_{obs}} - 1)^2 + (r - 1)^2}\\KGE = 1 - \sqrt{(\beta - 1)^2 + (\gamma - 1)^2 + (r - 1)^2}\\KGE = 1 - \sqrt{(\frac{\mu_{sim}}{\mu_{obs}} - 1)^2 + (\frac{CV_{sim}}{CV_{obs}} - 1)^2 + (r - 1)^2}\end{aligned}\end{align} \]

References

Gupta, H. V., Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling, Journal of Hydrology, 377, 80-91, 10.1016/j.jhydrol.2009.08.003, 2009.

Kling, H., Fuchs, M., and Paulin, M.: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios, Journal of Hydrology, 424-425, 264-277, 10.1016/j.jhydrol.2012.01.011, 2012.

Pool, S., Vis, M., and Seibert, J.: Evaluating model performance: towards a non-parametric variant of the Kling-Gupta efficiency, Hydrological Sciences Journal, 63, 1941-1953, 10.1080/02626667.2018.1552002, 2018.

roger.tools.evaluation.calc_nse(obs, sim)[source]

Calculate Nash-Sutcliffe-Efficiency (NSE).

Parameters:
  • obs ((N,)array_like) – Observed time series as 1-D array

  • sim ((N,)array_like) – Simulated time series as 1-D array

Returns:

eff – Nash-Sutcliffe-Efficiency

Return type:

float

Examples

Provide arrays with equal length

>>> from de import de
>>> import numpy as np
>>> obs = onp.array([1.5, 1, 0.8, 0.85, 1.5, 2])
>>> sim = onp.array([1.6, 1.3, 1, 0.8, 1.2, 2.5])
>>> nse.calc_nse(obs, sim)
0.5648252536640361

Notes

\[NSE = 1 - \frac{\sum_{t=1}^{t=T} (Q_{sim}(t) - Q_{obs}(t))^2}{\sum_{t=1}^{t=T} (Q_{obs}(t) - \overline{Q_{obs}})^2}\]

References

Nash, J. E., and Sutcliffe, J. V.: River flow forecasting through conceptual models part I - A discussion of principles, Journal of Hydrology, 10, 282-290, 10.1016/0022-1694(70)90255-6, 1970.