rxnfit.expdata_fit_sci module
Fit ODE parameters to experimental time-course data.
This module provides functions and classes for fitting symbolic rate constants in ODE systems to experimental data using scipy.optimize.minimize.
- class rxnfit.expdata_fit_sci.ExpDataFitSci(builded_rxnode, df_list, t_range, method='RK45', rtol=1e-06, df_names=None)
Bases:
objectMulti-dataset fitting of symbolic rate constants to experimental data.
Fits ODE rate constants using experimental time course data. Provides methods to run fitting and to prepare arguments for solv_ode (RxnODEsolver) for re-analysis with fitted parameters.
- get_fitted_rate_const_dict(result=None)
Get dict of fitted rate constants for builder.rate_consts_dict.
Use to update builded_rxnode.rate_consts_dict before passing to RxnODEsolver.
- Parameters:
result (scipy.optimize.OptimizeResult, optional) – OptimizeResult from run_fit. If None, uses internal result from last run_fit. Defaults to None.
- Returns:
- Fitted rate constants {key: value} to merge into
builded_rxnode.rate_consts_dict.
- Return type:
dict
- Raises:
RuntimeError – If run_fit not executed.
- get_solver_config_args(dataset_index=0)
Get kwargs for SolverConfig for use with RxnODEsolver.
Use after run_fit. Provides y0, t_span, method, rtol for the specified dataset. When the model has time-dependent rate constants k(t), also adds rate_const_values (callable) and symbolic_rate_const_keys; in that case run_fit() must have been called first.
- Parameters:
dataset_index (int, optional) – Index of dataset for y0. Defaults to 0 (first dataset).
- Returns:
- Keyword args for SolverConfig:
y0, t_span, method, rtol, and optionally rate_const_values, symbolic_rate_const_keys when k(t) present.
- Return type:
dict
- Raises:
RuntimeError – If run_fit not executed.
- plot_fitted_solution(expdata_df: DataFrame | List[DataFrame] | None = None, plot_datasets: List[str] | None = None, species: List[str] | None = None, subplot_layout: tuple | None = None)
Plot fitted time-course with per-dataset y0.
Use after run_fit. Each subplot uses the y0 from the corresponding DataFrame’s first row. When expdata_df is None, uses self.df_list for experimental overlay. When plot_datasets is given, only those datasets are plotted (by df_names). Subplot titles use dataset names (df_names) when available; otherwise “Dataset 1”, “Dataset 2”, …
- Parameters:
expdata_df – Experimental data for overlay. Single DataFrame or list. If None, uses self.df_list (fit data). When plot_datasets is given, length must match len(plot_datasets); otherwise must match n_datasets.
plot_datasets – List of dataset names (from df_names) to plot. If None, all datasets are plotted.
species – Species to plot. If None, all.
subplot_layout – (n_rows, n_cols) for subplot grid.
- Returns:
None
- Raises:
RuntimeError – If run_fit not executed.
ValueError – If length/column mismatch or unknown plot_datasets name.
- run_fit(p0, opt_method='L-BFGS-B', bounds=None, verbose=True, use_log_fit=False, lower_bound=None)
Run fitting and return optimized rate constants.
- Parameters:
p0 (list, tuple, or dict) –
Initial guess for symbolic rate constants. - list or tuple: Values in the order of symbolic_rate_const_keys.
The order you give is assumed correct. Length must match the number of symbolic rate constants.
dict: Keys are symbolic rate constant names (strings, e.g. “k1”, “k2”). Values are initial guesses (numeric). Keys must match get_symbolic_rate_const_keys() exactly (no extra keys, no missing keys). Example: {“k1”: 0.001, “k2”: 0.002}.
When use_log_fit=True, all values must be positive.
opt_method (str, optional) – scipy.optimize.minimize method. Defaults to ‘L-BFGS-B’.
bounds (list, optional) – Bounds for each parameter (linear fit only), in symbolic_rate_const_keys order. If None, uses [(lower_bound or 1e-10, None)] * n_params. When given, lower_bound is ignored. When use_log_fit=True, bounds is ignored; lower_bound (or default 1e-6) is used instead. Defaults to None.
verbose (bool, optional) – Print optimization result. Defaults to True.
use_log_fit (bool, optional) – If True, optimize in log(k) space for numerical stability with small rate constants. result.x is always returned in linear scale (k). Defaults to False.
lower_bound (float, optional) – Common lower bound for all parameters (must be positive). When None: linear fit uses 1e-10, log fit uses 1e-6. Defaults to None.
- Returns:
- (result, param_info, fit_metrics)
result: Object with .x (linear-scale k), .success, .fun, .tss, .r2. (RSS is .fun; do not use .rss.)
param_info: Dict with symbolic_rate_consts, etc.
fit_metrics: Dict with keys ‘rss’, ‘tss’, ‘r2’.
- Return type:
tuple
- Raises:
ValueError – If p0 has wrong length (sequence), invalid keys or non-numeric values (dict), lower_bound <= 0, or use_log_fit=True with non-positive p0 values.
- to_dataframe_list(time_column_name='time')
Return fitted solutions as list of DataFrames.
Use after run_fit. One DataFrame per dataset; failed integrations yield None at that index.
- Returns:
Length = n_datasets. Each element is pd.DataFrame or None.
- Return type:
list
- Raises:
RuntimeError – If run_fit not executed.
- rxnfit.expdata_fit_sci.run_fit_multi(builded_rxnode, df_list, p0, t_range=None, method='RK45', rtol=1e-06, df_names=None, opt_method='L-BFGS-B', bounds=None, verbose=True, use_log_fit=False, lower_bound=None)
Convenience wrapper around ExpDataFitSci.run_fit.
If t_range is None, derives from first DataFrame. use_log_fit and lower_bound are passed through to run_fit.
- Parameters:
builded_rxnode (RxnODEbuild) – Instance containing the reaction system definition.
df_list (list[pandas.DataFrame]) – List of experimental DataFrames. All must have same structure (time + species columns).
p0 (list, tuple, or dict) – Initial guess for symbolic rate constants. Same semantics as ExpDataFitSci.run_fit(p0=…): list/tuple in symbolic_rate_const_keys order, or dict with string keys (e.g. {“k1”: 0.001, “k2”: 0.002}).
t_range (tuple[float, float], optional) – Integration time span (t_start, t_end). If None, derives from first DataFrame. Defaults to None.
df_names (list[str], optional) – Names for each DataFrame. If None, uses df.attrs.get(‘name’) when present; else str(i).
method (str, optional) – Integration method for solve_ivp. Defaults to “RK45”.
rtol (float, optional) – Relative tolerance for solve_ivp. Defaults to 1e-6.
opt_method (str, optional) – scipy.optimize.minimize method. Defaults to ‘L-BFGS-B’.
bounds (list, optional) – Bounds for each parameter (linear fit only). If None, uses lower_bound or default. Defaults to None.
verbose (bool, optional) – Print optimization result. Defaults to True.
use_log_fit (bool, optional) – If True, optimize in log(k) space. Passed to run_fit. Defaults to False.
lower_bound (float, optional) – Common lower bound for all parameters. Passed to run_fit. Defaults to None.
- Returns:
- (result, param_info, fit_metrics)
result: Object with .x (linear-scale k), .success, .fun, .tss, .r2. (RSS is .fun.)
param_info: Dict with symbolic_rate_consts, function_names, n_params, n_datasets, y0_list.
fit_metrics: Dict with keys ‘rss’, ‘tss’, ‘r2’.
- Return type:
tuple
- Raises:
ValueError – If df_list is empty, column structure is invalid, p0 has wrong length or invalid keys/values (see run_fit), lower_bound <= 0, or use_log_fit=True with non-positive p0.
- rxnfit.expdata_fit_sci.solve_fit_model(builded_rxnode, fixed_initial_values, t_span, method='RK45', rtol=1e-06)
Create a model function for fitting only symbolic rate constants.
The returned function solves the ODE system and returns predicted concentrations. Only symbolic rate constants are varied; initial values and other numeric values are fixed.
- Parameters:
builded_rxnode (RxnODEbuild) – Instance containing the reaction system definition.
fixed_initial_values (list[float]) – Initial concentrations for each species (in function_names order). Typically from experimental data at t=0. These values are never optimized.
t_span (tuple[float, float]) – Integration time span (t_start, t_end). Must encompass the experimental time points. Required.
method (str, optional) – Integration method for solve_ivp. Defaults to “RK45”.
rtol (float, optional) – Relative tolerance for solve_ivp. Defaults to 1e-6.
- Returns:
- A function f(t, *params) where params are symbolic rate
constant values in symbolic_rate_const_keys order. Returns numpy.ndarray of shape (len(function_names), len(t)). Has param_info attribute with: symbolic_rate_consts, function_names, n_params, fixed_initial_values.
- Return type:
callable
- Raises:
ValueError – If len(fixed_initial_values) != number of chemical species.
- rxnfit.expdata_fit_sci.solve_fit_model_multi(builded_rxnode, df_list, t_span, method='RK45', rtol=1e-06, df_names=None)
Create residual function for multi-dataset fitting (varying y0).
y0 and the initial time (t0) for each dataset are taken from the first row of each DataFrame. Integration runs from that t0 to t_span[1]. Only rate constants are optimized; minimizes sum of residuals across all datasets.
- Parameters:
builded_rxnode (RxnODEbuild) – Instance containing the reaction system definition.
df_list (list[pandas.DataFrame]) – List of experimental DataFrames. All must have same structure (time + species columns).
t_span (tuple[float, float]) – Integration time span (t_start, t_end). Must encompass all experimental time points. Required.
method (str, optional) – Integration method for solve_ivp. Defaults to “RK45”.
rtol (float, optional) – Relative tolerance for solve_ivp. Defaults to 1e-6.
df_names (list[str], optional) – Names for each DataFrame. If None, uses df.attrs.get(‘name’) for each when present and non-empty; otherwise str(i). Length must match len(df_list) when provided.
- Returns:
residual_func: Callable that takes params (list of symbolic rate constant values in symbolic_rate_const_keys order) and returns scalar residual (sum of squared residuals).
param_info: Dict with symbolic_rate_consts, function_names, n_params, n_datasets, y0_list.
- Return type:
tuple
- Raises:
ValueError – If df_list is empty or column structure is invalid.