smipoly.smip.monc module
Monomer categolization system of the compound list in SMILES.
Classifies monomers based on functional groups and other criteria. This function processes a DataFrame containing SMILES strings, extracts and classifies monomers, and appends the results to the DataFrame. It also supports optional display of classification results.
- smipoly.smip.monc.moncls(df, smiColn, minFG=None, maxFG=None, dsp_rsl=None)
Select monomers from given dataset of small molecule compounds and categolize them into a monomer class.
- Parameters:
df (pd.DataFrame) – Input DataFrame containing chemical data. smiColn (str): Column name in the DataFrame containing SMILES strings.
minFG (int, optional) – Minimum number of functional groups for poly-functionalized monomers. Defaults to 2.
maxFG (int, optional) – Maximum number of functional groups for poly-functionalized monomers. Defaults to 4.
dsp_rsl (bool, optional) – Whether to display classification results. Defaults to False.
- Returns:
A modified DataFrame with classification results appended.
- Return type:
pd.DataFrame
- Raises:
ValueError – If the specified SMILES column name is invalid.
Notes
The function appends additional rows for carbonate structures.
- smipoly.smip.monc.olecls(df, smiColn, minFG=None, maxFG=None, dsp_rsl=None)
Select olefinic monomers from given dataset of small molecule compounds and categolize them into a olefinic monomer class.
- Parameters:
df (pd.DataFrame) – The input DataFrame containing chemical data. Must include the structure of a compound written in SMILES.
smiColn (str) – The column name in the DataFrame containing SMILES strings.
minFG (int, optional) – Minimum number of functional groups to consider. Defaults to 1.
maxFG (int, optional) – Maximum number of functional groups to consider. Defaults to 4.
dsp_rsl (bool, optional) – Whether to display results during processing. Defaults to False.
- Returns:
The updated DataFrame with olefin classification results.
- Return type:
pd.DataFrame
Notes
The function assumes the existence of several global variables such as monLg, exclLg, mon_vals, mon_dic_inv, and Ps_rxnL.
The function modifies the input DataFrame by adding new columns for olefin classification.
The genmol, genc_smi, ole_sel_cru, update_nested_dict and diene_14 functions are defined in ‘funclib.py’.
The ole_cls column is refined for conjugated diene classification using a specific reaction.