smipoly.smip.funclib module
functions for MonomerClassifier (monc.py) and PolymerGenerator (polyg.py).
- smipoly.smip.funclib.bipolymA(reactant, targ_rxn, monL, Ps_rxnL, P_class)
Generates a polymer CRU formed from two monomers by iteratively reacting a monomer until no further reactions are possible.
- Parameters:
reactant (tuple) – A tuple of reactant molecules to be used in the reaction.
targ_rxn (rdkit.Chem.rdChemReactions.ChemicalReaction) – The target chemical reaction to apply.
monL (list) – A list of monomer SMARTS patterns indexed by integers.
Ps_rxnL (dict) – A list of monomer SMARTS patterns indexed by integers.
P_class (type) – A class type used for polymer processing.
- Returns:
A list of SMILES strings representing the generated polymer products.
- Return type:
list
- smipoly.smip.funclib.coord_polym(smi, targ_rxn)
Generate a list of CRUs for olefin copolymer polymers from the input SMILES string with a target reaction.
- Parameters:
smi (str) – The SMILES string of the input molecule.
targ_rxn (rdkit.Chem.rdChemReactions.ChemicalReaction) – The target reaction to apply to the input molecule.
- Returns:
A list of unique SMILES strings representing the products of the reaction.
- Return type:
list
- smipoly.smip.funclib.count_fg(m, patt)
Counts the number of functional groups (FG) in a molecule based on a given pattern.
- Parameters:
m (rdkit.Chem.Mol) – The molecule object to search for substructure matches.
patt (rdkit.Chem.Mol) – The pattern molecule used to identify substructure matches.
- Returns:
The number of functional groups identified in the molecule.
- Return type:
int
- smipoly.smip.funclib.diene_12to14(smi, rxn)
Convert the structure of the 1,2-adducted CRU to a 1,4-adduct. Place this function right before def diene_14() so that it can be used within the function diene_14.
- Parameters:
smi (str) – The input SMILES string containing asterisks (*) as placeholders.
rxn (rdkit.Chem.rdChemReactions.ChemicalReaction) – Ps_rxnL[209] was applied.
- Returns:
The resulting SMILES string after the reaction, with placeholders replaced back to asterisks (*).
- Return type:
str
- Raises:
rdkit.Chem.rdchem.KekulizeException –
If the molecule sanitization fails. –
IndexError – If the reaction does not produce any products.
- smipoly.smip.funclib.diene_14(x, rxn)
Generate 1,4-addition CRU from a conjugated diene monomer.
- Parameters:
x (dict) – The results of olefin classification and the chemical structure of these CRU generated by ole_sel_cru.
rxn (rdkit.Chem.rdChemReactions.ChemicalReaction) – Ps_rxnL[209] was applied.
- Returns:
The modified dictionary x with the transformed SMILES string in x[‘conjdiene’][2], if applicable. If ‘conjdiene’ is not present or empty, the dictionary is returned unchanged.
- Return type:
dict
- smipoly.smip.funclib.genc_smi(m)
Generates a RDkit canonical SMILES string from a molecule object.
- Parameters:
m (rdkit.Chem.Mol) – A molecule object, from the RDKit library.
- Returns:
The SMILES string representation of the molecule if successful, otherwise returns np.nan.
- Return type:
str or np.nan
- smipoly.smip.funclib.genmol(s)
Generates a molecular object from a SMILES string.
- Parameters:
s (str) – A SMILES (Simplified Molecular Input Line Entry System) string representing the molecular structure.
- Returns:
A molecular object if the SMILES string is valid, otherwise returns numpy.nan.
- Return type:
rdkit.Chem.Mol or numpy.nan
- smipoly.smip.funclib.homopolymA(mon1, mons, excls, targ_mon1, Ps_rxnL, mon_dic, monL)
Generates a polymer CRU formed from a single monomer by iteratively reacting a monomer until no further reactions are possible.
- Parameters:
mon1 (rdkit.Chem.Mol) – The initial monomer to start the polymerization process.
mons (list) – A list of SMARTS strings representing monomer patterns to match against the molecule.
excls (list) – A list of SMARTS strings representing exclusion patterns to check against the molecule.
targ_mon1 (object) – The target monomer class for the polymerization process.
Ps_rxnL (list) – A dictionary of polymerization reaction objects indexed by integers.
mon_dic (dict) – A dictionary containing monomer class.
monL (list) – A list of monomer SMARTS patterns indexed by integers.
- Returns:
A list of SMILES strings representing the generated homopolymers.
- Return type:
list
- smipoly.smip.funclib.monomer_sel_mfg(m, mons, excls)
Determining whether the given small molecule compound qualifies as a self-polymerizable monomer and categolize it into a monomer class.
- Parameters:
m (rdkit.Chem.Mol) – The molecule to be analyzed. If None or NaN, the function returns default values.
mons (list of str) – A list of SMARTS strings representing monomer patterns to match against the molecule.
excls (list of str) – A list of SMARTS strings representing exclusion patterns to check against the molecule.
- Returns:
- A list containing:
fchk (bool): True if the molecule matches any monomer pattern
- and does not match any exclusion pattern,
otherwise False.
fchk_c (int): The total count of substructure matches
for all monomer patterns.
- Return type:
list
- smipoly.smip.funclib.monomer_sel_pfg(m, mons, excls, minFG, maxFG)
Determining whether the given small molecule compound qualifies as a monomer or not. If so, count a number of polymerizeble functional group and categolize it into a monomer class.
- Parameters:
m (rdkit.Chem.Mol) – The monomer molecule to evaluate.
mons (list of str) – A list of SMARTS patterns representing the functional groups to count in the monomer.
excls (list of str) – A list of SMARTS patterns representing the exclusion patterns to check against the monomer.
minFG (int) – The minimum number of functional groups required.
maxFG (int) – The maximum number of functional groups allowed.
- Returns:
- A list containing:
fchk (bool): True if the monomer satisfies the conditions,
False otherwise. - fchk_c (int): The total count of functional groups found in the monomer.
- Return type:
list
- smipoly.smip.funclib.ole_cru_gen(m, mon)
Generates a CRU from olefinic monomer by applying a reaction iteratively until no further reactions are possible.
- Parameters:
m (rdkit.Chem.Mol) – The input molecule to which the reaction will be applied.
mon (str) – A SMARTS string representing the monomer pattern.
- Returns:
- A list containing:
rdkit.Chem.Mol: The final CRU after all reactions.
list of str: A list of SMILES strings for CRUs.
- Return type:
list
- Raises:
Exception – If there is an issue with sanitizing the molecule
during reaction processing. –
- smipoly.smip.funclib.ole_rxnsmarts_gen(reactant)
Generates a polymerization reaction SMARTS string for a given olefinic monomer. Place this function right before def ole_cru_gen() so that it can be used within the function ole_cru_gen.
- Parameters:
reactant (str) – The input reactant string in SMARTS format.
- Returns:
The reaction SMARTS string representing the transformation from the reactant to the product.
- Return type:
str
- smipoly.smip.funclib.ole_sel_cru(m, mons, excls, minFG, maxFG)
Selects and processes a molecule based on specific criteria and generates a SMILES representation.
- Parameters:
m (rdkit.Chem.Mol) – The molecule to be processed.
mons (list of str) – A list of SMARTS patterns representing the functional groups to count in the monomer.
excls (list of str) – A list of SMARTS patterns representing the exclusion patterns to check against the monomer.
minFG (int) – The minimum number of olefinic polymerizable site required.
maxFG (int) – The maximum number of olefinic polymerizable site allowed.
- Returns:
- A list containing:
The result of the monomer_sel_pfg function (list of bool and other values).
The SMILES representation of the processed molecule (str).
- Return type:
list
- smipoly.smip.funclib.seq_chain(prod_P, targ_mon1, Ps_rxnL, mon_dic, monL)
This function applied to multifunctional monomers for chain polymerization except polyolefine. Processes a molecular structure by applying a sequential reactions based on specific substructure matches.
- Parameters:
prod_P (rdkit.Chem.Mol) – The input molecule to be processed.
targ_mon1 (str) – Target monomer type, used to determine processing logic.
Ps_rxnL (dict) – A dictionary of polymerization reaction objects indexed by integers.
mon_dic (dict) – A dictionary containing monomer class (not used in this function).
monL (list) – A list of monomer SMARTS patterns indexed by integers.
- Returns:
The processed molecule after applying the reactions.
- Return type:
rdkit.Chem.Mol
- smipoly.smip.funclib.seq_successive(prod_P, targ_rxn, monL, Ps_rxnL, P_class)
This function applied to multifunctional monomers for successive polymerization. Processes a molecular structure by applying a sequential reactions based on specific substructure matches.
- Parameters:
prod_P (rdkit.Chem.Mol) – The product molecule to be processed.
targ_rxn (Any) – Target reaction (not used in the current implementation).
monL (list) – A list containing SMARTS patterns for functional groups.
Ps_rxnL (list) – A list of polymerization reaction objects to be applied to the product molecule.
P_class (str) – The polymer class of the product molecule, which determines the reaction sequence.
- Returns:
The processed product molecule after applying the reaction sequence.
- Return type:
rdkit.Chem.Mol
Notes
The function uses substructure matching to determine which reactions to apply.
The behavior of the function depends on the P_class of the molecule.
Specific reaction sequences are applied for classes such as ‘polyolefin’, ‘polyoxazolidone’, ‘polyimide’, and ‘polyester’.
If the P_class is not recognized, the product molecule is returned unchanged.
- smipoly.smip.funclib.update_nested_dict(row, dict_col, new_val, updated_k)
Used in the function ‘olecls’. If the classification result for the olefin class is True, writes the number of functional groups and the SMILES notation of the CRU.
- Parameters:
row (dict) – The dictionary representing a row of data.
dict_col (str) – The key in the row that contains the nested dictionary to be updated.
new_val (str) – The key in the row whose value will be assigned to the nested dictionary.
updated_k (str) – The key in the nested dictionary to be updated.
- Returns:
The updated row with the modified nested dictionary.
- Return type:
dict