2. Getting Start
This guide will walk you through the steps to prepare input files, classify monomers, and generate polymers using the SMiPoly framework.
When using Jupyter Notebook, also refer to “sample_script/sample_smip_demo4.ipynb”.
1. Quick start
For quick start, the following sampl script and dataset are available.
Sample script
Download ‘./sample_script/sample_smip_demo4.ipynb’ from SMiPoly repository.
To run this demo script, Jupyter Notebook is required.
Sample data
The sample dataset ‘./sample_data/202207_smip_monset.csv’ includes common 1,083 monomers collected from published documents such as scientific articles, catalogues and so on.
2. Prepare and load Input File
To begin, you need to prepare an input file containing the chemical data for your compounds. The input file should be in a tabular format (e.g., CSV) with at least one column containing SMILES strings.
Steps:
Ensure the file is saved in a format readable by pandas (e.g.,
.csv
or.xlsx
).The column containing SMILES strings should be clearly labeled (e.g.,
SMILES
).
Example Input File
Compound_ID |
SMILES |
---|---|
CID174 |
OCCO |
CID7489 |
C1=CC(=CC=C1C(=O)O)C(=O)O |
CID6658 |
O=C(OC)C(C)=C |
CID7501 |
C=CC1=CC=CC=C1 |
CID1140 |
CC1=CC=CC=C1 |
CID702 |
CCO |
CID7896 |
CC(CCO)O |
CID7837 |
CC(=C)C(=O)OCC1CO1 |
CID10352 |
C1CC2CC1C=C2 |
import pandas as pd
df = pd.read_csv("input_file.csv")
Using Sample dataset
Also ‘sample_data/202207_smip_monset.csv’ is avalable under the Internet-connected environment.
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/PEJpOhno/SMiPoly/main/sample_data/202207_smip_monset.csv")
3. Monomer Classification
The monomer classification step identifies and categorizes monomers based on their functional groups or olefinic properties.
Monomer Extraction and Classification
Use the moncls
function to extract and classify monomers from small molecule compounds.
Example:
from smipoly.smip import monc
# Classify monomers
classified_monomer_df = monc.moncls(df, smiColn="SMILES", minFG=2, maxFG=4, dsp_rsl=True)
# Save results
classified_monomer_df.to_csv("classified_monomers.csv", index=False)
Olefinic Monomer Classification
To classify olefinic monomers in more detail, use the olecls
function.
Example:
from smipoly.smip import monc
# Classify olefinic monomers
classified_olefins_df = monc.olecls(df, smiColn="SMILES", minFG=1, maxFG=4, dsp_rsl=True)
# Save results
classified_olefins_df.to_csv("classified_olefins.csv", index=False)
4. Generate Polymers
The polymer generation step involves creating polymers from the classified monomers.
Polymer Generation
Use the biplym
function to generate polymers.
Example:
from smipoly.smip import polg
# Generate polymer CRU
generated_polymers_df = polg.biplym(classified_monomer_df, targ=["polyester"], dsp_rsl=True)
# Save results
generated_polymers_df.to_csv("generated_polymers_df.csv", index=False)
Olefinic (Co)polymer Generation
Once you have applied the olecls
function to classify olefin monomers, use the ole_copolym
function to generate olefin (co)polymers.
Example:
from smipoly.smip import polg
# Generate olefinic (co)polymers
olefinic_polymers_df = polg.ole_copolym(classified_olefins_df, targ=["ROMP"], ncomp=1, dsp_rsl=True)
# Save results
olefinic_polymers_df.to_csv("olefinic_polymers_df.csv", index=False)
Notes
When installed with pip or conda, the following will be executed automatically.
Ensure all required dependencies (e.g., RDKit, pandas) are installed in your environment.
The rules directory must contain the necessary configuration files (e.g., mon_vals.json, ps_rxn.pkl).