close
Skip to content

naenarao/MolCycleGAN-Optimator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

666 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MolCycleGAN-Optimator

Overview

MolCycleGAN-Optimator is an open-source generative model tailored for molecular optimization. By leveraging CycleGAN architecture and Junction Tree Variational Autoencoder (JT-VAE), it provides advanced capabilities to enhance molecular designs efficiently.

Requirements

We highly recommend using conda for package management -- the environment.yml file is provided.

The environment can be created by running:

conda env create -f environment.yml

Setup

After cloning this repository, execute the following script to initialize submodules:

./scripts/init_repo.sh

Datasets

All necessary datasets for aromatic ring experiments are provided.

Downloading input data (ZINC 250k dataset and JT-VAE encodings):

./scripts/download_input_data.sh

Downloading aromatic rings experiment data (train/test splits, returned molecules, SMILES data):

./scripts/download_ar_data.sh

Usage

Training

Train the model by running:

python train.py

Specify appropriate training parameters for the selected dataset.

Decoding

Once the model is trained and translations for the test set are generated, use JT-VAE decoding:

python decode.py

Specify your decoding parameters appropriately.

Experiments

The repository includes all data and code required to reproduce the aromatic rings experiment.

  1. Dataset Creation: Use the notebook data/input_data/aromatic_rings/datasets_generator_aromatic_rings.ipynb to create train/test sets for experiments.

  2. Model Training: Perform training using:

    ./scripts/run_aromatic_rings_training.sh
    

    This runs train.py with predefined parameters for aromatic rings data.

  3. Decoding Molecules: Decode molecules using:

    ./scripts/run_aromatic_rings_decoding.sh
    

    This executes decode.py with base parameters preconfigured for aromatic rings.

  4. Analysis: Analyze output data using experiments/aromatic_rings.ipynb.

Disclaimer

The MolCycleGAN-Optimator code is written in Python3, while the JT-VAE package relies on Python2. To simplify usage, the environment is configured to work seamlessly within a single setting using downgraded library versions. Please construct the environment strictly using the environment.yml file to avoid compatibility issues.

About

MolCycleGAN-Optimator is a generative model designed for efficient molecular optimization using CycleGAN and Junction Tree Variational Autoencoder techniques.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors