Protein Generated by CCC-ProGAN
Proteins are critical components of life that have shown promising results as synthetic medications. However, the process of developing therapeutic proteins requires immense amounts of time for testing and validation. Machine Learning (ML) has shown to be a powerful tool that can understand complex protein sequences, and recent research has taken advantage of its capabilities for protein design. However, limited methods exist to do so, and to the best of our knowledge, existing models do not ensure that output proteins are both feasible enough to prevent unnecessary testing of proteins and diverse enough to enable a large variety of output protein combinations. As such, we propose Cycle-Consistent Conditional Protein Generative Adversarial Network, or CCC-ProGAN, which utilizes secondary structure and primary structure design objectives in order to produce peptide-based therapeutics for specific applications. We mathematically define key losses for optimization in protein generation. After conditioning, we evaluate CCC-ProGAN on a test dataset of 65 samples and 15 randomly-generated proteins, showing that CCC-ProGAN is a good candidate for protein generation.
Explore the docs »
- This Repository Contains OUTDATED V1 code for the original experiments conducted in the 2023-2024 year
- This code does NOT contain Large Language Model-based protein generation processes
Table of Contents
Algorithmic code implementation of "Adversarially-Driven Generation of De Novo Proteins for Therapeutic Drug Design."
- please run the following commands in shell from the root directory
chmod +x ./gan_protein_structural_requirements/scripts/setup_script.sh
./gan_protein_structural_requirements/scripts/setup_script.sh
run the following commands separately for basic installation of prereqs
- conda is required, see this link for more information of installation for your system
Please run the following commands:
conda install -c salilab dssp
once installed, please run
conda env create -f environment.yml
conda activate protein_proj
Distributed under the MIT License. See LICENSE.txt for more information.
