- Berke from Superbio
- Posts
- Diffusion models for protein generation
Diffusion models for protein generation
A novel binder design task using RFDiffusion.

Hi there,
Not sure where to get started in the new AI protein era? We provide a sample binder design task targeting the human ACE2 protein.
Last year, Watson et al. rocked the protein science world with the release of RFDiffusion, a generative AI model capable of designing proteins with specified functional properties. This is a big idea - the ability to generate unseen proteins on-demand expands the potential for engineering biology beyond what was previously thought possible.
At Superbio, our passion lies in making models like RFDiffusion more user-friendly and accessible, thereby expanding the tool’s impact. Read on for a thorough walk-through of a binder design task on Superbio 👇
Overview
A classic problem in protein science is the development of new binders specific to a target protein of interest. This problem has a wide-ranging impact, from the discovery of new monoclonal antibodies for a given drug target, to the development of better diagnostic assays, to analyte detection in the context of biomedical R&D.
We propose a design task for generating novel binders to the human ACE2 protein, best known as the binding partner of the Spike protein that confers virulence to the viral pathogen SARS-CoV-2. Our high-level goal is to disrupt binding at the ACE2:RBD interface, thereby neutralizing the ability of SARS-CoV-2 to infect human tissue.
🔬 Select a target protein and prep design task
RFDiffusion needs two pieces of information to generate new binders: 1) the structure of the target protein, and 2) instructive design rules to constrain model output.
First, let’s navigate to the Protein Data Bank and search for human ACE2. From here, we can search for any protein structure and download its crystal structure locally (.pdb file).

ACE2:RBD interface with key residues shown.
Second, to give ourselves the best chance of disrupting the ACE2:RBD interface, we want to give RFDiffusion some key information: a) the ACE2 binding region of interest and b) ‘hotspot’ residues around which to design our binder.
Pictured above, Hattori et al. (2021) depict contact regions within the ACE2:RBD interface, and a focus on specific ACE2 residues Q24, E35, Y41, and Y83. Upon closer examination, these amino acids all belong to an 80 amino acid region along Chain A.
Using the above information, we can design a binder of roughly 80aa in length targeting this region of ACE2. Simply navigate to RFDiffusion on Superbio and fill in the parameters like below.
👩💻 Define workflow constraints and generate
RFDiffusion can generate proteins of any length or amino acid sequence - giving it an infinite design space. Because of this, any design task must be highly targeted in order to obtain proteins with relevant structure and function.

RFDiffusion on Superbio
Using the information from above, we can restrict the output of RFDiffusion. First, users must upload their target .pdb file by clicking ‘Local’ under ‘Upload PBD File’. Let’s do this with the ACE2 .pdb that we downloaded earlier.
Second, using our knowledge of the ACE2:RBD interface, we can define workflows parameters on Superbio which will constrain our design task as follows:
RFDIFFUSION TASK: Binder Design
CONTIG MAP INPUT: A20-100/0 70-90
HOTSPOTS POINT INPUT: A24, A35, A41, A83
SYMMETRY OPTIONS: NaN
You’re now ready to generate new protein binders on Superbio! Simply click Submit Job → Run on GPU.
🧑🔬 Explore results and experiment downstream
Voila! The below GIF shows our first binder designed with RFDiffusion.

RFDiffusion generated both structural outputs (.pdb files), metadata (.trb), and trajectory files for use in PyMol. We preview up to 10 .pdb files on Superbio for in-app exploration - users can navigate through these by clicking the ‘Model-0’ dropdown in the upper left of the viewing pane.
Please note - RFDiffusion generates protein backbones alone. While a very difficult problem on its own, backbones do not constitute functional proteins. It is recommended to run Protein-MPNN Fast Relax to populate amino acid residues and generate fully-functional structures.
We hope the above can kickstart your journey with protein design on Superbio. Our community actively make tutorial requests, so let us know if you’d like to see another design task!
Stay curious,
Berke from Superbio
P.S. - Want to learn about more of Superbio’s features?
Find our detailed RFDiffusion tutorial here.