Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: [ Ссылка ]
Never miss another M2D2 talk, add the schedule to your calendar: [ Ссылка ]
Also consider joining the M2D2 Slack: [ Ссылка ]
Abstract: The ability to modulate pathogenic proteins represents a powerful treatment strategy for diseases. Unfortunately, many proteins are considered “undruggable” by small molecules, and are often intrinsically disordered, precluding the usage of structure-based tools for binder design. To address these challenges, we have developed a suite of algorithms that enable the design of target-specific peptides via protein language model embeddings, without the requirement of 3D structures. First, we train a model that leverages ESM-2 embeddings to efficiently select high-affinity peptides from natural protein interaction interfaces. We experimentally fuse model-derived peptides to E3 ubiquitin ligases and identify candidates exhibiting robust degradation of undruggable targets in human cells. Next, we develop a high-accuracy discriminator, based on the CLIP architecture, to prioritize and screen peptides with selectivity to a specified target protein. As input to the discriminator, we create a Gaussian diffusion generator to sample an ESM-2-based latent space, fine-tuned on experimentally-valid peptide sequences. Finally, to enable de novo generation of binding peptides, we train an instance of GPT-2 with protein interacting sequences to enable peptide generation conditioned on target sequence. Our model demonstrates low perplexities across both existing and generated peptide sequences. Together, our work lays the foundation for programmable protein targeting and editing applications.
Speaker:
Pranam Chatterjee - [ Ссылка ]
Twitter Prudencio: [ Ссылка ]
Twitter Jonny: [ Ссылка ]
Twitter datamol.io: [ Ссылка ]
~
Chapters:
00:00 - Intro
09:40 - Contrastive Learning for Cas9-PAM Engineering
11:26 - Transcription Factors as Tools for Cell Engineering
15:47 - Transcription Factor Mediated Generation of Human Ovary
17:28 - COVID-19
20:48 - Peptide Beacons as a COVID Diagnostic
26:19 - CLIP for Guide Protein Design
35:10 - Peptide Prioritization with CLIP (PepPrCLIP)
38:12 - GPT-2 for Proteins
45:14 - Q&A
![](https://i.ytimg.com/vi/Mt6VMDG8NUA/maxresdefault.jpg)