Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (2024)

Axel Levy\orcidlink0000-0001-7890-9562
Stanford University
&Eric R. Chan\orcidlink0009-0002-9691-8213
Stanford University
&Sara Fridovich-Keil\orcidlink0000-0002-7661-4987
Stanford University
&Frédéric Poitevin\orcidlink0000-0002-3181-8652
SLAC National Accelerator Laboratory
&Ellen D. Zhong\orcidlink0000-0001-6345-1907
Princeton University
&Gordon Wetzstein\orcidlink0000-0002-9243-6885
Stanford University
Correspondence to: axlevy@stanford.edu, gordon.wetzstein@stanford.edu

Abstract

The interaction of a protein with its environment can be understood and controlled via its 3D structure. Experimental methods for protein structure determination, such as X-ray crystallography or cryogenic electron microscopy, shed light on biological processes but introduce challenging inverse problems. Learning-based approaches have emerged as accurate and efficient methods to solve these inverse problems for 3D structure determination, but are specialized for a predefined type of measurement. Here, we introduce a versatile framework to turn raw biophysical measurements of varying types into 3D atomic models. Our method combines a physics-based forward model of the measurement process with a pretrained generative model providing a task-agnostic, data-driven prior. Our method outperforms posterior sampling baselines on both linear and non-linear inverse problems. In particular, it is the first diffusion-based method for refining atomic models from cryo-EM density maps.

1 Introduction

Our ability to understand the molecular machinery of living organisms and design therapeutic compounds depends on our capability to observe the three-dimensional structures of protein and other biomolecular complexes.Experimental methods in structural biology such as X-ray crystallography, cryogenic electron microscopy (cryo-EM) and nuclear magnetic resonance (NMR) spectroscopy provide noisy and partial measurements from which the 3D structure of these biomolecules can be inferred. However, turning experimental observations into reliable 3D structural models is a challenging computational task. For many years, reconstruction algorithms were based on Maximum-A-Posteriori (MAP) estimation and often resorted to hand-crafted priors to compensate for the ill-posedness of the problem. State-of-the-art algorithms for cryo-EM reconstruction[56, 52] are instances of such “white-box” algorithms. These approaches can estimate the uncertainty of their answers and provide theoretical guarantees of correctness, but can only leverage explicitly defined regularizers and do not cope well with complex noise sources or missing data. Very recently, methods like Blush regularization in cryo-EM reconstruction[37] use a data-driven prior based on the noise2noise framework[42] to bypass the need for heuristic regularizers. However, the denoiser does not explicitly leverage the knowledge that proteins are made of atoms and cannot take advantage of a known amino acid sequence.

Recently, supervised-learning approaches have emerged as an alternative to the MAP framework and some of them established a new empirical state of the art for certain tasks. The algorithm ModelAngelo[31], for example, can convert cryo-EM density maps into 3D atomic models with unprecedented accuracy. Typically, these supervised learning methods view the reconstruction problem as a regression task where a mapping between experimental measurements and atomic models needs to be learned. Some of these, like ModelAngelo, can even combine experimental data with sequence information by leveraging a pretrained protein language model[53]. However, these methods can only cope with a predefined type of input data. If additional information is available in a format that the model was not trained on (e.g., structural information about a fragment of the protein), or if the distribution of input data shifts at inference time (e.g., if the noise level changes due to modifications in the experimental protocol), a new model needs to be trained to properly cope with the new data.

In the field of imaging, scenarios where an image or a 3D model must be inferred from corrupted and partial observations are known as “inverse problems”. To overcome the ill-posedness of these problems, regularizers were heuristically defined to inject hand-crafted priors and turn Maximum Likelihood Estimation (MLE) problems into MAP problems. In a similar fashion, machine learning-based methods were recently shown to outperform hand-crafted algorithms for a wide variety of tasks: denoising[74], inpainting[72], super-resolution[45], deblurring[49], monocular depth estimation[20], and camera calibration[36], among others. Recently, generative models were shown to be an effective way to inject data-driven priors into MAP problems, making inverse problems well-posed while replacing heuristic priors[6]. Among these generative methods, diffusion models gained popularity due to their powerful capabilities in the unconditional generation of images[19], videos[26], and 3D assets[51], and were recently leveraged to solve inverse problems in image space[59, 10].

The field of structural biology has also witnessed the application of diffusion models in protein structure modeling tasks[70, 1]. Building on recent progress in protein structure prediction[33] (sequence to structure) and fixed-backbone design[3] (structure to sequence), these diffusion models have opened the doors to de novo protein design[28, 70, 2, 73] (joint generation of structure and sequence). In particular, the recently released generative model Chroma[29] stands out in part thanks to its “programmable” framework, i.e. its ability to be conditioned on external hard or soft constraints. However, despite experimentally validated state-of-the-art results at unconditional generation and its programmable framework, Chroma has not yet been applied to inverse problems like atomic model building.

Here, we introduce ADP-3D (Atomic Denoising Prior for 3D reconstruction), a framework to condition a diffusion model in protein space with any observations for which the measurement process can be physically modeled.Instead of using unadjusted Langevin dynamics for posterior sampling, our approach performs MAP estimation and leverages the data-driven prior learned by a diffusion model using the plug-n-play framework[67].We demonstrate that our method handles a variety of external information: simulated cryo-EM density maps, amino acid sequence, partial 3D structure, and pairwise distances between amino acid residues, to refine a complete 3D atomic model of the protein. We show that our method outperforms posterior sampling baselines and, given a cryo-EM density map, can accurately refine incomplete atomic models provided by Modelangelo. In all our experiments, we use Chroma[29] as our pretrained generative model and highlight the importance of the diffusion-based prior. We therefore make the following contributions:

  • We introduce a versatile framework, inspired from plug-n-play, to solve inverse problems in protein space with a pretrained diffusion model as a learned prior;

  • We outperform existing posterior sampling methods at reconstructing full protein structures from partial structures;

  • We show that a protein diffusion model can be guided to perform atomic model refinement in simulated cryo-EM density maps;

  • We show that a protein diffusion model can be conditioned on a sparse distance matrix.

2 Related Work

Protein Diffusion Models.

Considerable progress has been made in leveraging diffusion models for protein structure generation.Lee etal. [41] demonstrated de novo protein generation using diffusion models over 2D distance matrices, requiring a post-processing step to produce 3D structures.Conditioned on secondary structure adjacency matrices, Anand and Achim [2] used a 3D point cloud representation to generate new structures.Trippe etal. [66] circumvented the need for conditioning information, demonstrated unconditional generation of proteins larger than 20 amino acid residues and introduced a novel conditional sampling algorithm to generate structures respecting a target motif.Wu etal. [71] built a generative model that directly operates on backbone internal coordinates (i.e., dihedral angles), thereby ensuring SE(3)SE3\text{SE}(3)SE ( 3 )-equivariance.Based on theoretical works extending diffusion models to Riemannian manifolds[16, 40], Yim etal. [73] introduced a principled formulation of diffusion models in SE(3)SE3\text{SE}(3)SE ( 3 ) and represented protein backbones as elements of SE(3)NSEsuperscript3𝑁\text{SE}(3)^{N}SE ( 3 ) start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT.In RFdiffusion, Watson etal. [70] experimentally designed the generated proteins and structurally validated them with cryo-EM.Recently, AlphaFold3[1] showed that a diffusion model operating on raw atom coordinates could be used as a tool to improve protein structure prediction.Diffusion models have been used to address specific tasks in protein space, like the generation of complementarity-determining region (CDR)-loops conditioned on non-CDR regions of antibody-antigen complexes[47], or sequence-to-structure prediction[32].

Out of the existing generative models for proteins, Chroma[29] has reported state-of-the-art “designability” metrics with a model that can be conditioned to generate proteins with desired properties (e.g., substructure motifs, symmetries). Despite its modular interface in which users can define their own conditioners, Chroma has not yet been used to guide the generative model with experimental measurements and solve real-world inverse problems.Here, we introduce a framework to efficiently condition a pretrained protein diffusion model and demonstrate the possibility of using cryo-EM maps as conditioning information.

Diffusion-Based Posterior Sampling in Image Space.

An inverse problem in image space can be defined by 𝐲=Γ(𝐱)+η𝐲Γ𝐱𝜂\mathbf{y}=\Gamma(\mathbf{x})+\etabold_y = roman_Γ ( bold_x ) + italic_η where 𝐱𝐱\mathbf{x}bold_x is an unknown image, 𝐲𝐲\mathbf{y}bold_y a measurement, ΓΓ\Gammaroman_Γ a known operator and η𝜂\etaitalic_η a noise vector of known distribution, potentially signal-dependent. The goal of posterior sampling is to sample 𝐱𝐱\mathbf{x}bold_x from the posterior p(𝐱|𝐲)𝑝conditional𝐱𝐲p(\mathbf{x}|\mathbf{y})italic_p ( bold_x | bold_y ), the normalized product of the prior p(𝐱)𝑝𝐱p(\mathbf{x})italic_p ( bold_x ) and the likelihood p(𝐲|𝐱)𝑝conditional𝐲𝐱p(\mathbf{y}|\mathbf{x})italic_p ( bold_y | bold_x ). Bora etal. [6] showed that generative models could be leveraged to implicitly represent a data-learned prior and solve compressed sensing problems in image space. Motivated by the success of diffusion models at unconditional generation[19], several works showed that score-based and denoising models could be used to solve linear inverse problems like super-resolution, deblurring, inpainting and colorization[43, 8, 55, 35, 46, 76], leading to results of unprecedented quality. Other methods leveraged the score learned by a diffusion model to solve inverse problems in medical imaging[60, 30, 9, 12, 11] and astronomy[61]. Finally, recent methods went beyond the scope of linear problems and used diffusion-based posterior sampling on nonlinear problems like JPEG restoration[59], phase retrieval and non-uniform deblurring[10]. Taking inspiration from these methods, we propose to leverage protein diffusion models to solve nonlinear inverse problems in protein space.

Model Building Methods.

Cryogenic electron-microscopy (cryo-EM) provides an estimate of the 3D electron scattering potential (or density map) of a protein. The task of fitting an atomic model 𝐱𝐱\mathbf{x}bold_x into this 3D map 𝐲𝐲\mathbf{y}bold_y is called model building and can be seen as a nonlinear inverse problem in protein space (see4.3).

Model building methods were first developed in X-Ray crystallography[13] and automated methods like Rosetta de-novo[68], PHENIX[44, 65] and MAINMAST[64] were later implemented for cryo-EM data.Although they constituted a milestone towards the automation of model building, obtained structures were often incomplete and needed refinement[58].Supervised learning techniques were built for model building, relying on CNN and U-Net-based architectures[57, 75, 50]. EMBuild[25] was the first method to make use of sequence information and Modelangelo[31] established a new state of the art for automated de novo model building. Trained on 3,715 experimental paired datapoints, Modelangelo uses a GNN-based architecture and processes the sequence information with a pretrained language model[53].Although fully-supervised methods outperform previous approaches, they still provide incomplete atomic models and cannot use a type of input data it was not trained with as additional information.

Here, we propose a versatile framework to solve inverse problems in protein space, including atomic model refinement. Our approach can cope with auxiliary measurements for which the measurement process is known. Our framework allows any pretrained diffusion model to be plugged-in as a prior and can therefore take advantage of future developments in generative models without any task-specific retraining step.

3 Background

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (1)

3.1 Diffusion in Protein Space

We describe the atomic structure of a protein of N𝑁Nitalic_N amino acid residues by the 3D Cartesian coordinates 𝐱4N×3𝐱superscript4𝑁3\mathbf{x}\in\mathbb{R}^{4N\times 3}bold_x ∈ blackboard_R start_POSTSUPERSCRIPT 4 italic_N × 3 end_POSTSUPERSCRIPT of the four backbone heavy atoms (N, Cα, C, O) in each residue, the amino acid sequence 𝐬{1,,20}N𝐬superscript120𝑁\mathbf{s}\in\{1,...,20\}^{N}bold_s ∈ { 1 , … , 20 } start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT, and the side chain torsion angles for each amino acid χ(π,π]4N𝜒superscript𝜋𝜋4𝑁\mathbf{\chi}\in(-\pi,\pi]^{4N}italic_χ ∈ ( - italic_π , italic_π ] start_POSTSUPERSCRIPT 4 italic_N end_POSTSUPERSCRIPT (the conformation of the side chain can be factorized as up to four sequential rotations).

In Chroma[29], the joint distribution over all-atom structures is factorized as

p(𝐱,𝐬,χ)=p(𝐱)p(𝐬|𝐱)p(χ|𝐱,𝐬).𝑝𝐱𝐬𝜒𝑝𝐱𝑝conditional𝐬𝐱𝑝conditional𝜒𝐱𝐬p(\mathbf{x},\mathbf{s},\mathbf{\chi})=p(\mathbf{x})p(\mathbf{s}|\mathbf{x})p(%\mathbf{\chi}|\mathbf{x},\mathbf{s}).italic_p ( bold_x , bold_s , italic_χ ) = italic_p ( bold_x ) italic_p ( bold_s | bold_x ) italic_p ( italic_χ | bold_x , bold_s ) .(1)

The first factor on the right hand side, p(𝐱)𝑝𝐱p(\mathbf{x})italic_p ( bold_x ), is modeled as a diffusion process operating in the space of backbone structures 𝐱𝐱\mathbf{x}bold_x. Given a structure 𝐱𝐱\mathbf{x}bold_x at diffusion time t𝑡titalic_t, Chroma models the conditional distribution of the sequence pθ(𝐬|𝐱,t)subscript𝑝𝜃conditional𝐬𝐱𝑡p_{\theta}(\mathbf{s}|\mathbf{x},t)italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_s | bold_x , italic_t ) as a conditional random field and the conditional distribution of the side chain conformations pθ(χ|𝐱,𝐬,t)subscript𝑝𝜃conditional𝜒𝐱𝐬𝑡p_{\theta}(\mathbf{\chi}|\mathbf{x},\mathbf{s},t)italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_χ | bold_x , bold_s , italic_t ) with an autoregressive model.

Adding isotropic Gaussian noise to a backbone structure would rapidly destroy simple biophysical patterns that proteins always follow (e.g., the scaling law of the radius of gyration with the number of residues). Instead, Chroma uses a non-isotropic noising process as an inductive bias to alleviate the need for the model to learn these patterns from the data. The correlation of the noise is defined in such a way that a few structural properties are statistically preserved throughout the noising process. Specifically, the forward diffusion process is defined by the variance preserving stochastic process

d𝐱=12𝐱βtdt+βt𝐑d𝐰,d𝐱12𝐱subscript𝛽𝑡d𝑡subscript𝛽𝑡𝐑d𝐰\mathrm{d}\mathbf{x}=-\frac{1}{2}\mathbf{x}\beta_{t}\mathrm{d}t+\sqrt{\beta_{t%}}\mathbf{R}\mathrm{d}\mathbf{w},roman_d bold_x = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_x italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_d italic_t + square-root start_ARG italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_R roman_d bold_w ,(2)

where βtsubscript𝛽𝑡\beta_{t}italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is a time-dependent noising schedule and d𝐰d𝐰\mathrm{d}\mathbf{w}roman_d bold_w is a standard Wiener process of dimension 4N×3superscript4𝑁3\mathbb{R}^{4N\times 3}blackboard_R start_POSTSUPERSCRIPT 4 italic_N × 3 end_POSTSUPERSCRIPT.The matrix 𝐑4N×4N𝐑superscript4𝑁4𝑁\mathbf{R}\in\mathbb{R}^{4N\times 4N}bold_R ∈ blackboard_R start_POSTSUPERSCRIPT 4 italic_N × 4 italic_N end_POSTSUPERSCRIPT is fixed and defined explicitly based on statistical considerations regarding the structure of proteins (see[29] and supplementary).Starting from 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT at t=0𝑡0t=0italic_t = 0, a solution to this stochastic differential equation (SDE) at time t𝑡titalic_t is given by

𝐱t𝒩(𝐱;αt𝐱0,σt2𝐑𝐑T),similar-tosubscript𝐱𝑡𝒩𝐱subscript𝛼𝑡subscript𝐱0superscriptsubscript𝜎𝑡2superscript𝐑𝐑𝑇\mathbf{x}_{t}\sim\mathcal{N}(\mathbf{x};\alpha_{t}\mathbf{x}_{0},\sigma_{t}^{%2}\mathbf{R}\mathbf{R}^{T}),bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∼ caligraphic_N ( bold_x ; italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_RR start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ,(3)

where αt=exp(120tβsds)subscript𝛼𝑡12superscriptsubscript0𝑡subscript𝛽𝑠differential-d𝑠\alpha_{t}=\exp\left(-\frac{1}{2}\int_{0}^{t}\beta_{s}\mathrm{d}s\right)italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_exp ( - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_β start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT roman_d italic_s ) and σt=1αt2subscript𝜎𝑡1superscriptsubscript𝛼𝑡2\sigma_{t}=\sqrt{1-\alpha_{t}^{2}}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG.

New protein samples can be generated by sampling 𝐱Tsubscript𝐱𝑇\mathbf{x}_{T}bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT from 𝒩(0,𝐑𝐑T)𝒩0superscript𝐑𝐑𝑇\mathcal{N}(0,\mathbf{R}\mathbf{R}^{T})caligraphic_N ( 0 , bold_RR start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) and integrating the following reverse-time SDE over t[T,0]𝑡𝑇0t\in[T,0]italic_t ∈ [ italic_T , 0 ][4]:

d𝐱=[12𝐱𝐑𝐑T𝐱logpt(𝐱)]βtdt+βt𝐑d𝐰¯,d𝐱delimited-[]12𝐱superscript𝐑𝐑𝑇subscript𝐱subscript𝑝𝑡𝐱subscript𝛽𝑡d𝑡subscript𝛽𝑡𝐑𝑑¯𝐰\mathrm{d}\mathbf{x}=\biggl{[}-\frac{1}{2}\mathbf{x}-\mathbf{R}\mathbf{R}^{T}%\nabla_{\mathbf{x}}\log p_{t}(\mathbf{x})\biggr{]}\beta_{t}\mathrm{d}t+\sqrt{%\beta_{t}}\mathbf{R}d\bar{\mathbf{w}},roman_d bold_x = [ - divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_x - bold_RR start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∇ start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT roman_log italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x ) ] italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_d italic_t + square-root start_ARG italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_R italic_d over¯ start_ARG bold_w end_ARG ,(4)

where d𝐰¯𝑑¯𝐰d\bar{\mathbf{w}}italic_d over¯ start_ARG bold_w end_ARG is a reverse-time Wiener process. Following Tweedie’s formula[54], the score 𝐱logpt(𝐱)subscript𝐱subscript𝑝𝑡𝐱\nabla_{\mathbf{x}}\log p_{t}(\mathbf{x})∇ start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT roman_log italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x ) is an affine function of the time-dependent optimal denoiser, approximated by 𝐱^θ(𝐱,t)subscript^𝐱𝜃𝐱𝑡\hat{\mathbf{x}}_{\theta}(\mathbf{x},t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x , italic_t ):

𝐱logpt(𝐱)=(𝐑𝐑T)11αt2(αt𝔼[𝐱0|𝐱t=𝐱]𝐱),𝐱^θ(𝐱,t)𝔼[𝐱0|𝐱t=𝐱].formulae-sequencesubscript𝐱subscript𝑝𝑡𝐱superscriptsuperscript𝐑𝐑𝑇11superscriptsubscript𝛼𝑡2subscript𝛼𝑡𝔼delimited-[]conditionalsubscript𝐱0subscript𝐱𝑡𝐱𝐱subscript^𝐱𝜃𝐱𝑡𝔼delimited-[]conditionalsubscript𝐱0subscript𝐱𝑡𝐱\nabla_{\mathbf{x}}\log p_{t}(\mathbf{x})=\frac{(\mathbf{R}\mathbf{R}^{T})^{-1%}}{1-\alpha_{t}^{2}}{(\alpha_{t}\mathbb{E}[\mathbf{x}_{0}|\mathbf{x}_{t}=%\mathbf{x}]-\mathbf{x})},\quad\hat{\mathbf{x}}_{\theta}(\mathbf{x},t)\approx%\mathbb{E}[\mathbf{x}_{0}|\mathbf{x}_{t}=\mathbf{x}].∇ start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT roman_log italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x ) = divide start_ARG ( bold_RR start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT blackboard_E [ bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_x ] - bold_x ) , over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x , italic_t ) ≈ blackboard_E [ bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_x ] .(5)

In practice, the SDE(4) is modified with a low-temperature sampling strategy in order to promote high-likelihoood states. The reverse-time low-temperature diffusion process is equilibrated with Langevin dynamics and solved numerically with the Euler-Maruyama method.

3.2 Half Quadratic Splitting and Plug-n-Play Framework

An objective function of the form f(𝐱)+g(𝐱)𝑓𝐱𝑔𝐱f(\mathbf{x})+g(\mathbf{x})italic_f ( bold_x ) + italic_g ( bold_x ) can be efficiently minimized over 𝐱𝐱\mathbf{x}bold_x using a variable splitting algorithm like Half Quadratic Splitting (HQS)[23]. By introducing an auxiliary variable 𝐱~~𝐱\tilde{\mathbf{x}}over~ start_ARG bold_x end_ARG, the HQS method relies on iteratively solving two subproblems:

𝐱~ksubscript~𝐱𝑘\displaystyle\tilde{\mathbf{x}}_{k}over~ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT=proxg,ρ(𝐱k)=argmin𝐱~g(𝐱~)+ρ2𝐱~𝐱k22,absentsubscriptprox𝑔𝜌subscript𝐱𝑘subscriptargmin~𝐱𝑔~𝐱𝜌2superscriptsubscriptnorm~𝐱subscript𝐱𝑘22\displaystyle=\text{prox}_{g,\rho}(\mathbf{x}_{k})=\operatorname*{argmin}_{%\tilde{\mathbf{x}}}~{}{g(\tilde{\mathbf{x}})+\frac{\rho}{2}\|\tilde{\mathbf{x}%}-\mathbf{x}_{k}\|_{2}^{2}},= prox start_POSTSUBSCRIPT italic_g , italic_ρ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = roman_argmin start_POSTSUBSCRIPT over~ start_ARG bold_x end_ARG end_POSTSUBSCRIPT italic_g ( over~ start_ARG bold_x end_ARG ) + divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ over~ start_ARG bold_x end_ARG - bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,(6)
𝐱k+1subscript𝐱𝑘1\displaystyle\mathbf{x}_{k+1}bold_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT=proxf,ρ(𝐱~k)=argmin𝐱f(𝐱)+ρ2𝐱𝐱~k22,absentsubscriptprox𝑓𝜌subscript~𝐱𝑘subscriptargmin𝐱𝑓𝐱𝜌2superscriptsubscriptnorm𝐱subscript~𝐱𝑘22\displaystyle=\text{prox}_{f,\rho}(\tilde{\mathbf{x}}_{k})=\operatorname*{%argmin}_{\mathbf{x}}~{}{f(\mathbf{x})+\frac{\rho}{2}\|\mathbf{x}-\tilde{%\mathbf{x}}_{k}\|_{2}^{2}},= prox start_POSTSUBSCRIPT italic_f , italic_ρ end_POSTSUBSCRIPT ( over~ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = roman_argmin start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT italic_f ( bold_x ) + divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_x - over~ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

where prox are called “proximal operators” and ρ>0𝜌0\rho>0italic_ρ > 0 is a user-defined proximal parameter.

If f𝑓fitalic_f represents a negative log-likelihood over 𝐱𝐱\mathbf{x}bold_x and g𝑔gitalic_g represents a negative log-prior, the above problem defines a Maximum-A-Posterior (MAP) problem. The key idea of the plug-and-play framework[67] is to notice that the first minimization problem in(6) is exactly a Gaussian denoising problem at noise level σ=1/ρ𝜎1𝜌\sigma=\sqrt{1/\rho}italic_σ = square-root start_ARG 1 / italic_ρ end_ARG with the prior exp(g(𝐱))𝑔𝐱\exp(-g(\mathbf{x}))roman_exp ( - italic_g ( bold_x ) ) in 𝐱𝐱\mathbf{x}bold_x-space. This means that any Gaussian denoiser can be used to “plug in” a prior into a MAP problem.

Once a diffusion model has been trained, it provides a deterministic Gaussian denoiser for various noise levels, as described in(5). As recently shown inZhu etal. [76], this optimal denoiser can be used in the plug-n-play framework to solve MAP problems in image space. Here, we propose to apply this idea to inverse problems in protein space, leveraging a pretrained diffusion model.

4 Methods

In this section, we formulate our method, ADP-3D (Atomic Denoising Prior for 3D reconstruction), as a MAP estimation method in protein space and explain how the plug-n-play framework can be used to leverage the prior learned by a pretrained diffusion model. We then introduce our preconditioning strategy in the case of linear problems. Finally, we describe and model the measurement process in cryogenic electron microscopy. ADP-3D is described with pseudo-code in Algorithm1.

4.1 General Approach

Given a set of independent measurements 𝐘={𝐲i}i=1n𝐘superscriptsubscriptsubscript𝐲𝑖𝑖1𝑛\mathbf{Y}=\{\mathbf{y}_{i}\}_{i=1}^{n}bold_Y = { bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT made from the same unknown protein, our goal is to find a Maximum-A-Posteriori (MAP) estimate of the backbone structure 𝐱superscript𝐱\mathbf{x}^{*}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.Following Bayes’ rule,

𝐱=argmax𝐱{p0(𝐱|𝐘)}=argmin𝐱{i=1nlogp0(𝐲i|𝐱)f(𝐱)logp0(𝐱)g(𝐱)}.superscript𝐱subscriptargmax𝐱subscript𝑝0conditional𝐱𝐘subscriptargmin𝐱subscriptsuperscriptsubscript𝑖1𝑛subscript𝑝0conditionalsubscript𝐲𝑖𝐱𝑓𝐱subscriptsubscript𝑝0𝐱𝑔𝐱\mathbf{x}^{*}=\operatorname*{argmax}_{\mathbf{x}}~{}\bigl{\{}p_{0}(\mathbf{x}%|\mathbf{Y})\bigr{\}}\\=\operatorname*{argmin}_{\mathbf{x}}~{}\Bigl{\{}\underbrace{-\sum_{i=1}^{n}%\log p_{0}(\mathbf{y}_{i}|\mathbf{x})}_{f(\mathbf{x})}\underbrace{-\log p_{0}(%\mathbf{x})}_{g(\mathbf{x})}\Bigr{\}}.bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_argmax start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT { italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x | bold_Y ) } = roman_argmin start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT { under⏟ start_ARG - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_x ) end_ARG start_POSTSUBSCRIPT italic_f ( bold_x ) end_POSTSUBSCRIPT under⏟ start_ARG - roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) end_ARG start_POSTSUBSCRIPT italic_g ( bold_x ) end_POSTSUBSCRIPT } .(7)

While most of previous works leveraging a diffusion model for inverse problems aim at sampling from the posterior distribution p(𝐱|𝐘)𝑝conditional𝐱𝐘p(\mathbf{x}|\mathbf{Y})italic_p ( bold_x | bold_Y ), we are interested here in scenarios where the measurements convey enough information to make the MAP estimate unique and well-defined. In the results section, we show that MAP estimation outperforms posterior sampling in such well-defined problems.

We take inspiration from the plug-and-play framework[67] to efficiently solve(7). We propose to use the optimal denoiser 𝐱^θ(𝐱,t)subscript^𝐱𝜃𝐱𝑡\hat{\mathbf{x}}_{\theta}(\mathbf{x},t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x , italic_t ) of a pretrained diffusion model to solve the first subproblem in(6). Framing the optimization loop in the whitened space of 𝐳=𝐑1𝐱𝐳superscript𝐑1𝐱\mathbf{z}=\mathbf{R}^{-1}\mathbf{x}bold_z = bold_R start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_x, which provides more stable results, our general optimization algorithm can be summarized in three steps:

𝐳~0subscript~𝐳0\displaystyle\tilde{\mathbf{z}}_{0}over~ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT=𝐑1𝐱^θ(𝐑𝐳t,t)absentsuperscript𝐑1subscript^𝐱𝜃subscript𝐑𝐳𝑡𝑡\displaystyle=\mathbf{R}^{-1}\hat{\mathbf{x}}_{\theta}(\mathbf{R}\mathbf{z}_{t%},t)= bold_R start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_Rz start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t )Denoise at levelt,Denoise at level𝑡,\displaystyle\text{Denoise at level }t\text{,}Denoise at level italic_t ,
𝐳^0subscript^𝐳0\displaystyle\hat{\mathbf{z}}_{0}over^ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT=argmin𝐳ρ2𝐳𝐳~022i=1nlogp0(𝐲i|𝐳)absentsubscriptargmin𝐳𝜌2superscriptsubscriptnorm𝐳subscript~𝐳022superscriptsubscript𝑖1𝑛subscript𝑝0conditionalsubscript𝐲𝑖𝐳\displaystyle=\operatorname*{argmin}_{\mathbf{z}}\frac{\rho}{2}\|\mathbf{z}-%\tilde{\mathbf{z}}_{0}\|_{2}^{2}-\sum_{i=1}^{n}\log p_{0}(\mathbf{y}_{i}|%\mathbf{z})= roman_argmin start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_z - over~ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_z )Maximize likelihood,
𝐳t1subscript𝐳𝑡1\displaystyle\mathbf{z}_{t-1}bold_z start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT𝒩(αt1𝐳^0,σt12)similar-toabsent𝒩subscript𝛼𝑡1subscript^𝐳0superscriptsubscript𝜎𝑡12\displaystyle\sim\mathcal{N}(\alpha_{t-1}\hat{\mathbf{z}}_{0},\sigma_{t-1}^{2})∼ caligraphic_N ( italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT over^ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )Add noise at levelt1.Add noise at level𝑡1\displaystyle\text{Add noise at level }t-1.Add noise at level italic_t - 1 .

Here, no specific assumptions have been made on the likelihood term and this framework could hypothetically be applied on any set of measurements for which we have a physics-based model of the measurement process.Since the second step is not tractable in most cases, we replace the explicit minimization with a gradient step with momentum from the iterate 𝐳~0subscript~𝐳0\tilde{\mathbf{z}}_{0}over~ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.This step can be implemented efficiently using automatic differentiation.

Inputs: log-likelihood functions {fi:(𝐲,𝐳)logp(𝐲i=𝐲|𝐳)}i=1nsuperscriptsubscriptconditional-setsubscript𝑓𝑖maps-to𝐲𝐳𝑝subscript𝐲𝑖conditional𝐲𝐳𝑖1𝑛\{f_{i}:(\mathbf{y},\mathbf{z})\mapsto\log p(\mathbf{y}_{i}=\mathbf{y}|\mathbf%{z})\}_{i=1}^{n}{ italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT : ( bold_y , bold_z ) ↦ roman_log italic_p ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_y | bold_z ) } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, measurements {𝐲i}subscript𝐲𝑖\{\mathbf{y}_{i}\}{ bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }.

Diffusion model: correlation matrix 𝐑𝐑\mathbf{R}bold_R, denoiser 𝐱^θ(𝐱,t)subscript^𝐱𝜃𝐱𝑡\hat{\mathbf{x}}_{\theta}(\mathbf{x},t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x , italic_t ), schedule {αt}t=1Tsuperscriptsubscriptsubscript𝛼𝑡𝑡1𝑇\{\alpha_{t}\}_{t=1}^{T}{ italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT.

Optimization parameters: learning rates {λi}subscript𝜆𝑖\{\lambda_{i}\}{ italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }, momenta {ρi}subscript𝜌𝑖\{\rho_{i}\}{ italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }.

Initialization: 𝐳T𝒩(0,𝐈)subscript𝐳𝑇𝒩0𝐈\mathbf{z}_{T}\leftarrow\mathcal{N}(0,\mathbf{I})bold_z start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ← caligraphic_N ( 0 , bold_I ), i,𝐯i=0for-all𝑖subscript𝐯𝑖0\quad\forall i,\mathbf{v}_{i}=0∀ italic_i , bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0

fort=T,,1𝑡𝑇1t=T,\ldots,1italic_t = italic_T , … , 1do

𝐳~0𝐑1𝐱^θ(𝐑𝐳t,t)subscript~𝐳0superscript𝐑1subscript^𝐱𝜃subscript𝐑𝐳𝑡𝑡\tilde{\mathbf{z}}_{0}\leftarrow\mathbf{R}^{-1}\hat{\mathbf{x}}_{\theta}(%\mathbf{R}\mathbf{z}_{t},t)over~ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ← bold_R start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_Rz start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) Denoise at level t𝑡titalic_t

i,𝐯i=ρi𝐯i+λi𝐳fi(𝐲i,𝐳)|𝐳=𝐳~0for-all𝑖subscript𝐯𝑖subscript𝜌𝑖subscript𝐯𝑖evaluated-atsubscript𝜆𝑖subscript𝐳subscript𝑓𝑖subscript𝐲𝑖𝐳𝐳subscript~𝐳0\forall i,\mathbf{v}_{i}=\rho_{i}\mathbf{v}_{i}+\lambda_{i}\nabla_{\mathbf{z}}%f_{i}(\mathbf{y}_{i},\mathbf{z})|_{\mathbf{z}=\tilde{\mathbf{z}}_{0}}∀ italic_i , bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∇ start_POSTSUBSCRIPT bold_z end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_z ) | start_POSTSUBSCRIPT bold_z = over~ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT Accumulate gradient of log likelihood

𝐳^0𝐳~0+i𝐯isubscript^𝐳0subscript~𝐳0subscript𝑖subscript𝐯𝑖\hat{\mathbf{z}}_{0}\leftarrow\tilde{\mathbf{z}}_{0}+\sum_{i}\mathbf{v}_{i}over^ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ← over~ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT Take a step to maximize likelihood

𝐳t1𝒩(αt1𝐳^0,σt12)similar-tosubscript𝐳𝑡1𝒩subscript𝛼𝑡1subscript^𝐳0superscriptsubscript𝜎𝑡12\mathbf{z}_{t-1}\sim\mathcal{N}(\alpha_{t-1}\hat{\mathbf{z}}_{0},\sigma_{t-1}^%{2})bold_z start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ∼ caligraphic_N ( italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT over^ start_ARG bold_z end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) Add noise at level t1𝑡1t-1italic_t - 1

endfor

return 𝐱0=𝐑𝐳0subscript𝐱0subscript𝐑𝐳0\mathbf{x}_{0}=\mathbf{R}\mathbf{z}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_Rz start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT

4.2 Preconditioning for Linear Measurements

We consider the case where the measurement process is linear:

𝐲=𝐀𝐱0+η=𝐀𝐑𝐳0+η,η𝒩(0,𝚺m×m),formulae-sequence𝐲subscript𝐀𝐱0𝜂subscript𝐀𝐑𝐳0𝜂similar-to𝜂𝒩0𝚺superscript𝑚𝑚\mathbf{y}=\mathbf{A}\mathbf{x}_{0}+\eta=\mathbf{A}\mathbf{R}\mathbf{z}_{0}+%\eta,\quad\eta\sim\mathcal{N}(0,\mathbf{\Sigma}\in\mathbb{R}^{m\times m}),bold_y = bold_Ax start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_η = bold_ARz start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_η , italic_η ∼ caligraphic_N ( 0 , bold_Σ ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT ) ,(8)

with 𝐲m𝐲superscript𝑚\mathbf{y}\in\mathbb{R}^{m}bold_y ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT and 𝐀m×4N𝐀superscript𝑚4𝑁\mathbf{A}\in\mathbb{R}^{m\times 4N}bold_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × 4 italic_N end_POSTSUPERSCRIPT being a known measurement matrix of rank m𝑚mitalic_m. In this case, the log-likelihood term is a quadratic function:

logp0(𝐲|𝐳)=12𝐀𝐑𝐳𝐲𝚺𝟏2+C,where𝐱𝚺12=𝐱T𝚺1𝐱,formulae-sequencesubscript𝑝0conditional𝐲𝐳12superscriptsubscriptnorm𝐀𝐑𝐳𝐲superscript𝚺12𝐶wheresuperscriptsubscriptnorm𝐱superscript𝚺12superscript𝐱𝑇superscript𝚺1𝐱\log p_{0}(\mathbf{y}|\mathbf{z})=-\frac{1}{2}\|\mathbf{A}\mathbf{R}\mathbf{z}%-\mathbf{y}\|_{\mathbf{\Sigma^{-1}}}^{2}+C,\text{where }\|\mathbf{x}\|_{%\mathbf{\Sigma}^{-1}}^{2}=\mathbf{x}^{T}\mathbf{\Sigma}^{-1}\mathbf{x},roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_z ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∥ bold_ARz - bold_y ∥ start_POSTSUBSCRIPT bold_Σ start_POSTSUPERSCRIPT - bold_1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_C , where ∥ bold_x ∥ start_POSTSUBSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = bold_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_x ,(9)
Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (2)

and C𝐶Citalic_C does not depend on 𝐳𝐳\mathbf{z}bold_z. As shown in Figure2, the condition number of 𝐑𝐑\mathbf{R}bold_R (i.e., the ratio between its largest and smallest singular values) grows as a power function of the number of residues. For typical proteins (N100𝑁100N\geq 100italic_N ≥ 100), this condition number exceeds 100100100100, making the maximization of the above term an ill-conditioned problem. In order to make gradient-based optimization more efficient, we propose to precondition the problem by precomputing a singular value decomposition 𝐀𝐑=𝐔𝐒𝐕T𝐀𝐑superscript𝐔𝐒𝐕𝑇\mathbf{A}\mathbf{R}=\mathbf{U}\mathbf{S}\mathbf{V}^{T}bold_AR = bold_USV start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and to set 𝚺=σ2𝐔𝐒𝐒T𝐔T𝚺superscript𝜎2superscript𝐔𝐒𝐒𝑇superscript𝐔𝑇\mathbf{\Sigma}=\sigma^{2}\mathbf{U}\mathbf{S}\mathbf{S}^{T}\mathbf{U}^{T}bold_Σ = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_USS start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Note that this is equivalent to modeling the measurement process as 𝐲=𝐀𝐑(𝐳+η~)𝐲𝐀𝐑𝐳~𝜂\mathbf{y}=\mathbf{A}\mathbf{R}(\mathbf{z}+\tilde{\eta})bold_y = bold_AR ( bold_z + over~ start_ARG italic_η end_ARG ) with η~𝒩(0,σ2)similar-to~𝜂𝒩0superscript𝜎2\tilde{\eta}\sim\mathcal{N}(0,\sigma^{2})over~ start_ARG italic_η end_ARG ∼ caligraphic_N ( 0 , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). In other words, we assume that the noise η𝜂\etaitalic_η preserves the simple patterns in proteins, which is a reasonable hypothesis if, for example, 𝐲𝐲\mathbf{y}bold_y is an incomplete atomic model obtained by an upstream reconstruction algorithm that leverages prior knowledge on protein structures. The log-likelihood then becomes

logp0(𝐲|𝐳)=12σ2(𝐈m000)𝐕T𝐳𝐒+𝐔T𝐲22+C.subscript𝑝0conditional𝐲𝐳12superscript𝜎2superscriptsubscriptnormmatrixsubscript𝐈𝑚000superscript𝐕𝑇𝐳superscript𝐒superscript𝐔𝑇𝐲22𝐶\log p_{0}(\mathbf{y}|\mathbf{z})=-\frac{1}{2\sigma^{2}}\bigg{\|}\begin{%pmatrix}\mathbf{I}_{m}&0\\0&0\end{pmatrix}\mathbf{V}^{T}\mathbf{z}-\mathbf{S}^{+}\mathbf{U}^{T}\mathbf{y%}\bigg{\|}_{2}^{2}+C.roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_z ) = - divide start_ARG 1 end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ ( start_ARG start_ROW start_CELL bold_I start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) bold_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_z - bold_S start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_y ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_C .(10)

The maximization of this term is a well-posed problem that gradient ascent with momentum efficiently solves (see supplementary analyses). In(10), 𝐒+superscript𝐒\mathbf{S}^{+}bold_S start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT denotes the pseudo-inverse of 𝐒𝐒\mathbf{S}bold_S.

4.3 Application to Atomic Model Building

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (3)

Measurement Model in Cryo-EM.

In single particle cryo-EM, a purified solution of a target protein is flash-frozen and imaged with a transmission electron microscope, providing thousands to millions of randomly oriented 2D projection images of the protein’s electron scattering potential. Reconstruction algorithms process these images and infer a 3D density map of the protein. Given a protein (𝐱,𝐬,χ)𝐱𝐬𝜒(\mathbf{x},\mathbf{s},\mathbf{\chi})( bold_x , bold_s , italic_χ ), its density map is well approximated by[17]

𝐲=(Γ(𝐱,𝐬,χ))+ηD×D×D,𝐲Γ𝐱𝐬𝜒𝜂superscript𝐷𝐷𝐷\mathbf{y}=\mathcal{B}(\Gamma(\mathbf{x},\mathbf{s},\mathbf{\chi}))+\eta\in%\mathbb{R}^{D\times D\times D},bold_y = caligraphic_B ( roman_Γ ( bold_x , bold_s , italic_χ ) ) + italic_η ∈ blackboard_R start_POSTSUPERSCRIPT italic_D × italic_D × italic_D end_POSTSUPERSCRIPT ,(11)

where ΓΓ\Gammaroman_Γ is an operator that places a sum of 5 isotropic Gaussians centered on each heavy atom. The amplitudes and standard deviations of these Gaussians, known as “form factors”, are tabulated[24] and depend on the chemical element they are centered on.\mathcal{B}caligraphic_B represents the effect of “B-factors”[34] and can be viewed as a spatially-dependent blurring kernel modelling molecular motions and/or signal damping by the transfer function of the electron microscope.η𝜂\etaitalic_η models isotropic Gaussian noise of variance σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. This measurement model leads to the following log-likelihood:

logp0(𝐲|𝐱,𝐬,χ)=12σ2𝐲(Γ(𝐱,𝐬,χ))22+C.subscript𝑝0conditional𝐲𝐱𝐬𝜒12superscript𝜎2superscriptsubscriptnorm𝐲Γ𝐱𝐬𝜒22𝐶\log p_{0}(\mathbf{y}|\mathbf{x},\mathbf{s},\mathbf{\chi})=-\frac{1}{2\sigma^{%2}}\|\mathbf{y}-\mathcal{B}(\Gamma(\mathbf{x},\mathbf{s},\mathbf{\chi}))\|_{2}%^{2}+C.roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_x , bold_s , italic_χ ) = - divide start_ARG 1 end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ bold_y - caligraphic_B ( roman_Γ ( bold_x , bold_s , italic_χ ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_C .(12)

Likelihood Terms in Model Refinement.

We consider a 3D density map 𝐲𝐲\mathbf{y}bold_y provided by an upstream reconstruction method and an incomplete backbone structure 𝐱¯m¯𝐱superscript𝑚\bar{\mathbf{x}}\in\mathbb{R}^{m}over¯ start_ARG bold_x end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT (m4N𝑚4𝑁m\leq 4Nitalic_m ≤ 4 italic_N) provided by an upstream model building algorithm (e.g., ModelAngelo[31]). Sequencing a protein is now a routine process[18] and we therefore consider the sequence 𝐬𝐬\mathbf{s}bold_s as an additional source of information. The side chain angles χ𝜒\chiitalic_χ are, however, unknown.

The log-likelihood of our measurements for a given backbone structure 𝐱𝐱\mathbf{x}bold_x can be decomposed as

logp0(𝐲,𝐬,𝐱¯|𝐱)=logp0(𝐬|𝐱)+logp0(𝐱¯|𝐱)+logp0(𝐲|𝐱,𝐬).subscript𝑝0𝐲𝐬conditional¯𝐱𝐱subscript𝑝0conditional𝐬𝐱subscript𝑝0conditional¯𝐱𝐱subscript𝑝0conditional𝐲𝐱𝐬\log p_{0}(\mathbf{y},\mathbf{s},\bar{\mathbf{x}}|\mathbf{x})=\log p_{0}(%\mathbf{s}|\mathbf{x})+\log p_{0}(\bar{\mathbf{x}}|\mathbf{x})+\log p_{0}(%\mathbf{y}|\mathbf{x},\mathbf{s}).roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y , bold_s , over¯ start_ARG bold_x end_ARG | bold_x ) = roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_s | bold_x ) + roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( over¯ start_ARG bold_x end_ARG | bold_x ) + roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_x , bold_s ) .(13)

On the right-hand side, the first term can be approximated using the learned conditional distribution pθ(𝐬|𝐱,t=0)subscript𝑝𝜃conditional𝐬𝐱𝑡0p_{\theta}(\mathbf{s}|\mathbf{x},t=0)italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_s | bold_x , italic_t = 0 ). We model 𝐱¯=𝐌𝐱+η¯𝐱𝐌𝐱𝜂\bar{\mathbf{x}}=\mathbf{M}\mathbf{x}+\etaover¯ start_ARG bold_x end_ARG = bold_Mx + italic_η so that the second term can be handled by the preconditioning procedure described in the previous section. Finally, the last term involves the marginalization of p0(𝐲|𝐱,𝐬,χ)subscript𝑝0conditional𝐲𝐱𝐬𝜒p_{0}(\mathbf{y}|\mathbf{x},\mathbf{s},\mathbf{\chi})italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_x , bold_s , italic_χ ) over χ𝜒\chiitalic_χ. This marginalization is not tractable but(12) provides a lower bound:

logp0(𝐲|𝐱,𝐬)𝔼χp0(χ|𝐱,𝐬)[logp0(𝐲|𝐱,𝐬,χ)]𝔼χpθ(χ|𝐱,𝐬,t=0)[logp0(𝐲|𝐱,𝐬,χ)],subscript𝑝0conditional𝐲𝐱𝐬subscript𝔼similar-to𝜒subscript𝑝0conditional𝜒𝐱𝐬delimited-[]subscript𝑝0conditional𝐲𝐱𝐬𝜒subscript𝔼similar-to𝜒subscript𝑝𝜃conditional𝜒𝐱𝐬𝑡0delimited-[]subscript𝑝0conditional𝐲𝐱𝐬𝜒\log p_{0}(\mathbf{y}|\mathbf{x},\mathbf{s})\geq\mathbb{E}_{\mathbf{\chi}\sim p%_{0}(\mathbf{\chi}|\mathbf{x},\mathbf{s})}\Bigl{[}\log p_{0}(\mathbf{y}|%\mathbf{x},\mathbf{s},\mathbf{\chi})\Bigr{]}\approx\mathbb{E}_{\mathbf{\chi}%\sim p_{\theta}(\mathbf{\chi}|\mathbf{x},\mathbf{s},t=0)}\Bigl{[}\log p_{0}(%\mathbf{y}|\mathbf{x},\mathbf{s},\mathbf{\chi})\Bigr{]},roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_x , bold_s ) ≥ blackboard_E start_POSTSUBSCRIPT italic_χ ∼ italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_χ | bold_x , bold_s ) end_POSTSUBSCRIPT [ roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_x , bold_s , italic_χ ) ] ≈ blackboard_E start_POSTSUBSCRIPT italic_χ ∼ italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_χ | bold_x , bold_s , italic_t = 0 ) end_POSTSUBSCRIPT [ roman_log italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_y | bold_x , bold_s , italic_χ ) ] ,(14)

using Jensen’s inequality.The expectation is approximated by Monte Carlo sampling and gradients of χ𝜒\chiitalic_χ with respect to 𝐱𝐱\mathbf{x}bold_x are computed by automatic differentiation through the autoregressive sampler of χ𝜒\mathbf{\chi}italic_χ.

5 Experiments

Experimental Setup.

Our method uses the publicly released version of Chroma111https://github.com/generatebio/chroma[29]. We run all our experiments using structures of proteins downloaded from the Protein Data Bank (PDB)[7]. In order to select proteins that do not belong to the training dataset of Chroma, here we only consider structures that were released after 2022-03-20 (Chroma was trained on a filtered version of the PDB queried on that date). We only consider single-chain structures for which all the residues have been spatially resolved, so that the Root Mean Square Deviation (RMSD) of the predicted Cartesian coordinates can be computed for all the heavy atoms in the backbone. In each experiment, we run 8 replicas in parallel on a single NVIDIA A100 GPU.

Structure Completion.

Given an incomplete atomic model of a protein, our first task is to predict the coordinates of all heavy atoms in the backbone. The forward measurement process can be modeled as 𝐲=𝐌𝐱𝐲𝐌𝐱\mathbf{y}=\mathbf{M}\mathbf{x}bold_y = bold_Mx where 𝐌{0,1}(4N/k)×4N𝐌superscript014𝑁𝑘4𝑁\mathbf{M}\in\{0,1\}^{(4N/k)\times 4N}bold_M ∈ { 0 , 1 } start_POSTSUPERSCRIPT ( 4 italic_N / italic_k ) × 4 italic_N end_POSTSUPERSCRIPT is a masking matrix (𝐌𝟏=𝟏𝐌𝟏1\mathbf{M}\mathbf{1}=\mathbf{1}bold_M1 = bold_1) and k𝑘kitalic_k is the subsampling factor. We consider the case where, for each residue, the location of all 4 heavy atoms on the backbone (N, Cα, C, O) is either known or unknown. Residues of known locations are regularly spaced along the backbone every k𝑘kitalic_k residues. We compare our results to the baseline Chroma conditioned with a SubstructureConditioner[29]. This baseline samples from the posterior probability p(𝐱|𝐲)𝑝conditional𝐱𝐲p(\mathbf{x}|\mathbf{y})italic_p ( bold_x | bold_y ) using unadjusted Langevin dynamics. We use 1000 diffusion steps for our method and the baseline.

In Figure3, we show our results on ATAD2 (PDB:7qum)[15, 5], a cancer-associated protein of 130 residues. The protein was resolved at a resolution of 1.5Å using X-ray crystallography. Our method recovers the target structure without loss of information (RMSD<1.5ÅRMSD1.5Å\text{RMSD}<1.5~{}\text{\AA}RMSD < 1.5 Å) for subsampling factors of 2, 4 and 8. Fig.3.b shows that our method outperforms the baseline and highlights the importance of the diffusion-based prior. When the subsampling factor is large (32absent32\geq 32≥ 32), the reconstruction accuracy decreases but the method inpaints unknown regions with realistic secondary structures (see quantitative evaluation in the supplementary). Note that making the conditioning information sparser (increasing the subsampling factor) tends to close the gap between our method (MAP estimation) and the baseline (posterior sampling). We provide results on four other protein structures in the supplementary.

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (4)

Atomic Model Refinement.

Next, we evaluate our method on the model refinement task. We use ChimeraX[48, 62] to simulate the 3D density maps of a single-chain protein, at different resolutions. We run the state-of-the-art model building method ModelAngelo[31] on the simulated maps, using the known sequence. Then, we provide ModelAngelo’s output (an incomplete model) to our method, along with the density map and the sequence. We use ModelAngelo’s default parameters. To evaluate our method, we report the RMSD of the predicted structure for the X𝑋Xitalic_X% most well-resolved alpha carbons (compared to the deposited structure), for X[0,100]𝑋0100X\in[0,100]italic_X ∈ [ 0 , 100 ] (X𝑋Xitalic_X is called the “completeness”).

In Figure4, we show our results on a synthetic density map of the TecA bacterial toxin (PDB:7pzt).This protein structure belongs to the CASP15 dataset, released in May 2022 to evaluate protein structure modeling methods.Proteins belonging to this dataset were selected for not having close hom*ologous in the PDB, which ensures that Chroma was not trained on similar structures.The protein was resolved at 1.84Å resolution using X-ray crystallography. Note that we do not chose a structure that was resolved with cryo-EM because most cryo-EM-resolved models are incomplete. For both input resolutions (2.0Å and 4.0Å), our method improves on ModelAngelo’s accuracy for a fixed completeness level (and reaches a higher completeness for the same accuracy). We explore the effect of removing some of the conditioning information in the supplementary materials.

Pairwise Distances to Structure

Finally, we assume we are given a set of pairwise distances between alpha carbons of a protein and we use our method to predict a full 3D structure. This task is a simplification of the reconstruction problem in paramagnetic NMR spectroscopy, where one can obtain information about the relative distances and orientations between pairs of atoms via the nuclear Overhauser effect and sparse paramagnetic restraints, and must deduce the Cartesian coordinates of every atom[38, 39]. Formally, our measurement model is 𝐲=𝐃𝐱2m𝐲subscriptnorm𝐃𝐱2superscript𝑚\mathbf{y}=\|\mathbf{D}\mathbf{x}\|_{2}\in\mathbb{R}^{m}bold_y = ∥ bold_Dx ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT,where 𝐃{1,0,1}m×4N𝐃superscript101𝑚4𝑁\mathbf{D}\in\{-1,0,1\}^{m\times 4N}bold_D ∈ { - 1 , 0 , 1 } start_POSTSUPERSCRIPT italic_m × 4 italic_N end_POSTSUPERSCRIPT is the distance matrix and the norm is taken row-wise (in xyz𝑥𝑦𝑧xyzitalic_x italic_y italic_z space). 𝐃𝐃\mathbf{D}bold_D contains a single “1” and a single “-1” in each row and is not redundant (the distance between a given pair of atoms is measured at most once). m𝑚mitalic_m corresponds to the number of measured distances.

We evaluate our method on the bromodomain-containing protein 4 (BRD4, PDB:7r5b[69]), a protein involved in the development of a specific type of cancer (NUT midline carinoma)[22] and targeted by pharmaceutical drugs[14]. For a given number m𝑚mitalic_m, we randomly sample m𝑚mitalic_m pairs of alpha carbons (without redundancy) between which we assume the distances to be known. Our results are shown in Figure5. When 500 pairwise distances or more are known, our method recovers the structural information of the target structure (RMSD<1.77ÅRMSD1.77Å\text{RMSD}<1.77~{}\text{\AA}RMSD < 1.77 Å, the resolution of the deposited structure resolved with X-ray crystallography). We conduct the same experiment without the diffusion model and show a drop of accuracy, highlighting the importance of the generative prior. Note that, when the diffusion model is removed, increasing the number of measurements increases the number of local minimima in the objective function and can therefore hurt the reconstruction quality (plot in Fig.5, orange curve, far-right part).

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (5)

6 Discussion

This paper introduces ADP-3D, a method to leverage a pretrained protein diffusion model for protein structure determination.ADP-3D is not tied to a specific diffusion model and allows for any data-driven denoisers to be plugged in as priors. Our method can therefore continually benefit from the development of more powerful or more specialized generative models.

We focused here on cases where the measurement process is simulated and perfectly known. Considering real experimental measurements (e.g., cryo-EM or NMR data) would raise complex and exciting challenges. In particular, we hope ADP-3D can serve as a precursor for reconstructing all-atom models directly from cryo-EM images, a key to turn empirical conformational distributions into thermodynamic information[21].In cases where the measurement process cannot be faithfully modeled due to complex nonidealities, or when the measurement process is not differentiable, our framework reaches its boundaries. At the detriment of versatility, exploring the possibility of finetuning a pretrained diffusion model on paired data for conditional generation constitutes another promising avenue for future work.

Acknowledgements.

The author(s) are pleased to acknowledge that the work reported on in this paper was substantially performed using the Princeton Research Computing resources at Princeton University which is a consortium of groups led by the Princeton Institute for Computational Science and Engineering (PICSciE) and Office of Information Technology’s Research Computing. This material is based upon work supported by the National Science Foundation under award number 2303178 to SFK. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

We thank Mike Dunne and Jay Shenoy for their insightful feedback on this manuscript.

References

  • Abramson etal. [2024]Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, Lindsay Willmore, AndrewJ Ballard, Joshua Bambrick, etal.Accurate structure prediction of biomolecular interactions with alphafold 3.Nature, pages 1–3, 2024.
  • Anand and Achim [2022]Namrata Anand and Tudor Achim.Protein structure and sequence generation with equivariant denoising diffusion probabilistic models.arXiv preprint arXiv:2205.15019, 2022.
  • Anand etal. [2022]Namrata Anand, Raphael Eguchi, IrimpanI Mathews, CarlaP Perez, Alexander Derry, RussB Altman, and Po-Ssu Huang.Protein sequence design with a learned potential.Nature communications, 13(1):746, 2022.
  • Anderson [1982]BrianDO Anderson.Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, 1982.
  • Bamborough etal. [2016]Paul Bamborough, Chun-wa Chung, EmmanuelH Demont, RebeccaC Furze, AndrewJ Bannister, KaHing Che, Hawa Diallo, Clement Douault, Paola Grandi, Tony Kouzarides, etal.A chemical probe for the atad2 bromodomain.Angewandte Chemie, 128(38):11554–11558, 2016.
  • Bora etal. [2017]Ashish Bora, Ajil Jalal, Eric Price, and AlexandrosG Dimakis.Compressed sensing using generative models.In International conference on machine learning, pages 537–546. PMLR, 2017.
  • Burley etal. [2021]StephenK Burley, Charmi Bhikadiya, Chunxiao Bi, Sebastian Bittrich, LiChen, GreggV Crichlow, ColeH Christie, Kenneth Dalenberg, Luigi DiCostanzo, JoseM Duarte, etal.Rcsb protein data bank: powerful new tools for exploring 3d structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences.Nucleic acids research, 49(D1):D437–D451, 2021.
  • Choi etal. [2021]Jooyoung Choi, Sungwon Kim, Yonghyun Jeong, Youngjune Gwon, and Sungroh Yoon.Ilvr: Conditioning method for denoising diffusion probabilistic models.arXiv preprint arXiv:2108.02938, 2021.
  • Chung and Ye [2022]Hyungjin Chung and JongChul Ye.Score-based diffusion models for accelerated mri.Medical image analysis, 80:102479, 2022.
  • Chung etal. [2022a]Hyungjin Chung, Jeongsol Kim, MichaelT Mccann, MarcL Klasky, and JongChul Ye.Diffusion posterior sampling for general noisy inverse problems.arXiv preprint arXiv:2209.14687, 2022a.
  • Chung etal. [2022b]Hyungjin Chung, Byeongsu Sim, Dohoon Ryu, and JongChul Ye.Improving diffusion models for inverse problems using manifold constraints.Advances in Neural Information Processing Systems, 35:25683–25696, 2022b.
  • Chung etal. [2022c]Hyungjin Chung, Byeongsu Sim, and JongChul Ye.Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12413–12422, 2022c.
  • Cowtan [2006]Kevin Cowtan.The buccaneer software for automated model building. 1. tracing protein chains.Acta crystallographica section D: biological crystallography, 62(9):1002–1011, 2006.
  • DaCosta etal. [2013]David DaCosta, Angelo Agathanggelou, Timothy Perry, Victoria Weston, Eva Petermann, Anastasia Zlatanou, Ceri Oldreive, Wenbin Wei, Grant Stewart, Joanna Longman, etal.Bet inhibition as a single or combined therapeutic approach in primary paediatric b-precursor acute lymphoblastic leukaemia.Blood cancer journal, 3(7):e126–e126, 2013.
  • Davison etal. [2022]Gemma Davison, MathewP Martin, Shannon Turberville, Selma Dormen, Richard Heath, AmyB Heptinstall, Marie Lawson, DuncanC Miller, YiMin Ng, JamesN Sanderson, etal.Mapping ligand interactions of bromodomains brd4 and atad2 with fraglites and peplites-halogenated probes of druglike and peptide-like molecular interactions.Journal of Medicinal Chemistry, 65(22):15416–15432, 2022.
  • DeBortoli etal. [2022]Valentin DeBortoli, Emile Mathieu, Michael Hutchinson, James Thornton, YeeWhye Teh, and Arnaud Doucet.Riemannian score-based generative modelling.Advances in Neural Information Processing Systems, 35:2406–2422, 2022.
  • DeGraef [2003]Marc DeGraef.Introduction to conventional transmission electron microscopy.Cambridge university press, 2003.
  • DeHoffmann and Stroobant [2007]Edmond DeHoffmann and Vincent Stroobant.Mass spectrometry: principles and applications.John Wiley & Sons, 2007.
  • Dhariwal and Nichol [2021]Prafulla Dhariwal and Alexander Nichol.Diffusion models beat gans on image synthesis.Advances in neural information processing systems, 34:8780–8794, 2021.
  • Eigen etal. [2014]David Eigen, Christian Puhrsch, and Rob Fergus.Depth map prediction from a single image using a multi-scale deep network.Advances in neural information processing systems, 27, 2014.
  • Frank and Ourmazd [2016]Joachim Frank and Abbas Ourmazd.Continuous changes in structure mapped by manifold embedding of single-particle data in cryo-em.Methods, 100:61–67, 2016.
  • French [2010]ChristopherA French.Demystified molecular pathology of nut midline carcinomas.Journal of clinical pathology, 63(6):492–496, 2010.
  • Geman and Yang [1995]Donald Geman and Chengda Yang.Nonlinear image recovery with half-quadratic regularization.IEEE transactions on Image Processing, 4(7):932–946, 1995.
  • Hahn etal. [1983]Theo Hahn, Uri Shmueli, and JCWilson Arthur.International tables for crystallography, volume1.Reidel Dordrecht, 1983.
  • He etal. [2022]Jiahua He, Peicong Lin, JiChen, Hong Cao, and Sheng-You Huang.Model building of protein complexes from intermediate-resolution cryo-em maps with deep learning-guided automatic assembly.Nature Communications, 13(1):4066, 2022.
  • Ho etal. [2022]Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, DiederikP Kingma, Ben Poole, Mohammad Norouzi, DavidJ Fleet, etal.Imagen video: High definition video generation with diffusion models.arXiv preprint arXiv:2210.02303, 2022.
  • Hong and Lei [2009]Liu Hong and Jinzhi Lei.Scaling law for the radius of gyration of proteins and its dependence on hydrophobicity.Journal of Polymer Science Part B: Polymer Physics, 47(2):207–214, 2009.
  • Huang etal. [2016]Po-Ssu Huang, ScottE Boyken, and David Baker.The coming of age of de novo protein design.Nature, 537(7620):320–327, 2016.
  • Ingraham etal. [2023]JohnB Ingraham, Max Baranov, Zak Costello, KarlW Barber, Wujie Wang, Ahmed Ismail, Vincent Frappier, DanaM Lord, Christopher Ng-Thow-Hing, ErikR VanVlack, etal.Illuminating protein space with a programmable generative model.Nature, 623(7989):1070–1078, 2023.
  • Jalal etal. [2021]Ajil Jalal, Marius Arvinte, Giannis Daras, Eric Price, AlexandrosG Dimakis, and Jon Tamir.Robust compressed sensing mri with deep generative priors.Advances in Neural Information Processing Systems, 34:14938–14954, 2021.
  • Jamali etal. [2024]Kiarash Jamali, Lukas Käll, Rui Zhang, Alan Brown, Dari Kimanius, and SjorsHW Scheres.Automated model building and protein identification in cryo-em maps.Nature, pages 1–2, 2024.
  • Jing etal. [2023]Bowen Jing, Ezra Erives, Peter Pao-Huang, Gabriele Corso, Bonnie Berger, and Tommi Jaakkola.Eigenfold: Generative protein structure prediction with diffusion models.arXiv preprint arXiv:2304.02198, 2023.
  • Jumper etal. [2021]John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, etal.Highly accurate protein structure prediction with alphafold.Nature, 596(7873):583–589, 2021.
  • Kaur etal. [2021]Satinder Kaur, Josue Gomez-Blanco, AhmadAZ Khalifa, Swathi Adinarayanan, Ruben Sanchez-Garcia, Daniel Wrapp, JasonS McLellan, KhanhHuy Bui, and Javier Vargas.Local computational methods to improve the interpretability and analysis of cryo-em maps.Nature communications, 12(1):1240, 2021.
  • Kawar etal. [2022]Bahjat Kawar, Michael Elad, Stefano Ermon, and Jiaming Song.Denoising diffusion restoration models.Advances in Neural Information Processing Systems, 35:23593–23606, 2022.
  • Kendall etal. [2015]Alex Kendall, Matthew Grimes, and Roberto Cipolla.Posenet: A convolutional network for real-time 6-dof camera relocalization.In Proceedings of the IEEE international conference on computer vision, pages 2938–2946, 2015.
  • Kimanius etal. [2023]Dari Kimanius, Kiarash Jamali, MaxE Wilkinson, Sofia Lövestam, Vaithish Velazhahan, Takanori Nakane, and SjorsHW Scheres.Data-driven regularisation lowers the size barrier of cryo-em structure determination.bioRxiv, pages 2023–10, 2023.
  • Koehler and Meiler [2011]Julia Koehler and Jens Meiler.Expanding the utility of nmr restraints with paramagnetic compounds: background and practical aspects.Progress in nuclear magnetic resonance spectroscopy, 59(4):360–389, 2011.
  • Kuenze etal. [2019]Georg Kuenze, Richard Bonneau, JuliaKoehler Leman, and Jens Meiler.Integrative protein modeling in rosettanmr from sparse paramagnetic restraints.Structure, 27(11):1721–1734, 2019.
  • Leach etal. [2022]Adam Leach, SebastianM Schmon, MatteoT Degiacomi, and ChrisG Willco*cks.Denoising diffusion probabilistic models on so (3) for rotational alignment.In ICLR 2022 Workshop on Geometrical and Topological Representation Learning, 2022.
  • Lee etal. [2022]JinSub Lee, Jisun Kim, and PhilipM Kim.Proteinsgm: Score-based generative modeling for de novo protein design.bioRxiv, pages 2022–07, 2022.
  • Lehtinen etal. [2018]Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila.Noise2noise: Learning image restoration without clean data.arXiv preprint arXiv:1803.04189, 2018.
  • Li etal. [2022]Haoying Li, Yifan Yang, Meng Chang, Shiqi Chen, Huajun Feng, Zhihai Xu, QiLi, and Yueting Chen.Srdiff: Single image super-resolution with diffusion probabilistic models.Neurocomputing, 479:47–59, 2022.
  • Liebschner etal. [2019]Dorothee Liebschner, PavelV Afonine, MatthewL Baker, Gábor Bunkóczi, VincentB Chen, TristanI Croll, Bradley Hintze, L-W Hung, Swati Jain, AirlieJ McCoy, etal.Macromolecular structure determination using x-rays, neutrons and electrons: recent developments in phenix.Acta Crystallographica Section D: Structural Biology, 75(10):861–877, 2019.
  • Lim etal. [2017]Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung MuLee.Enhanced deep residual networks for single image super-resolution.In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 136–144, 2017.
  • Lugmayr etal. [2022]Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, and Luc VanGool.Repaint: Inpainting using denoising diffusion probabilistic models.In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11461–11471, 2022.
  • Luo etal. [2022]sh*tong Luo, Yufeng Su, Xingang Peng, Sheng Wang, Jian Peng, and Jianzhu Ma.Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures.Advances in Neural Information Processing Systems, 35:9754–9767, 2022.
  • Meng etal. [2023]ElaineC Meng, ThomasD Goddard, EricF Pettersen, GregS Couch, ZachJ Pearson, JohnH Morris, and ThomasE Ferrin.Ucsf chimerax: Tools for structure building and analysis.Protein Science, 32(11):e4792, 2023.
  • Nah etal. [2017]Seungjun Nah, Tae HyunKim, and Kyoung MuLee.Deep multi-scale convolutional neural network for dynamic scene deblurring.In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3883–3891, 2017.
  • Pfab etal. [2021]Jonas Pfab, NhutMinh Phan, and Dong Si.Deeptracer for fast de novo cryo-em protein structure modeling and special studies on cov-related complexes.Proceedings of the National Academy of Sciences, 118(2):e2017525118, 2021.
  • Po etal. [2023]Ryan Po, Wang Yifan, Vladislav Golyanik, Kfir Aberman, JonathanT Barron, AmitH Bermano, EricRyan Chan, Tali Dekel, Aleksander Holynski, Angjoo Kanazawa, etal.State of the art on diffusion models for visual computing.arXiv preprint arXiv:2310.07204, 2023.
  • Punjani etal. [2017]Ali Punjani, JohnL Rubinstein, DavidJ Fleet, and MarcusA Brubaker.cryosparc: algorithms for rapid unsupervised cryo-em structure determination.Nature methods, 14(3):290–296, 2017.
  • Rives etal. [2021]Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, CLawrence Zitnick, Jerry Ma, etal.Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.Proceedings of the National Academy of Sciences, 118(15):e2016239118, 2021.
  • Robbins [1992]HerbertE Robbins.An empirical bayes approach to statistics.In Breakthroughs in Statistics: Foundations and basic theory, pages 388–394. Springer, 1992.
  • Saharia etal. [2022]Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, DavidJ Fleet, and Mohammad Norouzi.Image super-resolution via iterative refinement.IEEE transactions on pattern analysis and machine intelligence, 45(4):4713–4726, 2022.
  • Scheres [2012]SjorsHW Scheres.Relion: implementation of a bayesian approach to cryo-em structure determination.Journal of structural biology, 180(3):519–530, 2012.
  • Si etal. [2020]Dong Si, SpencerA Moritz, Jonas Pfab, Jie Hou, Renzhi Cao, Liguo Wang, Tianqi Wu, and Jianlin Cheng.Deep learning to predict protein backbone structure from high-resolution cryo-em density maps.Scientific reports, 10(1):4282, 2020.
  • Singharoy etal. [2016]Abhishek Singharoy, Ivan Teo, Ryan McGreevy, JohnE Stone, Jianhua Zhao, and Klaus Schulten.Molecular dynamics-based refinement and validation for sub-5 å cryo-electron microscopy maps.Elife, 5:e16105, 2016.
  • Song etal. [2022]Jiaming Song, Arash Vahdat, Morteza Mardani, and Jan Kautz.Pseudoinverse-guided diffusion models for inverse problems.In International Conference on Learning Representations, 2022.
  • Song etal. [2021]Yang Song, Liyue Shen, Lei Xing, and Stefano Ermon.Solving inverse problems in medical imaging with score-based generative models.arXiv preprint arXiv:2111.08005, 2021.
  • Sun etal. [2023]YuSun, Zihui Wu, Yifan Chen, BerthyT Feng, and KatherineL Bouman.Provable probabilistic imaging using score-based generative priors.arXiv preprint arXiv:2310.10835, 2023.
  • Tang etal. [2007]Guang Tang, Liwei Peng, PhilipR Baldwin, DeepinderS Mann, Wen Jiang, Ian Rees, and StevenJ Ludtke.Eman2: an extensible image processing suite for electron microscopy.Journal of structural biology, 157(1):38–46, 2007.
  • Tanner [2016]JohnJ Tanner.Empirical power laws for the radii of gyration of protein oligomers.Acta Crystallographica Section D: Structural Biology, 72(10):1119–1129, 2016.
  • Terashi and Kihara [2018]Genki Terashi and Daisuke Kihara.De novo main-chain modeling for em maps using mainmast.Nature communications, 9(1):1618, 2018.
  • Terwilliger etal. [2018]ThomasC Terwilliger, PaulD Adams, PavelV Afonine, and OlegV Sobolev.A fully automatic method yielding initial models from high-resolution cryo-electron microscopy maps.Nature methods, 15(11):905–908, 2018.
  • Trippe etal. [2022]BrianL Trippe, Jason Yim, Doug Tischer, David Baker, Tamara Broderick, Regina Barzilay, and Tommi Jaakkola.Diffusion probabilistic modeling of protein backbones in 3d for the motif-scaffolding problem.arXiv preprint arXiv:2206.04119, 2022.
  • Venkatakrishnan etal. [2013]SinganallurV Venkatakrishnan, CharlesA Bouman, and Brendt Wohlberg.Plug-and-play priors for model based reconstruction.In 2013 IEEE global conference on signal and information processing, pages 945–948. IEEE, 2013.
  • Wang etal. [2015]Ray Yu-Ruei Wang, Mikhail Kudryashev, Xueming Li, EdwardH Egelman, Marek Basler, Yifan Cheng, David Baker, and Frank DiMaio.De novo protein structure determination from near-atomic-resolution cryo-em maps.Nature methods, 12(4):335–338, 2015.
  • Warstat etal. [2023]Robin Warstat, Mehrosh Pervaiz, Pierre Regenass, Marius Amann, Karin Schmidtkunz, Oliver Einsle, Manfred Jung, Bernhard Breit, Martin Hügle, and Stefan Günther.A novel pan-selective bromodomain inhibitor for epigenetic drug design.European Journal of Medicinal Chemistry, 249:115139, 2023.
  • Watson etal. [2023]JosephL Watson, David Juergens, NathanielR Bennett, BrianL Trippe, Jason Yim, HelenE Eisenach, Woody Ahern, AndrewJ Borst, RobertJ Ragotte, LukasF Milles, etal.De novo design of protein structure and function with rfdiffusion.Nature, 620(7976):1089–1100, 2023.
  • Wu etal. [2024]KevinE Wu, KevinK Yang, Rianne vanden Berg, Sarah Alamdari, JamesY Zou, AlexX Lu, and AvaP Amini.Protein structure generation via folding diffusion.Nature Communications, 15(1):1059, 2024.
  • Xie etal. [2012]Junyuan Xie, Linli Xu, and Enhong Chen.Image denoising and inpainting with deep neural networks.Advances in neural information processing systems, 25, 2012.
  • Yim etal. [2023]Jason Yim, BrianL Trippe, Valentin DeBortoli, Emile Mathieu, Arnaud Doucet, Regina Barzilay, and Tommi Jaakkola.Se (3) diffusion model with application to protein backbone generation.arXiv preprint arXiv:2302.02277, 2023.
  • Zhang etal. [2017]Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang.Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE transactions on image processing, 26(7):3142–3155, 2017.
  • Zhang etal. [2022]XiZhang, Biao Zhang, PeterL Freddolino, and Yang Zhang.Cr-i-tasser: assemble protein structures from cryo-em density maps using deep convolutional neural networks.Nature methods, 19(2):195–204, 2022.
  • Zhu etal. [2023]Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, and Luc VanGool.Denoising diffusion models for plug-and-play image restoration.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1219–1229, 2023.

Appendix

Appendix A Log-Likelihood Functions

We define here the log-likelihood functions, as mentioned in Algorithm1.

Structure Completion.

For structure completion, we use the log-likelihood function defined in(10):

f(𝐲,𝐳)=(𝐈m000)𝐕T𝐳𝐒+𝐔T𝐲22.𝑓𝐲𝐳superscriptsubscriptnormmatrixsubscript𝐈𝑚000superscript𝐕𝑇𝐳superscript𝐒superscript𝐔𝑇𝐲22f(\mathbf{y},\mathbf{z})=-\bigg{\|}\begin{pmatrix}\mathbf{I}_{m}&0\\0&0\end{pmatrix}\mathbf{V}^{T}\mathbf{z}-\mathbf{S}^{+}\mathbf{U}^{T}\mathbf{y%}\bigg{\|}_{2}^{2}.italic_f ( bold_y , bold_z ) = - ∥ ( start_ARG start_ROW start_CELL bold_I start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) bold_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_z - bold_S start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_y ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .(15)

Model Refinement.

In the model refinement task, we combine three sources of information with the following three log-likelihood functions:

fm(𝐱¯,𝐳)subscript𝑓𝑚¯𝐱𝐳\displaystyle f_{m}(\bar{\mathbf{x}},\mathbf{z})italic_f start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( over¯ start_ARG bold_x end_ARG , bold_z )=(𝐈m000)𝐕T𝐳𝐒+𝐔T𝐱¯22absentsuperscriptsubscriptnormmatrixsubscript𝐈𝑚000superscript𝐕𝑇𝐳superscript𝐒superscript𝐔𝑇¯𝐱22\displaystyle=-\bigg{\|}\begin{pmatrix}\mathbf{I}_{m}&0\\0&0\end{pmatrix}\mathbf{V}^{T}\mathbf{z}-\mathbf{S}^{+}\mathbf{U}^{T}\bar{%\mathbf{x}}\bigg{\|}_{2}^{2}= - ∥ ( start_ARG start_ROW start_CELL bold_I start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) bold_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_z - bold_S start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over¯ start_ARG bold_x end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT(incomplete model)(16)
fs(𝐬,𝐳)subscript𝑓𝑠𝐬𝐳\displaystyle f_{s}(\mathbf{s},\mathbf{z})italic_f start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( bold_s , bold_z )=logpθ(𝐬|𝐑𝐳,t=0)absentsubscript𝑝𝜃conditional𝐬𝐑𝐳𝑡0\displaystyle=\log p_{\theta}(\mathbf{s}|\mathbf{R}\mathbf{z},t=0)= roman_log italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_s | bold_Rz , italic_t = 0 )(sequence)
fd(𝐲,𝐳)subscript𝑓𝑑𝐲𝐳\displaystyle f_{d}(\mathbf{y},\mathbf{z})italic_f start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ( bold_y , bold_z )=𝐲(Γ(𝐑𝐳,𝐬,χ))22,whereχpθ(χ|𝐑𝐳,𝐬,t=0)formulae-sequenceabsentsuperscriptsubscriptnorm𝐲Γ𝐑𝐳𝐬𝜒22similar-towhere𝜒subscript𝑝𝜃conditional𝜒𝐑𝐳𝐬𝑡0\displaystyle=-\|\mathbf{y}-\mathcal{B}(\Gamma(\mathbf{R}\mathbf{z},\mathbf{s}%,\mathbf{\chi}))\|_{2}^{2},\text{ where }\mathbf{\chi}\sim p_{\theta}(\mathbf{%\chi}|\mathbf{R}\mathbf{z},\mathbf{s},t=0)= - ∥ bold_y - caligraphic_B ( roman_Γ ( bold_Rz , bold_s , italic_χ ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , where italic_χ ∼ italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_χ | bold_Rz , bold_s , italic_t = 0 )(cryo-EM density)

ΓΓ\Gammaroman_Γ is an operator turning an all-atom model into a density map. Given 𝐗=(𝐱,𝐬,χ)=[X1,,XA]T(3)A𝐗𝐱𝐬𝜒superscriptsubscript𝑋1subscript𝑋𝐴𝑇superscriptsuperscript3𝐴\mathbf{X}=(\mathbf{x},\mathbf{s},\mathbf{\chi})=[X_{1},\ldots,X_{A}]^{T}\in(%\mathbb{R}^{3})^{A}bold_X = ( bold_x , bold_s , italic_χ ) = [ italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ ( blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_A end_POSTSUPERSCRIPT, the Cartesian coordinates of A𝐴Aitalic_A heavy atoms,

Γ(𝐗):x3i=1Aj=154πai,jbi,jexp(4π2bi,jXix22),:Γ𝐗𝑥superscript3maps-tosuperscriptsubscript𝑖1𝐴superscriptsubscript𝑗154𝜋subscript𝑎𝑖𝑗subscript𝑏𝑖𝑗4superscript𝜋2subscript𝑏𝑖𝑗superscriptsubscriptnormsubscript𝑋𝑖𝑥22\Gamma(\mathbf{X}):x\in\mathbb{R}^{3}\mapsto\sum_{i=1}^{A}\sum_{j=1}^{5}4\pi%\frac{a_{i,j}}{b_{i,j}}\exp\left(-\frac{4\pi^{2}}{b_{i,j}}\|X_{i}-x\|_{2}^{2}%\right)\in\mathbb{R},roman_Γ ( bold_X ) : italic_x ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ↦ ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT 4 italic_π divide start_ARG italic_a start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_b start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT end_ARG roman_exp ( - divide start_ARG 4 italic_π start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_b start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT end_ARG ∥ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∈ blackboard_R ,(17)

where ai,jsubscript𝑎𝑖𝑗a_{i,j}italic_a start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT and bi,jsubscript𝑏𝑖𝑗b_{i,j}italic_b start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT are tabulated form factors[24].The norm 𝐲Γ(𝐗)22superscriptsubscriptnorm𝐲Γ𝐗22\|\mathbf{y}-\Gamma(\mathbf{X})\|_{2}^{2}∥ bold_y - roman_Γ ( bold_X ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is computed directly in Fourier space using Parseval’s theorem, up to a time-dependent resolution rtsubscript𝑟𝑡r_{t}italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Here, we assume the B-factors are unknown and we neglect their contribution in the log-likelihood function as we consider high-resolution density maps (4Åabsent4Å\leq 4~{}\text{\AA}≤ 4 Å).

Pairwise Distances to Structure.

We use the following log-likelihood function:

f(𝐲,𝐳)=𝐲𝐃𝐑𝐳222.𝑓𝐲𝐳evaluated-atevaluated-atnormlimit-from𝐲𝐃𝐑𝐳222f(\mathbf{y},\mathbf{z})=-\big{\|}\mathbf{y}-\|\mathbf{D}\mathbf{R}\mathbf{z}%\|_{2}\big{\|}_{2}^{2}.italic_f ( bold_y , bold_z ) = - ∥ bold_y - ∥ bold_DRz ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .(18)

Appendix B Optimization parameters

Structure Completion.

All experiments are ran for 1,000 epochs, with a learning rate λ𝜆\lambdaitalic_λ of 0.3 and a momentum ρ𝜌\rhoitalic_ρ of 0.9. The time schedule is linear:

t=1epoch/1000.𝑡1epoch1000t=1-\text{epoch}/1000.italic_t = 1 - epoch / 1000 .(19)

Model Refinement.

All experiments are ran for 4,000 epochs. The learning rates are

λm=0.1,λs={0if epoch<3,0001×105otherwise,λd=3×105(rt/r0)3,formulae-sequencesubscript𝜆𝑚0.1formulae-sequencesubscript𝜆𝑠cases0if epoch3,000otherwise1superscript105otherwiseotherwisesubscript𝜆𝑑3superscript105superscriptsubscript𝑟𝑡subscript𝑟03\lambda_{m}=0.1,\quad\lambda_{s}=\begin{cases}0\text{ if epoch}<\text{3,000}\\1\times 10^{-5}\text{ otherwise}\end{cases},\quad\lambda_{d}=3\times 10^{-5}%\cdot(r_{t}/r_{0})^{3},italic_λ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = 0.1 , italic_λ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = { start_ROW start_CELL 0 if epoch < 3,000 end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 1 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT otherwise end_CELL start_CELL end_CELL end_ROW , italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 3 × 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT ⋅ ( italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT / italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ,(20)

and all the momenta are set to 0.9. rtsubscript𝑟𝑡r_{t}italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is set to 15Å for the first 3,000 epochs and linearly decreases to 5Å during the last 1,000 epochs. The side chain angles (χ𝜒\mathbf{\chi}italic_χ) are sampled every 100 epochs. The time schedule is

t=1epoch/4,000.𝑡1epoch4,000t=1-\sqrt{\text{epoch}/\text{4,000}}.italic_t = 1 - square-root start_ARG epoch / 4,000 end_ARG .(21)

Pairwise Distances to Structure.

All experiments are ran for 1,000 epochs, with a learning rate λ=200/n𝜆200𝑛\lambda=200/nitalic_λ = 200 / italic_n, where n𝑛nitalic_n is the number of known pairwise distances, and a momentum ρ𝜌\rhoitalic_ρ of 0.99. The time schedule is

t=1epoch/1,000.𝑡1epoch1,000t=1-\sqrt{\text{epoch}/\text{1,000}}.italic_t = 1 - square-root start_ARG epoch / 1,000 end_ARG .(22)

Appendix C Correlated Diffusion

We use the “Rgsubscript𝑅𝑔R_{g}italic_R start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT-confined globular polymer” correlation matrix, as defined in[29]:

𝐑=a[νb0νb1b0νb2b1b0νbN2b1b0νbN1b2b1b0],𝐑𝑎matrix𝜈superscript𝑏0missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝜈superscript𝑏1superscript𝑏0missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝜈superscript𝑏2superscript𝑏1superscript𝑏0missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝜈superscript𝑏𝑁2missing-subexpressionmissing-subexpressionsuperscript𝑏1superscript𝑏0missing-subexpression𝜈superscript𝑏𝑁1missing-subexpressionsuperscript𝑏2superscript𝑏1superscript𝑏0\mathbf{R}=a\begin{bmatrix}\nu b^{0}&&&&&\\\nu b^{1}&b^{0}&&&&\\\nu b^{2}&b^{1}&b^{0}&&&\\\vdots&&\ddots&\ddots&&\\\nu b^{N-2}&&&b^{1}&b^{0}&\\\nu b^{N-1}&&\cdots&b^{2}&b^{1}&b^{0}\end{bmatrix},bold_R = italic_a [ start_ARG start_ROW start_CELL italic_ν italic_b start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_ν italic_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_ν italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_ν italic_b start_POSTSUPERSCRIPT italic_N - 2 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_ν italic_b start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL ⋯ end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_b start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] ,(23)

where a𝑎aitalic_a is a free parameter, ν=11b2𝜈11superscript𝑏2\nu=\frac{1}{\sqrt{1-b^{2}}}italic_ν = divide start_ARG 1 end_ARG start_ARG square-root start_ARG 1 - italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG and b𝑏bitalic_b is chosen such that the radius of gyration Rgsubscript𝑅𝑔R_{g}italic_R start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT scales with the number of residues as RgrNαsimilar-tosubscript𝑅𝑔𝑟superscript𝑁𝛼R_{g}\sim rN^{\alpha}italic_R start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ∼ italic_r italic_N start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT (r2.0Å𝑟2.0År\approx 2.0~{}\text{\AA}italic_r ≈ 2.0 Å, α0.4𝛼0.4\alpha\approx 0.4italic_α ≈ 0.4[63, 27]).

Appendix D Target Structures

Information on the proteins used as target structures are given in Table1.

PDBNb. of residuesResolutionRelease dateImaging method\in CASP15
7r5b1271.77Å2023-02-08X-ray diffractionNo
7qum1301.50Å2023-03-01X-ray diffractionNo
1cfd148N/A1995-12-07NMRNo
7pzt1601.84Å2022-11-02X-ray diffractionYes
1a2f2912.10Å1998-11-25X-ray diffractionNo

Appendix E Additional Results

E.1 Structure Completion

We provide additional results in Table2. The Evidence Lower Bound (ELBO) is computed with Chroma and is a lower bound of the learned log-prior. Note that, as the subsampling factor increases (i.e., as the problem becomes less constrained), the gap between our approach (MAP estimation) and the baseline (low-temperature posterior sampling) decreases. For highly undersampled structures, the RMSD with the target structure is high but the generated structure remains plausible under the learned prior distribution.

PDB (# res.)MetricMethodSubsampling factor
248163264128
7r5b (127)RMSD (\downarrow)Chroma*0.411.343.739.3415.9516.27
ADP-3D0.100.472.214.479.4511.54
ELBO (\uparrow)Chroma*7.796.727.277.567.409.10
ADP-3D7.387.877.486.657.647.98
7qum (130)RMSD (\downarrow)Chroma*0.280.812.143.467.7011.4914.45
ADP-3D0.210.271.132.685.607.7613.01
ELBO (\uparrow)Chroma*5.876.087.167.138.438.899.27
ADP-3D5.646.477.877.567.998.768.98
1cfd (148)RMSD (\downarrow)Chroma*0.421.173.065.849.4414.2815.87
ADP-3D0.200.591.834.6212.7612.5115.29
ELBO (\uparrow)Chroma*5.655.976.495.468.879.228.94
ADP-3D5.225.347.608.008.918.727.82
7pzt (160)RMSD (\downarrow)Chroma*0.501.614.8210.3616.1516.8116.93
ADP-3D0.230.672.649.3914.2615.4714.07
ELBO (\uparrow)Chroma*5.995.555.397.478.528.749.04
ADP-3D5.796.446.927.798.297.818.64
1a2f (291)RMSD (\downarrow)Chroma*0.391.213.427.4514.7716.9917.29
ADP-3D0.130.681.915.4812.4817.7819.68
ELBO (\uparrow)Chroma*5.395.336.837.098.749.289.48
ADP-3D5.246.547.627.858.208.279.12

E.2 Gradient Descent for Linear Constraints

In Figure6, we compare different gradient-based techniques for minimizing over 𝐳𝐳\mathbf{z}bold_z the following objective function:

(𝐳)=𝐌𝐱𝐌𝐑𝐳22.𝐳superscriptsubscriptnormsuperscript𝐌𝐱𝐌𝐑𝐳22\mathcal{L}(\mathbf{z})=\|\mathbf{M}\mathbf{x}^{*}-\mathbf{M}\mathbf{R}\mathbf%{z}\|_{2}^{2}.caligraphic_L ( bold_z ) = ∥ bold_Mx start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - bold_MRz ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .(24)

𝐱superscript𝐱\mathbf{x}^{*}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is the backbone structure of the ATAD2 protein (PDB:7qum). 𝐌𝐌\mathbf{M}bold_M is a masking matrix with a subsampling factor of 2. The preconditioning strategy is described in Section4.2. Gradient descent with preconditioning and momentum leads to the fastest convergence.

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (6)

E.3 Ablation Study for Model Refinement

In Figure7, we analyze the importance of the different input measurements for atomic model refinement. Removing the partial atomic model leads to the largest drop in accuracy. The cryo-EM density map is the second most important measurement, followed by the generative prior and the sequence.

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (7)

E.4 Additional video

We provide one additional video showing the predicted structure throughout the diffusion process, for the three tasks explored in this paper.

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors (2024)
Top Articles
Latest Posts
Article information

Author: Sen. Ignacio Ratke

Last Updated:

Views: 6282

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Sen. Ignacio Ratke

Birthday: 1999-05-27

Address: Apt. 171 8116 Bailey Via, Roberthaven, GA 58289

Phone: +2585395768220

Job: Lead Liaison

Hobby: Lockpicking, LARPing, Lego building, Lapidary, Macrame, Book restoration, Bodybuilding

Introduction: My name is Sen. Ignacio Ratke, I am a adventurous, zealous, outstanding, agreeable, precious, excited, gifted person who loves writing and wants to share my knowledge and understanding with you.