new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Dec 8

Lie Group Decompositions for Equivariant Neural Networks

Invariance and equivariance to geometrical transformations have proven to be very useful inductive biases when training (convolutional) neural network models, especially in the low-data regime. Much work has focused on the case where the symmetry group employed is compact or abelian, or both. Recent work has explored enlarging the class of transformations used to the case of Lie groups, principally through the use of their Lie algebra, as well as the group exponential and logarithm maps. The applicability of such methods to larger transformation groups is limited by the fact that depending on the group of interest G, the exponential map may not be surjective. Further limitations are encountered when G is neither compact nor abelian. Using the structure and geometry of Lie groups and their homogeneous spaces, we present a framework by which it is possible to work with such groups primarily focusing on the Lie groups G = GL^{+}(n, R) and G = SL(n, R), as well as their representation as affine transformations R^{n} rtimes G. Invariant integration as well as a global parametrization is realized by decomposing the `larger` groups into subgroups and submanifolds which can be handled individually. Under this framework, we show how convolution kernels can be parametrized to build models equivariant with respect to affine transformations. We evaluate the robustness and out-of-distribution generalisation capability of our model on the standard affine-invariant benchmark classification task, where we outperform all previous equivariant models as well as all Capsule Network proposals.

  • 2 authors
·
Oct 17, 2023

Towards A Universally Transferable Acceleration Method for Density Functional Theory

Recently, sophisticated deep learning-based approaches have been developed for generating efficient initial guesses to accelerate the convergence of density functional theory (DFT) calculations. While the actual initial guesses are often density matrices (DM), quantities that can convert into density matrices also qualify as alternative forms of initial guesses. Hence, existing works mostly rely on the prediction of the Hamiltonian matrix for obtaining high-quality initial guesses. However, the Hamiltonian matrix is both numerically difficult to predict and intrinsically non-transferable, hindering the application of such models in real scenarios. In light of this, we propose a method that constructs DFT initial guesses by predicting the electron density in a compact auxiliary basis representation using E(3)-equivariant neural networks. Trained on small molecules with up to 20 atoms, our model is able to achieve an average 33.3% self-consistent field (SCF) step reduction on systems up to 60 atoms, substantially outperforming Hamiltonian-centric and DM-centric models. Critically, this acceleration remains nearly constant with increasing system sizes and exhibits strong transferring behaviors across orbital basis sets and exchange-correlation (XC) functionals. To the best of our knowledge, this work represents the first and robust candidate for a universally transferable DFT acceleration method. We are also releasing the SCFbench dataset and its accompanying code to facilitate future research in this promising direction.

  • 6 authors
·
Sep 29

Regularizing Towards Soft Equivariance Under Mixed Symmetries

Datasets often have their intrinsic symmetries, and particular deep-learning models called equivariant or invariant models have been developed to exploit these symmetries. However, if some or all of these symmetries are only approximate, which frequently happens in practice, these models may be suboptimal due to the architectural restrictions imposed on them. We tackle this issue of approximate symmetries in a setup where symmetries are mixed, i.e., they are symmetries of not single but multiple different types and the degree of approximation varies across these types. Instead of proposing a new architectural restriction as in most of the previous approaches, we present a regularizer-based method for building a model for a dataset with mixed approximate symmetries. The key component of our method is what we call equivariance regularizer for a given type of symmetries, which measures how much a model is equivariant with respect to the symmetries of the type. Our method is trained with these regularizers, one per each symmetry type, and the strength of the regularizers is automatically tuned during training, leading to the discovery of the approximation levels of some candidate symmetry types without explicit supervision. Using synthetic function approximation and motion forecasting tasks, we demonstrate that our method achieves better accuracy than prior approaches while discovering the approximate symmetry levels correctly.

  • 4 authors
·
Jun 1, 2023

More on the Weak Gravity Conjecture via Convexity of Charged Operators

The Weak Gravity Conjecture has recently been re-formulated in terms of a particle with non-negative self-binding energy. Because of the dual conformal field theory (CFT) formulation in the anti-de Sitter space the conformal dimension Delta (Q) of the lowest-dimension operator with charge Q under some global U(1) symmetry must be a convex function of Q. This property has been conjectured to hold for any (unitary) conformal field theory and generalized to larger global symmetry groups. Here we refine and further test the convex charge conjecture via semiclassical computations for fixed charge sectors of different theories in different dimensions. We analyze the convexity properties of the leading and next-to-leading order terms stemming from the semiclassical computation, de facto, extending previous tests beyond the leading perturbative contributions and to arbitrary charges. In particular, the leading contribution is sufficient to test convexity in the semiclassical computations. We also consider intriguing cases in which the models feature a transition from real to complex conformal dimensions either as a function of the charge or number of matter fields. As a relevant example of the first kind, we investigate the O(N) model in 4+epsilon dimensions. As an example of the second type we consider the U(N)times U(M) model in 4-epsilon dimensions. Both models display a rich dynamics where, by changing the number of matter fields and/or charge, one can achieve dramatically different physical regimes. We discover that whenever a complex conformal dimension appears, the real part satisfies the convexity property.

  • 5 authors
·
Sep 10, 2021

Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity

Hamiltonian matrix prediction is pivotal in computational chemistry, serving as the foundation for determining a wide range of molecular properties. While SE(3) equivariant graph neural networks have achieved remarkable success in this domain, their substantial computational cost--driven by high-order tensor product (TP) operations--restricts their scalability to large molecular systems with extensive basis sets. To address this challenge, we introduce SPHNet, an efficient and scalable equivariant network, that incorporates adaptive SParsity into Hamiltonian prediction. SPHNet employs two innovative sparse gates to selectively constrain non-critical interaction combinations, significantly reducing tensor product computations while maintaining accuracy. To optimize the sparse representation, we develop a Three-phase Sparsity Scheduler, ensuring stable convergence and achieving high performance at sparsity rates of up to 70%. Extensive evaluations on QH9 and PubchemQH datasets demonstrate that SPHNet achieves state-of-the-art accuracy while providing up to a 7x speedup over existing models. Beyond Hamiltonian prediction, the proposed sparsification techniques also hold significant potential for improving the efficiency and scalability of other SE(3) equivariant networks, further broadening their applicability and impact. Our code can be found at https://github.com/microsoft/SPHNet.

  • 10 authors
·
Feb 3

Spin pumping by a moving domain wall at the interface of an antiferromagnetic insulator and a two-dimensional metal

A domain wall (DW) which moves parallel to a magnetically compensated interface between an antiferromagnetic insulator (AFMI) and a two-dimensional (2D) metal can pump spin polarization into the metal. It is assumed that localized spins of a collinear AFMI interact with itinerant electrons through their exchange interaction on the interface. We employed the formalism of Keldysh Green's functions for electrons which experience potential and spin-orbit scattering on random impurities. This formalism allows a unified analysis of spin pumping, spin diffusion and spin relaxation effects on a 2D electron gas. It is shown that the pumping of a nonstaggered magnetization into the metal film takes place in the second order with respect to the interface exchange interaction. At sufficiently weak spin relaxation this pumping effect can be much stronger than the first-order effect of the Pauli magnetism which is produced by the small nonstaggered exchange field of the DW. It is shown that the pumped polarization is sensitive to the geometry of the electron's Fermi surface and increases when the wave vector of the staggered magnetization approaches the nesting vector of the Fermi surface. In a disordered diffusive electron gas the induced spin polarization follows the motion of the domain wall. It is distributed asymmetrically around the DW over a distance which can be much larger than the DW width.

  • 1 authors
·
Nov 2, 2022

Enabling Efficient Equivariant Operations in the Fourier Basis via Gaunt Tensor Products

Developing equivariant neural networks for the E(3) group plays an important role in modeling 3D data across real-world applications. Enforcing this equivariance primarily involves the tensor products of irreducible representations (irreps). However, the computational complexity of such operations increases significantly as higher-order tensors are used. In this work, we propose a systematic approach to substantially accelerate the computation of the tensor products of irreps. We mathematically connect the commonly used Clebsch-Gordan coefficients to the Gaunt coefficients, which are integrals of products of three spherical harmonics. Through Gaunt coefficients, the tensor product of irreps becomes equivalent to the multiplication between spherical functions represented by spherical harmonics. This perspective further allows us to change the basis for the equivariant operations from spherical harmonics to a 2D Fourier basis. Consequently, the multiplication between spherical functions represented by a 2D Fourier basis can be efficiently computed via the convolution theorem and Fast Fourier Transforms. This transformation reduces the complexity of full tensor products of irreps from O(L^6) to O(L^3), where L is the max degree of irreps. Leveraging this approach, we introduce the Gaunt Tensor Product, which serves as a new method to construct efficient equivariant operations across different model architectures. Our experiments on the Open Catalyst Project and 3BPA datasets demonstrate both the increased efficiency and improved performance of our approach.

  • 3 authors
·
Jan 18, 2024

Geometric Trajectory Diffusion Models

Generative models have shown great promise in generating 3D geometric systems, which is a fundamental problem in many natural science domains such as molecule and protein design. However, existing approaches only operate on static structures, neglecting the fact that physical systems are always dynamic in nature. In this work, we propose geometric trajectory diffusion models (GeoTDM), the first diffusion model for modeling the temporal distribution of 3D geometric trajectories. Modeling such distribution is challenging as it requires capturing both the complex spatial interactions with physical symmetries and temporal correspondence encapsulated in the dynamics. We theoretically justify that diffusion models with equivariant temporal kernels can lead to density with desired symmetry, and develop a novel transition kernel leveraging SE(3)-equivariant spatial convolution and temporal attention. Furthermore, to induce an expressive trajectory distribution for conditional generation, we introduce a generalized learnable geometric prior into the forward diffusion process to enhance temporal conditioning. We conduct extensive experiments on both unconditional and conditional generation in various scenarios, including physical simulation, molecular dynamics, and pedestrian motion. Empirical results on a wide suite of metrics demonstrate that GeoTDM can generate realistic geometric trajectories with significantly higher quality.

  • 5 authors
·
Oct 16, 2024

Fast, Expressive SE(n) Equivariant Networks through Weight-Sharing in Position-Orientation Space

Based on the theory of homogeneous spaces we derive geometrically optimal edge attributes to be used within the flexible message-passing framework. We formalize the notion of weight sharing in convolutional networks as the sharing of message functions over point-pairs that should be treated equally. We define equivalence classes of point-pairs that are identical up to a transformation in the group and derive attributes that uniquely identify these classes. Weight sharing is then obtained by conditioning message functions on these attributes. As an application of the theory, we develop an efficient equivariant group convolutional network for processing 3D point clouds. The theory of homogeneous spaces tells us how to do group convolutions with feature maps over the homogeneous space of positions R^3, position and orientations R^3 {times} S^2, and the group SE(3) itself. Among these, R^3 {times} S^2 is an optimal choice due to the ability to represent directional information, which R^3 methods cannot, and it significantly enhances computational efficiency compared to indexing features on the full SE(3) group. We support this claim with state-of-the-art results -- in accuracy and speed -- on five different benchmarks in 2D and 3D, including interatomic potential energy prediction, trajectory forecasting in N-body systems, and generating molecules via equivariant diffusion models.

  • 5 authors
·
Oct 4, 2023

Faces of highest weight modules and the universal Weyl polyhedron

Let V be a highest weight module over a Kac-Moody algebra g, and let conv V denote the convex hull of its weights. We determine the combinatorial isomorphism type of conv V, i.e. we completely classify the faces and their inclusions. In the special case where g is semisimple, this brings closure to a question studied by Cellini-Marietti [IMRN 2015] for the adjoint representation, and by Khare [J. Algebra 2016; Trans. Amer. Math. Soc. 2017] for most modules. The determination of faces of finite-dimensional modules up to the Weyl group action and some of their inclusions also appears in previous work of Satake [Ann. of Math. 1960], Borel-Tits [IHES Publ. Math. 1965], Vinberg [Izv. Akad. Nauk 1990], and Casselman [Austral. Math. Soc. 1997]. For any subset of the simple roots, we introduce a remarkable convex cone which we call the universal Weyl polyhedron, which controls the convex hulls of all modules parabolically induced from the corresponding Levi factor. Namely, the combinatorial isomorphism type of the cone stores the classification of faces for all such highest weight modules, as well as how faces degenerate as the highest weight gets increasingly singular. To our knowledge, this cone is new in finite and infinite type. We further answer a question of Michel Brion, by showing that the localization of conv V along a face is always the convex hull of the weights of a parabolically induced module. Finally, as we determine the inclusion relations between faces representation-theoretically from the set of weights, without recourse to convexity, we answer a similar question for highest weight modules over symmetrizable quantum groups.

  • 2 authors
·
Oct 31, 2016

Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molecule Generation

Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery. However, their utility is still limited by suboptimal performance on large molecular structures and limited training data. To address this gap, we explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas. Our extensive comparative analysis evaluates the interplay between continuous and discrete state spaces. From this investigation, we present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets. Significantly, EQGAT-diff takes continuous atom positions, while chemical elements and bond types are categorical and uses time-dependent loss weighting, substantially increasing training convergence, the quality of generated samples, and inference time. We also showcase that including chemically motivated additional features like hybridization states in the diffusion process enhances the validity of generated molecules. To further strengthen the applicability of diffusion models to limited training data, we investigate the transferability of EQGAT-diff trained on the large PubChem3D dataset with implicit hydrogen atoms to target different data distributions. Fine-tuning EQGAT-diff for just a few iterations shows an efficient distribution shift, further improving performance throughout data sets. Finally, we test our model on the Crossdocked data set for structure-based de novo ligand generation, underlining the importance of our findings showing state-of-the-art performance on Vina docking scores.

  • 5 authors
·
Sep 29, 2023

Distinguishability and linear independence for H-chromatic symmetric functions

We study the H-chromatic symmetric functions X_G^H (introduced in (arXiv:2011.06063) as a generalization of the chromatic symmetric function (CSF) X_G), which track homomorphisms from the graph G to the graph H. We focus first on the case of self-chromatic symmetric functions (self-CSFs) X_G^G, making some progress toward a conjecture from (arXiv:2011.06063) that the self-CSF, like the normal CSF, is always different for different trees. In particular, we show that the self-CSF distinguishes trees from non-trees with just one exception, we check using Sage that it distinguishes all trees on up to 12 vertices, and we show that it determines the number of legs of a spider and the degree sequence of a caterpillar given its spine length. We also show that the self-CSF detects the number of connected components of a forest, again with just one exception. Then we prove some results about the power sum expansions for H-CSFs when H is a complete bipartite graph, in particular proving that the conjecture from (arXiv:2011.06063) about p-monotonicity of ω(X_G^H) for H a star holds as long as H is sufficiently large compared to G. We also show that the self-CSFs of complete multipartite graphs form a basis for the ring Λ of symmetric functions, and we give some construction of bases for the vector space Λ^n of degree n symmetric functions using H-CSFs X_G^H where H is a fixed graph that is not a complete graph, answering a question from (arXiv:2011.06063) about whether such bases exist. However, we show that there generally do not exist such bases with G fixed, even with loops, answering another question from (arXiv:2011.06063). We also define the H-chromatic polynomial as an analogue of the chromatic polynomial, and ask when it is the same for different graphs.

  • 2 authors
·
Nov 11

SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation

In this paper, we introduce an SE(3) diffusion model-based point cloud registration framework for 6D object pose estimation in real-world scenarios. Our approach formulates the 3D registration task as a denoising diffusion process, which progressively refines the pose of the source point cloud to obtain a precise alignment with the model point cloud. Training our framework involves two operations: An SE(3) diffusion process and an SE(3) reverse process. The SE(3) diffusion process gradually perturbs the optimal rigid transformation of a pair of point clouds by continuously injecting noise (perturbation transformation). By contrast, the SE(3) reverse process focuses on learning a denoising network that refines the noisy transformation step-by-step, bringing it closer to the optimal transformation for accurate pose estimation. Unlike standard diffusion models used in linear Euclidean spaces, our diffusion model operates on the SE(3) manifold. This requires exploiting the linear Lie algebra se(3) associated with SE(3) to constrain the transformation transitions during the diffusion and reverse processes. Additionally, to effectively train our denoising network, we derive a registration-specific variational lower bound as the optimization objective for model learning. Furthermore, we show that our denoising network can be constructed with a surrogate registration model, making our approach applicable to different deep registration networks. Extensive experiments demonstrate that our diffusion registration framework presents outstanding pose estimation performance on the real-world TUD-L, LINEMOD, and Occluded-LINEMOD datasets.

  • 5 authors
·
Oct 26, 2023

Ground State Preparation via Dynamical Cooling

Quantum algorithms for probing ground-state properties of quantum systems require good initial states. Projection-based methods such as eigenvalue filtering rely on inputs that have a significant overlap with the low-energy subspace, which can be challenging for large, strongly-correlated systems. This issue has motivated the study of physically-inspired dynamical approaches such as thermodynamic cooling. In this work, we introduce a ground-state preparation algorithm based on the simulation of quantum dynamics. Our main insight is to transform the Hamiltonian by a shifted sign function via quantum signal processing, effectively mapping eigenvalues into positive and negative subspaces separated by a large gap. This automatically ensures that all states within each subspace conserve energy with respect to the transformed Hamiltonian. Subsequent time-evolution with a perturbed Hamiltonian induces transitions to lower-energy states while preventing unwanted jumps to higher energy states. The approach does not rely on a priori knowledge of energy gaps and requires no additional qubits to model a bath. Furthermore, it makes mathcal{O}(d^{,3/2}/epsilon) queries to the time-evolution operator of the system and mathcal{O}(d^{,3/2}) queries to a block-encoding of the perturbation, for d cooling steps and an epsilon-accurate energy resolution. Our results provide a framework for combining quantum signal processing and Hamiltonian simulation to design heuristic quantum algorithms for ground-state preparation.

  • 4 authors
·
Apr 8, 2024

Subequivariant Graph Reinforcement Learning in 3D Environments

Learning a shared policy that guides the locomotion of different agents is of core interest in Reinforcement Learning (RL), which leads to the study of morphology-agnostic RL. However, existing benchmarks are highly restrictive in the choice of starting point and target point, constraining the movement of the agents within 2D space. In this work, we propose a novel setup for morphology-agnostic RL, dubbed Subequivariant Graph RL in 3D environments (3D-SGRL). Specifically, we first introduce a new set of more practical yet challenging benchmarks in 3D space that allows the agent to have full Degree-of-Freedoms to explore in arbitrary directions starting from arbitrary configurations. Moreover, to optimize the policy over the enlarged state-action space, we propose to inject geometric symmetry, i.e., subequivariance, into the modeling of the policy and Q-function such that the policy can generalize to all directions, improving exploration efficiency. This goal is achieved by a novel SubEquivariant Transformer (SET) that permits expressive message exchange. Finally, we evaluate the proposed method on the proposed benchmarks, where our method consistently and significantly outperforms existing approaches on single-task, multi-task, and zero-shot generalization scenarios. Extensive ablations are also conducted to verify our design. Code and videos are available on our project page: https://alpc91.github.io/SGRL/.

  • 4 authors
·
May 30, 2023

Frame Averaging for Invariant and Equivariant Network Design

Many machine learning tasks involve learning functions that are known to be invariant or equivariant to certain symmetries of the input data. However, it is often challenging to design neural network architectures that respect these symmetries while being expressive and computationally efficient. For example, Euclidean motion invariant/equivariant graph or point cloud neural networks. We introduce Frame Averaging (FA), a general purpose and systematic framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. Our framework builds on the well known group averaging operator that guarantees invariance or equivariance but is intractable. In contrast, we observe that for many important classes of symmetries, this operator can be replaced with an averaging operator over a small subset of the group elements, called a frame. We show that averaging over a frame guarantees exact invariance or equivariance while often being much simpler to compute than averaging over the entire group. Furthermore, we prove that FA-based models have maximal expressive power in a broad setting and in general preserve the expressive power of their backbone architectures. Using frame averaging, we propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs. We demonstrate the practical effectiveness of FA on several applications including point cloud normal estimation, beyond 2-WL graph separation, and n-body dynamics prediction, achieving state-of-the-art results in all of these benchmarks.

  • 7 authors
·
Oct 7, 2021

Approximately Piecewise E(3) Equivariant Point Networks

Integrating a notion of symmetry into point cloud neural networks is a provably effective way to improve their generalization capability. Of particular interest are E(3) equivariant point cloud networks where Euclidean transformations applied to the inputs are preserved in the outputs. Recent efforts aim to extend networks that are E(3) equivariant, to accommodate inputs made of multiple parts, each of which exhibits local E(3) symmetry. In practical settings, however, the partitioning into individually transforming regions is unknown a priori. Errors in the partition prediction would unavoidably map to errors in respecting the true input symmetry. Past works have proposed different ways to predict the partition, which may exhibit uncontrolled errors in their ability to maintain equivariance to the actual partition. To this end, we introduce APEN: a general framework for constructing approximate piecewise-E(3) equivariant point networks. Our primary insight is that functions that are equivariant with respect to a finer partition will also maintain equivariance in relation to the true partition. Leveraging this observation, we propose a design where the equivariance approximation error at each layers can be bounded solely in terms of (i) uncertainty quantification of the partition prediction, and (ii) bounds on the probability of failing to suggest a proper subpartition of the ground truth one. We demonstrate the effectiveness of APEN using two data types exemplifying part-based symmetry: (i) real-world scans of room scenes containing multiple furniture-type objects; and, (ii) human motions, characterized by articulated parts exhibiting rigid movement. Our empirical results demonstrate the advantage of integrating piecewise E(3) symmetry into network design, showing a distinct improvement in generalization compared to prior works for both classification and segmentation tasks.

  • 4 authors
·
Feb 13, 2024

Accelerating the Search for Superconductors Using Machine Learning

Prediction of critical temperature (T_c) of a superconductor remains a significant challenge in condensed matter physics. While the BCS theory explains superconductivity in conventional superconductors, there is no framework to predict T_c of unconventional, higher T_{c} superconductors. Quantum Structure Diagrams (QSD) were successful in establishing structure-property relationship for superconductors, quasicrystals, and ferroelectric materials starting from chemical composition. Building on the QSD ideas, we demonstrate that the principal component analysis of superconductivity data uncovers the clustering of various classes of superconductors. We use machine learning analysis and cleaned databases of superconductors to develop predictive models of T_c of a superconductor using its chemical composition. Earlier studies relied on datasets with inconsistencies, leading to suboptimal predictions. To address this, we introduce a data-cleaning workflow to enhance the statistical quality of superconducting databases by eliminating redundancies and resolving inconsistencies. With this improvised database, we apply a supervised machine learning framework and develop a Random Forest model to predict superconductivity and T_c as a function of descriptors motivated from Quantum Structure Diagrams. We demonstrate that this model generalizes effectively in reasonably accurate prediction of T_{c} of compounds outside the database. We further employ our model to systematically screen materials across materials databases as well as various chemically plausible combinations of elements and predict Tl_{5}Ba_{6}Ca_{6}Cu_{9}O_{29} to exhibit superconductivity with a T_{c} sim 105 K. Being based on the descriptors used in QSD's, our model bypasses structural information and predicts T_{c} merely from the chemical composition.

  • 2 authors
·
May 17

Multi-property directed generative design of inorganic materials through Wyckoff-augmented transfer learning

Accelerated materials discovery is an urgent demand to drive advancements in fields such as energy conversion, storage, and catalysis. Property-directed generative design has emerged as a transformative approach for rapidly discovering new functional inorganic materials with multiple desired properties within vast and complex search spaces. However, this approach faces two primary challenges: data scarcity for functional properties and the multi-objective optimization required to balance competing tasks. Here, we present a multi-property-directed generative framework designed to overcome these limitations and enhance site symmetry-compliant crystal generation beyond P1 (translational) symmetry. By incorporating Wyckoff-position-based data augmentation and transfer learning, our framework effectively handles sparse and small functional datasets, enabling the generation of new stable materials simultaneously conditioned on targeted space group, band gap, and formation energy. Using this approach, we identified previously unknown thermodynamically and lattice-dynamically stable semiconductors in tetragonal, trigonal, and cubic systems, with bandgaps ranging from 0.13 to 2.20 eV, as validated by density functional theory (DFT) calculations. Additionally, we assessed their thermoelectric descriptors using DFT, indicating their potential suitability for thermoelectric applications. We believe our integrated framework represents a significant step forward in generative design of inorganic materials.

  • 6 authors
·
Mar 20

Grad DFT: a software library for machine learning enhanced density functional theory

Density functional theory (DFT) stands as a cornerstone method in computational quantum chemistry and materials science due to its remarkable versatility and scalability. Yet, it suffers from limitations in accuracy, particularly when dealing with strongly correlated systems. To address these shortcomings, recent work has begun to explore how machine learning can expand the capabilities of DFT; an endeavor with many open questions and technical challenges. In this work, we present Grad DFT: a fully differentiable JAX-based DFT library, enabling quick prototyping and experimentation with machine learning-enhanced exchange-correlation energy functionals. Grad DFT employs a pioneering parametrization of exchange-correlation functionals constructed using a weighted sum of energy densities, where the weights are determined using neural networks. Moreover, Grad DFT encompasses a comprehensive suite of auxiliary functions, notably featuring a just-in-time compilable and fully differentiable self-consistent iterative procedure. To support training and benchmarking efforts, we additionally compile a curated dataset of experimental dissociation energies of dimers, half of which contain transition metal atoms characterized by strong electronic correlations. The software library is tested against experimental results to study the generalization capabilities of a neural functional across potential energy surfaces and atomic species, as well as the effect of training data noise on the resulting model accuracy.

  • 5 authors
·
Sep 22, 2023

Symmetries and Asymptotically Flat Space

The construction of a theory of quantum gravity is an outstanding problem that can benefit from better understanding the laws of nature that are expected to hold in regimes currently inaccessible to experiment. Such fundamental laws can be found by considering the classical counterparts of a quantum theory. For example, conservation laws in a quantum theory often stem from conservation laws of the corresponding classical theory. In order to construct such laws, this thesis is concerned with the interplay between symmetries and conservation laws of classical field theories and their application to asymptotically flat spacetimes. This work begins with an explanation of symmetries in field theories with a focus on variational symmetries and their associated conservation laws. Boundary conditions for general relativity are then formulated on three-dimensional asymptotically flat spacetimes at null infinity using the method of conformal completion. Conserved quantities related to asymptotic symmetry transformations are derived and their properties are studied. This is done in a manifestly coordinate independent manner. In a separate step a coordinate system is introduced, such that the results can be compared to existing literature. Next, asymptotically flat spacetimes which contain both future as well as past null infinity are considered. Asymptotic symmetries occurring at these disjoint regions of three-dimensional asymptotically flat spacetimes are linked and the corresponding conserved quantities are matched. Finally, it is shown how asymptotic symmetries lead to the notion of distinct Minkowski spaces that can be differentiated by conserved quantities.

  • 1 authors
·
Mar 16, 2020

On Neural Differential Equations

The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.

  • 1 authors
·
Feb 4, 2022

Strong pairing and symmetric pseudogap metal in double Kondo lattice model: from nickelate superconductor to tetralayer optical lattice

In this work, we propose and study a double Kondo lattice model which hosts robust superconductivity. The system consists of two identical Kondo lattice model, each with Kondo coupling J_K within each layer, while the localized spin moments are coupled together via an inter-layer on-site antiferromagnetic spin coupling J_perp. We consider the strong J_perp limit, wherein the local moments tend to form rung singlets and are thus gapped. However, the Kondo coupling J_K transmits the inter-layer entanglement between the local moments to the itinerant electrons. Consequently, the itinerant electrons experience a strong inter-layer antiferromangetic spin coupling and form strong inter-layer pairing, which is confirmed through numerical simulation in one dimensional system. Experimentally, the J_K rightarrow -infty limits of the model describes the recently found bilayer nickelate La_3Ni_2O_7, while the J_K>0 side can be realized in tetralayer optical lattice of cold atoms. Two extreme limits, J_K rightarrow -infty and J_K rightarrow +infty limit are shown to be simplified to a bilayer type II t-J model and a bilayer one-orbital t-J model, respectively. Thus, our double Kondo lattice model offers a unified framework for nickelate superconductor and tetralayer optical lattice quantum simulator upon changing the sign of J_K. We highlight both the qualitative similarity and the quantitative difference in the two sides of J_K. Finally, we discuss the possibility of a symmetric Kondo breakdown transition in the model with a symmetric pseudogap metal corresponding to the usual heavy Fermi liquid.

  • 3 authors
·
Aug 2, 2024

Improving equilibrium propagation without weight symmetry through Jacobian homeostasis

Equilibrium propagation (EP) is a compelling alternative to the backpropagation of error algorithm (BP) for computing gradients of neural networks on biological or analog neuromorphic substrates. Still, the algorithm requires weight symmetry and infinitesimal equilibrium perturbations, i.e., nudges, to estimate unbiased gradients efficiently. Both requirements are challenging to implement in physical systems. Yet, whether and how weight asymmetry affects its applicability is unknown because, in practice, it may be masked by biases introduced through the finite nudge. To address this question, we study generalized EP, which can be formulated without weight symmetry, and analytically isolate the two sources of bias. For complex-differentiable non-symmetric networks, we show that the finite nudge does not pose a problem, as exact derivatives can still be estimated via a Cauchy integral. In contrast, weight asymmetry introduces bias resulting in low task performance due to poor alignment of EP's neuronal error vectors compared to BP. To mitigate this issue, we present a new homeostatic objective that directly penalizes functional asymmetries of the Jacobian at the network's fixed point. This homeostatic objective dramatically improves the network's ability to solve complex tasks such as ImageNet 32x32. Our results lay the theoretical groundwork for studying and mitigating the adverse effects of imperfections of physical networks on learning algorithms that rely on the substrate's relaxation dynamics.

  • 2 authors
·
Sep 5, 2023

PropMolFlow: Property-guided Molecule Generation with Geometry-Complete Flow Matching

Molecule generation is advancing rapidly in chemical discovery and drug design. Flow matching methods have recently set the state of the art (SOTA) in unconditional molecule generation, surpassing score-based diffusion models. However, diffusion models still lead in property-guided generation. In this work, we introduce PropMolFlow, a novel approach for property-guided molecule generation based on geometry-complete SE(3)-equivariant flow matching. Integrating five different property embedding methods with a Gaussian expansion of scalar properties, PropMolFlow outperforms previous SOTA diffusion models in conditional molecule generation across various properties while preserving the stability and validity of the generated molecules, consistent with its unconditional counterpart. Additionally, it enables faster inference with significantly fewer time steps compared to baseline models. We highlight the importance of validating the properties of generated molecules through DFT calculations performed at the same level of theory as the training data. Specifically, our analysis identifies properties that require DFT validation and others where a pretrained SE(3) geometric vector perceptron regressors provide sufficiently accurate predictions on generated molecules. Furthermore, we introduce a new property metric designed to assess the model's ability to propose molecules with underrepresented property values, assessing its capacity for out-of-distribution generalization. Our findings reveal shortcomings in existing structural metrics, which mistakenly validate open-shell molecules or molecules with invalid valence-charge configurations, underscoring the need for improved evaluation frameworks. Overall, this work paves the way for developing targeted property-guided generation methods, enhancing the design of molecular generative models for diverse applications.

  • 9 authors
·
May 27

Crystal Diffusion Variational Autoencoder for Periodic Material Generation

Generating the periodic structure of stable materials is a long-standing challenge for the material design community. This task is difficult because stable materials only exist in a low-dimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) global stability also requires the structure to follow the complex, yet specific bonding preferences between different atom types. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We significantly outperform past methods in three tasks: 1) reconstructing the input structure, 2) generating valid, diverse, and realistic materials, and 3) generating materials that optimize a specific property. We also provide several standard datasets and evaluation metrics for the broader machine learning community.

  • 5 authors
·
Oct 12, 2021

PLaID++: A Preference Aligned Language Model for Targeted Inorganic Materials Design

Discovering novel materials is critical for technological advancements such as solar cells, batteries, and carbon capture. However, the development of new materials is constrained by a slow and expensive trial-and-error process. To accelerate this pipeline, we introduce PLaID++, a Large Language Model (LLM) fine-tuned for stable and property-guided crystal generation. We fine-tune Qwen-2.5 7B to generate crystal structures using a novel Wyckoff-based text representation. We show that generation can be effectively guided with a reinforcement learning technique based on Direct Preference Optimization (DPO), with sampled structures categorized by their stability, novelty, and space group. By encoding symmetry constraints directly into text and guiding model outputs towards desirable chemical space, PLaID++ generates structures that are thermodynamically stable, unique, and novel at a sim50\% greater rate than prior methods and conditionally generates structures with desired space group properties. Our experiments highlight the effectiveness of iterative DPO, achieving sim115\% and sim50\% improvements in unconditional and space group conditioned generation, respectively, compared to fine-tuning alone. Our work demonstrates the potential of adapting post-training techniques from natural language processing to materials design, paving the way for targeted and efficient discovery of novel materials.

  • 5 authors
·
Sep 8