Mathematics and Computer Science for Materials Innovation 2023:
4th MACSMIN focuses on continuous crystallography and geometry of proteins
 Dates : 2225 May 2023 in a hybrid form at Liverpool Materials Innovation Factory, UK.
 Conference organizer : Vitaliy Kurlin. To participate for free, please email Vitaliy Kurlin.
 All talks will be aimed for a broad audience of scientists, see the current program.
 Travel information for Liverpool (UK): venue, accommodation, trains, flights.
 Past meetings of the MACSMIN conference series: 2022, 2021, 2020.
 Invited speakers who came in person
 Elspeth Garman (University of Oxford, UK)
 Nicholas Kotov (University of Michigan, US)
 Simon Billinge (Columbia University, US)
 Benedict Leimkuhler (Edinburgh University, UK)
 Invited speakers who gave zoom talks
 Stephen K. Burley (Director of the RCSB Protein Data Bank, Rutgers University and UCSD)
 Nikolai Dolbilin (Steklov Mathematical Institute)
 Marjorie Senechal (Smith College, US)
 Michele Ceriotti (EPFL, Switzerland)
 Henry Adams (Colorado State University, US)
 Daniel Schwalbe Koda (Lawrence Livermore National Lab, US)
 Tutorials about Geometric Data Science
studing geometry on moduli spaces of finite and periodic structures:
 Vitaliy Kurlin. Introduction to Continuous Crystallography and Geometry of Proteins.
 Dan Widdowson. Continuous invariantbased maps of the crystal universe.
 Matt Bright. Continuous chiral distances for 2dimensional lattices.
 We invited to participate inperson and had a modest budget to cover travel and accommodation of invited speakers. However, since many colleagues have other important committments, invited talks over zoom were also arranged.
 MACSMIN has the MIF++ scientific style and encourages rigorous results justified by proofs, not only by examples.
Travel information : venue, accommodation, trains, flights
 All talks on Monday 22nd May were in the ground floor boardroom in the Materials Innovation Factory (MIF), Liverpool. Address : 51 Oxford street, building 807 in the grid cell F5 on the campus map. The building has a secure entrance, so we will let the reception know about MACSMIN participants. The MIF is 15 min on foot from the Liverpool Lime Street station.
 We usually book accommodation in the Liner hotel in a quiet street close to the Liverpool Lime Street main train station. There are many other good hotels and attractions: visit Liverpool. The conference dinner is planned in the Liner hotel.
 The city has the Liverpool John Lennon airport with convenient buses to the centre. The larger Manchester airport has the train station with direct 90min trains to the Liverpool Lime Street station. Check flights to nearby airports at Skyscanner.
Back to Top of this page  Back to MACSMIN  Back to Home page
Program : Monday 22 May Tuesday 23 May Wednesday 24 May Thursday 25 May

Monday 22 May 2023 : all UK times (most talks in person)
Location: boardroom on the ground floor of the MIF, building 807 in the grid cell F5 on the campus map. 8.509.00 Brief opening by Vitaliy Kurlin (Liverpool Materials Innovation Factory, UK).
 9.009.50
Elspeth Garman (University of Oxford, UK)
Video (50 min)
Title. Quantifying and comparing radiation damage in the Protein Data Bank.
Abstract (pdf). Structural biology relies on Xray crystallography to provide much of the three dimensional information on proteins and other macromolecules that informs biological function [1], but radiation damage to the samples remains one of the major bottlenecks to accurate structure determined. The radiation damage can manifest as `global’ changes resulting in the fading of the diffraction pattern with increasing dose, or as `specific’ structural and chemical changes in the protein structures obtained. It is hence an important consideration when assessing the quality and biological veracity of crystal structures in repositories such as the Protein Data Bank (PDB). However, detection of radiation damage artefacts has traditionally proved very challenging. To address this, we have introduced the Bnet metric. Bnet summarises in a single value the extent of damage suffered by a crystal structure by comparing the Bfactor values of damageprone and nondamageprone atoms in a similar local environment. After validating that Bnet successfully detects damage in 23 different crystal structures previously characterised as damaged, we have calculated Bnet values for 93,978 PDB crystal structures. Our metric highlights a range of damage features, many of which would remain unidentified by the other summary statistics typically calculated for PDB structures [2].
[1] EF Garman (2014) Developments in Xray Crystallographic Structure Determination of Biological Macromolecules. Science 343: 11021108
[2] KL Shelley & EF Garman (2022) Quantifying radiation damage in the Protein Data Bank. Nature Communications 13:1314 1325.  10.0010.50
Nicholas Kotov (University of Michigan, US)
Video (51 min)
Title. Graph Theoretical Engineering of Nanostructures.  11.0011.50
Simon J. L. Billinge (Columbia University, US)
Title. Do materials have a genome, and if they do, what can we do with it?
Abstract. The materials genome initiative (MGI) is a US government initiative from 2011 to speed up, and lower the cost of, the discovery of new advanced materials, and their deployment in technologies, by applying data analytic methods inspired by biological genomics. Of course, materials do not have a genome. However, if we generalize the concept of a genome as a 1D discrete quantity that codes for the 3dimensional arrangement of atoms, then we do have quantities that could serve as this generalized genome, for example, that atomic pair distribution function, or PDF. This is a quantity that is experimentally accessible through diffraction measurements and is a powerful and widely used approach for characterizing material structure on the nanoscale, which is a particularly challenging but important problem in advanced materials. As well as being experimentally accessible, the PDF can be computed from a structure by making interatomic vectors between all the atoms in the structure and then making a histogram of these lengths, weighted by the average scattering power of each atom. This is closely related to the quantity you get by making a Cech filtration of a point cloud of points. The PDF does not guarantee a unique 3D structure, and as pointed out by Vitaliy Kurlin, higher order correlation functions are needed to do that. Nonetheless, the PDF is experimentally accessible and often sufficient to give unique 3D embeddings of the graph. In this talk I introduce the PDF and describe what is possible to do with it, and speculate on how this "genomiclike" quantity may be used in materials discovery.  12.1513.45 Lunch at the Victoria Gallery and Museum waterhouse cafe
 14.0014.50
Mihaly Varadi
(EMBL's European Bioinformatics Institute, Cambridge, UK) Video (44 min)
Title. Navigating the protein universe: An insight into the PDBeKB and its role in facilitating structural biology research.
Abstract. The recent advancements in artificial intelligence (AI) technology have played a pivotal role in revolutionising protein structure prediction, leading to an unprecedented influx of predicted models. Today, databases such as AlphaFold DB and ESM Atlas host over 800 million structures, a staggering increase from the 200,000 experimentally determined structures available in the Protein Data Bank. This monumental surge of data has mobilised an increasing number of researchers to leverage protein structures in addressing complex biological problems.
This presentation will centre around the Protein Data Bank in Europe  Knowledge Base (PDBeKB), a communityoriented platform engineered to facilitate the scientific community's exploration and comprehension of macromolecular structures amid the enormous volume of data. We will deliver an expansive overview of PDBeKB, focusing on its various data services to promote the utilisation and biological interpretation of protein structures.
Further, we will delve into the various functional and biophysical annotations provided by the PDBeKB consortium members, enriching the value of these molecular structures. A key feature of our discussion will be the 3DBeacons Network, established and upheld under the PDBeKB umbrella. As a federated network of molecular structure providers, the 3DBeacons Network enables users to conveniently locate both computationally predicted and experimentally determined protein structures with their functional annotations, enhancing the accessibility of this vast wealth of structural data.  15.0015.50
Vitaliy Kurlin (Liverpool MIF) Video (51 min)
Title. Continuous Crystallography and Geometry of Proteins.
Abstract. We overview advances in Geometric Data Science studying finite and periodic structures modulo rigid motion.
First, the distancebased invariants of periodic crystals based on our work at NeurIPS 2022 and ICML 2023 established a new branch of Continuous Crystallography Dan Widdowson's talk on Tuesday demonstrates continuous maps of large crystal datasets such as the Cambridge Structural Database (CSD) and the Crystallography Open Database (COD).
Second, invariantbased maps of 2dimensional lattices from Acta Cryst A are now complemented by chiral distances to continuously quantify deviations from higher symmetry, see Matt Bright's talk. Third, the CVPR 2023 paper defined complete and continuous invariants for finite clouds of unlabeled points in Euclidean spaces, which will be presented later.
Fourth, 325+ billion comparisons of complete invariants of 800+ thousand protein chains from the Protein Data Bank (PDB) in less than two days on a modest desktop computer detected thousands of duplicate chains whose all alphacarbon atoms have identical x,y,z coordinates to the last digit, which will be later presented by Alexey Gorelov at MIF++.  16.0016.50
Matt Bright (Liverpool MIF) Video (51 min)
Title. Continuous chiral distances for 2dimensional lattices.
Abstract. Recent discoveries in materials science have emphasized the connection between structural symmetry (or its absence) and the physical properties of materials [1]. From complete, continuous invariants of 2D lattices proved in [2] we develop chiral distances: realvalued quantities that measure a continuous distance between a general lattice and its nearest higher symmetry neighbour [3]. We apply these distances to real twodimensional lattices arising from two publicly available databases of potential sources for monolayer materials, 2DMatPedia [4] and the 2D Materials Database [5] and show how the results provide a practical route to isolating stable asymmetric monolayers and illustrate the behaviour of monolayer materials when isolated from the bulk crystal.
References
[1] Ma, W. et al. Chiral inorganic nanostructures. Chemical Reviews (2017) 117, 8041.
[2] Kurlin, V. Mathematics of 2dimensional lattices. Foundations of Computational Mathematics, 2022.
[3] Bright, M, Cooper A., Kurlin, V. Geographicstyle maps for 2dimensional lattices. Acta Cryst A (2023) 79A, 113.
[4] Zhou, J, et al. 2DMatpedia, an open computational database ot twodimensional materials from topdown and bottomup approaches. Scientific Data 6 (2019), 669.
[5] Mounet, N et al. Twodimensional materials from highthroughput computational exfoliation of experimentally known compounds. Materials Cloud Archive 2020.158 (2020).  18.0021.00
Dinner at Seven Seas Brasserie, the Liner hotel
Back to Program  Back to Top of this page  Back to MACSMIN series  Back to Home page

Tuesday 23 May 2023 : all UK times (in person talks in the morning)
Location: VR suite (room 2/066) in the Chemistry department, building 213 in the grid cell G5 on the campus map. 9.009.50
Benedict Leimkuhler (Edinburgh University, UK)
Video (65 min)
Title. Convergence of dissipative and stochastic algorithms and applications in molecular modelling and statistics. Abstract. I will discuss the convergence of various schemes for optimization and sampling. Algorithms for these tasks are frequently developed as discretizations of stochastic or dissipative systems which exploit knowledge of gradients to improve convergence. In the first part of my talk, I will discuss the design of splitting methods for Langevin dynamics, including accuracy and contractivity properties as defined by stepsize and collision rate. I will also mention extensions of discretized Langevin dynamics to constrained systems and systems with reflecting boundaries. In the second part of the talk I will focus on dissipative dynamics and its use as a local (or global) optimization framework. I will show that we can dramatically improve the stability and efficiency of dissipated Hamiltonian dynamics by reenvisioning the dissipation mechanism.  10.0010.50
Dan Widdowson (Liverpool MIF)
Video (48 min)
Title. Continuous invariantbased maps of the crystal universe.
Abstract. The work in MATCH 2022 and NeurIPS 2022 developed continuous isometry invariants of periodic structures that have distinguished all periodic crystals in the Cambridge Structural Database (CSD), also finding a few pairs of geometric duplicates, where one atom was unexplainably replaced with a different one (Cd with Mn in the pair HIFCAB vs JEPLIA). The resulting Crystal Isometry Principle (CRISP) justified a continuous map of the Crystal Isometry Space (CRIS) including all known and not yet discovered periodic crystals parametrized by complete invariants. This talk will present interactive maps for large datasets including the CSD and Crystallography Open Database (COD) in invariant coordinates.  11.0011.50 Discussion of open problems
 12.1513.45 Lunch at the Victoria Gallery and Museum waterhouse cafe
 15.0015.50 Alexander Kozachinskiy (Millennium Institute Foundational Research on Data, Chile)
Title. Three iterations of (d1)WL test distinguish non isometric clouds of ddimensional points. Video (48 min)
Abstract. The WeisfeilerLehman (WL) test is a fundamental iterative algorithm for checking isomorphism of graphs. It has also been observed that it underlies the design of several graph neural network architectures, whose capabilities and performance can be understood in terms of the expressive power of this test. Motivated by recent developments in machine learning applications to datasets involving threedimensional objects, we study when the WL test is complete for clouds of euclidean points represented by complete distance graphs, i.e., when it can distinguish, up to isometry, any arbitrary such point cloud. Our main result states that the (d1)dimensional WL test is complete for point clouds in ddimensional Euclidean space, for any d>1, and that only three iterations of the test suffice. Our result is tight for d = 2,3. We also observe that the ddimensional WL test only requires one iteration to achieve completeness. Joint work with Valentino Delle Rose, Cristóbal Rojas, Mircea Petrache and Pablo Barceló.  16.0016.50
Daniel Schwalbe Koda (Lawrence Livermore National Lab, US)
Video (51 min)
Title. Representation learning for materials transformations, synthesis, and simulations.
Abstract. Datadriven approaches are increasingly inseparable from materials design processes. Extracting knowledge from materials data requires connecting physical phenomena to mathematical representations that allow for explainable predictions. In this talk, I will discuss how representation learning can guide the understanding of materials synthesis, transformations, and simulations. Using nanoporous materials as an example, I will describe how graph theory and geometric features explain synthesis and phase transformations effects of materials with controlled catalytic behavior. Next, moving towards learnable representations, I will explore how differentiable sampling strategies enable more robust machine learningdriven materials simulations. Finally, I will explain this interplay between representation, model robustness, and efficiency by connecting deep learning concepts to thermodynamic metrics. Prepared by LLNL under Contract DEAC5207NA27344.
Bio: Daniel SchwalbeKoda is an incoming Assistant Professor of Materials Science and Engineering at UCLA. Currently, he holds the Lawrence Fellowship at the Lawrence Livermore National Laboratory, California. He obtained a PhD in Materials Science and Engineering from MIT in 2022, and a Master’s in Physics and a Bachelor’s in Electrical Engineering from the Aeronautics Institute of Technology, Brazil, in 2017.
Back to Program  Back to Top of this page  Back to MACSMIN series  Back to Home page
 9.009.50
Benedict Leimkuhler (Edinburgh University, UK)
Video (65 min)

Wednesday 24 May 2023 : all UK times (all talks online)
 10.0010.50
Milo Torda (Slovakia and University of Liverpool, UK) Video (52 min)
Title. Symmetries of maximally dense plane group packings of regular convex polygons.
Abstract. Geometric packings are extensively studied in discrete and computational geometry, with broad applications in solidstate physics modeling, materials science, and biophysics. The general packing problem asks how we can arrange identical copies of a compact subset of ndimensional Euclidean space to maximize the ratio of filled space to the entire space. In this talk, we focus on a subproblem of the general packing problem by limiting the configuration space to isomorphism classes of twodimensional Crystallographic Symmetry Groups (plane groups). These are discrete groups of isometries of the twodimensional Euclidean space containing a lattice subgroup, leading to the associated plane group packing problem. We explore the symmetries of the densest plane group packing configurations for various regular convex polygons (ngons) in all 17 isomorphism classes obtained experimentally and propose conjectures about the densest plane group packings for all ngons. Using this information, we calculate the algebraic values of the densest packings of the disc based on the Archimedean tilings. This talk is based on the research published in Physical Review E 106 (2022), 054603 by M. Torda, J. Y. Goulermas, V. Kurlin, and G. M. Day.  11.0011.50 Artur Bille (University of Ulm, Germany) Video (49 min)
Title. Random eigenvalues of graphenes and the triangulation of the plane.
Abstract (pdf). We analyse the numbers of closed paths of length k on two important regular lattices: the hexagonal lattice (also called graphene in chemistry) and its dual triangular lattice. These numbers form a moment sequence of specific random variables connected to the distance of a position of a planar random flight (in three steps) to the origin. Here, we call such an random variable a random eigenvalue of the underlying lattice. Explicit formulae for the probability density and characteristic functions of these random eigenvalues are given for both the hexagonal and the triangular lattice. Furthermore, it is proven that both probability distributions can be approximated by a functional of the random variable uniformly distributed on increasing intervals [0,b] as b tends to infinity. This yields a simple way to simulate these random eigenvalues without generating graphene and triangular lattice graphs. To show that approximation, we first prove an interesting integral identity for a specific series containing the third powers of the modified Bessel functions of the nth order. Such series play a crucial role in many contexts, in particular, in analysis, combinatorics and theoretical physics.  12.1513.45 Lunch at the Victoria Gallery and Museum waterhouse cafe
 15.1016.00 Henry Adams (Colorado State University, US) Video (33 min)
Title. Additive energy functions have predictable landscape topologies.
Abstract. Many properties of a chemical system are described by its energy landscape, a realvalued function defined on a highdimensional domain. I will explain how topology, and in particular persistent homology, can be used in order to describe some of the pertinent features of an energy landscape. Whereas a merge tree encodes how connected components of an energy landscape evolve as the energy level increases, sublevelset persistent homology can also quantify the shape of these connected components. If the energy is an additive function over a product space, we use the Künneth formula to characterize the sublevelset persistent homology. As applications, we describe the sublevelset persistent homology of nalkanes and of branched alkanes. Joint work with Aurora Clark, Biswajit Sadhu, Brittany Story.
References: https://chemrxiv.org/engage/chemrxiv/articledetails/63aa2507e9d0fd65aa2ab7f4 and https://pubs.aip.org/aip/jcp/article/154/11/114114/315565/Representationsofenergylandscapesby  16.1017.00 Marjorie Senechal (Smith College, US) Video (38 min)
Title. Are there five parallelohedra, or only one? Reflections on the newly discovered einstein.
Abstract. In their recent preprint [1], the authors of "An Aperiodic Monotile" show that their 13sided "hat" polygon belongs to a continuous family of aperiodic tiles obtained from it by decreasing the lengths of its edges. This operation is a novel variant of edgereduction, a standard tool in the theory of convex zonohedra. We show that from this point of view, Fedorov's five threedimensional parallelohedra belong to another such continuum [2].
[1] David Smith, Joseph Myers, Craig Kaplan, and Chaim GoodmanStrauss, "An Aperiodic Monotile", posted on arxiv.org on March 20, 2023.
[2] Marjorie Senechal and Jean Taylor, "Parallelohedra, Old and New", to appear in Acta Cryst A.
Back to Program  Back to Top of this page  Back to MACSMIN series  Back to Home page
 10.0010.50
Milo Torda (Slovakia and University of Liverpool, UK) Video (52 min)

Thursday 25 May 2023 : all UK times (all talks online)
 9.009.50 Nikolai Dolbilin (Steklov Mathematical Institute)
Title. Local Rules of Crystallinity and Groups in Delone Sets.
Abstract (pdf). Introduced by B. Delone (Delaunay) in 1934 as (r,R)systems and called Delone sets today, these point sets are used to model atomic structures of periodic crystals, nonperiodic quasicrystals, glasses, and so on.
Numerous applications of this concept are due to the simplicity and depth of its definition given by Delone. According to the definition, in a Delone set X,
(1) a real number r > 0 means that an every (open) ball of radius r contains at most one point of X;
(2) a real number R > 0 is such that every (closed) ball of radius R contains at least one point of X.
It is easy to see that Condition (1) can be reformulated as follows: the distance between any two points of the Delone set is not lesser than 2r. Condition (2) is equivalent to the next one: the distance from an arbitrary point of space to the nearest point of X does not exceed R.
In the talk, we will discuss results on the foundations of geometric crystallography obtained in recent decades at school of Delone. The focus of this study is search for and proving local rules that guarantee Delone set to be a regular system (a set with pointtransitive group and therefore periodic). We will also discuss quite recent results and conjectures on local groups in arbitrary (without any additional conditions) Delone sets.  10.0010.50
Michele Ceriotti (EPFL, Switzerland)
Video (55 min)
Title. An equivariant representation to learn longdistance interactions.
Abstract. Machine learning models are proving to be extremely effective in predicting the properties of atomistic configurations of matter, circumventing the need for timeconsuming electronic structure calculations. The most successful schemes achieve transferability by means of a local representation of structures, in which the problem of predicting a property is broken down into the prediction of local, atomcentred contributions. This approach is however not efficient in describing longrange interatomic forces, such as those arising due to electrostatics, polarization, or dispersive interactions. I will present a possible solution to this conundrum based on the longdistance equivariant (LODE) framework, that combines a local description of matter with the appropriate, longrange asymptotic behaviour of interactions.  14.0014.50 Stephen K. Burley (Director of the RCSB Protein Data Bank, Rutgers University and UCSD)
Title. Validation of 3D Biostructures in PDB. Video (60 min)
Abstract. In addition to exposing the various search, etc. capabilities that we support, I will spend some time explaining the technical architectural underpinnings of our recently revamped RCSB.org web site.
 9.009.50 Nikolai Dolbilin (Steklov Mathematical Institute)
Back to Top of this page  Back to MACSMIN  Back to Home page