Dr Vitaliy Kurlin: mathematics & computer science

Data Science theory and applications. Everything is possible!

E-mail: vitaliy.kurlin(at)gmail.com, University of Liverpool, UK

Doctoral Network in Artificial Intelligence for Future Digital Health is a doctoral training centre funded by the University of Liverpool from October 2019 to train the next generation of world-leading experts in Data Science an AI to solve data intensive problems in healthcare.

photo of the Computer Science department

Vision of the doctoral network

Back to Top of this page | Back to Home page

Leadership team of the network

Back to Top of this page | Back to Home page

Case studies of PhD students Daniel Widdowson and Jonathan Balasingham

Daniel Widdowson's photoDaniel Widdowson has BSc in Mathematics at Warwick and MSc in Computer Science at Liverpool.

Daniel's MSc thesis supervised by Vitaliy Kurlin in summer 2020 led to the high-profile MATCH paper introducing ultra-fast isometry invariants (Average Minimum Distances) for mapping all periodic crystals.

Daniel's PhD is supervised since October 2020 by Vitaliy Kurlin, Andy Cooper, Jason Cole.

Daniel's research in his own words: Crystal Structure Prediction (CSP) is a set of methods for predicting new crystalline materials given a molecule. The way crystals are stored by a computer is ambiguous, i.e., one crystal can be represented in many ways, so during CSP it is not possible to automatically detect and remove duplicates. Currently this is handled manually in a time-consuming filtering process.

Our work uses mathematical tools called isometry invariants to tackle this problem of ambiguity. Every crystal has an invariant which will not change if the crystal is represented differently, and similar crystals have similar invariants to account for atomic vibrations and measurement errors.

As part of the Materials Innovation Factory at the University of Liverpool, co-supervised by Professor Andy Cooper and in collaboration with the Cambridge Crystallographic Data Centre (CCDC), this work has shown impact and promise even outside of applications in crystal structure prediction. The CCDC curates the Cambridge Structural Database (CSD), a collection of over one million crystals collected from research all over the world. Our tools searched the database for duplicates in a process totalling over 200 billion comparisons, leading to 5 pairs of crystals currently being investigated.

invariants help discover materials

These comparisons demonstrated the Crystal Isometry Principle stating that any crystal is determined uniquely by the geometry of its atomic centres. So all crystals live in a common landscape parameterised by invariants, the ‘Crystal Isometry Space’.

Recent work in JACS used invariants in a novel way to compare crystals whose molecules were different but superficially alike. The two molecules could form crystals that were similar by eye, but this was difficult to detect automatically. Our tools detected and quantified these similarities, and all given reference crystals had analogues in the other set.

Jonathan Balasingham's photoJonathan Balasingham has gained many degrees: MSc in Scientific and Data-Intensive Computing (University College London), MSc in Industrial Engineering (San Jose State University) and BSc in Computer Science (University of California, Santa Cruz).

Jonathan's PhD is supervised since October 2021 by Viktor Zamaraev, Vitaliy Kurlin, Andy Cooper.

Jonathan's research in his own words: Working with molecular crystals presents inherent difficulty due to their periodic nature. Until recently, there have not been rigorous ways to classify and compare crystal structures. Research from the Data Science Theory and Applications group has provided two means by which to accomplish this, Average Minimum Distances and Pointwise Distance Distributions.

These mathematical tools give us the capability to quickly compare large amounts of crystals in a precise way. Because of this, we’ve been able to build tools such as a search engine and visualization software to explore crystal databases such as the Cambridge Structural Database provided by the CCDC.

Work from the research group also granted the expansion from pure mathematics to other domains such as machine learning where having an unambiguous representation for molecular crystals opens doors for use in new algorithms and allows for improvement upon existing methods. More generally, taking on a geometric view of data science applications can help reduce data needs and make for more robust and effective models. The PhD project is successful because

Back to Top of this page | Back to Home page

Cohort-based training for PhD students

Back to Top of this page | Back to Home page

Advanced topics in Data Science in Spring 2022

Back to Top of this page | Back to Home page

Introductory topics in Data Science in Autumn 2021

Back to Top of this page | Back to Home page

Advanced topics in Data Science in Spring 2021

Back to Top of this page | Back to Home page

Introductory topics in Data Science in Autumn 2020

Back to Top of this page | Back to Home page

Advanced topics in Data Science in Spring 2020

Back to Top of this page | Back to Home page

Introductory topics in Data Science in Autumn 2019

Back to Top of this page | Back to Home page

PhD projects: first cohort from Autumn 2019 (seven students)

Back to Top of this page | Back to Home page

PhD projects: second cohort from Autumn 2020 (five students)

Back to Top of this page | Back to Home page

PhD projects: third cohort from Autumn 2021 (six students)

Back to Top of this page | Back to Home page