Millions of calculated materials science data are now available to the scientific community (see for example the NOMAD project). Our research is focused on developing and implementing scalable and efficient computational methods to automatically extract knowledge from materials science data.
The starting point of our research are state-of-the-art data science techniques, such as for example convolutional and siamese neural networks, kernel methods, hierarchical clustering algorithms, and various dimensionality reduction methods. On top of this, we integrate our physical insight and domain knowledge in both descriptor identification (how the system is represented) and modeling. We strongly believe that the application of data science to materials should not only lead to transferable models with excellent performance, but more importantly generate value through real physical and chemical insight.
More specifically, we work in the following two areas:
- Materials similarities: We develop methods to assess similarities and to build similarity maps between materials, these similarities being based on either structural, mechanical or chemical properties. These materials maps would reveal which regions of this high-dimensional space have not been explored yet, but may contain novel materials with unusual properties.
- Crystal-structure classification: We use low-dimensional representations of physical systems (descriptors) and supervised learning techniques – in particular neural networks and kernel methods – to automatically classify crystal structures.
We are making the computational tools that stem from our research available to the scientific community with both easy-the-use and more advanced tutorials in the context of the NOMAD Analytics Toolkit.