Lecture Series:

Similarity of materials and data-quality assessment by unsupervised learning

Wednesday, 19.01.2022 · 16:00

Speaker: Claudia Draxl, HU-Berlin

In recent years, data-analytics and machine-learning approaches are being applied to various problems of materials research, and high-throughput screening (HTS) is going hand in hand with the establishment of small- and largescale data collections. These resources allow us finding trends and patterns that cannot be obtained from individual investigations. Moreover, one can search for materials which exhibit features that are similar to those of other materials but are superior with respect to other criteria. Besides finding materials that resemble each other, e.g. in their electronic properties, we can use the same tools for assessing data quality. As such, one can compare the performance of different methodologies for one and the same material or the impact of approximations and computational parameters on calculated properties. We also make use of unsupervised learning to find trends in the data and rationalize their physical origin. It will also be discussed what the challenges are for building a FAIR data infrastructure, and how we are currently expanding our efforts toward inclusion of experimental data.