Imperial News

Machine learning, materials science and the new Imperial MOOC

by Caroline Detchenique

Machine Learning is not new but may not an obvious technique to use in Materials Science and Engineering. Why and how can it be used now?

We hear a lot about artificial intelligence (AI) and machine learning these days. Imperial earlier this year has launched a new MOOC (Massive Open Online Courses) with Coursera. The set of three online courses – Mathematics for Machine Learning – is fully-accessible across the world and supports learners in developing their mathematical skills and intuition so they can understand the complex principles underpinning AI.

Staff in the Department of Materials Professor David Dye [Engineering Alloys theme in the Department of Materials], and Department’s alumni Dr Sam Cooper, along with colleagues in the Department of Computing Dr Marc Deisenroth and the Dyson School of Design Engineering, Dr Freddie Page, teach the series of courses covering fundamental skills, such as linear algebra, vector calculus, and analytic geometry, which are all key ingredients for many machine learning algorithms that power artificial intelligence.

Financial Services, Government, Healthcare, Marketing and Sales companies all use machine learning to analyse data to work more effectively or create a competitive advantage.  But could machine learning be used in areas such as Materials Science? We find out with Professor Aron Walsh who recently published a paper in Nature on the subject of ‘Machine learning for molecular and materials science.

 

Could you briefly describe what machine learning (ML) is?

Machine learning is a subfield of artificial intelligence that has evolved rapidly in recent years. It is based on the use of statistical algorithms whose performance improves with training (quite similar to our PhD students!). There are many classes of ML approaches ranging from Bayesian analysis based on probability models to pattern recognition using artificial neural networks. It is not always apparent, but ML is now widely employed for tasks ranging from fraud detection for banks, to online advertising, and computer-aided healthcare diagnostics.

 

ML is not new but may not an obvious technique to use in Materials Science and Engineering (MSE). Why is it coming to the fore now, and how can ML be applied to Materials Science?

Data is the fuel of machine learning and in MSE there is an abundance of data on the structures and properties of materials. For example, the phase space of some of the high-performance alloys now being studied in our Department is so large that conventional methods cannot fully describe them. There is a growing infrastructure of machine-learning tools for generating, testing and refining models. MSE is benefiting from these developments with progress in predicting synthetic pathways, accelerating characterisation, and discovering previously unknown structure-property relationships.

 

What are the main advantages of using ML in the discovery of materials?

Machine learning is helping to reduce the barriers between materials design, synthesis, characterisation and modelling. To give one example, the structure of materials is usually deduced by a combination of experimental methods, such as X-ray / electron diffraction and vibrational spectroscopy. Each approach is limited in sensitivity and length scale, but they provide information that is complementary. ML represents a unifying framework that could assimilate all available data into a coherent description of structure.

 

Can any type of materials be discovered using ML?

Any class of materials can be treated once sufficient data is available for training. Machine-learning approaches typically require large amounts of data for learning to be effective. There is research activity in all areas of MSE including ceramics, alloys and even superconductors.  However, applications in molecular science and in particular drug design are more advanced, in part because the structure of molecules is easier to describe and manipulate digitally, but there has also been a significant investment in research by major pharmaceutical companies.

 

Does ML complement, or will it replace other methods of gathering and interpreting data?  

In our upcoming perspective in Nature, we argue that ML will complement and help to accelerate how we perform research in MSE. The rapid recent growth in the quantity of research results and data available from publications, patents, and databases makes it very difficult for a human researcher to make efficient use of them. The combination of machine learning and big data has been described as the “fourth paradigm of science”. Whether such a paradigm shift is realised will depend on how warmly the community embraces these new approaches and on the success that can be achieved in the near term. I am optimistic!