Studying Complex Phosphorus Systems with Machine Learning

12-11-2020 |   |  By Liam Critchey

Machine learning and other artificial intelligence (AI) algorithms are becoming more commonplace in modern-day society. They are starting to become a very valuable tool for chemical research—at both the fundamental research and industrial-scale optimisation levels. This is primarily due to the rise in computational chemistry methods which use simulations and advanced numerical algorithms to predict best how molecules will behave (and how they will look structurally in the case of complex systems).

While research is going into a lot of different chemicals, the various allotropes of elemental phosphorus (i.e. materials made of only phosphorus but with different spatial and structural arrangements) are attracting a lot of interest. Still, it is relatively hard to simulate using conventional computational methods compared to other elements and molecules. 

Materials of elemental phosphorus have been around for a long time, but are now being touted for advanced technology applications, especially the nanomaterial allotropes such as phosphorene. These include the weakly bound phosphorus material, white P, the highly amorphous and covalently linked Red P, the many layered black phosphorus (the phosphorus version of graphite), and phosphorene—the all-phosphorus 2D material that is in the same 2D family of graphene. 

Beyond these, there are also liquid phosphorus materials, phosphorus nanorods and nanowires, and cage-like molecules, as well as many theoretical allotropes. Because there are so many different structural forms possible with just phosphorous alone, it has meant that computational approaches struggle to provide accurate results compared to other elemental materials.

Current Issues with Simulating Phosphorus Systems

Currently, the go-to methods for studying pure phosphorus materials are density functional theory (DFT) and molecular dynamics (MD) methods, as they are for many computational chemistry approaches. It should be noted that the methods performed to date have provided valuable insights into the different phosphorus allotropes; however, they are extremely limited because the computational cost to compute these systems is very high. So, simpler and quicker computational runs need to be used typically, but this brings in limitations of what information can be obtained.

One way to overcome this high computational cost has been to empirically fit ‘force fields’ within the algorithm, which are interatomic potential models (mathematical functions used to calculate the energy of a molecular system). These force fields require a much lower computational power and are starting to be used for modelling all-phosphorous materials.

A lot of force fields have started to be utilised in the different computational approaches. Despite their lower computational cost, there are still some issues when it comes to complex phosphorus systems. One of the main issues is that the force fields can only measure a narrow range of atomic space, and these phosphorous systems are large and complex with several different atomic configurations.

Because each of the different phosphorous materials are held together with different inter- and intra-molecular forces, it makes it inherently difficult to measure all the different structural environments. This has meant that different force models need to be designed specifically for each type of phosphorous, i.e. a force field that works well on black phosphorous will not work well liquid phosphorus (and vice versa). Machine learning has since been integrated with these force fields to help tackle these key issues.

Using Machine Learning to Combat these Issues

The integration of machine learning approaches with these computational force fields has led to ‘machine learning force fields’ being developed. Having already being used on other complex molecular systems, machine learning force fields are being used more and more to tackle computational problems in the chemical simulation space.

The core idea of the machine learning force fields is that they carry out a few thousand reference computations for different small structures. These structures are based on DFT results but are fitted non-parametrically (i.e. where the data is not assumed to come from prescribed models) to make them machine learning-based. This is utilised in conjunction with a suitable reference database that contains the relevant atomistic configurations of the element(s) being analysed. However, the databased is not typically too large, so it is easier to control. When fitted to a suitable database, these machine learning force fields can be used to deduce a wide range of properties of a material and predict different material structures.

A New General-Purpose Machine Learning Force Field

One of the most recent pieces of research in this space has come from the UK and Finland, where the research teams have developed a general-purpose machine learning field for simulating both bulk and nanostructured forms of phosphorus. The reasoning behind the recent research is due to most of the machine learning fields—while being able to identify and compute many properties of phosphorus allotropes—are still very specific in nature. Many of the machine learning force fields already developed are still better than other force fields, as purpose-specific force fields can be fitted on the fly as the research demands. Still, they struggle when out-of-the-box, random and unexpected situations arise.

The recent development of the general-purpose force field for elemental phosphorus can be used to describe the characteristics of several phosphorus allotropes using DFT. The team created a database from a known model and added in suitable 2D and 3D structures of elemental phosphorus. These databases have tended to be disjointed, and the ability to combine them enables the most relevant regions of the atomic configuration space to be a lot more accurate than before and can adapt to changing environments—leading to more accurate representations of both 2D and 3D allotropes of phosphorus, including details regarding their structure and characteristics, as well as how the allotropes can change.


A reference database such as this, in conjunction with machine learning algorithms, has the potential to enable a wide range of chemical simulations to take place surrounding phosphorus that wasn’t available before, even to the fastest DFT algorithms in existence—including an accurate representation of liquid phosphorus, as well as the transition of one phosphorus system into another (e.g. simulating the exfoliation of black phosphorus into phosphorene).

Development of these machine learning systems for DFT approaches not only has implications for phosphorus, but it could be adapted and used for other complex systems, including 2D layered nanomaterials and complex nanostructures (if the database entries are aligned with the relevant materials). This opens the possibility for more advanced and accurate simulations to be realised across the materials, chemical and nanotechnology research sectors.

Liam Critchley Headshot.jpg

By Liam Critchey

Liam is a science writer who specialises in chemistry and nanotechnology, and reports on the extensive amount of areas which cross-over with these disciplines. As a writer, Liam has worked with companies, media sites and associations around the world and has published over 600 articles to date. Liam is also a member of the advisory board for the National Graphene Association and the Nanotechnology World Association and is a member of the board of Trustees for the charity GlamSci. Before becoming a writer, Liam obtained two masters degrees in Chemistry with Nanotechnology and Chemical Engineering.

Related articles