Data analytics and the materials cycle
Data has the potential to revolutionise the materials industry, from extraction to processing to commercialisation. But it can only do so with the right mix of technology, skills and strategy. Dr Matt Jones explains.
Using data is nothing new, but the rapid increase in data sources and quantity of data collected means getting the most out of it is ever more important. Properly used, data can provide answers to questions such as – how deep is the metal reserve, under what conditions does fermentation yield the best results, what combination of materials is most aerodynamic, what factors could indicate a problem that needs addressing fast, how quickly will my waste decompose?
Sensors, once largely the preserve of technical applications, have become commonplace in all aspects of our professional and personal lives, collecting vast stores of data. There is a general consensus that all this data is hugely useful for both scientific and business insight, and this includes many opportunities throughout the materials lifecycle. And there are indeed many examples where data has led to new materials being developed, processes optimised, insight gained and millions saved.
However, despite clear potential, many data projects still fail to deliver. The reason for such failures is both simple and complicated. Simple, because it is often down to overlooking sensible planning procedures, such as understanding what problems need solving and identifying the right approach. Complicated, because getting it right requires some fairly serious mathematical expertise, usually alongside a deep understanding of at least one of the many scientific process involved in the materials lifecycle.
Working with data
When we extract materials, we use geographic data to identify where to look, measure acoustics to optimise where to drill, and match measurements of sub-surface geography with models of material behaviour and fluid dynamics to know how best to extract them. As raw materials are processed, sensors collect data on temperature, humidity, and chemical levels to ensure nothing goes wrong and to allow continual optimisation from the feedstock.
Developing the materials themselves requires huge numbers of experiments. Using data to model those experiments and quickly hone in on the right ones makes modern materials R&D viable. Data is also helping to demonstrate how materials innovation works in practice. For instance, a tool that uses data to predict fuel and CO2 savings of different ship hull coatings helps shipping companies make big savings in operational costs and improve environmental performance.
Data is also widely used to check that material constructs continue to operate effectively. Jet engines generate multiple gigabytes of data per second during operation, which must be collected, harvested and sent to a central hub for processing. Applying analytics to this allows for predictive maintenance, spotting problems quickly and avoiding taking planes out of service for traditionally scheduled periodical checks.
Analysing and applying
Collecting data is the easy part. Unfortunately, many projects fail because of the assumption that collecting huge data sets will automatically yield insights. The best advice for anyone looking to perform analytics on their data is to start by getting the planning right, then worry about the technical stuff. Successful data projects begin by framing what they want to achieve, then asking if they have the right data, or if anything is missing.
You also need to understand what format your data is in and what this means for the analytics process. Data can be structured (that is in nice consistent formats), or unstructured. Structured data is traditionally easier to work with than unstructured, but structured data also needs to be handled with care.
On a recent project, Tessella worked with a major oil company to help develop a visualisation platform to spot problems during drilling. Here, processes were developed to transfer disparate measurements – such as hook load and mud pressure – into a consistent industry-recognised format, which allows diverse data to be processed and compared. We could then use this data, combined with a modelling and analytics layer, to create a dashboard of information that indicated the likelihood of costly errors to their drilling experts. The company says this approach saved them around US$200 million by suppressing non-productive time.
Deriving something meaningful from unstructured data, such as a combination of audio, visual, and thermodynamic data is more difficult, although machine learning and AI algorithms are capable of consuming unstructured data sets and spotting relationships that might otherwise be missed, building ontologies that correlate material properties to the scientific and engineering inputs used to create them.
For example, materials researchers can alter their materials’ properties by altering a number of aspects during production – such as temperature, cofactor, pressures and solvent levels. Many use real world experiments to do so, which is expensive and time consuming. An alternative is to build experimental models where you enter your desired property, and the model tells you how to obtain it. For such models to work, they need to understand your experimental environment. Machine learning can process historical data collected from sensors in plants to build up an understanding of how experiments work and how altering parameters will impact the outcome, allowing such models to provide reliable insight into how to quickly hone in on the process required for your desired materials property.
Using machine learning effectively usually requires human expertise to train the program – someone who understands both data and the subject domain being investigated. This expert can help the programme to recognise what is likely to be good and bad measurements and what information is more or less valuable. Over time, the programme will learn to identify not just errors, but how much weight to assign to each piece of information, so when future data is fed in it can automatically detect patterns that provide useful insights, and ignore those that don't.
Of course, it never hurts to apply some human intuition. If there is an obvious outlier, for example, check the sensor is working before feeding it into a programme.
Bringing it all together
The biggest area of failure in data analytics is assuming data alone has all the answers. It is only by using a mix of data and subject matter expertise around the problem, understanding the organisation's objectives, and analytics technology that you can develop statistical processes to spot patterns that provide valuable insights.
Framing the problem in the right way and feeding the right data into the right analytics has the potential to increase understanding, and to make things easier and cheaper across the materials cycle. Massive increases in our ability to capture and store that data is opening up a world of exciting possibilities.
As exciting as the potential of materials data analysis is, we must stay grounded. The data we are talking about is vast, complex and often inconsistent. Even the best data is full of uncertainties, and it exists in a world of almost infinite variables. Few people really understand how to use it properly, which has led to many failures. Getting the most out of your data requires an understanding of the data, its context, its limitations, and how to take advantage of it. This requires a plan, the right technology and data expertise, and an in-depth understanding of the area in which you are looking for insights. Only by bringing all these together do data projects deliver.
Dr Matt Jones is Senior Analytics Project Manager and Consultant at Tessella. Tessella is an international analytics and data science consulting services company that has worked with AzkoNobel, as well as a wide variety of companies and universities working across the materials lifecycle. It was recently acquired by high-tech engineering consulting firm, Altran.