ORNL Uses AI and Big Data Research Tools to Enable Materials Science Discoveries

May 5, 2021 – At the Department of Energy’s Oak Ridge National Laboratory, scientists are using artificial intelligence, or AI, to accelerate the discovery and development of materials for energy and information technology .

“AI gives scientists the power to extract information from an ever-expanding volume of data,” said David Womble, director of the ORNL AI program. “New AI tools, along with world-class computing capabilities, are essential to sustaining scientific leadership. “

AI uses computers to extract mountains of data for scientific and technical information. Starting with high quality data is important. Well-characterized materials create a solid knowledge base for the design of new materials that launch technologies and grow economies. ORNL has a history of materials development dating back to WWII and a rich archive of data generated on world-class instruments by expert researchers. Increasingly, researchers are generating high-resolution materials data at a volume, variety, and speed they’ve never had to deal with before.

“Ten years ago, a doctorate. a student working on steels could analyze five precipitates per day, ”said electron microscopist Chad Parish of ORNL. Such precipitates could weaken an alloy and cause it to fail. “We have now developed a technique that allows us to make a thousand precipitates in five hours. We are drowned in data. AI may hold the key to making the most of it all.

Two types of AI help make sense of big data. Machine learning runs algorithms on high-performance computers to find correlations within large data sets and determine how well they match expectations. In doing so, it reveals characteristics that traditional data analytics may miss because they are subtle, infrequent, complex, or unexpected. One step closer, deep learning models the functioning of the human brain (for example, by applying logic and expertise) to distinguish characteristics of datasets that enhance discovery, learning, and decision-making. .

“We can now design machines to do the job that once required a human expert, but much faster and on a larger scale,” said Stephen Jesse, materials scientist at ORNL.

Coupling machines

David Womble, director of the artificial intelligence program at ORNL, relies on high-performance computing resources like Summit, America’s smartest supercomputer. Credit: Carlos Jones / ORNL, US Department of Energy

ORNL researchers have been at the forefront of efforts to harness machines to propel advances in materials science. From 1992, Bobby Sumpter worked on fundamental theory and chemical / materials science aspects of machine learning. Markus Eisenbach joined him to create the machine learning foundation for the integration of imaging instruments and high performance computers. They ran theoretical models on supercomputers and validated the results against experimental findings.

In 2001, when the Materials Research Society published a conference report on AI methods in materials science, ORNL researchers were well represented, offering methods to analyze, compress, and visualize multidimensional data.

At ORNL Nanophase Materials Science Center, Sergei Kalinin, founding member of the American Physical Society’s Data Science Theme Group, works with colleagues to initiate automated analysis of growing data from high-resolution microscopy experiments. “We turned to machine learning methods because traditional approaches were not practical or sufficient, ”Kalinin said.

Around 2008, ORNL researchers began publishing papers advancing machine learning and deep learning in the processing of microscopy big data and linking experimental results to theoretical models. This effort intensified over the next decade to include advances in AI such as:

  • Complex scanning probe microscope imaging and spectroscopy methods to reveal nanoscale properties in greater detail
  • Complete capture of large data streams from microscope detectors
  • Workflow for on-the-fly scanning transmission electron microscopy data analysis
  • Automated conversion of microscopy data into structure and defect libraries
  • Algorithms for learning physical laws from observation data
  • Assistance in setting up microscopes, choosing regions of interest in samples and controlling atom-by-atom assembly

“We are only scratching the surface with the use of deep learning for quantitative structural analysis of microscopy data,” said Albina Borisevich of ORNL. “If we can move from isolated issues to a more general approach, it can completely revolutionize the field. “

For example, ORNL researchers Wei-Ren Chen and Changwoo Do in Spallation neutron source use machine learning to help characterize small-angle neutron scattering from a wide range of material structures. Machine learning methods can help them suggest models for data analysis.

ORNL researchers such as Suhas Somnath have also explored ways to share the data widely. He evolves codes to run on distributed computing architectures and develops data infrastructure solutions.

“Continued advancements in automation, computational power, resolution, and detector speed in instruments now translate into ever larger, larger, more diverse and complex data sets from both simulations and applications. ‘experiences,’ Somnath said. “Powered by data and the CADES Data Gateway will imminently facilitate the collaborative collection, retention, annotation and sharing of data.

The Mountain peak supercomputer at Oak Ridge Leadership Computing Center is ideally suited for training and deploying AI algorithms on large data sets with its 27,648 state-of-the-art graphics processing units, high-speed file system, and large memory. A recent application of materials microscopy Demonstrated AI scaling to use all of Summit while operating at 93% efficiency.

Quality inside, quality outside

“AI tends to focus primarily on data analysis, but we have to stress that the data itself is important,” said Dongwon Shin, materials scientist at ORNL, who runs thermodynamic models on supercomputers to design high performance alloys.

He said the ORNL advantage is akin to “grandmother’s knowledge”. You can follow a cookie recipe to the letter, but your grandmother – with her extensive knowledge of interactions with ingredients, etc. – will outdo you every time. Likewise, ORNL researchers who have worked on materials for decades have world-class data sets with detailed pedigrees.

Shin realized that most machine learning tools were developed by and for programming experts, not scientists in the field. His team has developed an open source toolkit called MOUNTED which allows scientists with little programming or data science knowledge to apply data analysis as easily as with Excel. ASCENDS analyzes the correlations between input characteristics and target properties to facilitate the generation and validation of hypotheses and the training of machine learning models that predict the behavior of materials.

Visualize material success

The visualization of big data is an additional challenge. Materials scientists often use software that comes with the instruments they purchase. “Most vendor software presents the data collected by the instruments incorrectly,” said Philip Edmondson of ORNL, who studies materials for nuclear fission and fusion applications.

The scientific community is clamoring for open source software to help turn big data into something the human mind can interpret. Edmondson and Parish have good recommended practices to improve data visualization.

Materials for advanced nuclear reactors are irradiated in ORNL’s high flux isotopic reactor. Then, scientists characterize the specimens in detail, and machine learning methods analyze the measurements to determine how irradiation alters microstructures and properties that can affect the lifespan of fission or fusion energy systems. “With nuclear material, there could be millions of dollars and five years or more of investment to put a three-millimeter sample into the electron microscope,” Parish explained. “You want to make sure that you get all the scientific information possible from that sample. “

“We invest a lot of money and time in collecting good data,” said Edmondson. “Let’s understand it”.

UT-Battelle manages ORNL for the Office of Science of the Department of Energy, the largest support for basic research in the physical sciences in the United States. The Office of Science works to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.

Source: ORNL

Sean N. Ayres