SlideShare a Scribd company logo
Applications of Machine Learning
for Materials Discovery at NREL
Caleb Phillips, Ph.D.
Data Analysis and Visualization
Computational Sciences Center
National Renewable Energy Laboratory
NREL | 2
Modern “Full Stack” Materials Science
Synthesis
Characterization
Computation
Golden, Colorado
NREL | 3
Skeptics Allowed
“I’ll admit it, there may be something to
this ‘big data’ and ‘machine learning’
thing everyone keeps talking about.”
- Anonymous cynic (2017)
What changed?
• Computational power
• Deep Neural Networks
• Cheap storage, big data
• Increasing adoption/investment
NREL | 4
Setting Realistic Expectations
Machine learning &
Deep Learning
Image source: Gartner.com, Aug. 2017
NREL | 5
Overview of the talk
Compelling examples of materials-oriented machine learning at
work at NREL:
• Improving the throughput of experimentation:
• Interpretation: Accelerate the data->knowledge path
• Automation: Replace onerous manual tasks
• Prediction: Predict properties not measured
• Augmenting or replacing DFT simulations in candidate screening
• Prediction: End-to-end deep learning on molecular and
atomistic structures
Will cover applications at a high level – talk to or email me for more info
} Do work
faster with
more insight
} Focus work,
avoid some
altogether
NREL | 6
But first, the Data
The Materials Project
http://materials.nrel.gov
http://organiceletronics.nrel.gov
http://htem.nrel.gov
{
{ {
Experimental
Theoretical
Both
NREL | 7
Example: Experimental Materials Discovery
Taylor et al. Adv. Funct. Mater. 18, 3169 (2008)
% In
Conductivity of Annealed InZnO
Goal: Make a PhD Thesis Amount of Analysis a Routine Activity
Composition
Structure
Property
Process
Slide credit: John Perkins
NREL | 8
Application Driven (High Throughput) Materials Discovery at NREL
Input:
Theoretical
calculations
Combinatorial
synthesis
Spatially resolved
characterization
Output:
Application driven
optimization
NREL | 9
Application Driven (High Throughput) Materials Discovery at NREL
Input:
Theoretical
calculations
Combinatorial
synthesis
Spatially resolved
characterization
Output:
Application driven
optimization
AI/ML
Opportunities
Improve fidelity
Guided search
Faster screening
Automation
Visualization
Property Prediction{
NREL | 10
Initial motivating problem: Accelerate Slow Analysis Tasks
880 Unanalyzed XRD Patterns (Data)
1 Structure Phase Map (Knowledge)
Slow
NREL | 11
Clustering by structure and composition
Samples
Results
Extracted Data Set
~ 1000 XRD patterns
Spectral Clustering
NNLS Decomposition
Apply Machine Learning to Determine Clusters in XRD Patterns
Fast: ~ 30 seconds on a laptop
Calculated XRD Patterns
NREL | 12
Automatic band gap calculation
Goal: replace highly subjective manual
Process with something scalable, automated,
and (more) accurate.
Combining experimental and theoretical data compare properties across a
wide landscape of materials systems and synthesis conditions.
Schwarting et al., Materials Discovery (2018)
NREL | 13
Application Driven (High Throughput) Materials Discovery at NREL
Input:
Theoretical
calculations
Combinatorial
synthesis
Spatially resolved
characterization
Output:
Application driven
optimization
AI/ML
Opportunities
Improve fidelity
Guided search
Faster screening
Automation
Visualization
Property Prediction{
NREL | 14
High throughput screening using computational results
Constraints
Molecule
Generator
Predictive
(Machine Learning)
Model
Simulation on
Supercomputer
$$$
Results
Database
OR
Best candidates
All candidates
(sequentially)
Visualization &
Analysis
Materials
Synthesis
$$$$$
Measurement
and Validation
New
Materials
Theoretical
Experimental
Training
on
Past Results
Phillips et al. CoDA (2016)
NREL | 15
Predict opto-electric properties of molecules
Support Vector Regression (SVR) performance when predicting calculated band gap. Residual
error is linear and normally distributed. Median error is effectively zero, RMSE is 0.25 EV or
less for most scenarios.
First try: learn using
molecular descriptors
(traditional feature
engineering)
2 million candidates
NREL | 16
End-to-end Learning: Skip the feature extraction
Image Recognition: Convolutional
Neural Networks (CNNs)
O
Message
Passing
Blocks
Node Recurrent Units
Node
Embedding
Layer
Graph
Output
Layer(s)
Dense
Regression
Layers
Predictions
Input
Graph
(Molecule)
Molecular Graphs: Message Passing Neural
Networks (MPNNs)Gilmer et al., CoRR (2017)
Key hypothesis: model
can learn which features
are important directly from
structure.
NREL | 17
End-to-end Learning: Skip the feature extraction
Duplicate 1 (DFT)
MachineLearningPrediction
3-5x improvement over manually engineered features.
Accuracy approaching repeated-measures accuracy of DFT.
Gap 0.90
HOMO 1.05
LUMO 0.89
Spectral overlap 1.28
Polymer HOMO 1.24
Polymer LUMO 1.03
Polymer gap 1.19
Polymer optical LUMO 1.02
!"#$("&'ℎ)*+ ,+&-*)*.)
!"#$(012 3456)'&7+8)
St. John et al. https://arxiv.org/abs/1807.10363. (2018)
NREL | 18
Transfer learning and training set size
St. John et al. https://arxiv.org/abs/1807.10363. (2018)
NREL | 19
End-to-end learning for crystalline materials
Represent crystal structure as a graph
to allow end-to-end learning.
Kamdar. 2018. NREL/US DOE CSGF.
NREL | 20
Thanks to Many Collaborators
(and many funding sources)
Theory
Stephan Lany
Vladan Stevonvic
Aaron Holder
@ LBNL
Gerd Ceder
Kristin Persson
Data
Robert White
Kristin Munch
Peter Graf
@ NIST
Zachary Trautt
Robert Hanisch
Experiment
Andriy Zakutayev
John Perkins
Philip Parilla
David Ginley
Bill Tumas
Sebastian Siol
Lauren Garten
Elisabetta Arca
Matthew Taylor
@ NIST
Martin Green
Jae Hattrick-Simpers
Nam Nguyen
@ SLAC
Apurva Mehta
@ ANL
Debbie Myers
AI/ML
Jacob Hinkle
Marcus Schwarting
Peter St. John
@ Harvard
Harshil Kamdar Slide credit: John Perkins
NREL | 21
Selected Publications
Peter C. St. John, Caleb Phillips, Travis W. Kemper, A. Nolan Wilson,
Michael F. Crowley, Mark R. Nimlos, Ross E. Larsen.
Message-passing neural networks for high-throughput polymer screening.
In submission. ArXiv preprint: https://arxiv.org/abs/1807.10363
Marcus Schwarting, Sebastian Siol, Kevin Talley, Andriy Zakutayev, Caleb Phillips.
Automated algorithms for band gap analysis from optical absorption spectra.
Materials Discovery, April 18, 2018. https://doi.org/10.1016/j.md.2018.04.003
Andriy Zakutayev, Nick Wunder, Marcus Schwarting, John Perkins, Robert White,
Kristin Munch, William Tumas, and Caleb Phillips.
An open experimental database for exploring inorganic materials.
Nature. Scientific Data. April 3, 2018. https://www.nature.com/articles/sdata201853
Caleb Phillips, Ross Larson, Kristin Munch, Nikos Kopidakis.
Guided Search for Organic Photovoltaic Materials Using Predictive Data Modeling.
Conference on Data Analysis (CoDA) 2016. March 2-4, 2016. Santa Fe, New Mexico.
www.nrel.gov
Thank you
This work was authored by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy,
LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided by U.S.
Department of Energy Office of Energy Efficiency and Renewable Energy. The views expressed in the article do not
necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher,
by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up,
irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for
U.S. Government purposes.
caleb.phillips@nrel.gov
NREL | 23
Save work: predict not-measured properties
• Electrical conductivity prediction using random forest model
• Training variables: chemical composition, XRD peak count, deposition conditions
• Training process: 10-fold cross-validation by withdrawing 25% sample libraries
• Training set: 16K data points varying by 9-10 orders of magnitude
Predicted vs Measured
Conductivity
Prediction accuracy for
Conductivity
Prediction accuracy of
1-2 orders of
magnitude, reasonable
for semiconductors
Zakutayev et al. Scientific Data 5 180053 (2018)
NREL | 24
What’s in my database?
tSne model can group
70K samples based on
similarity of their
chemical compositions
t-distributed stochastic neighbor embedding (tSne) dimensionality reduction model
Zakutayev et al. Scientific Data 5 180053 (2018)

More Related Content

PDF
Conducting and Enabling Data-Driven Research Through the Materials Project
PDF
Graphs, Environments, and Machine Learning for Materials Science
PDF
Machine learning for materials design: opportunities, challenges, and methods
PDF
Supercapacitor
PPTX
Hybrid inorganic/organic semiconductor structures for opto-electronics.
PDF
A Machine Learning Framework for Materials Knowledge Systems
PDF
Using Metamaterial as Optical Perfect Absorber
PDF
Soft Magnetic Nanocrystalline Materials for Inductors and Shielding Applicati...
Conducting and Enabling Data-Driven Research Through the Materials Project
Graphs, Environments, and Machine Learning for Materials Science
Machine learning for materials design: opportunities, challenges, and methods
Supercapacitor
Hybrid inorganic/organic semiconductor structures for opto-electronics.
A Machine Learning Framework for Materials Knowledge Systems
Using Metamaterial as Optical Perfect Absorber
Soft Magnetic Nanocrystalline Materials for Inductors and Shielding Applicati...

What's hot (20)

PPT
Biomaterials – an overview
PDF
Density functional theory (DFT) and the concepts of the augmented-plane-wave ...
PPTX
Organic hybrid thermoelectrics
PPTX
Metamaterialsppt
PPT
GRAPHENE SYNTHESIS AND APPLICATION POSTER
PPTX
Spintronics
PPTX
Metamaterial
PPT
Nano Indentation Lecture1
PPTX
Organic thermoelectric generators
PPTX
Optoelectronic Materials
PPTX
Slides of invited "ALD 101" tutorial by Puurunen at ALD 2021
PDF
SpinFET
PDF
The Materials Project: A Community Data Resource for Accelerating New Materia...
PPTX
Spintronics
PPTX
Ellipsometry- non destructive measuring method
PPTX
Supercapacitor.pptx
PDF
Database of Topological Materials and Spin-orbit Spillage
PPTX
Thermoelectricity
PPTX
Synthesis of CNT by Arc discharge method
PPT
Implantation srim trim
Biomaterials – an overview
Density functional theory (DFT) and the concepts of the augmented-plane-wave ...
Organic hybrid thermoelectrics
Metamaterialsppt
GRAPHENE SYNTHESIS AND APPLICATION POSTER
Spintronics
Metamaterial
Nano Indentation Lecture1
Organic thermoelectric generators
Optoelectronic Materials
Slides of invited "ALD 101" tutorial by Puurunen at ALD 2021
SpinFET
The Materials Project: A Community Data Resource for Accelerating New Materia...
Spintronics
Ellipsometry- non destructive measuring method
Supercapacitor.pptx
Database of Topological Materials and Spin-orbit Spillage
Thermoelectricity
Synthesis of CNT by Arc discharge method
Implantation srim trim
Ad

Similar to Applications of Machine Learning for Materials Discovery at NREL (20)

PDF
Accelerating materials property predictions using machine learning
PDF
Combining density functional theory calculations, supercomputing, and data-dr...
PPTX
Morgan uw maGIV v1.3 dist
PDF
Overview of accelerated materials design efforts in the Hacking Materials res...
PDF
A*STAR Webinar on The AI Revolution in Materials Science
PDF
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
PDF
Computational materials design with high-throughput and machine learning methods
PDF
Discovering new functional materials for clean energy and beyond using high-t...
PPTX
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
PDF
Predicting Material Properties Using Machine Learning for Accelerated Materia...
PPTX
Hattrick-Simpers MRS Webinar on AI in Materials
PPTX
AI-driven materials design_ a mini-review.pptx
PDF
NIST-JARVIS infrastructure for Improved Materials Design
PDF
2D/3D Materials screening and genetic algorithm with ML model
PDF
AI for automated materials discovery via learning to represent, predict, gene...
PDF
Materials discovery through theory, computation, and machine learning
PPTX
Machine Learning In Materials Science.pptx
PDF
Combining density functional theory calculations, supercomputing, and data-dr...
PDF
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
PDF
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Accelerating materials property predictions using machine learning
Combining density functional theory calculations, supercomputing, and data-dr...
Morgan uw maGIV v1.3 dist
Overview of accelerated materials design efforts in the Hacking Materials res...
A*STAR Webinar on The AI Revolution in Materials Science
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Computational materials design with high-throughput and machine learning methods
Discovering new functional materials for clean energy and beyond using high-t...
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Predicting Material Properties Using Machine Learning for Accelerated Materia...
Hattrick-Simpers MRS Webinar on AI in Materials
AI-driven materials design_ a mini-review.pptx
NIST-JARVIS infrastructure for Improved Materials Design
2D/3D Materials screening and genetic algorithm with ML model
AI for automated materials discovery via learning to represent, predict, gene...
Materials discovery through theory, computation, and machine learning
Machine Learning In Materials Science.pptx
Combining density functional theory calculations, supercomputing, and data-dr...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Ad

More from aimsnist (18)

PDF
Enabling Data Science Methods for Catalyst Design and Discovery
PDF
A Framework and Infrastructure for Uncertainty Quantification and Management ...
PDF
Predicting local atomic structures from X-ray absorption spectroscopy using t...
PDF
Smart Metrics for High Performance Material Design
PDF
When The New Science Is In The Outliers
PDF
The MGI and AI
PDF
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
PDF
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...
PDF
Coupling AI with HiTp experiments to Discover Metallic Glasses Faster
PDF
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
PDF
Autonomous experimental phase diagram acquisition
PDF
Classical force fields as physics-based neural networks
PDF
Pathways Towards a Hierarchical Discovery of Materials
PDF
Automated Generation of High-accuracy Interatomic Potentials Using Quantum Data
PDF
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
PDF
Materials Data in Action
PDF
Combinatorial Experimentation and Machine Learning for Materials Discovery
PDF
Progress in Natural Language Processing of Materials Science Text
Enabling Data Science Methods for Catalyst Design and Discovery
A Framework and Infrastructure for Uncertainty Quantification and Management ...
Predicting local atomic structures from X-ray absorption spectroscopy using t...
Smart Metrics for High Performance Material Design
When The New Science Is In The Outliers
The MGI and AI
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
How to Leverage Artificial Intelligence to Accelerate Data Collection and Ana...
Coupling AI with HiTp experiments to Discover Metallic Glasses Faster
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
Autonomous experimental phase diagram acquisition
Classical force fields as physics-based neural networks
Pathways Towards a Hierarchical Discovery of Materials
Automated Generation of High-accuracy Interatomic Potentials Using Quantum Data
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
Materials Data in Action
Combinatorial Experimentation and Machine Learning for Materials Discovery
Progress in Natural Language Processing of Materials Science Text

Recently uploaded (20)

PPTX
web development for engineering and engineering
PDF
Structs to JSON How Go Powers REST APIs.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Well-logging-methods_new................
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Digital Logic Computer Design lecture notes
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Geodesy 1.pptx...............................................
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPT
Project quality management in manufacturing
web development for engineering and engineering
Structs to JSON How Go Powers REST APIs.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Model Code of Practice - Construction Work - 21102022 .pdf
Lecture Notes Electrical Wiring System Components
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Well-logging-methods_new................
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Digital Logic Computer Design lecture notes
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Geodesy 1.pptx...............................................
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Foundation to blockchain - A guide to Blockchain Tech
Arduino robotics embedded978-1-4302-3184-4.pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Project quality management in manufacturing

Applications of Machine Learning for Materials Discovery at NREL

  • 1. Applications of Machine Learning for Materials Discovery at NREL Caleb Phillips, Ph.D. Data Analysis and Visualization Computational Sciences Center National Renewable Energy Laboratory
  • 2. NREL | 2 Modern “Full Stack” Materials Science Synthesis Characterization Computation Golden, Colorado
  • 3. NREL | 3 Skeptics Allowed “I’ll admit it, there may be something to this ‘big data’ and ‘machine learning’ thing everyone keeps talking about.” - Anonymous cynic (2017) What changed? • Computational power • Deep Neural Networks • Cheap storage, big data • Increasing adoption/investment
  • 4. NREL | 4 Setting Realistic Expectations Machine learning & Deep Learning Image source: Gartner.com, Aug. 2017
  • 5. NREL | 5 Overview of the talk Compelling examples of materials-oriented machine learning at work at NREL: • Improving the throughput of experimentation: • Interpretation: Accelerate the data->knowledge path • Automation: Replace onerous manual tasks • Prediction: Predict properties not measured • Augmenting or replacing DFT simulations in candidate screening • Prediction: End-to-end deep learning on molecular and atomistic structures Will cover applications at a high level – talk to or email me for more info } Do work faster with more insight } Focus work, avoid some altogether
  • 6. NREL | 6 But first, the Data The Materials Project http://materials.nrel.gov http://organiceletronics.nrel.gov http://htem.nrel.gov { { { Experimental Theoretical Both
  • 7. NREL | 7 Example: Experimental Materials Discovery Taylor et al. Adv. Funct. Mater. 18, 3169 (2008) % In Conductivity of Annealed InZnO Goal: Make a PhD Thesis Amount of Analysis a Routine Activity Composition Structure Property Process Slide credit: John Perkins
  • 8. NREL | 8 Application Driven (High Throughput) Materials Discovery at NREL Input: Theoretical calculations Combinatorial synthesis Spatially resolved characterization Output: Application driven optimization
  • 9. NREL | 9 Application Driven (High Throughput) Materials Discovery at NREL Input: Theoretical calculations Combinatorial synthesis Spatially resolved characterization Output: Application driven optimization AI/ML Opportunities Improve fidelity Guided search Faster screening Automation Visualization Property Prediction{
  • 10. NREL | 10 Initial motivating problem: Accelerate Slow Analysis Tasks 880 Unanalyzed XRD Patterns (Data) 1 Structure Phase Map (Knowledge) Slow
  • 11. NREL | 11 Clustering by structure and composition Samples Results Extracted Data Set ~ 1000 XRD patterns Spectral Clustering NNLS Decomposition Apply Machine Learning to Determine Clusters in XRD Patterns Fast: ~ 30 seconds on a laptop Calculated XRD Patterns
  • 12. NREL | 12 Automatic band gap calculation Goal: replace highly subjective manual Process with something scalable, automated, and (more) accurate. Combining experimental and theoretical data compare properties across a wide landscape of materials systems and synthesis conditions. Schwarting et al., Materials Discovery (2018)
  • 13. NREL | 13 Application Driven (High Throughput) Materials Discovery at NREL Input: Theoretical calculations Combinatorial synthesis Spatially resolved characterization Output: Application driven optimization AI/ML Opportunities Improve fidelity Guided search Faster screening Automation Visualization Property Prediction{
  • 14. NREL | 14 High throughput screening using computational results Constraints Molecule Generator Predictive (Machine Learning) Model Simulation on Supercomputer $$$ Results Database OR Best candidates All candidates (sequentially) Visualization & Analysis Materials Synthesis $$$$$ Measurement and Validation New Materials Theoretical Experimental Training on Past Results Phillips et al. CoDA (2016)
  • 15. NREL | 15 Predict opto-electric properties of molecules Support Vector Regression (SVR) performance when predicting calculated band gap. Residual error is linear and normally distributed. Median error is effectively zero, RMSE is 0.25 EV or less for most scenarios. First try: learn using molecular descriptors (traditional feature engineering) 2 million candidates
  • 16. NREL | 16 End-to-end Learning: Skip the feature extraction Image Recognition: Convolutional Neural Networks (CNNs) O Message Passing Blocks Node Recurrent Units Node Embedding Layer Graph Output Layer(s) Dense Regression Layers Predictions Input Graph (Molecule) Molecular Graphs: Message Passing Neural Networks (MPNNs)Gilmer et al., CoRR (2017) Key hypothesis: model can learn which features are important directly from structure.
  • 17. NREL | 17 End-to-end Learning: Skip the feature extraction Duplicate 1 (DFT) MachineLearningPrediction 3-5x improvement over manually engineered features. Accuracy approaching repeated-measures accuracy of DFT. Gap 0.90 HOMO 1.05 LUMO 0.89 Spectral overlap 1.28 Polymer HOMO 1.24 Polymer LUMO 1.03 Polymer gap 1.19 Polymer optical LUMO 1.02 !"#$("&'ℎ)*+ ,+&-*)*.) !"#$(012 3456)'&7+8) St. John et al. https://arxiv.org/abs/1807.10363. (2018)
  • 18. NREL | 18 Transfer learning and training set size St. John et al. https://arxiv.org/abs/1807.10363. (2018)
  • 19. NREL | 19 End-to-end learning for crystalline materials Represent crystal structure as a graph to allow end-to-end learning. Kamdar. 2018. NREL/US DOE CSGF.
  • 20. NREL | 20 Thanks to Many Collaborators (and many funding sources) Theory Stephan Lany Vladan Stevonvic Aaron Holder @ LBNL Gerd Ceder Kristin Persson Data Robert White Kristin Munch Peter Graf @ NIST Zachary Trautt Robert Hanisch Experiment Andriy Zakutayev John Perkins Philip Parilla David Ginley Bill Tumas Sebastian Siol Lauren Garten Elisabetta Arca Matthew Taylor @ NIST Martin Green Jae Hattrick-Simpers Nam Nguyen @ SLAC Apurva Mehta @ ANL Debbie Myers AI/ML Jacob Hinkle Marcus Schwarting Peter St. John @ Harvard Harshil Kamdar Slide credit: John Perkins
  • 21. NREL | 21 Selected Publications Peter C. St. John, Caleb Phillips, Travis W. Kemper, A. Nolan Wilson, Michael F. Crowley, Mark R. Nimlos, Ross E. Larsen. Message-passing neural networks for high-throughput polymer screening. In submission. ArXiv preprint: https://arxiv.org/abs/1807.10363 Marcus Schwarting, Sebastian Siol, Kevin Talley, Andriy Zakutayev, Caleb Phillips. Automated algorithms for band gap analysis from optical absorption spectra. Materials Discovery, April 18, 2018. https://doi.org/10.1016/j.md.2018.04.003 Andriy Zakutayev, Nick Wunder, Marcus Schwarting, John Perkins, Robert White, Kristin Munch, William Tumas, and Caleb Phillips. An open experimental database for exploring inorganic materials. Nature. Scientific Data. April 3, 2018. https://www.nature.com/articles/sdata201853 Caleb Phillips, Ross Larson, Kristin Munch, Nikos Kopidakis. Guided Search for Organic Photovoltaic Materials Using Predictive Data Modeling. Conference on Data Analysis (CoDA) 2016. March 2-4, 2016. Santa Fe, New Mexico.
  • 22. www.nrel.gov Thank you This work was authored by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided by U.S. Department of Energy Office of Energy Efficiency and Renewable Energy. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes. caleb.phillips@nrel.gov
  • 23. NREL | 23 Save work: predict not-measured properties • Electrical conductivity prediction using random forest model • Training variables: chemical composition, XRD peak count, deposition conditions • Training process: 10-fold cross-validation by withdrawing 25% sample libraries • Training set: 16K data points varying by 9-10 orders of magnitude Predicted vs Measured Conductivity Prediction accuracy for Conductivity Prediction accuracy of 1-2 orders of magnitude, reasonable for semiconductors Zakutayev et al. Scientific Data 5 180053 (2018)
  • 24. NREL | 24 What’s in my database? tSne model can group 70K samples based on similarity of their chemical compositions t-distributed stochastic neighbor embedding (tSne) dimensionality reduction model Zakutayev et al. Scientific Data 5 180053 (2018)