Skip to content

Commit

Permalink
Updates from Overleaf
Browse files Browse the repository at this point in the history
  • Loading branch information
amkrajewski committed Jun 27, 2024
1 parent d2199c4 commit 055d880
Show file tree
Hide file tree
Showing 20 changed files with 422 additions and 337 deletions.
14 changes: 9 additions & 5 deletions crystall.tex
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ \section{\texttt{crystALL} - Purely Data-Driven Structure Prediction for Unident

\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{crystall/NdBi2_GraphicalAbstract_V4.png}
\includegraphics[width=0.75\textwidth]{crystall/NdBi2_GraphicalAbstract_V4.png}
\caption{Simplified \texttt{crystALL} core schematic of operation based around performing all permutations of elemental substitutions and energy predictions, exemplified in the case of $NdBi_2$ intermetallic.}
\label{crystall:fig:crystallcompound}
\end{figure}
Expand All @@ -37,9 +37,9 @@ \section{\texttt{crystALL} - Purely Data-Driven Structure Prediction for Unident

\section{Successfully Identifying \ch{NdBi_2} Structure} \label{crystall:ndbi2}

The first deployment of the \texttt{crystALL} method happened within \citet{Im2022ThermodynamicModeling} work which re-assessed thermodynamic models of Nd-Bi chemical system, which was an important step towards rational design of rare-earth alloys for clean energy technologies through, e.g., electrochemical recovery of rare-earth elements and prospects of novel energy storage devices like liquid metal batteries. As noted in the publication, the \ch{NdBi_2} compound has long been known to be thermodynamically stable \cite{Yoshihara1975Rare-earthBismuthides}; however its crystal structure has remained unknown presenting a significant obstacle in using ab initio methods to study it. Using \texttt{crystALL}'s data-mining approach deployed on \texttt{MPDD}, the possible configurations of \ch{NdBi_2} were determined and later validated by DFT-based methods, as depicted earlier in Figure \ref{crystall:fig:crystallcompound}.
The first deployment of the \texttt{crystALL} method happened within \citet{Im2022ThermodynamicModeling} work, which re-assessed thermodynamic models of Nd-Bi chemical system, which was an essential step towards the rational design of rare-earth alloys for clean energy technologies through, e.g., electrochemical recovery of rare-earth elements and prospects of novel energy storage devices like liquid metal batteries. As noted in the publication, the \ch{NdBi_2} compound has long been known to be thermodynamically stable \cite{Yoshihara1975Rare-earthBismuthides}; however, its crystal structure has remained unknown, presenting a significant obstacle in using ab initio methods to study it. Using \texttt{crystALL}'s data-mining approach deployed on \texttt{MPDD}, the possible configurations of \ch{NdBi_2} were determined and later validated by DFT-based methods, as depicted earlier in Figure \ref{crystall:fig:crystallcompound}.

First, all of the 26,055 \ch{AB_2}-type configurations were extracted from the contemporary mid-2020 snapshot of \texttt{MPDD}, which at the time had approximately 1.3 million total configurations dataset of DFT-relaxed or experimental structures covering all materials contained in the Open Quantum Materials Database (OQMD), the Materials Project (MP), the Joint Automated Repository for Various Integrated Simulations (JARVIS), and the Crystallography Open Database (COD), described in Chapter \ref{chap:mpdd}. Following the extraction and substitution, all generated candidates have been featurized using \texttt{Ward2017} \cite{Ward2017IncludingTessellations} and their energies were predicted through \texttt{SIPFENN} Novel Materials Model (NN20) described in Chapter \ref{chap:sipfenn}. The 1,000 lowest energy candidates were selected and their feature-space representations were embedded into lower dimensional space (3D) using popular t-distributed stochastic neighbor embedding (t-SNE) \cite{HintonStochasticEmbedding}, and clustered using k-means approach, as depicted in Figure \ref{crystall:fig:ndbi2clusters}.
First, all of the 26,055 \ch{AB_2}-type configurations were extracted from the contemporary mid-2020 snapshot of \texttt{MPDD}, which at the time had approximately 1.3 million total configurations dataset of DFT-relaxed or experimental structures covering all materials contained in the Open Quantum Materials Database (OQMD), the Materials Project (MP), the Joint Automated Repository for Various Integrated Simulations (JARVIS), and the Crystallography Open Database (COD), described in Chapter \ref{chap:mpdd}. Following the extraction and substitution, all generated candidates have been featurized using \texttt{Ward2017} \cite{Ward2017IncludingTessellations}, and their energies were predicted through \texttt{SIPFENN} Novel Materials Model (NN20) described in Chapter \ref{chap:sipfenn}. The 1,000 lowest energy candidates were selected, and their feature-space representations were embedded into lower dimensional space (3D) using popular t-distributed stochastic neighbor embedding (t-SNE) \cite{HintonStochasticEmbedding} and clustered using k-means approach, as depicted in Figure \ref{crystall:fig:ndbi2clusters}.

\begin{figure}[H]
\centering
Expand Down Expand Up @@ -78,12 +78,16 @@ \section{Predicting Compounds of Uncertain Compositions} \label{sec:crystallcomp

\begin{figure}[h]
\centering
\includegraphics[width=0.9\textwidth]{crystall/crystALL_composition_diagram_V3.png}
\includegraphics[width=0.95\textwidth]{crystall/crystALL_composition_diagram_V3.png}
\caption{\texttt{crystALL} schematic of operation in cases of "compositional" searches where measured composition can be given alongside uncertainty bounds. Efficient handling of such queries is a unique feature of MPDD.}
\label{fig:crystallcomposition}
\end{figure}

Such approach generates output with additional characteristic of the compositional distance to the reported composition value, enabling researchers to make a better-informed decision on what to pass into the validation steps
Such approach generates output with additional information of the compositional distance to the \textit{reported} composition value, enabling researchers to make a better-informed decision on what to pass into the validation steps based on belief whether the hypothetical compound is the one being observed.

\section{Software Availability} \label{crystall:sec:softwareavaialbility}

The \texttt{crystALL} source code is currently developed closed-sourced; however, it is planned to be released as a free open-source software (FOSS) in the Fall of 2024 through outlets including GitHub repository at \href{https://github.com/PhasesResearchLab/crystALL}{github.com/PhasesResearchLab/crystALL}, alongside a scientific publication describing it, high-quality documentation, and a workshop-style tutorial.

% \section{Connection to Zentropy} \label{sec:crystallzentropy}
%
Expand Down
20 changes: 10 additions & 10 deletions infeasibilitygliding.tex
Original file line number Diff line number Diff line change
Expand Up @@ -6,47 +6,47 @@ \chapter{Infeasibility Gliding in Compositional Spaces} \label{chap:infeasibilit

\section{Introduction} \label{infglide:sec:intro}

As explored in Chapter~\ref{chap:nimplex}, exploration of high-dimensional compositional spaces, needed for many materials discovery tasks, is a challenging task both conceptually and computationally, due to several inherent complexities. Typically, this forces efforts like screening and path planning to include as much prior knowledge (i.e., assumptions) as possible to bring these complexities down as much as possible, what has been explored in detail in Section~\ref{nimplex:ssec:complexes} on three individual examples, including real-world one based on \citet{Bobbio2022DesignCompositions}.
As explored in Chapter~\ref{chap:nimplex}, exploration of high-dimensional compositional spaces, needed for many materials discovery tasks, is a challenging task, both conceptually and computationally, due to several inherent complexities. Typically, this forces efforts like screening and path planning to include as much prior knowledge (i.e., assumptions) as possible to bring these complexities down as much as possible, which has been explored in detail in Section~\ref{nimplex:ssec:complexes} on three individual examples, including real-world one based on \citet{Bobbio2022DesignCompositions}.

It is important, however, to note that the assumptions imposed on the design space to reject spaces unlikely to work, like "\textit{Boron cannot be added because it will precipitate borides}", by the same assumptions do not significantly increase the volume of feasible (or desired) space. Thus, an approach that would explore only such regions, while skipping the rest, could in principle consider such design space at a low additional cost, reducing the number of assumptions and possibly identifying high-performing materials that would otherwise be skipped.
It is essential, however, to note that the assumptions imposed on the design space to reject spaces unlikely to work, like "\textit{Boron cannot be added because it will precipitate borides}", by the same assumptions do not significantly increase the volume of feasible (or desired) space. Thus, an approach that would explore only such regions while skipping the rest could, in principle, consider such design space at a low additional cost, reducing the number of assumptions and possibly identifying high-performing materials that would otherwise be skipped.

\section{Exploiting Compositional Graph Representation} \label{infglide:sec:exploitgraph}

To set up an approach exploring only feasible or otherwise desirable spaces, one can begin by leveraging \texttt{nimplex}'s compositional graph representation, described in Section~\ref{nimplex:sec:simplexgraph}, which enable one to easily traverse all compositions based on their adjacency, starting from one or more points, akin to typical high-throughput screenings that exhaust the design space population \cite{Feng2021High-throughputAlloys, Wang2023SearchingExperiments, Yang2022AHardness, Maresca2020Mechanistic1900K}.
To set up an approach exploring only feasible or otherwise desirable spaces, one can begin by leveraging \texttt{nimplex}'s compositional graph representation, described in Section~\ref{nimplex:sec:simplexgraph}, which enables one to easily traverse all compositions based on their adjacency, starting from one or more points, akin to typical high-throughput screenings that exhaust the design space population \cite{Feng2021High-throughputAlloys, Wang2023SearchingExperiments, Yang2022AHardness, Maresca2020Mechanistic1900K}.

Figure~\ref{infeasibilitygliding:fig:fullcomputation} depicts an example result of such approach for a specific 4-component design space in a 7-component elemental space, with roughly half of the points being feasible (green), forming a complex concave shape, and half being infeasible (red).
Figure~\ref{infeasibilitygliding:fig:fullcomputation} depicts an example result of such an approach for a specific 4-component design space in a 7-component elemental space, with roughly half of the points being feasible (green), forming a complex concave shape, and half being infeasible (red).

\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{infeasibilitygliding/InfeasibilityGliding_Full.png}
\caption{Feasibility map over compositional tetrahedron (3-simplex) formed by all combinations of Ti50 Zr50, Hf95 Ti5, Mo33 Nb33 Ta33, Mo80 Nb10 W10 discretized at 12 divisions per dimension. The positions in the 7-component elemental space obtained from \texttt{nimplex}, described in Chapter \ref{chap:nimplex}, were used to run \texttt{pycalphad} \cite{Otis2017Pycalphad:Python} evaluations and constrained by limiting phases present at equilibrium at 1000K to single or many solid solution phases. Roughly half of the compositions are infeasible with most of them forming a single large region.}
\caption{Feasibility map over compositional tetrahedron (3-simplex) formed by all combinations of Ti50 Zr50, Hf95 Ti5, Mo33 Nb33 Ta33, Mo80 Nb10 W10 discretized at 12 divisions per dimension. The positions in the 7-component elemental space obtained from \texttt{nimplex}, described in Chapter \ref{chap:nimplex}, were used to run \texttt{pycalphad} \cite{Otis2017Pycalphad:Python} evaluations and constrained by limiting phases present at equilibrium at 1000K to single or many solid solution phases. Roughly half of the compositions are infeasible, with most of them forming a single large region.}
\label{infeasibilitygliding:fig:fullcomputation}
\end{figure}

\section{Gliding on the Boundaries of Infeasibility} \label{infglide:sec:glide}

As shown in Figure~\ref{infeasibilitygliding:fig:fullcomputation}, the infeasible region of space is generally continuously bounded by a single smooth surface, with only a few other small infeasible points. Thus, in principle, the infeasible space could be efficiently navigated around, through only surface point calculations, without considering the bulk of internal points that cannot be accessed, accomplishing the goals set in the Section~\ref{infglide:sec:intro}. This work coins the term \emph{Infeasibility Gliding} to describe such approach.
As shown in Figure~\ref{infeasibilitygliding:fig:fullcomputation}, the infeasible region of space is generally continuously bounded by a single smooth surface, with only a few other small infeasible points. Thus, in principle, the infeasible space could be efficiently navigated around through only surface point calculations, without considering the bulk of internal points that cannot be accessed, accomplishing the goals set in Section~\ref{infglide:sec:intro}. This work coins the term \emph{Infeasibility Gliding} to describe such an approach.

\subsection{Underlying Assumptions} \label{infglide:ssec:assumptions}

One core assumption that needs to be considered in exploration based on the infeasibility gliding is that the bounding the infeasible space is surface is smooth in the compositional space, or in terms of phase stability, that the region where a given infeasible phase exists (often spanning multiple phase regions) is smoothly bound. While not possible to be proven to be valid in every system, it can be shown to be reasonable for exploration problems, as it (1) is \emph{not required} for the method to glide around the boundary, but only to argue for low computational cost, and (2) it is generally true for metallic systems of interest, as depicted in an example in Figure~\ref{infeasibilitygliding:fig:katesphasemap} from \citet{Elder2023ComputationalValidation}, as well as many other studies, like ones by \citet{Bobbio2022DesignCompositions}, \citet{Sun2024MaterialsMap:Ag-Al-Cu}, \citet{Gao2016SenaryHfNbTaTiVZr}, or \citet{Zhao2014ExperimentalSystem}.
One core assumption that needs to be considered in exploration based on the infeasibility gliding is that the high-dimensional surface bounding the infeasible space is highly smooth, or in terms of phase stability, that the region where a given infeasible phase exists (often spanning multiple phase regions) is smoothly bound. While not possible to be proven to be valid in every system, it can be shown to be reasonable for exploration problems, as it (1) is \emph{not required} for the method to glide around the boundary, but only to argue for low computational cost, and (2) it is generally true for metallic systems of interest, as depicted in an example in Figure~\ref{infeasibilitygliding:fig:katesphasemap} from \citet{Elder2023ComputationalValidation}, as well as many other studies, like ones by \citet{Bobbio2022DesignCompositions}, \citet{Sun2024MaterialsMap:Ag-Al-Cu}, \citet{Gao2016SenaryHfNbTaTiVZr}, or \citet{Zhao2014ExperimentalSystem}.

\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{infeasibilitygliding/PhaseTernaryMap_Elder2023.png}
\caption{A view of ternary Cr-Nb-W phase diagram projected across a temperature range of phase classes, further augmented by imposing predicted property value constraints. It depicts the smoothness of the infeasible region boundary (red) and increasingly smooth boundary of property-constrained region. Taken from Figure 2b in \citet{Elder2023ComputationalValidation} under CC BY 4.0 license.}
\caption{A view of ternary Cr-Nb-W phase diagram projected across a temperature range of phase classes, further augmented by imposing predicted property value constraints. It depicts the smoothness of the infeasible region boundary (red) and the increasingly smooth boundary of the property-constrained region. Taken from Figure 2b in \citet{Elder2023ComputationalValidation} under CC BY 4.0 license.}
\label{infeasibilitygliding:fig:katesphasemap}
\end{figure}


\subsection{Unbiased Exploration Searches} \label{infglide:ssec:unbiasedexplore}

With the infeasibility gliding approach, one can now perform the same traversal over a graph as done in Section~\ref{infglide:sec:exploitgraph}; however, limited to exploring only the neighborhood of the feasible points and, thus, not going into the inside of the infeasible region. This is shown in Figure~\ref{infglide:sec:glide}, in contrast to the earlier Figure~\ref{infeasibilitygliding:fig:fullcomputation}. The presented results are taken from the second \texttt{nimplex} workshop, which has been adapted as Appendix~\ref{chap:nimplextutorial2} and can be consulted for step-by-step details of an example implementation.
With the infeasibility gliding approach, one can now perform the same traversal over a graph as done in Section~\ref{infglide:sec:exploitgraph}; however, it is limited to exploring only the neighborhood of the feasible points and, thus, not going into the inside of the infeasible region. This highly desirable behavior, reducing computation by a factor of roughly 2, is shown in Figure~\ref{infglide:sec:glide}, in contrast to the earlier Figure~\ref{infeasibilitygliding:fig:fullcomputation}. The presented results are taken from the second \texttt{nimplex} workshop, which has been adapted as Appendix~\ref{chap:nimplextutorial2} and can be consulted for step-by-step details of an example implementation.

\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{infeasibilitygliding/InfeasibilityGliding_Glide.png}
\caption{The same problem as in Figure \ref{infeasibilitygliding:fig:fullcomputation} solved by iteratively exploring all feasible paths in the compositional graph in depth-first approach, which can be started from one or multiple points, and terminated once goal is reached or once all of the feasible space is explored.}
\caption{The same problem as in Figure \ref{infeasibilitygliding:fig:fullcomputation} solved by iteratively exploring all feasible paths in the compositional graph in a depth-first approach, which can be started from one or multiple points, and terminated once the goal is reached or once all of the feasible space is explored.}
\label{infeasibilitygliding:fig:glide}
\end{figure}

Expand Down
Loading

0 comments on commit 055d880

Please sign in to comment.