Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMACT Intermetallics Enhancements #367

Open
wants to merge 16 commits into
base: develop
Choose a base branch
from

Conversation

ryannduma
Copy link
Collaborator

@ryannduma ryannduma commented Jan 16, 2025

SMACT Intermetallics Enhancement

This document describes the enhancements made to SMACT to better handle intermetallic compounds and alloys. The changes introduce a scoring system for identifying and validating intermetallic compounds, moving beyond the simple binary classification previously used.

Key Changes

1. New Module: smact.intermetallics

A dedicated module for handling intermetallic compounds with several specialized functions:

get_metal_fraction(composition)

  • Calculates the fraction of metallic elements in a composition
  • Input: pymatgen Composition object
  • Output: Float between 0-1
  • Example:
from pymatgen.core import Composition
from smact.intermetallics import get_metal_fraction

# Pure intermetallic - returns 1.0
print(get_metal_fraction(Composition("Fe3Al")))

# Mixed compound - returns fraction
print(get_metal_fraction(Composition("Fe2O3")))

get_d_electron_fraction(composition)

  • Calculates the fraction of d-block elements
  • Important for transition metal intermetallics
  • Example:
from smact.intermetallics import get_d_electron_fraction

# Pure transition metal compound - returns 1.0
print(get_d_electron_fraction(Composition("Fe2Nb")))

# Main group compound - returns 0.0
print(get_d_electron_fraction(Composition("Mg2Si")))

get_distinct_metal_count(composition)

  • Counts unique metallic elements
  • Useful for identifying complex intermetallics
  • Example:
from smact.intermetallics import get_distinct_metal_count

# Binary intermetallic
print(get_distinct_metal_count(Composition("Fe3Al")))  # Returns 2

# Complex HEA-like composition
print(get_distinct_metal_count(Composition("NbTiAlCr")))  # Returns 4

get_pauling_test_mismatch(composition)

  • Calculates deviation from ideal electronegativity ordering
  • Handles both metal-metal and metal-nonmetal pairs
  • Lower scores indicate more intermetallic-like bonding
  • Example:
from smact.intermetallics import get_pauling_test_mismatch

# Intermetallic - low mismatch
print(get_pauling_test_mismatch(Composition("Fe3Al")))

# Ionic compound - high mismatch
print(get_pauling_test_mismatch(Composition("NaCl")))

intermetallic_score(composition)

  • Main scoring function combining multiple metrics
  • Returns a score between 0-1
  • Higher scores indicate more intermetallic character
  • Example:
from smact.intermetallics import intermetallic_score

# Classic intermetallics - high scores
print(intermetallic_score("Fe3Al"))  # ~0.85
print(intermetallic_score("Ni3Ti"))  # ~0.82

# Non-intermetallics - low scores
print(intermetallic_score("NaCl"))  # ~0.20
print(intermetallic_score("Fe2O3"))  # ~0.45

2. Enhanced smact_validity

The existing smact_validity function in smact.screening has been enhanced:

  • New parameter intermetallic_threshold (default: 0.7) # TODO: change to 0.5 as the classification task suggests that this is the best parameter for the task
  • Uses the scoring system when include_alloys=True
  • More nuanced than previous binary metal check
  • Example:
from smact.screening import smact_validity

# Check with intermetallic detection
print(smact_validity("Fe3Al", include_alloys=True))  # True
print(smact_validity("NaCl", include_alloys=True))  # False

# Adjust threshold for stricter filtering
print(smact_validity("Fe3Al", include_alloys=True, intermetallic_threshold=0.8))

Differences from Previous Version

Before

  • Simple binary check for all-metal compositions
  • No distinction between intermetallics and other metallic phases
  • Limited handling of mixed bonding character
  • Binary valid/invalid classification

After

  • Scoring system using multiple chemical descriptors
  • Better handling of partial metallic character
  • Consideration of d-electron contributions
  • Continuous scoring (0-1) for more nuanced classification
  • Adjustable threshold for different applications and weightings for said combined rule**
    -The intermetallic_threshold in smact_validity() can be raised or lowered. Literature shows that in real materials, the line between "intermetallic" and "ionic/metallic" can be fuzzy. Having a tunable threshold aligns with different research needs (e.g., searching for strongly metallic Heuslers vs. half-metallic systems)

Usage Examples

Basic Screening

from smact.screening import smact_validity
from smact.intermetallics import intermetallic_score

compounds = [
    "Fe3Al",  # Classic intermetallic
    "Ni3Ti",  # Superalloy
    "NaCl",  # Ionic
    "Fe2O3",  # Metal oxide
]

for compound in compounds:
    score = intermetallic_score(compound)
    is_valid = smact_validity(compound, include_alloys=True)
    print(f"{compound}: score={score:.2f}, valid={is_valid}")

Advanced Usage

from pymatgen.core import Composition
from smact.intermetallics import *

# Detailed analysis of a compound
comp = Composition("Fe3Al")
metrics = {
    "metal_fraction": get_metal_fraction(comp),
    "d_electron_fraction": get_d_electron_fraction(comp),
    "distinct_metals": get_distinct_metal_count(comp),
    "pauling_mismatch": get_pauling_test_mismatch(comp),
    "overall_score": intermetallic_score(comp),
}

Known Limitations and Pitfalls

  1. Electronegativity Data

    • Some elements may lack Pauling electronegativity data
    • Falls back to default behavior in these cases
  2. VEC Calculation

    • Assumes simple electron counting rules
    • May not capture complex electronic structures
  3. Threshold Selection

    • Default threshold (0.7) may need adjustment for specific applications
    • Consider domain-specific validation
  4. Complex Compositions

    • High-entropy alloys may need different weighting schemes
    • Current weights optimized for binary/ternary systems

Future Development Directions

  1. Additional Features

    • Incorporate atomic size factors
    • Add structure prediction capabilities
    • Include formation energy estimates - linking back to Miedema's model DOI Link
  2. Validation and Refinement

    • Benchmark against experimental databases
    • Refine scoring weights with more data
    • Add support for mixed-valence compounds
  3. Extended Functionality

    • Add support for partial occupancies
    • Include temperature-dependent properties
    • Integrate with phase diagram tools

References

  1. Original SMACT paper: SMACT: Semiconducting Materials by Analogy and Chemical Theory

  2. Intermetallics theory and classification literature sources:

    • D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.

      • Reference: D.G. Pettifor, "A chemical scale for crystal-structure maps," Solid State Communications. 51 (1984) 31–34. DOI Link
    • Also, The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure, the parameters coded here, while conceptually similar have not implemented Miedema directly.

      • Reference: A.R. Miedema, Cohesion in alloys - fundamentals of a semi-empirical model. DOI Link
  3. Electronegativity scales (pauling electronegativity)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A - Unit Tests
  • Test B - XGBoost Classification exercise

Test A - Unit Tests (smact/tests/test_intermetallics.py):

  • Tests individual feature functions
  • Validates edge cases
  • Key test cases include
  # Known intermetallics
  self.intermetallics = [
      "Fe3Al",  # Classic intermetallic
      "Ni3Ti",  # Superalloy component
      "Cu3Au",  # Ordered alloy
      "Fe2Nb",  # Laves phase
  ]

  # Known non-intermetallics
  self.non_intermetallics = [
      "NaCl",   # Ionic
      "SiO2",   # Covalent
      "Fe2O3",  # Metal oxide
      "CuSO4",  # Complex ionic
  ]

Test B: Simple XGB classification task with new features and parameter tuning on intermetallic_threshold (new_intermetallics_classification.ipynb):

  • Uses matbench_expt_is_metal dataset
  • Implements XGBoost classifier
  • Performs feature importance analysis

Classification Workflow Details:

  1. Data Loading & Preparation:
from matminer.datasets import load_dataset
df = load_dataset("matbench_expt_is_metal")  # 4921 entries
  1. Feature Extraction:
  • Metal fraction
  • d-electron fraction
  • Distinct metal count
  • Pauling mismatch
  • Overall intermetallic score
  1. Model Training:
  • XGBoost classifier with hyperparameter tuning
  • Cross-validation with threshold optimization
  • Feature importance analysis using SHAP

Classification results:

Test Accuracy: 0.852 with intermetallic_threshold=0.5

Classification Report:
              precision    recall  f1-score   support

       False       0.84      0.87      0.85       494
        True       0.86      0.83      0.85       491

    accuracy                           0.85       985
   macro avg       0.85      0.85      0.85       985
weighted avg       0.85      0.85      0.85       985

confusion matrix
Shap analysis

Reproduction Instructions:

  1. Environment Setup:
# Create conda environment
conda create -n smact-env python=3.11
conda activate smact-env

# Install dependencies
pip install smact matminer xgboost shap scikit-learn pandas numpy matplotlib seaborn
  1. Run Unit Tests
python -m pytest smact/tests/test_intermetallics.py -v
  1. Run Classification Notebook:
jupyter notebook new_intermetallics_classification.ipynb

Key Results

Unit Test Results:

  • All feature functions pass edge cases
  • Intermetallic scoring correctly differentiates between known compounds
  • Error handling works as expected
  1. Classification Performance:
  • High accuracy in metal vs. non-metal classification
  • Feature importance ranking shows:
    1. Metal fraction
    2. d-electron fraction
    3. Intermetallic score
    4. Pauling mismatch
    5. Distinct metal count

Test Configuration:

  • Python version: 3.11
  • Operating System: Linux
  • Key package versions:
    • smact: latest
    • pymatgen: latest
    • xgboost: latest
    • scikit-learn: latest
    • pandas: latest
    • numpy: latest
    • shap: latest

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced advanced intermetallic compound analysis capabilities in the SMACT framework.
    • Added sophisticated scoring system for evaluating intermetallic character of chemical compositions.
    • Implemented new functions for calculating metal fraction, d-electron fraction, distinct metal count, and assessing electronegativity deviations.
  • Improvements

    • Enhanced compound screening with more nuanced validation methods, including intermetallic scoring.
    • Updated smact_validity function to support intermetallic validation with adjustable thresholds.
  • Documentation

    • Updated documentation with comprehensive usage examples and function descriptions.
  • Tests

    • Added a new test suite to validate the functionality of the intermetallics module, covering various edge cases and standard scenarios.

These updates provide researchers with more sophisticated tools for analysing and screening intermetallic compounds.

@ryannduma ryannduma requested a review from AntObi January 16, 2025 17:35
@ryannduma ryannduma self-assigned this Jan 16, 2025
Copy link
Contributor

coderabbitai bot commented Jan 16, 2025

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)
  • docs/examples/intermetallics_classification.ipynb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request introduces a new module smact.intermetallics to enhance the SMACT framework's capabilities for analysing intermetallic compounds. The module provides several specialised functions to calculate metal fraction, d-electron fraction, distinct metal count, and an overall intermetallic score. The smact_validity function in the screening module has been updated to incorporate a new validation path for intermetallics, moving from a binary classification to a more sophisticated scoring system that considers multiple chemical descriptors.

Changes

File Change Summary
docs/intermetallics_readme.md Added comprehensive documentation for new intermetallics module and its functions
smact/intermetallics.py Introduced new utility functions for handling intermetallic compounds, including get_metal_fraction(), get_d_electron_fraction(), get_distinct_metal_count(), get_pauling_test_mismatch(), and intermetallic_score()
smact/screening.py Updated smact_validity() function with new parameters check_intermetallic and intermetallic_threshold
smact/tests/test_intermetallics.py Added comprehensive test suite for new intermetallics module functions

Sequence Diagram

sequenceDiagram
    participant User
    participant SMACTValidity
    participant Intermetallics
    
    User->>SMACTValidity: Call smact_validity()
    SMACTValidity->>Intermetallics: Calculate intermetallic_score()
    Intermetallics-->>SMACTValidity: Return score
    SMACTValidity->>SMACTValidity: Compare score to threshold
    SMACTValidity-->>User: Return validity result
Loading

Possibly related PRs

Suggested labels

tests, docs

Poem

🐰 Intermetallics dance and sway,
Metals mingling in their own way,
Electrons twirling, fractions bright,
SMACT's new module takes flight!
A scientific rabbit's delight! 🔬


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (7)
smact/intermetallics.py (2)

68-68: Consider handling missing electronegativity values appropriately

Currently, if any electronegativity values are None, the function returns a mismatch score of 0.0, which may inaccurately imply a perfect match. It might be more appropriate to raise a warning or return a distinct value indicating that the score could not be computed.

Apply this diff to handle missing values:

 if None in electronegativities:
-    return 0.0
+    # Return a distinct value or raise an error
+    return float('nan')  # Alternatively, raise a ValueError

23-24: Reduce code duplication in similar functions

The functions get_metal_fraction and get_d_electron_fraction share similar structures. Consider refactoring to eliminate duplication and improve maintainability.

You could abstract the common logic into a helper function:

def get_element_fraction(composition: Composition, element_set: set[str]) -> float:
    total_amt = sum(composition.values())
    target_amt = sum(
        amt for el, amt in composition.items() if el.symbol in element_set
    )
    return target_amt / total_amt

def get_metal_fraction(composition: Composition) -> float:
    return get_element_fraction(composition, smact.metals)

def get_d_electron_fraction(composition: Composition) -> float:
    return get_element_fraction(composition, smact.d_block)

Also applies to: 37-38

new_intermetallics_mod_usage_example.py (1)

35-38: Enhance output formatting for clarity

Consider improving the output formatting to make the results more readable and informative.

Apply this diff to enhance the print statements:

 print(f"\nCompound: {compound}")
-print(f"Standard validity: {is_valid_standard}")
-print(f"With intermetallic detection: {is_valid_intermetallic}")
-print(f"Intermetallic score: {score:.2f}")
+print(f"Standard Validity: {is_valid_standard}")
+print(f"Intermetallic Detection Validity: {is_valid_intermetallic}")
+print(f"Intermetallic Score: {score:.2f}")
smact/tests/test_intermetallics.py (2)

43-46: Use assertAlmostEqual for floating-point comparisons

When comparing floating-point numbers, especially results of calculations, it's safer to use assertAlmostEqual to account for potential precision errors.

Apply this diff to update the assertions:

-        self.assertEqual(get_metal_fraction(Composition("Fe3Al")), 1.0)
+        self.assertAlmostEqual(get_metal_fraction(Composition("Fe3Al")), 1.0, places=6)

-        self.assertEqual(get_metal_fraction(Composition("SiO2")), 0.0)
+        self.assertAlmostEqual(get_metal_fraction(Composition("SiO2")), 0.0, places=6)

79-86: Provide informative assertion messages

Including informative messages in your assertions can help identify failures more easily during testing.

Apply this diff to add messages to assertions:

         for formula in self.intermetallics:
             score = intermetallic_score(formula)
-            self.assertTrue(score > 0.7, f"Expected high score (>0.7) for {formula}, got {score}")

+            self.assertTrue(
+                score > 0.7,
+                msg=f"Expected high intermetallic score (>0.7) for {formula}, but got {score:.2f}",
+            )
 
         for formula in self.non_intermetallics:
             score = intermetallic_score(formula)
-            self.assertTrue(score < 0.5, f"Expected low score (<0.5) for {formula}, got {score}")

+            self.assertTrue(
+                score < 0.5,
+                msg=f"Expected low intermetallic score (<0.5) for {formula}, but got {score:.2f}",
+            )
docs/intermetallics_readme.md (2)

89-90: Add explanation for example scores.

The example scores would be more helpful if accompanied by brief explanations of why these compounds receive their respective scores. For instance, explain why Fe3Al scores ~0.85 in terms of its metallic fraction, d-electron contribution, etc.

Also applies to: 93-94


178-178: Fix grammatical issues.

Please apply these corrections:

  • Line 178: Add a period after "in these cases"
  • Line 231: Add a comma after "here" in "parameters coded here while conceptually similar"

Also applies to: 231-231

🧰 Tools
🪛 LanguageTool

[uncategorized] ~178-~178: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a92e123 and e4f4d84.

📒 Files selected for processing (5)
  • docs/intermetallics_readme.md (1 hunks)
  • new_intermetallics_mod_usage_example.py (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/screening.py (4 hunks)
  • smact/tests/test_intermetallics.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~80-~80: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~178-~178: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~187-~187: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~231-~231: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

🔇 Additional comments (5)
smact/intermetallics.py (1)

119-119: Re-evaluate the default vec_factor value

When the valence electron count cannot be calculated, the vec_factor defaults to 0.5. Consider whether this default appropriately reflects the lack of information, or if a different value or approach would be more suitable.

Would you like to revisit the default value for vec_factor to ensure it aligns with the scoring intentions?

smact/screening.py (4)

21-21: LGTM!

The import statement follows Python best practices by importing only the required function.


441-441: LGTM!

The new parameter is well-documented and follows consistent naming conventions. The default threshold value of 0.7 aligns with the documentation's recommendations for intermetallic classification.

Also applies to: 447-447, 463-464


477-477: LGTM!

The added empty line improves code readability by separating logical blocks.


479-481: LGTM!

The implementation cleanly integrates the new intermetallic scoring system, providing a more sophisticated approach to classifying intermetallic compounds.

smact/intermetallics.py Outdated Show resolved Hide resolved
new_intermetallics_mod_usage_example.py Outdated Show resolved Hide resolved
smact/tests/test_intermetallics.py Outdated Show resolved Hide resolved
docs/intermetallics_readme.md Outdated Show resolved Hide resolved
AntObi
AntObi previously requested changes Jan 16, 2025
smact/screening.py Outdated Show resolved Hide resolved
smact/intermetallics.py Show resolved Hide resolved
new_intermetallics_mod_usage_example.py Outdated Show resolved Hide resolved
docs/intermetallics_readme.md Show resolved Hide resolved
smact/tests/test_intermetallics.py Show resolved Hide resolved
docs/intermetallics_readme.md Outdated Show resolved Hide resolved
@AntObi AntObi added enhancement python Pull requests that update Python code feature labels Jan 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
smact/intermetallics.py (2)

1-11: Enhance module documentation to better describe its purpose and functionality.

The current docstring is quite minimal. Consider expanding it to include:

  • Overview of what intermetallic compounds are
  • Key features and functions provided by the module
  • Example usage
  • References to relevant literature or methodologies
-"""Utility functions for handling intermetallic compounds in SMACT."""
+"""Functions for analysing and scoring intermetallic compounds in SMACT.
+
+This module provides a comprehensive set of utilities for identifying and
+characterising intermetallic compounds, moving beyond simple binary classification
+to a more nuanced scoring system. Key features include:
+
+- Metal fraction analysis
+- D-electron contribution assessment
+- Pauling electronegativity ordering evaluation
+- Composite intermetallic scoring
+
+Example:
+    >>> from smact.intermetallics import intermetallic_score
+    >>> score = intermetallic_score("Fe3Al")
+    >>> print(f"Intermetallic character score: {score:.2f}")
+
+References:
+    1. Pauling, L. (1947). J. Am. Chem. Soc., 69(3), 542-553.
+    2. Villars, P. (1983). J. Less Common Met., 92(2), 215-238.
+"""

102-154: Make scoring configuration more flexible and enhance documentation.

Consider making the weights and thresholds configurable and documenting the scoring components in more detail.

+# Default weights for scoring components
+DEFAULT_WEIGHTS = {
+    "metal_fraction": 0.3,
+    "d_electron": 0.2,
+    "n_metals": 0.2,
+    "vec": 0.15,
+    "pauling": 0.15,
+}
+
-def intermetallic_score(composition: str | Composition) -> float:
+def intermetallic_score(
+    composition: str | Composition,
+    weights: dict[str, float] | None = None,
+    max_metals: int = 3,
+    target_vec: float = 8.0,
+) -> float:
     """Calculate a score indicating how likely a composition is to be an intermetallic compound.

     The score is based on several heuristics:
-    1. Fraction of metallic elements
-    2. Number of distinct metals
-    3. Presence of d-block elements
-    4. Electronegativity differences
-    5. Valence electron count
+    1. Fraction of metallic elements (weight: 0.3)
+       Higher fraction indicates stronger intermetallic character
+    2. Number of distinct metals (weight: 0.2)
+       Normalised to max_metals (default: 3)
+    3. Presence of d-block elements (weight: 0.2)
+       Higher d-electron fraction indicates stronger intermetallic character
+    4. Electronegativity differences (weight: 0.15)
+       Lower Pauling mismatch indicates stronger intermetallic character
+    5. Valence electron count (weight: 0.15)
+       Normalised around target_vec (default: 8.0)

     Args:
         composition: Chemical formula as string or pymatgen Composition
+        weights: Optional custom weights for scoring components
+        max_metals: Maximum number of metals for normalisation
+        target_vec: Target valence electron count for normalisation

     Returns:
         float: Score between 0 and 1, where higher values indicate more intermetallic character
     """
     if isinstance(composition, str):
         composition = Composition(composition)

+    # Use default weights if none provided
+    weights = weights or DEFAULT_WEIGHTS
+
     # 1. Basic metrics
     metal_fraction = get_metal_fraction(composition)
     d_electron_fraction = get_d_electron_fraction(composition)
     n_metals = get_distinct_metal_count(composition)

     # 2. Electronic structure indicators
     try:
         vec = valence_electron_count(composition.reduced_formula)
-        vec_factor = 1.0 - (abs(vec - 8.0) / 8.0)  # Normalized around VEC=8
+        vec_factor = 1.0 - (abs(vec - target_vec) / target_vec)
     except ValueError:
         vec_factor = 0.5  # Default if we can't calculate VEC

     # 3. Bonding character
     pauling_mismatch = get_pauling_test_mismatch(composition)

-    # 4. Calculate weighted score
-    # These weights can be tuned based on empirical testing
-    weights = {"metal_fraction": 0.3, "d_electron": 0.2, "n_metals": 0.2, "vec": 0.15, "pauling": 0.15}
-
     score = (
         weights["metal_fraction"] * metal_fraction
         + weights["d_electron"] * d_electron_fraction
-        + weights["n_metals"] * min(1.0, n_metals / 3.0)  # Normalized to max of 3 metals
+        + weights["n_metals"] * min(1.0, n_metals / max_metals)
         + weights["vec"] * vec_factor
         + weights["pauling"] * (1.0 - pauling_mismatch)  # Invert mismatch score
     )
smact/tests/test_intermetallics.py (3)

5-41: Standardise testing framework usage.

The test file mixes unittest and pytest frameworks. Choose one framework consistently:

  1. If using unittest, replace pytest.raises with self.assertRaises
  2. If using pytest, remove unittest.TestCase inheritance and use pytest fixtures

The test data setup is well-organised and comprehensive.


42-67: Consider parametrizing test cases for better maintainability.

The test cases for get_element_fraction could be parametrized to make them more maintainable and easier to extend.

+    @pytest.mark.parametrize(
+        "composition,element_set,expected,message",
+        [
+            ("Fe3Al", smact.metals, 1.0, "Expected all elements in Fe3Al to be metals"),
+            ("Fe2Nb", smact.d_block, 1.0, "Expected all elements in Fe2Nb to be d-block"),
+            ("Fe3Al", set(), 0.0, "Expected zero fraction for empty element set"),
+        ],
+    )
+    def test_get_element_fraction(self, composition, element_set, expected, message):
+        """Test the helper function for element fraction calculations."""
+        self.assertAlmostEqual(
+            get_element_fraction(composition, element_set),
+            expected,
+            places=6,
+            msg=message,
+        )

160-175: Expand edge case coverage and standardise exception testing.

  1. Add more edge cases
  2. Standardise exception testing with the chosen framework
     def test_edge_cases(self):
         """Test edge cases and error handling."""
         # Single element
         score = intermetallic_score("Fe")
         self.assertTrue(
             0.0 <= score <= 1.0,
             msg=f"Expected score between 0 and 1 for Fe, got {score:.2f}",
         )

+        # Test with very large composition
+        large_comp = "Fe" * 100 + "Al" * 100
+        score = intermetallic_score(large_comp)
+        self.assertTrue(
+            0.0 <= score <= 1.0,
+            msg=f"Expected score between 0 and 1 for large composition, got {score:.2f}",
+        )
+
         # Empty composition should not crash
-        with pytest.raises(ValueError, match="Empty composition"):
+        with self.assertRaisesRegex(ValueError, "Empty composition"):
             intermetallic_score("")

         # Invalid formula should raise error
-        with pytest.raises(ValueError, match="Invalid formula"):
+        with self.assertRaisesRegex(ValueError, "Invalid formula"):
             intermetallic_score("NotAnElement")
+
+        # Test with special characters
+        with self.assertRaisesRegex(ValueError, "Invalid formula"):
+            intermetallic_score("Fe@Al")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4f4d84 and 1ff91b8.

📒 Files selected for processing (3)
  • new_intermetallics_mod_usage_example.py (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/tests/test_intermetallics.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • new_intermetallics_mod_usage_example.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (3)
smact/intermetallics.py (2)

71-100: ⚠️ Potential issue

Address Python version compatibility and clarify logic.

  1. Remove Python 3.10+ specific strict=False parameter
  2. Clarify the metal-nonmetal pair logic with better variable names and comments
  3. Consider using absolute values for consistent comparison
     # Calculate pairwise differences
     mismatches = []
-    for i, (el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)):
-        for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False):
+    for i, (el1, eneg1) in enumerate(zip(elements, electronegativities)):
+        for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]):
             # For metal pairs, we expect small electronegativity differences
             if el1.symbol in smact.metals and el2.symbol in smact.metals:
-                mismatches.append(abs(eneg1 - eneg2))
+                metal_pair_mismatch = abs(eneg1 - eneg2)
+                mismatches.append(metal_pair_mismatch)
             # For metal-nonmetal pairs, we expect larger differences
             elif (el1.symbol in smact.metals) != (el2.symbol in smact.metals):
-                mismatches.append(1.0 - abs(eneg1 - eneg2))
+                # Invert the difference for metal-nonmetal pairs as we expect
+                # larger differences (closer to 1.0) to indicate ionic rather
+                # than intermetallic character
+                ionic_character = abs(eneg1 - eneg2)
+                intermetallic_character = 1.0 - ionic_character
+                mismatches.append(intermetallic_character)

Likely invalid or redundant comment.


32-57: 🛠️ Refactor suggestion

Maintain consistency with input handling across functions.

Apply the same input flexibility pattern to these functions for consistency with get_element_fraction.

-def get_metal_fraction(composition: Composition) -> float:
+def get_metal_fraction(composition: str | Composition) -> float:
     """Calculate the fraction of metallic elements in a composition.
     Implemented using get_element_fraction helper with smact.metals set.

     Args:
-        composition: A pymatgen Composition object
+        composition: Chemical formula as string or pymatgen Composition object

     Returns:
         float: Fraction of the composition that consists of metallic elements (0-1)
+
+    Raises:
+        ValueError: If the composition is invalid or empty
     """
     return get_element_fraction(composition, smact.metals)

-def get_d_electron_fraction(composition: Composition) -> float:
+def get_d_electron_fraction(composition: str | Composition) -> float:
     """Calculate the fraction of d-block elements in a composition.
     Implemented using get_element_fraction helper with smact.d_block set.

     Args:
-        composition: A pymatgen Composition object
+        composition: Chemical formula as string or pymatgen Composition object

     Returns:
         float: Fraction of the composition that consists of d-block elements (0-1)
+
+    Raises:
+        ValueError: If the composition is invalid or empty
     """
     return get_element_fraction(composition, smact.d_block)

Likely invalid or redundant comment.

smact/tests/test_intermetallics.py (1)

129-141: 🛠️ Refactor suggestion

Enhance Pauling test assertions and documentation.

The test could be more explicit about the expected behaviour and use absolute values as suggested in past reviews.

     def test_get_pauling_test_mismatch(self):
-        """Test Pauling electronegativity mismatch calculation."""
+        """Test Pauling electronegativity mismatch calculation.
+
+        The mismatch score should be:
+        - Lower for intermetallic compounds (similar electronegativities)
+        - Higher for ionic compounds (large electronegativity differences)
+        """
         # Ionic compounds should have high mismatch
         nacl_mismatch = get_pauling_test_mismatch(Composition("NaCl"))

         # Intermetallics should have lower mismatch
         fe3al_mismatch = get_pauling_test_mismatch(Composition("Fe3Al"))

         self.assertTrue(
-            fe3al_mismatch < nacl_mismatch,
+            abs(fe3al_mismatch) < abs(nacl_mismatch),
             msg=f"Expected lower Pauling mismatch for Fe3Al ({fe3al_mismatch:.2f}) than NaCl ({nacl_mismatch:.2f})",
         )

Likely invalid or redundant comment.

smact/intermetallics.py Outdated Show resolved Hide resolved
smact/intermetallics.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
smact/intermetallics.py (1)

107-108: ⚠️ Potential issue

Remove Python 3.10+ specific parameter.

The strict parameter in zip is only available in Python 3.10 and above.

Apply this diff to maintain compatibility with earlier Python versions:

-for i, (el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)):
+for i, (el1, eneg1) in enumerate(zip(elements, electronegativities)):
-    for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False):
+    for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]):
🧹 Nitpick comments (1)
smact/intermetallics.py (1)

119-170: Consider making weights configurable.

The scoring weights are hardcoded and might need adjustment based on empirical testing. Consider making them configurable through function parameters or a configuration file.

Apply this diff to make weights configurable:

-def intermetallic_score(composition: str | Composition) -> float:
+def intermetallic_score(
+    composition: str | Composition,
+    weights: dict[str, float] | None = None,
+) -> float:
     """Calculate a score indicating how likely a composition is to be an intermetallic compound.
     
     Args:
         composition: Chemical formula as string or pymatgen Composition
+        weights: Optional dictionary of weights for each factor.
+                Default weights are used if not provided.
     """
     comp = _ensure_composition(composition)
 
     # 1. Basic metrics
     metal_fraction = get_metal_fraction(comp)
     d_electron_fraction = get_d_electron_fraction(comp)
     n_metals = get_distinct_metal_count(comp)
 
     # 2. Electronic structure indicators
     try:
         vec = valence_electron_count(comp.reduced_formula)
         vec_factor = 1.0 - (abs(vec - 8.0) / 8.0)  # Normalized around VEC=8
     except ValueError:
         vec_factor = 0.5  # Default if we can't calculate VEC
 
     # 3. Bonding character
     pauling_mismatch = get_pauling_test_mismatch(comp)
 
     # 4. Calculate weighted score
-    # These weights can be tuned based on empirical testing
-    weights = {"metal_fraction": 0.3, "d_electron": 0.2, "n_metals": 0.2, "vec": 0.15, "pauling": 0.15}
+    default_weights = {
+        "metal_fraction": 0.3,
+        "d_electron": 0.2,
+        "n_metals": 0.2,
+        "vec": 0.15,
+        "pauling": 0.15
+    }
+    weights = weights or default_weights
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ff91b8 and 56246d3.

📒 Files selected for processing (3)
  • new_intermetallics_mod_usage_example.py (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/screening.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • new_intermetallics_mod_usage_example.py
🔇 Additional comments (5)
smact/intermetallics.py (4)

13-24: LGTM! Well-structured helper function.

The function is well-documented, properly typed, and follows the single responsibility principle.


47-57: LGTM! Clean implementation.

Good use of the helper function to avoid code duplication.


61-71: LGTM! Clean implementation.

Good use of the helper function to avoid code duplication.


74-84: LGTM! Efficient implementation.

The function effectively uses a generator expression for counting unique metals.

smact/screening.py (1)

441-442: LGTM! Well-implemented intermetallic validation.

The implementation follows the suggested approach of keeping intermetallic validation separate from the existing alloy validation. The new parameters are well-documented and the logic is clear.

Also applies to: 481-485

Comment on lines +28 to +43
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
"""Calculate the fraction of elements from a given set in a composition.
This helper function is used to avoid code duplication in functions that
calculate fractions of specific element types (e.g., metals, d-block elements).

Args:
composition: Chemical formula as string or pymatgen Composition
element_set: Set of element symbols to check for

Returns:
float: Fraction of the composition that consists of elements from the set (0-1)
"""
comp = _ensure_composition(composition)
total_amt = sum(comp.values())
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
return target_amt / total_amt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for edge cases.

The function should handle empty compositions and potential division by zero errors.

Apply this diff to add error handling:

 def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
     comp = _ensure_composition(composition)
     total_amt = sum(comp.values())
+    if not total_amt:
+        raise ValueError("Empty composition")
     target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
     return target_amt / total_amt
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
"""Calculate the fraction of elements from a given set in a composition.
This helper function is used to avoid code duplication in functions that
calculate fractions of specific element types (e.g., metals, d-block elements).
Args:
composition: Chemical formula as string or pymatgen Composition
element_set: Set of element symbols to check for
Returns:
float: Fraction of the composition that consists of elements from the set (0-1)
"""
comp = _ensure_composition(composition)
total_amt = sum(comp.values())
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
return target_amt / total_amt
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
"""Calculate the fraction of elements from a given set in a composition.
This helper function is used to avoid code duplication in functions that
calculate fractions of specific element types (e.g., metals, d-block elements).
Args:
composition: Chemical formula as string or pymatgen Composition
element_set: Set of element symbols to check for
Returns:
float: Fraction of the composition that consists of elements from the set (0-1)
"""
comp = _ensure_composition(composition)
total_amt = sum(comp.values())
if not total_amt:
raise ValueError("Empty composition")
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
return target_amt / total_amt

validated the approximate values in the docs with the new intermetallic scoring function
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
docs/intermetallics_readme.md (1)

21-23: 🛠️ Refactor suggestion

Inconsistent API usage examples.

The example shows both string and Composition input methods, but based on previous feedback, the API should be simplified to accept string formulas directly.

Update the example to show the preferred usage pattern:

-print(get_metal_fraction("Fe3Al"))  # Works with string formula
-print(get_metal_fraction(Composition("Fe3Al")))  # Works with Composition
+print(get_metal_fraction("Fe3Al"))  # Returns 1.0
🧹 Nitpick comments (2)
docs/intermetallics_readme.md (2)

179-179: Improve error handling display.

Using string 'nan' for NaN values is not a consistent way to handle numerical formatting.

Consider using a more Pythonic approach:

-print(f"Pauling mismatch: {'nan' if math.isnan(pauling) else f'{pauling:.2f}'}")
+print(f"Pauling mismatch: {pauling:.2f if not math.isnan(pauling) else 'N/A'}")

307-316: Enhance reference formatting.

The references section could be improved for better readability and accessibility.

Consider using a consistent format for all references:

1. Original SMACT paper:
   - Title: "SMACT: Semiconducting Materials by Analogy and Chemical Theory"
   - Journal: Journal of Open Source Software
   - DOI: 10.21105/joss.01361

2. Intermetallics Theory:
   - Author: D.G. Pettifor
   - Title: "A chemical scale for crystal-structure maps"
   - Journal: Solid State Communications, 51 (1984) 31-34
   - DOI: 10.1016/0038-1098(84)90765-8

3. Miedema's Framework:
   - Author: A.R. Miedema
   - Title: "Cohesion in alloys - fundamentals of a semi-empirical model"
   - DOI: 10.1016/0378-4363(80)90054-6
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56246d3 and f5ec89a.

📒 Files selected for processing (1)
  • docs/intermetallics_readme.md (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (2)
docs/intermetallics_readme.md (2)

92-98: Verify numerical values in examples.

The example scores appear to be inconsistent with actual implementation results.

Let's verify these values by running the actual implementation:

#!/bin/bash
# Search for test files containing these specific compounds and their scores
rg -A 5 "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2.*score.*=.*0\.[0-9]+" --type py

133-133: Clarify threshold value selection.

The examples use 0.7 as the threshold, but there's no explanation of why this value was chosen or how it relates to the classification performance.

Add a note explaining:

  • How the 0.7 threshold was determined
  • What classification performance it achieves
  • When users might want to adjust it

Also applies to: 142-142

docs/intermetallics_readme.md Show resolved Hide resolved
docs/intermetallics_readme.md Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
docs/intermetallics_readme.md (2)

150-151: 🛠️ Refactor suggestion

Replace wildcard import with explicit imports.

Using wildcard imports (from smact.intermetallics import *) is not a recommended practice as it can lead to namespace pollution.

Apply this diff:

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

242-242: 🛠️ Refactor suggestion

Replace wildcard import in the advanced usage example.

Another instance of wildcard import that should be replaced with explicit imports.

Apply this diff:

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)
🧹 Nitpick comments (2)
docs/intermetallics_readme.md (2)

309-311: Add inline citation for the Pettifor number.

While the reference is provided, it would be helpful to add an inline citation to the specific page or section where the Pettifor number concept is introduced.

Apply this diff:

-   - D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.
+   - D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes [1, p. 31]. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.

313-314: Fix grammatical issues in the Miedema reference.

The sentence about Miedema's framework needs grammatical corrections.

Apply this diff:

-   - Also, The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure, the parameters coded here, while conceptually similar have not implemented Miedema directly.
+   - The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure. The parameters coded here, while conceptually similar, have not implemented Miedema's model directly.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~313-~313: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5ec89a and 92aeb6c.

📒 Files selected for processing (1)
  • docs/intermetallics_readme.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~313-~313: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

🔇 Additional comments (3)
docs/intermetallics_readme.md (3)

91-97: Verify the intermetallic scores in the examples.

The example shows specific scores (e.g., ~0.83 for Fe3Al, ~0.87 for Ni3Ti) that need verification based on previous feedback about numerical discrepancies.

Run this script to verify the actual outputs:

✅ Verification successful

The intermetallic scores in the examples are accurate and consistent with the implementation.

The scores align with the implementation's scoring heuristics and test cases. The codebase confirms:

  • Fe3Al (~0.83) and Ni3Ti (~0.87) are classified as known intermetallics with expected high scores
  • NaCl (0.63), Fe2O3 (0.64), and SiO2 (0.25) are correctly classified as non-intermetallics with appropriately lower scores
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Create a minimal test script
echo 'from smact.intermetallics import intermetallic_score
compounds = ["Fe3Al", "Ni3Ti", "NaCl", "Fe2O3", "SiO2"]
for compound in compounds:
    print(f"{compound}: {intermetallic_score(compound)}")' > test_scores.py

# Execute the test script
python3 test_scores.py

Length of output: 325


Script:

#!/bin/bash
# Find the implementation of intermetallic_score
ast-grep --pattern 'def intermetallic_score($$$):'

# Search for test files containing these compounds
rg -l "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2" --type py

Length of output: 312


Script:

#!/bin/bash
# Check the implementation
echo "=== Implementation in intermetallics.py ==="
rg -A 20 "def intermetallic_score" smact/intermetallics.py

echo -e "\n=== Test cases in test_intermetallics.py ==="
rg -A 5 -B 5 "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2" smact/tests/test_intermetallics.py

Length of output: 5039


269-271: Clarify the default threshold value.

The documentation mentions a default threshold of 0.7, but there's inconsistency in the codebase regarding this value. Please verify and document:

  1. The current default value
  2. The rationale for this specific value
  3. Guidelines for when to adjust it

Run this script to verify the default threshold:

✅ Verification successful

The default threshold value of 0.7 is correctly documented and consistently implemented.

The threshold value:

  • Is defined as 0.7 in the source code (smact/screening.py)
  • Is validated by test cases expecting scores above 0.7 for known intermetallics
  • Aligns with the statistical distribution of ion pair similarities in the codebase

Guidelines for adjustment:

  • Consider lowering for more inclusive screening of potential intermetallics
  • Consider raising for stricter filtering in high-precision applications
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for the default threshold value in the codebase
ast-grep --pattern 'intermetallic_threshold = $_'
rg -i "default.*threshold.*=.*0\.[0-9]+"

Length of output: 94


Script:

#!/bin/bash
# Search with broader patterns
rg "0\.7" --type py --type yaml --type json -A 2 -B 2
rg "threshold" --type py --type yaml --type json -A 2 -B 2

# Look for configuration files
fd -e yaml -e json -e toml -e ini -e cfg --exec cat {}

# Search for threshold-related constants
ast-grep --pattern 'THRESHOLD = $_'
ast-grep --pattern 'threshold = $_'

Length of output: 71635

🧰 Tools
🪛 LanguageTool

[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)


21-25: Verify the numerical output in the example.

The example shows get_metal_fraction("Fe2O3") returning 0.4, but this value needs verification as previous reviewers reported discrepancies in the numerical outputs.

Run this script to verify the actual output:

✅ Verification successful

Documentation example is accurate

The example showing get_metal_fraction("Fe2O3") returning 0.4 is mathematically correct. The test suite confirms that Fe2O3 should return a fractional value between 0 and 1, and the calculation (2 metal atoms / 5 total atoms = 0.4) is accurate.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Create a minimal test script
echo 'from smact.intermetallics import get_metal_fraction
print(get_metal_fraction("Fe2O3"))' > test_metal_fraction.py

# Execute the test script
python3 test_metal_fraction.py

Length of output: 226


Script:

#!/bin/bash
# Find the implementation of get_metal_fraction
ast-grep --pattern 'def get_metal_fraction($_):
    $$$'

# Also search with ripgrep for more context
rg "def get_metal_fraction" -A 10

Length of output: 813


Script:

#!/bin/bash
# Find get_element_fraction implementation
ast-grep --pattern 'def get_element_fraction($_):
    $$$'

# Look for metals set definition
rg "metals = " -A 5

# Also check if there's any direct usage of Fe2O3 in tests
rg "Fe2O3" --type py

Length of output: 1647


Script:

#!/bin/bash
# Get the full test case for Fe2O3
rg "Fe2O3" smact/tests/test_intermetallics.py -B 2 -A 2

# Get the get_element_fraction implementation
ast-grep --pattern 'def get_element_fraction($params):
    $$$'

Length of output: 522

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/intermetallics_readme.md (4)

132-133: Document the default threshold value.

The example shows a threshold of 0.7, but it would be helpful to explicitly mention this is the default value in the documentation.


244-252: Show both string and Composition usage in advanced example.

Consider showing both string and Composition usage to demonstrate the API's flexibility:

 # Detailed analysis of a compound
-comp = Composition("Fe3Al")
+# Using string formula
+metrics_str = {
+    "metal_fraction": get_metal_fraction("Fe3Al"),
+    "d_electron_fraction": get_d_electron_fraction("Fe3Al"),
+    "distinct_metals": get_distinct_metal_count("Fe3Al"),
+    "pauling_mismatch": get_pauling_test_mismatch("Fe3Al"),
+    "overall_score": intermetallic_score("Fe3Al"),
+}
+
+# Using Composition object
+comp = Composition("Fe3Al")
 metrics = {
     "metal_fraction": get_metal_fraction(comp),
     "d_electron_fraction": get_d_electron_fraction(comp),

260-260: Add missing period.

Add a period at the end of the sentence.

-   - Falls back to default behavior in these cases
+   - Falls back to default behavior in these cases.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


257-261: Add version compatibility information.

Consider adding information about minimum required versions of dependencies (pymatgen, etc.) in the limitations section.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92aeb6c and db78427.

📒 Files selected for processing (1)
  • docs/intermetallics_readme.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)

🔇 Additional comments (2)
docs/intermetallics_readme.md (2)

150-151: Replace wildcard import with explicit imports.

Using wildcard imports is not recommended as it can lead to namespace pollution and make it unclear which functions are being used.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

242-242: Replace wildcard import in advanced usage example.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

test_edge_cases, test_get_pauling_test_mismatch, test_intermetallic_score
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
docs/intermetallics_readme.md (2)

150-151: 🛠️ Refactor suggestion

Replace wildcard import with explicit imports.

Using wildcard imports can lead to namespace pollution and make it unclear which functions are being used.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

242-242: 🛠️ Refactor suggestion

Replace wildcard import in advanced usage example.

Another instance of wildcard import that should be replaced with explicit imports.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)
🧹 Nitpick comments (4)
docs/intermetallics_readme.md (1)

22-25: Clarify example outputs and input types.

The examples should:

  1. Clarify that numerical outputs are approximate values that may vary.
  2. Show both string and Composition object usage examples since both are supported.
-print(get_metal_fraction("Fe3Al"))  # Works with string formula (& Composition object)
+# Using string input
+print(get_metal_fraction("Fe3Al"))  # Returns 1.0
+
+# Using Composition object
+from pymatgen.core import Composition
+print(get_metal_fraction(Composition("Fe3Al")))  # Returns 1.0

-print(get_metal_fraction("Fe2O3"))  # 0.4
+print(get_metal_fraction("Fe2O3"))  # Returns ~0.4 (exact value may vary)
smact/tests/test_intermetallics.py (3)

42-67: Consider adding negative test case

The test coverage is thorough for valid cases. Consider adding a test case with an invalid composition to verify error handling.


142-159: Define score thresholds as class constants

Consider defining the score thresholds (0.7 and 0.5) as class constants to improve maintainability and make it easier to adjust these values if needed.

Example:

class TestIntermetallics(unittest.TestCase):
    HIGH_SCORE_THRESHOLD = 0.7
    LOW_SCORE_THRESHOLD = 0.5

1-2: Add docstring coverage verification

Consider adding a test to verify that all public functions in the intermetallics module have proper docstrings.

Example:

def test_docstring_coverage(self):
    """Verify all public functions have docstrings."""
    for func in [get_metal_fraction, get_d_electron_fraction,
                 get_distinct_metal_count, get_pauling_test_mismatch,
                 intermetallic_score]:
        self.assertIsNotNone(func.__doc__, 
            f"{func.__name__} missing docstring")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db78427 and 4727c9a.

📒 Files selected for processing (3)
  • docs/intermetallics_readme.md (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/tests/test_intermetallics.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • smact/intermetallics.py
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (6)
docs/intermetallics_readme.md (1)

110-143: Well-structured examples demonstrating all validation paths!

The examples effectively showcase different validation methods with clear comments and consistent formatting.

smact/tests/test_intermetallics.py (5)

5-8: Standardise the testing framework

The test suite is mixing unittest and pytest frameworks. Since the class inherits from unittest.TestCase, we should maintain consistency by using unittest throughout.


24-40: Well-structured test data setup

Excellent selection of test cases covering various types of intermetallics and non-intermetallics. The setup provides a robust foundation for comprehensive testing.


68-92: Excellent boundary testing

Comprehensive test coverage with clear boundary cases (0%, 100%, and intermediate metal fractions) and descriptive error messages.


167-172: 🛠️ Refactor suggestion

Use unittest assertions consistently

Replace pytest.raises with unittest's assertRaises to maintain consistency with the chosen testing framework.

Apply this diff:

-        with pytest.raises(ValueError, match="Empty composition"):
+        with self.assertRaises(ValueError) as cm:
             intermetallic_score("")
+        self.assertEqual(str(cm.exception), "Empty composition")

-        with pytest.raises(ValueError, match="Invalid formula"):
+        with self.assertRaises(ValueError) as cm:
             intermetallic_score("NotAnElement")
+        self.assertEqual(str(cm.exception), "Invalid formula")

Likely invalid or redundant comment.


129-141: 🛠️ Refactor suggestion

Update Pauling test comparison

The comparison should use absolute values to properly measure the deviation from zero, as per previous feedback. Additionally, consider adding edge cases with known electronegativity values.

Apply this diff:

-            fe3al_mismatch < nacl_mismatch,
+            abs(fe3al_mismatch) < abs(nacl_mismatch),

Likely invalid or redundant comment.

Comment on lines +74 to +77
print(get_pauling_test_mismatch("Fe3Al")) # 0.22

# Ionic compound - high mismatch
print(get_pauling_test_mismatch("NaCl")) # -1.23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add disclaimers for numerical values.

The numerical values shown in the examples should be clearly marked as approximate values that may vary based on the implementation. Additionally, emphasise the relative relationships rather than specific values.

-print(get_pauling_test_mismatch("Fe3Al"))  # 0.22
-print(get_pauling_test_mismatch("NaCl"))  # -1.23
+print(get_pauling_test_mismatch("Fe3Al"))  # Returns ~0.22 (intermetallics have lower mismatch)
+print(get_pauling_test_mismatch("NaCl"))  # Returns ~-1.23 (ionic compounds have higher mismatch)

-print(intermetallic_score("Fe3Al"))  # ~0.83
-print(intermetallic_score("Ni3Ti"))  # ~0.87
+print(intermetallic_score("Fe3Al"))  # Returns ~0.83 (high score indicates strong intermetallic character)
+print(intermetallic_score("Ni3Ti"))  # Returns ~0.87 (high score indicates strong intermetallic character)

-print(intermetallic_score("NaCl"))  # 0.63
-print(intermetallic_score("Fe2O3"))  # 0.64
-print(intermetallic_score("SiO2"))  # 0.25
+print(intermetallic_score("NaCl"))  # Returns ~0.63 (ionic compounds have lower scores)
+print(intermetallic_score("Fe2O3"))  # Returns ~0.64 (metal oxides have lower scores)
+print(intermetallic_score("SiO2"))  # Returns ~0.25 (non-metallic compounds have lowest scores)

Also applies to: 91-97

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
smact/intermetallics.py (3)

13-35: Enhance error messages for better debugging

The error messages could be more specific to help users identify the exact issue with their input.

Apply this diff to improve error messages:

-            raise ValueError("Invalid formula") from exc
+            raise ValueError(f"Invalid formula '{composition}': {str(exc)}") from exc

56-70: Enhance function documentation

Consider adding more detailed docstrings including:

  • Return value ranges
  • Example usage
  • Edge cases

Example enhancement for get_metal_fraction:

def get_metal_fraction(composition: str | Composition) -> float:
    """Calculate the fraction of metallic elements in a composition.

    Args:
        composition: Chemical formula as string or pymatgen Composition

    Returns:
        float: Fraction between 0 and 1, where 1 indicates all elements are metals

    Example:
        >>> get_metal_fraction("Fe2O3")
        0.4  # 2/(2+3) as only Fe is metallic
    """

107-157: Make scoring parameters configurable

Consider extracting magic numbers into configurable parameters to allow for different scoring strategies:

  • The scale factor of 3.0 for pauling mismatch
  • The target VEC value of 8.0
  • The weights dictionary

Example implementation:

def intermetallic_score(
    composition: str | Composition,
    *,
    pauling_scale: float = 3.0,
    target_vec: float = 8.0,
    weights: dict[str, float] | None = None,
) -> float:
    """Calculate a score (0-1) indicating how intermetallic a composition is.

    Args:
        composition: Chemical formula or pymatgen Composition
        pauling_scale: Scale factor for pauling mismatch penalty (default: 3.0)
        target_vec: Target valence electron count (default: 8.0)
        weights: Custom weights for each component (default: None, uses standard weights)
    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4727c9a and 96b4f30.

📒 Files selected for processing (1)
  • smact/intermetallics.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (3)
smact/intermetallics.py (3)

1-11: Well-structured module setup!

The imports are properly organised, and the module's purpose is clearly documented.


38-54: Add error handling for edge cases

The function should handle empty compositions and potential division by zero errors.


97-98: Ensure compatibility with Python versions earlier than 3.10

The use of the strict parameter in the zip function is only available in Python 3.10 and above.

Copy link

codecov bot commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 93.45794% with 7 lines in your changes missing coverage. Please review.

Project coverage is 76.17%. Comparing base (a92e123) to head (b6dd89a).

Files with missing lines Patch % Lines
smact/intermetallics.py 93.10% 4 Missing ⚠️
smact/screening.py 40.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #367      +/-   ##
==========================================
+ Coverage   75.47%   76.17%   +0.70%     
==========================================
  Files          31       33       +2     
  Lines        2642     2749     +107     
==========================================
+ Hits         1994     2094     +100     
- Misses        648      655       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ryannduma ryannduma dismissed AntObi’s stale review January 17, 2025 19:31

fix has been made, dismissing to request fresh review

@ryannduma ryannduma requested a review from AntObi January 17, 2025 19:32
@ryannduma
Copy link
Collaborator Author

Hey Anthony, thanks for all the reviews and requested changes, hopefully resolved by the changes made today. Tests are passing which is great, only test not working is pre-commit because there's a new Pyright version out for the pre-commit hook. Other than that, I hope this is a step in the right direction for this chemical filter and I look forward to future discussions and contributions towards its enhancement. Have a lovely weekend and a great start at your new job next week! All the very best <3

@AntObi AntObi changed the base branch from master to develop January 19, 2025 15:47
@ryannduma
Copy link
Collaborator Author

Hey Anthony, latest commit just moved the Jupyter notebook example into the examples docs folder, there are no inherent code changes, though the tests are failing cause of something to do with mp_api being none which I think is a separate issue you're working on so don't worry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants