Updated oxidation.py and its tests #366

KingaMas · 2025-01-14T18:15:50Z

Pull Request Template

Description

Updated get_species_occurences_df by adding a consensus_threshold. Helps to avoid non-physical species. Default is 3.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

[ x ] New feature (non-breaking change which adds functionality)

Reviewers

@AntObi

Checklist

[ x] My code follows the style guidelines of this project
[ x] I have performed a self-review of my own code
[ x] I have commented my code, particularly in hard-to-understand areas
[ x] I have made corresponding changes to the documentation
[ x] My changes generate no new warnings
[ x] I have added tests that prove my fix is effective or that my feature works
[ x] New and existing unit tests pass locally with my changes
[ x] Any dependent changes have been merged and published in downstream modules
[x ] I have checked my code and corrected any misspellings

Summary by CodeRabbit

Release Notes

Chores
- Updated pre-commit hook configuration to run codespell during pre-commit phase.
- Improved code formatting in composition generation function.
New Features
- Enhanced species occurrence filtering with a new consensus threshold parameter.
- Added element column to species occurrences DataFrame.
- Introduced a new commonality parameter for filtering species based on occurrence proportions.
- Added two new text files containing comprehensive lists of oxidation states for various chemical elements.

coderabbitai · 2025-01-14T18:16:01Z

Walkthrough

This pull request encompasses changes across several files, primarily focusing on the configuration of pre-commit hooks and modifications to the ICSD24OxStatesFilter class. The .pre-commit-config.yaml file is updated to run the codespell hook during the pre-commit phase. The smact/utils/oxidation.py file sees significant changes to method signatures and filtering logic, introducing new parameters for species filtering. Additionally, new text files are added to provide oxidation state data, and minor formatting improvements are made in the generate_composition_with_smact.py file.

Changes

File	Change Summary
`.pre-commit-config.yaml`	Modified `codespell` hook stages from `[commit, commit-msg]` to `[pre-commit, commit-msg]`
`smact/utils/oxidation.py`	- Updated `filter`, `get_species_list`, `get_species_occurrences_df`, and `write` methods to replace `threshold` with `consensus` and add `commonality` parameter. - Adjusted filtering logic and return DataFrame structure.
`smact/utils/crystal_space/generate_composition_with_smact.py`	Reformatted `multiprocessing.Pool` instantiation and `partial` function call for improved readability
`smact/tests/files/oxidation_states_icsd24_consensus.txt`	New file added with a list of oxidation states filtered for low commonality.
`smact/tests/files/oxidation_states_icsd24_consensus_w_0.txt`	New file added with a list of oxidation states including zero values.
`smact/tests/test_utils.py`	Updated test methods to reflect changes in method signatures and file paths for oxidation states.

Possibly related PRs

Add oxidation states option to SMACT validity function #282: The changes in this PR introduce a new feature allowing users to select specific oxidation states in the smact_validity function, which relates to the changes in the main PR that modify how oxidation states are handled in the codebase.
Develop branch updates in preparation for new release #308: This PR includes significant modifications to the smact_validity function, particularly in how oxidation states are selected, which aligns with the changes in the main PR regarding the handling of oxidation states.
Structure prediction fixes #336: This PR updates the default oxidation states used in various functions, which is directly relevant to the changes made in the main PR regarding the handling of oxidation states.

Suggested labels

tests, feature, bug

Poem

🐰 Codespell hops through lines of code,
Pre-commit hooks now take a new road.
Oxidation states refined with care,
Formatting dance beyond compare!
A rabbit's leap of coding flair! 🔍

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

smact/utils/oxidation.py (1)
100-101: Add validation for consensus_threshold parameter.

Consider adding validation to ensure the threshold is non-negative, as negative values wouldn't make sense in this context.
 species_occurrences_df = self.ox_states_df[
+    (consensus_threshold >= 0)
+    and
     (self.ox_states_df.results_count >= consensus_threshold)
 ].reset_index(drop=True)
get_species_occurences_df2.ipynb (1)
17-19: Add missing parameter documentation.

The docstring is missing documentation for the threshold parameter.
     Args:
+        threshold (int): Minimum number of occurrences required for inclusion. Default is 3.
         include_one_oxidation_state (bool): Include oxidation states +1 and -1 in the species or include as + and - signs. Default is False.
         sort_by_occurrences (bool): Sort the species list by occurrences. Default is True.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a92e123 and 78c3049.

📒 Files selected for processing (3)

.pre-commit-config.yaml (1 hunks)
get_species_occurences_df2.ipynb (1 hunks)
smact/utils/oxidation.py (2 hunks)

🧰 Additional context used

🪛 GitHub Actions: SMACT CI

get_species_occurences_df2.ipynb

[error] 2-7: Notebook contains execution count metadata that should be stripped. The nbstripout pre-commit hook failed and modified the file to remove execution counts.

🔇 Additional comments (3)

smact/utils/oxidation.py (2)

85-87: LGTM! Parameter addition looks good.

The new consensus_threshold parameter is well-documented and its default value of 3 seems reasonable for filtering out spurious data.

Also applies to: 94-94

Line range hint 85-101: Add tests for the new consensus_threshold parameter.

Please add unit tests to verify:

Default threshold behaviour

Custom threshold filtering

Edge cases (0, very large values)

Invalid inputs (negative values)

Would you like me to help generate comprehensive test cases for the new parameter?

.pre-commit-config.yaml (1)

55-55: LGTM! Good improvement to the pre-commit workflow.

Moving the codespell check to the pre-commit stage will help catch spelling issues earlier in the commit process.

get_species_occurences_df2.ipynb

AntObi

Tests need to be updated to reflect the changes.
See my additional review comments

smact/utils/oxidation.py

AntObi · 2025-01-15T12:52:32Z

smact/utils/oxidation.py

-        species_occurrences_df = species_occurrences_df[["species", "results_count"]]
+        species_occurrences_df = species_occurrences_df[["element", "species", "results_count"]]


Happy to accept the change, but can this change also be documented in the Pull Request text. Would also be helpful to know why the addition of the element column is needed.

Additionally, the tests need to be updated to reflect the changes made to the code to ensure that they pass.

SMACT/smact/tests/test_utils.py

Lines 211 to 220 in a92e123

def test_oxidation_states_filter_species_occurrences(self):

species_occurrences_df = self.ox_filter.get_species_occurrences_df()

self.assertIsInstance(species_occurrences_df, pd.DataFrame)

self.assertEqual(

species_occurrences_df.columns.tolist(),

["species", "results_count"],

)

self.assertEqual(species_occurrences_df.shape, (490, 2))

self.assertEqual(species_occurrences_df.iloc[0]["species"], "O2-")

self.assertEqual(species_occurrences_df.iloc[0]["results_count"], 116910)

I'd say that consensus_threshold should be relabeled to threshold since there are functions which also take a threshold argument in this Python file. So it's better to have consistency. On the point of consistency, all thresholds should either be >= or >, rather than having a mix of both.

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 2

🔭 Outside diff range comments (1)

smact/utils/oxidation.py (1)

Line range hint 130-154: Reduce code duplication by reusing filter method logic.

The filtering logic and species proportion calculation are duplicated from the filter method. Consider reusing the existing implementation.

-if not include_zero:
-    species_occurrences_df = self.ox_states_df[
-        self.ox_states_df["oxidation_state"] != 0
-    ].reset_index(drop=True)
-else:
-    species_occurrences_df = self.ox_states_df
-species_occurrences_df = species_occurrences_df[
-    (species_occurrences_df.results_count >= consensus)
-].reset_index(drop=True)
+# Reuse filter method with commonality="low" to maintain all proportions
+species_occurrences_df = self.ox_states_df[
+    self.ox_states_df["results_count"].isin(
+        self.filter(consensus=consensus, include_zero=include_zero, commonality="low")["results_count"]
+    )
+].reset_index(drop=True)

🧹 Nitpick comments (1)

smact/utils/oxidation.py (1)

196-198: Enhance metadata in output file.

Consider adding more detailed metadata about the filtering parameters used.

 f.write(
-    f"#\n# Oxidation state set\n# Source: ICSD (2024), filtered for {commonality} commonality of reports\n#\n"
+    f"#\n# Oxidation state set\n"
+    f"# Source: ICSD (2024)\n"
+    f"# Filtering parameters:\n"
+    f"#   - Minimum consensus: {consensus} reports\n"
+    f"#   - Commonality threshold: {commonality} "
+    f"({commonality_map[commonality]}% minimum proportion)\n#\n"
 )

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ced5460 and 88f4c08.

📒 Files selected for processing (1)

smact/utils/oxidation.py (5 hunks)

🔇 Additional comments (3)

smact/utils/oxidation.py (3)

24-26: LGTM! Clean data loading implementation.

The DataFrame loading implementation is correct and follows pandas best practices.

79-83: LGTM! Consistent parameter updates.

The changes to get_species_list maintain consistency with the filter method updates.

Also applies to: 96-96

28-30: Consider renaming 'consensus' to 'threshold' for consistency.

As mentioned in previous reviews, other functions in this file use the term 'threshold'. For consistency, consider renaming the 'consensus' parameter to 'threshold'.

coderabbitai · 2025-01-16T16:29:43Z

smact/utils/oxidation.py

+        element_totals = filtered_df.groupby("element")["results_count"].transform(
+            "sum"
+        )
+        filtered_df["species_proportion (%)"] = (
+            filtered_df["results_count"] / element_totals * 100
+        )


🛠️ Refactor suggestion

Add error handling for zero element totals in proportion calculation.

The species proportion calculation could raise a division by zero error if all results are filtered out for an element.

element_totals = filtered_df.groupby("element")["results_count"].transform( "sum" ) +if (element_totals == 0).any(): + raise ValueError( + "No valid results remain for some elements after filtering. " + "Consider adjusting the consensus threshold." + ) filtered_df["species_proportion (%)"] = ( filtered_df["results_count"] / element_totals * 100 )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

element_totals = filtered_df.groupby("element")["results_count"].transform(

"sum"

)

filtered_df["species_proportion (%)"] = (

filtered_df["results_count"] / element_totals * 100

)

element_totals = filtered_df.groupby("element")["results_count"].transform(

"sum"

)

if (element_totals == 0).any():

raise ValueError(

"No valid results remain for some elements after filtering. "

"Consider adjusting the consensus threshold."

)

filtered_df["species_proportion (%)"] = (

filtered_df["results_count"] / element_totals * 100

)

coderabbitai · 2025-01-16T16:29:43Z

smact/utils/oxidation.py

+        commonality_map = {"low": 0, "medium": 10, "high": 50}
+        commonality_threshold = commonality_map.get(commonality)


🛠️ Refactor suggestion

Add input validation for commonality parameter.

The commonality parameter should be validated against allowed values to prevent silent failures with invalid inputs.

commonality_map = {"low": 0, "medium": 10, "high": 50} +if commonality not in commonality_map: + raise ValueError( + f"Invalid commonality value: {commonality}. " + f"Must be one of: {', '.join(commonality_map.keys())}" + ) commonality_threshold = commonality_map.get(commonality)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

commonality_map = {"low": 0, "medium": 10, "high": 50}

commonality_threshold = commonality_map.get(commonality)

commonality_map = {"low": 0, "medium": 10, "high": 50}

if commonality not in commonality_map:

raise ValueError(

f"Invalid commonality value: {commonality}. "

f"Must be one of: {', '.join(commonality_map.keys())}"

)

commonality_threshold = commonality_map.get(commonality)

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88f4c08 and 3a3c1d0.

📒 Files selected for processing (1)

smact/utils/oxidation.py (5 hunks)

🔇 Additional comments (5)

smact/utils/oxidation.py (5)

58-75: LGTM! Clear documentation and proper parameter propagation

The method signature changes and documentation updates are well-implemented.

Line range hint 135-169: LGTM! Consistent parameter updates and clear documentation

The method changes maintain consistency with other methods and provide clear documentation of the new functionality.

179-179: LGTM! Consistent threshold comparison

The change from > to >= aligns with the consensus threshold usage throughout the class.

37-38: ⚠️ Potential issue

Add input validation for the commonality parameter

The commonality parameter should be validated to prevent silent failures with invalid inputs.
 commonality_map = {"low": 0, "medium": 10, "high": 50}
+if commonality not in commonality_map:
+    raise ValueError(
+        f"Invalid commonality value: {commonality}. "
+        f"Must be one of: {', '.join(commonality_map.keys())}"
+    )
 commonality_threshold = commonality_map.get(commonality)
Likely invalid or redundant comment.

46-48: ⚠️ Potential issue

Add error handling for zero element totals

The species proportion calculation could raise a division by zero error if all results are filtered out for an element.
 element_totals = filtered_df.groupby("element")["results_count"].transform("sum")
+if (element_totals == 0).any():
+    raise ValueError(
+        "No valid results remain for some elements after filtering. "
+        "Consider adjusting the consensus threshold."
+    )
 filtered_df["species_proportion (%)"] = filtered_df["results_count"] / element_totals * 100
Likely invalid or redundant comment.

coderabbitai · 2025-01-16T16:31:15Z

smact/utils/oxidation.py

        self,
+        consensus: int = 3,
        include_one_oxidation_state: bool = False,
        sort_by_occurrences: bool = True,
+        include_zero: bool = False,


🛠️ Refactor suggestion

Add commonality parameter for consistency

This method should include the commonality parameter for consistency with other methods in the class.

def get_species_occurrences_df( self, consensus: int = 3, include_one_oxidation_state: bool = False, sort_by_occurrences: bool = True, - include_zero: bool = False, + include_zero: bool = False, + commonality: str = "low", ):

Committable suggestion skipped: line range outside the PR's diff.

AntObi · 2025-01-16T18:01:32Z

smact/utils/oxidation.py

@@ -142,7 +164,9 @@ def write(
        if not filename.endswith(".txt"):
            filename += ".txt"
        with open(filename, "w") as f:
-            f.write(f"#\n# Oxidation state set\n# Source: ICSD (2024), filtered for > {threshold} reports\n#\n")


I would suggest that users should have the option to either write out lists based on either the threshold/consensus or commonality. This should preserve backwards compatibility, as well as provide flexibility to users to choose how to write the oxidation states lists

AntObi · 2025-01-16T18:03:07Z

smact/utils/oxidation.py


        Returns:
            pd.DataFrame: The filtered oxidation states list as a DataFrame.
        """
+        commonality_map = {"low": 0, "medium": 10, "high": 50}


I would suggest that there's additional argument to the filter function for commonality_map which can take your default arguments.
This provides flexibility to the user if they want to define their own thresholds

KingaMas · 2025-01-17T17:42:48Z

Thanks @AntObi ! I've introduced your suggestions and updated the tests accordingly

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

smact/utils/oxidation.py (1)

55-61: ⚠️ Potential issue

Add error handling for zero element totals.

The species proportion calculation could raise a division by zero error if all results are filtered out for an element.

         element_totals = filtered_df.groupby("element")["results_count"].transform("sum")
+        if (element_totals == 0).any():
+            raise ValueError(
+                "No valid results remain for some elements after filtering. "
+                "Consider adjusting the consensus threshold."
+            )
         filtered_df["species_proportion (%)"] = filtered_df["results_count"] / element_totals * 100

🧹 Nitpick comments (5)

smact/utils/oxidation.py (2)

26-41: LGTM! Consider enhancing the docstring.

The parameter renaming and addition of commonality filtering improve the API. The docstring is comprehensive but could be enhanced by documenting the return value's columns.

         Returns:
-            pd.DataFrame: The filtered oxidation states list as a DataFrame.
+            pd.DataFrame: The filtered oxidation states list as a DataFrame with columns:
+                - element (str): Chemical element symbol
+                - oxidation_state (str): Space-separated oxidation states

179-181: Enhance the file header with consensus threshold information.

The header should mention both the commonality level and consensus threshold for complete filtering information.

             f.write(
-                f"#\n# Oxidation state set\n# Source: ICSD (2024), filtered for {commonality} commonality of reports\n#\n"
+                f"#\n# Oxidation state set\n# Source: ICSD (2024), filtered for {commonality} commonality of reports (consensus ≥ {consensus})\n#\n"
             )

smact/tests/test_utils.py (3)

181-181: Enhance test coverage for commonality parameter.

The test should verify the behaviour with different commonality values.

         threshold = 10
-        filtered_df = self.ox_filter.filter(consensus=threshold)
+        filtered_df = self.ox_filter.filter(consensus=threshold)
+        filtered_df_medium = self.ox_filter.filter(consensus=threshold, commonality="medium")
+        filtered_df_high = self.ox_filter.filter(consensus=threshold, commonality="high")
 
         self.assertIsInstance(filtered_df, pd.DataFrame)
+        self.assertIsInstance(filtered_df_medium, pd.DataFrame)
+        self.assertIsInstance(filtered_df_high, pd.DataFrame)
+        self.assertGreater(len(filtered_df), len(filtered_df_medium))
+        self.assertGreater(len(filtered_df_medium), len(filtered_df_high))

187-227: Enhance file content verification.

The tests could be more thorough in verifying the structure and content of the written files.

         with open(f"{filename}.txt") as f:
             content = f.read()
         # Check if the comment is included in the file
         self.assertIn(comment, content)
+        # Verify file structure
+        lines = content.split("\n")
+        self.assertTrue(any(line.startswith("# Source: ICSD (2024)") for line in lines))
+        self.assertTrue(all(len(line.split()) <= 2 for line in lines if not line.startswith("#") and line))
+        # Verify oxidation states format
+        for line in lines:
+            if line and not line.startswith("#"):
+                parts = line.split()
+                if len(parts) == 2:
+                    self.assertTrue(all(x.isdigit() or x == "-" for x in parts[1].split()))

229-261: Add tests for custom commonality thresholds.

The tests should verify that custom commonality thresholds work correctly.

         # Test with a specific consensus threshold
         species_list_threshold = self.ox_filter.get_species_list(consensus=5)
         self.assertIsInstance(species_list_threshold, list)
         self.assertGreater(len(species_list_threshold), 0)
+
+        # Test with custom commonality threshold
+        species_list_custom = self.ox_filter.get_species_list(
+            consensus=3,
+            commonality_thresholds={"low": 0, "medium": 15, "high": 60}
+        )
+        self.assertIsInstance(species_list_custom, list)
+        self.assertGreater(len(species_list_custom), 0)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a3c1d0 and 5844fbd.

📒 Files selected for processing (4)

smact/tests/files/oxidation_states_icsd24_consensus.txt (1 hunks)
smact/tests/files/oxidation_states_icsd24_consensus_w_0.txt (1 hunks)
smact/tests/test_utils.py (2 hunks)
smact/utils/oxidation.py (5 hunks)

✅ Files skipped from review due to trivial changes (2)

smact/tests/files/oxidation_states_icsd24_consensus_w_0.txt
smact/tests/files/oxidation_states_icsd24_consensus.txt

🔇 Additional comments (4)

smact/utils/oxidation.py (3)

42-50: Make commonality thresholds configurable.

As suggested in a previous review, the commonality thresholds should be configurable to provide more flexibility to users.
     def filter(
         self,
         consensus: int = 3,
         include_zero: bool = False,
         commonality: str | float = "low",
+        commonality_thresholds: dict[str, float] | None = None,
     ):
-        commonality_map = {"low": 0, "medium": 10, "high": 50}
+        default_map = {"low": 0, "medium": 10, "high": 50}
+        commonality_map = commonality_thresholds or default_map
Line range hint 105-139: Add commonality parameter for consistency.

This method should include the commonality parameter for consistency with other methods in the class.
     def get_species_occurrences_df(
         self,
         consensus: int = 3,
         include_one_oxidation_state: bool = False,
         sort_by_occurrences: bool = True,
         include_zero: bool = False,
+        commonality: str = "low",
     ):
191-191: LGTM! Consistent threshold comparison.

The change to use >= makes the filtering logic consistent across all methods.
smact/tests/test_utils.py (1)

166-167: LGTM! Test paths updated correctly.

The test file paths have been appropriately updated to reflect the consensus-based filtering approach.

codecov · 2025-01-23T16:10:43Z

Codecov Report

Attention: Patch coverage is 93.84615% with 4 lines in your changes missing coverage. Please review.

Project coverage is 75.58%. Comparing base (a92e123) to head (5844fbd).

Files with missing lines	Patch %	Lines
smact/utils/oxidation.py	81.81%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #366      +/-   ##
==========================================
+ Coverage   75.47%   75.58%   +0.11%     
==========================================
  Files          31       31              
  Lines        2642     2679      +37     
==========================================
+ Hits         1994     2025      +31     
- Misses        648      654       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Updated get_species_occurences_df by adding a consensus_threshold

78c3049

coderabbitai bot reviewed Jan 14, 2025

View reviewed changes

get_species_occurences_df2.ipynb Outdated Show resolved Hide resolved

get_species_occurences_df2.ipynb Outdated Show resolved Hide resolved

AntObi added the enhancement label Jan 14, 2025

AntObi reviewed Jan 14, 2025

View reviewed changes

get_species_occurences_df2.ipynb Outdated Show resolved Hide resolved

Kinga Oliwia Mastej and others added 2 commits January 14, 2025 19:45

Removed oxidation state 0

76653e1

Run pre-commit

5f0c162

AntObi requested changes Jan 15, 2025

View reviewed changes

Kinga Oliwia Mastej and others added 3 commits January 16, 2025 16:03

Introducing consensus and commonality thresholds

ced5460

Updated oxidation.py

88f4c08

[pre-commit.ci] auto fixes from pre-commit.com hooks

3a3c1d0

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jan 16, 2025

View reviewed changes

AntObi reviewed Jan 16, 2025

View reviewed changes

AntObi added the breaking label Jan 17, 2025

KingaMas changed the title ~~Updated get_species_occurences_df by adding a consensus_threshold~~ Updated oxidation.py and its tests Jan 17, 2025

Kinga Oliwia Mastej added 3 commits January 23, 2025 16:04

Update the tests to correctly assess new functions from oxidation.py.

5c86864

Improvements to oxidation.py and tests

b886e0b

Suggested changes implemented

5844fbd

coderabbitai bot reviewed Jan 23, 2025

View reviewed changes

KingaMas requested a review from AntObi January 23, 2025 17:55

AntObi changed the base branch from master to develop January 24, 2025 19:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated oxidation.py and its tests #366

Updated oxidation.py and its tests #366

KingaMas commented Jan 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 14, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

AntObi left a comment

AntObi Jan 15, 2025

coderabbitai bot left a comment

coderabbitai bot Jan 16, 2025

coderabbitai bot Jan 16, 2025

coderabbitai bot left a comment

coderabbitai bot Jan 16, 2025

AntObi Jan 16, 2025

AntObi Jan 16, 2025

KingaMas commented Jan 17, 2025

coderabbitai bot left a comment

codecov bot commented Jan 23, 2025 •

edited

Loading

		species_occurrences_df = species_occurrences_df[["species", "results_count"]]
		species_occurrences_df = species_occurrences_df[["element", "species", "results_count"]]

	def test_oxidation_states_filter_species_occurrences(self):
	species_occurrences_df = self.ox_filter.get_species_occurrences_df()
	self.assertIsInstance(species_occurrences_df, pd.DataFrame)
	self.assertEqual(
	species_occurrences_df.columns.tolist(),
	["species", "results_count"],
	)
	self.assertEqual(species_occurrences_df.shape, (490, 2))
	self.assertEqual(species_occurrences_df.iloc[0]["species"], "O2-")
	self.assertEqual(species_occurrences_df.iloc[0]["results_count"], 116910)

		commonality_map = {"low": 0, "medium": 10, "high": 50}
		commonality_threshold = commonality_map.get(commonality)

Updated oxidation.py and its tests #366

Are you sure you want to change the base?

Updated oxidation.py and its tests #366

Conversation

KingaMas commented Jan 14, 2025 • edited by coderabbitai bot Loading

Pull Request Template

Description

Type of change

Reviewers

Checklist

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Jan 14, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested labels

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

AntObi left a comment

Choose a reason for hiding this comment

AntObi Jan 15, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 16, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 16, 2025

Choose a reason for hiding this comment

AntObi Jan 16, 2025

Choose a reason for hiding this comment

AntObi Jan 16, 2025

Choose a reason for hiding this comment

KingaMas commented Jan 17, 2025

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 23, 2025 • edited Loading

Codecov Report

KingaMas commented Jan 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 14, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Jan 23, 2025 •

edited

Loading