Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Adding support for reading vaspwave.h5 in Wavecar and Chgcar class #3768

Open
wants to merge 114 commits into
base: master
Choose a base branch
from

Conversation

Shibu778
Copy link

@Shibu778 Shibu778 commented Apr 18, 2024

Dear Maintainers,
This is my first contribution to pymatgen. I am relatively new to opensource and would appreciate your feedback and guidance in improving the added features and this PR. In further commits, I will add read_from_hdf5 functionality to Chgcar and write all the tests.

Summary

Major changes:

  • feature 1: Added a method read_from_hdf5 in Wavecar class to read WAVECAR related information from vaspwave.h5 file (refer to vaspwiki).
  • fix 1: made a function to read the legacy WAVECAR format that was previously implemented in Wavecar class.
  • fix 2: added appropriate comments and doc-string
  • fix 3: Some part of the pymatgen/io/vasp/outputs.py file got formatted by black formatter

Todos

This is work in progress, following needs to be done

  • feature 2: Add the same feature in Chgcar class
  • fix 4: Write appropriate test to read the hdf5 files in using Wavecar and Chgcar class

Pre-commit result

pre-commit run --all-files passed all everything

~\Github\pymatgen> pre-commit run --all-files
ruff.....................................................................Passed
ruff-format..............................................................Passed
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
mypy.....................................................................Passed
codespell................................................................Passed
cython-lint..............................................................Passed
double-quote Cython strings..............................................Passed
blacken-docs.............................................................Passed
markdownlint.............................................................Passed
nbstripout...............................................................Passed

Checklist

  • Google format doc strings added. Check with ruff.
  • Type annotations included. Check with mypy.
  • The tests for fixes in Wavecar class remains unchanged to read legacy WAVECAR file
  • Tests added for new features.

Summary by CodeRabbit

  • Refactor

    • Improved code readability and formatting in various functions.
    • Updated function signatures for clarity.
  • Documentation

    • Enhanced function descriptions with new comments and docstrings.

Copy link

coderabbitai bot commented Apr 18, 2024

Walkthrough

The recent changes across the pymatgen/io/vasp/outputs.py and tests/io/vasp/test_outputs.py files focus on improving code readability and maintainability. These modifications include importing h5py, enhancing tuple unpacking, refining code formatting, updating function signatures with type annotations, and enriching documentation through comments and docstrings.

Changes

File Path Changes Summary
pymatgen/io/vasp/outputs.py Added h5py import, refactored tuple unpacking, improved formatting, updated function signatures, and enriched documentation.
tests/io/vasp/test_outputs.py Restructured code for readability and maintainability, improved code organization without altering core functionality.

🐰✨
A hop of joy for cleaner code,
With tuples unpacked along the road.
Type hints and docs, now clear as day,
In outputs.py, improvements stay.
Cheers to the devs, with ears so long,
For making the code both swift and strong.
🌟📄


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Comment on lines 371 to 375
(
self.kpoints,
self.actual_kpoints,
self.actual_kpoints_weights,
) = self._parse_kpoints(elem)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper error handling in the __init__ method for file operations.

Consider adding a try-except block around the file opening and parsing logic to handle potential I/O errors or XML parsing issues gracefully.

Comment on lines 1063 to 1072
[
max(cbm_spins[0] - vbm_spins[0], 0),
max(cbm_spins[1] - vbm_spins[1], 0),
],
[cbm_spins[0], cbm_spins[1]],
[vbm_spins[0], vbm_spins[1]],
[vbm_spins_kpoints[0] == cbm_spins_kpoints[0], vbm_spins_kpoints[1] == cbm_spins_kpoints[1]],
[
vbm_spins_kpoints[0] == cbm_spins_kpoints[0],
vbm_spins_kpoints[1] == cbm_spins_kpoints[1],
],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine the method for extracting eigenvalue band properties.

-    def eigenvalue_band_properties(self):
+    def get_eigenvalue_band_properties(self):

Rename the method to follow Python's naming conventions for methods that return data.


Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
[
max(cbm_spins[0] - vbm_spins[0], 0),
max(cbm_spins[1] - vbm_spins[1], 0),
],
[cbm_spins[0], cbm_spins[1]],
[vbm_spins[0], vbm_spins[1]],
[vbm_spins_kpoints[0] == cbm_spins_kpoints[0], vbm_spins_kpoints[1] == cbm_spins_kpoints[1]],
[
vbm_spins_kpoints[0] == cbm_spins_kpoints[0],
vbm_spins_kpoints[1] == cbm_spins_kpoints[1],
],
[
max(cbm_spins[0] - vbm_spins[0], 0),
max(cbm_spins[1] - vbm_spins[1], 0),
],
[cbm_spins[0], cbm_spins[1]],
[vbm_spins[0], vbm_spins[1]],
[
vbm_spins_kpoints[0] == cbm_spins_kpoints[0],
vbm_spins_kpoints[1] == cbm_spins_kpoints[1],
],

Ensure the data structure for WSWQ is correctly initialized.

Verify that the initialization of the data arrays in the WSWQ class handles all possible configurations correctly, especially with different spin and k-point counts.


Optimize the file writing process in the write_file method.

-    def write_file(self, filename):
+    def save_to_file(self, filename):

Rename the method to better reflect its functionality and align with common Python practices.


Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
[
max(cbm_spins[0] - vbm_spins[0], 0),
max(cbm_spins[1] - vbm_spins[1], 0),
],
[cbm_spins[0], cbm_spins[1]],
[vbm_spins[0], vbm_spins[1]],
[vbm_spins_kpoints[0] == cbm_spins_kpoints[0], vbm_spins_kpoints[1] == cbm_spins_kpoints[1]],
[
vbm_spins_kpoints[0] == cbm_spins_kpoints[0],
vbm_spins_kpoints[1] == cbm_spins_kpoints[1],
],
[
max(cbm_spins[0] - vbm_spins[0], 0),
max(cbm_spins[1] - vbm_spins[1], 0),
],
[cbm_spins[0], cbm_spins[1]],
[vbm_spins[0], vbm_spins[1]],
[
vbm_spins_kpoints[0] == cbm_spins_kpoints[0],
vbm_spins_kpoints[1] == cbm_spins_kpoints[1],
],

@@ -1736,6 +1819,32 @@ def test_standard(self):
finally:
sys.stdout = saved_stdout

def test_vaspwave(self):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functions checks whether the information read from legacy and hdf5 files are same or not.

for b in range(vaspwave.nb):
assert len(vaspwave.coeffs[k][b]) == len(wavecar.Gpoints[k])

with pytest.raises(ValueError, match="invalid filetype='json'"):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper filetype should be given.

with pytest.raises(ValueError, match="invalid filetype='json'"):
Wavecar(f"{VASP_OUT_DIR}/vaspwave.fccSi.h5.gz", filetype="json", vasp_type="std")

with pytest.raises(ValueError, match="Invalid rtag=.+, must be one of"):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default filetype is legacy. It will result in Value error for in valid rtag while reading hdf5 file as it tries to read the hdf5 file as legacy WAVECAR file. So, hdf5 filetype should be provided while reading hdf5 file.

@mkhorton
Copy link
Member

Thanks for the PR @Shibu778! My sincere apologies there was no attention given to this PR before now.

Could you resolve the current merge conflicts and ensure the automated CI checks still pass? The pymatgen code has now been moved into /src directory, so the changes may just need to be moved.

Other than this, I notice a lot of code is duplicated between read_from_hdf5 and read_from_legacy. I wonder if maybe the duplicated code can be merged into a single function?

It should also be possible to detect the filetype automatically: use the existing logical if the filename is *WAVECAR*, and use the new logic if the filename is *h5*.

@janosh janosh added enhancement A new feature or improvement to an existing one io Input/output functionality vasp Vienna Ab initio Simulation Package labels Jul 25, 2024
janosh and others added 7 commits August 4, 2024 07:15
found with regex \b\s{2,}\b[^\d]
… returning nothing

fix brittle TestQuasiHarmonicDebyeApprox.test_gruneisen_parameter from tight assert_allclose abs tol
The pdfs list should be constructed from filenames only without other path components, i.e. use last entry from split()
kavanase and others added 27 commits September 9, 2024 18:01
* Fix `apply_operation(fractional=True)`

* Add tests for `apply_operation`

* pre-commit auto-fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…config='cp39-*', skip_config='*musllinux*', requires_python=<SpecifierSet('>=3.10')>, prerelease_pythons=False)
* Use distinct names for artifacts

* Bump `upload-artifact`

* allow any v4 download-artifact

---------

Co-authored-by: Janosh Riebesell <[email protected]>
* Update surface and interface generating functions

1. fixing problem for the lll_reduce process when making slabs, doing mapping before updating the structure
2. allow to set ftol of the termination distances for hierarchical cluster so that some non-identical terminations close to each other can be identified
3. allow to add index for terminations so that terminations with the same space group can be distinguished

Interfaces made by identical slabs can be non-identical because the relative transformation of the misorientation and the termination variation do not ensure symmetry, especially when the film and substrate have different point groups. Therefore, the termination finding function should allow to generate all the possible terminations. This can help others to develop more robust algorithm to group the equivalent interfaces made by different terminations.

---------

Signed-off-by: Jason Xie <[email protected]>
Signed-off-by: Jason Xie <[email protected]>
Co-authored-by: Janosh Riebesell <[email protected]>
* push a random commit to get a baseline

* install native micromamba + native uv

* trigger uv to run a second time, hopefully cached

* get a benchmark of pip install uv

* remove manual install of build dependencies

* more verbose uv

* remove debug output
* fix all ruff DOC202

https://docs.astral.sh/ruff/rules/docstring-extraneous-returns/

* doc str format: "list[...]: List of ..." -> "list[...]: ..."
…rialsproject#4071)

* fixed electronic step check with algo = exact and nelm = 1

* pre-commit auto-fixes

* fixed the cases in which ALGO does not appear in incar

* pre-commit auto-fixes

---------

Co-authored-by: yanghan-microsoft <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Added DebyeCalculator as an external tool

Signed-off-by: Andy S. Anker <[email protected]>
* Update `slme` function so it doesn't overwrite the input data

* Basic plotting updates; remove hard-coded x-limit, add some labels
* Fix bug with species defaults

NEed to specify the default value properly

* pre-commit auto-fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…ect#4084)

* revert to strict=False for some zip

* add some return types

* Revert "add some return types"

This reverts commit 0663667.

* make test names follow naming convention

* avoid module level variable

* separate tests for basis and potential
* use np.testing to check dict equality

* add unit tests

* perhaps use a helper function instead?

* add test for misc, thanks gpt

* fix typo

* add check if return type
* pip install BoltzTraP2

* remove windows exclusion pin and see what happens

* bump bt2 ver

* skip bt2 for windows

* bump bt2 ver again

* still skip win for now as cannot build

* give VS build tools a try

* revert to skip windows

* remove unnecessary venv activate
@Shibu778
Copy link
Author

Shibu778 commented Oct 5, 2024

Dear @janosh
Thanks for your suggestion. Currently, I am trying to fix the merge conflict. I tried different approaches, and it's becoming complicated day by day. I am 8 commit ahead of the original repository and 304 commits behind. My effort in resolving the conflict first is polluting this pull request. What do you suggest I should do? Should we close this pull request and start a new one?

@Shibu778
Copy link
Author

Shibu778 commented Oct 5, 2024

Now I have synced the forked repository properly. I will start adding the feature reading WAVECAR and CHGCAR in hdf5 format soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement to an existing one io Input/output functionality vasp Vienna Ab initio Simulation Package
Projects
None yet
Development

Successfully merging this pull request may close these issues.