Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Refactor tests that checks the string output from the info module #3564

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 9 additions & 10 deletions pygmt/tests/test_clib_virtualfile_from_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"""

import numpy as np
import numpy.testing as npt
import pytest
from pygmt import clib
from pygmt.helpers import GMTTempFile
Expand All @@ -27,11 +28,10 @@ def test_virtualfile_from_matrix(dtypes):
with clib.Session() as lib:
with lib.virtualfile_from_matrix(data) as vfile:
with GMTTempFile() as outfile:
lib.call_module("info", [vfile, f"->{outfile.name}"])
output = outfile.read(keep_tabs=True)
bounds = "\t".join([f"<{col.min():.0f}/{col.max():.0f}>" for col in data.T])
expected = f"<matrix memory>: N = {shape[0]}\t{bounds}\n"
assert output == expected
lib.call_module("info", [vfile, "-C", f"->{outfile.name}"])
output = outfile.loadtxt()
npt.assert_equal(output[::2], data.min(axis=0))
npt.assert_equal(output[1::2], data.max(axis=0))
Comment on lines 28 to +34
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can change it to:

        with clib.Session() as lib:
            with (
                lib.virtualfile_from_matrix(data) as vintbl,
                lib.virtualfile_out(kind="dataset") as vouttbl,
            ):
                lib.call_module("read", [vintbl, vouttbl, "-Td"])
                output = lib.virtualfile_to_dataset(vfname=vouttbl, output_type="numpy")
                npt.assert_equal(output, data)

Pros:

  • No need to use GMTTempFile
  • Check the full data array rather than just the min/max values
  • This is exactly what we are doing (virtualfile_in/virtualfile_out/call_module/virtualfile_to_dataset) when wrapping modules

Cons:

  • Calls four Session functions in a single test, violating the rule that "unit test should test one thing"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems @weiji14 is against the changes in this comment. Are you OK with the changes in https://github.com/GenericMappingTools/pygmt/pull/3564/files, i.e., checking the numerical min/max values rather than checking the string output from info.

Copy link
Member Author

@seisman seisman Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comparison, we sometimes use Session.write_data to write the dataset into a temporary file and load it to compare the full data array.

with GMTTempFile() as tmp_file:
lib.write_data(
"GMT_IS_VECTOR",
"GMT_IS_POINT",
"GMT_WRITE_SET",
wesn,
tmp_file.name,
dataset,
)
# Load the data and check that it's correct
newx, newy, newz = tmp_file.loadtxt(unpack=True, dtype=dtype)

Copy link
Member

@weiji14 weiji14 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing the <matrix memory>: N part which checks that the array was loaded as a GMT matrix, rather than via a GMT vector <vector memory>. Also missing the count/length of the array.

I think it's best to keep this unit test simple, we're also using .loadtxt here instead of .read, which means it is going through numpy.loadtxt that introduces another layer or source of potential breakages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing the <matrix memory>: N part which checks that the array was loaded as a GMT matrix, rather than via a GMT vector <vector memory>.

I actually don't care if it's a matrix or vector memory (it's just decided by the GMT_VIA_VECTOR/GMT_VIA_MATRIX modifier, as long as the data array is correct.

The current way in the main branch (checking string output from info) only checks the min/max values in each column, which is not 100% correct, since the actual data array that GMT API gets may differ from the array we pass without affecting the min/max values. That's why checking the full data array is more robust. Please see the Session.write_data way above.



def test_virtualfile_from_matrix_slice(dtypes):
Expand All @@ -47,8 +47,7 @@ def test_virtualfile_from_matrix_slice(dtypes):
with clib.Session() as lib:
with lib.virtualfile_from_matrix(data) as vfile:
with GMTTempFile() as outfile:
lib.call_module("info", [vfile, f"->{outfile.name}"])
output = outfile.read(keep_tabs=True)
bounds = "\t".join([f"<{col.min():.0f}/{col.max():.0f}>" for col in data.T])
expected = f"<matrix memory>: N = {rows}\t{bounds}\n"
assert output == expected
lib.call_module("info", [vfile, "-C", f"->{outfile.name}"])
output = outfile.loadtxt()
npt.assert_equal(output[::2], data.min(axis=0))
npt.assert_equal(output[1::2], data.max(axis=0))