POC: Refactor tests that checks the string output from the info module #3564

seisman · 2024-10-30T12:53:01Z

Instead of check the string output from info, I feel checking the numpy array is more accurate and more readable.

This is just a proof of concept. I'll work on other tests if you agree with the changes.

Please vote by 👍 or 👎 or leave comments.

seisman · 2024-10-30T15:34:31Z

pygmt/tests/test_clib_virtualfile_from_matrix.py

        with clib.Session() as lib:
            with lib.virtualfile_from_matrix(data) as vfile:
                with GMTTempFile() as outfile:
-                    lib.call_module("info", [vfile, f"->{outfile.name}"])
-                    output = outfile.read(keep_tabs=True)
-            bounds = "\t".join([f"<{col.min():.0f}/{col.max():.0f}>" for col in data.T])
-            expected = f"<matrix memory>: N = {shape[0]}\t{bounds}\n"
-            assert output == expected
+                    lib.call_module("info", [vfile, "-C", f"->{outfile.name}"])
+                    output = outfile.loadtxt()
+        npt.assert_equal(output[::2], data.min(axis=0))
+        npt.assert_equal(output[1::2], data.max(axis=0))


Or we can change it to:

with clib.Session() as lib: with ( lib.virtualfile_from_matrix(data) as vintbl, lib.virtualfile_out(kind="dataset") as vouttbl, ): lib.call_module("read", [vintbl, vouttbl, "-Td"]) output = lib.virtualfile_to_dataset(vfname=vouttbl, output_type="numpy") npt.assert_equal(output, data)

Pros:

No need to use GMTTempFile

Check the full data array rather than just the min/max values

This is exactly what we are doing (virtualfile_in/virtualfile_out/call_module/virtualfile_to_dataset) when wrapping modules

Cons:

Calls four Session functions in a single test, violating the rule that "unit test should test one thing"

It seems @weiji14 is against the changes in this comment. Are you OK with the changes in https://github.com/GenericMappingTools/pygmt/pull/3564/files, i.e., checking the numerical min/max values rather than checking the string output from info.

For comparison, we sometimes use Session.write_data to write the dataset into a temporary file and load it to compare the full data array.

pygmt/pygmt/tests/test_clib_put_vector.py

Lines 46 to 56 in 82b0c73

with GMTTempFile() as tmp_file:

lib.write_data(

"GMT_IS_VECTOR",

"GMT_IS_POINT",

"GMT_WRITE_SET",

wesn,

tmp_file.name,

dataset,

)

# Load the data and check that it's correct

newx, newy, newz = tmp_file.loadtxt(unpack=True, dtype=dtype)

We're missing the <matrix memory>: N part which checks that the array was loaded as a GMT matrix, rather than via a GMT vector <vector memory>. Also missing the count/length of the array.

I think it's best to keep this unit test simple, we're also using .loadtxt here instead of .read, which means it is going through numpy.loadtxt that introduces another layer or source of potential breakages.

We're missing the <matrix memory>: N part which checks that the array was loaded as a GMT matrix, rather than via a GMT vector <vector memory>.

I actually don't care if it's a matrix or vector memory (it's just decided by the GMT_VIA_VECTOR/GMT_VIA_MATRIX modifier, as long as the data array is correct.

The current way in the main branch (checking string output from info) only checks the min/max values in each column, which is not 100% correct, since the actual data array that GMT API gets may differ from the array we pass without affecting the min/max values. That's why checking the full data array is more robust. Please see the Session.write_data way above.

Refactor tests for test_clib_virtualfile_from_matrix

c925569

seisman added maintenance Boring but important stuff for the core devs needs review This PR has higher priority and needs review. labels Oct 30, 2024

seisman commented Oct 30, 2024

View reviewed changes

seisman removed the needs review This PR has higher priority and needs review. label Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: Refactor tests that checks the string output from the info module #3564

POC: Refactor tests that checks the string output from the info module #3564

seisman commented Oct 30, 2024 •

edited

Loading

seisman Oct 30, 2024

seisman Oct 31, 2024

seisman Nov 1, 2024 •

edited

Loading

weiji14 Nov 1, 2024 •

edited

Loading

seisman Nov 1, 2024

	with GMTTempFile() as tmp_file:
	lib.write_data(
	"GMT_IS_VECTOR",
	"GMT_IS_POINT",
	"GMT_WRITE_SET",
	wesn,
	tmp_file.name,
	dataset,
	)
	# Load the data and check that it's correct
	newx, newy, newz = tmp_file.loadtxt(unpack=True, dtype=dtype)

POC: Refactor tests that checks the string output from the info module #3564

Are you sure you want to change the base?

POC: Refactor tests that checks the string output from the info module #3564

Conversation

seisman commented Oct 30, 2024 • edited Loading

seisman Oct 30, 2024

Choose a reason for hiding this comment

seisman Oct 31, 2024

Choose a reason for hiding this comment

seisman Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

weiji14 Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

seisman Nov 1, 2024

Choose a reason for hiding this comment

seisman commented Oct 30, 2024 •

edited

Loading

seisman Nov 1, 2024 •

edited

Loading

weiji14 Nov 1, 2024 •

edited

Loading