Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INSTALL: All tests are not passing after update on Mac (Intel-based) #3800

Open
nhartney opened this issue Oct 16, 2024 · 11 comments
Open

INSTALL: All tests are not passing after update on Mac (Intel-based) #3800

nhartney opened this issue Oct 16, 2024 · 11 comments

Comments

@nhartney
Copy link

Describe the error
3 tests fail after a Firedrake update: test_dg_advection_icosahedral_sphere_parallel, test_dg_advection_cubed_sphere_parallel and test_poisson_analytic_linear_parallel.

Steps to Reproduce
firedrake-update --install gusto
pip uninstall -y h5py
pip uninstall -y mpi4py
firedrake-update --install gusto
firedrake-clean
cd $VIRTUAL_ENV/src/firedrake
pytest tests/regression/ -k "poisson_strong or stokes_mini or dg_advection"

Expected behavior
Expected all tests to pass.

Error message

============================= test session starts ==============================
platform darwin -- Python 3.10.8, pytest-8.3.3, pluggy-1.5.0
rootdir: /Users/Jemma/firedrake/src/firedrake
configfile: setup.cfg
plugins: anyio-4.3.0, mpi-0.1, xdist-3.5.0, nbval-0.11.0
collected 3596 items / 3569 deselected / 2 skipped / 27 selected               

tests/regression/test_dg_advection.py .F.F                               [ 14%]
tests/regression/test_poisson_strong_bcs.py ................F            [ 77%]
tests/regression/test_poisson_strong_bcs_nitsche.py ....                 [ 92%]
tests/regression/test_stokes_mini.py ..                                  [100%]

=================================== FAILURES ===================================
________________ test_dg_advection_icosahedral_sphere_parallel _________________

args = (), kwargs = {}

    def parallel_callback(*args, **kwargs):
>       subprocess.run(cmd, check=True)

../pytest-mpi/pytest_mpi.py:210: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

input = None, capture_output = False, timeout = None, check = True
popenargs = (['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHILD_PROCESS', '1', ...],)
kwargs = {}
process = <Popen: returncode: 6 args: ['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHI...>
stdout = None, stderr = None, retcode = 6

    def run(*popenargs,
            input=None, capture_output=False, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.
    
        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them,
        or pass capture_output=True to capture both.
    
        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return code
        in the returncode attribute, and output & stderr attributes if those streams
        were captured.
    
        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.
    
        There is an optional argument "input", allowing you to
        pass bytes or a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.
    
        By default, all communication is in bytes, and therefore any "input" should
        be bytes, and the stdout and stderr will be bytes. If in text mode, any
        "input" should be a string, and stdout and stderr will be strings decoded
        according to locale encoding, or by "encoding" if set. Text mode is
        triggered by setting any of text, encoding, errors or universal_newlines.
    
        The other arguments are the same as for the Popen constructor.
        """
        if input is not None:
            if kwargs.get('stdin') is not None:
                raise ValueError('stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE
    
        if capture_output:
            if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
                raise ValueError('stdout and stderr arguments may not be used '
                                 'with capture_output.')
            kwargs['stdout'] = PIPE
            kwargs['stderr'] = PIPE
    
        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired as exc:
                process.kill()
                if _mswindows:
                    # Windows accumulates the output in a single blocking
                    # read() call run on child threads, with the timeout
                    # being done in a join() on those threads.  communicate()
                    # _after_ kill() is required to collect that and add it
                    # to the exception.
                    exc.stdout, exc.stderr = process.communicate()
                else:
                    # POSIX _communicate already populated the output so
                    # far into the TimeoutExpired exception.
                    process.wait()
                raise
            except:  # Including KeyboardInterrupt, communicate handled that.
                process.kill()
                # We don't call process.wait() as .__exit__ does that for us.
                raise
            retcode = process.poll()
            if check and retcode:
>               raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command '['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHILD_PROCESS', '1', '/Users/Jemma/firedrake/bin/pytest', '--runxfail', '-s', '-q', '/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py::test_dg_advection_icosahedral_sphere_parallel', ':', '-n', '2', '/Users/Jemma/firedrake/bin/pytest', '--runxfail', '-s', '-q', '/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py::test_dg_advection_icosahedral_sphere_parallel', '--tb=no', '--no-summary', '--no-header', '--disable-warnings', '--show-capture=no']' returned non-zero exit status 6.

/usr/local/Cellar/[email protected]/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py:526: CalledProcessError
----------------------------- Captured stdout call -----------------------------

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 1688 RUNNING AT gbigpro.local
=   EXIT CODE: 6
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Abort trap: 6 (signal 6)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
----------------------------- Captured stderr call -----------------------------
firedrake:WARNING OMP_NUM_THREADS is not set or is set to a value greater than 1, we suggest setting OMP_NUM_THREADS=1 to improve performance
Fatal Python error: Aborted

Current thread 0x0000000110c88600 (most recent call first):
  File "/Users/Jemma/firedrake/src/firedrake/firedrake/variational_solver.py", line 324 in solve
  File "/Users/Jemma/firedrake/src/firedrake/firedrake/adjoint_utils/variational_solver.py", line 89 in wrapper
  File "/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py", line 48 in run_test
  File "/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py", line 74 in test_dg_advection_icosahedral_sphere_parallel
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/python.py", line 159 in pytest_pyfunc_call
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/python.py", line 1627 in runtest
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 174 in pytest_runtest_call
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 242 in <lambda>
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 241 in call_and_report
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 132 in runtestprotocol
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 113 in pytest_runtest_protocol
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 362 in pytest_runtestloop
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 337 in _main
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 283 in wrap_session
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 330 in pytest_cmdline_main
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/config/__init__.py", line 175 in main
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/config/__init__.py", line 201 in console_main
  File "/Users/Jemma/firedrake/bin/pytest", line 8 in <module>

Extension modules: mpi4py.MPI, zmq.backend.cython.context, zmq.backend.cython.message, zmq.backend.cython.socket, zmq.backend.cython._device, zmq.backend.cython._poll, zmq.backend.cython._proxy_steerable, zmq.backend.cython._version, zmq.backend.cython.error, zmq.backend.cython.utils, tornado.speedups, petsc4py.PETSc, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, siphash24, pyop2.sparsity, scipy._lib._ccallback_c, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._flinalg, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.special._ellip_harm_2, symengine.lib.symengine_wrapper, firedrake.cython.dmcommon, firedrake.cython.extrusion_numbering, firedrake.cython.spatialindex, firedrake.cython.patchimpl, h5py._errors, h5py.defs, h5py._objects, h5py.h5, h5py.utils, h5py.h5t, h5py.h5s, h5py.h5ac, h5py.h5p, h5py.h5r, h5py._proxy, h5py._conv, h5py.h5z, h5py.h5a, h5py.h5d, h5py.h5ds, h5py.h5g, h5py.h5i, h5py.h5o, h5py.h5f, h5py.h5fd, h5py.h5pl, h5py.h5l, h5py._selector, firedrake.cython.hdf5interface, firedrake.cython.mgimpl, vtkmodules.vtkCommonCore, vtkmodules.vtkCommonMath, vtkmodules.vtkCommonTransforms, vtkmodules.vtkCommonDataModel, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, matplotlib._image (total: 96)
___________________ test_dg_advection_cubed_sphere_parallel ____________________

args = (), kwargs = {}

    def parallel_callback(*args, **kwargs):
>       subprocess.run(cmd, check=True)

../pytest-mpi/pytest_mpi.py:210: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

input = None, capture_output = False, timeout = None, check = True
popenargs = (['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHILD_PROCESS', '1', ...],)
kwargs = {}
process = <Popen: returncode: 6 args: ['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHI...>
stdout = None, stderr = None, retcode = 6

    def run(*popenargs,
            input=None, capture_output=False, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.
    
        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them,
        or pass capture_output=True to capture both.
    
        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return code
        in the returncode attribute, and output & stderr attributes if those streams
        were captured.
    
        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.
    
        There is an optional argument "input", allowing you to
        pass bytes or a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.
    
        By default, all communication is in bytes, and therefore any "input" should
        be bytes, and the stdout and stderr will be bytes. If in text mode, any
        "input" should be a string, and stdout and stderr will be strings decoded
        according to locale encoding, or by "encoding" if set. Text mode is
        triggered by setting any of text, encoding, errors or universal_newlines.
    
        The other arguments are the same as for the Popen constructor.
        """
        if input is not None:
            if kwargs.get('stdin') is not None:
                raise ValueError('stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE
    
        if capture_output:
            if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
                raise ValueError('stdout and stderr arguments may not be used '
                                 'with capture_output.')
            kwargs['stdout'] = PIPE
            kwargs['stderr'] = PIPE
    
        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired as exc:
                process.kill()
                if _mswindows:
                    # Windows accumulates the output in a single blocking
                    # read() call run on child threads, with the timeout
                    # being done in a join() on those threads.  communicate()
                    # _after_ kill() is required to collect that and add it
                    # to the exception.
                    exc.stdout, exc.stderr = process.communicate()
                else:
                    # POSIX _communicate already populated the output so
                    # far into the TimeoutExpired exception.
                    process.wait()
                raise
            except:  # Including KeyboardInterrupt, communicate handled that.
                process.kill()
                # We don't call process.wait() as .__exit__ does that for us.
                raise
            retcode = process.poll()
            if check and retcode:
>               raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command '['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHILD_PROCESS', '1', '/Users/Jemma/firedrake/bin/pytest', '--runxfail', '-s', '-q', '/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py::test_dg_advection_cubed_sphere_parallel', ':', '-n', '2', '/Users/Jemma/firedrake/bin/pytest', '--runxfail', '-s', '-q', '/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py::test_dg_advection_cubed_sphere_parallel', '--tb=no', '--no-summary', '--no-header', '--disable-warnings', '--show-capture=no']' returned non-zero exit status 6.

/usr/local/Cellar/[email protected]/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py:526: CalledProcessError
----------------------------- Captured stdout call -----------------------------

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 2217 RUNNING AT gbigpro.local
=   EXIT CODE: 6
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Abort trap: 6 (signal 6)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
----------------------------- Captured stderr call -----------------------------
firedrake:WARNING OMP_NUM_THREADS is not set or is set to a value greater than 1, we suggest setting OMP_NUM_THREADS=1 to improve performance
Fatal Python error: Aborted

Current thread 0x0000000109f60600 (most recent call first):
  File "/Users/Jemma/firedrake/src/firedrake/firedrake/variational_solver.py", line 324 in solve
  File "/Users/Jemma/firedrake/src/firedrake/firedrake/adjoint_utils/variational_solver.py", line 89 in wrapper
  File "/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py", line 51 in run_test
  File "/Users/Jemma/firedrake/src/firedrake/tests/regression/test_dg_advection.py", line 83 in test_dg_advection_cubed_sphere_parallel
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/python.py", line 159 in pytest_pyfunc_call
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/python.py", line 1627 in runtest
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 174 in pytest_runtest_call
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 242 in <lambda>
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 241 in call_and_report
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 132 in runtestprotocol
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/runner.py", line 113 in pytest_runtest_protocol
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 362 in pytest_runtestloop
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 337 in _main
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 283 in wrap_session
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/main.py", line 330 in pytest_cmdline_main
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/config/__init__.py", line 175 in main
  File "/Users/Jemma/firedrake/lib/python3.10/site-packages/_pytest/config/__init__.py", line 201 in console_main
  File "/Users/Jemma/firedrake/bin/pytest", line 8 in <module>

Extension modules: mpi4py.MPI, zmq.backend.cython.context, zmq.backend.cython.message, zmq.backend.cython.socket, zmq.backend.cython._device, zmq.backend.cython._poll, zmq.backend.cython._proxy_steerable, zmq.backend.cython._version, zmq.backend.cython.error, zmq.backend.cython.utils, tornado.speedups, petsc4py.PETSc, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, siphash24, pyop2.sparsity, scipy._lib._ccallback_c, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._flinalg, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.special._ellip_harm_2, symengine.lib.symengine_wrapper, firedrake.cython.dmcommon, firedrake.cython.extrusion_numbering, firedrake.cython.spatialindex, firedrake.cython.patchimpl, h5py._errors, h5py.defs, h5py._objects, h5py.h5, h5py.utils, h5py.h5t, h5py.h5s, h5py.h5ac, h5py.h5p, h5py.h5r, h5py._proxy, h5py._conv, h5py.h5z, h5py.h5a, h5py.h5d, h5py.h5ds, h5py.h5g, h5py.h5i, h5py.h5o, h5py.h5f, h5py.h5fd, h5py.h5pl, h5py.h5l, h5py._selector, firedrake.cython.hdf5interface, firedrake.cython.mgimpl, vtkmodules.vtkCommonCore, vtkmodules.vtkCommonMath, vtkmodules.vtkCommonTransforms, vtkmodules.vtkCommonDataModel, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, matplotlib._image (total: 96)
____________________ test_poisson_analytic_linear_parallel _____________________

args = (), kwargs = {}

    def parallel_callback(*args, **kwargs):
>       subprocess.run(cmd, check=True)

../pytest-mpi/pytest_mpi.py:210: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

input = None, capture_output = False, timeout = None, check = True
popenargs = (['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHILD_PROCESS', '1', ...],)
kwargs = {}
process = <Popen: returncode: 1 args: ['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHI...>
stdout = None, stderr = None, retcode = 1

    def run(*popenargs,
            input=None, capture_output=False, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.
    
        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them,
        or pass capture_output=True to capture both.
    
        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return code
        in the returncode attribute, and output & stderr attributes if those streams
        were captured.
    
        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.
    
        There is an optional argument "input", allowing you to
        pass bytes or a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.
    
        By default, all communication is in bytes, and therefore any "input" should
        be bytes, and the stdout and stderr will be bytes. If in text mode, any
        "input" should be a string, and stdout and stderr will be strings decoded
        according to locale encoding, or by "encoding" if set. Text mode is
        triggered by setting any of text, encoding, errors or universal_newlines.
    
        The other arguments are the same as for the Popen constructor.
        """
        if input is not None:
            if kwargs.get('stdin') is not None:
                raise ValueError('stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE
    
        if capture_output:
            if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
                raise ValueError('stdout and stderr arguments may not be used '
                                 'with capture_output.')
            kwargs['stdout'] = PIPE
            kwargs['stderr'] = PIPE
    
        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired as exc:
                process.kill()
                if _mswindows:
                    # Windows accumulates the output in a single blocking
                    # read() call run on child threads, with the timeout
                    # being done in a join() on those threads.  communicate()
                    # _after_ kill() is required to collect that and add it
                    # to the exception.
                    exc.stdout, exc.stderr = process.communicate()
                else:
                    # POSIX _communicate already populated the output so
                    # far into the TimeoutExpired exception.
                    process.wait()
                raise
            except:  # Including KeyboardInterrupt, communicate handled that.
                process.kill()
                # We don't call process.wait() as .__exit__ does that for us.
                raise
            retcode = process.poll()
            if check and retcode:
>               raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command '['mpiexec', '-n', '1', '-genv', '_PYTEST_MPI_CHILD_PROCESS', '1', '/Users/Jemma/firedrake/bin/pytest', '--runxfail', '-s', '-q', '/Users/Jemma/firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py::test_poisson_analytic_linear_parallel', ':', '-n', '1', '/Users/Jemma/firedrake/bin/pytest', '--runxfail', '-s', '-q', '/Users/Jemma/firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py::test_poisson_analytic_linear_parallel', '--tb=no', '--no-summary', '--no-header', '--disable-warnings', '--show-capture=no']' returned non-zero exit status 1.

/usr/local/Cellar/[email protected]/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py:526: CalledProcessError
----------------------------- Captured stdout call -----------------------------
[1] error: 12.124355652982139
F[0] error: 12.124355652982139
F
=================================== FAILURES ===================================
____________________ test_poisson_analytic_linear_parallel _____________________

    @pytest.mark.parallel(nprocs=2)
    def test_poisson_analytic_linear_parallel():
        from mpi4py import MPI
        error = run_test_linear(1, 1)
        print('[%d]' % MPI.COMM_WORLD.rank, 'error:', error)
>       assert error < 5e-6
E       assert 12.124355652982139 < 5e-06

tests/regression/test_poisson_strong_bcs.py:92: AssertionError
=============================== warnings summary ===============================
../fiat/FIAT/__init__.py:5
  /Users/Jemma/firedrake/src/fiat/FIAT/__init__.py:5: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
    import pkg_resources

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/regression/test_poisson_strong_bcs.py::test_poisson_analytic_linear_parallel
1 failed, 1 warning in 1.33s

1 failed, 1 warning in 1.33s
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 3 unused database options. They are:
Option left: name:--runxfail (no value) source: command line
Option left: name:-q value: /Users/Jemma/firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py::test_poisson_analytic_linear_parallel source: command line
Option left: name:-s (no value) source: command line
----------------------------- Captured stderr call -----------------------------
firedrake:WARNING OMP_NUM_THREADS is not set or is set to a value greater than 1, we suggest setting OMP_NUM_THREADS=1 to improve performance
=============================== warnings summary ===============================
../fiat/FIAT/__init__.py:5
  /Users/Jemma/firedrake/src/fiat/FIAT/__init__.py:5: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
    import pkg_resources

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/regression/test_dg_advection.py::test_dg_advection_icosahedral_sphere_parallel - subprocess.CalledProcessError: Command '['mpiexec', '-n', '1', '-genv', '_P...
FAILED tests/regression/test_dg_advection.py::test_dg_advection_cubed_sphere_parallel - subprocess.CalledProcessError: Command '['mpiexec', '-n', '1', '-genv', '_P...
FAILED tests/regression/test_poisson_strong_bcs.py::test_poisson_analytic_linear_parallel - subprocess.CalledProcessError: Command '['mpiexec', '-n', '1', '-genv', '_P...
= 3 failed, 24 passed, 2 skipped, 3569 deselected, 1 warning in 124.43s (0:02:04) =
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There is one unused database option. It is:
Option left: name:-k value: poisson_strong or stokes_mini or dg_advection source: command line

Please also include either firedrake-install.log which is found in
the directory firedrake-install was run or firedrake-update.log
found in the virtualenv directory.
firedrake-update.log

Additionally please include the PETSc configure log located in
$VIRTUAL_ENV/src/petsc/configure.log.
configure.log

Environment:

  • OS: MacOS 12.7.6
  • Python version: 3.10.8
  • Any relevant environment variables or modifications: None

Additional context
I did pip uninstall -y h5py and pip uninstall -y mpi4py after a first unsuccessful update, and firedrake-clean after the second update (which was more successful) when this was suggested. After the second update I still have some failing tests. I tried the suggestion from Issue #3793 but this did not work for me.

@connorjward
Copy link
Contributor

Unfortunately these are known issues with parallel MUMPS on Mac. This issue is the same as #3793. @Ig-dolci knows more.

In general it should still be possible to run your code even with these failing tests:

  • If you are not running in parallel then there should be no issue.
  • If you need a parallel direct solver (which is what MUMPS is) then we recommend using SuperLU_Dist instead, which is also bundled with Firedrake.

This issue should probably be closed as a duplicate of #3793.

@Ig-dolci
Copy link
Contributor

Ig-dolci commented Oct 16, 2024

This failure is in general happening only with pytest executions. Can you please execute this test:

  1. Open the file firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py.

  2. At the end of the file, add the following lines:

    if __name__ == '__main__':
        test_poisson_analytic_linear_parallel()
  3. Ensure that you have the Firedrake virtual environment activated, then run the following command:

    mpiexec -np 2 python test_poisson_strong_bcs.py
  4. After running the test, verify that the output errors are similar as follows:

    [1] error: 8.809531411201023e-16
    [0] error: 8.809531411201023e-16
    

If you encounter this error, it indicates that the parallel Python executions are working correctly on your machine. Again, this issue is related to the firedrake pytest and does not affect your Python code executions.

@nhartney Please confirm if you are obtaining such results.

@Ig-dolci
Copy link
Contributor

Unfortunately these are known issues with parallel MUMPS on Mac. This issue is the same as #3793. @Ig-dolci knows more.

In general it should still be possible to run your code even with these failing tests:

  • If you are not running in parallel then there should be no issue.
  • If you need a parallel direct solver (which is what MUMPS is) then we recommend using SuperLU_Dist instead, which is also bundled with Firedrake.

This issue should probably be closed as a duplicate of #3793.

Yes. This issue is duplicated with #3793.

@connorjward
Copy link
Contributor

This failure is in general happening only with pytest executions. Can you please execute this test:

  1. Open the file firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py.
  2. At the end of the file, add the following lines:
    if __name__ == '__main__':
        test_poisson_analytic_linear_parallel()
  3. Ensure that you have the Firedrake virtual environment activated, then run the following command:
    mpiexec -np 2 python test_poisson_strong_bcs.py
  4. After running the test, verify that the output errors are similar as follows:
    [1] error: 8.809531411201023e-16
    [0] error: 8.809531411201023e-16
    

If you encounter this error, it indicates that the parallel Python executions are working correctly on your machine. Again, this issue is related to the firedrake pytest and does not affect your Python code executions.

@nhartney Please confirm if you are obtaining such results.

Interesting. I didn't realise it was exclusively an issue with pytest (most likely pytest-mpi). In that case I think @JDBetteridge moving us to a different means of invoking parallel pytest will at least stop users encountering this issue in the majority of cases.

My guess is that with the current parallel testing approach we initialise MPI twice and perhaps MUMPS is not set up for this.

@JDBetteridge
Copy link
Member

JDBetteridge commented Oct 16, 2024

This is not a duplicate of issue 3793. The final part of Nell's message says that she tried the suggestions and it didn't work. @nhartney could you share exactly what you tried? I know you sent me the output:

[1] error: 12.124355652982139
[0] error: 12.124355652982139

But could you share the exact command you executed (which I assume is the method used in the afore mentioned issue something like mpiexec -np 2 python test_poisson_strong_bcs.py) please?

There aren't many Intel Macs kicking around any more so reproducing this issue might be quite difficult!

@Ig-dolci
Copy link
Contributor

You are right @JDBetteridge . My suggestion to try fixing this python errors is also update petsc with firedrake-update —rebuild.

@JDBetteridge
Copy link
Member

Good idea, but firedrake-update --rebuild is what lead to this error in the first place. Nell has just done a fresh update precisely because the PETSc on the Mac was out of date. We should check the configure.log for any issues there before rebuilding

@nhartney
Copy link
Author

This failure is in general happening only with pytest executions. Can you please execute this test:

  1. Open the file firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py.
  2. At the end of the file, add the following lines:
    if __name__ == '__main__':
        test_poisson_analytic_linear_parallel()
  3. Ensure that you have the Firedrake virtual environment activated, then run the following command:
    mpiexec -np 2 python test_poisson_strong_bcs.py
  4. After running the test, verify that the output errors are similar as follows:
    [1] error: 8.809531411201023e-16
    [0] error: 8.809531411201023e-16
    

If you encounter this error, it indicates that the parallel Python executions are working correctly on your machine. Again, this issue is related to the firedrake pytest and does not affect your Python code executions.

@nhartney Please confirm if you are obtaining such results.

After trying this test the output errors are not as given in step 4. Instead the output I get is:

[1] error: 12.124355652982139
Traceback (most recent call last):
  File "/Users/Jemma/firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py", line 95, in <module>
    test_poisson_analytic_linear_parallel()
  File "/Users/Jemma/firedrake/src/firedrake/tests/regression/test_poisson_strong_bcs.py", line 92, in test_poisson_analytic_linear_parallel
    assert error < 5e-6
AssertionError
application called MPI_Abort(PYOP2_COMM_WORLD, 1) - process 1

@Ig-dolci
Copy link
Contributor

It seems that even after updating Firedrake/PETSc, you are still encountering issues with parallel executions. To sort out this, try setting solver_parameters={"pc_factor_mat_solver_type": "superlu_dist"} for the solver. This should allow the parallel executions to work correctly.

@nhartney
Copy link
Author

This worked - thank you very much!

@connorjward
Copy link
Contributor

To summarise, it seems like the current MacOS + MUMPS situation is as follows:

  1. On Intel, Macs MUMPS is completely broken in parallel
  2. On M1/M2/M3, Macs MUMPS only fails when running pytest in parallel (and probably only when pytest is run on the "outside").

In all cases the solution is to use superlu_dist instead.

For (1), as Intel Macs are phased out I think we are less and less likely to have users reporting this issue and so we probably don't need to do anything.

For (2) I also think that users will stop hitting this because:

  • @JDBetteridge is helpfully providing a make smoke_test (or similar) command that we can use instead of telling users to run pytest directly to test their installations. This will run pytest on the "inside" and so hopefully will pass fine.
  • I think that there are probably very few developers using pytest-mpi to write parallel tests. pytest on the "outside" is merely a convenience and pytest on the "inside" is an acceptable thing to do instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants