Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_vindex can sometimes fail with UnicodeDecodeError #2732

Open
QuLogic opened this issue Jan 19, 2025 · 0 comments
Open

test_vindex can sometimes fail with UnicodeDecodeError #2732

QuLogic opened this issue Jan 19, 2025 · 0 comments
Labels
bug Potential issues with the zarr-python library

Comments

@QuLogic
Copy link
Contributor

QuLogic commented Jan 19, 2025

Zarr version

3.0.1

Numcodecs version

0.14.0

Python Version

3.13.1

Operating System

Fedora Rawhide

Installation

source

Description

It appears that this tests uses some Hypothesis testing, but it isn't constrained sufficiently to not input invalid data.

Steps to reproduce

pytest -k test_vindex

Additional output

_________________________________ test_vindex __________________________________

    @given(data=st.data())
>   def test_vindex(data: st.DataObject) -> None:

tests/test_properties.py:36: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/test_properties.py:38: in test_vindex
    zarray = data.draw(arrays(shapes=npst.array_shapes(max_dims=4, min_side=1)))
/usr/lib/python3.13/site-packages/hypothesis/strategies/_internal/core.py:2151: in draw
    result = self.conjecture_data.draw(strategy, observe_as=f"generate:{desc}")
/usr/lib/python3.13/site-packages/hypothesis/internal/conjecture/data.py:2509: in draw
    v = strategy.do_draw(self)
/usr/lib/python3.13/site-packages/hypothesis/strategies/_internal/lazy.py:167: in do_draw
    return data.draw(self.wrapped_strategy)
/usr/lib/python3.13/site-packages/hypothesis/internal/conjecture/data.py:2503: in draw
    return strategy.do_draw(self)
/usr/lib/python3.13/site-packages/hypothesis/strategies/_internal/core.py:1775: in do_draw
    return self.definition(data.draw, *self.args, **self.kwargs)
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/testing/strategies.py:167: in arrays
    a[:] = nparray
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/array.py:2471: in __setitem__
    self.set_orthogonal_selection(pure_selection, value, fields=fields)
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/_compat.py:43: in inner_f
    return f(*args, **kwargs)
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/array.py:2927: in set_orthogonal_selection
    return sync(
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/sync.py:142: in sync
    raise return_result
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/sync.py:98: in _runner
    return await coro
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/array.py:1361: in _set_selection
    await self.codec_pipeline.write(
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/codec_pipeline.py:468: in write
    await concurrent_map(
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/common.py:68: in concurrent_map
    return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/common.py:66: in run
    return await func(*item)
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/codec_pipeline.py:403: in write_batch
    chunk_bytes_batch = await self.encode_batch(
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/codec_pipeline.py:210: in encode_batch
    chunk_bytes_batch = await self.array_bytes_codec.encode(
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/abc/codec.py:152: in encode
    return await _batching_helper(self._encode_single, chunks_and_specs)
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/abc/codec.py:407: in _batching_helper
    return await concurrent_map(
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/common.py:68: in concurrent_map
    return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/common.py:66: in run
    return await func(*item)
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/abc/codec.py:420: in wrap
    return await func(chunk, chunk_spec)
../BUILDROOT/usr/lib/python3.13/site-packages/zarr/codecs/_v2.py:88: in _encode_single
    chunk = await asyncio.to_thread(f.encode, chunk)
/usr/lib64/python3.13/asyncio/threads.py:25: in to_thread
    return await loop.run_in_executor(None, func_call)
/usr/lib64/python3.13/concurrent/futures/thread.py:59: in run
    result = self.fn(*self.args, **self.kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 0: surrogates not allowed
E   while generating 'Draw 1' from arrays(shapes=array_shapes(max_dims=4))
E   Falsifying example: test_vindex(
E       data=data(...),
E   )
E   Explanation:
E       These lines were always and only run by failing examples:
E           /builddir/build/BUILD/python-zarr-3.0.1-build/BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/sync.py:142
E           /builddir/build/BUILD/python-zarr-3.0.1-build/BUILDROOT/usr/lib/python3.13/site-packages/zarr/core/sync.py:99
E           /usr/lib64/python3.13/asyncio/futures.py:356
E           /usr/lib64/python3.13/asyncio/tasks.py:841
E           /usr/lib64/python3.13/asyncio/tasks.py:842
E           (and 5 more with settings.verbosity >= verbose)

numcodecs/vlen.pyx:105: UnicodeEncodeError
@QuLogic QuLogic added the bug Potential issues with the zarr-python library label Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

1 participant