Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected PGCOPY signature of 11 bytes at beginning of stream but found -1 bytes of input #2276

Open
mcrumiller opened this issue Oct 24, 2024 · 5 comments
Labels
Type: bug Something isn't working

Comments

@mcrumiller
Copy link

mcrumiller commented Oct 24, 2024

What happened?

This is a semi-duplicate of #2133. That issue was closed by a PR that apparently improves the error message, but it seems as though it doesn't resolve the issue itself in that instance.

In my case, I'd like to diagnose what's causing the error (if possible) as I'd like to actually get this working--if there's a workaround possible on my end I would like to pursue that possibility before waiting for a new version of ADBC. I'm not sure where the ADBC log file are stored and I can't find in the documentation where I might look.

When I run my query (via polars), I get:

IO: [libpq] ReadHeader failed: Expected PGCOPY signature of 11 bytes at beginning of stream but found -1 bytes of input

If I add a LIMIT 10000 the query succeeds, so the issue is either in a later record in the data, or something else that I can't think of. I expect 1,219,228 total records.. Can someone possibly help me diagnose the issue?

Stack Trace

  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\adbc_driver_manager\_reader.pyx", line 89, in adbc_driver_manager._reader.AdbcRecordBatchReader.read_all
    return self._reader.read_all()
  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\pyarrow\ipc.pxi", line 762, in pyarrow.lib.RecordBatchReader.read_all
    check_status(self.reader.get().ToTable().Value(&table))
  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\pyarrow\error.pxi", line 92, in pyarrow.lib.check_status
    raise convert_status(status)
OSError: [libpq] ReadHeader failed: Expected PGCOPY signature of 11 bytes at beginning of stream but found -1 bytes of input

During handling of the above exception, another exception occurred:

  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\adbc_driver_manager\_reader.pyx", line 41, in adbc_driver_manager._reader._AdbcErrorHelper.check_error
    raise exc from None
  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\adbc_driver_manager\_reader.pyx", line 91, in adbc_driver_manager._reader.AdbcRecordBatchReader.read_all
    self._helper.check_error(e)
  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\adbc_driver_manager\_lib.pyx", line 1590, in adbc_driver_manager._lib._blocking_call
    return func(*args, **kwargs)
  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\adbc_driver_manager\dbapi.py", line 1197, in fetch_arrow_table
    return _blocking_call(self._reader.read_all, (), {}, self._stmt.cancel)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Projects\project-cqn\.venv_cqn\Lib\site-packages\adbc_driver_manager\dbapi.py", line 1088, in fetch_arrow_table
    return self._results.fetch_arrow_table()

How can we reproduce the bug?

It's a fairly complex query. I could perhaps work to reproduce but since it works on 10k records, it may be difficult to make a repro.

Environment/Setup

greenplum/postgres PostgreSQL 9.4.26
(Greenplum Database 6.24.3 build commit:25d3498a400ca5230e81abb94861f23389315213)
on x86_64-unknown-linux-gnu,
compiled by gcc (GCC) 6.4.0,
64-bit compiled on May  3 2023 20:34:57
@mcrumiller mcrumiller added the Type: bug Something isn't working label Oct 24, 2024
@paleolimbot
Copy link
Member

Does this still happen with the nightly Python builds? Since the last release we did a heavy refactor of some of those internals to make those types of errors report better/go away:

pip install \
      --pre \
      --index-url https://repo.fury.io/arrow-adbc-nightlies \
      adbc-driver-postgresql

@mcrumiller
Copy link
Author

Sorry, I'm unable to build the nightlies at work. Unsure if you want me to re-open after the next release if I still see the issue, or if you want to just leave this one in limbo until then.

@mcrumiller
Copy link
Author

Or actually, is there a pip release for the nightlies?

@paleolimbot
Copy link
Member

I think there's wheels! For me it resolves to https://repo.fury.io/arrow-adbc-nightlies/-/ver_1QpKJK/adbc_driver_postgresql-1.3.0-py3-none-macosx_11_0_arm64.whl but no idea if that will work for you. Installing dev versions is not great right now, sorry!

If you're up for another approach, figuring out which row/column causes the issue by manipulating the LIMIT clause might help pinpoint the issue. The error that was obscured by the last report of this was one where a computation failed on a specific value.

@mcrumiller
Copy link
Author

Is that some anonymous third-party website providing builds? I'd rather not go through that route. I'll do more investigating on my side. Thank you though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants