-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding problem with column names using Amazon Athena ODBC 2.x driver #10
Comments
…ement >> describeCols:` In some cases (i.e. Athena ODBC 2.x driver) could be necessary to reallocate the column name buffer to a bigger size. Resolves: pharo-rdbms#10
I identified the problem. In fact, it has nothing to do with the encoding system, but rather with the external call to the ODBC API function Checking the output parameters after the The signature of the SQLRETURN SQLDescribeCol(
SQLHSTMT StatementHandle,
SQLUSMALLINT ColumnNumber,
SQLCHAR * ColumnName,
SQLSMALLINT BufferLength,
SQLSMALLINT * NameLengthPtr,
SQLSMALLINT * DataTypePtr,
SQLULEN * ColumnSizePtr,
SQLSMALLINT * DecimalDigitsPtr,
SQLSMALLINT * NullablePtr); The description of the
I forked the project in my personal account and changed If you think this could be a permanent solution, please let me know. I can submit a pull request for your evaluation. |
Would be good to provide a PR (ideally with a test case) so the users of the original project can profit from your fix too. |
Great! I will prepare that. |
I'm trying to run a query on a database using the Amazon Athena ODBC 2.x driver (https://docs.aws.amazon.com/athena/latest/ug/odbc-v2-driver.html). I can correctly connect to the Athena server (via DSN) and also execute queries. The problem is that the returned column names appear to be in the wrong encoding system. The contents of the columns themselves are correct, the problem is only in the name of the columns.
The Athena driver requires forwardOnly cursors, so I used the
query:forwardOnly:
message from the connection object to execute the queries. I debugged the execution of this message and identified the following:As I am using Windows, the
ODBCConnection
object is always instantiated with utf-16 as the string encoding system. This is done in theODBCConnection class >> determineStringEncoder
which always returns an instance ofODBCUTF16Encoder
.When executing the query, the problem occurs in the method
ODBCAbstractStatement >> describeCols:
This is the part of the code where the problem occurs:
I can get the proper column names using other odbc libraries (pyodbc for Python, for example).
I tried to understand how the
ODBCUTF16Encoder >> decodeStringFrom:characterCount:
method works, but it is very complicated, and I still haven't been able to understand what might be happening. I would appreciate it if you have any tips that could help me.This is a screenshot of my playground inspecting an ODBCRow object returned by the query where the problem can be seen more clearly:
The text was updated successfully, but these errors were encountered: