Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate key values error is not thrown as exception and is getting ignored #255

Open
flyninjacode opened this issue Jan 23, 2019 · 9 comments
Labels
enhancement protocol requires Vertica protocol knowledge

Comments

@flyninjacode
Copy link

I'm trying to capture 'Duplicate key values error ' message if there is any insert statement that is violating the primary key constraint. In the following code I created a 'demo' table with 'demo_id' as the primary_key and tried to insert the same row twice.
Example:
`connection = v.connect()
cursor = connection.cursor()
cursor.execute("""
DROP TABLE IF EXISTS demo;
CREATE TABLE demo (
demo_id integer primary key enabled,
code varchar(20),
quantity smallint,
amount money,
product_id int,
demo_date datetime);
""")
cursor.execute("""
INSERT INTO demo
SELECT 1,'RT0132',1,100,10023,now();

""")
cursor.execute("""
INSERT INTO demo
SELECT 1,'RT0132',1,100,10023,now();

""")
connection.commit()`
For the above code I would expect it to throw an Exception like "Message: b"Duplicate key values: 'demo_id=1' -- violates constraint", but it doesn't. Is there an existing solution to handle this case or can this issue be resolved by doing some quick workaround ?

I'm using Python 3.5.2 and vertica_python 0.8.2 versions.

@sitingren
Copy link
Member

@flyninjacode Thanks for reporting this problem. I don't know if there is an existing solution, but the quick workaround I think is to call cursor.fetchall() after your second INSERT execution. I'll do further investigation and let you know if there is an update.

@sitingren
Copy link
Member

Here are protocol messages that are sending(=>) and receiving(<=) during those two INSERT executions:

=> Query: First "INSERT INTO demo SELECT 1,'RT0132',1,100,10023,now();"
<= RowDescription
<= ParameterStatus
<= ParameterStatus
<= ParameterStatus
<= DataRow: the number of accepted rows = 1  NOTE: cursor.execute() returned at here
<= CommandComplete
<= ReadyForQuery

=> Query: Second "INSERT INTO demo SELECT 1,'RT0132',1,100,10023,now();"
<= RowDescription
<= ParameterStatus
<= ParameterStatus
<= ParameterStatus
<= DataRow: the number of accepted rows = 1  NOTE: cursor.execute() returned at here
<= ErrorResponse: "Duplicate key values: 'demo_id=1' -- violates constraint 'public.demo.C_PRIMARY'"
<= ReadyForQuery

The logic of cursor.execute() makes the function stop reading remaining protocol messages if it receives a DataRow message. But since the client doesn't receive a CommandComplete message, that means the server is still running the query and errors can be thrown afterward.

When you run cursor.execute(), the client will read and ignore all remaining protocol messages of the previous command cycle, even an ErrorResponse. If you run cursor.fetchall(), you can iterate through all remaining protocol messages and catch those ErrorResponse.

So your expected behavior need a change of cursor.execute() to read a few more messages after a DataRow message is received. But this would also affect the logic of cursor.fetch*(). We'll see if it is easy to fix.

@roveo
Copy link

roveo commented Oct 12, 2020

You can do this

cursor.execute(query)
cursor.fetchall()
while cursor.nextset():
    cursor.fetchall()

It will raise an exception for each error associated with a result set (tested only with user-generated error).

EDIT: tested with .fetchall() instead of .fetchone() and it raises the expected exception. Updated the code.

@skryzh
Copy link

skryzh commented Jul 4, 2024

Hi, same problem w/ Airflow VerticaOperator

Some version info:

Python 3.10.10

Airflow version: 2.5.3

package_name                     | description                      | version
=================================+==================================+========
apache-airflow-providers-vertica | Vertica https://www.vertica.com/ | 3.0.0

import vertica_python as vp
print(vp.version_info)
(1, 3, 8)

Vertica Analytic Database v12.0.4-25

@sitingren
Copy link
Member

sitingren commented Jul 8, 2024

@skryzh apache-airflow-providers-vertica package is not managed by us, please report this issue to the correct maintainer. The package should use vertica-python in following way

cursor.execute("...")
while True:  # Fetch the result set for each statement
    rows = cursor.fetchall()  # raise an exception associated with each statement!
    print(rows)
    if not cursor.nextset():
        break

@skryzh
Copy link

skryzh commented Jul 8, 2024

@sitingren yeah it is not your project, but it uses default methods.
I saw your suggestion in this thread earlier, however it looks like workaround.

Is it possible to make in future versions default indications for PK problem?

@sitingren
Copy link
Member

@skryzh We don't have plan to implement this in the near future, that'll be a complex fix. This is not a problem of only duplicate key, the error would happen in any number of messages after cursor.execute() stops reading remaining messages. In order to prevent out-of-memory problem, cursor.execute() should not read all messages into memory. So the example code above is the suggested way.

@skryzh
Copy link

skryzh commented Jul 9, 2024

@sitingren Thank you for your explaining, now I get it and will try to check apache-airflow-providers-vertica for updates or connect developers to create issue for implementing your suggestion

@skryzh
Copy link

skryzh commented Aug 13, 2024

Hi, same problem w/ Airflow VerticaOperator

Some version info:

Python 3.10.10

Airflow version: 2.5.3

package_name                     | description                      | version
=================================+==================================+========
apache-airflow-providers-vertica | Vertica https://www.vertica.com/ | 3.0.0

import vertica_python as vp
print(vp.version_info)
(1, 3, 8)

Vertica Analytic Database v12.0.4-25

For those who have the same problem with airflow:
This problem fixed in apache-airflow-providers-vertica 3.5.2: https://airflow.apache.org/docs/apache-airflow-providers-vertica/3.8.2/changelog.html#id14
apache/airflow#34041

If you by any reason can not to upgrade apache-airflow-providers-vertica to higher version, my suggestion is to temporary, I repeat: temporary - replace your operator with small fix, here is my draft:

from typing import TYPE_CHECKING
from airflow.providers.vertica.hooks.vertica import VerticaHook
if TYPE_CHECKING:
    from airflow.utils.context import Context

from airflow.providers.common.sql.hooks.sql import fetch_all_handler
from airflow.providers.vertica.operators.vertica import VerticaOperator

class VerticaOperatorTMP(VerticaOperator):
    def execute(self, context: 'Context') -> None:
        self.log.info('Executing: %s', self.sql)
        hook = VerticaHook(vertica_conn_id=self.vertica_conn_id)
        hook.run(
            sql=self.sql,
            handler=fetch_all_handler,
        )

As @sitingren explained - the problem is in fetching response, you can see how my suggestion works by trying this:

-- create table
DROP TABLE IF EXISTS dev.vp_issues_255;
CREATE TABLE dev.vp_issues_255
(
    pk_1 int,
    ldm timestamp NOT NULL DEFAULT now(),
    CONSTRAINT pk PRIMARY KEY (pk_1) ENABLED
);
from airflow.providers.vertica.hooks.vertica import VerticaHook
from airflow.providers.common.sql.hooks.sql import fetch_all_handler

h = VerticaHook('conn_id')
sql = 'INSERT INTO dev.vp_issues_255 (pk_1) VALUES (1);'

for i in range(2):
    print(f'Attempt w/ silent error {i+1}')
    h.run(sql=sql)

h.run('TRUNCATE TABLE dev.vp_issues_255;')

for i in range(2):
    print(f'Attempt w/ error after first insert {i+1}')
    h.run(sql=sql, handler=fetch_all_handler)

I hope this will help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement protocol requires Vertica protocol knowledge
Projects
None yet
Development

No branches or pull requests

4 participants