Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LOAD DATA executions not returning affected rows value #283

Open
NoProblemJack opened this issue Oct 9, 2022 · 5 comments
Open

LOAD DATA executions not returning affected rows value #283

NoProblemJack opened this issue Oct 9, 2022 · 5 comments
Labels

Comments

@NoProblemJack
Copy link

Hi there,

I believe there is a bug in the returned 'affected rows' value that results after executing a LOAD DATA command. I am running on windows 10, and connecting to a cloud based MySQL database 5.7

Hopefully the following reprex is satisfactory. It uses a temporary table for reproducability, but the result is the same with permanent tables.

db_temp_table_name <- "temp_table123"
temp_file_name <- tempfile("temp123.csv")
temp_file_name <- stringr::str_replace_all(temp_file_name, "\\\\", "/")

local_temp_table <- data.frame(id = 1:10)
write.csv(local_temp_table, temp_file_name)

sql_create_table <-
  paste(
    "CREATE TEMPORARY TABLE ", db_temp_table_name, "
    (id INT)
   ENGINE=InnoDB"
  )

sql_load_data <-
  paste0(
    "LOAD DATA LOCAL INFILE '", temp_file_name, "'
    INTO TABLE ", db_temp_table_name, "
    FIELDS TERMINATED BY ',' ENCLOSED BY '\"' ESCAPED BY '\\\\'
    LINES TERMINATED BY '\\r\\n'
    IGNORE 1 LINES
    (`id`)"
  )

mariadb_con <-
  RMariaDB::dbConnect(
    RMariaDB::MariaDB(),
    host = ...,
    user = ...,
    password = ...,
    port = ...,
    dbname = ...,
    bigint = "numeric"
  )

RMariaDB::dbExecute(conn = mariadb_con, statement = sql_create_table)
RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123")
affected_records <- RMariaDB::dbExecute(conn = mariadb_con, statement = sql_load_data)
affected_records ## has value 0, but should have value 10
RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123") ## confirm that the insert did infact occur.
RMariaDB::dbDisconnect(mariadb_con)

For comparison reasons, it is worth noting that when using RMySQL the correct 'affected rows' value is returned, so the value is presumably being returned by the database. I am working on a project where confirmation that the correct number of inserts has occurred is necessary, and so this issue is preventing me from upgrading from RMySQL to RMariaDB.

Many thanks in advance. Appreciate the work you do.

@krlmlr
Copy link
Member

krlmlr commented Oct 10, 2022

Thanks, confirmed that the return value is 0, but I don't see records in the table:

db_temp_table_name <- "temp_table123"
temp_file_name <- tempfile("temp123.csv")
temp_file_name <- stringr::str_replace_all(temp_file_name, "\\\\", "/")

local_temp_table <- data.frame(id = 1:10)
write.csv(local_temp_table, temp_file_name)

sql_create_table <-
  paste(
    "CREATE TEMPORARY TABLE ", db_temp_table_name, "
    (id INT)
   ENGINE=InnoDB"
  )

sql_load_data <-
  paste0(
    "LOAD DATA LOCAL INFILE '", temp_file_name, "'
    INTO TABLE ", db_temp_table_name, "
    FIELDS TERMINATED BY ',' ENCLOSED BY '\"' ESCAPED BY '\\\\'
    LINES TERMINATED BY '\\r\\n'
    IGNORE 1 LINES
    (`id`)"
  )

mariadb_con <-
  RMariaDB::dbConnect(
    RMariaDB::MariaDB(),
    dbname = "test",
    bigint = "numeric"
  )

RMariaDB::dbExecute(conn = mariadb_con, statement = sql_create_table)
#> [1] 0
RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123")
#> [1] id
#> <0 rows> (or 0-length row.names)
affected_records <- RMariaDB::dbExecute(conn = mariadb_con, statement = sql_load_data)
affected_records ## has value 0, but should have value 10
#> [1] 0
RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123") ## confirm that the insert did infact occur.
#> [1] id
#> <0 rows> (or 0-length row.names)
RMariaDB::dbDisconnect(mariadb_con)

Created on 2022-10-10 with reprex v2.0.2

@NoProblemJack
Copy link
Author

Hi,

ah that's a pain. I can confirm the table is inserted correctly when I run the code, so there must be some issue preventing you from inserting into the table in the first place. I'm no database expert by any means but I can think of two possible reasons why the inserts are failing for you

  1. If you are using LINUX then I think you need to use LINES TERMINATED BY '\\n' (I have even had to use this on windows from time to time) I doubt that is the issue though as I would think in this case one record would still be inserted into the table.
  2. There may be security settings with the way your database is set up preventing local files from being loaded https://dev.mysql.com/doc/refman/5.7/en/load-data.html#load-data-local

One thing I would recommend trying is to run the above reprex to create the temp file and temporary table, but then to run the LOAD DATA statement on a database client, so in that case you will be able to see the warning/error message of why the inserts are failing, and then can hopefully figure out a fix and get the above reprex to work properly.

Note though that there is a bug with MySQL Workbench 8.0 (in case you are using it) that, would you believe, causes LOAD DATA statements to fail (I've had to previously roll back to v6.3 because of this, haven't tried v8 recently though) https://bugs.mysql.com/bug.php?id=91872

If you can get the inserts to work, I am sure you will see that the return value is still 0.

Hope that helps.

@krlmlr
Copy link
Member

krlmlr commented Oct 12, 2022

Thanks. On my macOS system, I had to use LINES TERMINATED BY '\\n' . Also, the write.csv() call needed row.names = FALSE . The number of rows affected seem to come out right here:

db_temp_table_name <- "temp_table123"
temp_file_name <- tempfile("temp123.csv")
temp_file_name <- stringr::str_replace_all(temp_file_name, "\\\\", "/")

local_temp_table <- data.frame(id = 1:10)
write.csv(local_temp_table, temp_file_name, row.names = FALSE)
readLines(temp_file_name)
#>  [1] "\"id\"" "1"      "2"      "3"      "4"      "5"      "6"      "7"     
#>  [9] "8"      "9"      "10"

sql_create_table <-
  paste(
    "CREATE TEMPORARY TABLE ", db_temp_table_name, "
    (id INT)
   ENGINE=InnoDB"
  )

sql_load_data <-
  paste0(
    "LOAD DATA LOCAL INFILE '", temp_file_name, "'
    INTO TABLE ", db_temp_table_name, "
    FIELDS TERMINATED BY ',' ENCLOSED BY '\"' ESCAPED BY '\\\\'
    LINES TERMINATED BY '\\n'
    IGNORE 1 LINES
    (`id`)"
  )

mariadb_con <-
  RMariaDB::dbConnect(
    RMariaDB::MariaDB(),
    dbname = "test",
    bigint = "numeric"
  )

RMariaDB::dbExecute(conn = mariadb_con, statement = sql_create_table)
#> [1] 0
RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123")
#> [1] id
#> <0 rows> (or 0-length row.names)
affected_records <- RMariaDB::dbExecute(conn = mariadb_con, statement = sql_load_data)
affected_records
#> [1] 10
RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123") ## confirm that the insert did infact occur.
#>    id
#> 1   1
#> 2   2
#> 3   3
#> 4   4
#> 5   5
#> 6   6
#> 7   7
#> 8   8
#> 9   9
#> 10 10
RMariaDB::dbDisconnect(mariadb_con)

Created on 2022-10-12 with reprex v2.0.2

@NoProblemJack
Copy link
Author

gah!

Just so you don't think i'm crazy, this is what I get:

> RMariaDB::dbExecute(conn = mariadb_con, statement = sql_create_table)
[1] 0
> RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123")
[1] id
<0 rows> (or 0-length row.names)
> affected_records <- RMariaDB::dbExecute(conn = mariadb_con, statement = sql_load_data)
> affected_records ## has value 0, but should have value 10
[1] 0
> RMariaDB::dbGetQuery(conn = mariadb_con, "select * from temp_table123")
   id
1   1
2   2
3   3
4   4
5   5
6   6
7   7
8   8
9   9
10 10

I have tested on MySQL 5.7.32 and MariaDB 10.2.21 and I get 0 returned in both cases, so I doubt it is a database specific issue.

The only other thing is that you are using Mac and I am using Windows 10, so could be a Windows specific issue, which probably makes it harder to diagnose and fix :/

@krlmlr
Copy link
Member

krlmlr commented Apr 1, 2024

Is this still an issue with current package versions?

@krlmlr krlmlr added the reprex label Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants