Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mw corrupts utf8mb4 multi-byte characters when dumping MySQL databases #818

Open
tehplague opened this issue Sep 24, 2024 · 0 comments
Open

Comments

@tehplague
Copy link

Describe the bug
Using mw database mysql dump to dump a database containing 4-byte UTF-8 (e.g. emojis encoded as utf8mb4) will result in an SQL dump file that does not contain those characters anymore. Instead the file contains question marks (?) instead of the actual glyphs.

Note that those are indeed ASCII question marks, not Unicode Replacement Characters (�) which you might expect when dealing with byte sequences not resembling valid UTF-8.

To Reproduce
Steps to reproduce the behavior:

  1. Create a MySQL database and run the following SQL:
SET NAMES utf8mb4;
CREATE TABLE `test` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `test` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
INSERT INTO `test` (`test`) VALUES (x'44696573206973742065696E20546573743A20F09F988A');

You may omit the SET NAMES utf8mb4; statement if you can otherwise ensure that your connection uses the utf8mb4 character set.

  1. Confirm proper contents of the test table:
select * from test;

Output should be something like this:

+----+-------------------------+
| id | test                    |
+----+-------------------------+
|  1 | Dies ist ein Test: 😊     |
+----+-------------------------+

Note, that the MySQL client is unsure about the number of printable characters represented by the emoji, thus overflowing the ASCII art table. This is unrelated to the problem depicted above

  1. Export the database with mw:
$ mw database mysql dump <database-id> -o export.sql
  1. The database dump in export.sql will contain the following:
INSERT INTO `test` VALUES (1,'Dies ist ein Test: ?');
  1. A database dump performed on the server with mysqldump still contains the correct characters.

Expected behavior
The database dump in export.sql should have contained the following instead:

INSERT INTO `test` VALUES (1,'Dies ist ein Test: 😊');

Console logs
No error has been reported. The command output looked totally unsuspicious.

Environment (please complete the following information):

  • OS: Arch Linux
  • Shell: fish
  • Terminal: WezTerm
  • Version (output of mw --version): @mittwald/cli/1.2.1 linux-x64 node-v20.17.0

Additional context
N/A

@tehplague tehplague added the bug Something isn't working label Sep 24, 2024
@martin-helmich martin-helmich removed the bug Something isn't working label Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants