Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9238: Make MySQL 5.7 compatible with CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7 replication reliability #5362

Open
wants to merge 3 commits into
base: 8.0
Choose a base branch
from

Conversation

kamil-holubicki
Copy link
Contributor

https://perconadev.atlassian.net/browse/PS-9238

Problem:
5.7 does not understand the new syntax and complains about CREATE TABLE ... START TRANSACTION received in binlog.

Cause:
Commit 4fe3f2f (8.0.21) introduced changes that allow the execution of CREATE TABLE AS SELECT as a single transaction. Before that change, CREATE TABLE was the 1st transaction, then rows were inserted, which was unsafe. To achieve this we don't commit the transaction after CREATE TABLE. The commit is done after rows are inserted. For async replica to understand this protocol, START TRANSACTION clause was added to CREATE TABLE in binlog. When the replica sees this, it knows not to commit after CREATE TABLE.

Solution:
Introduced ctas_compatibility_mode global, read-only variable, OFF by default. If switched to ON, it enforces the behavior from before the above-mentioned commit.

Implementation details:
The fix contains 2 parts:

  1. Query_result_create::binlog_show_create_table() - When SE supports atomic DDL, CREATE TABLE query was binlogged with
    direct=false / is_trans=true flags. This caused the event to be cached in trx_cache instead of stmt_cache.
    MYSQL_BIN_LOG::commit() logic skips trx_cache commit, if this is not the transaction commit (it does only stmt_cache commit). Then, when rows are inserted, it was detected that there is ongoing transaction (trx_cache not empty), so no new BEGIN statement was generated in binlog.
    Effectively CREATE TABLE and rows insertion were done in one transaction.

The fix is to revert to the default behavior, i.e., Query_result_create::binlog_show_create_table() creates the event in stmt_cache, which is committed after table creation. When rows are being inserted, it is detected that trx_cache is empty, so BEGIN statement is added to binlog.

  1. store_create_info()—When executing CTAS, the START TRANSACTION clause was added to the rewritten query to inform the replica about what was happening.

The fix is not to add the START TRANSACTION clause.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

Comment on lines 3042 to 3043
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags &
HTON_SUPPORTS_ATOMIC_DDL) && !opt_ctas_compatibility_mode) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-implicit-bool-conversion ⚠️
implicit conversion unsigned int -> bool

Suggested change
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags &
HTON_SUPPORTS_ATOMIC_DDL) && !opt_ctas_compatibility_mode) {
if (((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags &
HTON_SUPPORTS_ATOMIC_DDL) != 0u) && !opt_ctas_compatibility_mode) {

@@ -4292,6 +4292,13 @@ static Sys_var_int32 Sys_regexp_stack_limit(
GLOBAL_VAR(opt_regexp_stack_limit), CMD_LINE(REQUIRED_ARG),
VALID_RANGE(0, INT32_MAX), DEFAULT(8000000), BLOCK_SIZE(1));

static Sys_var_bool Sys_ctas_compatibility_mode(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-interfaces-global-init ⚠️
initializing non-local variable with non-const expression depending on uninitialized non-local variable opt_ctas_compatibility_mode

"ctas_compatibility_mode",
"Execute and binlog CTAS in pre 8.0.21 way, i.e. with intermediate commit "
"after the table creation.",
READ_ONLY GLOBAL_VAR(opt_ctas_compatibility_mode), CMD_LINE(OPT_ARG),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-cstyle-cast ⚠️
do not use C-style cast to convert between unrelated types

@@ -2363,7 +2363,8 @@ bool store_create_info(THD *thd, Table_ref *table_list, String *packet,
This is done only while binlogging CREATE TABLE AS SELECT.
*/
if (!thd->lex->query_block->field_list_is_empty() &&
(create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL)) {
(create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL) &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-implicit-bool-conversion ⚠️
implicit conversion unsigned int -> bool

Suggested change
(create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL) &&
((create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL) != 0u) &&

CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7
replication reliability

https://perconadev.atlassian.net/browse/PS-9238

Problem:
5.7 does not understand the new syntax and complains about
CREATE TABLE ... START TRANSACTION received in binlog.

Cause:
Commit 4fe3f2f (8.0.21) introduced changes that allow the execution
of CREATE TABLE AS SELECT as a single transaction. Before that change,
CREATE TABLE was the 1st transaction, then rows were inserted, which was
unsafe. To achieve this we don't commit the transaction after
CREATE TABLE. The commit is done after rows are inserted. For async
replica to understand this protocol, START TRANSACTION clause was added
to CREATE TABLE in binlog. When the replica sees this, it knows not to
commit after CREATE TABLE.

Solution:
Introduced ctas_compatibility_mode global, read-only variable, OFF by
default. If switched to ON, it enforces the behavior from before the
above-mentioned commit.

Implementation details:
The fix contains 2 parts:
1. Query_result_create::binlog_show_create_table() - When SE supports
atomic DDL, CREATE TABLE query was binlogged with
direct=false / is_trans=true flags. This caused the event to be cached
in trx_cache instead of stmt_cache.
MYSQL_BIN_LOG::commit() logic skips trx_cache commit, if this is not
the transaction commit (it does only stmt_cache commit). Then, when rows
are inserted, it was detected that there is ongoing transaction
(trx_cache not empty), so no new BEGIN statement was generated
in binlog.
Effectively CREATE TABLE and rows insertion were done in one
transaction.

The fix is to revert to the default behavior, i.e.,
Query_result_create::binlog_show_create_table() creates the event in
stmt_cache, which is committed after table creation. When rows are being
inserted, it is detected that trx_cache is empty, so BEGIN statement is
added to binlog.

2. store_create_info()—When executing CTAS, the START TRANSACTION clause
was added to the rewritten query to inform the replica about what was
happening.

The fix is not to add the START TRANSACTION clause.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

Comment on lines +3042 to +3043
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags &
HTON_SUPPORTS_ATOMIC_DDL) &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-implicit-bool-conversion ⚠️
implicit conversion unsigned int -> bool

Suggested change
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags &
HTON_SUPPORTS_ATOMIC_DDL) &&
if (((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags &
HTON_SUPPORTS_ATOMIC_DDL) != 0u) &&

@kamil-holubicki kamil-holubicki marked this pull request as ready for review July 25, 2024 15:13
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

}
}
result =
write_bin_log(thd, true, thd->query().str, query_length, is_trans);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-implicit-bool-conversion ⚠️
implicit conversion int -> bool

Suggested change
write_bin_log(thd, true, thd->query().str, query_length, is_trans);
(write_bin_log(thd, true, thd->query().str, query_length, is_trans) != 0);

@kamil-holubicki kamil-holubicki marked this pull request as draft July 25, 2024 16:56
CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7
replication reliability

https://perconadev.atlassian.net/browse/PS-9238

Part 2 of the fix.
In the scenario source -> replica1 -> replica2, if replica1 is
configured with ctas_compatibility_mode it should behave accordingly and
produce the binglog compatible with version 8.0.20 (intermediate commit
after CREATE TABLE and no START TRANSACTION clause)

This part is sensitive to create_info->m_transactional_ddl flag which is
set during statement parsing. If ctas_compatibility_mode is set, we
force this flag not to be set.

Additionally, replica side follows the path of regular CREATE TABLE
which binlogs the query as-is. if ctas_compatibility_mode is set,
CREATE TABLE is removed from the query.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

HA_CREATE_USED_START_TRANSACTION>(value),
value(value) {}

bool contextualize(Table_ddl_parse_context *pc) override {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-make-member-function-const ⚠️
method contextualize can be made const

Suggested change
bool contextualize(Table_ddl_parse_context *pc) override {
bool contextualize(Table_ddl_parse_context *pc) const override {

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

sql/log_event.h Outdated
@@ -1471,6 +1472,12 @@ class Query_log_event : public virtual binary_log::Query_event,
native_strncasecmp(query, STRING_WITH_LEN("ROLLBACK TO "))) ||
!strncmp(query, STRING_WITH_LEN("XA ROLLBACK"));
}

virtual bool is_ctas() const override {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-explicit-virtual-functions ⚠️
virtual is redundant since the function is already declared override

Suggested change
virtual bool is_ctas() const override {
bool is_ctas() const override {

CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7
replication reliability

https://perconadev.atlassian.net/browse/PS-9238

Part 2.1 of the fix.
When ctas_compatibility_mode=OFF the binlog sequence is:
BEGIN
CREATE TABLE ... START TRANSACTION
ROW EVENT
COMMIT

MTS sees the above as one group.

When the source produces binlog with CREATE TABLE ... START TRANSACTION
but the intermediate replica is configured with
ctas_compatibility_mode=ON it has to execute and binlog CTAS in legacy
mode (pre 8.0.20). It means the intermediate commit will be done after
CREATE TABLE. This commit will call pre_commit hook, which will call
Slave_worker::commit_positions() where we check the validity of
ptr_group->checkpoint_seqno. checkpoint_seqno however is set when group
end is detected (COMMIT event), but it is too late.

We detect CTAS case and set checkpoint_seqno to be prepared for
intermediate commit.
@kamil-holubicki kamil-holubicki marked this pull request as ready for review July 26, 2024 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant