Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-17243: "FSM: no such a transition ABORTING -> REPLICATING" #524

Open
wants to merge 1 commit into
base: 3.x
Choose a base branch
from

Conversation

janlindstrom
Copy link
Contributor

TrxMap structure doesn't take into consideration presence of two trx
objects with same trx_id (2^64 - 1 which is default trx_id) belonging
to two different connections.

This eventually causes same trx object to get shared among two
different unrelated connections which causes state inconsistency
leading to crash (RACE CONDITION).

This problem could be solved by taking into consideration conn-id,
but that would invite interface change. To avoid this we should
maintain a separate map of such trx objects based on gu_thread_id.

https://jira.mariadb.org/browse/MDEV-17243 and
https://jira.mariadb.org/browse/MDEV-17262

TrxMap structure doesn't take into consideration presence of two trx
objects with same trx_id (2^64 - 1 which is default trx_id) belonging
to two different connections.

This eventually causes same trx object to get shared among two
different unrelated connections which causes state inconsistency
leading to crash (RACE CONDITION).

This problem could be solved by taking into consideration conn-id,
but that would invite interface change. To avoid this we should
maintain a separate map of such trx objects based on gu_thread_id.

https://jira.mariadb.org/browse/MDEV-17243 and
https://jira.mariadb.org/browse/MDEV-17262
@temeo
Copy link
Contributor

temeo commented Oct 31, 2018

This looks like a suspicious patch for two reasons:

  • Application which is using Galera library (MariaDB in this case) should guarantee that it assigns unique trx_id for each transaction before calling wsrep hooks. Therefore the transaction with trx_id = -1 (undefined trx id in MariaDB) should never end up in connection map
  • gu_thread_t is used as a workaround for transactions with invalid trx id, but this assumes that the same thread always handles the whole transaction cycle, which may not hold in all cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants