Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill tablet on BS failures #13766

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

avevad
Copy link
Member

@avevad avevad commented Jan 23, 2025

Changelog entry

Kill tablet on BS failures instead of aborting node process.

Changelog category

  • Bugfix

Additional information

This resolves #7901

@avevad avevad added the area/cs label Jan 23, 2025
@avevad avevad self-assigned this Jan 23, 2025
@avevad avevad requested a review from a team as a code owner January 23, 2025 15:11
@avevad avevad requested a review from ivanmorozov333 January 23, 2025 15:11
Copy link

github-actions bot commented Jan 23, 2025

2025-01-23 15:14:21 UTC Pre-commit check linux-x86_64-release-asan for 416f747 has started.
2025-01-23 15:14:34 UTC Artifacts will be uploaded here
2025-01-23 15:17:31 UTC ya make is running...
🟡 2025-01-23 16:20:51 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
11153 11084 0 30 11 28

2025-01-23 16:22:11 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-01-23 16:34:20 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
104 (only retried tests) 75 0 2 3 24

2025-01-23 16:34:31 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-01-23 16:46:02 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
51 (only retried tests) 24 0 3 0 24

🟢 2025-01-23 16:46:09 UTC Build successful.
🟢 2025-01-23 16:46:38 UTC ydbd size 3.6 GiB changed* by -3.0 KiB, which is <= 0 Bytes vs main: OK

ydbd size dash main: 38d0188 merge: 416f747 diff diff %
ydbd size 3 858 369 408 Bytes 3 858 366 304 Bytes -3.0 KiB -0.000%
ydbd stripped size 1 349 602 288 Bytes 1 349 601 328 Bytes -960 Bytes -0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Jan 23, 2025

2025-01-23 15:15:17 UTC Pre-commit check linux-x86_64-relwithdebinfo for 416f747 has started.
2025-01-23 15:17:20 UTC Artifacts will be uploaded here
2025-01-23 15:20:17 UTC ya make is running...
🟡 2025-01-23 16:05:36 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
18396 17091 0 3 1180 122

2025-01-23 16:07:39 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-01-23 16:17:28 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
182 (only retried tests) 64 0 1 0 117

2025-01-23 16:17:37 UTC ya make is running... (failed tests rerun, try 3)
🟢 2025-01-23 16:26:52 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
172 (only retried tests) 56 0 0 0 116

🟢 2025-01-23 16:26:59 UTC Build successful.
🟢 2025-01-23 16:27:22 UTC ydbd size 2.1 GiB changed* by -1.3 KiB, which is <= 0 Bytes vs main: OK

ydbd size dash main: 38d0188 merge: 416f747 diff diff %
ydbd size 2 220 477 584 Bytes 2 220 476 256 Bytes -1.3 KiB -0.000%
ydbd stripped size 469 681 648 Bytes 469 681 456 Bytes -192 Bytes -0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@@ -14,7 +14,11 @@ void TGarbageCollectionActor::Handle(TEvBlobStorage::TEvCollectGarbageResult::TP
CheckFinished();
} else {
ACFL_ERROR()("event", "GC_ERROR")("details", ev->Get()->Print(true));
SendToBSProxy(NActors::TActivationContext::AsActorContext(), ev->Cookie, GCTask->BuildRequest(TBlobAddress(ev->Cookie, ev->Get()->Channel)).release(), ev->Cookie);
if (auto gc_ev = GCTask->BuildRequest(TBlobAddress(ev->Cookie, ev->Get()->Channel))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style guide

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

поправил

if (auto gc_ev = GCTask->BuildRequest(TBlobAddress(ev->Cookie, ev->Get()->Channel))) {
SendToBSProxy(NActors::TActivationContext::AsActorContext(), ev->Cookie, gc_ev.release(), ev->Cookie);
} else {
Send(TabletActorId, new TEvents::TEvPoison);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

нужно добавить логику ожидания всех ответов от групп, которым отослали данные и только после этого убивать таблетку
также, нужно завершить, как-то, актор, в котором мы работаем

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Добавил нужную логику

Copy link

github-actions bot commented Jan 23, 2025

2025-01-23 17:10:13 UTC Pre-commit check linux-x86_64-relwithdebinfo for a28a768 has started.
2025-01-23 17:10:25 UTC Artifacts will be uploaded here
2025-01-23 17:13:25 UTC ya make is running...
2025-01-23 17:24:19 UTC Check cancelled

Copy link

github-actions bot commented Jan 23, 2025

2025-01-23 17:10:14 UTC Pre-commit check linux-x86_64-release-asan for a28a768 has started.
2025-01-23 17:10:25 UTC Artifacts will be uploaded here
2025-01-23 17:13:24 UTC ya make is running...
2025-01-23 17:24:13 UTC Check cancelled

@avevad avevad requested a review from ivanmorozov333 January 23, 2025 17:26
Copy link

github-actions bot commented Jan 23, 2025

2025-01-23 17:28:41 UTC Pre-commit check linux-x86_64-relwithdebinfo for 5f95c5b has started.
2025-01-23 17:28:53 UTC Artifacts will be uploaded here
2025-01-23 17:31:44 UTC ya make is running...
🟡 2025-01-23 18:22:56 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
18396 17079 0 6 1190 121

2025-01-23 18:25:19 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-01-23 18:35:03 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
198 (only retried tests) 80 0 0 0 118

🟢 2025-01-23 18:35:11 UTC Build successful.
🟢 2025-01-23 18:35:29 UTC ydbd size 2.1 GiB changed* by +32.6 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 38d0188 merge: 5f95c5b diff diff %
ydbd size 2 220 477 584 Bytes 2 220 510 960 Bytes +32.6 KiB +0.002%
ydbd stripped size 469 681 648 Bytes 469 681 712 Bytes +64 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Jan 23, 2025

2025-01-23 17:28:43 UTC Pre-commit check linux-x86_64-release-asan for 5f95c5b has started.
2025-01-23 17:34:26 UTC Artifacts will be uploaded here
2025-01-23 17:37:14 UTC ya make is running...
🟡 2025-01-23 18:34:58 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
11153 11102 0 17 4 30

2025-01-23 18:36:12 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-01-23 18:48:33 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
81 (only retried tests) 50 0 1 1 29

2025-01-23 18:48:42 UTC ya make is running... (failed tests rerun, try 3)
🟢 2025-01-23 19:00:25 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
55 (only retried tests) 27 0 0 0 28

🟢 2025-01-23 19:00:35 UTC Build successful.
🟢 2025-01-23 19:01:01 UTC ydbd size 3.6 GiB changed* by +54.2 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 38d0188 merge: 5f95c5b diff diff %
ydbd size 3 858 369 408 Bytes 3 858 424 960 Bytes +54.2 KiB +0.001%
ydbd stripped size 1 349 602 288 Bytes 1 349 606 832 Bytes +4.4 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change VERIFY BuildRequest to KILL_TABLET
2 participants