Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During reconfiguration process some requests on router fails #482

Open
Serpentian opened this issue Aug 2, 2024 · 2 comments
Open

During reconfiguration process some requests on router fails #482

Serpentian opened this issue Aug 2, 2024 · 2 comments
Assignees
Labels

Comments

@Serpentian
Copy link
Contributor

We need to minimize the number of failed requests on router during reconfiguration process. Users are faced with a problem of router returning errors, when e.g. new replicaset is added. Probably related to #481

@Serpentian Serpentian self-assigned this Aug 2, 2024
@Serpentian
Copy link
Contributor Author

2sp for investigation and figuring out, whether we can do something about this.

@Serpentian
Copy link
Contributor Author

Fix the flakiness of the reconfiguration stress test in the scope of this issue:

rebalancer/stress_add_remove_several_rs.test.l> vinyl           [ fail ]

Test failed! Result content mismatch:
--- rebalancer/stress_add_remove_several_rs.result	Fri Apr 26 12:50:35 2024
+++ /__w/vshard-ee/vshard-ee/test/var/rejects/rebalancer/stress_add_remove_several_rs.reject	Mon Aug 12 04:25:42 2024
@@ -500,19 +500,19 @@
 ...
 #box.space._bucket.index.status:select{vshard.consts.BUCKET.ACTIVE}
 ---
-- 100
-...
-check_consistency()
----
-- true
-...
-test_run:switch('box_2_a')
----
-- true
-...
-#box.space._bucket.index.status:select{vshard.consts.BUCKET.ACTIVE}
----
-- 100
+- 66
+...
+check_consistency()
+---
+- true
+...
+test_run:switch('box_2_a')
+---
+- true
+...
+#box.space._bucket.index.status:select{vshard.consts.BUCKET.ACTIVE}
+---
+- 67
 ...
 check_consistency()
 ---
@@ -524,24 +524,24 @@
 ...
 #box.space._bucket.index.status:select{vshard.consts.BUCKET.ACTIVE}
 ---
+- 67
+...
+check_consistency()
+---
+- true
+...
+test_run:switch('box_4_a')
+---
+- true
+...
+#box.space._bucket.index.status:select{vshard.consts.BUCKET.ACTIVE}
+---
 - 0
 ...
 check_consistency()
 ---
 - true
 ...
-test_run:switch('box_4_a')
----
-- true
-...
-#box.space._bucket.index.status:select{vshard.consts.BUCKET.ACTIVE}
----
-- 0
-...
-check_consistency()
----
-- true
-...
 test_run:switch('default')
 ---
 - true

[test-run server "test"] Last 15 lines of the log file /__w/vshard-ee/vshard-ee/test/var/001_rebalancer/test.log:
2024-08-12 04:25:18.432 [1651] main/347/console/unix/: I> Slaves are connected to a master "box_1_a"
2024-08-12 04:25:18.433 [1651] main/347/console/unix/: I> Waiting until slaves are connected to a master
2024-08-12 04:25:18.539 [1651] main/347/console/unix/: I> Slaves are connected to a master "box_2_a"
2024-08-12 04:25:19.175 [1651] main/355/console/unix/: I> Waiting until slaves are connected to a master
2024-08-12 04:25:19.283 [1651] main/355/console/unix/: I> Slaves are connected to a master "box_3_a"
2024-08-12 04:25:21.722 [1651] main/361/console/unix/: I> Waiting until slaves are connected to a master
2024-08-12 04:25:21.829 [1651] main/361/console/unix/: I> Slaves are connected to a master "box_4_a"
2024-08-12 04:25:31.071 [1651] main/367/console/unix/: I> Waiting until slaves are connected to a master
2024-08-12 04:25:31.079 [1651] main/367/console/unix/: I> Slaves are connected to a master "box_1_a"
2024-08-12 04:25:31.079 [1651] main/367/console/unix/: I> Waiting until slaves are connected to a master
2024-08-12 04:25:31.291 [1651] main/367/console/unix/: I> Slaves are connected to a master "box_2_a"
2024-08-12 04:25:31.845 [1651] main/373/console/unix/: I> Waiting until slaves are connected to a master
2024-08-12 04:25:31.953 [1651] main/373/console/unix/: I> Slaves are connected to a master "box_3_a"
2024-08-12 04:25:34.402 [1651] main/379/console/unix/: I> Waiting until slaves are connected to a master
2024-08-12 04:25:34.509 [1651] main/379/console/unix/: I> Slaves are connected to a master "box_4_a"
Reproduce file /__w/vshard-ee/vshard-ee/test/var/reproduce/001_rebalancer.list.yaml
---
- [rebalancer/bucket_ref.test.lua, null]
- [rebalancer/errinj.test.lua, null]
- [rebalancer/parallel.test.lua, memtx]
- [rebalancer/parallel.test.lua, vinyl]
- [rebalancer/rebalancer.test.lua, memtx]
- [rebalancer/rebalancer.test.lua, vinyl]
- [rebalancer/rebalancer2.test.lua, null]
- [rebalancer/rebalancer_lock_and_pin.test.lua, null]
- [rebalancer/receiving_bucket.test.lua, null]
- [rebalancer/restart_during_rebalancing.test.lua, memtx]
- [rebalancer/restart_during_rebalancing.test.lua, vinyl]
- [rebalancer/stress_add_remove_rs.test.lua, memtx]
- [rebalancer/stress_add_remove_rs.test.lua, vinyl]
- [rebalancer/stress_add_remove_several_rs.test.lua, memtx]
- [rebalancer/stress_add_remove_several_rs.test.lua, vinyl]
...
---------------------------------------------------------------------------
[Instance test] Stopping the server...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant