-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enhancement](cloud) reconnect after the RPC request to the meta service fails #45668
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
} | ||
|
||
private: | ||
static Status get_pooled_client(std::shared_ptr<MetaService_Stub>* stub) { | ||
static Status get_pooled_client(std::shared_ptr<MetaService_Stub>* stub, | ||
MetaServiceProxy** proxy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try not to use double stars. is it possible to use a pointer to an unique_ptr?
be/src/cloud/cloud_meta_mgr.cpp
Outdated
@@ -447,6 +481,7 @@ Status CloudMetaMgr::sync_tablet_rowsets(CloudTablet* tablet, bool warmup_delta_ | |||
.tag("partition_id", tablet->partition_id()) | |||
.tag("tried", tried) | |||
.tag("sleep", duration_ms); | |||
proxy->set_unhealthy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not set_unhealthy()
right after we get if (cntl.Failed())
? it should not rely on retry_times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not
set_unhealthy()
right after we getif (cntl.Failed())
? it should not rely on retry_times
done
be/src/cloud/cloud_meta_mgr.cpp
Outdated
return Status::RpcError("failed to get delete bitmap: {}", cntl.ErrorText()); | ||
|
||
int retry_times = 0; | ||
brpc::Controller cntl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put it into the while
loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start and end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
std::shared_ptr<MetaService_Stub> stub; | ||
RETURN_IF_ERROR(MetaServiceProxy::get_client(&stub)); | ||
MetaServiceProxy* proxy; | ||
RETURN_IF_ERROR(MetaServiceProxy::get_proxy(&proxy)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use a new proxy every time we retry?
} | ||
|
||
private: | ||
static Status get_pooled_client(std::shared_ptr<MetaService_Stub>* stub) { | ||
static Status get_pooled_client(std::shared_ptr<MetaService_Stub>* stub, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment for this methods including behavior and params
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
|
||
static Status get_proxy(MetaServiceProxy** proxy) { | ||
std::shared_ptr<MetaService_Stub> stub; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment for this function, and the stub is a placeholder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -3307,4 +3307,6 @@ public static int metaServiceRpcRetryTimes() { | |||
"For disabling certain SQL queries, the configuration item is a list of simple class names of AST" | |||
+ "(for example CreateRepositoryStmt, CreatePolicyCommand), separated by commas."}) | |||
public static String block_sql_ast_names = ""; | |||
|
|||
public static long ms_rpc_reconn_interval_ms = 20000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use full name meta_service_rpc_reconnect_interval_ms
and add comment for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
be/src/cloud/config.h
Outdated
@@ -111,5 +111,9 @@ DECLARE_mBool(enable_use_cloud_unique_id_from_fe); | |||
|
|||
DECLARE_Bool(enable_cloud_tablet_report); | |||
|
|||
DECLARE_mInt32(delete_bitmap_rpc_retry_times); | |||
|
|||
DECLARE_mInt64(ms_rpc_reconn_interval_ms); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use full name meta_service_rpc_reconnect_interval_ms
and add comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
run buildall |
TPC-H: Total hot run time: 32519 ms
|
TPC-DS: Total hot run time: 195756 ms
|
TeamCity be ut coverage result: |
ClickBench: Total hot run time: 31.37 s
|
fe/fe-core/src/main/java/org/apache/doris/cloud/rpc/MetaServiceProxy.java
Outdated
Show resolved
Hide resolved
run buildall |
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
run buildall |
run buildall |
TPC-H: Total hot run time: 32626 ms
|
TPC-DS: Total hot run time: 196710 ms
|
ClickBench: Total hot run time: 30.86 s
|
TeamCity be ut coverage result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
…ice fails (apache#45668) Co-authored-by: Gavin Chou <[email protected]>
…ice fails (#45668) (#46358) pick #45668 --------- Co-authored-by: Gavin Chou <[email protected]>
No description provided.