Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source_stmt in Context for test logging. #1103

Merged
merged 12 commits into from
Aug 25, 2023
Merged

Conversation

ekexium
Copy link
Contributor

@ekexium ekexium commented Apr 19, 2023

@@ -916,6 +920,11 @@ enum APIVersion {
V2 = 2;
}

message TidbSource {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about QueryCtx or StmtCtx? Our goal is to establish a connection between the execution flow of transactions or statements.
connection_id may not be globally unique, to clarify which query the request belongs to something like stmt_id or uuid is needed but it looks a bit complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ctx is too general and it can cover almost everything... How about source_stmt?
stmt_id is a good idea. We can simply assign an incrementing id in each session, and print it together with conn_id in general log.

@ekexium ekexium changed the title Add TidbSource in Context for test logging. Add source_stmt in Context for test logging. Apr 24, 2023

// It tells us where the request comes from in TiDB. If it isn't from TiDB, leave it blank.
// This is for tests only and thus can be safely changed/removed without affecting compatibility.
SourceStmt source_stmt = 33;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this field is only used for logging with context info, I think a formated string is enough (just like request_source).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO string is less structured and less flexible.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm concerned about is: do we really have to introduce some sql concepts (stmt_id, conn_id) to the kv rpc?
Besides, I think string is more flexible, it's allow callers define/update the context info freely. The drawback of string is that it's not compacted.

Copy link
Contributor Author

@ekexium ekexium Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really have to introduce some sql concepts (stmt_id, conn_id) to the kv rpc?

Yeah I agree. I hope we could use something like SpanContext, making it part of the tracing framework. However that's kinda infeasible for now.

it's allow callers define/update the context info freely

True. But I don't quite view it as a pure advantage. If it changes often, we will easily forget what is included and have to jump forth and back in code to find out. I'd prefer an explicit definition to avoid such effort.
By flexible I mean TiKV can't do anything except simply printing a string, like structured logging, filtering, etc.

@disksing
Copy link
Contributor

disksing commented Jun 7, 2023

@cfzjywxk @zyguan Please take another look.

uint64 start_ts = 1;
uint64 connection_id = 2;
uint64 stmt_id = 3;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the connection_id can be identical across different TiDB servers, it may be necessary to use a unique identifier such as uuid to indicate which statement and instance this query originates from.
Besides, how should the stmt_id be generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how should the stmt_id be generated

Each connection generates its own stmt_id.
Merging them together into a UUID can make it harder for humans to read. Since it's only used in tests, the performance gain doesn't matter a lot.

Copy link

@zyguan zyguan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, is the source_stmt field only set for testing?

BTW, source_stmt.start_ts seems to be a little redundant since most of rpc requests already have a field that indicate to start-ts.

@ekexium
Copy link
Contributor Author

ekexium commented Jun 7, 2023

is the source_stmt field only set for testing?

Yes.

BTW, source_stmt.start_ts seems to be a little redundant since most of rpc requests already have a field that indicate to start-ts.

Makes sense. Since it's only used in tests, I still prefer more simplicity at the cost of redundancy.

@ekexium
Copy link
Contributor Author

ekexium commented Jun 7, 2023

We can wait until the corresponding TiKV PR completes its review before merging this PR

@ekexium ekexium merged commit 9e7ebd3 into pingcap:master Aug 25, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants