-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e2e logging: Hard to follow workload events #1597
Comments
@raghavendra-talur what do you think? |
nirs
added a commit
to nirs/ramen
that referenced
this issue
Oct 30, 2024
We used to log everything using the global logger. This makes the log messy and hard to follow, since we run 5 dr flows concurrently. This will be worse as we add new workloads (e.g. flattened pvcs, cloned pvcs, volsync with block pvc). We use now per-test logger created using the name of the workload (e.g. disapp-deploy-rbd-busybox). Every log message from test flow start with this name. The rest of the log does not specify resources names since we use the same name for all resources (drpc, placement, subscription, etc). Clean up log messages, replacing noisy text with simple description of the event. More work is needed, we still log too much uninteresting details, and do no not log what we wait for, but this is already much easier to follow. Example log with this change: === RUN TestSuites/Exhaustive/Deploy-rbd/Subscr/Enable 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-rbd-busybox Protecting workload 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-cephfs-busybox Protecting workload 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Workload running on dr1 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Annotating placement 2024-10-31T00:37:41.926+0200 INFO appset-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.930+0200 INFO subscr-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.932+0200 INFO subscr-deploy-cephfs-busybox Creating drpc 2024-10-31T00:38:52.038+0200 INFO appset-deploy-cephfs-busybox drpc is ready Fixes: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Oct 30, 2024
We used to log everything using the global logger. This makes the log messy and hard to follow, since we run 5 dr flows concurrently. This will be worse as we add new workloads (e.g. flattened pvcs, cloned pvcs, volsync with block pvc). We use now per-test logger created using the name of the workload (e.g. disapp-deploy-rbd-busybox). Every log message from test flow start with this name. The rest of the log does not specify resources names since we use the same name for all resources (drpc, placement, subscription, etc). Clean up log messages, replacing noisy text with simple description of the event. More work is needed, we still log too much uninteresting details, and do no not log what we wait for, but this is already much easier to follow. Example log with this change: === RUN TestSuites/Exhaustive/Deploy-rbd/Subscr/Enable 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-rbd-busybox Protecting workload 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-cephfs-busybox Protecting workload 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Workload running on dr1 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Annotating placement 2024-10-31T00:37:41.926+0200 INFO appset-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.930+0200 INFO subscr-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.932+0200 INFO subscr-deploy-cephfs-busybox Creating drpc 2024-10-31T00:38:52.038+0200 INFO appset-deploy-cephfs-busybox drpc is ready Fixes: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Oct 31, 2024
This is the recommended way based on logr docs, probably to make it easier to use, since you don't need to check if the logger is nil, and zero value of a logger just discards data. This is also the way logr is used in ramen, so there is no reason to use it differently in the e2e code. Part-of: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Oct 31, 2024
This info is usually not helpful and harmful by taking spaces needed for important info in the logs. When logging error we have file:lineno info in the traceback. Example log with this change: === RUN TestSuites/Exhaustive/Deploy-cephfs#01/Appset/Enable 2024-10-30T16:46:14.331+0200 INFO enter EnableProtection appset-deploy-rbd-busybox 2024-10-30T16:46:14.331+0200 INFO enter EnableProtection appset-deploy-cephfs-busybox 2024-10-30T16:46:16.995+0200 INFO deployment busybox is ready 2024-10-30T16:46:16.995+0200 INFO workload Deploy-rbd is ready 2024-10-30T16:46:16.995+0200 INFO disapp-deploy-rbd-busybox is deployed Part-of: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Oct 31, 2024
We used to log everything using the global logger. This makes the log messy and hard to follow, since we run 5 dr flows concurrently. This will be worse as we add new workloads (e.g. flattened pvcs, cloned pvcs, volsync with block pvc). We use now per-test logger created using the name of the workload (e.g. disapp-deploy-rbd-busybox). Every log message from test flow start with this name. The rest of the log does not specify resources names since we use the same name for all resources (drpc, placement, subscription, etc). Clean up log messages, replacing noisy text with simple description of the event. More work is needed, we still log too much uninteresting details, and do no not log what we wait for, but this is already much easier to follow. Example log with this change: === RUN TestSuites/Exhaustive/Deploy-rbd/Subscr/Enable 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-rbd-busybox Protecting workload 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-cephfs-busybox Protecting workload 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Workload running on dr1 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Annotating placement 2024-10-31T00:37:41.926+0200 INFO appset-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.930+0200 INFO subscr-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.932+0200 INFO subscr-deploy-cephfs-busybox Creating drpc 2024-10-31T00:38:52.038+0200 INFO appset-deploy-cephfs-busybox drpc is ready Part-of: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Oct 31, 2024
Adding the same text to all workload is not helpful. Shorter names are important to make it easy to scan the logs. Currently with test everything with deployment using busybox image. When we will add new workload we can consider changing the names, but we need just one word to make the workload unique. I want to add vm based tests later, but these will run with a special environment, so they can use different test suite. Example logs with this change: 2024-10-31T19:39:11.391+0200 INFO subscr-rbd Protecting workload 2024-10-31T19:39:11.467+0200 INFO appset-rbd Workload running on dr1 2024-10-31T19:39:11.467+0200 INFO appset-rbd Annotating placement 2024-10-31T19:39:11.467+0200 INFO appset-cephfs Workload running on dr2 2024-10-31T19:39:11.467+0200 INFO appset-cephfs Annotating placement 2024-10-31T19:39:11.467+0200 INFO subscr-rbd Workload running on dr1 2024-10-31T19:39:11.467+0200 INFO subscr-rbd Annotating placement 2024-10-31T19:39:11.485+0200 INFO appset-cephfs Creating drpc 2024-10-31T19:39:11.571+0200 INFO appset-rbd Creating drpc 2024-10-31T19:39:11.767+0200 INFO subscr-rbd Creating drpc 2024-10-31T19:40:37.560+0200 INFO disapp-rbd drpc is ready Fixes: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Nov 4, 2024
This is the recommended way based on logr docs, probably to make it easier to use, since you don't need to check if the logger is nil, and zero value of a logger just discards data. This is also the way logr is used in ramen, so there is no reason to use it differently in the e2e code. Part-of: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Nov 4, 2024
This info is usually not helpful and harmful by taking spaces needed for important info in the logs. When logging error we have file:lineno info in the traceback. Example log with this change: === RUN TestSuites/Exhaustive/Deploy-cephfs#01/Appset/Enable 2024-10-30T16:46:14.331+0200 INFO enter EnableProtection appset-deploy-rbd-busybox 2024-10-30T16:46:14.331+0200 INFO enter EnableProtection appset-deploy-cephfs-busybox 2024-10-30T16:46:16.995+0200 INFO deployment busybox is ready 2024-10-30T16:46:16.995+0200 INFO workload Deploy-rbd is ready 2024-10-30T16:46:16.995+0200 INFO disapp-deploy-rbd-busybox is deployed Part-of: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Nov 4, 2024
We used to log everything using the global logger. This makes the log messy and hard to follow, since we run 5 dr flows concurrently. This will be worse as we add new workloads (e.g. flattened pvcs, cloned pvcs, volsync with block pvc). We use now per-test logger created using the name of the workload (e.g. disapp-deploy-rbd-busybox). Every log message from test flow start with this name. The rest of the log does not specify resources names since we use the same name for all resources (drpc, placement, subscription, etc). Clean up log messages, replacing noisy text with simple description of the event. More work is needed, we still log too much uninteresting details, and do no not log what we wait for, but this is already much easier to follow. Example log with this change: === RUN TestSuites/Exhaustive/Deploy-rbd/Subscr/Enable 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-rbd-busybox Protecting workload 2024-10-31T00:37:41.913+0200 INFO subscr-deploy-cephfs-busybox Protecting workload 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.915+0200 INFO appset-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Workload running on dr2 2024-10-31T00:37:41.917+0200 INFO subscr-deploy-rbd-busybox Annotating placement 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Workload running on dr1 2024-10-31T00:37:41.918+0200 INFO subscr-deploy-cephfs-busybox Annotating placement 2024-10-31T00:37:41.926+0200 INFO appset-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.930+0200 INFO subscr-deploy-rbd-busybox Creating drpc 2024-10-31T00:37:41.932+0200 INFO subscr-deploy-cephfs-busybox Creating drpc 2024-10-31T00:38:52.038+0200 INFO appset-deploy-cephfs-busybox drpc is ready Part-of: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/ramen
that referenced
this issue
Nov 4, 2024
Adding the same text to all workload is not helpful. Shorter names are important to make it easy to scan the logs. Currently with test everything with deployment using busybox image. When we will add new workload we can consider changing the names, but we need just one word to make the workload unique. I want to add vm based tests later, but these will run with a special environment, so they can use different test suite. Example logs with this change: 2024-10-31T19:39:11.391+0200 INFO subscr-rbd Protecting workload 2024-10-31T19:39:11.467+0200 INFO appset-rbd Workload running on dr1 2024-10-31T19:39:11.467+0200 INFO appset-rbd Annotating placement 2024-10-31T19:39:11.467+0200 INFO appset-cephfs Workload running on dr2 2024-10-31T19:39:11.467+0200 INFO appset-cephfs Annotating placement 2024-10-31T19:39:11.467+0200 INFO subscr-rbd Workload running on dr1 2024-10-31T19:39:11.467+0200 INFO subscr-rbd Annotating placement 2024-10-31T19:39:11.485+0200 INFO appset-cephfs Creating drpc 2024-10-31T19:39:11.571+0200 INFO appset-rbd Creating drpc 2024-10-31T19:39:11.767+0200 INFO subscr-rbd Creating drpc 2024-10-31T19:40:37.560+0200 INFO disapp-rbd drpc is ready Fixes: RamenDR#1597 Signed-off-by: Nir Soffer <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The current logs make following single workload events hard:
Example logs
We need this format:
Example logs converted to wanted format:
Notes:
The text was updated successfully, but these errors were encountered: