TMA 4.8 perf converted JSON files #182

weilinwa · 2024-05-11T06:04:26Z

Changes included in this PR:

New change based on up-to-date event files and metrics files release in PR: TMA 4.8 Release #181
Two new fields in event JSON files, Counter and Experimental, from PR: Update field names in counter.json file #165 and PR: Add experimental field to perf json #170
Add new counter.json files, from PR: Update field names in counter.json file #165

weilinwa · 2024-05-15T20:27:07Z

@edwarddavidbaker, could you please also request Ian as a reviewer for this PR? Thanks!

edwarddavidbaker · 2024-05-15T20:30:56Z

@edwarddavidbaker, could you please also request Ian as a reviewer for this PR? Thanks!

Sure, added Ian to the reviewers.

captain5050 · 2024-05-15T22:24:47Z

scripts/perf/skylakex/skx-metrics.json

        "MetricName": "llc_data_read_demand_plus_prefetch_miss_latency",
        "ScaleUnit": "1ns"
    },
    {
        "BriefDescription": "Average latency of a last level cache (LLC) demand and prefetch data read miss (read memory access) addressed to local memory in nano seconds",
-        "MetricExpr": "1e9 * (cha@UNC_CHA_TOR_OCCUPANCY.IA_MISS\\,config1\\=0x40432@ / cha@UNC_CHA_TOR_INSERTS.IA_MISS\\,config1\\=0x40432@) / (UNC_CHA_CLOCKTICKS / (#num_cores / #num_packages * #num_packages)) * duration_time",
+        "MetricExpr": "llc_data_read_demand_plus_prefetch_miss_latency_for_remote_requests",


This looks funny, local miss request latency is computed using remote request miss latency.

@captain5050, yes, the event data used in these metrics' expressions was not complete, which causes these metrics have the same metric expression and then got rewrote by in term of the other metric name. We just updated these metrics. Thanks!

captain5050 · 2024-05-15T22:27:55Z

scripts/perf/skylakex/skx-metrics.json

@@ -241,13 +241,13 @@
    },
    {
        "BriefDescription": "Memory read that miss the last level cache (LLC) addressed to local DRAM as a percentage of total memory read accesses, does not include LLC prefetches.",
-        "MetricExpr": "cha@UNC_CHA_TOR_INSERTS.IA_MISS\\,config1\\=0x40432@ / (cha@UNC_CHA_TOR_INSERTS.IA_MISS\\,config1\\=0x40432@ + cha@UNC_CHA_TOR_INSERTS.IA_MISS\\,config1\\=0x40431@)",
+        "MetricExpr": "numa_reads_addressed_to_remote_dram",
        "MetricName": "numa_reads_addressed_to_local_dram",


The same local aliased to remote issue appears here.

captain5050 · 2024-05-16T15:58:47Z

scripts/perf/haswellx/hsx-metrics.json

@@ -156,31 +156,31 @@
    },
    {
        "BriefDescription": "Ratio of number of code read requests missing last level core cache (includes demand w/ prefetches) to the total number of completed instructions",
-        "MetricExpr": "(cbox@UNC_C_TOR_INSERTS.MISS_OPCODE\\,filter_opc\\=0x181@ + cbox@UNC_C_TOR_INSERTS.MISS_OPCODE\\,filter_opc\\=0x191@) / INST_RETIRED.ANY",
+        "MetricExpr": "llc_data_read_mpi_demand_plus_prefetch",


This metric is written in terms of itself, ie it is recursive and would lock up if perf tried to evaluate it. There is a test in perf that detects this:

https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/metricgroup.c#n918

It may be worth copying the json here into a perf tree and running the tests to detect issues like this.

@captain5050, this is metric llc_code_read_mpi_demand_plus_prefetch written in terms of llc_data_read_mpi_demand_plus_prefetch. They are two different metrics with very similar names.

But I think this is also having the same issue like the SKX metrics. We will work on this one.

Thanks, could we squash some (all maybe) of the changes? It is hard to review patches when there are fixes on the patches later in the series.

@edwarddavidbaker, I guess I could make this PR to compare against the TMA-4.8-Release branch instead of the main branch to get it easier to read? What's your suggestion? Thanks!

Do we need two pull requests?

Ian can comment with his preferences. I typically use interactive rebases (and force push the branch) to incorporate feedback / tweaks. This in essence creates a new patch set and is the typical work flow on gerrit. As a concrete example here, all of Caleb's commits would be squished into a single 4.8 change. Your changes then update the output directory scripts/perf, and can be squished into a single commit.

With gerrit you can keep the magic Change-Id tag and gerrit time orders changes uploaded with it. I'm not sure how this works on github. My point of pain is trying to review many too large to open changes on github with fixes then layered on top in different patches. This makes it hard to say whether an issue is or isn't still present without mentally squashing the changes.

One or two PRs, both are fine. But I think I don't have write access to Caleb's branch. I'm not sure how to push changes to that PR.

@captain5050, now we only have changes to the script/perf/ directory remain in this PR. Does this work better?

captain5050 · 2024-05-21T18:31:17Z

scripts/perf/cascadelakex/clx-metrics.json

        "MetricName": "llc_data_read_demand_plus_prefetch_miss_latency",
        "ScaleUnit": "1ns"
    },
    {
        "BriefDescription": "Average latency of a last level cache (LLC) demand and prefetch data read miss (read memory access) addressed to local memory in nano seconds",
-        "MetricExpr": "1e9 * (cha@UNC_CHA_TOR_OCCUPANCY.IA_MISS\\,config1\\=0x40432@ / cha@UNC_CHA_TOR_INSERTS.IA_MISS\\,config1\\=0x40432@) / (UNC_CHA_CLOCKTICKS / (source_count(UNC_CHA_CLOCKTICKS) * #num_packages)) * duration_time",
+        "MetricExpr": "llc_data_read_demand_plus_prefetch_miss_latency_for_remote_requests",
        "MetricName": "llc_data_read_demand_plus_prefetch_miss_latency_for_local_requests",


This looks like the same issue as on skylakex which is fixed in a later commit - local requests metric is aliased to remote requests. Could we make the fix on all architectures and squash the fix commit into this one so that we don't need to spot bugs then check for later fixes?

@captain5050, this error turns out to be an issue caused by the mixed commits in this PR. @edwarddavidbaker just helped me updated this PR to one squashed commit. Hopefully, this makes it cleaner and easier to review.

weilinwa · 2024-05-23T00:28:50Z

@captain5050, I put the files into perf code and ran the perf tests. It passed most of the tests including PMU JSON event tests and Sysfs PMU tests. But the all metricgroug tests and all metrics test failed at power metrics and metrics like upi_data_transmit_bw because of missing PMU or events. Not sure if there are other metric would fail because the both of them end early at the failure. For reference, I did see similar test failures with the original perf pmu-event json files, so I'm thinking that might be caused by my system settings.

captain5050 · 2024-05-23T16:57:29Z

@captain5050, I put the files into perf code and ran the perf tests. It passed most of the tests including PMU JSON event tests and Sysfs PMU tests. But the all metricgroug tests and all metrics test failed at power metrics and metrics like upi_data_transmit_bw because of missing PMU or events. Not sure if there are other metric would fail because the both of them end early at the failure. For reference, I did see similar test failures with the original perf pmu-event json files, so I'm thinking that might be caused by my system settings.

We should be good. IIRC those metrics aren't using json events and so this points to a sysfs issue on your test machine.

captain5050 · 2024-05-23T16:59:39Z

scripts/perf/alderlake/adl-metrics.json

@@ -117,7 +117,7 @@
        "MetricExpr": "cpu_atom@TOPDOWN_BE_BOUND.ALLOC_RESTRICTIONS@ / tma_info_core_slots",
        "MetricGroup": "TopdownL3;tma_L3_group;tma_resource_bound_group",
        "MetricName": "tma_alloc_restriction",
-        "MetricThreshold": "tma_alloc_restriction > 0.1",
+        "MetricThreshold": "tma_alloc_restriction > 0.1 & (tma_resource_bound > 0.2 & tma_backend_bound_aux > 0.2)",


Just to note that there are a lot of improved thresholds but we may get multiplexing issues because of the greater number of metrics/events.

captain5050 · 2024-05-23T17:36:09Z

scripts/perf/sapphirerapids/spr-metrics.json

@@ -684,28 +712,28 @@
        "PublicDescription": "Branch Misprediction Cost: Fraction of TMA slots wasted per non-speculative branch misprediction (retired JEClear). Related metrics: tma_branch_mispredicts, tma_info_bottleneck_mispredictions, tma_mispredicts_resteers"
    },
    {
-        "BriefDescription": "Instructions per retired mispredicts for conditional non-taken branches (lower number means higher occurrence rate).",
+        "BriefDescription": "Instructions per retired Mispredicts for conditional non-taken branches (lower number means higher occurrence rate).",


Nit, the case change seems unnecessary/worse leading to a large number of changes.

captain5050

There are some seemingly unnecessary case changes, and changes for metric groups where names like BvML aren't explained. That's caused from the source spreadsheet. The conversion looks good.

There is a set of new metricgroup added in this release. They map to the related bottleneck metrics like below: BvMP: tma_info_bottleneck_mispredictions BvBC: tma_info_bottleneck_big_code BvFB: tma_info_bottleneck_instruction_fetch_bandwidth BvMB: tma_info_bottleneck_cache_memory_bandwidth BvML: tma_info_bottleneck_cache_memory_latency BvMT: tma_info_bottleneck_memory_data_tlbs BvMS: tma_info_bottleneck_memory_synchronization BvCB: tma_info_bottleneck_compute_bound_est BvIO: tma_info_bottleneck_irregular_overhead BvOB: tma_info_bottleneck_other_bottlenecks BvBO: tma_info_bottleneck_branching_overhead BvUW: tma_info_bottleneck_useful_work

weilinwa · 2024-05-23T23:34:16Z

There are some seemingly unnecessary case changes, and changes for metric groups where names like BvML aren't explained. That's caused from the source spreadsheet. The conversion looks good.

@captain5050, we've updated commit message to add explanations on the new metric groups and revert the case changes. Please take a look at the updates. Thanks!

edwarddavidbaker

Thanks Ian and Weilin!

weilinwa requested a review from edwarddavidbaker as a code owner May 11, 2024 06:04

weilinwa force-pushed the perf_converted_json_tma4.8 branch from bb033e8 to b3750ad Compare May 15, 2024 20:26

weilinwa requested review from 1perrytaylor, calebbiggers and kshiprab as code owners May 15, 2024 20:26

edwarddavidbaker requested a review from captain5050 May 15, 2024 20:30

captain5050 reviewed May 15, 2024

View reviewed changes

captain5050 reviewed May 16, 2024

View reviewed changes

edwarddavidbaker mentioned this pull request May 21, 2024

Sync scripts/perf to current repo #157

Closed

captain5050 reviewed May 21, 2024

View reviewed changes

weilinwa force-pushed the perf_converted_json_tma4.8 branch 2 times, most recently from 5803bf7 to 3990a25 Compare May 21, 2024 21:48

1perrytaylor approved these changes May 23, 2024

View reviewed changes

captain5050 reviewed May 23, 2024

View reviewed changes

captain5050 approved these changes May 23, 2024

View reviewed changes

weilinwa force-pushed the perf_converted_json_tma4.8 branch from 3990a25 to 4d86844 Compare May 23, 2024 22:39

weilinwa force-pushed the perf_converted_json_tma4.8 branch from 4d86844 to 59194d4 Compare May 23, 2024 23:30

captain5050 approved these changes May 24, 2024

View reviewed changes

edwarddavidbaker approved these changes May 28, 2024

View reviewed changes

edwarddavidbaker merged commit f74babd into intel:main May 28, 2024
3 checks passed

weilinwa deleted the perf_converted_json_tma4.8 branch May 29, 2024 20:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TMA 4.8 perf converted JSON files #182

TMA 4.8 perf converted JSON files #182

weilinwa commented May 11, 2024

weilinwa commented May 15, 2024

edwarddavidbaker commented May 15, 2024

captain5050 May 15, 2024

weilinwa May 16, 2024

captain5050 May 15, 2024

captain5050 May 16, 2024

weilinwa May 16, 2024

captain5050 May 16, 2024

weilinwa May 16, 2024

edwarddavidbaker May 16, 2024 •

edited

Loading

captain5050 May 16, 2024

weilinwa May 16, 2024

weilinwa May 21, 2024

captain5050 May 21, 2024

weilinwa May 21, 2024

weilinwa commented May 23, 2024

captain5050 commented May 23, 2024

captain5050 May 23, 2024

captain5050 May 23, 2024

captain5050 left a comment

weilinwa commented May 23, 2024

edwarddavidbaker left a comment

TMA 4.8 perf converted JSON files #182

TMA 4.8 perf converted JSON files #182

Conversation

weilinwa commented May 11, 2024

weilinwa commented May 15, 2024

edwarddavidbaker commented May 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edwarddavidbaker May 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weilinwa commented May 23, 2024

captain5050 commented May 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

captain5050 left a comment

Choose a reason for hiding this comment

weilinwa commented May 23, 2024

edwarddavidbaker left a comment

Choose a reason for hiding this comment

edwarddavidbaker May 16, 2024 •

edited

Loading