From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37F4FC3DA49 for ; Tue, 16 Jul 2024 12:53:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3D076B00AE; Tue, 16 Jul 2024 08:53:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B780A6B00B0; Tue, 16 Jul 2024 08:53:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CA786B00B1; Tue, 16 Jul 2024 08:53:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7BF336B00AE for ; Tue, 16 Jul 2024 08:53:03 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 198BD14052A for ; Tue, 16 Jul 2024 12:53:03 +0000 (UTC) X-FDA: 82345605846.01.682702D Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by imf28.hostedemail.com (Postfix) with ESMTP id 743BAC001F for ; Tue, 16 Jul 2024 12:53:00 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Ce2de+e6; spf=none (imf28.hostedemail.com: domain of kan.liang@linux.intel.com has no SPF policy when checking 198.175.65.18) smtp.mailfrom=kan.liang@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721134362; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Pp/mw/2+hG0RAExBsOg/iC62NqhepPy3BBhDrMot9IA=; b=3Assg8p3GyFn7SIzH8KKP31+zJtygVOZOg/C+Im5aU59PMeUd8yuuiosqReCcBPby8JY7R ON0cqehRNYn8JfAq/ZDNbynUf+BK1sjByzsM5pmFGuwRgBY46zwORjifscAfByAhge2R8I fxLjzG3suJmbpiI/tV+mR/hxCAeudqk= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Ce2de+e6; spf=none (imf28.hostedemail.com: domain of kan.liang@linux.intel.com has no SPF policy when checking 198.175.65.18) smtp.mailfrom=kan.liang@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721134362; a=rsa-sha256; cv=none; b=YUM6KJHV0+mX7Bo1mlJH0AxvpNx1y/8/KkY2FJgYRkUzhR8bNbP2ATtj0EOcuoETj67cfp MtDmislWb2dsrdq3Drw6991j8Dl1d9/yq0GY14Vwt/zaQbRS1pl+63FxTA2hrBmw5SYdip fnPFDvEGu4rAsSBGNUWl8lgMiGFAB9E= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721134381; x=1752670381; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=SSgUscryii1clarjrDEuWe0oAR9tIv+UZaWDyL8CiQM=; b=Ce2de+e6uUWJMcbkoqYtjAjXuT1GVUEl3K848TX6xcDm0tba6jfk/6jo wMxi94f722EJiwCDgHtD9v7FEdI7Tm54H40d0H3WN93+i4EvRHXT0wzg2 Pyce6oB3fKmOJWCTJDah9H3TwAoKHxt4tsrpwgfoHjvA8jOoQEreAKB7Z YNV1Kp8ExwUVnUjXSrjC1bK7iCrCAEJRgHsK1klBw0z9BZotNcd+l0qQN 6LhqyGfJlVGwqHLJ4hu0lENeXMRhH+7dfqBBU8s8jTFu45Wsv2FZ4cgqz lAGlngXl2zO/HtiyTBWf8DeA4XulZ9/I5XUOwbLWcUs3OZkBkrvisfcII A==; X-CSE-ConnectionGUID: 1URAGcXbTCu9lwgA2LGynA== X-CSE-MsgGUID: jRCW7ZhAQCicFuHuNrgvtg== X-IronPort-AV: E=McAfee;i="6700,10204,11134"; a="18700720" X-IronPort-AV: E=Sophos;i="6.09,211,1716274800"; d="scan'208";a="18700720" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Jul 2024 05:52:59 -0700 X-CSE-ConnectionGUID: CyQTy01RQ4S4qOmeiA/5sA== X-CSE-MsgGUID: 9koCZ61vTuW6piZp7MQsJg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,211,1716274800"; d="scan'208";a="50062034" Received: from linux.intel.com ([10.54.29.200]) by orviesa009.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Jul 2024 05:52:59 -0700 Received: from [10.213.174.231] (kliang2-mobl1.ccr.corp.intel.com [10.213.174.231]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 4883E20B8CD6; Tue, 16 Jul 2024 05:52:57 -0700 (PDT) Message-ID: <7491af6f-7be8-4a7a-8f48-249d26440bbc@linux.intel.com> Date: Tue, 16 Jul 2024 08:52:55 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [linux-next:master] [perf vendor events] e2641db83f: perf-sanity-tests.perf_all_PMU_test.fail To: Ian Rogers Cc: kernel test robot , oe-lkp@lists.linux.dev, lkp@intel.com, Linux Memory Management List , Namhyung Kim , Weilin Wang , Caleb Biggers , Alexandre Torgue , Maxime Coquelin , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org References: <202407101021.2c8baddb-oliver.sang@intel.com> <48b4bfb0-d0c7-4d1f-9e52-06e873646366@linux.intel.com> Content-Language: en-US From: "Liang, Kan" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 743BAC001F X-Stat-Signature: iogkc5uuww9aedgshgm7pot9135e8oek X-HE-Tag: 1721134380-923575 X-HE-Meta: U2FsdGVkX19JcU9HAdubbvW2AWKB319krBkC2hF86NziB+UxTB1N5Tzrw12BsShvbDnTNiWTi+Wp8yqp1QoGqjE1V3JCiLgUxHO8qQs3Zak8C4+PCkbVXTviFQSeO87wE3T7rpd02hyqmdFnCjLlVSkovONpEh99v/QVyIM9EanPNHqbwm9jfDQZOG+JGj9f+1bf/jjRQP8rV+iaGjw8QpzAhFXTsgXk/H3kQU5Uotp7EfKZXukPgUl1aJ+S0vBR2aD9poI6MZV0JpTDwcrh2kWh9ycIHPr8jYikeyXKhoGviNmvRd3SY7jGNPELb31783UcVU9Iuz8oPXmWr0Rzk6FejWoO8Y9sc56hmzlcPh6BuVmKNaCcw5z4KAoE4hiqcHCK03oepYwC82XYNLNkDHikWQPPfwHrcfCeqWbJ+4FdWB2GPzDJ618NEPOYql5zp6v1KHcmYDYumqL2D+BP3ztPKJWJUYs2x+V55DoZL//vMNzyDptKql3R/SozAVlYXr/bmfe43tL4iHdTRjBmbrAC59CE7Bduo9lxRqF5FmfPDEZ1Zl7QwKUJYsUMm6fAAJ68mYjzlF5DXXtPPOPDzpAnUTlmLa35IY6MtnQlezpMOQaxtL6FReHuJ9wTKlSP7FxprRZeTBb6XdEI7uw34CPrdRXExjnq8kyLeVe4vxyHD6DawTdzyIL8g/3cQBFW/TviTDJvY2nONLUVcLvKVgrWzm/1uU9x9q9oe4kj2cMvGUg375QqmQORbQyO2hIkg2g15/3iVLCnGMlkAyJycvIgPSrPaNaIVWMvRgPu7ZRc/jjwoezr4SPM0TjG4VDG0AXRjeWU/h0kXL7aeogpLG5EJN7P+YOqsqJq1cuISDCSLJIZiH9jLyCBDDHjuO+rVJmT2RayO8Yac91x9b4OgnIjtZos+Ok7vE6kTlYj5ErRF4vdoez8jmN2+HMQqhvG7Lu7zFTZLqdpCUmq9kT 5AQQrFz0 07xGU2JbG5zYL4aQbJ5VTjT1l8ajztez1qmjdL2HACxzRJWhPffJbUvGDqE5nmdQ5FACIPqir3TLv1vLpfFjssC6k/c4+LrXc2lrrLaCymA/qC6xs13hRaXqRVdPEga7Ahb2NWWgfQIYWEv4EdgO/qUA5Lfd6FPmsqQIM3gzV7U94OxR6JZ2/B0+a0PE1ivGbQXfZ0LVWO6k/Tr6BzzGFGMCsvcwRywl6VELkqhG8urTEJoNzBKX8th/w4aLREkYLVoCws+0w0B58AGxx3oOt32yXXApFupkDQwRrCxuckJyfz/IlT6tiQw3Xw/MGoUCEKH5KEv+LDgdXg7hIPv6Nr60JpRSOciKujNKm0PakKNErIcLpd+oTf7M5JzxewH8+HXR9OJCuU3xn95v+ywMQzMJk/WMUskOeY1PO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-07-15 5:48 p.m., Ian Rogers wrote: > On Mon, Jul 15, 2024 at 2:41 PM Liang, Kan wrote: >> >> >> >> On 2024-07-15 4:11 p.m., Ian Rogers wrote: >>> On Mon, Jul 15, 2024 at 1:05 PM Liang, Kan wrote: >>>> >>>> Hi Ian, >>>> >>>> On 2024-07-10 12:59 a.m., kernel test robot wrote: >>>>> >>>>> >>>>> Hello, >>>>> >>>>> kernel test robot noticed "perf-sanity-tests.perf_all_PMU_test.fail" on: >>>>> >>>>> commit: e2641db83f18782f57a0e107c50d2d1731960fb8 ("perf vendor events: Add/update skylake events/metrics") >>>>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master >>>>> >>>>> [test failed on linux-next/master 82d01fe6ee52086035b201cfa1410a3b04384257] >>>>> >>>>> in testcase: perf-sanity-tests >>>>> version: >>>>> with following parameters: >>>>> >>>>> perf_compiler: gcc >>>>> >>>>> >>>>> >>>>> compiler: gcc-13 >>>>> test machine: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory >>>>> >>>>> (please refer to attached dmesg/kmsg for entire log/backtrace) >>>>> >>>>> >>>>> we also observed two cases which also failed on parent can pass on this commit. >>>>> FYI. >>>>> >>>>> >>>>> caccae3ce7b988b6 e2641db83f18782f57a0e107c50 >>>>> ---------------- --------------------------- >>>>> fail:runs %reproduction fail:runs >>>>> | | | >>>>> :6 100% 6:6 perf-sanity-tests.perf_all_PMU_test.fail >>>>> :6 100% 6:6 perf-sanity-tests.perf_all_metricgroups_test.pass >>>>> :6 100% 6:6 perf-sanity-tests.perf_all_metrics_test.pass >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of >>>>> the same patch/commit), kindly add following tags >>>>> | Reported-by: kernel test robot >>>>> | Closes: https://lore.kernel.org/oe-lkp/202407101021.2c8baddb-oliver.sang@intel.com >>>>> >>>>> >>>>> >>>>> 2024-07-09 07:09:53 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 105 >>>>> 105: perf all metricgroups test : Ok >>>>> 2024-07-09 07:10:11 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 106 >>>>> 106: perf all metrics test : Ok >>>>> 2024-07-09 07:10:23 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 107 >>>>> 107: perf all libpfm4 events test : Ok >>>>> 2024-07-09 07:10:47 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 108 >>>>> 108: perf all PMU test : FAILED! >>>>> >>>> >>>> The failure is caused by the below change in the e2641db83f18. >>>> >>>> + { >>>> + "BriefDescription": "This 48-bit fixed counter counts the UCLK >>>> cycles", >>>> + "Counter": "FIXED", >>>> + "EventCode": "0xff", >>>> + "EventName": "UNC_CLOCK.SOCKET", >>>> + "PerPkg": "1", >>>> + "PublicDescription": "This 48-bit fixed counter counts the UCLK >>>> cycles.", >>>> + "Unit": "cbox_0" >>>> } >>>> >>>> The other cbox events have the unit name "CBOX", while the fixed counter >>>> has a unit name "cbox_0". So the events_table will maintain separate >>>> entries for cbox and cbox_0. >>>> >>>> The perf_pmus__print_pmu_events() calculate the total number of events, >>>> allocate an aliases buffer, store all the events into the buffer, sort, >>>> and print all the aliases one by one. >>>> >>>> The problem is that the calculated total number of events doesn't match >>>> the stored events on the SKL machine. >>>> >>>> The perf_pmu__num_events() is used to calculate the number of events. It >>>> invokes the pmu_events_table__num_events() to go through the entire >>>> events_table to find all events. Because of the >>>> pmu_uncore_alias_match(), the suffix of uncore PMU will be ignored. So >>>> the events for cbox and cbox_0 are all counted. >>>> >>>> When storing events into the aliases buffer, the >>>> perf_pmu__for_each_event() only process the events for cbox. >>>> >>>> Since a bigger buffer was allocated, the last entry are all 0. >>>> When printing all the aliases, null will be outputed. >>>> >>>> $ perf list pmu >>>> >>>> List of pre-defined events (to be used in -e or -M): >>>> >>>> (null) [Kernel PMU event] >>>> branch-instructions OR cpu/branch-instructions/ [Kernel PMU event] >>>> branch-misses OR cpu/branch-misses/ [Kernel PMU event] >>>> >>>> >>>> I'm thinking of two ways to address it. >>>> One is to only print all the stored events. The below patch can fix it. >>>> >>>> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c >>>> index 3fcabfd8fca1..2b2f5117ff84 100644 >>>> --- a/tools/perf/util/pmus.c >>>> +++ b/tools/perf/util/pmus.c >>>> @@ -485,6 +485,7 @@ void perf_pmus__print_pmu_events(const struct >>>> print_callbacks *print_cb, void *p >>>> perf_pmu__for_each_event(pmu, skip_duplicate_pmus, &state, >>>> perf_pmus__print_pmu_events__callback); >>>> } >>>> + len = state.index; >>>> qsort(aliases, len, sizeof(struct sevent), cmp_sevent); >>>> for (int j = 0; j < len; j++) { >>>> /* Skip duplicates */ >>>> >>>> The only drawback is that perf list will not show the new cbox_0 event. >>>> (But the event name still works. Users can still apply perf stat -e >>>> unc_clock.socket.) >>>> >>>> Since the cbox_0 event is only available on old machines (SKL and >>>> earlier), people should already use the equivalent kernel event. It >>>> doesn't sounds a big issue for me. I prefer this simple fix. >>>> >>>> I think the other way would be to modify the perf_pmu__for_each_event() >>>> to go through all the possible PMUs. >>>> It seems complicated and may impact others ARCHs (e.g., S390). I haven't >>>> tried it yet. >>>> >>>> What do you think? >>>> Do you see any other ways to address the issue? >>> >>> Ugh. It seems the sizing and then iterating approach is just prone to >>> keep breaking. Perhaps we can switch to realloc-ed arrays to avoid the >>> need for perf_pmu__num_events, which seems to be the source of the >>> problems. >>> >> >> I think a realloc-ed array should have the same drawback as the first >> way, but bad performance. >> Because the pmu_add_cpu_aliases() in the perf_pmu__for_each_event() only >> add the events from the first matched PMU. If we don't fix it, the >> UNC_CLOCK.SOCKET of cbox_0 will never be displayed. > > Ok, but I don't think we need to optimize `perf list` for speed. Fwiw, > I think this was the fix for the last bug in this code: > https://lore.kernel.org/r/20240511003601.2666907-1-irogers@google.com That seems a different issue. The root cause of this issue should be that different methods are used between calculating the total number of events and adding the events. A realloc-ed array should just hide the problem. There will still be an issue if anyone use the pair of perf_pmu__num_events() and pmu_add_cpu_aliases() somewhere else later. It looks like the difference is introduced from the commit e3edd6cf6399 ("perf pmu-events: Reduce processed events by passing PMU"). The pmu_events_table__for_each_event() stops immediately once a pmu is set. But for uncore, especially this case, the method is wrong and mismatch what perf does in the perf_pmu__num_events(). The below patch can fix it. diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py index ac9b7ca41856..97a3b018f865 100755 --- a/tools/perf/pmu-events/jevents.py +++ b/tools/perf/pmu-events/jevents.py @@ -923,7 +923,7 @@ int pmu_events_table__for_each_event(const struct pmu_events_table *table, continue; ret = pmu_events_table__for_each_event_pmu(table, table_pmu, fn, data); - if (pmu || ret) + if (ret) return ret; } return 0; The event can be displayed by the perf list pmu with the patch. $ perf list pmu | grep -A 1 clock.socket unc_clock.socket [This 48-bit fixed counter counts the UCLK cycles. Unit: uncore_cbox_0 & perf test "perf all PMU test" 107: perf all PMU test : Ok I will send the patch shortly. Thanks, Kan