From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D61DDC3DA4B for ; Mon, 15 Jul 2024 20:05:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 577FF6B008A; Mon, 15 Jul 2024 16:05:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 527846B00A9; Mon, 15 Jul 2024 16:05:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C8DF6B0093; Mon, 15 Jul 2024 16:05:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1A9736B00A9 for ; Mon, 15 Jul 2024 16:05:36 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B1B77A01E7 for ; Mon, 15 Jul 2024 20:05:35 +0000 (UTC) X-FDA: 82343067030.22.9147C2D Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) by imf09.hostedemail.com (Postfix) with ESMTP id 00DAC14002F for ; Mon, 15 Jul 2024 20:05:32 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Xzc4A+Pk; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf09.hostedemail.com: domain of kan.liang@linux.intel.com has no SPF policy when checking 198.175.65.16) smtp.mailfrom=kan.liang@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721073891; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6xNbdmfK/4JRuPxrw7JOsF6weavOIp17q/JpDqe54mI=; b=zl8MYA2Wq7/0cb0JpBuTyx/+KQjuZoPCqQF3tm1R/ZMvOZIVBlMF7MflxNKJJStowkJw6g iiTXumedTTt7Oo7PnNGNQD4sIIvNFyOyuZ5g61iJKTLYXd0M/PnLA9hi08ogkwKQyQ2vzA cMeVfhwaK5KPvVRLUoTO2SYg+GuG8Z4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721073891; a=rsa-sha256; cv=none; b=Buc0TQjk5DPK9gNoGZVI2Yvkw6uV9GnlLsbMUljK3+GTX2+ylOukm8bxXOBKXVOhBiP/B8 FOrKzD96JFVNluRqC6M+6KQWGUvFZtsO3BNOQNb0Rkcyf99+USH8kRHcOYJE00w9KY11V+ alE2UJL5XBdwuhgRpBs+NQLC0eiJ524= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Xzc4A+Pk; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf09.hostedemail.com: domain of kan.liang@linux.intel.com has no SPF policy when checking 198.175.65.16) smtp.mailfrom=kan.liang@linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721073932; x=1752609932; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=/DOi2ls71ic2OyuxBaWpZlE+L/2u4kMQ6x6k0YhB6TY=; b=Xzc4A+Pkp2VZ0BPI8ONLZ7UBCsuYYH+g6KiBpU6L9jK3q/KftEHkAIzV Ev8H2wvOWLx9ON2OCWvSvwYJbe68sQ/ypvH/Cu0L1LpPEGtn2wO1hxQfj Rs6hg7SWsndsnRpS/wP6OR9eQyHuSDyIQO+6Ip260oQyhYfdxfVFKSsur yHsNjom2Jy3u2tA5TRAW68ATe6sr0AvR/hXSgWIzDwwOlmJncj4927PPn xj3XFJqvAkZ4Wj6UeeCDg7cf3U+cihjCQ9rhdUfBiFwi7Ew10dexHd7Yc L95yzQ/ld4FGke1PrYzlOvPVe7ssnszicWwge4SDaHHWbxxxttYb3gI1B A==; X-CSE-ConnectionGUID: ikQzUmb4Q7SXy0XXDIIuig== X-CSE-MsgGUID: DAva4CE6R3OxQn46Xj84Hg== X-IronPort-AV: E=McAfee;i="6700,10204,11134"; a="18615568" X-IronPort-AV: E=Sophos;i="6.09,211,1716274800"; d="scan'208";a="18615568" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2024 13:05:31 -0700 X-CSE-ConnectionGUID: wFLa480YSAa2EZOT+0vNKw== X-CSE-MsgGUID: y16IyOeCRt2hKFkEqfJpxg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,211,1716274800"; d="scan'208";a="49591891" Received: from linux.intel.com ([10.54.29.200]) by fmviesa007.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2024 13:05:31 -0700 Received: from [10.212.96.36] (kliang2-mobl1.ccr.corp.intel.com [10.212.96.36]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id ED7D620B8CD6; Mon, 15 Jul 2024 13:05:28 -0700 (PDT) Message-ID: Date: Mon, 15 Jul 2024 16:05:27 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [linux-next:master] [perf vendor events] e2641db83f: perf-sanity-tests.perf_all_PMU_test.fail To: kernel test robot , Ian Rogers Cc: oe-lkp@lists.linux.dev, lkp@intel.com, Linux Memory Management List , Namhyung Kim , Weilin Wang , Caleb Biggers , Alexandre Torgue , Maxime Coquelin , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org References: <202407101021.2c8baddb-oliver.sang@intel.com> Content-Language: en-US From: "Liang, Kan" In-Reply-To: <202407101021.2c8baddb-oliver.sang@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 00DAC14002F X-Stat-Signature: 6weejy3uxcu9dchj8fne7eje9zpizdfy X-Rspam-User: X-HE-Tag: 1721073932-716141 X-HE-Meta: U2FsdGVkX1/rPkwtGmqzrctBqpOgqPDLJdSmcBznATLOgT2u8cR1VXM9CmzrrbW53914ZBVKuf3kD5uhkpt+1wV80GVt8qto+tYoz6mou5tME3HmACro2JVw5NqQcS54m4jwqxZVj3A7Mf9Li0oSTr97Ev4xPWYQbnLbj6aSZZqt+xsF3shuEQB3nZPZBha/TUSEAX9cPDb8Pt3AWHIBNk0iIaop4QYkiZgiGzRvIb5qVX0wtsKUcVGpUDHbJFOhulscNj3J38QAxbW8q0pqJ97Tu7SqRSII3Hjp2s1Mv7tenGQ9iqRr6VbPbHb+gMfixiNiZOWxBNcmPCjyMRArzL+5LxMk7B24aO0EFlaNs/5I+BRhwP2PccQxwbOZ4JxwuOL0yAAZrI1bVqNfcarkcNfTbWeH4TqwmxnaJG3GKGys8FVXuSr+qK+K0Ap6zGF07ZntnYmUgSX85kZPR4ojTgocUvmlwhAbWyCo1WIK2rh07jn5v/irwjLp8ynz2/OySB9eidepMRRuXGgFZGHwu/LcLeVv8SfzNltKw/JriNRHv55ixluy2biBohk8HBqk82M4T/ow/6kSCiZlXTlI0jSFgWj9B3IMYTT0khRyYQt+Q7b+ZNCUikjHTa8HikEcRdZMqvlZkNBOUhH2W7Nz+JiIlXChInACf8E4olIF1evJ9XaiY0A0UGZ9XLFJqx8Qca3a5rWvqeo2RfpGGEKB0GSTbOsF2zQY+49kRtEbgkMvkv0gRW9xGRUtQUr0Xq5vbd4ZuLk3xEeHelyObCOFTTwts7MwnFwE1hXg2aI/6tfXkC0KK2jYdmUoA4sy9xHjEZPX6KbfSRR3cvM+4GbTmsUlzIpPvzfJyj29F+eGddNRVGewOo6jxSuyyeFMK24v+u4Z126Pe/o0Px1Vu8EBHCquavwZ32emmISIa14eZsslKAPMVO61yxdG01ADZaMd9O3oBTWSEH2Pqjb32zR lGzPRA9r ozPFMhkL0WjL95Vg01JSq3V4tPW/e2v5sQiPzBoYKQQcdUMcDQc+m64tvLjfbRjUTNse3eurHOPk66BStAagoHO2LqXBiMzl1w+bmBAEsdOYSMMMR/BSb3K8N1nATjFtr8wT5o0TTsX00OIUVtYbDM7nv0Rtwu1hxfLYSMCULYAC0CNdVCRyHeuuAiFwfhv9aZGc3hXc90X7U7Z8EfFkRZmgKgPm9cTTRe1HsbqmEACPZqq2hVx2HhEJrB2fLonRW4TiWtt7z2EVnL9mNWfn/a5NMNiSqiJ96pADbOuQStL4FZS2YqUO1r4y50fMkU83tFKaq9ZRBYUyvsSQqjWn4L2HhgkBc3CrKvBiF8WW2gSg/esA7WKtuk1+ElA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Ian, On 2024-07-10 12:59 a.m., kernel test robot wrote: > > > Hello, > > kernel test robot noticed "perf-sanity-tests.perf_all_PMU_test.fail" on: > > commit: e2641db83f18782f57a0e107c50d2d1731960fb8 ("perf vendor events: Add/update skylake events/metrics") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > [test failed on linux-next/master 82d01fe6ee52086035b201cfa1410a3b04384257] > > in testcase: perf-sanity-tests > version: > with following parameters: > > perf_compiler: gcc > > > > compiler: gcc-13 > test machine: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > we also observed two cases which also failed on parent can pass on this commit. > FYI. > > > caccae3ce7b988b6 e2641db83f18782f57a0e107c50 > ---------------- --------------------------- > fail:runs %reproduction fail:runs > | | | > :6 100% 6:6 perf-sanity-tests.perf_all_PMU_test.fail > :6 100% 6:6 perf-sanity-tests.perf_all_metricgroups_test.pass > :6 100% 6:6 perf-sanity-tests.perf_all_metrics_test.pass > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot > | Closes: https://lore.kernel.org/oe-lkp/202407101021.2c8baddb-oliver.sang@intel.com > > > > 2024-07-09 07:09:53 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 105 > 105: perf all metricgroups test : Ok > 2024-07-09 07:10:11 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 106 > 106: perf all metrics test : Ok > 2024-07-09 07:10:23 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 107 > 107: perf all libpfm4 events test : Ok > 2024-07-09 07:10:47 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641db83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 108 > 108: perf all PMU test : FAILED! > The failure is caused by the below change in the e2641db83f18. + { + "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles", + "Counter": "FIXED", + "EventCode": "0xff", + "EventName": "UNC_CLOCK.SOCKET", + "PerPkg": "1", + "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.", + "Unit": "cbox_0" } The other cbox events have the unit name "CBOX", while the fixed counter has a unit name "cbox_0". So the events_table will maintain separate entries for cbox and cbox_0. The perf_pmus__print_pmu_events() calculate the total number of events, allocate an aliases buffer, store all the events into the buffer, sort, and print all the aliases one by one. The problem is that the calculated total number of events doesn't match the stored events on the SKL machine. The perf_pmu__num_events() is used to calculate the number of events. It invokes the pmu_events_table__num_events() to go through the entire events_table to find all events. Because of the pmu_uncore_alias_match(), the suffix of uncore PMU will be ignored. So the events for cbox and cbox_0 are all counted. When storing events into the aliases buffer, the perf_pmu__for_each_event() only process the events for cbox. Since a bigger buffer was allocated, the last entry are all 0. When printing all the aliases, null will be outputed. $ perf list pmu List of pre-defined events (to be used in -e or -M): (null) [Kernel PMU event] branch-instructions OR cpu/branch-instructions/ [Kernel PMU event] branch-misses OR cpu/branch-misses/ [Kernel PMU event] I'm thinking of two ways to address it. One is to only print all the stored events. The below patch can fix it. diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c index 3fcabfd8fca1..2b2f5117ff84 100644 --- a/tools/perf/util/pmus.c +++ b/tools/perf/util/pmus.c @@ -485,6 +485,7 @@ void perf_pmus__print_pmu_events(const struct print_callbacks *print_cb, void *p perf_pmu__for_each_event(pmu, skip_duplicate_pmus, &state, perf_pmus__print_pmu_events__callback); } + len = state.index; qsort(aliases, len, sizeof(struct sevent), cmp_sevent); for (int j = 0; j < len; j++) { /* Skip duplicates */ The only drawback is that perf list will not show the new cbox_0 event. (But the event name still works. Users can still apply perf stat -e unc_clock.socket.) Since the cbox_0 event is only available on old machines (SKL and earlier), people should already use the equivalent kernel event. It doesn't sounds a big issue for me. I prefer this simple fix. I think the other way would be to modify the perf_pmu__for_each_event() to go through all the possible PMUs. It seems complicated and may impact others ARCHs (e.g., S390). I haven't tried it yet. What do you think? Do you see any other ways to address the issue? Thanks, Kan