From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0991CC3DA4B for ; Mon, 15 Jul 2024 20:11:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 925EF6B00A9; Mon, 15 Jul 2024 16:11:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D5206B00AA; Mon, 15 Jul 2024 16:11:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 700C26B00AB; Mon, 15 Jul 2024 16:11:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4CBDA6B00A9 for ; Mon, 15 Jul 2024 16:11:16 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CA1324142C for ; Mon, 15 Jul 2024 20:11:15 +0000 (UTC) X-FDA: 82343081310.21.CB256EB Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by imf09.hostedemail.com (Postfix) with ESMTP id 01DC814002E for ; Mon, 15 Jul 2024 20:11:13 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2GMlDLe4; spf=pass (imf09.hostedemail.com: domain of irogers@google.com designates 209.85.214.173 as permitted sender) smtp.mailfrom=irogers@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721074256; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M+8XC+YAYfGY9wu0jexmyV/isIA7xf3gadQwEp3guO4=; b=pH/VJRz2o86+eFjT/CY9NjToyblkHao3+JE89lxqHaBc9AgDlIrocJR7EceTFqv5pbqKoX hieWnqxLx8V576dDf9RBcCK83Xoudd9MyhuYNM6KCVTe1ajjNci13TPvJEfwpkB4NuIyAf QdaaKSxEBrpKLY4HAaJPmfX2FaE79II= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2GMlDLe4; spf=pass (imf09.hostedemail.com: domain of irogers@google.com designates 209.85.214.173 as permitted sender) smtp.mailfrom=irogers@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721074256; a=rsa-sha256; cv=none; b=crbj+dZ0pETvWFQpu2KWDAHK/bwILWWGuFYA/q60S3FQw3EivbWOcwxtnaIbJSdfmmZNzD 4GGAVZ8R2AZzHpB2wZTXHXqOcpErYisdaD++ghHcwvs3JNrHdzk5X3Aqpq4D2chQ0q4GYe 5D9qsCcmMuyBhpuHX+0N1gQCek6sdsI= Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-1fb67f59805so12185ad.1 for ; Mon, 15 Jul 2024 13:11:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721074273; x=1721679073; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=M+8XC+YAYfGY9wu0jexmyV/isIA7xf3gadQwEp3guO4=; b=2GMlDLe44RnoHGQEm5QyTZR0ctQvIe8wqqxJQP2mP+XqLnCddYac/JxGY4OdoTIPpX OmHi4FphT2hRzh3P37ORl24+4/qdahPFxLqSTQkuQ9B0vWhy+MME2sB7azGnUYf6RyfP 8n4nFr5Nq9T4J8CoKg6A6LAuQ6C5Z/xKzS7XVCMDmbEyC4FisFqvDMQplr+XPDDnWRa7 ha6b8iuq0wLDw4AEBb/88bCCgQc3y8sFrgmrlh0IPAiecANte/KVsuvGV/hyme/jOUbV iBmDJRMKXlXVOYXfz9OHtdqR9VOoKJYrVFsYmV354UICBS6tAMKYmsh6YauJr0b8nNUY XrBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721074273; x=1721679073; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M+8XC+YAYfGY9wu0jexmyV/isIA7xf3gadQwEp3guO4=; b=ZTpq+VBg0VRYfH/g2U3W3gxZ91/jkdGJn+7dH+qKm4LU/7xJmxLtIgZhnu+kyCPdj5 AtHMCeOL4sdYagdVyJxvzAJRR5nzJZLUs4DhRLxHAiEMSlcCmaZBwxipHobnKm2w5fOe dSgcRXajq67G6kkm0csmqTdMjdXKum8oUbqbw3G2tRHNxRT1GEXZ6GE3xjCpHFvKqP1R auKhVsPpcCQbVO30liMa/v8+IxU/i6T+w+/TSIMS4sgBTFuNcFoms+w7eQgE83jg34Jc AkYiODAF/FO6geCbp9AqsS5Q5ojrFasmr8bj7NReSCoxMf+C97pkq/yW3Xrf4R7I44c3 mlIw== X-Forwarded-Encrypted: i=1; AJvYcCVUbfpS4+jJvf75QoTiVLajNnEUwWXXKUrT7du2lc52ZPb7KFJVlnjrORgGMd0WaQ2QzENs/251Ex5NpzbWvST5ExI= X-Gm-Message-State: AOJu0Yx89f4Tycggs+dCxcHGEhTC2GoCk313aivhVeA4O1CDfkyOqKvf fPUhe4vF6cSCsW42oz2QVJiA1k3kqNn+6gcZmdBwNoPLUMxMk504ealxWEHwCTd1Y6rmrGkEppB sazS02ccUnMCWMWl9bIYF+vaqsvc5/9PMoBgI X-Google-Smtp-Source: AGHT+IE9Wtsd5xJe49TAZXmU4vck6FQTGGOiDV4YjFrp/Xf6d3q9s50kG1iMn8QpN2cIYR2RIAnDx3B9rGGi45Fpr7g= X-Received: by 2002:a17:903:1249:b0:1fb:1006:980f with SMTP id d9443c01a7336-1fc3c77dae9mr717425ad.16.1721074272329; Mon, 15 Jul 2024 13:11:12 -0700 (PDT) MIME-Version: 1.0 References: <202407101021.2c8baddb-oliver.sang@intel.com> In-Reply-To: From: Ian Rogers Date: Mon, 15 Jul 2024 13:11:01 -0700 Message-ID: Subject: Re: [linux-next:master] [perf vendor events] e2641db83f: perf-sanity-tests.perf_all_PMU_test.fail To: "Liang, Kan" Cc: kernel test robot , oe-lkp@lists.linux.dev, lkp@intel.com, Linux Memory Management List , Namhyung Kim , Weilin Wang , Caleb Biggers , Alexandre Torgue , Maxime Coquelin , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 01DC814002E X-Stat-Signature: dzpobzo473en4dthbo4zcm1tf11buc37 X-HE-Tag: 1721074273-103251 X-HE-Meta: U2FsdGVkX1+NtDncB6pxWVgWGW5WjCXxTPWKHUaMaSYe9G9BVfkhjhf8KUdhUW8cXshm1cYjVDeDLHI5YIF0eg9go66c6wc8HJ8h5SGOFWrM8pPYQr0LFaJolV7+AXtfPst15acimoqhsBbRRpkLtbYiN1hEA6RfNvhUWanT36/qAXMheeqEACdS/00E6RkaQYORJXVoJUy8u7NPvvA6m8CrqwC0jI3G5gVuBBmNkloeDXIjtP1Smhh8XGnI8IrsqscfiP5jytA5Dv8wANPtMtr1x4IjNpk/rgjV515p9hfjh/YElIaaDimRQ49cA0OnVSXBHZODF9cEQPmwtT5uVMFpuPeJudBz38MLYk6pyy9SODLyo+LJIyBCgAspdo6dZckiiFV2GfrP3SFE+uYKkcrhHS9wdhJnrv8u1CJMKUXfkGTUzp+oZQ06lfdMB/6ttt5XvUMeV2K7stjoruPQFIgLFr7UBP3l3hJYniJCzv3u5mqQGjXGfLj3CI05x/QLnXgCThr6Y3Oy1AHEf22219le/xvMK7JnacfPNntNd+oQhb+xzV1lD/bAbeWN5gAi7ZwWvAwXQYVC2uT86TVAXd64s+3V+U4HuC2OrPglHEWFdhaAABbHyDk559NiaJF6OI5kAXnm+JSUVt9QKdbNhSzmgPW7NPQcTm1TwOozxbaXtRYBaye8Mi678mnifiKkgSDIupHrPfJL8iP4jxaSguh0073qjCUUN/tayYT1BEfCa/LAzV8l+yxJmEUvLekDHA4XqoBkWVHOox/Gq9wjEX2VcCnKYAn1ATbjwLCYkoaQj3gNDRtvug138XbeO5T2JA0qCTWrv3E98fWh4bY7SNLlq0M3ZczqIgRUjN80nwNZOV+D4+0yGj3e99Wrf3RAI64cSKXgKFoUVDMyCg4komWn7/FZcVophiVn2c9+4hGNjStNYDJ/7pEzXfoM8RcVgE5woeQKrpjVjxz+3Wl 0njhJix+ i3TP/HXYQR5mEAjyeSmU3mDuwK2nroGBUYYKJFRbCSz/uDnZAFX77O38rMbil7nUOBjotbS8fKv+Ib+S+PrsOR5Xibft8DrvNtPKUDO+4bXu63ANSPA3gNPdFhsQfxx7Ay5mxt/CEkmqVI7IGn8SBMKLcEBti+6ahssmFu2k62SAfBjEVFgkvq2UmmSSh2CuHtnTLSy/vVGSXlp2nUjOQ/h5Ek2K1vtfe6TAPN7t7tuJAWfrGqnq9hJEK/Rvq8yexNivgRnfRJaCHbnjuXZ6vG6Jv80f5uqwomqdx3zpGiMm2Y4oDUi7eohs1816NG2EGCbZkaxjVPPXrj9MmriWikmPGYZ4BMbytS+KEIVgecBBDNMVP5t9ulMPNHQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jul 15, 2024 at 1:05=E2=80=AFPM Liang, Kan wrote: > > Hi Ian, > > On 2024-07-10 12:59 a.m., kernel test robot wrote: > > > > > > Hello, > > > > kernel test robot noticed "perf-sanity-tests.perf_all_PMU_test.fail" on= : > > > > commit: e2641db83f18782f57a0e107c50d2d1731960fb8 ("perf vendor events: = Add/update skylake events/metrics") > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > > > [test failed on linux-next/master 82d01fe6ee52086035b201cfa1410a3b04384= 257] > > > > in testcase: perf-sanity-tests > > version: > > with following parameters: > > > > perf_compiler: gcc > > > > > > > > compiler: gcc-13 > > test machine: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40G= Hz (Coffee Lake) with 32G memory > > > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > > > we also observed two cases which also failed on parent can pass on this= commit. > > FYI. > > > > > > caccae3ce7b988b6 e2641db83f18782f57a0e107c50 > > ---------------- --------------------------- > > fail:runs %reproduction fail:runs > > | | | > > :6 100% 6:6 perf-sanity-tests.perf_al= l_PMU_test.fail > > :6 100% 6:6 perf-sanity-tests.perf_al= l_metricgroups_test.pass > > :6 100% 6:6 perf-sanity-tests.perf_al= l_metrics_test.pass > > > > > > > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new ve= rsion of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot > > | Closes: https://lore.kernel.org/oe-lkp/202407101021.2c8baddb-oliver.s= ang@intel.com > > > > > > > > 2024-07-09 07:09:53 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641d= b83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 105 > > 105: perf all metricgroups test : = Ok > > 2024-07-09 07:10:11 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641d= b83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 106 > > 106: perf all metrics test : = Ok > > 2024-07-09 07:10:23 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641d= b83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 107 > > 107: perf all libpfm4 events test : = Ok > > 2024-07-09 07:10:47 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-e2641d= b83f18782f57a0e107c50d2d1731960fb8/tools/perf/perf test 108 > > 108: perf all PMU test : = FAILED! > > > > The failure is caused by the below change in the e2641db83f18. > > + { > + "BriefDescription": "This 48-bit fixed counter counts the UCLK > cycles", > + "Counter": "FIXED", > + "EventCode": "0xff", > + "EventName": "UNC_CLOCK.SOCKET", > + "PerPkg": "1", > + "PublicDescription": "This 48-bit fixed counter counts the UCLK > cycles.", > + "Unit": "cbox_0" > } > > The other cbox events have the unit name "CBOX", while the fixed counter > has a unit name "cbox_0". So the events_table will maintain separate > entries for cbox and cbox_0. > > The perf_pmus__print_pmu_events() calculate the total number of events, > allocate an aliases buffer, store all the events into the buffer, sort, > and print all the aliases one by one. > > The problem is that the calculated total number of events doesn't match > the stored events on the SKL machine. > > The perf_pmu__num_events() is used to calculate the number of events. It > invokes the pmu_events_table__num_events() to go through the entire > events_table to find all events. Because of the > pmu_uncore_alias_match(), the suffix of uncore PMU will be ignored. So > the events for cbox and cbox_0 are all counted. > > When storing events into the aliases buffer, the > perf_pmu__for_each_event() only process the events for cbox. > > Since a bigger buffer was allocated, the last entry are all 0. > When printing all the aliases, null will be outputed. > > $ perf list pmu > > List of pre-defined events (to be used in -e or -M): > > (null) [Kernel PMU event] > branch-instructions OR cpu/branch-instructions/ [Kernel PMU event] > branch-misses OR cpu/branch-misses/ [Kernel PMU event] > > > I'm thinking of two ways to address it. > One is to only print all the stored events. The below patch can fix it. > > diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c > index 3fcabfd8fca1..2b2f5117ff84 100644 > --- a/tools/perf/util/pmus.c > +++ b/tools/perf/util/pmus.c > @@ -485,6 +485,7 @@ void perf_pmus__print_pmu_events(const struct > print_callbacks *print_cb, void *p > perf_pmu__for_each_event(pmu, skip_duplicate_pmus, &state= , > perf_pmus__print_pmu_events__call= back); > } > + len =3D state.index; > qsort(aliases, len, sizeof(struct sevent), cmp_sevent); > for (int j =3D 0; j < len; j++) { > /* Skip duplicates */ > > The only drawback is that perf list will not show the new cbox_0 event. > (But the event name still works. Users can still apply perf stat -e > unc_clock.socket.) > > Since the cbox_0 event is only available on old machines (SKL and > earlier), people should already use the equivalent kernel event. It > doesn't sounds a big issue for me. I prefer this simple fix. > > I think the other way would be to modify the perf_pmu__for_each_event() > to go through all the possible PMUs. > It seems complicated and may impact others ARCHs (e.g., S390). I haven't > tried it yet. > > What do you think? > Do you see any other ways to address the issue? Ugh. It seems the sizing and then iterating approach is just prone to keep breaking. Perhaps we can switch to realloc-ed arrays to avoid the need for perf_pmu__num_events, which seems to be the source of the problems. Thanks, Ian