From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A4605CDB482
	for <linux-mm@archiver.kernel.org>; Thu, 12 Oct 2023 08:02:38 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 1730C8D000A; Thu, 12 Oct 2023 04:02:38 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 0FAC28D0002; Thu, 12 Oct 2023 04:02:38 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id EB83D8D000A; Thu, 12 Oct 2023 04:02:37 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13])
	by kanga.kvack.org (Postfix) with ESMTP id D4A798D0002
	for <linux-mm@kvack.org>; Thu, 12 Oct 2023 04:02:37 -0400 (EDT)
Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id A54FE14055D
	for <linux-mm@kvack.org>; Thu, 12 Oct 2023 08:02:37 +0000 (UTC)
X-FDA: 81336067554.27.BE423C6
Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50])
	by imf03.hostedemail.com (Postfix) with ESMTP id CEE442002C
	for <linux-mm@kvack.org>; Thu, 12 Oct 2023 08:02:35 +0000 (UTC)
Authentication-Results: imf03.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=cd46Ez2Q;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf03.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1697097755;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=AVeG/ezevZBe84mHfYghsq6zV3sBIKH/wm3kaQ/fwQk=;
	b=i2SUt9ADuilXQojSQ7nlRR2DRchoJ9AbV9aLWNF/1cOTnJwgWXY0H/BRNgAQ4t4fmvZ6l4
	m9RpyCWRM3qaXdiff0/NIiTibghjgd4qHrNq98AIO2kmxVqc9dLPN/QYfE9upeMo72IAlK
	oZTqphZGh5H6QQUnhAm66Xziesi8lko=
ARC-Authentication-Results: i=1;
	imf03.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=cd46Ez2Q;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf03.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697097755; a=rsa-sha256;
	cv=none;
	b=XG0l7PwlLNjZ1WVGFlmjY9OXZ3m9PNSyNEtCpPcWyFb/semK2zGNRzNUWwM2h93fSc1CyM
	TzsrziKna7RPcJZSBVPnnJd7cmoYcIdn1f5nPWv2UjUFTWuFM8WYSJmwO5oRevvmU04qLz
	sAqqewhR85HB12VBDMjB+srLamYoO8g=
Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-53e2308198eso95832a12.1
        for <linux-mm@kvack.org>; Thu, 12 Oct 2023 01:02:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1697097754; x=1697702554; darn=kvack.org;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:from:to:cc:subject:date:message-id:reply-to;
        bh=AVeG/ezevZBe84mHfYghsq6zV3sBIKH/wm3kaQ/fwQk=;
        b=cd46Ez2QFgbakWHLPP4X6RVp6C4fvIXfLOSm9wnQwtUFPDt5/riQoFHmdznIj/sUJn
         fp4PCdSHaGJ5R/Qs1aujHR0yAsfPL3QtLtu5LgNEandL19b/V3YIX0iIyz95dP46xirl
         2CF3HH1vKbzA5rEQzcfUchUuNj/I/TfzVU6qpk2gkOXsBnaR6AD8B9pLAU2KhhHt5RJf
         wsHrsbRwlWnvb6Z7c7iHRV/KDLropiHNxQ+/4eOmHyBBj7Mg8FJUHsaGYFgxqzQ5do19
         YX+D1q5wQBkAKI2JzZMpsE5w+NP0rW2Z7INPjCecAf8wstnR04OA9rdSbtA+AVBfo0UR
         J3Eg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1697097754; x=1697702554;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=AVeG/ezevZBe84mHfYghsq6zV3sBIKH/wm3kaQ/fwQk=;
        b=XHz8CcjZCAlhFQSheCUrtpH+qFzvl5ZrwE7iHE2a0zVcxx1dxqxNuW/21FTApOHFqa
         EcaYLO8CoO4vLPrPI7grhhlbrG6/6DsI9zHEr7opXXJzXg0v3fJe3ehSB9h+97c0oyFS
         WAl1kLPAJCRmqF3gczpN9bL8Ql6QsjNx2jDKG55ml5D3ygyMGTJdwNNautxNakmCXr5y
         F5A+kJJZjuqaufBA222ImnhO4PkJ5otCtIANpV+1qdk32siIVGZd+kUaXTclOJacCu6U
         /QqySY1Yxm/vA9loZorWPPgFa/0Up4BkowrijAwBXXmsOgJXUGCgj6hl/ItHGMIRPoLY
         Hs4Q==
X-Gm-Message-State: AOJu0Yw6zUgksnZ1iLffXJjn75puxClnDD7GjADJl3IB/TnyvwDxyaOz
	D1myloqRy3oBNBDs3gHm1jUF4d5cLwB0SJHSpL6+tw==
X-Google-Smtp-Source: AGHT+IFHSfS6YEiISPA59NACNJ60CBAJMUetZYuwnUsbR9a2BAxuaSp8pcg6cnp2f+vPrw7x8UBcwC/ari8J5wzZtUI=
X-Received: by 2002:a17:907:7603:b0:9b2:b715:89ee with SMTP id
 jx3-20020a170907760300b009b2b71589eemr20314848ejc.69.1697097754071; Thu, 12
 Oct 2023 01:02:34 -0700 (PDT)
MIME-Version: 1.0
References: <20231010032117.1577496-1-yosryahmed@google.com>
 <20231010032117.1577496-4-yosryahmed@google.com> <CALvZod5nQrf=Y24u_hzGOTXYBfnt-+bo+cYbRMRpmauTMXJn3Q@mail.gmail.com>
 <CAJD7tka=kjd42oFpTm8FzMpNedxpJCUj-Wn6L=zrFODC610A-A@mail.gmail.com>
 <CAJD7tkZSanKOynQmVcDi_y4+J2yh+n7=oP97SDm2hq1kfY=ohw@mail.gmail.com>
 <20231011003646.dt5rlqmnq6ybrlnd@google.com> <CAJD7tkaZzBbvSYbCdvCigcum9Dddk8b6MR2hbCBG4Q2h4ciNtw@mail.gmail.com>
 <CALvZod7NN-9Vvy=KRtFZfV7SUzD+Bn8Z8QSEdAyo48pkOAHtTg@mail.gmail.com> <CAJD7tkbHWW139-=3HQM1cNzJGje9OYSCsDtNKKVmiNzRjE4tjQ@mail.gmail.com>
In-Reply-To: <CAJD7tkbHWW139-=3HQM1cNzJGje9OYSCsDtNKKVmiNzRjE4tjQ@mail.gmail.com>
From: Yosry Ahmed <yosryahmed@google.com>
Date: Thu, 12 Oct 2023 01:01:55 -0700
Message-ID: <CAJD7tkZE81GDKXtdWvp-xQcV9=a8LWRZAZDu6PC_y1J1EJC-Cg@mail.gmail.com>
Subject: Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg
To: Shakeel Butt <shakeelb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, Johannes Weiner <hannes@cmpxchg.org>, 
	Michal Hocko <mhocko@kernel.org>, Roman Gushchin <roman.gushchin@linux.dev>, 
	Muchun Song <muchun.song@linux.dev>, Ivan Babrou <ivan@cloudflare.com>, Tejun Heo <tj@kernel.org>, 
	=?UTF-8?Q?Michal_Koutn=C3=BD?= <mkoutny@suse.com>, 
	Waiman Long <longman@redhat.com>, kernel-team@cloudflare.com, 
	Wei Xu <weixugc@google.com>, Greg Thelen <gthelen@google.com>, linux-mm@kvack.org, 
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Content-Type: multipart/alternative; boundary="000000000000c8c3ee0607805c47"
X-Rspam-User: 
X-Stat-Signature: 1oo1qe9m6xi3tmrmf1xt631iixbjkamd
X-Rspamd-Server: rspam07
X-Rspamd-Queue-Id: CEE442002C
X-HE-Tag: 1697097755-544249
X-HE-Meta: U2FsdGVkX18zOyjv9EvbVsfxYvT+nV4pFCSP4e4HPGK4DpQqTfSxaelj8FPpOXAO8p67UbYjWmArlhkp56Sd7p5I/NmQy2EcBIQWE9mLDf1Ej+24bRCtmGG+AxraQs2LcBHgpqdq2kfP6BdT1hdlhPky8EK1uvQe/XGeT3NYcD3fZqkfj3eFRuK5gYzzDmwXIoGVEgGVFDKQB1AQg/A/qxsLoskSOkkyNIqIFmu8Yn0HeDQ9lokl2G8KHHLkVsa+y/XG8SLhQepeXvX7dIaPk39bj/3fqMJx8E0LXnho815YpuGdzWNuANmCI0D/uxGyDuEdsEQYMISh2W8amDr5+5PiWP9tFUlPlMG7zS3zRYicEz0DkINkRFtiG5edFBGvK7QuAlE0pVvKcaYfLqq5HsWf26ybvgy4EAvRWDUsN8caH1SKhdVEx1LcM3Wt3doceRa3Al3MP2DBF7hvL7BJYpEro0Tr94fPD+3N5qedLNvwsikLqXwMGRvfdKZLiHB2RgvdKIRUnMsVA3MGSgqwDT36rUTIltkgVEw59nb+xicsLUMM16kFK/mTEKEpIvpIGDypnITl+sFZ/LRWqiuzgRn71SExrIUWpdMkOZ6zwN3TYJAhIXFARgo3n1IYur6Nijlb7j9bOyYPRXi+SY3bqiCjK41fYkFeD/Xpgpc9FppnmBRISN0U1bYqUzrCF2ACK+uQkD13FzziiL1l6wmaxW+6hfdGPg5v0F7mS+jZqc9PuWgd6pzEbwa5fZw2ZKDt7gc5jR/zIbbU+DhWThi1x1XHAXUNKMNS/j76wfQpFXQM88QO5pvCAnWs000ckD8l8aNgJ9OZ0xy5k8tq38Z2rM23tnfBFa41ngUPCBUzQhEmroZ3kttkuBHc2pVzmKXo+RM48xt/fakUaztCq5ZlrXrl+BZ97tRkSVy2On1kZqZ8/sKuRShQnAb5NOowalZWxt0uGPn6U2ugBU98qyU
 251cjqWC
 0vnFW9oOgMAjAjPL39wJv5Zs85V2pCOqUpiWexZBBpocrxnzP3j00gEPiRn+qye9Fo00GL4HNYEV7mnbxZjo9y6UVJIJbgjvOzZVpDF9PjjjekLejhjh2I4ZB24w3iaXEkUEWUKYlmD455wS3DvbPes5tQSyc4ewYPqqV9nYiiGuHG5yuUy8Ki66Oib9JdCGm1/kb46Q3VlZIirRpCRW+tR+2nM6NYg9NZZdtGD0BU2OWbACNW8/S3cjUNAV+qsFtKOOAR3OV508KDFu3zSBDVvT6H6CxgvzzWkmYY7p6vySaXmK8nUSIbIxN6dqof0D7wV/r+hSdoTQPVk9PDrAoknHixnQ+4Hs+K6cO
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000232, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

--000000000000c8c3ee0607805c47
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Wed, Oct 11, 2023 at 8:13=E2=80=AFPM Yosry Ahmed <yosryahmed@google.com>=
 wrote:
>
> On Wed, Oct 11, 2023 at 5:46=E2=80=AFAM Shakeel Butt <shakeelb@google.com=
> wrote:
> >
> > On Tue, Oct 10, 2023 at 6:48=E2=80=AFPM Yosry Ahmed <yosryahmed@google.=
com>
wrote:
> > >
> > > On Tue, Oct 10, 2023 at 5:36=E2=80=AFPM Shakeel Butt <shakeelb@google=
.com>
wrote:
> > > >
> > > > On Tue, Oct 10, 2023 at 03:21:47PM -0700, Yosry Ahmed wrote:
> > > > [...]
> > > > >
> > > > > I tried this on a machine with 72 cpus (also ixion), running both
> > > > > netserver and netperf in /sys/fs/cgroup/a/b/c/d as follows:
> > > > > # echo "+memory" > /sys/fs/cgroup/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a
> > > > > # echo "+memory" > /sys/fs/cgroup/a/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a/b
> > > > > # echo "+memory" > /sys/fs/cgroup/a/b/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a/b/c
> > > > > # echo "+memory" > /sys/fs/cgroup/a/b/c/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a/b/c/d
> > > > > # echo 0 > /sys/fs/cgroup/a/b/c/d/cgroup.procs
> > > > > # ./netserver -6
> > > > >
> > > > > # echo 0 > /sys/fs/cgroup/a/b/c/d/cgroup.procs
> > > > > # for i in $(seq 10); do ./netperf -6 -H ::1 -l 60 -t
TCP_SENDFILE --
> > > > > -m 10K; done
> > > >
> > > > You are missing '&' at the end. Use something like below:
> > > >
> > > > #!/bin/bash
> > > > for i in {1..22}
> > > > do
> > > >    /data/tmp/netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K &
> > > > done
> > > > wait
> > > >
> > >
> > > Oh sorry I missed the fact that you are running instances in
parallel, my bad.
> > >
> > > So I ran 36 instances on a machine with 72 cpus. I did this 10 times
> > > and got an average from all instances for all runs to reduce noise:
> > >
> > > #!/bin/bash
> > >
> > > ITER=3D10
> > > NR_INSTANCES=3D36
> > >
> > > for i in $(seq $ITER); do
> > >   echo "iteration $i"
> > >   for j in $(seq $NR_INSTANCES); do
> > >     echo "iteration $i" >> "out$j"
> > >     ./netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K >> "out$j" &
> > >   done
> > >   wait
> > > done
> > >
> > > cat out* | grep 540000 | awk '{sum +=3D $5} END {print sum/NR}'
> > >
> > > Base: 22169 mbps
> > > Patched: 21331.9 mbps
> > >
> > > The difference is ~3.7% in my runs. I am not sure what's different.
> > > Perhaps it's the number of runs?
> >
> > My base kernel is next-20231009 and I am running experiments with
> > hyperthreading disabled.
>
> Using next-20231009 and a similar 44 core machine with hyperthreading
> disabled, I ran 22 instances of netperf in parallel and got the
> following numbers from averaging 20 runs:
>
> Base: 33076.5 mbps
> Patched: 31410.1 mbps
>
> That's about 5% diff. I guess the number of iterations helps reduce
> the noise? I am not sure.
>
> Please also keep in mind that in this case all netperf instances are
> in the same cgroup and at a 4-level depth. I imagine in a practical
> setup processes would be a little more spread out, which means less
> common ancestors, so less contended atomic operations.

I was curious, so I ran the same testing in a cgroup 2 levels deep (i.e
/sys/fs/cgroup/a/b), which is a much more common setup in my experience.
Here are the numbers:

Base: 40198.0 mbps
Patched: 38629.7 mbps

The regression is reduced to ~3.9%.

What's more interesting is that going from a level 2 cgroup to a level 4
cgroup is already a big hit with or without this patch:

Base: 40198.0 -> 33076.5 mbps (~17.7% regression)
Patched: 38629.7 -> 31410.1 (~18.7% regression)

So going from level 2 to 4 is already a significant regression for other
reasons (e.g. hierarchical charging). This patch only makes it marginally
worse. This puts the numbers more into perspective imo than comparing
values at level 4. What do you think?

--000000000000c8c3ee0607805c47
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wed, Oct 11, 2023 at 8:13=E2=80=AFPM Yosry Ahme=
d &lt;<a href=3D"mailto:yosryahmed@google.com">yosryahmed@google.com</a>&gt=
; wrote:<br>&gt;<br>&gt; On Wed, Oct 11, 2023 at 5:46=E2=80=AFAM Shakeel Bu=
tt &lt;<a href=3D"mailto:shakeelb@google.com">shakeelb@google.com</a>&gt; w=
rote:<br>&gt; &gt;<br>&gt; &gt; On Tue, Oct 10, 2023 at 6:48=E2=80=AFPM Yos=
ry Ahmed &lt;<a href=3D"mailto:yosryahmed@google.com">yosryahmed@google.com=
</a>&gt; wrote:<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; On Tue, Oct 10, 2023 at=
 5:36=E2=80=AFPM Shakeel Butt &lt;<a href=3D"mailto:shakeelb@google.com">sh=
akeelb@google.com</a>&gt; wrote:<br>&gt; &gt; &gt; &gt;<br>&gt; &gt; &gt; &=
gt; On Tue, Oct 10, 2023 at 03:21:47PM -0700, Yosry Ahmed wrote:<br>&gt; &g=
t; &gt; &gt; [...]<br>&gt; &gt; &gt; &gt; &gt;<br>&gt; &gt; &gt; &gt; &gt; =
I tried this on a machine with 72 cpus (also ixion), running both<br>&gt; &=
gt; &gt; &gt; &gt; netserver and netperf in /sys/fs/cgroup/a/b/c/d as follo=
ws:<br>&gt; &gt; &gt; &gt; &gt; # echo &quot;+memory&quot; &gt; /sys/fs/cgr=
oup/cgroup.subtree_control<br>&gt; &gt; &gt; &gt; &gt; # mkdir /sys/fs/cgro=
up/a<br>&gt; &gt; &gt; &gt; &gt; # echo &quot;+memory&quot; &gt; /sys/fs/cg=
roup/a/cgroup.subtree_control<br>&gt; &gt; &gt; &gt; &gt; # mkdir /sys/fs/c=
group/a/b<br>&gt; &gt; &gt; &gt; &gt; # echo &quot;+memory&quot; &gt; /sys/=
fs/cgroup/a/b/cgroup.subtree_control<br>&gt; &gt; &gt; &gt; &gt; # mkdir /s=
ys/fs/cgroup/a/b/c<br>&gt; &gt; &gt; &gt; &gt; # echo &quot;+memory&quot; &=
gt; /sys/fs/cgroup/a/b/c/cgroup.subtree_control<br>&gt; &gt; &gt; &gt; &gt;=
 # mkdir /sys/fs/cgroup/a/b/c/d<br>&gt; &gt; &gt; &gt; &gt; # echo 0 &gt; /=
sys/fs/cgroup/a/b/c/d/cgroup.procs<br>&gt; &gt; &gt; &gt; &gt; # ./netserve=
r -6<br>&gt; &gt; &gt; &gt; &gt;<br>&gt; &gt; &gt; &gt; &gt; # echo 0 &gt; =
/sys/fs/cgroup/a/b/c/d/cgroup.procs<br>&gt; &gt; &gt; &gt; &gt; # for i in =
$(seq 10); do ./netperf -6 -H ::1 -l 60 -t TCP_SENDFILE --<br>&gt; &gt; &gt=
; &gt; &gt; -m 10K; done<br>&gt; &gt; &gt; &gt;<br>&gt; &gt; &gt; &gt; You =
are missing &#39;&amp;&#39; at the end. Use something like below:<br>&gt; &=
gt; &gt; &gt;<br>&gt; &gt; &gt; &gt; #!/bin/bash<br>&gt; &gt; &gt; &gt; for=
 i in {1..22}<br>&gt; &gt; &gt; &gt; do<br>&gt; &gt; &gt; &gt; =C2=A0 =C2=
=A0/data/tmp/netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K &amp;<br>&gt=
; &gt; &gt; &gt; done<br>&gt; &gt; &gt; &gt; wait<br>&gt; &gt; &gt; &gt;<br=
>&gt; &gt; &gt;<br>&gt; &gt; &gt; Oh sorry I missed the fact that you are r=
unning instances in parallel, my bad.<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; S=
o I ran 36 instances on a machine with 72 cpus. I did this 10 times<br>&gt;=
 &gt; &gt; and got an average from all instances for all runs to reduce noi=
se:<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; #!/bin/bash<br>&gt; &gt; &gt;<br>&g=
t; &gt; &gt; ITER=3D10<br>&gt; &gt; &gt; NR_INSTANCES=3D36<br>&gt; &gt; &gt=
;<br>&gt; &gt; &gt; for i in $(seq $ITER); do<br>&gt; &gt; &gt; =C2=A0 echo=
 &quot;iteration $i&quot;<br>&gt; &gt; &gt; =C2=A0 for j in $(seq $NR_INSTA=
NCES); do<br>&gt; &gt; &gt; =C2=A0 =C2=A0 echo &quot;iteration $i&quot; &gt=
;&gt; &quot;out$j&quot;<br>&gt; &gt; &gt; =C2=A0 =C2=A0 ./netperf -6 -H ::1=
 -l 60 -t TCP_SENDFILE -- -m 10K &gt;&gt; &quot;out$j&quot; &amp;<br>&gt; &=
gt; &gt; =C2=A0 done<br>&gt; &gt; &gt; =C2=A0 wait<br>&gt; &gt; &gt; done<b=
r>&gt; &gt; &gt;<br>&gt; &gt; &gt; cat out* | grep 540000 | awk &#39;{sum +=
=3D $5} END {print sum/NR}&#39;<br>&gt; &gt; &gt;<br>&gt; &gt; &gt; Base: 2=
2169 mbps<br>&gt; &gt; &gt; Patched: 21331.9 mbps<br>&gt; &gt; &gt;<br>&gt;=
 &gt; &gt; The difference is ~3.7% in my runs. I am not sure what&#39;s dif=
ferent.<br>&gt; &gt; &gt; Perhaps it&#39;s the number of runs?<br>&gt; &gt;=
<br>&gt; &gt; My base kernel is next-20231009 and I am running experiments =
with<br>&gt; &gt; hyperthreading disabled.<br>&gt;<br>&gt; Using next-20231=
009 and a similar 44 core machine with hyperthreading<br>&gt; disabled, I r=
an 22 instances of netperf in parallel and got the<br>&gt; following number=
s from averaging 20 runs:<br>&gt;<br>&gt; Base: 33076.5 mbps<br>&gt; Patche=
d: 31410.1 mbps<br>&gt;<br>&gt; That&#39;s about 5% diff. I guess the numbe=
r of iterations helps reduce<br>&gt; the noise? I am not sure.<br>&gt;<br>&=
gt; Please also keep in mind that in this case all netperf instances are<br=
>&gt; in the same cgroup and at a 4-level depth. I imagine in a practical<b=
r>&gt; setup processes would be a little more spread out, which means less<=
br>&gt; common ancestors, so less contended atomic operations.<br><br>I was=
 curious, so I ran the same testing in a cgroup 2 levels deep (i.e /sys/fs/=
cgroup/a/b), which is a much more common setup in my experience. Here are t=
he numbers:<br><br>Base: 40198.0 mbps<br>Patched: 38629.7 mbps<br><br>The r=
egression is reduced to ~3.9%.<br><br>What&#39;s more interesting is that g=
oing from a level 2 cgroup to a level 4 cgroup is already a big hit with or=
 without this patch:<br><br>Base: 40198.0 -&gt; 33076.5 mbps (~17.7% regres=
sion)<br>Patched: 38629.7 -&gt; 31410.1 (~18.7% regression)<div><br></div><=
div>So going from level 2 to 4 is already a significant regression for othe=
r reasons (e.g. hierarchical charging). This patch only makes it marginally=
 worse. This puts the numbers more into perspective imo than comparing valu=
es at level 4. What do you think?</div></div>

--000000000000c8c3ee0607805c47--