From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id CD3D4C0032E
	for <linux-mm@archiver.kernel.org>; Wed, 25 Oct 2023 06:23:13 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 35A9D6B031C; Wed, 25 Oct 2023 02:23:13 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 30B266B031D; Wed, 25 Oct 2023 02:23:13 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 1D19A6B031E; Wed, 25 Oct 2023 02:23:13 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id 0F10C6B031C
	for <linux-mm@kvack.org>; Wed, 25 Oct 2023 02:23:13 -0400 (EDT)
Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay03.hostedemail.com (Postfix) with ESMTP id DAE9BA04AB
	for <linux-mm@kvack.org>; Wed, 25 Oct 2023 06:23:12 +0000 (UTC)
X-FDA: 81382991424.09.A12FF52
Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53])
	by imf21.hostedemail.com (Postfix) with ESMTP id 167A01C0009
	for <linux-mm@kvack.org>; Wed, 25 Oct 2023 06:23:10 +0000 (UTC)
Authentication-Results: imf21.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=r2nGJzS5;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1698214991;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=nQbB4bwPVZ7SBi8C6gh7OCcW70kLdzSL7ZqDE/GfPyM=;
	b=olwXrEtPq/JTzTdgBqYNBkbobFij0zY7MZDQwPRDnmdS47qkreLeacjr27/6v45BokAMHW
	6yPqnNtUjaIpnCHzXvfnuaE5NZZo4bLIdO/OzAEZeT/pSfK9delZ21H82xD98/eh6+4BXB
	LtogpsMAhopvdtGF4TEt+P41kbmDqh4=
ARC-Authentication-Results: i=1;
	imf21.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=r2nGJzS5;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698214991; a=rsa-sha256;
	cv=none;
	b=vYFzpSQNDkdXMqfFA9K+szg9/B4M3AL1jatYLgPJYSSHOZ6UPqm/sAvPoYhUEzAO7wGj7+
	GLg6Evz+G79WyxQzHeBli4eEuHWJnnEGk7c99lNUOExB2hv5NN0yr3Vm9fxMOQxYdsCsNq
	1x2EPYY48Fll0RtEOabridmwjpmATQE=
Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-53de0d1dc46so8573468a12.3
        for <linux-mm@kvack.org>; Tue, 24 Oct 2023 23:23:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1698214989; x=1698819789; darn=kvack.org;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=nQbB4bwPVZ7SBi8C6gh7OCcW70kLdzSL7ZqDE/GfPyM=;
        b=r2nGJzS5ewnkjvoC9gMjz6iMKme4S0SLgSK4O2Aj3H/ksUwmH1+WLzMVZzefZCbauz
         v+YXZCmnTNZnqBGpUPggmbyFE8TO1gbxwXNzayfVCe6qO5y9x4dxIOvvao1pRHOAvaxD
         CB/xaigpWY89dk18q85iUn/rnl3JM7LbxSFsu/CYuLtKvDNmzmjsiLF5jUZDkeV4NcR0
         8gsgfQ3qCRkH1SsM8SFfsORAtW4Lu3WOdrUvNkDGDXeri+Lk6FxZ7cl4jZ+b2i3Zomro
         upZ85iAHkLsiOquBpfs2Dmlh/NtO16lrBc/eaun6h8Z3eyDL+H9jNOXsyrXl6DTcfxVQ
         dB/g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1698214989; x=1698819789;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=nQbB4bwPVZ7SBi8C6gh7OCcW70kLdzSL7ZqDE/GfPyM=;
        b=r2e3A6VBOOkNXOF2Pn1CnMPhshEjeEcTEgThQs7ZMASGLxt9m85xkNdIUBRq7NHRZp
         FqOe0RlvzycrOBFX5992lawWrNNLO58EHpvveHRc8lv+eqAF/C5UvHs/KhtTkN2euIUK
         qUEiTnM8T1tv+X4/XfZmI9bjyAb3Nj07A+nAXKRYUpwrSKK+0xWbb760hVzjrQ80oJ6t
         uLxGOGkXwXO1+b571GbKt0Okk5JlzSTI2Kz9mLj522AQaae922jtLIc0lNvvv9gjZ6jD
         eNL7sCTiQ+aIC1bF5Q9HfNpQifG4rEnMzEuy69MsqHSZFUR1g+Nm3sin93LjSS2wnOht
         Hlbw==
X-Gm-Message-State: AOJu0YzUWg+A8DEM7v9qQl67LrrwJnui1nLc9iyR+PQy53FdeBhOtfzm
	HN3gy6P1D0W6W/LEZPd9JUwHc1uDKjMytNfj3f322A==
X-Google-Smtp-Source: AGHT+IE4wzVGyva3F8aqQ4UNl8uI27Tc9h8knuHkhSOuPmZWNO6sz2hREqb1zMTmRAyOKxNKYEU65LWTXFGRf0nH6wU=
X-Received: by 2002:a17:907:9342:b0:9be:40ba:5f1 with SMTP id
 bv2-20020a170907934200b009be40ba05f1mr12127756ejc.60.1698214989310; Tue, 24
 Oct 2023 23:23:09 -0700 (PDT)
MIME-Version: 1.0
References: <20231010032117.1577496-4-yosryahmed@google.com>
 <202310202303.c68e7639-oliver.sang@intel.com> <CALvZod5hKvjm3WVSOGc5PpR9eNHFkt=BDmcrBe5CeWgFzP7jgQ@mail.gmail.com>
 <CAJD7tkbjZri4ayBOT9rJ0yMAi__c-1SVmRh_5oXezr7U6dvALg@mail.gmail.com>
 <ZTXLeAAI1chMamkU@feng-clx> <CAJD7tka5UnHBz=eX1LtynAjJ+O_oredMKBBL3kFNfG7PHjuMCw@mail.gmail.com>
 <CAJD7tkYXJ3vcGvteNH98tB_C7OTo718XSxL=mFsUa7kO8vzFzA@mail.gmail.com>
 <ZTdqpcDFVHhFwWMc@xsang-OptiPlex-9020> <CAJD7tka7hmOD6KPmJBJa+TscbYEMmTjS+Jh2utPfTbKkfvwD9A@mail.gmail.com>
 <ZTiw/iIb0SbvN7vh@xsang-OptiPlex-9020>
In-Reply-To: <ZTiw/iIb0SbvN7vh@xsang-OptiPlex-9020>
From: Yosry Ahmed <yosryahmed@google.com>
Date: Tue, 24 Oct 2023 23:22:30 -0700
Message-ID: <CAJD7tkaBnSwarz8yHu9RL_3DtaLRfjrcZ7m0YZZgHJsJdtHaZw@mail.gmail.com>
Subject: Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg
To: Oliver Sang <oliver.sang@intel.com>, Shakeel Butt <shakeelb@google.com>, 
	Johannes Weiner <hannes@cmpxchg.org>
Cc: Feng Tang <feng.tang@intel.com>, "oe-lkp@lists.linux.dev" <oe-lkp@lists.linux.dev>, lkp <lkp@intel.com>, 
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>, "linux-mm@kvack.org" <linux-mm@kvack.org>, 
	"Huang, Ying" <ying.huang@intel.com>, "Yin, Fengwei" <fengwei.yin@intel.com>, 
	Andrew Morton <akpm@linux-foundation.org>, Michal Hocko <mhocko@kernel.org>, 
	Roman Gushchin <roman.gushchin@linux.dev>, Muchun Song <muchun.song@linux.dev>, 
	Ivan Babrou <ivan@cloudflare.com>, Tejun Heo <tj@kernel.org>, =?UTF-8?Q?Michal_Koutn=C3=BD?= <mkoutny@suse.com>, 
	Waiman Long <longman@redhat.com>, 
	"kernel-team@cloudflare.com" <kernel-team@cloudflare.com>, Wei Xu <weixugc@google.com>, 
	Greg Thelen <gthelen@google.com>, 
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, 
	Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Server: rspam09
X-Rspamd-Queue-Id: 167A01C0009
X-Stat-Signature: r488z4ep1ezo1mhkbn7ehitkg8puiwmq
X-Rspam-User: 
X-HE-Tag: 1698214990-962407
X-HE-Meta: U2FsdGVkX196yrNNGvN+bTzQ64FUlfEyipgg2qHPqBqMfWzfAW/AACX2S4AxgfeiCMvfr0NVus94n7qyNdwlciE/F4f+X9pinMDzjZ5MbYKBxtCff42Eib6zB2WkvRbdPxr1LoVJemSUa94+dY5jY7pPnNSYb7b115ZLwTEzdtU9kq0EGdBvcYYZmV+rsLOrAzr7HziDhTNXxQdKP/Lm/q8iKmwXUZnSlPy4E1S5G0oeIp+p6R4L11bysOWbK5t0Cwy4gQspT0M9i9K49U995nUR6n7RDt7+TTvyJOq7p2mgx+DdYhOEIhSsOJZYuAb4w7GThbK0VCTSk/H0yVn3dPiPy4H7UhsPiPoB62g7ee9gPWBnuPMDIU9h90vj+Cp5Je2Qd7FVINXrW0K13V6YKHWw7eeMjU1aXv2JmjJVKzcVol53FA39WzGel5/zUsTuns4KC6lyIDsjRcoEgIcSmQiyAYcjsY/W+dQi2CWSYs2WEqahQu61K4FM86ky+l/AbrbczZhajsCD+S+MrwPGlq/TB39iTm10ch8v5kOkhaX+gAR0dd8LbYMzWrgzjD/o6jDPEa0P3TuEaglIS0WwN6Eo4Bi01lHQvXhMy+dmgG11NXWXc+zFwTOtnAe6THxYsxI4mT7g0JmALhknsrVaoGkq/u7d20WtnirIHeHkHVEaNGN/6OaXVdeVbuU8zSINtIrttvXghZ/Ko7CkPg/gvpomUDLZhnQBReX7Iwz/BIUjukJi1j0zhhNTYHpfBJIItUeXvV0v3CCxayDdMya65FbutTtXQTBf9zvdwtVEd8w0piufm4HnxUEAd/gYBxi2igFMKDXiO1CCyrYByX6HE2/XBN8NZLEygQ6Vf+ID3yLymXZ38+fRCqtSyBhD+aUu9SBjGC1ELXylqnV4jVEkr23kG/0suqyzq3lvD5lIInttlVV3YaRvTHkuafJ44NXyET81lKffTjuU7CEo0I6
 Ap0IQN/t
 Hgk8gSErZTlZnoEUSVuS5FmXysP16DJOgbswyURhdYDqtHKEeM85jr0SKwnNLMJRApeCVGxVGG3ZC53cSLNCTEZlz6oLJ/eLwe6i7Zl0Dj0tgMazNs6cQOUh2BEdSvIihH0Td0j4sEpfusFyFqb60IWW2VoiRP/ikgHQsUl4yv4sfRdm9ZXnEWaFtpeZqAeiTHXoVVM0W6oRs93AK6fz1EJFEC5+1Z4VbLXd3zc3z770W2b/Bb3LcHowCbiQqsVQmy+y6OUa096ZjOyWl3bBhaisW5OWmIG4Qcj5hJPuZbZyfsZcfda6T7POC/gQN2XJU2zE42fuETnWsihBh+XMFbmqsCz0dbggXJvyvoIFjS/TI1JvOEQfYf+2uoRwXvvGKZAMwqnf6j5y6UT4=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Tue, Oct 24, 2023 at 11:09=E2=80=AFPM Oliver Sang <oliver.sang@intel.com=
> wrote:
>
> hi, Yosry Ahmed,
>
> On Tue, Oct 24, 2023 at 12:14:42AM -0700, Yosry Ahmed wrote:
> > On Mon, Oct 23, 2023 at 11:56=E2=80=AFPM Oliver Sang <oliver.sang@intel=
.com> wrote:
> > >
> > > hi, Yosry Ahmed,
> > >
> > > On Mon, Oct 23, 2023 at 07:13:50PM -0700, Yosry Ahmed wrote:
> > >
> > > ...
> > >
> > > >
> > > > I still could not run the benchmark, but I used a version of
> > > > fallocate1.c that does 1 million iterations. I ran 100 in parallel.
> > > > This showed ~13% regression with the patch, so not the same as the
> > > > will-it-scale version, but it could be an indicator.
> > > >
> > > > With that, I did not see any improvement with the fixlet above or
> > > > ___cacheline_aligned_in_smp. So you can scratch that.
> > > >
> > > > I did, however, see some improvement with reducing the indirection
> > > > layers by moving stats_updates directly into struct mem_cgroup. The
> > > > regression in my manual testing went down to 9%. Still not great, b=
ut
> > > > I am wondering how this reflects on the benchmark. If you're able t=
o
> > > > test it that would be great, the diff is below. Meanwhile I am stil=
l
> > > > looking for other improvements that can be made.
> > >
> > > we applied previous patch-set as below:
> > >
> > > c5f50d8b23c79 (linux-review/Yosry-Ahmed/mm-memcg-change-flush_next_ti=
me-to-flush_last_time/20231010-112257) mm: memcg: restore subtree stats flu=
shing
> > > ac8a48ba9e1ca mm: workingset: move the stats flush into workingset_te=
st_recent()
> > > 51d74c18a9c61 mm: memcg: make stats flushing threshold per-memcg
> > > 130617edc1cd1 mm: memcg: move vmstats structs definition above flushi=
ng code
> > > 26d0ee342efc6 mm: memcg: change flush_next_time to flush_last_time
> > > 25478183883e6 Merge branch 'mm-nonmm-unstable' into mm-everything   <=
---- the base our tool picked for the patch set
> > >
> > > I tried to apply below patch to either 51d74c18a9c61 or c5f50d8b23c79=
,
> > > but failed. could you guide how to apply this patch?
> > > Thanks
> > >
> >
> > Thanks for looking into this. I rebased the diff on top of
> > c5f50d8b23c79. Please find it attached.
>
> from our tests, this patch has little impact.
>
> it was applied as below ac6a9444dec85:
>
> ac6a9444dec85 (linux-devel/fixup-c5f50d8b23c79) memcg: move stats_updates=
 to struct mem_cgroup
> c5f50d8b23c79 (linux-review/Yosry-Ahmed/mm-memcg-change-flush_next_time-t=
o-flush_last_time/20231010-112257) mm: memcg: restore subtree stats flushin=
g
> ac8a48ba9e1ca mm: workingset: move the stats flush into workingset_test_r=
ecent()
> 51d74c18a9c61 mm: memcg: make stats flushing threshold per-memcg
> 130617edc1cd1 mm: memcg: move vmstats structs definition above flushing c=
ode
> 26d0ee342efc6 mm: memcg: change flush_next_time to flush_last_time
> 25478183883e6 Merge branch 'mm-nonmm-unstable' into mm-everything
>
> for the first regression reported in original report, data are very close
> for 51d74c18a9c61, c5f50d8b23c79 (patch-set tip, parent of ac6a9444dec85)=
,
> and ac6a9444dec85.
> full comparison is as [1]
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/tes=
tcase:
>   gcc-12/performance/x86_64-rhel-8.3/thread/100%/debian-11.1-x86_64-20220=
510.cgz/lkp-skl-fpga01/fallocate1/will-it-scale
>
> 130617edc1cd1ba1 51d74c18a9c61e7ee33bc90b522 c5f50d8b23c7982ac875791755b =
ac6a9444dec85dc50c6bfbc4ee7
> ---------------- --------------------------- --------------------------- =
---------------------------
>          %stddev     %change         %stddev     %change         %stddev =
    %change         %stddev
>              \          |                \          |                \   =
       |                \
>      36509           -25.8%      27079           -25.2%      27305       =
    -25.0%      27383        will-it-scale.per_thread_ops
>
> for the second regression reported in origianl report, seems a small impa=
ct
> from ac6a9444dec85.
> full comparison is as [2]
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/tes=
tcase:
>   gcc-12/performance/x86_64-rhel-8.3/thread/50%/debian-11.1-x86_64-202205=
10.cgz/lkp-skl-fpga01/fallocate1/will-it-scale
>
> 130617edc1cd1ba1 51d74c18a9c61e7ee33bc90b522 c5f50d8b23c7982ac875791755b =
ac6a9444dec85dc50c6bfbc4ee7
> ---------------- --------------------------- --------------------------- =
---------------------------
>          %stddev     %change         %stddev     %change         %stddev =
    %change         %stddev
>              \          |                \          |                \   =
       |                \
>      76580           -30.0%      53575           -28.9%      54415       =
    -26.7%      56152        will-it-scale.per_thread_ops
>
> [1]
>

Thanks Oliver for running the numbers. If I understand correctly the
will-it-scale.fallocate1 microbenchmark is the only one showing
significant regression here, is this correct?

In my runs, other more representative microbenchmarks benchmarks like
netperf and will-it-scale.page_fault* show minimal regression. I would
expect practical workloads to have high concurrency of page faults or
networking, but maybe not fallocate/ftruncate.

Oliver, in your experience, how often does such a regression in such a
microbenchmark translate to a real regression that people care about?
(or how often do people dismiss it?)

I tried optimizing this further for the fallocate/ftruncate case but
without luck. I even tried moving stats_updates into cgroup core
(struct cgroup_rstat_cpu) to reuse the existing loop in
cgroup_rstat_updated() -- but it somehow made it worse.

On the other hand, we do have some machines in production running this
series together with a previous optimization for non-hierarchical
stats [1] on an older kernel, and we do see significant reduction in
cpu time spent on reading the stats. Domenico did a similar experiment
with only this series and reported similar results [2].

Shakeel, Johannes, (and other memcg folks), I personally think the
benefits here outweigh a regression in this particular benchmark, but
I am obviously biased. What do you think?

[1]https://lore.kernel.org/lkml/20230726153223.821757-2-yosryahmed@google.c=
om/
[2]https://lore.kernel.org/lkml/CAFYChMv_kv_KXOMRkrmTN-7MrfgBHMcK3YXv0dPYEL=
7nK77e2A@mail.gmail.com/