From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EB8AC678D5 for ; Wed, 8 Mar 2023 18:02:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2B8F280003; Wed, 8 Mar 2023 13:02:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9DAF6280002; Wed, 8 Mar 2023 13:02:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C9A9280003; Wed, 8 Mar 2023 13:02:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7EE70280002 for ; Wed, 8 Mar 2023 13:02:07 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E593B40148 for ; Wed, 8 Mar 2023 18:02:06 +0000 (UTC) X-FDA: 80546499852.30.40224BC Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf21.hostedemail.com (Postfix) with ESMTP id 360901C0029 for ; Wed, 8 Mar 2023 18:02:02 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="OW1g/V0Z"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678298523; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=S4ti/z/gy4KrrUZTxjXuzAZhonovpT4J/QPkichvLcg=; b=o4K3vOrJAHqAliuUOY/BDg3/ZEQjNzXMXd0rpTkiLEg7Raxv1QHcRhAF3bKQJh0CwYPjPF 9B1zUsZtjaaPC8xbyaYpyG3IDw+n5qh2qNY7We7LHKpD8HwdS7GWgYcK0Zr/EklXHM+449 gCQhGmv6JwHrEo4t3ASVp1xVNqzOdLA= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="OW1g/V0Z"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678298523; a=rsa-sha256; cv=none; b=3UyqQ8ZZAnvB8IRXUxXFW//dvFdwYuAB5ybQd+bQEkmTRjV26FJsRksLBGbz/WNx45GpK9 H6yiy5Dgh6bUH0VsoqfxexrblcAiJJ4qCbHTH0DXDxd49UYNMoi25tBFeokOd7w1Dns/5L jSzaz8xzMTtcVfidnQ/MzTZF9cZ+tc0= Received: by mail-ed1-f54.google.com with SMTP id da10so69203925edb.3 for ; Wed, 08 Mar 2023 10:02:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678298521; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=S4ti/z/gy4KrrUZTxjXuzAZhonovpT4J/QPkichvLcg=; b=OW1g/V0ZnHiKXAAuMa/g45YdvNwqsGqpiHla8THv1YDkaz9IQf8iPFIObghMKUFgBj h8GtlHUimui34LY3Y3Yhe6QRBKAt/Dr6/Jd90I8NldNmylmZxWLj7uNbuuqlnijZ0P96 qtl80rYWk1BPt88b1z9CICrJySHOMZj0S3ut4xWVeXTnG4KJd/dsdexjxmkhE95SVhKf G175FTaGfT9s6k/ogCXBil6pCss3RH49Zo23fV0QmslbdRaXkFb6m0lq7yOB92BPL0ox +D+9dCp92uwM+0RLJP4aHz8vBs2sWYdEHW+2L1M6bMCifCyHZTacFulKwTtQcE87im6B Ts+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678298521; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=S4ti/z/gy4KrrUZTxjXuzAZhonovpT4J/QPkichvLcg=; b=nZU6YXt4oHoWay9YiOeDhXFKXjt3FFuPhgijA0ttz5Sx4f+UQV1rvCRY0y/9lkmpSg M+319c5mM9zcOs247gGPgbUlZkDxm0SunpWWAnj2MZXGpEAuSkzCnYtoe5MXr6qgjFWH NuJlO6HIx4FVIhZaK9edB0dKJ33rKWfDSw8FXKYQtE2mq0/Ko1w9kIzI3yRgeQlIwI52 fpphhpWsPlDB0r4cJKOpouiykwBIC3PCpHjURx98vkzRB8AyzTNAv4GyH2GQ6VKOcMlz ZXkwNWUgReYN/6TfOsixE4Sc5uaZC8HWywmcC6IoDmRpuWNv1HVUIIZiP4TUg9ezo2Lb ixoQ== X-Gm-Message-State: AO0yUKU9dnLcR7Sn4+ho5CEBMbOZDbkxIrOBoozYhagmwzy/i/XOFdez dGw/QKBXH/gwloNSJl/O7DBoFg26CvcHfoOvyELOiQ== X-Google-Smtp-Source: AK7set8/jHpN5Gwo1YgmZIy+zhgP7w+9nthoGIsgE3sW+C9piQ9Y5z41ZNW8GzKbj+B/VAuIXkh35VxKjD5xkPj0xyg= X-Received: by 2002:a17:906:1618:b0:8b3:8147:6b6e with SMTP id m24-20020a170906161800b008b381476b6emr8727807ejd.10.1678298520898; Wed, 08 Mar 2023 10:02:00 -0800 (PST) MIME-Version: 1.0 References: <20230228085002.2592473-1-yosryahmed@google.com> <20230308160056.GA414058@cmpxchg.org> In-Reply-To: <20230308160056.GA414058@cmpxchg.org> From: Yosry Ahmed Date: Wed, 8 Mar 2023 10:01:24 -0800 Message-ID: Subject: Re: [PATCH v1 0/2] Ignore non-LRU-based reclaim in memcg reclaim To: Johannes Weiner Cc: Alexander Viro , "Darrick J. Wong" , Christoph Lameter , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, "Matthew Wilcox (Oracle)" , Miaohe Lin , David Hildenbrand , Peter Xu , NeilBrown , Shakeel Butt , Michal Hocko , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 360901C0029 X-Stat-Signature: 7et4ogf574umpfcyohhzitfx7dp7iqn9 X-HE-Tag: 1678298522-927895 X-HE-Meta: U2FsdGVkX1+GglvjX30psixp7/oSNfFrxE12OdIzZHK5sHPefn8ucL98VKE+D5WZ1rTp0bw1OKbu1TXUY5QFPcXPS2pEV6XsU+Fe8BPvkeqIqmTVK6gypYOd174UoHbu2KpPNWweFVUrTKROD5UjDK1/E660frWad6mRSoj1hAuhPUP7t0v9mLY6QhZyQ3m+QE6LJDsF2chhcPM+Eqd6ebVL95WHKHTri/N5BLEPUW2A91+15gkimg2X5kV9K3e5XKGILjAntu5qT1eCZkxG9UgAsKY5ha67wJvSMMHXViPH/u+R0SirWL84oU7DESYfmGYVBLCNyN3fmfuTBt7ESZH2fd6JNEjQ/wUmIjTKwLhb4w0Eou+UAvgzawNZRrZfqbCHh7f75/8ulH4+Ni81LaUS6FeZcReE/vurgt2/DyFTYlEZm8WGodUrOp6xtWgvF3tfdy1Z20Jc4DbIlI+sShYP3UlEfFVjtffeCWZhYwpKRmExOXsZi5aoOTnlgVRCE8fw6eiX+F/joLE7eoeeRkeycPcevJotPW4O5qQr014I6FY7O2T9OT/oLgeJD+HmbrYoBcLKwzjhf1hTKQwdgkZjSNUScQPdQq3j07w4z7dK8mRATBZJ3cM0nzYDB21UTMAeO9D+5AFOE0+G1dGZR+ZDPDAATxcvtxkMQtpBlB6/cAmr6KZegtpyycgHS5swUazlGMBQs+xtopi7HmOJ6YZVDfpq5chSiZnFtUrw4XGxrxlHspeJCcX99/vb+VXCjFXLgMbTJegPKdfd7yhXttkIXy8BRwB0BFMlCT6RgB5Ct+9EhtN8Ze4CyAQEv7N4+GD56BotOI0AEnj3zq7nY9nr9FMWyYc7VhTqXkqN9B0srUKO71mv+zMg+U0MS7LHWaX9NTSGKHpxv6jUw1rvHixIYYP5Iwzb1jRPsb9JRSMNGTnJZRtZndDuquRDPWcFkxYI2ou9bpDTsaydw1E 3V28gyxd AoHhB01IXmySHv31cnTE6P8QdBK6TUGzGUlKBss1Ph7jkbApBoxdZ+p3dygWm7VhqqfAXK1Md1khdSpCH1vab41q/zsY4vTETh93cjpM2TLOPTcNa1P+TayUIpdfo6dwHy6xmOQQWGta3NI4VyPnXLX8RqwKwg871Enb29FgCGa/8WCVkMXrZQdinMZPtu7RuZ7iZlk6pzstOi/ST1v7FKaOz9L3Mz8iW96xPtZw97qUIca4d1pj5ROlCJuYBanLXWwLuXrDADqVZz7hkranxknlgZHFDEr93zWW3RlgnmRHYOZLI1QtqnuEchLovCiYgUtPOIX5tsgyOxJ4ayZgGyIIhQnSmmqmVaDVQwvbLRDpj3+Nj+PPMq4ohdA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 8, 2023 at 8:00=E2=80=AFAM Johannes Weiner = wrote: > > Hello Yosry, > > On Tue, Feb 28, 2023 at 08:50:00AM +0000, Yosry Ahmed wrote: > > Reclaimed pages through other means than LRU-based reclaim are tracked > > through reclaim_state in struct scan_control, which is stashed in > > current task_struct. These pages are added to the number of reclaimed > > pages through LRUs. For memcg reclaim, these pages generally cannot be > > linked to the memcg under reclaim and can cause an overestimated count > > of reclaimed pages. This short series tries to address that. > > Could you please add more details on how this manifests as a problem > with real workloads? We haven't observed problems in production workloads, but we have observed problems in testing using memory.reclaim when sometimes a write to memory.reclaim would succeed when we didn't fully reclaim the requested amount. This leads to tests flaking sometimes, and we have to look into the failures to find out if there is a real problem or not. > > > Patch 1 is just refactoring updating reclaim_state into a helper > > function, and renames reclaimed_slab to just reclaimed, with a comment > > describing its true purpose. > > Looking through the code again, I don't think these helpers add value. > > report_freed_pages() is fairly vague. Report to who? It abstracts only > two lines of code, and those two lines are more descriptive of what's > happening than the helper is. Just leave them open-coded. I agree the name is not great, I am usually bad at naming things and hope people would point that out (like you're doing now). The reason I added it is to contain the logic within mm/vmscan.c such that future changes do not have to add noisy diffs to a lot of unrelated files. If you have a better name that makes more sense to you please let me know, otherwise I'm fine dropping the helper as well, no strong opinions here. > > add_non_vmanscan_reclaimed() may or may not add anything. But let's > take a step back. It only has two callsites because lrugen duplicates > the entire reclaim implementation, including the call to shrink_slab() > and the transfer of reclaim_state to sc->nr_reclaimed. > > IMO the resulting code would overall be simpler, less duplicative and > easier to follow if you added a common shrink_slab_reclaim() that > takes sc, handles the transfer, and documents the memcg exception. IIUC you mean something like: void shrink_slab_reclaim(struct scan_control *sc, pg_data_t *pgdat, struct mem_cgroup *memcg) { shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, sc->priority); /* very long comment */ if (current->reclaim_state && !cgroup_reclaim(sc)) { sc->nr_reclaimed +=3D current->reclaim_state->reclaimed; current->reclaim_state->reclaimed =3D 0; } } The difference would be that today we handle the transfer once after we scan all memcgs in classic lruvec, while we do the transfer once per-memcg in lrugen. With this change, we would be doing the transfer once per-memcg for both, but I guess that's not a big deal. What I don't like about this is that it doubles down on associating the counter in reclaim_state with slab, which is the opposite of what patch 1 does (renaming reclaimed_slab to just reclaimed). If we do this, maybe it's better from a consistency perspective to leave it as reclaimed_slab, with a comment announcing that this is mainly used for slab, but others are piggybacking on it. It seems like whatever we do is not going to be ideal in this case, but we should at least be consistent. Either add shrink_slab_reclaim(), leave it as reclaimed_slab and call out other users as piggybackers -- or make it generic and separate from slab completely, with a separate helper to do the transfer. I do not have a strong opinion here, so let me know what you prefer.