From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16CCBC433F5 for ; Wed, 5 Oct 2022 22:46:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D2326B0071; Wed, 5 Oct 2022 18:46:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 85A6D6B0074; Wed, 5 Oct 2022 18:46:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6ADA76B0075; Wed, 5 Oct 2022 18:46:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 56C226B0071 for ; Wed, 5 Oct 2022 18:46:08 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 54AADA0E29 for ; Wed, 5 Oct 2022 22:46:07 +0000 (UTC) X-FDA: 79988380374.02.F5CF751 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) by imf23.hostedemail.com (Postfix) with ESMTP id E578614001C for ; Wed, 5 Oct 2022 22:46:05 +0000 (UTC) Received: by mail-wr1-f47.google.com with SMTP id w18so32284wro.7 for ; Wed, 05 Oct 2022 15:46:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=Y7PkXIPyCPKXPJY3Ag9S0P6ozhe5/SZxhnvrYGS6zk0=; b=nUygVbNLVgpRMz29ZL4Mo17ZVVjjwdGeIkO7HpFOcNVyaB775Ph7JC/TsuUPmYPACt /D2ghtiPdFZbQhyfp9kc4QAkLToFjD2DieiCBPP8O8BoBbLZ9tZa78XrVhoXmURLbDSj VpSACO+pn6tWMrd36Fj12V6Ek3xVfptlHjPBg6gYq6fQ2LwQEQidqNMK2QmIKzNuSw4c cuJD8d+fRPAW0+y5+7yny7MZOX+cumwKPbh06JlKguMuRDyycfGua2RVDcgSMwrOHoPc m7tLpC0DbIMBEJPD3UVYIjTovNjwdwHvukM3L7XHT2ABtwXVVr+XlRlojzVJwLqYdGI7 Zgzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=Y7PkXIPyCPKXPJY3Ag9S0P6ozhe5/SZxhnvrYGS6zk0=; b=3LmmRkmueM6RpO2h2oRtcya4+vsyMbnsdo00c4El5C9npedh7ZQfF5kKmuk+whVlKa wSKFxJxJ6azeBG34JpmKsspaRbL1zfMuAAEmRSCpgDpUW83vCycvpyV+TogS0dC8Ktfh 6hQNAR78ezAtDGbE6GZbAHHwMFnSSqT4U77HPdTLiSXFHrhIX8TdsV9ZPTZt90IoeGtA a8+EhNByHbVHJE0tJ3CHrxP1Ft2EJTdhdKtW+Mp5PPB2Q3WIlx3NNwLSC+k/PwyiqFEw banFtO3QII8P5TVAA+xYYMsI9gHx5BBJfnxI+iQWNAJ5MDS4HeoCiwvkanfX77EWOi+h N1sA== X-Gm-Message-State: ACrzQf3lMEW1B3GR1cFWbMrQm5haQqGOzJs55cvSV5RSr6Hfyhfm+j5o 8wYYWp3Tz+8RT2buQnMU1ItU40+qK/Am1NROXmpunA== X-Google-Smtp-Source: AMsMyM57TKXhFZ/LaoU/QYTDmGDJL5ZNbcAqwVkPauC4XmxZ8mWMNy6CFIKcSpgGv4ylnqDNj4GyLJpLL6r/xWpocWE= X-Received: by 2002:a5d:64e8:0:b0:22a:bbb0:fa with SMTP id g8-20020a5d64e8000000b0022abbb000famr1149395wri.372.1665009964490; Wed, 05 Oct 2022 15:46:04 -0700 (PDT) MIME-Version: 1.0 References: <20221005173713.1308832-1-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Wed, 5 Oct 2022 15:45:27 -0700 Message-ID: Subject: Re: [PATCH v2] mm/vmscan: check references from all memcgs for swapbacked memory To: Yu Zhao Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Greg Thelen , David Rientjes , Cgroups , Linux-MM Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665009966; a=rsa-sha256; cv=none; b=uKX6SwLqW3Gp6a5PSMOZRAYNE2nn+7xK93eosJzfHeOJqq3XSTj1WnQAg5K9kOmD5DEhxI zVhPvk976q5EEE5Xgy8zp6tTWcxdn/nydvsL68EpYmnKI1/b43kvtKROvAXD4Kli7xZSse vBh0Zcv9QvNjuaTg7jI9SBpPiso0My0= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=nUygVbNL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665009966; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y7PkXIPyCPKXPJY3Ag9S0P6ozhe5/SZxhnvrYGS6zk0=; b=tzKtCCiAN0FwJEUsNgP8s/WX6BvMThSHfNT0KQuSnqx4TkT2c7HJDU5mLM/AEQiv/KblUk rPbv7EmvnpVFNDLiQvzX3ET2jNe4MBc9qgd9QB+y6PRl6Qkh/279JooGwRA9crAp8vpuFT nEw468d6Ku2y1xk+ATATcSwRclp+41M= X-Rspam-User: Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=nUygVbNL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com X-Stat-Signature: 5tmprid9purfguecdcrbi3xxn65ibh1y X-Rspamd-Queue-Id: E578614001C X-Rspamd-Server: rspam09 X-HE-Tag: 1665009965-636548 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 5, 2022 at 3:23 PM Yu Zhao wrote: > > On Wed, Oct 5, 2022 at 3:13 PM Yu Zhao wrote: > > > > On Wed, Oct 5, 2022 at 3:02 PM Yosry Ahmed wrote: > > > > > > On Wed, Oct 5, 2022 at 1:48 PM Yu Zhao wrote: > > > > > > > > On Wed, Oct 5, 2022 at 11:37 AM Yosry Ahmed wrote: > > > > > > > > > > During page/folio reclaim, we check if a folio is referenced using > > > > > folio_referenced() to avoid reclaiming folios that have been recently > > > > > accessed (hot memory). The rationale is that this memory is likely to be > > > > > accessed soon, and hence reclaiming it will cause a refault. > > > > > > > > > > For memcg reclaim, we currently only check accesses to the folio from > > > > > processes in the subtree of the target memcg. This behavior was > > > > > originally introduced by commit bed7161a519a ("Memory controller: make > > > > > page_referenced() cgroup aware") a long time ago. Back then, refaulted > > > > > pages would get charged to the memcg of the process that was faulting them > > > > > in. It made sense to only consider accesses coming from processes in the > > > > > subtree of target_mem_cgroup. If a page was charged to memcg A but only > > > > > being accessed by a sibling memcg B, we would reclaim it if memcg A is > > > > > is the reclaim target. memcg B can then fault it back in and get charged > > > > > for it appropriately. > > > > > > > > > > Today, this behavior still makes sense for file pages. However, unlike > > > > > file pages, when swapbacked pages are refaulted they are charged to the > > > > > memcg that was originally charged for them during swapping out. Which > > > > > means that if a swapbacked page is charged to memcg A but only used by > > > > > memcg B, and we reclaim it from memcg A, it would simply be faulted back > > > > > in and charged again to memcg A once memcg B accesses it. In that sense, > > > > > accesses from all memcgs matter equally when considering if a swapbacked > > > > > page/folio is a viable reclaim target. > > I just read the entire commit message (sorry for not doing so > previously) to figure out where the confusion came from: the above > claim is wrong for two cases. I'll let you figure out why :) I missed the cases with dead memcgs. I think the two cases are: 1) If a memcg is dead during swap out it looks like the swap charge is moved to the parent. So reclaim is effectively recharging to the parent. This can be handled by only checking access from all memcgs if the charged memcg is alive, something like this: if (target_memcg && (!folio_test_swapback(folio) || !mem_cgroup_online(folio_memcg(folio)))) ... 2) If a memcg dies after a page is already swapped out. During swap in it looks like we charge the page to the process of the page fault if that's the case. Now in this case this patch might actually increase zombie memcgs, but only temporarily. Next time we try to reclaim the page we will go back to case (1) and reclaim it. Also, one might argue that given that the page is relatively hot (accessed by a different memcg), and therefore likely to be faulted in soon, the chances of the memcg dying between the time where the page is reclaimed and faulted back in are slim. One might also argue that having swapbacked pages charged to one memcg and accessed by another is generally less common compared to file page cache. So I *think* with an added check for offline memcgs there shouldn't be any concerns, unless I got the two cases wrong :) > > > > > > Modify folio_referenced() to always consider accesses from all memcgs if > > > > > the folio is swapbacked. > > > > > > > > It seems to me this change can potentially increase the number of > > > > zombie memcgs. Any risk assessment done on this? > > > > > > Do you mind elaborating the case(s) where this could happen? Is this > > > the cgroup v1 case in mem_cgroup_swapout() where we are reclaiming > > > from a zombie memcg and swapping out would let us move the charge to > > > the parent? > > > > The scenario is quite straightforward: for a page charged to memcg A > > and also actively used by memcg B, if we don't ignore the access from > > memcg B, we won't be able to reclaim it after memcg A is deleted.