From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4075C433FE for ; Mon, 28 Nov 2022 22:24:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 604586B0071; Mon, 28 Nov 2022 17:24:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B5086B0072; Mon, 28 Nov 2022 17:24:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47BB56B0073; Mon, 28 Nov 2022 17:24:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 398576B0071 for ; Mon, 28 Nov 2022 17:24:16 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 00996402C3 for ; Mon, 28 Nov 2022 22:24:15 +0000 (UTC) X-FDA: 80184280512.24.7CB7274 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf28.hostedemail.com (Postfix) with ESMTP id 86C8AC000C for ; Mon, 28 Nov 2022 22:24:15 +0000 (UTC) Received: by mail-pl1-f181.google.com with SMTP id w23so11576747ply.12 for ; Mon, 28 Nov 2022 14:24:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ChSR6Hj3C9VOEQNkpeQ6mJNLM6aWG/Pf82PGUg9D7GM=; b=TJ3UVG5hY7eNPG/NI1+Ofd6ZgVBVNg0o4s4HC8hmb56J7fnfnYX6byUPJntUVjbkvM mbnSLnfDDAqp0j6WdmpnSxgAZER30CcMUE01FP/WqAi251v0v93rP+fWvJvDm1c0MKLf JACq0lvFkIkdQnDeDjDdZ2QkSGcxld+nGK8y/E8+E3LId+HRRBJ/7BbGsIy0cCBmz+RI 4RNo00M5vyg1Z3X/UB3Y4lcHplxj7jE5PXU6vu3SMsz6h+rBWLgPlwGljviih4aLPeip 6V+bTBVJuLFBKr3gwSSxNuDqLfg3IqOSLOd+h99tcqHcWw0UyQy3YdrM97RCyGidEd/i uSvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ChSR6Hj3C9VOEQNkpeQ6mJNLM6aWG/Pf82PGUg9D7GM=; b=we1dzUKbwmjcdc99l2IbJEo222AcQ4KOz3QHnCZNrif1BYgonXrYBeNSpsb0AInAVZ qrdMQQ0Q3ms8LbYFRZRBlyiFe8Ir2v0Mt38M7dsulpHdYrKD5Td+jjJ8QzX/PJcsrvxX znSBsBBL2MmYrZ6gXihRsABiqug8wBMrsYn04LvXJx7aetueRO2dtiYKggRO8Zan//Kh pUUbSeHebIIVS6A61kslVAPHekDZIB6JmYFINzx780rSWbihg+2VY50aN6VsLf4jwLeS WoLa6a77+lpFe5E2tO6mudIC7/iMBs3cTLxxLcF0PHE5g0GunBxoPFquL1dJkC4PJ9uA iwmg== X-Gm-Message-State: ANoB5pksCMm+Hn2b9OjfiCFHTJWTXo1MmRRwcEw9DJ1SfufH6s5begHv ps3PdF/MBF+OLzYUbL/ecJKKuxY9/tC3q0CFQNo= X-Google-Smtp-Source: AA0mqf78Srjq2nmL2QKmQeg1Qnjvh/p1DU0WiyXMtqy2g7t2ClV6vV2TeFnd0kCLhZrNwxv9uw7vuX3djN3aAQ9y4tI= X-Received: by 2002:a17:903:22c4:b0:184:cb7e:ba36 with SMTP id y4-20020a17090322c400b00184cb7eba36mr33770698plg.57.1669674254391; Mon, 28 Nov 2022 14:24:14 -0800 (PST) MIME-Version: 1.0 References: <20221122203850.2765015-1-almasrymina@google.com> <874juonbmv.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <874juonbmv.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Yang Shi Date: Mon, 28 Nov 2022 14:24:03 -0800 Message-ID: Subject: Re: [RFC PATCH V1] mm: Disable demotion from proactive reclaim To: "Huang, Ying" Cc: Johannes Weiner , Mina Almasry , Yang Shi , Yosry Ahmed , Tim Chen , weixugc@google.com, shakeelb@google.com, gthelen@google.com, fvdl@google.com, Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669674255; a=rsa-sha256; cv=none; b=FrPjTewlct2SZnYx6DFGEtgFUxqCes4c2PsCy8tgki5cezogZu3ynFoI9liG4N46LeAkTV HSigLOI3/NRq44tmEi58J6Bms/44ZrTfGAzMgHeUx11L2n+yg0tvA9V0+XXwt+S9NvqxjS ZSEFqolenYyX6W12iZ9287r3yYBhmU4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TJ3UVG5h; spf=pass (imf28.hostedemail.com: domain of shy828301@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669674255; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ChSR6Hj3C9VOEQNkpeQ6mJNLM6aWG/Pf82PGUg9D7GM=; b=uw0pzkHRnhpeOqTAcSjY8jkcIVAu9Ttlu9YuoNplaeVfmrmAZi2pMdDCTzayQft507CURQ i/B+DYG/l3lhAJjYL3CsK72fBGXKck/Czx+Pt21QqfK/ihFL2fAJEACOKhibpFjHpgocbx yMONnVR2KUtZOZvBaYAoW5VtBkClDF4= X-Stat-Signature: on986no31n4azyofaos4auodu4gsahg6 X-Rspamd-Queue-Id: 86C8AC000C X-Rspam-User: Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TJ3UVG5h; spf=pass (imf28.hostedemail.com: domain of shy828301@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam05 X-HE-Tag: 1669674255-697976 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 23, 2022 at 9:52 PM Huang, Ying wrote: > > Hi, Johannes, > > Johannes Weiner writes: > [...] > > > > The fallback to reclaim actually strikes me as wrong. > > > > Think of reclaim as 'demoting' the pages to the storage tier. If we > > have a RAM -> CXL -> storage hierarchy, we should demote from RAM to > > CXL and from CXL to storage. If we reclaim a page from RAM, it means > > we 'demote' it directly from RAM to storage, bypassing potentially a > > huge amount of pages colder than it in CXL. That doesn't seem right. > > > > If demotion fails, IMO it shouldn't satisfy the reclaim request by > > breaking the layering. Rather it should deflect that pressure to the > > lower layers to make room. This makes sure we maintain an aging > > pipeline that honors the memory tier hierarchy. > > Yes. I think that we should avoid to fall back to reclaim as much as > possible too. Now, when we allocate memory for demotion > (alloc_demote_page()), __GFP_KSWAPD_RECLAIM is used. So, we will trigger > kswapd reclaim on lower tier node to free some memory to avoid fall back > to reclaim on current (higher tier) node. This may be not good enough, > for example, the following patch from Hasan may help via waking up > kswapd earlier. For the ideal case, I do agree with Johannes to demote the page tier by tier rather than reclaiming them from the higher tiers. But I also agree with your premature OOM concern. > > https://lore.kernel.org/linux-mm/b45b9bf7cd3e21bca61d82dcd1eb692cd32c122c.1637778851.git.hasanalmaruf@fb.com/ > > Do you know what is the next step plan for this patch? > > Should we do even more? In my initial implementation I implemented a simple throttle logic when the demotion is not going to succeed if the demotion target has not enough free memory (just check the watermark) to make migration succeed without doing any reclamation. Shall we resurrect that? Waking kswapd sooner is fine to me, but it may be not enough, for example, the kswapd may not keep up so remature OOM may happen on higher tiers or reclaim may still happen. I think throttling the reclaimer/demoter until kswapd makes progress could avoid both. And since the lower tiers memory typically is quite larger than the higher tiers, so the throttle should happen very rarely IMHO. > > From another point of view, I still think that we can use falling back > to reclaim as the last resort to avoid OOM in some special situations, > for example, most pages in the lowest tier node are mlock() or too hot > to be reclaimed. > > > So I'm hesitant to design cgroup controls around the current behavior. > > > > Best Regards, > Huang, Ying >