From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8A84CD98ED for ; Thu, 5 Sep 2024 18:05:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5DD5E6B008C; Thu, 5 Sep 2024 14:05:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 58D376B0093; Thu, 5 Sep 2024 14:05:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 454666B0095; Thu, 5 Sep 2024 14:05:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 27DC26B008C for ; Thu, 5 Sep 2024 14:05:26 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A50C514032E for ; Thu, 5 Sep 2024 18:05:25 +0000 (UTC) X-FDA: 82531461810.27.CD455D7 Received: from mail-il1-f180.google.com (mail-il1-f180.google.com [209.85.166.180]) by imf03.hostedemail.com (Postfix) with ESMTP id D4CEC2001F for ; Thu, 5 Sep 2024 18:05:23 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=N0GH+47i; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of hughd@google.com designates 209.85.166.180 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725559444; a=rsa-sha256; cv=none; b=rfEFdr55WAjoX1i9oWF436rcL1LrYionaWC8xYVhu7/mbC3ve00lAK290gqe6HfTrP+C3K Pa79wBMlDcOKvm2SNFqEoV3PpuWsi5shY1ETcZKcMtXxhT2Jx/yuFaf6VmZQVxhvt58iD1 qSoROgQw+pHiAE/hPJ26WgwrsFj7wJg= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=N0GH+47i; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of hughd@google.com designates 209.85.166.180 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725559444; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JACy3LsEORLfoHEGD7EKNLc5Lwvyz2oKImXIztt5wps=; b=LGBOkdHX93C3Yx/TXo9C/wR9gp/0h4b1j56UaeYHgJPYjOtkpmdA+mtISGLPduB/QfW/RZ 3FBeEzIxAkyNESPavXgDUSj4BZrdBjAs1nay8iD1A4DPPVZ/R3Bso5ykPkyoOxjJyrSmB6 7fy0YiIxaVq5OFIjfWWGRSzNtA+eJIw= Received: by mail-il1-f180.google.com with SMTP id e9e14a558f8ab-39d47a9ffb9so3761165ab.1 for ; Thu, 05 Sep 2024 11:05:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1725559523; x=1726164323; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=JACy3LsEORLfoHEGD7EKNLc5Lwvyz2oKImXIztt5wps=; b=N0GH+47iikQnva4CzFzvtsbMGu7pEveQRshK6Qqg0KZauL1hdCwLfsHHzpGbGayywu /NLsI9vvHsaS36ruK+ibpgzXpJbca4McYh38gYvEaBEwnfaoRE6UhK1M3s/lMwwizIID /AXNuZGw/18I8H3FXOMXp7bJg+qEHx5VKMeNsh1QxRLrBcRhSo0+CcWVM+RH8401+cNE 2MJ3kNNNj6SEp3ENQ2VxNMHm6IEm+7ikHgYJg2dwhFIJtf/Ah89l0MtoIGzhwAF5x3Bq UfAGXiM0fG0awyPXyzO4t5O/IKc48UG8mcFMxir3kF3Y4ZBwd62ySjvp5AEqaSaHdOBM wzGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725559523; x=1726164323; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JACy3LsEORLfoHEGD7EKNLc5Lwvyz2oKImXIztt5wps=; b=P+O0WNFJAo1bC+JcJYrI1Vtuk9TzIw5iQUp045fNtQDxbyoL32viLRie3fcgjsQ+Vo eABYa84QCKKX4XBlVtfxdS5R3tniyoZSFuyxmvhvQDJxB2DWvjr33imEWW5q7+7alu+I R8I/c4LKIIfTR4TDGQs7xsjLT7gO6LAudt+N8xt8iCpqGNW0GZsunjus+sQlmf0PVgW2 tWWHRWJBu1z8jL2xxo9zpOp/aT4boiUKD1UXS2yhU2+LKrZ/lYFWnQSLY3OyZmyafCjr qYLWSGJwKQK0UdlS3jXmSj9VcAX7k30a0MDoKLDggykGjsFiE8untQpN1iZRFCFvBiXA nVNA== X-Forwarded-Encrypted: i=1; AJvYcCVP2Gvc3hiM0mPGniBpl9ARpf1PkuDAeph8SZJIpy6EmJnOT2grmOf/y7YrMZ9wZqhnj6WygHMlDA==@kvack.org X-Gm-Message-State: AOJu0Yzb66ywAfb/PnQfe+VDGwaiQ+bVLJhELlLKl6m1OVyGBMGCgkIf o5xtBQKnXWTm3MzIwruxOA1H61i+rTD63rtvoN1s+v6EB4GYUouYV5aT4bEwNA== X-Google-Smtp-Source: AGHT+IFw+wD+B0uqAGFnbHubKac3f2t2ytAlhpUog4fT3ny3a9crryvc6Qd33jM6Y3o1nron34LN/g== X-Received: by 2002:a05:6e02:1569:b0:3a0:1828:31d9 with SMTP id e9e14a558f8ab-3a018283360mr66660025ab.24.1725559522462; Thu, 05 Sep 2024 11:05:22 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7d5119668aasm510809a12.1.2024.09.05.11.05.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Sep 2024 11:05:21 -0700 (PDT) Date: Thu, 5 Sep 2024 11:05:09 -0700 (PDT) From: Hugh Dickins To: Usama Arif cc: Hugh Dickins , Andrew Morton , Yu Zhao , linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai Subject: Re: [PATCH v5 1/6] mm: free zapped tail pages when splitting isolated thp In-Reply-To: <1ffdf94d-ce3f-4dac-8ed3-0681f98beebf@gmail.com> Message-ID: <5efa255b-3689-0c91-1980-c0f0562d84e9@google.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> <20240830100438.3623486-2-usamaarif642@gmail.com> <1d490ab5-5cf8-4c16-65d0-37a62999fcd5@google.com> <1ffdf94d-ce3f-4dac-8ed3-0681f98beebf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: D4CEC2001F X-Stat-Signature: 3wzapgkws7wojgu4xgu15uae8ywp9nxs X-Rspam-User: X-HE-Tag: 1725559523-106594 X-HE-Meta: U2FsdGVkX1+26WK8pLLQGy8UeFxMJ5UM75866zQ2zXBMWjwOnInF2DZB7yEdgcpnitFRQGWG66siOnVnu2/4w94HB/sZnkPG+LMH6tFWmPGEk2ckgOEuxKoy9eCrS81nDNbzDFpRp1ELI6/zguGQpvsvdAly2H1CUWwRIR3pnDvK5hzM0jYlDXWohSR9RPCL4ZHTYc2CKiC5lNnYPR4UxS7/UNImwyiIO+8DRgfGhX3AWkQE3ox5XvotgY0rcJ6MmxLdaypgPTxEmxbdkDdcc7PY/NzUn07QVb94JzQcmDnpBz7N9ptqW6qPMFZ5n7fQk2SI5QstegeCrXCUHB3NWPaVruX583fA0mgFB6akjrQsHh4QnjiXBfS6HQNeOeDqpuQyEE5MjlXTMZ6zbf9D1pDoHuQLnL/6CnfYFICTey8rNGHmG4y1An9nYSnrlWIx0pdx93+QbUaWXCOHClCpk0FOhPz7pO12S4XxAZ+CQKn1lumGHVqfUBi1FIq1ywRESv2VcTV/GlNUrMHwZldQd8nK842OUC2SCJIMnv62gTM4hqfyeI6OYdjcztEFDuLR2EJCR9PhZiv14a9wSk+IiJm7UBtyvddUV5JIo7et3gVvlNJF3FXuGqxT2ZNtKiFnk3MgNcyWXNOTH3DK2VdSt89/8sClhOQFYDGp29XxPC7gnY8DTxdvZdoBDXHQvGaTZVgrS8byyKyhXKFWxEfK77Ng/7aucfR0ClfrOkqNe0dMF8kN3Q9adBNRs7+Jaa8+HSyZYlusssewdSWMzaYFDjXYcOJ7TnYDodst6jabnF9X3IV0wPl6XdJ1EFs85NVINcTlDWM6aGTgvvZq5jIeEi2S7b/QPkl6yUjscG76/lVrPpxikpPFWxW58ILQbcvoRkgYHpg1wIZ6o+14ZmgUL0yrg+g/RgY3C/LrH0e0k1wjvmXp0NUG8uH351By1E4rKA/9rWS6Mv490cvucgX WRClp0Qc kchfcVfmf/f/POH5IxxVNXDd6k3+39Sk4hpZ3ALEIavEa5yLhjumZ4aDQWeoMSlm4Y6+PwSJmEGsarOfDJr7qMRRLR6i4R3LTRqH1GIUTcrhEb15IJskAdvNUgKQTUwSCc1/MXZTGCX+h2wNQsh8ZSpOi722bKXsWlOA42DzGB8Ty6V1QU1+bFvqKDt8jOVwSQQigiWNW0SFaOnEvIznAArA5opJe6tAhPJ3L1Wick92MipvBYX51lFLwIsn3lXQ1+CbGkdZVO2U0WOXyg7HvsSZ82J0MBm+UyQjGYOufVjZbaav42SdjiqcAc+lYxc9ibYdCf8vwphkXrPjpR2CMAeNPV+CnfUWWA1wnjMILl8sugD5dvkI4ixz5we7GPn10N/O+LBuXd/eejKNg/y9ujPVxpVyBv0IVdYAUkbHfGuAUMnrdKogwbDKYig== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 5 Sep 2024, Usama Arif wrote: > On 05/09/2024 09:46, Hugh Dickins wrote: > > On Fri, 30 Aug 2024, Usama Arif wrote: > > > >> From: Yu Zhao > >> > >> If a tail page has only two references left, one inherited from the > >> isolation of its head and the other from lru_add_page_tail() which we > >> are about to drop, it means this tail page was concurrently zapped. > >> Then we can safely free it and save page reclaim or migration the > >> trouble of trying it. > >> > >> Signed-off-by: Yu Zhao > >> Tested-by: Shuang Zhai > >> Acked-by: Johannes Weiner > >> Signed-off-by: Usama Arif > > > > I'm sorry, but I think this patch (just this 1/6) needs to be dropped: > > it is only an optimization, and unless a persuasive performance case > > can be made to extend it, it ought to go (perhaps revisited later). > > > > I am ok for patch 1 only to be dropped. Patches 2-6 are not dependent on it. > > Its an optimization and underused shrinker doesn't depend on it. > Its possible that folio->new_folio below might fix it? But if it doesn't, > I can retry later on to make this work and resend it only if it alone shows > a significant performance improvement. > > Thanks a lot for debugging this! and sorry it caused an issue. > > > > The problem I kept hitting was that all my work, requiring compaction and > > reclaim, got (killably) stuck in or repeatedly calling reclaim_throttle(): > > because nr_isolated_anon had grown high - and remained high even when the > > load had all been killed. > > > > Bisection led to the 2/6 (remap to shared zeropage), but I'd say this 1/6 > > is the one to blame. I was intending to send this patch to "fix" it: > > > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -3295,6 +3295,8 @@ static void __split_huge_page(struct pag > > folio_clear_active(new_folio); > > folio_clear_unevictable(new_folio); > > list_del(&new_folio->lru); > > + node_stat_sub_folio(folio, NR_ISOLATED_ANON + > > + folio_is_file_lru(folio)); > > Maybe this should have been below? (Notice the folio->new_folio) > > + node_stat_sub_folio(new_folio, NR_ISOLATED_ANON + > + folio_is_file_lru(new_folio)); Yes, how very stupid of me (I'm well aware of the earlier bugfixes here, and ought not to have done a blind copy and paste of those lines): thanks. However... it makes no difference. I gave yours a run, expecting a much smaller negative number, but actually it works out much the same: because, of course, by this point in the code "folio" is left pointing to a small folio, and is almost equivalent to new_folio for the node_stat calculations. (I say "almost" because I guess there's a chance that the page at folio got reused as part of a larger folio meanwhile, which would throw the stats off (if they weren't off already).) So, even with your fix to my fix, this code doesn't work. Whereas a revert of this 1/6 does work: nr_isolated_anon and nr_isolated_file come to 0 when the system is at rest, as expected (and as silence from vmstat_refresh confirms - /proc/vmstat itself presents negative stats as 0, in order to hide transient oddities). Hugh > > > if (!folio_batch_add(&free_folios, new_folio)) { > > mem_cgroup_uncharge_folios(&free_folios); > > free_unref_folios(&free_folios); > > > > And that ran nicely, until I terminated the run and did > > grep nr_isolated /proc/sys/vm/stat_refresh /proc/vmstat > > at the end: stat_refresh kindly left a pr_warn in dmesg to say > > nr_isolated_anon -334013737 > > > > My patch is not good enough. IIUC, some split_huge_pagers (reclaim?) > > know how many pages they isolated and decremented the stats by, and > > increment by that same number at the end; whereas other split_huge_pagers > > (migration?) decrement one by one as they go through the list afterwards. > > > > I've run out of time (I'm about to take a break): I gave up researching > > who needs what, and was already feeling this optimization does too much > > second guessing of what's needed (and its array of VM_WARN_ON_ONCE_FOLIOs > > rather admits to that). > > > > And I don't think it's as simple as moving the node_stat_sub_folio() > > into 2/6 where the zero pte is substituted: that would probably handle > > the vast majority of cases, but aren't there others which pass the > > folio_ref_freeze(new_folio, 2) test - the title's zapped tail pages, > > or racily truncated now that the folio has been unlocked, for example? > > > > Hugh