From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 6F1B5EB64DD
	for <linux-mm@archiver.kernel.org>; Fri, 21 Jul 2023 03:40:24 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id DDCA1280186; Thu, 20 Jul 2023 23:40:23 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id D8D2C28004C; Thu, 20 Jul 2023 23:40:23 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id C54BE280186; Thu, 20 Jul 2023 23:40:23 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14])
	by kanga.kvack.org (Postfix) with ESMTP id AF28C28004C
	for <linux-mm@kvack.org>; Thu, 20 Jul 2023 23:40:23 -0400 (EDT)
Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id 7A170140116
	for <linux-mm@kvack.org>; Fri, 21 Jul 2023 03:40:23 +0000 (UTC)
X-FDA: 81034216326.20.63C8BBB
Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53])
	by imf24.hostedemail.com (Postfix) with ESMTP id 8809D180003
	for <linux-mm@kvack.org>; Fri, 21 Jul 2023 03:40:20 +0000 (UTC)
Authentication-Results: imf24.hostedemail.com;
	dkim=pass header.d=google.com header.s=20221208 header.b=SNsaBwWz;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1689910820;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=4fWKhpf+MYR9ihAE8ZGQSFtOLqA2Qlrb9gXRTnhKSJw=;
	b=njPKun9HdKt3NrKH2+a7ldYMR8/tnWujIPpMJP89Leqw9XcL3uLwf+Rpmj+KXxUOOysHEf
	9fsWDf0ASy5McQI0xo2qffcWdwUf6+MfAv6bLHkswT9NcvpkUUKJHSh4+3E/hNtXeafyI9
	ib5c3t3emDKQTlLy0zTnNbKsDqRkVsI=
ARC-Authentication-Results: i=1;
	imf24.hostedemail.com;
	dkim=pass header.d=google.com header.s=20221208 header.b=SNsaBwWz;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689910820; a=rsa-sha256;
	cv=none;
	b=LgWhyXkHPXZ/gg2WxR+tzpvugNmpROxZYG6xdgQjvGhKcUPb/IUedFiIpdF3fMh/4mXsoJ
	AEeM+R3YswwFSzFhmy0kkAeOkVHLamgdhM1Zb+vT/bahvckTd9oV8p1hDZMEzWqSXEwYgA
	yi+OF1y4OKEr+npWFAGvt5Zrf/fB3fA=
Received: by mail-ej1-f53.google.com with SMTP id a640c23a62f3a-99364ae9596so239239266b.1
        for <linux-mm@kvack.org>; Thu, 20 Jul 2023 20:40:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20221208; t=1689910819; x=1690515619;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=4fWKhpf+MYR9ihAE8ZGQSFtOLqA2Qlrb9gXRTnhKSJw=;
        b=SNsaBwWz2Ha+2HPCKbmCI0FTmfoO2yoqAwp8X84Ni6GSFrSxcBFTh7b1uqe2OwY/fA
         jQ6uKrddDbcHlULo/tzgDupEJrywTVVGUZtC/lfBxaiDZSwa7v0KqpYBd1qVYy9WdhAG
         E2QsUX+JBr0LiASw6PWKl8em12JruFlJCCZvsA+Ww1L9Gi+uEKTKeHUQdw9BAzsL14Tb
         qcxg4cn59H7YEz0MK1MdxQENBTLW9THLdQ5uW5CSwTCNhIT0vz7lSbxuH19h4nYW0TCr
         Lp+K6yBjHfG6mREu6qym1sPdksBbt9Iz0Yg/DzGlgfTIUQO9howQiPirfmHEiF6MPyz1
         X52A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1689910819; x=1690515619;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=4fWKhpf+MYR9ihAE8ZGQSFtOLqA2Qlrb9gXRTnhKSJw=;
        b=Fn1YK8MB6kt69/z/62pAJARzvD6pt/VBQI3xwwRmfT7kuhUfVQiECm482subdTJ/Rh
         A9Vh70yczzpU/lx6r1St1h8x3L0eFQkiT32Vo641RTRCDztwCb/FBJ99MB37vV1DUB0g
         0cjveedFlzdpDWPqBf6aFH2PPzWzL39eVC6Szf4hvaezci6HlJoTUUb9sIYPKo8n5NX0
         nauvvUO4/B9bZUCufqC9YkW0u71wZ2/tWPJGhwxAuK+bCMYvTofnDK9H/6ajXI1F7Lnv
         uB3Gx3GJ4z7k6mRLe6/x06eJk0mu9ctRX1KcV4b8Z4NclKG5eZDwQR0sGTFqJ3sTxwn7
         TH7Q==
X-Gm-Message-State: ABy/qLb/tSslxP+mM4QiwzfV+jAyErcL8G5HTfZyIJgv/2AledI+ne9m
	grGpK1QZ11YCxVRz5dJmoXelHPpkjR08EkIVROZ5SQ==
X-Google-Smtp-Source: APBJJlENXUFW8nB+lJPMnmO3Y0bByPlKuB2KPlP2C6Y8m4gZvyE0lvSdOKgvxXF4G0iVx2X5MgHcF0S9KmFISW6zVP4=
X-Received: by 2002:a17:906:5a5d:b0:99b:5a73:4d06 with SMTP id
 my29-20020a1709065a5d00b0099b5a734d06mr610999ejc.20.1689910818761; Thu, 20
 Jul 2023 20:40:18 -0700 (PDT)
MIME-Version: 1.0
References: <20230712060144.3006358-1-fengwei.yin@intel.com>
 <CAJD7tkYAkVOE2caqEj_hTmm47Kex451prBQ1wKTRUiOwnDcwNA@mail.gmail.com>
 <b995e802-1500-6930-79d0-8cc4bfe89589@intel.com> <CAJD7tkZtHku-kaK02MAdgaxNzr9hQkPty=cw44R_9HdTS+Pd5w@mail.gmail.com>
 <CAJD7tkZWXdHwpW5AeKqmn6TVCXm1wmKr-2RN2baRJ7c4ciTJng@mail.gmail.com>
 <208aff10-8a32-6ab8-f03a-7f3c9d3ca0f7@intel.com> <CAJD7tkYT6EZMwit8C9MTftUxMmuWtn2YpZ+NSVhy0xVCYuafsg@mail.gmail.com>
 <438d6f6d-2571-69d9-844e-9af9e6b4f820@intel.com> <CAJD7tkYWH8umBFgmxPmeOkRF=pauVW=MvyyN+z17XMHN+q8JKg@mail.gmail.com>
 <e3044d46-3b38-dc2e-b8d2-8ec1033f85e7@intel.com> <79f6822-f2f8-aba4-b517-b661d07e2d@google.com>
 <CAJD7tkaMycnAaY-8Gu=kUwbYqDzihP4BQDzCC2M4BTYAKgG6Qg@mail.gmail.com>
 <d2ae87ee-8ee3-0758-a433-8c937e5e3fb5@intel.com> <CAJD7tkbuU9Op_TmUET9N+Mug=AS7N3S16tZifVajVBL0yaYv4w@mail.gmail.com>
 <c8ea2617-df48-a1cf-e910-71eeba353d67@intel.com> <CAJD7tkYH-9YoLMSc4RLd0P4hmMcV4mzko8oijLXNOA_dquHJqA@mail.gmail.com>
 <c9b53e12-80bc-7447-af2e-71920e4179d0@intel.com>
In-Reply-To: <c9b53e12-80bc-7447-af2e-71920e4179d0@intel.com>
From: Yosry Ahmed <yosryahmed@google.com>
Date: Thu, 20 Jul 2023 20:39:42 -0700
Message-ID: <CAJD7tkZJFG=7xs=9otc5CKs6odWu48daUuZP9Wd9Z-sZF07hXg@mail.gmail.com>
Subject: Re: [RFC PATCH v2 3/3] mm: mlock: update mlock_pte_range to handle
 large folio
To: "Yin, Fengwei" <fengwei.yin@intel.com>
Cc: Hugh Dickins <hughd@google.com>, Yu Zhao <yuzhao@google.com>, linux-mm@kvack.org, 
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org, willy@infradead.org, 
	david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Queue-Id: 8809D180003
X-Rspam-User: 
X-Rspamd-Server: rspam02
X-Stat-Signature: xxxetbfbamddh1xw1fjqqd6buffpnqha
X-HE-Tag: 1689910820-973211
X-HE-Meta: U2FsdGVkX1/Um+WXUnU5ZpAGtsMAKWpkenicLq6uoLkYJB/TEZU+leYIoq5wWK05Slf00Obw2jXD7dFSs3hFbxai8cO24IcI3XToS0qGNMOV7llkbjHuvvD3AyzBOidvfgubdqGi0QuTeq/4QajxE1zAizKnjZBt9TVQg6rB68x4KVu/Bo3wyg9UA/pPcs0FiiI1lDakYpcu4nNkWjwyI+MY8UA4Y4f4A06frnvfcSTFoFaE+ScFDZs+yxuVwuuBfJqVG4EQzpkfW6xbCfMnFfzVD+KWwynzvjuE1mFqhn9gg+2UKn6NvHLULIahhGKPiIYxhr4cbj4qoH1P5dUcd9ZKPN95RuK+P0kNMfCYXkrxammiVxv86KOSfRyL3hglbW18EHg6AQqzDTd+9VKqa1tzAG/CoElNMGsyUVoVAhEIkTHe5h2NjfSmudNO6UhpXAHQMNC+EakP10Wv3G781QePCbLYxPf7tQothbsdBEfr0kCr/336jnKr9m8iga63diwo9EEFi4/NnID8LfAseHGkrgwBxB0WDP9jRWKsViCP/rrk4mG1RO0IlZh17mM259VsENAPmw5bTDLefbn0IbBJX21X8WMJ470SihgWfQa5mhHJt1oOZfCnMKaSw522yfahcgHVXAMn/PBRDAGSdZyIVtn5VeN+4udb2A1vgkcc1IGjVx8nHlUHtO6yY6Fn4pQSBwXltwrowKH2GaZ5rclp975Wvx6FGxeORS7tOTCA31zObnkJUQLPTPh+88PlGS/ccHrwFa3nQjtvPlPiqOPt4zVUPEe8CkLgVvHW4DQ7LFh6Y31Wdt35+ajYD5Kwy+/HPLjWxL+RAWUJqCa92knuPNustbXXHwFYOaij4/V0d+iFSLY6cQ2SHYjI77W+iS+CNELqip0AblBzIlINhlZsiSrEicrAdqAhLHC1/2TnBRxbWm26i4aZBAfcMdvMrW4Aw/n3laIRVQ+DujL
 gVtuxCSy
 LSR1nyoMCl3erHR3XuL3UN1pPIjpbXY4sE4Y0HyWcWEm+tKdSVqVZWHwOKSGaXA/lEBh2yUVhGth3IyuImBEtlRwGDj03aM+vNLnvZONHxnGXknNSy1J71eNlAyshRQ/AOHp5XJ/+//Rudq2V9HDNXPWBKWRutXIVIMIPzb7mHRfnt5i9ZoZbCMQzRe3ihvY6yHjfYQAq9PyARqQs7+JL1U0nVE1/oQLSehxbs537w/gfhuIPJTbwqSAEp0smI47nXTnJ3Z8WRFjD0XE5bcdO/x1pBE+VKeMjp34MvIIjpEN3LVa4hEQHGg9ksKTKJXlJTLElyANqjctfl6yI5GpRehJPLGnT0h2gkKjLLADR9hXLYSkY9SJMUhVGH8sZxXr2Uxsk
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Thu, Jul 20, 2023 at 8:19=E2=80=AFPM Yin, Fengwei <fengwei.yin@intel.com=
> wrote:
>
>
>
> On 7/21/2023 9:35 AM, Yosry Ahmed wrote:
> > On Thu, Jul 20, 2023 at 6:12=E2=80=AFPM Yin, Fengwei <fengwei.yin@intel=
.com> wrote:
> >>
> >>
> >>
> >> On 7/21/2023 4:51 AM, Yosry Ahmed wrote:
> >>> On Thu, Jul 20, 2023 at 5:03=E2=80=AFAM Yin, Fengwei <fengwei.yin@int=
el.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 7/19/2023 11:44 PM, Yosry Ahmed wrote:
> >>>>> On Wed, Jul 19, 2023 at 7:26=E2=80=AFAM Hugh Dickins <hughd@google.=
com> wrote:
> >>>>>>
> >>>>>> On Wed, 19 Jul 2023, Yin Fengwei wrote:
> >>>>>>>>>>>>>>>> Could this also happen against normal 4K page? I mean wh=
en user try to munlock
> >>>>>>>>>>>>>>>> a normal 4K page and this 4K page is isolated. So it bec=
ome unevictable page?
> >>>>>>>>>>>>>>> Looks like it can be possible. If cpu 1 is in __munlock_f=
olio() and
> >>>>>>>>>>>>>>> cpu 2 is isolating the folio for any purpose:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> cpu1                        cpu2
> >>>>>>>>>>>>>>>                             isolate folio
> >>>>>>>>>>>>>>> folio_test_clear_lru() // 0
> >>>>>>>>>>>>>>>                             putback folio // add to unevi=
ctable list
> >>>>>>>>>>>>>>> folio_test_clear_mlocked()
> >>>>>>>>>>>>                                folio_set_lru()
> >>>>>>> Let's wait the response from Huge and Yu. :).
> >>>>>>
> >>>>>> I haven't been able to give it enough thought, but I suspect you a=
re right:
> >>>>>> that the current __munlock_folio() is deficient when folio_test_cl=
ear_lru()
> >>>>>> fails.
> >>>>>>
> >>>>>> (Though it has not been reported as a problem in practice: perhaps=
 because
> >>>>>> so few places try to isolate from the unevictable "list".)
> >>>>>>
> >>>>>> I forget what my order of development was, but it's likely that I =
first
> >>>>>> wrote the version for our own internal kernel - which used our ori=
ginal
> >>>>>> lruvec locking, which did not depend on getting PG_lru first (havi=
ng got
> >>>>>> lru_lock, it checked memcg, then tried again if that had changed).
> >>>>>
> >>>>> Right. Just holding the lruvec lock without clearing PG_lru would n=
ot
> >>>>> protect against memcg movement in this case.
> >>>>>
> >>>>>>
> >>>>>> I was uneasy with the PG_lru aspect of upstream lru_lock implement=
ation,
> >>>>>> but it turned out to work okay - elsewhere; but it looks as if I m=
issed
> >>>>>> its implication when adapting __munlock_page() for upstream.
> >>>>>>
> >>>>>> If I were trying to fix this __munlock_folio() race myself (sorry,=
 I'm
> >>>>>> not), I would first look at that aspect: instead of folio_test_cle=
ar_lru()
> >>>>>> behaving always like a trylock, could "folio_wait_clear_lru()" or =
whatever
> >>>>>> spin waiting for PG_lru here?
> >>>>>
> >>>>> +Matthew Wilcox
> >>>>>
> >>>>> It seems to me that before 70dea5346ea3 ("mm/swap: convert lru_add =
to
> >>>>> a folio_batch"), __pagevec_lru_add_fn() (aka lru_add_fn()) used to =
do
> >>>>> folio_set_lru() before checking folio_evictable(). While this is
> >>>>> probably extraneous since folio_batch_move_lru() will set it again
> >>>>> afterwards, it's probably harmless given that the lruvec lock is he=
ld
> >>>>> throughout (so no one can complete the folio isolation anyway), and
> >>>>> given that there were no problems introduced by this extra
> >>>>> folio_set_lru() as far as I can tell.
> >>>> After checking related code, Yes. Looks fine if we move folio_set_lr=
u()
> >>>> before if (folio_evictable(folio)) in lru_add_fn() because of holdin=
g
> >>>> lru lock.
> >>>>
> >>>>>
> >>>>> If we restore folio_set_lru() to lru_add_fn(), and revert 2262ace60=
713
> >>>>> ("mm/munlock:
> >>>>> delete smp_mb() from __pagevec_lru_add_fn()") to restore the strict
> >>>>> ordering between manipulating PG_lru and PG_mlocked, I suppose we c=
an
> >>>>> get away without having to spin. Again, that would only be possible=
 if
> >>>>> reworking mlock_count [1] is acceptable. Otherwise, we can't clear
> >>>>> PG_mlocked before PG_lru in __munlock_folio().
> >>>> What about following change to move mlocked operation before check l=
ru
> >>>> in __munlock_folio()?
> >>>
> >>> It seems correct to me on a high level, but I think there is a subtle=
 problem:
> >>>
> >>> We clear PG_mlocked before trying to isolate to make sure that if
> >>> someone already has the folio isolated they will put it back on an
> >>> evictable list, then if we are able to isolate the folio ourselves an=
d
> >>> find that the mlock_count is > 0, we set PG_mlocked again.
> >>>
> >>> There is a small window where PG_mlocked might be temporarily cleared
> >>> but the folio is not actually munlocked (i.e we don't update the
> >>> NR_MLOCK stat). In that window, a racing reclaimer on a different cpu
> >>> may find VM_LOCKED from in a different vma, and call mlock_folio(). I=
n
> >>> mlock_folio(), we will call folio_test_set_mlocked(folio) and see tha=
t
> >>> PG_mlocked is clear, so we will increment the MLOCK stats, even thoug=
h
> >>> the folio was already mlocked. This can cause MLOCK stats to be
> >>> unbalanced (increments more than decrements), no?
> >> Looks like NR_MLOCK is always connected to PG_mlocked bit. Not possibl=
e
> >> to be unbalanced.
> >>
> >> Let's say:
> >>   mlock_folio()  NR_MLOCK increase and set mlocked
> >>   mlock_folio()  NR_MLOCK NO change as folio is already mlocked
> >>
> >>   __munlock_folio() with isolated folio. NR_MLOCK decrease (0) and
> >>                                          clear mlocked
> >>
> >>   folio_putback_lru()
> >>   reclaimed mlock_folio()  NR_MLOCK increase and set mlocked
> >>
> >>   munlock_folio()  NR_MLOCK decrease (0) and clear mlocked
> >>   munlock_folio()  NR_MLOCK NO change as folio has no mlocked set
> >
> > Right. The problem with the diff is that we temporarily clear
> > PG_mlocked *without* updating NR_MLOCK.
> >
> > Consider a folio that is mlocked by two vmas. NR_MLOCK =3D folio_nr_pag=
es.
> >
> > Assume cpu 1 is doing __munlock_folio from one of the vmas, while cpu
> > 2 is doing reclaim.
> >
> > cpu 1                                        cpu2
> > clear PG_mlocked
> >                                                  folio_referenced()
> >                                                    mlock_folio()
> >                                                      set PG_mlocked
> >                                                        add to NR_MLOCK
> > mlock_count > 0
> > set PG_mlocked
> > goto out
> >
> > Result: NR_MLOCK =3D folio_nr_pages * 2.
> >
> > When the folio is munlock()'d later from the second vma, NR_MLOCK will
> > be reduced to folio_nr_pages, but there are not mlocked folios.
> >
> > This is the scenario that I have in mind. Please correct me if I am wro=
ng.
> Yes. Looks possible even may be difficult to hit.
>
> My first thought was it's not possible because unevictable folio will not
> be picked by reclaimer. But it's possible case if things happen between
> clear_mlock and test_and_clear_lru:
>     folio_putback_lru() by other isolation user like migration
>     reclaimer pick the folio and call mlock_folio()
>     reclaimer call folio
>
> The fixing can be following the rules (combine NR_LOCK with PG_mlocked bi=
t)
> strictly.

Yeah probably. I believe restoring the old ordering of manipulating
PG_lru and PG_mlocked with the memory barrier would be a simpler fix,
but this is only possible if the mlock_count rework gets merged.

>
>
> Regards
> Yin, Fengwei
>
> >
> >>
> >>
> >> Regards
> >> Yin, Fengwei
> >>
> >>>
> >>>>
> >>>> diff --git a/mm/mlock.c b/mm/mlock.c
> >>>> index 0a0c996c5c21..514f0d5bfbfd 100644
> >>>> --- a/mm/mlock.c
> >>>> +++ b/mm/mlock.c
> >>>> @@ -122,7 +122,9 @@ static struct lruvec *__mlock_new_folio(struct f=
olio *folio, struct lruvec *lruv
> >>>>  static struct lruvec *__munlock_folio(struct folio *folio, struct l=
ruvec *lruvec)
> >>>>  {
> >>>>         int nr_pages =3D folio_nr_pages(folio);
> >>>> -       bool isolated =3D false;
> >>>> +       bool isolated =3D false, mlocked =3D true;
> >>>> +
> >>>> +       mlocked =3D folio_test_clear_mlocked(folio);
> >>>>
> >>>>         if (!folio_test_clear_lru(folio))
> >>>>                 goto munlock;
> >>>> @@ -134,13 +136,17 @@ static struct lruvec *__munlock_folio(struct f=
olio *folio, struct lruvec *lruvec
> >>>>                 /* Then mlock_count is maintained, but might underco=
unt */
> >>>>                 if (folio->mlock_count)
> >>>>                         folio->mlock_count--;
> >>>> -               if (folio->mlock_count)
> >>>> +               if (folio->mlock_count) {
> >>>> +                       if (mlocked)
> >>>> +                               folio_set_mlocked(folio);
> >>>>                         goto out;
> >>>> +               }
> >>>>         }
> >>>>         /* else assume that was the last mlock: reclaim will fix it =
if not */
> >>>>
> >>>>  munlock:
> >>>> -       if (folio_test_clear_mlocked(folio)) {
> >>>> +       if (mlocked) {
> >>>>                 __zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages);
> >>>>                 if (isolated || !folio_test_unevictable(folio))
> >>>>                         __count_vm_events(UNEVICTABLE_PGMUNLOCKED, n=
r_pages);
> >>>>
> >>>>
> >>>>>
> >>>>> I am not saying this is necessarily better than spinning, just a no=
te
> >>>>> (and perhaps selfishly making [1] more appealing ;)).
> >>>>>
> >>>>> [1]https://lore.kernel.org/lkml/20230618065719.1363271-1-yosryahmed=
@google.com/
> >>>>>
> >>>>>>
> >>>>>> Hugh