From: "Huang, Ying" <ying.huang@linux.alibaba.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
SeongJae Park <sj@kernel.org>,
David Hildenbrand <david@redhat.com>, Zi Yan <ziy@nvidia.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Brost <matthew.brost@intel.com>,
Rakie Kim <rakie.kim@sk.com>, Byungchul Park <byungchul@sk.com>,
Gregory Price <gourry@gourry.net>,
Alistair Popple <apopple@nvidia.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-team@meta.com, Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Date: Tue, 05 Aug 2025 09:27:30 +0800 [thread overview]
Message-ID: <871ppqy2v1.fsf@DESKTOP-5N7EMDA> (raw)
In-Reply-To: <20250804144200.1047918-1-joshua.hahnjy@gmail.com> (Joshua Hahn's message of "Mon, 4 Aug 2025 07:41:59 -0700")
Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>>
>> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>> >
>> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> >>
>> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
>> >> > memory. Contrary to its user-facing name, it is internally referred to as
>> >> > "node_reclaim_mode".
>> >> >
>> >> > This can be confusing. But because we cannot change the name of the API since
>> >> > it has been in place since at least 2.6, let's try to be more explicit about
>> >> > what the behavior of this API is.
>> >> >
>> >> > Change the description to clarify what zone reclaim entails, and be explicit
>> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
>> >> > past already [1] [2].
>> >> >
>> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
>> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>> >> >
>> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> >> > ---
>> >> > include/uapi/linux/mempolicy.h | 8 +++++++-
>> >> > 1 file changed, 7 insertions(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
>> >> > --- a/include/uapi/linux/mempolicy.h
>> >> > +++ b/include/uapi/linux/mempolicy.h
>> >> > @@ -66,10 +66,16 @@ enum {
>> >> > #define MPOL_F_MORON (1 << 4) /* Migrate On protnone Reference On Node */
>> >> >
>> >> > /*
>> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
>> >> > + * the allocation request on the current node by triggering reclaim and
>> >> > + * trying to shrink the current node.
>> >> > + * Fallback allocations on the next candidates in the zonelist are considered
>> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
>> >> > + *
>> >> > * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> >> > * ABI. New bits are OK, but existing bits can never change.
>> >>
>> >> As far as I know, sysctl isn't considered kernel ABI now. So, cghane
>> >> this line too?
>> >
>> > Hi Ying,
>> >
>> > Thank you for reviewing this patch!
>> >
>> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
>> > suggestion correctly, I can rephrase the comment block above to something like this?
>> >
>> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> > - * ABI. New bits are OK, but existing bits can never change.
>> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
>> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
>> > + * can never change.
>
> Hi Ying,
>
>> Because it's not an ABI, I think that we could avoid to say "never".
>
> My personal opinion is that we should keep this warning, since there has
> already been an example before where a developer tried to remove this bit [1],
> and this broke some behavior for userspace configurations. However, if I
> understand your comment correctly, you are suggesting that we should change
> the wording to not include "never", since sysctls are no longer an ABI (and
> therefore we should be OK to change what the values mean?)
>
> If that is the case, then I can send in another patch since I think the goals
> are a bit different for the two patches. With that said, I think we should
> keep the warning just to avoid any breakages in userspace, even if sysctl
> might not be considered an ABI anymore (also I must have missed this, I didn't
> know this at all!)
Sorry for confusing. I agree that we shouldn't change the sysctl
interface in most cases. I just thought that we could soften the
wording a little? For example,
New bits are OK, but existing bits shouldn't be changed.
I think that it's still clear that we don't want to change the existing
bits.
However, my English is poor. So, my suggestion may not make sense.
>> > Thanks again for your review Ying, I hope you have a good day : -)
>>
>> Welcome! You too!
>>
>> With some trivial tweak, please feel free to add my
>>
>> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
>>
>> in the future version.
>
> Thank you for your review Ying! Since there is a question remaining about what
> to do with the "never" statement, I will wait to send out a v3 with your
> review : -)
---
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2025-08-05 1:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-31 21:07 Joshua Hahn
2025-07-31 22:41 ` SeongJae Park
2025-08-01 9:04 ` David Hildenbrand
2025-08-01 14:50 ` Joshua Hahn
2025-08-01 0:59 ` Huang, Ying
2025-08-01 14:48 ` Joshua Hahn
2025-08-04 1:24 ` Huang, Ying
2025-08-04 14:41 ` Joshua Hahn
2025-08-05 1:27 ` Huang, Ying [this message]
2025-08-05 20:03 ` Joshua Hahn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871ppqy2v1.fsf@DESKTOP-5N7EMDA \
--to=ying.huang@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=byungchul@sk.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=joshua.hahnjy@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.brost@intel.com \
--cc=rakie.kim@sk.com \
--cc=sj@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox