From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Lorenzo Stoakes <ljs@kernel.org>, Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Jan Kara <jack@suse.cz>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@kernel.org>,
<linux-fsdevel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH RFC] fs: drop_caches: introduce per-node drop_caches interface
Date: Thu, 9 Apr 2026 16:21:43 +0800 [thread overview]
Message-ID: <f8eec130-efe5-4e08-b5b4-a9701ccc7056@huawei.com> (raw)
In-Reply-To: <addRr9Vn_LNiXp11@lucifer>
On 4/9/2026 3:19 PM, Lorenzo Stoakes wrote:
> On Thu, Apr 09, 2026 at 09:06:08AM +0200, Michal Hocko wrote:
>> On Thu 09-04-26 14:35:03, Kefeng Wang wrote:
>>> Add a sysfs interface at /sys/devices/system/node/nodeX/drop_caches
>>> to allow dropping caches on a specific NUMA node.
>>>
>>> The existing global drop_caches mechanism (/proc/sys/vm/drop_caches)
>>> operates across all NUMA nodes indiscriminately, causing,
>>> - Unnecessary eviction of hot cache on some nodes
>>> - Performance degradation for applications with NUMA affinity
>>> - Long times spent on large systems with lots of memory
>>>
>>> By exposing a per-node interface, admistrator can,
>>> - Target specific nodes experiencing memory pressure
>>> - Preserve cache on unaffected nodes
>>> - Perform reclamation with finer granularity
>>
>> Quite honestly drop_caches is not the best interface to build any new
>> functionality on top of. It has backfired a lot in the past and we have
>> tried to make it extra clear that this should be used for debugging
>> purposes only. Extending it further sounds like a bad step.
>
> Agreed, there is still _huge_ confusion out there as to what this is for.
>
> (If I hear another person tell me this is a way to 'free up memory' I'll scream
> :)
>
> Really I think it should be seen as a legacy thing for, as Michal says,
> debug purposes (it would have been a good idea to put this in debugfs tbh).
>
> And adding something to sysfs as a permanent, maintained interface for this
> purpose seems problematic.
>
> What is the use-case for this? Why are you dropping the caches? Presumably
> for debugging/perf measuring/etc. purposes? The cover letter is lacking
> that.
Our use case is as follows, for hot-pluggable nodes, for anon pages,
migrating them to other nodes, but for pagecaches, just evicting them
since pages could be refaulted. For mem-tiering, a large amount of cold
memory is stored in the low tier, we think that evicting pagecache is
better than migrating them back to high tier, also this avoid the risk
of accessing potentially faulty memory.
>
> I wonder, if this is something we should move forward with, whether we
> might be better off putting it in debugfs as a result?
>
>>
>>> One use cases is hot-pluggable NUMA nodes, during hot-remove, simply
>>> dropping pagecache is far more efficient than migrating large amounts
>>> of pages to other nodes, which also eliminating the risk of accessing
>>> potentially faulty memory.
>>
>> Does the per-node reclaim interface can help with this by
>> any means?
>>
>> --
>> Michal Hocko
>> SUSE Labs
>
> Thanks, Lorenzo
next prev parent reply other threads:[~2026-04-09 8:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 6:35 Kefeng Wang
2026-04-09 7:06 ` Michal Hocko
2026-04-09 7:19 ` Lorenzo Stoakes
2026-04-09 8:21 ` Kefeng Wang [this message]
2026-04-09 8:27 ` Lorenzo Stoakes
2026-04-09 8:08 ` Kefeng Wang
2026-04-09 8:22 ` Michal Hocko
2026-04-09 8:54 ` Kefeng Wang
2026-04-09 8:30 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f8eec130-efe5-4e08-b5b4-a9701ccc7056@huawei.com \
--to=wangkefeng.wang@huawei.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=david@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox