From: Honggyu Kim <honggyu.kim@sk.com>
To: SeongJae Park <sj@kernel.org>
Cc: kernel_team@skhynix.com, Sang-Heon Jeon <ekffu200098@gmail.com>,
damon@lists.linux.dev, linux-mm@kvack.org
Subject: Re: [PATCH] mm/damon: update expired description of damos_action
Date: Sun, 3 Aug 2025 13:43:03 +0900 [thread overview]
Message-ID: <68a36ae4-5dc0-4023-b850-2af0401d6d75@sk.com> (raw)
In-Reply-To: <20250803042208.50634-1-sj@kernel.org>
Hi SeongJae,
On 8/3/2025 1:22 PM, SeongJae Park wrote:
> On Sun, 3 Aug 2025 11:03:12 +0900 Honggyu Kim <honggyu.kim@sk.com> wrote:
>
>> Hi SeongJae and Sang-Heon,
>>
>> On 8/2/2025 1:50 AM, SeongJae Park wrote:
>>> On Sat, 2 Aug 2025 01:11:09 +0900 Sang-Heon Jeon <ekffu200098@gmail.com> wrote:
>>>
>>>> Hi, Honggyu
>>>>
>>>> On Fri, Aug 1, 2025 at 8:35 PM Honggyu Kim <honggyu.kim@sk.com> wrote:
>>>>>
>>>>> Hi Sang-Heon and SeongJae,
>>>>>
>>>>> On 8/1/2025 2:58 AM, SeongJae Park wrote:
>>>>>> Hello Sang-Heon,
>>>>>>
>>>>>> On Thu, 31 Jul 2025 22:22:30 +0900 Sang-Heon Jeon <ekffu200098@gmail.com> wrote:
>>>>>>
>>>>>>> Nowadays, damos operation actions support more various operation set.
>>>>>>> But comments(also, generated documentation) doesn't updated.
>>>>>>> So, fix the comments with current support status.
>>> [...]
>>>>>>> diff --git a/include/linux/damon.h b/include/linux/damon.h
>>> [...]
>>>>>>> * @DAMOS_WILLNEED: Call ``madvise()`` for the region with MADV_WILLNEED.
>>>>>>> * @DAMOS_COLD: Call ``madvise()`` for the region with MADV_COLD.
>>>>>>> - * @DAMOS_PAGEOUT: Call ``madvise()`` for the region with MADV_PAGEOUT.
>>>>>>> + * @DAMOS_PAGEOUT: Reclaim the region.
>>>>>>
>>>>>> Nice!
>>>>>
>>>>> But doesn't it make confusion about whether this pages out to disk or does
>>>>> demotion to the lower tier memory? It's because PAGEOUT action doesn't do
>>>>> demotion, but it looks "reclaim" includes pageout and demotion together in my
>>>>> understanding since /sys/kernel/mm/numa/demotion_enabled was introduced.
>>>
>>> To my understanding, DAMOS_PAGEOUT can also do demotion when demotion_enabled
>>> is set. Am I missing something?
>>
>> Actually no, please see below.
>
> I'm unsure to what point you are saying "no". Are you saying DAMOS_PAGEOUT can
> also do demotion when demotion_enabled is set? Or not? Could you please
> clarify, and add more explanations about why you think so?
I checked it again and found I pointed out in the wrong place. Please see below.
>
>>
>> do_demote_pass in shrink_folio_list()
>> https://github.com/torvalds/linux/blob/v6.16/mm/vmscan.c#L1122
>>
>> The do_demote_pass is used here.
>> https://github.com/torvalds/linux/blob/v6.16/mm/vmscan.c#L1293-L1302
>>
>> can_demote() implementation returns false when demotion_enabled is on.
>> https://github.com/torvalds/linux/blob/v6.16/mm/vmscan.c#L350-L351
>
> I'm again get confused. Isn't it opposite?
The thing is that DAMOS_PAGEOUT call sequence is as follows.
DAMOS_PAGEOUT
-> damon_pa_pageout
-> reclaim_pages
-> reclaim_folio_list
In reclaim_folio_list(), it sets "no_demotion = 1" in scan_control, then invokes
shrink_folio_list().
https://github.com/torvalds/linux/blob/v6.16/mm/vmscan.c#L2237
Inside shrink_folio_list(), it calls can_demote() and it returns false even if
demotion_enabled is true.
https://github.com/torvalds/linux/blob/v6.16/mm/vmscan.c#L352-L353
>
> ```
> static bool can_demote(int nid, struct scan_control *sc,
> struct mem_cgroup *memcg)
> {
> int demotion_nid;
>
> if (!numa_demotion_enabled)
> return false;
> ```
>
> It returns "false" when demotion_enabled is "off" (unset). Am I reading
> something wrong...?
The full implementation of can_demote is as follows. It checks whether
"no_demotion" is set.
static bool can_demote(int nid, struct scan_control *sc,
struct mem_cgroup *memcg)
{
int demotion_nid;
if (!numa_demotion_enabled)
return false;
if (sc && sc->no_demotion)
return false;
demotion_nid = next_demotion_node(nid);
if (demotion_nid == NUMA_NO_NODE)
return false;
/* If demotion node isn't in the cgroup's mems_allowed, fall back */
return mem_cgroup_node_allowed(memcg, demotion_nid);
}
Hope this is helpful.
>
>>
>> The replated commit is as follows.
>> mm/migrate: add sysfs interface to enable reclaim migration
>> https://github.com/torvalds/linux/commit/20b51af15e014cac63b58a4f8b8b323ac35bccce
>>
>>>
>>>>
>>>> My intention was just to synchronize with the Design documentation.
>>>>
>>>> So how about changing the description to `Page out the region`, Would
>>>> this be also confusing?
>>>> I feel like it would be clearer than using word "reclaim"
>>
>> I don't have a good idea but it looks like recusive explanation.
>>
>>>
>>> In my opinion, "reclaim" is good.
>>
>> I wish there could be better term that distinguishes between swap out and
>> demotion. In my understanding "reclaim" includes both swap out and demotion.
>
> "Reclaim" also includes writeback operations.
>
> I agree we have many rooms to improve in terms of terminologies. But, I'd
> argue we don't need to have only 1:1 mapping terminologies. Othrwise, maybe we
> don't need any documentation at all but just code. "Reclaim" is a good general
> terminology for describing an effort to get free pages on a memory domain (NUMA
> node, zone, etc), in my opinion.
>
> To be honest, btw, I'm not a fan of "promote/demote", and that was one of the
> reasons I insisted "DAMOS_MIGRATE_{HOT,COLD}" instead of
> "DAMOS_{PROMOTE,DEMOTE}".
I do agree, I feel the current memory tiering system doesn't represent the
tiered memory topology so promote/demote might not be good terms for now.
>
>>
>> I also found that man page explanation about MADV_PAGEOUT is "reclaim these
>> pages". If this is correct, then maybe demotion isn't included in "reclaim".
>
> I interpret the term "reclaim" on the man page with a flexibility including my
> above definition, and hence I don't think this is contradicting in a very bad
> way against what I'm understanding. That is, I interpret the documentation
> says MADV_PAGEOUT can also do demote pages under certain conditions.
>
> I didn't write the documentation, so I may be completely wrong. I also
> frustratingly lost my resource to validate the main question of this discussion
> (whether DAMOS_PAGEOUT can also do demotion or not), for now. I'd like to
> continue this discussion based on code rather than a documentation that _might_
> be right or wrong, and preferrably based on real tests if you or I have a good
> testing setup.
The code I showed above explains well enough about this.
Thanks,
Honggyu
next prev parent reply other threads:[~2025-08-03 4:43 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-31 13:22 Sang-Heon Jeon
2025-07-31 17:58 ` SeongJae Park
2025-08-01 11:35 ` Honggyu Kim
2025-08-01 16:11 ` Sang-Heon Jeon
2025-08-01 16:50 ` SeongJae Park
2025-08-03 2:03 ` Honggyu Kim
2025-08-03 4:22 ` SeongJae Park
2025-08-03 4:43 ` Honggyu Kim [this message]
2025-08-03 5:30 ` SeongJae Park
2025-08-03 5:41 ` Honggyu Kim
2025-08-03 13:22 ` Sang-Heon Jeon
2025-08-03 17:42 ` SeongJae Park
2025-08-04 12:56 ` Sang-Heon Jeon
2025-08-05 2:07 ` Honggyu Kim
2025-08-01 15:34 ` Sang-Heon Jeon
2025-08-01 17:02 ` SeongJae Park
2025-08-03 12:44 ` Sang-Heon Jeon
2025-08-03 17:44 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68a36ae4-5dc0-4023-b850-2af0401d6d75@sk.com \
--to=honggyu.kim@sk.com \
--cc=damon@lists.linux.dev \
--cc=ekffu200098@gmail.com \
--cc=kernel_team@skhynix.com \
--cc=linux-mm@kvack.org \
--cc=sj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox