From: SeongJae Park <sj@kernel.org>
To: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
Cc: SeongJae Park <sj@kernel.org>,
damon@lists.linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com,
ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com
Subject: Re: [RFC PATCH v4 4/4] mm/damon: add PA-mode cache for eligible memory detection lag
Date: Mon, 23 Feb 2026 21:54:50 -0800 [thread overview]
Message-ID: <20260224055451.58713-1-sj@kernel.org> (raw)
In-Reply-To: <20260223123232.12851-5-ravis.opensrc@gmail.com>
On Mon, 23 Feb 2026 12:32:32 +0000 Ravi Jonnalagadda <ravis.opensrc@gmail.com> wrote:
> In PA-mode, DAMON needs time to re-detect hot memory at new physical
> addresses after migration. This causes the goal metrics to temporarily
> show incorrect values until detection catches up.
I agree this can happen, and could be problematic on some setup.
>
> Add an eligible cache mechanism to compensate for this detection lag:
>
> - Track migration deltas per node using a rolling window that
> automatically expires old data
> - Use direction-aware adjustment: for target nodes (receiving memory),
> use max(detected, predicted) to ensure migrated memory is counted
> even before detection catches up; for source nodes (losing memory),
> use predicted values when detection shows unreliable low values
> - Maintain the zero-sum property across nodes to preserve total
> eligible memory
> - Include cooldown mechanism to keep cache active while detection
> stabilizes after migration stops
> - Add time-based expiry to clear stale cache data when no migration
> occurs for a configured period
>
> The cache uses max_eligible tracking to handle detection oscillation,
> prioritizing peak observed values over potentially stale snapshots.
> A threshold check prevents quota oscillation when detection swings
> between zero and small values.
But, I feel this might be too overfit solution for a specific setup.
>
> Signed-off-by: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
> ---
> include/linux/damon.h | 45 +++++
> mm/damon/core.c | 421 +++++++++++++++++++++++++++++++++++----
> mm/damon/sysfs-schemes.c | 30 +++
> 3 files changed, 460 insertions(+), 36 deletions(-)
The size of the change is quite big. I'm now curious if the problem is
significant enough for this size of change, and if this solution is only the
single and the best one.
First of all, I'm curious if the problem is that significant. I assume you may
seen the issue from your test setup that you shared with the cover letter.
From my understanding of the cover letter of this patch series, however, you
are testing this on a setup having two complementary schemes. And you use
TEMPORAL tuner. The motivation of TEMPORAL tuner was for setup that not having
a factor to move the quota goal value without additional intervention. In
complementary schemes setup, the schemes becomes such factors for each other.
In the case, TEMPORAL tuner might be worse in terms of the size of temporal
oscillations. I don't know details of your test setup, but I suspect the use
of TEMPORAL tuner might made the issue bigger than real.
I also assume the real world people may use DAMON with auto-tuning mostly
because they don't know the access pattern of the system and assume it will be
dynamic. In the case, even if we perfectly solve the issue, some of
oscillation will happen. So, I think the issue in the real world might be
smaller than that we can find on some specific test setups.
Meanwhile, the node_[in]eligible_mem_bp concept makes sense to me. I'm worried
if this patch is unnecessarily delaying the progress of the main change.
So, unless we have clear evidence of the significance of this issue, I'd prefer
dropping this for now. After that, if the issue turns out to be significant or
this solution is proven to be significantly beneficial, from your next more
realistic test setup, or from real world usage after upstreaming of the main
change, we can revisit. What do you think?
Thanks,
SJ
[...]
next prev parent reply other threads:[~2026-02-24 5:54 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 12:32 [RFC PATCH v3 0/4] mm/damon: Introduce node_eligible_mem_bp and node_ineligible_mem_bp Quota Goal Metrics Ravi Jonnalagadda
2026-02-23 12:32 ` [RFC PATCH v3 1/4] mm/damon/sysfs: set goal_tuner after scheme creation Ravi Jonnalagadda
2026-02-24 1:40 ` SeongJae Park
2026-02-23 12:32 ` [RFC PATCH v3 2/4] mm/damon: fix esz=0 quota bypass allowing unlimited migration Ravi Jonnalagadda
2026-02-24 1:54 ` SeongJae Park
2026-02-23 12:32 ` [RFC PATCH v3 3/4] mm/damon: add node_eligible_mem_bp and node_ineligible_mem_bp goal metrics Ravi Jonnalagadda
2026-02-24 4:27 ` SeongJae Park
2026-02-23 12:32 ` [RFC PATCH v4 4/4] mm/damon: add PA-mode cache for eligible memory detection lag Ravi Jonnalagadda
2026-02-24 5:54 ` SeongJae Park [this message]
2026-02-24 5:36 ` [RFC PATCH v3 0/4] mm/damon: Introduce node_eligible_mem_bp and node_ineligible_mem_bp Quota Goal Metrics SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260224055451.58713-1-sj@kernel.org \
--to=sj@kernel.org \
--cc=ajayjoshi@micron.com \
--cc=akpm@linux-foundation.org \
--cc=bijan311@gmail.com \
--cc=corbet@lwn.net \
--cc=damon@lists.linux.dev \
--cc=honggyu.kim@sk.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ravis.opensrc@gmail.com \
--cc=yunjeong.mun@sk.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox