From: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
To: SeongJae Park <sj@kernel.org>
Cc: damon@lists.linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com,
ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com
Subject: Re: (sashiko review) [PATCH v6 1/1] mm/damon: add node_eligible_mem_bp and node_ineligible_mem_bp goal metrics
Date: Mon, 6 Apr 2026 12:47:56 -0700 [thread overview]
Message-ID: <CALa+Y14oWqu5+DbkENy7GgBjc=dCbFTaoOCr1i4=9CN-ZNRgEA@mail.gmail.com> (raw)
In-Reply-To: <20260405224550.76218-1-sj@kernel.org>
On Sun, Apr 5, 2026 at 3:45 PM SeongJae Park <sj@kernel.org> wrote:
>
>
> Ravi, thank you for reposting this patch after the rebase. This time sashiko
> was able to review this, and found good points including things that deserve
> another revision of this patch.
>
> Forwarding full sashiko review in a reply format with my inline comments below,
> for sharing details of my view and doing followup discussions via mails. Ravi,
> could you please reply?
>
Thanks SJ, providing your comments on top of sashiko's review is very helpful.
> > # review url: https://sashiko.dev/#/patchset/20260405184247.2690-2-ravis.opensrc@gmail.com
> > # start of sashiko.dev inline review
> > commit 5f6ea8da06d136b84d97827777acf2b07047d42c
> > Author: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
> > mm/damon: add node_eligible_mem_bp and node_ineligible_mem_bp goal metrics
> >
> > This commit adds quota goal metrics for tracking scheme-eligible and
> > ineligible memory distribution across NUMA nodes. It iterates over
> > scheme-eligible regions and uses damon_get_folio() to determine placement.
> >
> > > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > > --- a/mm/damon/core.c
> > > +++ b/mm/damon/core.c
> >
> > [ ... ]
> >
> > > +static unsigned long damos_calc_eligible_bytes(struct damon_ctx *c,
> > > + struct damos *s, int nid, unsigned long *total)
> > > +{
> > > + struct damon_target *t;
> > > + struct damon_region *r;
> > > + unsigned long total_eligible = 0;
> > > + unsigned long node_eligible = 0;
> >
> > Can total_eligible and node_eligible overflow on 32-bit architectures?
> > Since these accumulators track raw physical bytes, regions spanning multiple
> > gigabytes might wrap around the 32-bit unsigned long limit. Would it be
> > safer to use u64 or track the counts in pages instead?
>
> Makes sense. Ravi, could we update these to use phys_addr_t?
Agreed. I'll update total_eligible, node_eligible, and the function
return type to use phys_addr_t throughout damos_calc_eligible_bytes().
>
> >
> > > +
> > > + damon_for_each_target(t, c) {
> > > + damon_for_each_region(r, t) {
> > > + phys_addr_t addr, end_addr;
> > > +
> > > + if (!__damos_valid_target(r, s))
> > > + continue;
> >
> > Should this loop also check damos_core_filter_out()? By only checking
> > __damos_valid_target(), it seems memory explicitly excluded by user-defined
> > region filters (like target ID or memcg limits) is still counted as
> > eligible, which might inflate the total and affect the quota tuning loop.
>
> No, we intentionally defined "eligible" based on only the access pattern. We
> may be able to extend this to apply filters in future, if we find a use case.
> But, for now, I'd suggest keeping this as is for now.
>
Got it. Will keep it as is.
> >
> > > +
> > > + /* Convert from core address units to physical bytes */
> > > + addr = r->ar.start * c->addr_unit;
> > > + end_addr = r->ar.end * c->addr_unit;
> > > + while (addr < end_addr) {
> >
> > Could this unbounded loop cause CPU soft lockups? Since DAMON regions can
> > span hundreds of gigabytes, executing millions of folio lookups sequentially
> > without a cond_resched() could monopolize the CPU.
>
> Good point. Most DAMOS action implementations are also doing cond_resched()
> per region. Ravi, could we add cond_resched() per region to this function,
> too?
Will add cond_resched() at the end of each region's processing.
>
> >
> > > + struct folio *folio;
> > > + unsigned long folio_sz, counted;
> > > +
> > > + folio = damon_get_folio(PHYS_PFN(addr));
> >
> > What happens if this metric is assigned to a DAMON context configured for
> > virtual address space monitoring? If the context uses DAMON_OPS_VADDR,
> > passing a user-space virtual address to PHYS_PFN() might cause invalid
> > memory accesses or out-of-bounds page struct reads. Should this code
> > explicitly verify the operations type first?
>
> Good finding. We intend to support only paddr ops. But there is no guard for
> using this on vaddr ops configuration. Ravi, could we add underlying ops
> check? I think damon_commit_ctx() is a good place to add that. The check
> could be something like below?
>
I plan to add the ops type check directly in the metric functions
(damos_get_node_eligible_mem_bp and its counterpart) rather than in
damon_commit_ctx(). The functions will return 0 early
if c->ops.id != DAMON_OPS_PADDR.
That said, if you prefer the damon_commit_ctx() validation approach to
reject the configuration outright, I can implement it that way instead.
Please let me know your preference.
> '''
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1515,10 +1515,23 @@ static int damon_commit_sample_control(
> int damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src)
> {
> int err;
> + struct damos *scheme;
> + struct damos_quota_goal *goal;
>
> dst->maybe_corrupted = true;
> if (!is_power_of_2(src->min_region_sz))
> return -EINVAL;
> + if (src->ops.id != DAMON_OPS_PADDR) {
> + damon_for_each_scheme(scheme, src) {
> + damos_for_each_quota_goal(goal, &scheme->quota) {
> + switch (goal->metric) {
> + case DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP:
> + case DAMOS_QUOTA_NODE_INELIGIBLE_MEMPBP:
> + return -EINVAL;
> + }
> + }
> + }
> + }
>
> err = damon_commit_schemes(dst, src);
> if (err)
> '''
>
> >
> > > + if (!folio) {
> > > + addr += PAGE_SIZE;
> > > + continue;
> > > + }
> > > +
> > > + folio_sz = folio_size(folio);
> > > + /*
> > > + * Clip to region boundaries to avoid counting
> > > + * bytes outside the region when folio spans
> > > + * region boundaries.
> > > + */
> > > + counted = min(folio_sz, (unsigned long)(end_addr - addr));
> >
> > Does this accurately calculate the remaining bytes if the region starts at an
> > unaligned offset within a large folio? folio_size() returns the full size
> > of the folio, so this calculation might overestimate the eligible bytes by
> > not accounting for the initial offset within the folio.
> >
> > > + total_eligible += counted;
> > > + if (folio_nid(folio) == nid)
> > > + node_eligible += counted;
> > > +
> > > + addr += folio_sz;
> >
> > Can this pointer advancement overshoot the end of the current folio? If
> > addr was unaligned, adding the full folio_sz will skip past the beginning
> > of the next contiguous physical memory folio.
>
> Above two points make sense to me. Ravi, could we fix these?
>
Will fix it. Will calculate the exact overlap between the
region [addr, end_addr) and the folio's actual range.
> >
> > > + folio_put(folio);
> > > + }
> > > + }
> > > + }
> > > +
> > > + *total = total_eligible;
> > > + return node_eligible;
> > > +}
> >
> > [ ... ]
> >
> > > +static unsigned long damos_get_node_ineligible_mem_bp(struct damon_ctx *c,
> > > + struct damos *s, int nid)
> > > +{
> > > + unsigned long total_eligible = 0;
> > > + unsigned long node_eligible;
> > > +
> > > + if (nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
> > > + return 0;
> > > +
> > > + node_eligible = damos_calc_eligible_bytes(c, s, nid, &total_eligible);
> > > +
> > > + /* No eligible memory anywhere - ratio is undefined, return 0 */
> > > + if (!total_eligible)
> > > + return 0;
> > > +
> > > + /* Compute ineligible ratio directly: 10000 - eligible_bp */
> > > + return 10000 - mult_frac(node_eligible, 10000, total_eligible);
> > > +}
> >
> > Does this return value match the documented metric? The formula computes the
> > percentage of the system's eligible memory located on other NUMA nodes,
> > rather than the amount of actual ineligible (filtered out) memory residing
> > on the target node. Could this semantic mismatch cause confusion when
> > configuring quota policies?
>
> Nice catch. The name and the documentation are confusing. We actually
> confused a few times in previous revisions, and I'm again confused now. IIUC,
> the current implementation is the intended and right one for the given use
> case, though. If my understanding is correct, how about renaming
> DAMOS_QUOTA_NODE_INELIGIBLE_MEM_BP to
> DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP_COMPLEMENT, and updating the documentation
> together? Ravi, what do you think?
>
Agreed, the current name is confusing. How about
DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP_OFFNODE?
The rationale is that this metric measures "eligible memory that is off
this node" (i.e., on other nodes).
I think "offnode" conveys the physical meaning more directly than "complement".
That said, I'm happy to go with "complement" if you prefer.
both are clearer than "ineligible".
> >
> >
> > # end of sashiko.dev inline review
> > # review url: https://sashiko.dev/#/patchset/20260405184247.2690-2-ravis.opensrc@gmail.com
>
>
> Thanks,
> SJ
>
Best Regards,
Ravi.
> # hkml [1] generated a draft of this mail. You can regenerate
> # this using below command:
> #
> # hkml patch sashiko_dev --for_forwarding \
> # 20260405184247.2690-2-ravis.opensrc@gmail.com
> #
next prev parent reply other threads:[~2026-04-06 19:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-05 18:42 [PATCH v6 0/1] " Ravi Jonnalagadda
2026-04-05 18:42 ` [PATCH v6 1/1] " Ravi Jonnalagadda
2026-04-05 22:45 ` (sashiko review) " SeongJae Park
2026-04-06 19:47 ` Ravi Jonnalagadda [this message]
2026-04-07 0:13 ` SeongJae Park
2026-04-07 16:05 ` SeongJae Park
2026-04-08 2:33 ` Ravi Jonnalagadda
2026-04-08 13:54 ` SeongJae Park
2026-04-05 22:51 ` [PATCH v6 0/1] " SeongJae Park
2026-04-06 0:20 ` Ravi Jonnalagadda
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALa+Y14oWqu5+DbkENy7GgBjc=dCbFTaoOCr1i4=9CN-ZNRgEA@mail.gmail.com' \
--to=ravis.opensrc@gmail.com \
--cc=ajayjoshi@micron.com \
--cc=akpm@linux-foundation.org \
--cc=bijan311@gmail.com \
--cc=corbet@lwn.net \
--cc=damon@lists.linux.dev \
--cc=honggyu.kim@sk.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=sj@kernel.org \
--cc=yunjeong.mun@sk.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox