Re: [PATCH 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: <xu.xin16@zte.com.cn>
To: <david@kernel.org>, <akpm@linux-foundation.org>
Cc: <chengming.zhou@linux.dev>, <hughd@google.com>,
	<wang.yaxin@zte.com.cn>, <yang.yang29@zte.com.cn>,
	<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range
Date: Wed, 14 Jan 2026 10:40:59 +0800 (CST)	[thread overview]
Message-ID: <202601141040594302w9Pnbc3vzQLMkh8bQ80D@zte.com.cn> (raw)
In-Reply-To: <ba03780a-fd65-4a03-97de-bc0905106260@kernel.org>

> > Solution
> > ========
> > In fact, we can significantly improve performance by passing a more precise
> > range based on the given addr. Since the original pages merged by KSM
> > correspond to anonymous VMAs, the page offset can be calculated as
> > pgoff = address >> PAGE_SHIFT. Therefore, we can optimize the call by
> > defining:
> > 
> > 	pgoff_start = rmap_item->address >> PAGE_SHIFT;
> > 	pgoff_end = pgoff_start + folio_nr_pages(folio) - 1;
> > 
> > Performance
> > ===========
> > In our real embedded Linux environment, the measured metrcis were as follows:
> > 
> > 1) Time_ms: Max time for holding anon_vma lock in a single rmap_walk_ksm.
> > 2) Nr_iteration_total: The max times of iterations in a loop of anon_vma_interval_tree_foreach
> > 3) Skip_addr_out_of_range: The max times of skipping due to the first check (vma->vm_start
> >              and vma->vm_end) in a loop of anon_vma_interval_tree_foreach.
> > 4) Skip_mm_mismatch: The max times of skipping due to the second check (rmap_item->mm == vma->vm_mm)
> >              in a loop of anon_vma_interval_tree_foreach.
> > 
> > The result is as follows:
> > 
> >                   Time_ms      Nr_iteration_total    Skip_addr_out_of_range   Skip_mm_mismatch
> > Before patched:  228.65       22169                 22168                    0
> > After pacthed:   0.396        3                     0                        2
> 
> Nice improvement.
> 
> Can you make your reproducer available?

I'll do my best to try it. The original test data was derived from real business scenarios,
but it's quite complex. I'll try to simplify this high-latency scenario into a more
understandable demo as a reproduction program.

> 
> > 
> > Co-developed-by: Wang Yaxin <wang.yaxin@zte.com.cn>
> > Signed-off-by: xu xin <xu.xin16@zte.com.cn>
> > ---
> >   mm/ksm.c | 6 +++++-
> >   1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/ksm.c b/mm/ksm.c
> > index 335e7151e4a1..0a074ad8e867 100644
> > --- a/mm/ksm.c
> > +++ b/mm/ksm.c
> > @@ -3172,6 +3172,7 @@ void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc)
> >   		struct anon_vma_chain *vmac;
> >   		struct vm_area_struct *vma;
> >   		unsigned long addr;
> > +		pgoff_t pgoff_start, pgoff_end;
> > 
> >   		cond_resched();
> >   		if (!anon_vma_trylock_read(anon_vma)) {
> > @@ -3185,8 +3186,11 @@ void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc)
> >   		/* Ignore the stable/unstable/sqnr flags */
> >   		addr = rmap_item->address & PAGE_MASK;
> > 
> > +		pgoff_start = rmap_item->address >> PAGE_SHIFT;
> > +		pgoff_end = pgoff_start + folio_nr_pages(folio) - 1;
> 
> KSM folios are always order-0, so you can keep it simple and hard-code 
> PAGE_SIZE here.
> 
> You can also initialize both values directly and make them const.

Yes, I'll do it in v2.

> 
> > +
> >   		anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> > -					       0, ULONG_MAX) {
> > +					       pgoff_start, pgoff_end) {
> 
> This is interesting. When we fork() with KSM pages we don't duplicate 
> the rmap items. So we rely on this handling here to find all KSM pages 
> even in child processes without distinct rmap items.
> 
> The important thing is that, whenever we mremap(), we break COW to 
> unshare all KSM pages (see prep_move_vma).
> 
> So, indeed, I would expect that we only ever have to search at 
> rmap->address even in child processes. So makes sense to me.

Thanks!

     prev parent reply	other threads:[~2026-01-14  2:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260112215315996jocrkFSqeYfhABkZxqs4T@zte.com.cn>
2026-01-12 13:59 ` [PATCH 1/2] ksm: Initial the addr only once in rmap_walk_ksm xu.xin16
2026-01-12 14:01 ` [PATCH 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range xu.xin16
2026-01-12 17:47   ` Andrew Morton
2026-01-12 19:25   ` David Hildenbrand (Red Hat)
2026-01-14  2:40     ` xu.xin16 [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202601141040594302w9Pnbc3vzQLMkh8bQ80D@zte.com.cn \
    --to=xu.xin16@zte.com.cn \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=david@kernel.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=wang.yaxin@zte.com.cn \
    --cc=yang.yang29@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox