Re: [PATCH v7 2/2] mm/oom_kill: The OOM reaper traverses the VMA maple tree in reverse order

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Michal Hocko <mhocko@suse.com>
Cc: zhongjinji <zhongjinji@honor.com>,
	rientjes@google.com, shakeel.butt@linux.dev,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	lorenzo.stoakes@oracle.com, surenb@google.com,
	liulu.liu@honor.com, feng.han@honor.com
Subject: Re: [PATCH v7 2/2] mm/oom_kill: The OOM reaper traverses the VMA maple tree in reverse order
Date: Wed, 3 Sep 2025 15:02:34 -0400	[thread overview]
Message-ID: <7rvwvuifkav5oz4ftfuziq23wek2bn6ygvrfotpaweypuy7obv@hjuf3eknscii> (raw)
In-Reply-To: <aLg7ajpko2j1qV4h@tiehlicka>

* Michal Hocko <mhocko@suse.com> [250903 08:58]:
> On Wed 03-09-25 17:27:29, zhongjinji wrote:
> > Although the oom_reaper is delayed and it gives the oom victim chance to
> > clean up its address space this might take a while especially for
> > processes with a large address space footprint. In those cases
> > oom_reaper might start racing with the dying task and compete for shared
> > resources - e.g. page table lock contention has been observed.
> > 
> > Reduce those races by reaping the oom victim from the other end of the
> > address space.
> > 
> > It is also a significant improvement for process_mrelease(). When a process
> > is killed, process_mrelease is used to reap the killed process and often
> > runs concurrently with the dying task. The test data shows that after
> > applying the patch, lock contention is greatly reduced during the procedure
> > of reaping the killed process.
> 
> Thank you this is much better!
> 
> > Without the patch:
> > |--99.74%-- oom_reaper
> > |  |--76.67%-- unmap_page_range
> > |  |  |--33.70%-- __pte_offset_map_lock
> > |  |  |  |--98.46%-- _raw_spin_lock
> > |  |  |--27.61%-- free_swap_and_cache_nr
> > |  |  |--16.40%-- folio_remove_rmap_ptes
> > |  |  |--12.25%-- tlb_flush_mmu
> > |  |--12.61%-- tlb_finish_mmu
> > 
> > With the patch:
> > |--98.84%-- oom_reaper
> > |  |--53.45%-- unmap_page_range
> > |  |  |--24.29%-- [hit in function]
> > |  |  |--48.06%-- folio_remove_rmap_ptes
> > |  |  |--17.99%-- tlb_flush_mmu
> > |  |  |--1.72%-- __pte_offset_map_lock
> > |  |--30.43%-- tlb_finish_mmu
> 
> Just curious. Do I read this correctly that the overall speedup is
> mostly eaten by contention over tlb_finish_mmu?

The tlb_finish_mmu() taking less time indicates that it's probably not
doing much work, afaict.  These numbers would be better if exit_mmap()
was also added to show a more complete view of how the system is
affected - I suspect the tlb_finish_mmu time will have disappeared from
that side of things.

Comments in the code of this stuff has many arch specific statements,
which makes me wonder if this is safe (probably?) and beneficial for
everyone?  At the least, it would be worth mentioning which arch was
used for the benchmark - I am guessing arm64 considering the talk of
android, coincidently arm64 would benefit the most fwiu.

mmu_notifier_release(mm) is called early in the exit_mmap() path should
cause the mmu notifiers to be non-blocking (according to the comment in
v6.0 source of exit_mmap [1].

> 
> > Signed-off-by: zhongjinji <zhongjinji@honor.com>
> 
> Anyway, the change on its own makes sense to me
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Thanks for working on the changelog improvements.

[1]. https://elixir.bootlin.com/linux/v6.0.19/source/mm/mmap.c#L3089

...

Thanks,
Liam

next prev parent reply	other threads:[~2025-09-03 19:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-03  9:27 [PATCH v7 0/2] Improvements for victim thawing and reaper VMA traversal zhongjinji
2025-09-03  9:27 ` [PATCH v7 1/2] mm/oom_kill: Thaw victim on a per-process basis instead of per-thread zhongjinji
2025-09-03 12:27   ` Michal Hocko
2025-09-04 13:08     ` zhongjinji
2025-09-03  9:27 ` [PATCH v7 2/2] mm/oom_kill: The OOM reaper traverses the VMA maple tree in reverse order zhongjinji
2025-09-03 12:58   ` Michal Hocko
2025-09-03 19:02     ` Liam R. Howlett [this message]
2025-09-04 12:21       ` Michal Hocko
2025-09-05  2:12         ` Liam R. Howlett
2025-09-05  9:20           ` Michal Hocko
2025-09-04 12:47       ` zhongjinji
2025-09-04 12:24     ` zhongjinji
2025-09-04 14:48       ` Michal Hocko
2025-09-08 12:15         ` [PATCH v7 2/2] mm/oom_kill: The OOM reaper traverses the VMA zhongjinji
2025-09-04 23:50   ` [PATCH v7 2/2] mm/oom_kill: The OOM reaper traverses the VMA maple tree in reverse order Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7rvwvuifkav5oz4ftfuziq23wek2bn6ygvrfotpaweypuy7obv@hjuf3eknscii \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=feng.han@honor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liulu.liu@honor.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=zhongjinji@honor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox