From: Shakeel Butt <shakeel.butt@linux.dev>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
SeongJae Park <sj@kernel.org>,
"Liam R. Howlett" <howlett@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
kernel-team@meta.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [RFC PATCH 00/16] mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE
Date: Wed, 5 Mar 2025 11:57:34 -0800 [thread overview]
Message-ID: <yhfyhovnztn3m224tdbq4hrth3bulq23ym57rp7prvodaapjdo@any7cn33suh3> (raw)
In-Reply-To: <815d1f2d-4bc0-40da-ba07-42593ae7ee45@lucifer.local>
On Wed, Mar 05, 2025 at 07:49:50PM +0000, Lorenzo Stoakes wrote:
> On Wed, Mar 05, 2025 at 11:46:31AM -0800, Shakeel Butt wrote:
> > On Wed, Mar 05, 2025 at 08:19:41PM +0100, David Hildenbrand wrote:
> > > On 05.03.25 19:56, Matthew Wilcox wrote:
> > > > On Wed, Mar 05, 2025 at 10:15:55AM -0800, SeongJae Park wrote:
> > > > > For MADV_DONTNEED[_LOCKED] or MADV_FREE madvise requests, tlb flushes
> > > > > can happen for each vma of the given address ranges. Because such tlb
> > > > > flushes are for address ranges of same process, doing those in a batch
> > > > > is more efficient while still being safe. Modify madvise() and
> > > > > process_madvise() entry level code path to do such batched tlb flushes,
> > > > > while the internal unmap logics do only gathering of the tlb entries to
> > > > > flush.
> > > >
> > > > Do real applications actually do madvise requests that span multiple
> > > > VMAs? It just seems weird to me. Like, each vma comes from a separate
> > > > call to mmap [1], so why would it make sense for an application to
> > > > call madvise() across a VMA boundary?
> > >
> > > I had the same question. If this happens in an app, I would assume that a
> > > single MADV_DONTNEED call would usually not span multiples VMAs, and if it
> > > does, not that many (and that often) that we would really care about it.
> >
> > IMHO madvise() is just an add-on and the real motivation behind this
> > series is your next point.
> >
> > >
> > > OTOH, optimizing tlb flushing when using a vectored MADV_DONTNEED version
> > > would make more sense to me. I don't recall if process_madvise() allows for
> > > that already, and if it does, is this series primarily tackling optimizing
> > > that?
> >
> > Yes process_madvise() allows that and that is what SJ has benchmarked
> > and reported in the cover letter. In addition, we are adding
> > process_madvise() support in jemalloc which will land soon.
> >
>
> Feels like me adjusting that to allow for batched usage for guard regions
> has opened up unexpected avenues, which is really cool to see :)
>
> I presume the usage is intended for PIDFD_SELF usage right?
Yes.
>
> At some point we need to look at allowing larger iovec size. This was
> something I was planning to look at at some point, but my workload is
> really overwhelming + that's low priority for me so happy for you guys to
> handle that if you want.
>
> Can discuss at lsf if you guys will be there also :)
Yup, we will be there and will be happy to discuss.
Also the draft of jemalloc using process_madvise() is at [1].
[1] https://github.com/jemalloc/jemalloc/pull/2794
next prev parent reply other threads:[~2025-03-05 19:57 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-05 18:15 SeongJae Park
2025-03-05 18:15 ` [RFC PATCH 01/16] mm/madvise: use is_memory_failure() from madvise_do_behavior() SeongJae Park
2025-03-05 20:25 ` Shakeel Butt
2025-03-05 23:13 ` SeongJae Park
2025-03-05 18:15 ` [RFC PATCH 02/16] mm/madvise: split out populate behavior check logic SeongJae Park
2025-03-05 20:32 ` Shakeel Butt
2025-03-05 23:18 ` SeongJae Park
2025-03-05 18:15 ` [RFC PATCH 03/16] mm/madvise: deduplicate madvise_do_behavior() skip case handlings SeongJae Park
2025-03-05 18:15 ` [RFC PATCH 04/16] mm/madvise: remove len parameter of madvise_do_behavior() SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 05/16] mm/madvise: define and use madvise_behavior struct for madvise_do_behavior() SeongJae Park
2025-03-05 21:02 ` Shakeel Butt
2025-03-05 21:40 ` Shakeel Butt
2025-03-05 23:56 ` SeongJae Park
2025-03-06 3:37 ` Shakeel Butt
2025-03-06 4:18 ` SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 06/16] mm/madvise: pass madvise_behavior struct to madvise_vma_behavior() SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 07/16] mm/madvise: make madvise_walk_vmas() visit function receives a void pointer SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 08/16] mm/madvise: pass madvise_behavior struct to madvise_dontneed_free() SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 09/16] mm/memory: split non-tlb flushing part from zap_page_range_single() SeongJae Park
2025-03-06 18:45 ` Shakeel Butt
2025-03-06 19:09 ` SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 10/16] mm/madvise: let madvise_dontneed_single_vma() caller batches tlb flushes SeongJae Park
2025-03-06 18:36 ` Shakeel Butt
2025-03-06 19:10 ` SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 11/16] mm/madvise: let madvise_free_single_vma() " SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 12/16] mm/madvise: batch tlb flushes for process_madvise(MADV_DONTNEED[_LOCKED]) SeongJae Park
2025-03-06 18:36 ` Shakeel Butt
2025-03-06 19:11 ` SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 13/16] mm/madvise: batch tlb flushes for process_madvise(MADV_FREE) SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 14/16] mm/madvise: batch tlb flushes for madvise(MADV_{DONTNEED[_LOCKED],FREE} SeongJae Park
2025-03-05 18:16 ` [RFC PATCH 15/16] mm/madvise: remove !tlb support from madvise_dontneed_single_vma() SeongJae Park
2025-03-06 18:37 ` Shakeel Butt
2025-03-05 18:16 ` [RFC PATCH 16/16] mm/madvise: remove !caller_tlb case of madvise_free_single_vma() SeongJae Park
2025-03-05 18:56 ` [RFC PATCH 00/16] mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE Matthew Wilcox
2025-03-05 19:19 ` David Hildenbrand
2025-03-05 19:26 ` Lorenzo Stoakes
2025-03-05 19:35 ` David Hildenbrand
2025-03-05 19:39 ` Lorenzo Stoakes
2025-03-05 19:46 ` Shakeel Butt
2025-03-05 19:49 ` David Hildenbrand
2025-03-05 20:59 ` SeongJae Park
2025-03-05 19:49 ` Lorenzo Stoakes
2025-03-05 19:57 ` Shakeel Butt [this message]
2025-03-05 22:46 ` SeongJae Park
2025-03-05 20:22 ` Shakeel Butt
2025-03-05 22:58 ` SeongJae Park
2025-03-05 20:36 ` Nadav Amit
2025-03-05 23:02 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=yhfyhovnztn3m224tdbq4hrth3bulq23ym57rp7prvodaapjdo@any7cn33suh3 \
--to=shakeel.butt@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=howlett@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=sj@kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox