From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>, Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
Barry Song <baohua@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
Jann Horn <jannh@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Lance Yang <ioworker0@gmail.com>, SeongJae Park <sj@kernel.org>,
Suren Baghdasaryan <surenb@google.com>
Subject: [PATCH v2 0/5] madvise cleanup
Date: Fri, 20 Jun 2025 16:33:00 +0100 [thread overview]
Message-ID: <cover.1750433500.git.lorenzo.stoakes@oracle.com> (raw)
This is a series of patches that helps address a number of historic
problems in the madvise() implementation:
* Eliminate the visitor pattern and having the code which is implemented
for both the anon_vma_name implementation and ordinary madvise()
operations use the same madvise_vma_behavior() implementation.
* Thread state through the madvise_behavior state object - this object,
very usefully introduced by SJ, is already used to transmit state through
operations. This series extends this by having all madvise() operations
use this, including anon_vma_name.
* Thread range, VMA state through madvise_behavior - This helps avoid a lot
of the confusing code around range and VMA state and again keeps things
consistent and with a single 'source of truth'.
* Addressing the very strange behaviour around the passed around struct
vm_area_struct **prev pointer - all read-only users do absolutely nothing
with the prev pointer. The only function that uses it is
madvise_update_vma(), and in all cases prev is always reset to
VMA.
Fix this by no longer having aything but madvise_update_vma() reference
prev, and having madvise_walk_vmas() update prev in each
instance. Additionally make it clear that the meaningful change in vma
state is when madvise_update_vma() potentially merges a VMA, so
explicitly retrieve the VMA in this case.
* Update and clarify the madvise_walk_vmas() function - this is a source of
a great deal of confusion, so simplify, stop using prev = NULL to signify
that the mmap lock has been dropped (!) and make that explicit, and add
some comments to explain what's going on.
v2:
* Propagated tags (thanks everyone!)
* Don't separate out __MADV_SET_ANON_VMA_NAME and __MADV_SET_CLEAR_VMA_NAME,
just use __MADV_SET_ANON_VMA_NAME as per Zi.
* Eliminate is_anon_vma_name() as no longer necessary, addressing Zi's concern
around naming another way :)
* Put mm_struct abstraction of try_vma_read_lock() into 2/5 from 3/5 as per Zi.
* Added comment about get/put anon_vma_name in madvise_vma_behavior() as per
Vlastimil.
* Renamed have_anon_name to set_new_anon_name to make it clear why we make an
exception to this get/put behaviour in madvise_vma_behavior().
* Reworded 1/4 commit message to make it clearer what's being done as per
Vlastimil.
* Avoid comma-separated decls in struct madvise_behavior_range as per Zi and
Vlastimil.
* Put fix for silly development bug (range->start comparison to end not
range->end) in 3/5 rather than 4/5 so as to eliminate it altogether,
having fixed it during development but having not put the fix in the
correct place :) as per Vlastimil.
* Rename end to last_end in madvise_walk_vmas() and added a comment for
clarity as per Vlastimil.
* Update madvise_walk_vmas() comment to no longer refer to a visitor
function.
* Separated out prev, vma fields in struct madvise_behavior as per
Vlastimil.
* Added assert on not holding VMA lock whenever mmap lock is dropped and
abstracted to mark_mmap_lock_dropped() so we always assert when we do
this, based on discussion with Vlastimil.
* Removed duplicate comment about weird -ENOMEM unmapped error behaviour.
v1:
https://lore.kernel.org/all/cover.1750363557.git.lorenzo.stoakes@oracle.com/
Lorenzo Stoakes (5):
mm/madvise: remove the visitor pattern and thread anon_vma state
mm/madvise: thread mm_struct through madvise_behavior
mm/madvise: thread VMA range state through madvise_behavior
mm/madvise: thread all madvise state through madv_behavior
mm/madvise: eliminate very confusing manipulation of prev VMA
include/linux/huge_mm.h | 9 +-
mm/khugepaged.c | 9 +-
mm/madvise.c | 585 +++++++++++++++++++++-------------------
3 files changed, 313 insertions(+), 290 deletions(-)
--
2.49.0
next reply other threads:[~2025-06-20 15:33 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-20 15:33 Lorenzo Stoakes [this message]
2025-06-20 15:33 ` [PATCH v2 1/5] mm/madvise: remove the visitor pattern and thread anon_vma state Lorenzo Stoakes
2025-06-20 16:57 ` Zi Yan
2025-06-20 17:12 ` SeongJae Park
2025-06-23 22:38 ` Barry Song
2025-06-25 7:43 ` David Hildenbrand
2025-06-20 15:33 ` [PATCH v2 2/5] mm/madvise: thread mm_struct through madvise_behavior Lorenzo Stoakes
2025-06-20 16:59 ` Zi Yan
2025-06-20 17:25 ` SeongJae Park
2025-06-23 22:42 ` Barry Song
2025-06-20 15:33 ` [PATCH v2 3/5] mm/madvise: thread VMA range state " Lorenzo Stoakes
2025-06-20 17:02 ` Zi Yan
2025-06-20 17:33 ` SeongJae Park
2025-06-24 1:05 ` Barry Song
2025-06-20 15:33 ` [PATCH v2 4/5] mm/madvise: thread all madvise state through madv_behavior Lorenzo Stoakes
2025-06-20 17:56 ` SeongJae Park
2025-06-20 18:01 ` Lorenzo Stoakes
2025-06-20 18:12 ` SeongJae Park
2025-06-20 15:33 ` [PATCH v2 5/5] mm/madvise: eliminate very confusing manipulation of prev VMA Lorenzo Stoakes
2025-06-20 17:13 ` Zi Yan
2025-06-20 18:10 ` SeongJae Park
2025-06-24 13:16 ` Lorenzo Stoakes
2025-06-24 17:57 ` Vlastimil Babka
2025-06-24 18:01 ` Lorenzo Stoakes
2025-06-20 17:21 ` [PATCH v2 0/5] madvise cleanup SeongJae Park
2025-06-20 17:33 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1750433500.git.lorenzo.stoakes@oracle.com \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=ioworker0@gmail.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox