linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pedro Falcato <pfalcato@suse.de>
To: Dev Jain <dev.jain@arm.com>
Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	 David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	 Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	Barry Song <baohua@kernel.org>,
	 "open list:MEMORY MAPPING" <linux-mm@kvack.org>
Subject: Re: [PATCH] mm/mremap: Honour writable bit in mremap pte batching
Date: Tue, 28 Oct 2025 11:48:51 +0000	[thread overview]
Message-ID: <jmxnalmkkc5ztfhokqtzqihsdji2gprnv5z4tzruxi6iqgfkni@aerronulpyem> (raw)
In-Reply-To: <20251028063952.90313-1-dev.jain@arm.com>

On Tue, Oct 28, 2025 at 12:09:52PM +0530, Dev Jain wrote:
> Currently mremap folio pte batch ignores the writable bit during figuring
> out a set of similar ptes mapping the same folio. Suppose that the first
> pte of the batch is writable while the others are not - set_ptes will
> end up setting the writable bit on the other ptes, which is a violation
> of mremap semantics. Therefore, use FPB_RESPECT_WRITE to check the writable
> bit while determining the pte batch.
>

Hmm, it seems to be like we're doing the wrong thing by default here?
I must admit I haven't followed the contpte work as much as I would've
liked, but it doesn't make much sense to me why FPB_RESPECT_WRITE would
be an option you have to explicitly pass, and where folio_pte_batch (the
"simple" interface) doesn't Just Do The Right Thing for naive callers.

Auditing all callers:
 - khugepaged clears a variable number of ptes
 - memory.c clears a variable number of ptes
 - mempolicy.c grabs folios for migrations
 - mlock.c steps over nr_ptes - 1 ptes, speeding up traversal
 - mremap is borked since we're remapping nr_ptes ptes
 - rmap.c TTU unmaps nr_ptes ptes for a given folio

 so while the vast majority of callers don't seem to care, it would make
 sense that folio_pte_batch() works conservatively by default, and
 folio_pte_batch_flags() would allow for further batching (or maybe
 we would add a separate folio_pte_batch_clear() or
 folio_pte_batch_greedy() or whatnot).

> Cc: stable@vger.kernel.org #6.17
> Fixes: f822a9a81a31 ("mm: optimize mremap() by PTE batching")
> Reported-by: David Hildenbrand <david@redhat.com>
> Debugged-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Dev Jain <dev.jain@arm.com>

But the solution itself looks okay to me. so, fwiw:

Acked-by: Pedro Falcato <pfalcato@suse.de>

-- 
Pedro


  reply	other threads:[~2025-10-28 11:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28  6:39 Dev Jain
2025-10-28 11:48 ` Pedro Falcato [this message]
2025-10-28 12:30   ` David Hildenbrand
2025-10-28 16:41 ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jmxnalmkkc5ztfhokqtzqihsdji2gprnv5z4tzruxi6iqgfkni@aerronulpyem \
    --to=pfalcato@suse.de \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox