linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Mika Penttilä" <mpenttil@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>
Subject: Re: [RFC PATCH 1/4] mm: use current as mmu notifier's owner
Date: Thu, 14 Aug 2025 10:04:03 -0300	[thread overview]
Message-ID: <20250814130403.GF699432@nvidia.com> (raw)
In-Reply-To: <2da9464b-3b3d-46bd-a68f-bfef1226bbf6@redhat.com>

On Thu, Aug 14, 2025 at 03:53:00PM +0300, Mika Penttilä wrote:
> 
> On 8/14/25 15:40, Jason Gunthorpe wrote:
> > On Thu, Aug 14, 2025 at 10:19:26AM +0300, Mika Penttilä wrote:
> >> When doing migration in combination with device fault handling,
> >> detect the case in the interval notifier.
> >>
> >> Without that, we would livelock with our own invalidations
> >> while migrating and splitting pages during fault handling.
> >>
> >> Note, pgmap_owner, used in some other code paths as owner for filtering,
> >> is not readily available for split path, so use current for this use case.
> >> Also, current and pgmap_owner, both being pointers to memory, can not be
> >> mis-interpreted to each other.
> >>
> >> Cc: David Hildenbrand <david@redhat.com>
> >> Cc: Jason Gunthorpe <jgg@nvidia.com>
> >> Cc: Leon Romanovsky <leonro@nvidia.com>
> >> Cc: Alistair Popple <apopple@nvidia.com>
> >> Cc: Balbir Singh <balbirs@nvidia.com>
> >>
> >> Signed-off-by: Mika Penttilä <mpenttil@redhat.com>
> >> ---
> >>  lib/test_hmm.c   | 5 +++++
> >>  mm/huge_memory.c | 6 +++---
> >>  mm/rmap.c        | 4 ++--
> >>  3 files changed, 10 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> >> index 761725bc713c..cd5c139213be 100644
> >> --- a/lib/test_hmm.c
> >> +++ b/lib/test_hmm.c
> >> @@ -269,6 +269,11 @@ static bool dmirror_interval_invalidate(struct mmu_interval_notifier *mni,
> >>  	    range->owner == dmirror->mdevice)
> >>  		return true;
> >>  
> >> +	if (range->event == MMU_NOTIFY_CLEAR &&
> >> +	    range->owner == current) {
> >> +		return true;
> >> +	}
> > I don't understand this, there is nothing in hmm that says only
> > current can call hmm_range_fault, and indeed most applications won't
> > even gurantee that.
> 
> No it's the opposite, if we are ourselves the invalidator, don't care.

I think you've missed the point, you cannot use range->owner in any
way to tell "we are ourselves the invalidator". It is simply not the
meaning of range->owner.

> > So if this plan relies on something like the above in drivers I don't
> > see how it can work.
> >
> > If this is just some hack for tests, try instead to find a solution
> > that more accurately matches what a real driver should do.
> >
> > But this also seems overall troublesome to your goal, if you do a
> > migrate inside hmm_range_fault() it will generate an invalidation call
> > back and that will increment the seqlock and we will loop
> > hmm_range_fault() again which rewalks.
> 
> That's the problem this solves.
> The semantics is "if we are the invalidator don't wait for invalidate end",
> aka don't mmu_interval_set_seq() that would make hang in the next mmu_interval_read_begin(),
> waiting the invalidate to end

I doubt we can skip mmu_interval_set_seq(), doing so can corrupt concurrent
hmm_range_fault in some other thread.

Nor, as I said, can we actually skip the invalidation toward HW
anyhow.

At the very least this commit message fails to explain the new locking
proposal, or justify why it can work.

Jason


  reply	other threads:[~2025-08-14 13:04 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-14  7:19 [RFC PATCH 0/4] Migrate on fault for device pages Mika Penttilä
2025-08-14  7:19 ` [RFC PATCH 1/4] mm: use current as mmu notifier's owner Mika Penttilä
2025-08-14 12:40   ` Jason Gunthorpe
2025-08-14 12:53     ` Mika Penttilä
2025-08-14 13:04       ` Jason Gunthorpe [this message]
2025-08-14 13:20         ` Mika Penttilä
2025-08-14 14:11           ` Jason Gunthorpe
2025-08-14 17:00             ` Mika Penttilä
2025-08-14 17:20               ` Jason Gunthorpe
2025-08-14 17:45                 ` Mika Penttilä
2025-08-15  5:23                   ` Alistair Popple
2025-08-15  7:11                     ` Mika Penttilä
2025-08-19  4:27                       ` Balbir Singh
2025-08-19  4:33                         ` Mika Penttilä
2025-08-14  7:19 ` [RFC PATCH 2/4] mm: unified fault and migrate device page paths Mika Penttilä
2025-08-21  4:30   ` Balbir Singh
2025-08-21  5:10     ` Mika Penttilä
2025-08-22  5:02       ` Alistair Popple
2025-08-14  7:19 ` [RFC PATCH 3/4] mm:/migrate_device.c: remove migrate_vma_collect_*() functions Mika Penttilä
2025-08-14  7:19 ` [RFC PATCH 4/4] mm: add new testcase for the migrate on fault case Mika Penttilä
2025-08-15 11:36 ` [RFC PATCH 0/4] Migrate on fault for device pages Balbir Singh
2025-08-15 11:44   ` Mika Penttilä

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250814130403.GF699432@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=david@redhat.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mpenttil@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox