From: Simona Vetter <simona.vetter@ffwll.ch>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Yonatan Maman" <ymaman@nvidia.com>,
kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com,
airlied@gmail.com, simona@ffwll.ch, leon@kernel.org,
jglisse@redhat.com, akpm@linux-foundation.org,
GalShalom@nvidia.com, dri-devel@lists.freedesktop.org,
nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, linux-mm@kvack.org,
linux-tegra@vger.kernel.org
Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages
Date: Thu, 30 Jan 2025 11:50:27 +0100 [thread overview]
Message-ID: <Z5tZc0OQukfZEr3H@phenom.ffwll.local> (raw)
In-Reply-To: <20250129134757.GA2120662@ziepe.ca>
On Wed, Jan 29, 2025 at 09:47:57AM -0400, Jason Gunthorpe wrote:
> On Wed, Jan 29, 2025 at 02:38:58PM +0100, Simona Vetter wrote:
>
> > > The pgmap->owner doesn't *have* to fixed, certainly during early boot before
> > > you hand out any page references it can be changed. I wouldn't be
> > > surprised if this is useful to some requirements to build up the
> > > private interconnect topology?
> >
> > The trouble I'm seeing is device probe and the fundemantal issue that you
> > never know when you're done. And so if we entirely rely on pgmap->owner to
> > figure out the driver private interconnect topology, that's going to be
> > messy. That's why I'm also leaning towards both comparing owners and
> > having an additional check whether the interconnect is actually there or
> > not yet.
>
> Hoenstely, I'd rather invest more effort into being able to update
> owner for those special corner cases than to slow down the fast path
> in hmm_range_fault..
I'm not sure how you want to make the owner mutable.
The only design that I think is solid is to evict all device private
memory, unregister the dev_pagemap and register a new one with the updated
owner. I think any other approach boils down to the same issue, except we
pretend it's easier and just ignore all the race conditions.
And I've looked at the lifetime fun of unregistering a dev_pagemap for
device hotunplug and pretty firmly concluded it's unfixable and that I
should run away to do something else :-P
An optional callback is a lot less scary to me here (or redoing
hmm_range_fault or whacking the migration helpers a few times) looks a lot
less scary than making pgmap->owner mutable in some fashion.
Cheers, Sima
> The notion is that owner should represent a contiguous region of
> connectivity. IMHO you can always create this, but I suppose there
> could be corner cases where you need to split/merge owners.
>
> But again, this isn't set in stone, if someone has a better way to
> match the private interconnects without going to driver callbacks then
> try that too.
>
> I think driver callbacks inside hmm_range_fault should be the last resort..
>
> > You can fake that by doing these checks after hmm_range_fault returned,
> > and if you get a bunch of unsuitable pages, toss it back to
> > hmm_range_fault asking for an unconditional migration to system memory for
> > those. But that's kinda not great and I think goes at least against the
> > spirit of how you want to handle pci p2p in step 2 below?
>
> Right, hmm_range_fault should return pages that can be used and you
> should not call it twice.
>
> Jason
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2025-01-30 10:50 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-01 10:36 [RFC 0/5] GPU Direct RDMA (P2P DMA) for Device Private Pages Yonatan Maman
2024-12-01 10:36 ` [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages Yonatan Maman
2025-01-28 8:51 ` Thomas Hellström
2025-01-28 13:20 ` Jason Gunthorpe
2025-01-28 14:48 ` Thomas Hellström
2025-01-28 15:16 ` Jason Gunthorpe
2025-01-28 16:32 ` Thomas Hellström
2025-01-28 17:21 ` Jason Gunthorpe
2025-01-29 13:38 ` Simona Vetter
2025-01-29 13:47 ` Jason Gunthorpe
2025-01-29 17:09 ` Thomas Hellström
2025-01-30 10:50 ` Simona Vetter [this message]
2025-01-30 13:23 ` Jason Gunthorpe
2025-01-30 16:09 ` Simona Vetter
2025-01-30 17:42 ` Jason Gunthorpe
2025-01-31 16:59 ` Simona Vetter
2025-02-03 15:08 ` Jason Gunthorpe
2025-02-04 9:32 ` Thomas Hellström
2025-02-04 13:26 ` Jason Gunthorpe
2025-02-04 14:29 ` Thomas Hellström
2025-02-04 19:16 ` Jason Gunthorpe
2025-02-04 22:01 ` Thomas Hellström
2024-12-01 10:36 ` [RFC 2/5] nouveau/dmem: HMM P2P DMA for private dev pages Yonatan Maman
2024-12-01 10:36 ` [RFC 3/5] IB/core: P2P DMA for device private pages Yonatan Maman
2024-12-01 10:36 ` [RFC 4/5] RDMA/mlx5: Add fallback for P2P DMA errors Yonatan Maman
2024-12-01 10:36 ` [RFC 5/5] RDMA/mlx5: Enabling ATS for ODP memory Yonatan Maman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z5tZc0OQukfZEr3H@phenom.ffwll.local \
--to=simona.vetter@ffwll.ch \
--cc=GalShalom@nvidia.com \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=dakr@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=jgg@ziepe.ca \
--cc=jglisse@redhat.com \
--cc=kherbst@redhat.com \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-tegra@vger.kernel.org \
--cc=lyude@redhat.com \
--cc=nouveau@lists.freedesktop.org \
--cc=simona@ffwll.ch \
--cc=thomas.hellstrom@linux.intel.com \
--cc=ymaman@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox