linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Alistair Popple <apopple@nvidia.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	John Hubbard <jhubbard@nvidia.com>,
	nouveau@lists.freedesktop.org, Jason Gunthorpe <jgg@nvidia.com>,
	DRI Development <dri-devel@lists.freedesktop.org>,
	Karol Herbst <kherbst@redhat.com>, Lyude Paul <lyude@redhat.com>,
	Danilo Krummrich <dakr@kernel.org>
Subject: Re: [Question] Are "device exclusive non-swap entries" / "SVM atomics in Nouveau" still getting used in practice?
Date: Tue, 28 Jan 2025 21:24:33 +0100	[thread overview]
Message-ID: <ded68896-d682-4fb3-8693-4657aa90b313@redhat.com> (raw)
In-Reply-To: <Z5k6w1OZ1ttgTGRo@phenom.ffwll.local>

On 28.01.25 21:14, Simona Vetter wrote:
> On Tue, Jan 28, 2025 at 11:09:24AM +1100, Alistair Popple wrote:
>> On Fri, Jan 24, 2025 at 06:54:02PM +0100, David Hildenbrand wrote:
>>>>>> On integrated the gpu is tied into the coherency
>>>>>> fabric, so there it's not needed.
>>>>>>
>>>>>> I think the more fundamental question with both this function here and
>>>>>> with forced migration to device memory is that there's no guarantee it
>>>>>> will work out.
>>>>>
>>>>> Yes, in particular with device-exclusive, it doesn't really work with THP
>>>>> and is only limited to anonymous memory. I have patches to at least make it
>>>>> work reliably with THP.
>>>>
>>>> I should have crawled through the implementation first before replying.
>>>> Since it only looks at folio_mapcount() make_device_exclusive() should at
>>>> least in theory work reliably on anon memory, and not be impacted by
>>>> elevated refcounts due to migration/ksm/thp/whatever.
>>>
>>> Yes, there is -- in theory -- nothing blocking the conversion except the
>>> folio lock. That's different than page migration.
>>
>> Indeed - this was the entire motivation for make_device_exclusive() - that we
>> needed a way to reliably exclude CPU access that couldn't be blocked in the same
>> way page migration can (otherwise we could have just migrated to a device page,
>> even if that may have added unwanted overhead).
> 
> The folio_trylock worries me a bit. I guess this is to avoid deadlocks
> when locking multiple folios, but I think at least on the first one we
> need an unconditional folio_lock to guarantee forward progress.

At least on the hmm path I was able to trigger the EBUSY a couple of 
times due to concurrent swapout. But the hmm-tests selftest fails 
immediately instead of retrying.

> Since
> atomics can't cross 4k boundaries (or the hw is just really broken) this
> should be enough to avoid being stuck in a livelock. I'm also not seeing
> any other reason why a folio_lock shouldn't work here, but then my
> understanding of mm/ stuff is really just scratching the surface.
> 
> I did crawl through all the other code and it looks like everything else
> is unconditional locks. So looks all good and I didn't spot anything else
> that seemed problematic.
> 
> Somewhat aside, I do wonder whether we really want to require callers to
> hold the mmap lock, or whether with all the work towards lockless fastpath
> that shouldn't instead just be an implementation detail.

We might be able to use the VMA lock in the future, but that will 
require GUP support and a bunch more. Until then, the mm_lock in read 
mode is required.

I was not able to convince myself that we'll really need the folio lock, 
but that's also a separate discussion.

> At least for the
> gpu hmm code I've seen I've tried to push hard towards a world were the
> gpu side does not rely on mmap_read_lock being held at all, to future
> proof this all. And currently we only have one caller of
> make_device_exclusive_range() so would be simple to do.

We could likely move the mmap_lock into that function, but avoiding it 
is more effort.

In any case, I'll send something out probably tomorrow to fix page 
migration/swapout of pages with device-exclusive entries and a bunch of 
other things (THP, interaction with hugetlb, ...).

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-01-28 20:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-23 10:20 David Hildenbrand
2025-01-23 15:08 ` Simona Vetter
2025-01-24 10:44   ` David Hildenbrand
2025-01-24 14:11     ` Jason Gunthorpe
2025-01-24 14:39       ` David Hildenbrand
2025-01-24 15:28     ` Simona Vetter
2025-01-24 17:54       ` David Hildenbrand
2025-01-28  0:09         ` Alistair Popple
2025-01-28 20:14           ` Simona Vetter
2025-01-28 20:24             ` David Hildenbrand [this message]
2025-01-29 10:48               ` Simona Vetter
2025-01-29 11:28                 ` Simona Vetter
2025-01-29 11:31                   ` David Hildenbrand
2025-01-29 14:05                     ` Simona Vetter
2025-01-29 16:13                       ` David Hildenbrand
2025-01-30  8:55                         ` Simona Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ded68896-d682-4fb3-8693-4657aa90b313@redhat.com \
    --to=david@redhat.com \
    --cc=apopple@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kherbst@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=lyude@redhat.com \
    --cc=nouveau@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox