From: Ralph Campbell <rcampbell@nvidia.com>
To: Jerome Glisse <jglisse@redhat.com>, Jason Gunthorpe <jgg@ziepe.ca>
Cc: <linux-rdma@vger.kernel.org>, <linux-mm@kvack.org>,
John Hubbard <jhubbard@nvidia.com>
Subject: Re: [RFC PATCH 00/11] mm/hmm: Various revisions from a locking/code review
Date: Fri, 24 May 2019 10:47:16 -0700 [thread overview]
Message-ID: <7f82b770-85a3-9b01-48b2-9e458191b8d6@nvidia.com> (raw)
In-Reply-To: <20190524164902.GA3346@redhat.com>
On 5/24/19 9:49 AM, Jerome Glisse wrote:
> On Fri, May 24, 2019 at 11:36:49AM -0300, Jason Gunthorpe wrote:
>> On Thu, May 23, 2019 at 12:34:25PM -0300, Jason Gunthorpe wrote:
>>> From: Jason Gunthorpe <jgg@mellanox.com>
>>>
>>> This patch series arised out of discussions with Jerome when looking at the
>>> ODP changes, particularly informed by use after free races we have already
>>> found and fixed in the ODP code (thanks to syzkaller) working with mmu
>>> notifiers, and the discussion with Ralph on how to resolve the lifetime model.
>>
>> So the last big difference with ODP's flow is how 'range->valid'
>> works.
>>
>> In ODP this was done using the rwsem umem->umem_rwsem which is
>> obtained for read in invalidate_start and released in invalidate_end.
>>
>> Then any other threads that wish to only work on a umem which is not
>> undergoing invalidation will obtain the write side of the lock, and
>> within that lock's critical section the virtual address range is known
>> to not be invalidating.
>>
>> I cannot understand how hmm gets to the same approach. It has
>> range->valid, but it is not locked by anything that I can see, so when
>> we test it in places like hmm_range_fault it seems useless..
>>
>> Jerome, how does this work?
>>
>> I have a feeling we should copy the approach from ODP and use an
>> actual lock here.
>
> range->valid is use as bail early if invalidation is happening in
> hmm_range_fault() to avoid doing useless work. The synchronization
> is explained in the documentation:
>
>
> Locking within the sync_cpu_device_pagetables() callback is the most important
> aspect the driver must respect in order to keep things properly synchronized.
> The usage pattern is::
>
> int driver_populate_range(...)
> {
> struct hmm_range range;
> ...
>
> range.start = ...;
> range.end = ...;
> range.pfns = ...;
> range.flags = ...;
> range.values = ...;
> range.pfn_shift = ...;
> hmm_range_register(&range);
>
> /*
> * Just wait for range to be valid, safe to ignore return value as we
> * will use the return value of hmm_range_snapshot() below under the
> * mmap_sem to ascertain the validity of the range.
> */
> hmm_range_wait_until_valid(&range, TIMEOUT_IN_MSEC);
>
> again:
> down_read(&mm->mmap_sem);
> ret = hmm_range_snapshot(&range);
> if (ret) {
> up_read(&mm->mmap_sem);
> if (ret == -EAGAIN) {
> /*
> * No need to check hmm_range_wait_until_valid() return value
> * on retry we will get proper error with hmm_range_snapshot()
> */
> hmm_range_wait_until_valid(&range, TIMEOUT_IN_MSEC);
> goto again;
> }
> hmm_range_unregister(&range);
> return ret;
> }
> take_lock(driver->update);
> if (!hmm_range_valid(&range)) {
> release_lock(driver->update);
> up_read(&mm->mmap_sem);
> goto again;
> }
>
> // Use pfns array content to update device page table
>
> hmm_range_unregister(&range);
> release_lock(driver->update);
> up_read(&mm->mmap_sem);
> return 0;
> }
>
> The driver->update lock is the same lock that the driver takes inside its
> sync_cpu_device_pagetables() callback. That lock must be held before calling
> hmm_range_valid() to avoid any race with a concurrent CPU page table update.
>
>
> Cheers,
> Jérôme
Given the above, the following patch looks necessary to me.
Also, looking at drivers/gpu/drm/nouveau/nouveau_svm.c, it
doesn't check the return value to avoid calling up_read(&mm->mmap_sem).
Besides, it's better to keep the mmap_sem lock/unlock in the caller.
diff --git a/mm/hmm.c b/mm/hmm.c
index 836adf613f81..8b6ef97a8d71 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -1092,10 +1092,8 @@ long hmm_range_fault(struct hmm_range *range,
bool block)
do {
/* If range is no longer valid force retry. */
- if (!range->valid) {
- up_read(&hmm->mm->mmap_sem);
+ if (!range->valid)
return -EAGAIN;
- }
vma = find_vma(hmm->mm, start);
if (vma == NULL || (vma->vm_flags & device_vma))
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
next prev parent reply other threads:[~2019-05-24 17:47 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-23 15:34 Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 01/11] mm/hmm: Fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
2019-06-06 23:54 ` Ira Weiny
2019-06-07 14:17 ` Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_register_range Jason Gunthorpe
2019-05-23 18:22 ` Christoph Hellwig
2019-05-23 15:34 ` [RFC PATCH 03/11] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
2019-05-23 23:38 ` Ralph Campbell
2019-05-24 1:23 ` Jason Gunthorpe
2019-05-24 17:06 ` Ralph Campbell
2019-05-23 15:34 ` [RFC PATCH 05/11] mm/hmm: Improve locking around hmm->dead Jason Gunthorpe
2019-05-24 13:40 ` Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 06/11] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 07/11] mm/hmm: Delete hmm_mirror_mm_is_alive() Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 08/11] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
2019-06-07 19:33 ` Souptick Joarder
2019-06-07 19:39 ` Jason Gunthorpe
2019-06-07 21:02 ` Souptick Joarder
2019-06-08 1:15 ` Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 09/11] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
2019-06-07 19:38 ` Souptick Joarder
2019-06-07 19:37 ` Jason Gunthorpe
2019-06-07 19:55 ` Souptick Joarder
2019-05-23 15:34 ` [RFC PATCH 10/11] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
2019-06-07 20:13 ` Souptick Joarder
2019-06-07 20:18 ` Jason Gunthorpe
2019-05-23 15:34 ` [RFC PATCH 11/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
2019-06-07 20:22 ` Souptick Joarder
2019-05-23 19:04 ` [RFC PATCH 00/11] mm/hmm: Various revisions from a locking/code review John Hubbard
2019-05-23 19:37 ` Jason Gunthorpe
2019-05-23 20:59 ` Jerome Glisse
2019-05-24 13:35 ` Jason Gunthorpe
2019-05-24 14:36 ` Jason Gunthorpe
2019-05-24 16:49 ` Jerome Glisse
2019-05-24 16:59 ` Jason Gunthorpe
2019-05-24 17:01 ` Jerome Glisse
2019-05-24 17:52 ` Jason Gunthorpe
2019-05-24 18:03 ` Jerome Glisse
2019-05-24 18:32 ` Jason Gunthorpe
2019-05-24 18:46 ` Jerome Glisse
2019-05-24 22:09 ` Jason Gunthorpe
2019-05-27 19:58 ` Jason Gunthorpe
2019-05-24 17:47 ` Ralph Campbell [this message]
2019-05-24 17:51 ` Jerome Glisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7f82b770-85a3-9b01-48b2-9e458191b8d6@nvidia.com \
--to=rcampbell@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox