From: Alistair Popple <apopple@nvidia.com>
To: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
Cc: "dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
David Hildenbrand <david@redhat.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Hugh Dickins <hughd@google.com>, Peter Xu <peterx@redhat.com>,
Jason Gunthorpe <jgg@nvidia.com>,
Gerd Hoffmann <kraxel@redhat.com>,
"Kim, Dongwon" <dongwon.kim@intel.com>,
"Chang, Junxiao" <junxiao.chang@intel.com>
Subject: Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)
Date: Thu, 20 Jul 2023 19:00:52 +1000 [thread overview]
Message-ID: <87pm4nj6s5.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <IA0PR11MB718527A20068383829DFF6A3F83EA@IA0PR11MB7185.namprd11.prod.outlook.com>
"Kasireddy, Vivek" <vivek.kasireddy@intel.com> writes:
> Hi Alistair,
>
>>
>> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> > index 64a3239b6407..1f2f0209101a 100644
>> > --- a/mm/hugetlb.c
>> > +++ b/mm/hugetlb.c
>> > @@ -6096,8 +6096,12 @@ vm_fault_t hugetlb_fault(struct mm_struct
>> *mm, struct vm_area_struct *vma,
>> > * hugetlb_no_page will drop vma lock and hugetlb fault
>> > * mutex internally, which make us return immediately.
>> > */
>> > - return hugetlb_no_page(mm, vma, mapping, idx, address,
>> ptep,
>> > + ret = hugetlb_no_page(mm, vma, mapping, idx, address,
>> ptep,
>> > entry, flags);
>> > + if (!ret)
>> > + mmu_notifier_update_mapping(vma->vm_mm,
>> address,
>> > + pte_pfn(*ptep));
>>
>> The next patch ends up calling pfn_to_page() on the result of
>> pte_pfn(*ptep). I don't think that's safe because couldn't the PTE have
>> already changed and/or the new page have been freed?
> Yeah, that might be possible; I believe the right thing to do would be:
> - return hugetlb_no_page(mm, vma, mapping, idx, address, ptep,
> + ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep,
> entry, flags);
> + if (!ret) {
> + ptl = huge_pte_lock(h, mm, ptep);
> + mmu_notifier_update_mapping(vma->vm_mm, address,
> + pte_pfn(*ptep));
> + spin_unlock(ptl);
> + }
Yes, although obviously as I think you point out below you wouldn't be
able to take any sleeping locks in mmu_notifier_update_mapping().
> In which case I'd need to make a similar change in the shmem path as well.
> And, also redo (or eliminate) the locking in udmabuf (patch) which seems a
> bit excessive on a second look given our use-case (where reads and writes do
> not happen simultaneously due to fence synchronization in the guest driver).
I'm not at all familiar with the udmabuf use case but that sounds
brittle and effectively makes this notifier udmabuf specific right?
The name gives the impression it is more general though. I have
contemplated adding a notifier for PTE updates for drivers using
hmm_range_fault() as it would save some expensive device faults and it
this could be useful for that.
So if we're adding a notifier for PTE updates I think it would be good
if it covered all cases and was robust enough to allow mirroring of the
correct PTE value (ie. by being called under PTL or via some other
synchronisation like hmm_range_fault()).
Thanks.
> Thanks,
> Vivek
>
>>
>> > + return ret;
>> >
>> > ret = 0;
>> >
>> > @@ -6223,6 +6227,9 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm,
>> struct vm_area_struct *vma,
>> > */
>> > if (need_wait_lock)
>> > folio_wait_locked(folio);
>> > + if (!ret)
>> > + mmu_notifier_update_mapping(vma->vm_mm, address,
>> > + pte_pfn(*ptep));
>> > return ret;
>> > }
>> >
>> > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
>> > index 50c0dde1354f..6421405334b9 100644
>> > --- a/mm/mmu_notifier.c
>> > +++ b/mm/mmu_notifier.c
>> > @@ -441,6 +441,23 @@ void __mmu_notifier_change_pte(struct
>> mm_struct *mm, unsigned long address,
>> > srcu_read_unlock(&srcu, id);
>> > }
>> >
>> > +void __mmu_notifier_update_mapping(struct mm_struct *mm, unsigned
>> long address,
>> > + unsigned long pfn)
>> > +{
>> > + struct mmu_notifier *subscription;
>> > + int id;
>> > +
>> > + id = srcu_read_lock(&srcu);
>> > + hlist_for_each_entry_rcu(subscription,
>> > + &mm->notifier_subscriptions->list, hlist,
>> > + srcu_read_lock_held(&srcu)) {
>> > + if (subscription->ops->update_mapping)
>> > + subscription->ops->update_mapping(subscription,
>> mm,
>> > + address, pfn);
>> > + }
>> > + srcu_read_unlock(&srcu, id);
>> > +}
>> > +
>> > static int mn_itree_invalidate(struct mmu_notifier_subscriptions
>> *subscriptions,
>> > const struct mmu_notifier_range *range)
>> > {
>> > diff --git a/mm/shmem.c b/mm/shmem.c
>> > index 2f2e0e618072..e59eb5fafadb 100644
>> > --- a/mm/shmem.c
>> > +++ b/mm/shmem.c
>> > @@ -77,6 +77,7 @@ static struct vfsmount *shm_mnt;
>> > #include <linux/fcntl.h>
>> > #include <uapi/linux/memfd.h>
>> > #include <linux/rmap.h>
>> > +#include <linux/mmu_notifier.h>
>> > #include <linux/uuid.h>
>> >
>> > #include <linux/uaccess.h>
>> > @@ -2164,8 +2165,12 @@ static vm_fault_t shmem_fault(struct vm_fault
>> *vmf)
>> > gfp, vma, vmf, &ret);
>> > if (err)
>> > return vmf_error(err);
>> > - if (folio)
>> > + if (folio) {
>> > vmf->page = folio_file_page(folio, vmf->pgoff);
>> > + if (ret == VM_FAULT_LOCKED)
>> > + mmu_notifier_update_mapping(vma->vm_mm, vmf-
>> >address,
>> > + page_to_pfn(vmf->page));
>> > + }
>> > return ret;
>> > }
next prev parent reply other threads:[~2023-07-20 9:01 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 8:28 [RFC v1 0/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-07-18 8:28 ` [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages) Vivek Kasireddy
2023-07-18 15:36 ` Jason Gunthorpe
2023-07-19 0:05 ` Kasireddy, Vivek
2023-07-19 0:24 ` Jason Gunthorpe
2023-07-19 6:19 ` Kasireddy, Vivek
2023-07-19 2:08 ` Alistair Popple
2023-07-20 7:43 ` Kasireddy, Vivek
2023-07-20 9:00 ` Alistair Popple [this message]
2023-07-24 7:54 ` Kasireddy, Vivek
2023-07-24 13:35 ` Jason Gunthorpe
2023-07-24 20:32 ` Kasireddy, Vivek
2023-07-25 4:30 ` Hugh Dickins
2023-07-25 22:24 ` Kasireddy, Vivek
2023-07-27 21:43 ` Peter Xu
2023-07-29 0:08 ` Kasireddy, Vivek
2023-07-31 17:05 ` Peter Xu
2023-08-01 7:11 ` Kasireddy, Vivek
2023-08-01 21:57 ` Peter Xu
2023-08-03 8:08 ` Kasireddy, Vivek
2023-08-03 13:02 ` Peter Xu
2023-07-25 12:36 ` Jason Gunthorpe
2023-07-25 22:44 ` Kasireddy, Vivek
2023-07-25 22:53 ` Jason Gunthorpe
2023-07-27 7:34 ` Kasireddy, Vivek
2023-07-27 11:58 ` Jason Gunthorpe
2023-07-29 0:46 ` Kasireddy, Vivek
2023-07-30 23:09 ` Jason Gunthorpe
2023-08-01 5:32 ` Kasireddy, Vivek
2023-08-01 12:19 ` Jason Gunthorpe
2023-08-01 12:22 ` David Hildenbrand
2023-08-01 12:23 ` Jason Gunthorpe
2023-08-01 12:26 ` David Hildenbrand
2023-08-01 12:26 ` Jason Gunthorpe
2023-08-01 12:28 ` David Hildenbrand
2023-08-01 17:53 ` Kasireddy, Vivek
2023-08-01 18:19 ` Jason Gunthorpe
2023-08-03 7:35 ` Kasireddy, Vivek
2023-08-03 12:14 ` Jason Gunthorpe
2023-08-03 12:32 ` David Hildenbrand
2023-08-04 0:14 ` Alistair Popple
2023-08-04 6:39 ` Kasireddy, Vivek
2023-08-04 7:23 ` David Hildenbrand
2023-08-04 21:53 ` Kasireddy, Vivek
2023-08-04 12:49 ` Jason Gunthorpe
2023-08-08 7:37 ` Kasireddy, Vivek
2023-08-08 12:42 ` Jason Gunthorpe
2023-08-16 6:43 ` Kasireddy, Vivek
2023-08-21 9:02 ` Alistair Popple
2023-08-22 6:14 ` Kasireddy, Vivek
2023-08-22 8:15 ` Alistair Popple
2023-08-24 6:48 ` Kasireddy, Vivek
2023-08-28 4:38 ` Kasireddy, Vivek
2023-08-30 16:02 ` Jason Gunthorpe
2023-07-25 3:38 ` Alistair Popple
2023-07-24 13:36 ` Alistair Popple
2023-07-24 13:37 ` Jason Gunthorpe
2023-07-24 20:42 ` Kasireddy, Vivek
2023-07-25 3:14 ` Alistair Popple
2023-07-18 8:28 ` [RFC v1 2/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-08-02 12:40 ` Daniel Vetter
2023-08-03 8:24 ` Kasireddy, Vivek
2023-08-03 8:32 ` Daniel Vetter
2023-07-18 8:28 ` [RFC v1 3/3] selftests/dma-buf/udmabuf: Add tests for huge pages and FALLOC_FL_PUNCH_HOLE Vivek Kasireddy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pm4nj6s5.fsf@nvdebian.thelocal \
--to=apopple@nvidia.com \
--cc=david@redhat.com \
--cc=dongwon.kim@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hughd@google.com \
--cc=jgg@nvidia.com \
--cc=junxiao.chang@intel.com \
--cc=kraxel@redhat.com \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=peterx@redhat.com \
--cc=vivek.kasireddy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox