From: Jerome Glisse <jglisse@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
"Peter Xu" <peterx@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Arnaldo Carvalho de Melo" <acme@kernel.org>,
"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
"Jiri Olsa" <jolsa@redhat.com>,
"Namhyung Kim" <namhyung@kernel.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"Michal Hocko" <mhocko@kernel.org>,
kvm@vger.kernel.org
Subject: Re: [RFC PATCH 1/4] uprobes: use set_pte_at() not set_pte_at_notify()
Date: Mon, 11 Feb 2019 14:27:56 -0500 [thread overview]
Message-ID: <20190211192755.GC3908@redhat.com> (raw)
In-Reply-To: <20190202005022.GC12463@redhat.com>
Background we are discussing __replace_page() in:
kernel/events/uprobes.c
and wether this can be call on page that can be written too through
its virtual address mapping.
On Fri, Feb 01, 2019 at 07:50:22PM -0500, Andrea Arcangeli wrote:
> On Thu, Jan 31, 2019 at 01:37:03PM -0500, Jerome Glisse wrote:
> > @@ -207,8 +207,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
> >
> > flush_cache_page(vma, addr, pte_pfn(*pvmw.pte));
> > ptep_clear_flush_notify(vma, addr, pvmw.pte);
> > - set_pte_at_notify(mm, addr, pvmw.pte,
> > - mk_pte(new_page, vma->vm_page_prot));
> > + set_pte_at(mm, addr, pvmw.pte, mk_pte(new_page, vma->vm_page_prot));
> >
> > page_remove_rmap(old_page, false);
> > if (!page_mapped(old_page))
>
> This seems racy by design in the way it copies the page, if the vma
> mapping isn't readonly to begin with (in which case it'd be ok to
> change the pfn with change_pte too, it'd be a from read-only to
> read-only change which is ok).
>
> If the code copies a writable page there's no much issue if coherency
> is lost by other means too.
I am not sure the race exist but i am not familiar with the uprobe
code so maybe the page is already write protected and thus the copy
is fine and in fact that is likely the case but there is not check
for that. Maybe there should be a check ?
Maybe someone familiar with this code can chime in.
>
> Said that this isn't a worthwhile optimization for uprobes so because
> of the lack of explicit read-only enforcement, I agree it's simpler to
> skip change_pte above.
>
> It's orthogonal, but in this function the
> mmu_notifier_invalidate_range_end(&range); can be optimized to
> mmu_notifier_invalidate_range_only_end(&range); otherwise there's no
> point to retain the _notify in ptep_clear_flush_notify.
We need to keep the _notify for IOMMU otherwise it would break IOMMU.
IOMMU can walk the page table at any time and thus we need to first
clear the table then notify the IOMMU to flush TLB on all the devices
that might have a TLB entry. Only then can we set the new pte.
But yes the mmu_notifier_invalidate_range_end can be optimized to
only end. I will do a separate patch for this. As it is orthogonal as
you pointed out :)
Cheers,
Jérôme
next prev parent reply other threads:[~2019-02-11 19:28 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-31 18:37 [RFC PATCH 0/4] Restore change_pte optimization to its former glory jglisse
2019-01-31 18:37 ` [RFC PATCH 1/4] uprobes: use set_pte_at() not set_pte_at_notify() jglisse
2019-02-02 0:50 ` Andrea Arcangeli
2019-02-11 19:27 ` Jerome Glisse [this message]
2019-01-31 18:37 ` [RFC PATCH 2/4] mm/mmu_notifier: use unsigned for event field in range struct jglisse
2019-02-02 1:13 ` Andrea Arcangeli
2019-01-31 18:37 ` [RFC PATCH 3/4] mm/mmu_notifier: set MMU_NOTIFIER_USE_CHANGE_PTE flag where appropriate jglisse
2019-01-31 18:37 ` [RFC PATCH 4/4] kvm/mmu_notifier: re-enable the change_pte() optimization jglisse
2019-02-01 23:57 ` [RFC PATCH 0/4] Restore change_pte optimization to its former glory Andrea Arcangeli
2019-02-02 0:14 ` Andrea Arcangeli
2019-02-11 19:09 ` Jerome Glisse
2019-02-11 20:02 ` Andrea Arcangeli
2019-02-18 16:04 ` Jerome Glisse
2019-02-18 17:45 ` Andrea Arcangeli
2019-02-18 18:20 ` Jerome Glisse
2019-02-19 2:37 ` Peter Xu
2019-02-19 2:43 ` Jerome Glisse
2019-02-19 3:33 ` Jerome Glisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190211192755.GC3908@redhat.com \
--to=jglisse@redhat.com \
--cc=aarcange@redhat.com \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=jolsa@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox