linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Nadav Amit <namit@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Peter Xu <peterx@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Andy Lutomirski <luto@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>, Yu Zhao <yuzhao@google.com>,
	Nick Piggin <npiggin@gmail.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 2/2] mm/mprotect: do not flush on permission promotion
Date: Thu, 7 Oct 2021 19:07:24 +0200	[thread overview]
Message-ID: <1952fc7c-fb21-7d0e-661b-afa59b4580e5@redhat.com> (raw)
In-Reply-To: <5356D62E-1900-4E92-AF23-AA5625EFFD92@vmware.com>

On 07.10.21 18:16, Nadav Amit wrote:
> 
>> On Oct 7, 2021, at 5:13 AM, David Hildenbrand <david@redhat.com> wrote:
>>
>> On 25.09.21 22:54, Nadav Amit wrote:
>>> From: Nadav Amit <namit@vmware.com>
>>> Currently, using mprotect() to unprotect a memory region or uffd to
>>> unprotect a memory region causes a TLB flush. At least on x86, as
>>> protection is promoted, no TLB flush is needed.
>>> Add an arch-specific pte_may_need_flush() which tells whether a TLB
>>> flush is needed based on the old PTE and the new one. Implement an x86
>>> pte_may_need_flush().
>>> For x86, PTE protection promotion or changes of software bits does
>>> require a flush, also add logic that considers the dirty-bit. Changes to
>>> the access-bit do not trigger a TLB flush, although architecturally they
>>> should, as Linux considers the access-bit as a hint.
>>
>> Is the added LOC worth the benefit? IOW, do we have some benchmark that really benefits from that?
> 
> So you ask whether the added ~10 LOC (net) worth the benefit?

I read  "3 files changed, 46 insertions(+), 1 deletion(-)" to optimize 
something without proof, so I naturally have to ask. So this is just a 
"usually we optimize and show numbers to proof" comment.

> 
> Let’s start with the cost of this patch.
> 
> If you ask about complexity, I think that it is a rather simple
> patch and documented as needed. Please be more concrete if you
> think otherwise.

It is most certainly added complexity, although documented cleanly.

> 
> If you ask about the runtime overhead, my experience is that
> such code, which mostly does bit operations, has negligible cost.
> The execution time of mprotect code, and other similar pieces of
> code, is mostly dominated by walking the page-tables & getting
> the pages (which might require cold or random memory accesses),
> acquiring the locks, and of course the TLB flushes that this
> patch tries to eliminate.

I'm absolutely not concerned about runtime overhead :)

> 
> As for the benefit: TLB flush on x86 of a single PTE has an
> overhead of ~200 cycles. If a TLB shootdown is needed, for instance
> on multithreaded applications, this overhead can grow to few
> microseconds or even more, depending on the number of sockets,
> whether the workload runs in a VM (and worse if CPUs are
> overcommitted) and so on.
> 
> This overhead is completely unnecessary on many occasions. If
> you run mprotect() to add permissions, or as I noted in my case,
> to do something similar using userfaultfd. Note that the
> potentially unnecessary TLB flush/shootdown takes place while
> you hold the mmap-lock for write in the case of mprotect(),
> thereby potentially preventing other threads from making
> progress during that time.
> 
> On my in-development workload it was a considerable overhead
> (I didn’t collect numbers though). Basically, I track dirty
> pages using uffd, and every page-fault that can be easily
> resolved by unprotecting cause a TLB flush/shootdown.

Any numbers would be helpful.

> 
> If you want, I will write a microbenchmarks and give you numbers.
> If you look for further optimizations (although you did not indicate
> so), such as doing the TLB batching from do_mprotect_key(),
> (i.e. batching across VMAs), we can discuss it and apply it on
> top of these patches.

I think this patch itself is sufficient if we can show a benefit; I do 
wonder if existing benchmarks could already show a benefit, I feel like 
they should if this makes a difference. Excessive mprotect() usage 
(protect<>unprotect) isn't something unusual.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2021-10-07 17:07 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-25 20:54 [PATCH 0/2] mm/mprotect: avoid unnecessary TLB flushes Nadav Amit
2021-09-25 20:54 ` [PATCH 1/2] mm/mprotect: use mmu_gather Nadav Amit
2021-10-03 12:10   ` Peter Zijlstra
2021-10-04 19:24     ` Nadav Amit
2021-10-05  6:53       ` Peter Zijlstra
2021-10-05 16:34         ` Nadav Amit
2021-10-11  3:45   ` Nadav Amit
2021-10-12 10:16   ` Peter Xu
2021-10-12 17:31     ` Nadav Amit
2021-10-12 23:20       ` Peter Xu
2021-10-13 15:59         ` Nadav Amit
2021-09-25 20:54 ` [PATCH 2/2] mm/mprotect: do not flush on permission promotion Nadav Amit
2021-10-07 12:13   ` David Hildenbrand
2021-10-07 16:16     ` Nadav Amit
2021-10-07 17:07       ` David Hildenbrand [this message]
2021-10-08  6:06         ` Nadav Amit
2021-10-08  7:35           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1952fc7c-fb21-7d0e-661b-afa59b4580e5@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=namit@vmware.com \
    --cc=npiggin@gmail.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox