linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org
Cc: mhocko@kernel.org, kirill@shutemov.name,
	kirill.shutemov@linux.intel.com, vbabka@suse.cz,
	will.deacon@arm.com, catalin.marinas@arm.com
Subject: Re: [RFC 0/4] mm: Introduce lazy exec permission setting on a page
Date: Mon, 18 Feb 2019 10:20:15 -0800	[thread overview]
Message-ID: <d4fcaa44-344e-e8f1-f01f-e2f25f46fffb@intel.com> (raw)
In-Reply-To: <3da12849-bc56-cb9b-f13f-e15d42416223@arm.com>

On 2/18/19 12:31 AM, Anshuman Khandual wrote:
>> Ahh, got it.  I also assume that the Accessed bit on these platforms is
>> also managed similar to how we do it on x86 such that it can't be used
>> to drive invalidation decisions?
> 
> Drive I-cache invalidation ? Could you please elaborate on this. Is not that
> the access bit mechanism is to identify dirty pages after write faults when
> it is SW updated or write accesses when HW updated. In SW updated method, given
> PTE goes through pte_young() during page fault. Then how to differentiate exec
> fault/access from an write fault/access and decide to invalidate the I-cache.
> Just being curious.

Let's say this was on x86 where the Accessed bit is set by the hardware
on any access.  Let's also say that Linux invalidated the TLB any time
that bit was cleared in software (it doesn't, but let's pretend it did).

In that case, if we needed to do icache invalidation, we could optimize
it by only invalidating the icache when we see the Accessed bit set.
That's because any execution would first set the Accessed bit before the
icache was populated.

So, my question

>>>> Any idea which one it is?
>>>
>>> I am not sure about this particular reported case. But was able to reproduce
>>> the problem through a test case where a buffer was mapped with R|W|X, get it
>>> faulted/mapped through write, migrate and then execute from it.
>>
>> Could you make sure, please?
> 
> The test in the report [1] does not create any explicit PROT_EXEC maps and just
> attempts to migrate all pages of the process (which has 10 child processes)
> including the exec pages. So the only exec mappings would be the primary text
> segment and the mapped shared glibc segment. Looks like the shared libraries
> have some mapped pages.

Yeah, but the executable ones are also read-only in your example:

> $cat /proc/[PID]/numa_maps  | grep libc
> 
> ffffaa4c9000 default file=/lib/aarch64-linux-gnu/libc-2.28.so mapped=150 mapmax=57 N0=150 kernelpagesize_kB=4

^ These are all page-cache, executable and read-only.

> ffffaa621000 default file=/lib/aarch64-linux-gnu/libc-2.28.so
> ffffaa630000 default file=/lib/aarch64-linux-gnu/libc-2.28.so anon=4 dirty=4 mapmax=11 N0=4 kernelpagesize_kB=4
> ffffaa634000 default file=/lib/aarch64-linux-gnu/libc-2.28.so anon=2 dirty=2 mapmax=11 N0=2 kernelpagesize_kB=4

This last one is the only read-write one and it's not executable.


>> Write and Execute at the same time are generally a "bad idea".  Given
> 
> But wont this be the case for all run-time generate code which gets written to a
> buffer before being executed from there.

No.  They usually are r=1,w=1,x=0, then transition to r=1,w=0,x=1.  It's
never simultaneously executable and writable.

>> the hardware, I'm not surprised that this problem pops up, but it would
>> be great to find out if this is a real application, or a "doctor it
>> hurts when I do this."
> 
> Is not that a problem though :)

The point is that it's not a real-world problem.  You can certainly
expose this, but do *real* apps do this rather than something entirely
synthetic?

>> This set generally seems to be assuming an environment with "lots of
>> migration, and not much execution".  That seems like a kinda odd
>> situation to me.
> 
> Irrespective of the reported problem which is user driven, there are many kernel
> triggered migrations which can accumulate I-cache invalidation cost over time on
> a memory heavy system with high number of exec enabled user pages. Will that be
> such a rare situation !
> 
> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-December/620357.html

I translate "trivial C application" to "highly synthetic microbenchmark".

I suspect what's happening here is that somebody wrote a micro that
worked well on x86, although it was being rather stupid.  Somebody got
an arm system, and voila: it's slower.  Someone says "Oh, this arm
system is slower than x86!"

Again, the big questions you have real-world applications with writable,
executable pages?  The kernel essentially has *zero* of these because
they're such a massive security risk.  Adding this feature will
encourage folks to replicate this massive security risk in userspace.

Seems like a bad idea.


      parent reply	other threads:[~2019-02-18 18:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-13  8:06 Anshuman Khandual
2019-02-13  8:06 ` [RFC 1/4] " Anshuman Khandual
2019-02-13 13:17   ` Matthew Wilcox
2019-02-13 13:53     ` Anshuman Khandual
2019-02-14  9:06       ` Mike Rapoport
2019-02-15  8:11         ` Anshuman Khandual
2019-02-15  9:49           ` Catalin Marinas
2019-02-13  8:06 ` [RFC 2/4] arm64/mm: Identify user level instruction faults Anshuman Khandual
2019-02-13  8:06 ` [RFC 3/4] arm64/mm: Allow non-exec to exec transition in ptep_set_access_flags() Anshuman Khandual
2019-02-13  8:06 ` [RFC 4/4] arm64/mm: Enable ARCH_SUPPORTS_LAZY_EXEC Anshuman Khandual
2019-02-13 11:21 ` [RFC 0/4] mm: Introduce lazy exec permission setting on a page Catalin Marinas
2019-02-13 15:38   ` Michal Hocko
2019-02-14  6:04     ` Anshuman Khandual
2019-02-14  8:38       ` Michal Hocko
2019-02-14 10:19         ` Catalin Marinas
2019-02-14 12:28           ` Michal Hocko
2019-02-15  8:45             ` Anshuman Khandual
2019-02-15  9:27               ` Michal Hocko
2019-02-18  3:07                 ` Anshuman Khandual
2019-02-14 15:38       ` Dave Hansen
2019-02-18  3:19         ` Anshuman Khandual
2019-02-13 15:44 ` Dave Hansen
2019-02-14  4:12   ` Anshuman Khandual
2019-02-14 16:55     ` Dave Hansen
2019-02-18  8:31       ` Anshuman Khandual
2019-02-18  9:04         ` Catalin Marinas
2019-02-18  9:16           ` Anshuman Khandual
2019-02-18 18:20         ` Dave Hansen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4fcaa44-344e-e8f1-f01f-e2f25f46fffb@intel.com \
    --to=dave.hansen@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox