linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Maxwell Bland <mbland@motorola.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: "linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"dennis@kernel.org" <dennis@kernel.org>,
	"tj@kernel.org" <tj@kernel.org>, "cl@linux.com" <cl@linux.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"shikemeng@huaweicloud.com" <shikemeng@huaweicloud.com>,
	"david@redhat.com" <david@redhat.com>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"anshuman.khandual@arm.com" <anshuman.khandual@arm.com>,
	"willy@infradead.org" <willy@infradead.org>,
	"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
	"rick.p.edgecombe@intel.com" <rick.p.edgecombe@intel.com>,
	"pcc@google.com" <pcc@google.com>,
	"rmk+kernel@armlinux.org.uk" <rmk+kernel@armlinux.org.uk>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"gshan@redhat.com" <gshan@redhat.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	Andrew Wheeler <awheeler@motorola.com>
Subject: Re: [PATCH] arm64: allow post-init vmalloc PXNTable
Date: Tue, 13 Feb 2024 19:15:07 +0000	[thread overview]
Message-ID: <SEZPR03MB67865557061A2494B1A1243CB44F2@SEZPR03MB6786.apcprd03.prod.outlook.com> (raw)
In-Reply-To: <ZcurbvkUR-BoGTxu@FVFF77S0Q05N.cambridge.arm.com>

> From: Mark Rutland <mark.rutland@arm.com> On Tue, Feb 13, 2024 at 10:05:45AM
> -0600, Maxwell Bland wrote:
> > VMALLOC_START ffff800080000000 VMALLOC_END fffffbfff0000000 _text
> > ffffb6c0c1400000 _end        ffffb6c0c3e40000
> >
> > Setting VMALLOC_END to _text in init would resolve this issue with the
> > caveat of a sizeable reduction in the size of available vmalloc memory due
> > to requirements on aslr randomness. However, there are circumstances where
> > this trade-off is necessary: in particular, hypervisor-level security
> > monitors where 1) the microarchitecture contains race conditions on PTE
> > level updates or 2) a per-PTE update verifier comes at a significant hit to
> > performance.
>
> Which "hypervisor-level security monitors" are you referring to?

Right now there are around 4 or 5 different attempts (from what I know: Moto,
Samsung, MediaTek, and Qualcomm) at making page tables immutable and reducing
the kernel threat surface to just dynamically allocated structs, e.g.
file_operations, in ARM, a revival of some of the ideas of:

https://wenboshen.org/publications/papers/tz-rkp-ccs14.pdf

Which are no longer possible to enforce for a number of reasons.  As related to
this patch in particular: the performance hits involved in per-PTE update
verification are huge.

My goal is ultimately to prevent modern exploits like:

https://github.com/chompie1337/s8_2019_2215_poc

which modify dynamically allocated pointers, but trying to protect against these
exploits is disingenuous without first being able to enforce PXN on non-code
pages, i.e. there is a reason we do this in mm initialization, but we need to
enforce or support the enforcement of PXNTable dynamically too.

> We don't support any of those upstream AFAIK.

As is hopefully apparent from the above, though it will help downstream systems,
I do not see this patch as a support issue so much as a legitimate security
feature. There is the matter of deciding which subsystem should be responsible.

The generic vmalloc interface should provide a strong distinction between code
and data allocations, but enforcing this would become the responsibility of each
microarchitecture regardless.

>
> How much VA space are you potentially throwing away?
>

This is rough, I admit. )-: On the order of 70,000 GB, likely more in practice:
it restricts vmalloc to the region before _text. You may be thinking, "that is
ridiculous, c'mon Maxwell", and you would be right, but I was OK with this
trade-off for Moto systems, and was thinking the approach keeps the patch
changes small and simple.

I had a hard time thinking of a better way to do this while avoiding duplication
of vmalloc code into arm64 land. Potentially, though, it would be OK to add an
additional field to the generic vmalloc interface? I may need to reach out for
help here: maybe the solution to the issue will come more readily to those with
more experience.

> How does this work with other allocations of executable memory? e.g.  modules,
> BPF?

It should work.

- arch/arm64/kernel/module.c uses __vmalloc_node_range with module_alloc_base
  and module_alloc_end, bypassing the generic vmalloc_node region, and these
  variables are decided based on a random offset between _text and _end.
- kernel/bpf/core.c uses bpf_jit_alloc_exec to create executable code regions,
  which is a wrapper for module_alloc. In the interpreted BPF case, we do not
  need to worry since the pages storing interpreted code are NX and can be
  marked PXNTable regardless.

> I'm not keen on this as-is.

That's OK, so long as we agree enforcing PXNTable dynamically would be a good
thing. I look forward to your thoughts on the above, and I will go back and
iterate.

Working with IT to fix the email formatting now, so I will hopefully be able to
post a fetchable and runnable version of my initial patch shortly.

  reply	other threads:[~2024-02-13 19:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13 16:05 Maxwell Bland
2024-02-13 16:58 ` Greg KH
2024-02-13 17:16   ` [External] " Maxwell Bland
2024-02-13 17:30     ` Greg KH
2024-02-13 20:18       ` Maxwell Bland
2024-02-13 17:48 ` Mark Rutland
2024-02-13 19:15   ` Maxwell Bland [this message]
2024-02-13 19:35     ` Maxwell Bland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SEZPR03MB67865557061A2494B1A1243CB44F2@SEZPR03MB6786.apcprd03.prod.outlook.com \
    --to=mbland@motorola.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=awheeler@motorola.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=gshan@redhat.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=pcc@google.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rmk+kernel@armlinux.org.uk \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox