From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38320C3DA4A for ; Fri, 9 Aug 2024 17:17:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA1CB6B00A3; Fri, 9 Aug 2024 13:17:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C531D6B00A4; Fri, 9 Aug 2024 13:17:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1B106B00A5; Fri, 9 Aug 2024 13:17:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 94BFC6B00A3 for ; Fri, 9 Aug 2024 13:17:01 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2DA93C1026 for ; Fri, 9 Aug 2024 17:17:01 +0000 (UTC) X-FDA: 82433362242.19.E486BF2 Received: from mail-oa1-f41.google.com (mail-oa1-f41.google.com [209.85.160.41]) by imf29.hostedemail.com (Postfix) with ESMTP id 45AF0120033 for ; Fri, 9 Aug 2024 17:16:59 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=JvI9BidH; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf29.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.41 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723223765; a=rsa-sha256; cv=none; b=j+e5CVMWAmp5SKCCLjWDFWDKYxQKiR7a+RVBqrExdfmU/d5mNqmiStSZq9Ur4gZbkTv2ee dCJw6KAkUBwFMjdGCg36+NLUc4iWVU3T3oSYM7c+Qxbc+XMW+p/sJU2AoouuyS1tmARC+R qGFHtIFx3lrNx4PuLW5gPo4U9tDes2o= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=JvI9BidH; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf29.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.41 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723223765; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v/fAYz42YIT47Goo/L2J1KADvnu11AHx4qa4iqxm2BE=; b=EN0fXRaoawrSpEv8WhX1pVQ7nUbIA+xYxB4G0/+C8xUi0LevDqL+sgZKOV8XioSkJJ05kz aID/MNDF81a6vkWH57Nsf2ZbEeA1eP2qc31DI+TnWCIRFj1YmMG+KjRT3gxfvk4sS4JffW 5DkhgI9nsspaOr8t13FQFUpfBv5GYHo= Received: by mail-oa1-f41.google.com with SMTP id 586e51a60fabf-2611da054fbso1520779fac.1 for ; Fri, 09 Aug 2024 10:16:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1723223818; x=1723828618; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=v/fAYz42YIT47Goo/L2J1KADvnu11AHx4qa4iqxm2BE=; b=JvI9BidHjl2x3a9aeA+qXJNzn2fu55hhOIi/0nECPmLnVsuSVwjHKZ7fG+BDp8LT3K NYRcS/F0/LL2G24SuHXFCSt3arvk3j9ZMqp3jlSVi2bLdnOmbstmnS1WPuN1TDY+l9TL SMb+IhGWB8ePPhuYbicQM4vuZ5AO2QkQ+AlcY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723223818; x=1723828618; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v/fAYz42YIT47Goo/L2J1KADvnu11AHx4qa4iqxm2BE=; b=Dlieatz1Uj0D6oNPuaMkjm3qoT3hLKY2PzEvQbrqu9uqnxiF/GA1P6+ot4EovIyFcA NWLjRxwSNGm5Zr2HxnYMwfYsX7GF39PdPBc1LMscUPL0Sb/WHDg1dqUXSD2ULvZ7WyYU 7OgavjxnzsxK+9SxKMZoSSxbJvExQrzafBPJidcoajDfmZfUjc8VC9oCupAd3zp4D4z8 /qiEz+O58z5f66iBSKDT1nkDozcR7nR33kvR4FvK/9ROcmjYZq3plhykMSdRQrp5q0kg g7aKZ0lw9idS2hz+bIvtZ6bPJuvhubhanwcNGHxTSdDuoT2l57VielEcKnkUVbpz9n1U OQVA== X-Forwarded-Encrypted: i=1; AJvYcCU1aEOGOuJ1vdmmMlBdUQ73iWzmV5zQDCZnUCr13EID3XBsy9Gl5ohZBaxz0okx/DCyvBlBuSDUq5ct/RQ7yqLnx+E= X-Gm-Message-State: AOJu0YyOlpcJ8UvaOmNTdUwRaE1i51SrAcZEDJ0ZQXCZKtSPOONNC3uY ohTunb+XNc6NHJDIiUEkMb7IoKetFrHP9mjsij6eZJOIKkAJO4VbJb3/xncUH6qyqXNDWTucQ+k VBGgXspBcM0Tg11lNBmCcjtF+pF/dYrLwhVAD X-Google-Smtp-Source: AGHT+IF5FbaYM01wWxPU0jt/1SPd+pa+ksh5VmBehXTLUGrOb3ooRbFC680Hubd+zcObF+xh/FidrIvTSqNSkWcanCk= X-Received: by 2002:a05:6871:3a25:b0:261:1be0:b5b9 with SMTP id 586e51a60fabf-26c6291fc7fmr2528193fac.0.1723223817876; Fri, 09 Aug 2024 10:16:57 -0700 (PDT) MIME-Version: 1.0 References: <20240802061318.2140081-1-aruna.ramakrishna@oracle.com> <20240802061318.2140081-2-aruna.ramakrishna@oracle.com> In-Reply-To: <20240802061318.2140081-2-aruna.ramakrishna@oracle.com> From: Jeff Xu Date: Fri, 9 Aug 2024 10:16:47 -0700 Message-ID: Subject: Re: [PATCH v8 1/5] x86/pkeys: Add PKRU as a parameter in signal handling functions To: Aruna Ramakrishna Cc: linux-kernel@vger.kernel.org, x86@kernel.org, dave.hansen@linux.intel.com, tglx@linutronix.de, mingo@kernel.org, linux-mm@kvack.org, keith.lucas@oracle.com, rick.p.edgecombe@intel.com, jorgelo@chromium.org, keescook@chromium.org, sroettger@google.com, jannh@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 45AF0120033 X-Stat-Signature: rhrrpm9xeaimnokfakn8iaej977y4iit X-Rspam-User: X-HE-Tag: 1723223819-302455 X-HE-Meta: U2FsdGVkX19w/yom6i+VXMIj4FjHFzHetnR7v+yIYeLFxLa/+/Sd8UW1y1tTbuzxpn7uwaLclXRF4dwS3zuvWsQFwown0z7TLhLsI4P6HN9xpDuv5Nubw8Tjw7BfVDMhEhfXUDxd+jAmscFxs/A1xeOQQ1Dhow7fA2p39CmBEQSwEYq9DJWwNuUJjXv3uMAhbgepaOJlsGFksm1JtTgOmJlWtkPQdCzdCBypa6nokymlFq+aW6q6PXpSjCN8GwcWPlKfgs7uhLSLBk6uZr+K/C1GWJajZSp4w8UwIIdfR+qZGScEOVgqd0R0+uTH5uSg+ayRuEZ9F6X3vFqOFUONrMAGRKlt1FOCch4FjASM0chKQ86Rb5qJVEK4Tz25K6FM0AVy3s0N1xxusyCagibqMsdJCs0ePmPhvSvruZHzFX0CrbpVlLpFDFzAPEmCOiq4HFlJL6rC/dArWFy7SDXXd8fwbHYuTd8jlXIU1gWqj4SSYe5YzhAZS2zeaYg/UY7C2jCynrkyLRXoXwv8nxbNm8JAlp5OSPugkbJw6chbh89Bwg3/a3Suf57Bm+cW8GH+xDWVBJabPmBS4mWs0Gp9/ek9q0FqwzZM6pTFnKZhE0cgSeJNdYsBXHJ4dWWUDHt3894XymRdhxkwNaYXGj58j/nlVMJlviQGGVO2sFLvFQ+enQ1oN0uEAU5fklDCSGOphu6ko++ah1NYgPY3zfyOCNmmhHxbE5vmrn2ee0XVoAZeVBs98JFiChhtJGa7dxbGe0zPUgmboc6cXd8wFPlZ496mCMYI7dGXWCL7BNwYPdyfEL02Hzxv1Kjb+4lXQ8/YVoPSs1s+HhZ39rmRtE2tl8sekWEzTOyhJJZUv9kp47HmZJ3ndGUIePDfxVX32S/9NPhXETfYy4Gtw/FZ2eD3l1vT05p5zY9iSN2T0oj2mX1dQ3pPlyuUTD7mXA8Cw7Wo7T7nSrDW4Q1SZZNSyRo Xp1+Xeoe snEDLqaW9aclWtoRX27fnGjIMNw1a8lu5RP68bv+pTn9qKXYu2hFUyhyzBtvzY7PscOuza8l/9cxM74lpOUVLrvOb9mC+TVAsp6sc88WrqFiw65T32JBNQzQJveem3MQ7JdPVIwDmi3fbF+vzift+LfdkXpfe8jHz7UEEJO6mkmYzBHQcjMnlt9whgJhuYiQJW+iH4WWEtP7ab4cudLmVLY9Mho8Djwjwdyp9sqRtjhkZUfQIQO9ujJbvfkcqagust0r/Yj0woYKlIywwuFRzmAi38g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Aruna, On Thu, Aug 1, 2024 at 11:13=E2=80=AFPM Aruna Ramakrishna wrote: > > Problem description: > Let's assume there's a multithreaded application that runs untrusted > user code. Each thread has its stack/code protected by a non-zero pkey, > and the PKRU register is set up such that only that particular non-zero > pkey is enabled. Each thread also sets up an alternate signal stack to > handle signals, which is protected by pkey zero. The pkeys man page > documents that the PKRU will be reset to init_pkru when the signal > handler is invoked, which means that pkey zero access will be enabled. > But this reset happens after the kernel attempts to push fpu state > to the alternate stack, which is not (yet) accessible by the kernel, > which leads to a new SIGSEGV being sent to the application, terminating > it. > > Enabling both the non-zero pkey (for the thread) and pkey zero in > userspace will not work for this use case. We cannot have the alt stack > writeable by all - the rationale here is that the code running in that > thread (using a non-zero pkey) is untrusted and should not have access > to the alternate signal stack (that uses pkey zero), to prevent the > return address of a function from being changed. The expectation is that > kernel should be able to set up the alternate signal stack and deliver > the signal to the application even if pkey zero is explicitly disabled > by the application. The signal handler accessibility should not be > dictated by whatever PKRU value the thread sets up. > While the above use case is correct, there are also other use cases that can benefit from this patch: Setup: The thread's normal operation has RW permission to PKEY 0, and RO or None to PKEY 1. The Thread uses sigaltstack() to register a mapping which has PKEY1 assigne= d. Before this patch: When the kernel is dispatching the signal, the thread will get SIGSEGV inside get_sigframe, because the thread either doesn't have read access or write access to the mapping which is protected by PKEY1. Because of SIGSEGV, the signal is not delivered to userspace. After this patch: There won't be SIGSEGV and the signal is delivered to userspace. The scenario is useful to protect signal stack: Jann Horn had this idea originally, and Chrome is planning to use it in V8 [1]. There were discussions in the past to enable this feature [2] Is that possible to include above in the cover letter to support this patch series for upstream? (upstream does like to see there are multiple user cases) Thanks -Jeff [1] https://docs.google.com/document/d/1DjPhBq-5gRKtTeaknQDTWfvqRBCONmYqkU1= I6k-3Ai8/edit?resourcekey=3D0-GGQta3_yhKqK7xV5SxIrVQ&tab=3Dt.0 [2] https://lore.kernel.org/lkml/202208221331.71C50A6F@keescook/ > Solution: > The PKRU register is managed by XSAVE, which means the sigframe contents > must match the register contents - which is not the case here. We want > the sigframe to contain the user-defined PKRU value (so that it is > restored correctly from sigcontext) but the actual register must be > reset to init_pkru so that the alt stack is accessible and the signal > can be delivered to the application. It seems that the proper fix here > would be to remove PKRU from the XSAVE framework and manage it > separately, which is quite complicated. As a workaround, do this: > > orig_pkru =3D rdpkru(); > wrpkru(orig_pkru & init_pkru_value); > xsave_to_user_sigframe(); > put_user(pkru_sigframe_addr, orig_pkru) > > This change is split over multiple patches. > > In preparation for writing PKRU to sigframe in a later patch, pass in > PKRU as an additional parameter down the chain from get_sigframe(): > get_sigframe() > copy_fpstate_to_sigframe() > copy_fpregs_to_sigframe() > > There are no functional changes in this patch. > > Signed-off-by: Aruna Ramakrishna > --- > arch/x86/include/asm/fpu/signal.h | 2 +- > arch/x86/kernel/fpu/signal.c | 6 +++--- > arch/x86/kernel/signal.c | 3 ++- > 3 files changed, 6 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu= /signal.h > index 611fa41711af..eccc75bc9c4f 100644 > --- a/arch/x86/include/asm/fpu/signal.h > +++ b/arch/x86/include/asm/fpu/signal.h > @@ -29,7 +29,7 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame, > > unsigned long fpu__get_fpstate_size(void); > > -extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, = int size); > +extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, = int size, u32 pkru); > extern void fpu__clear_user_states(struct fpu *fpu); > extern bool fpu__restore_sig(void __user *buf, int ia32_frame); > > diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c > index 247f2225aa9f..2b3b9e140dd4 100644 > --- a/arch/x86/kernel/fpu/signal.c > +++ b/arch/x86/kernel/fpu/signal.c > @@ -156,7 +156,7 @@ static inline bool save_xstate_epilog(void __user *bu= f, int ia32_frame, > return !err; > } > > -static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf= ) > +static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf= , u32 pkru) > { > if (use_xsave()) > return xsave_to_user_sigframe(buf); > @@ -185,7 +185,7 @@ static inline int copy_fpregs_to_sigframe(struct xreg= s_state __user *buf) > * For [f]xsave state, update the SW reserved fields in the [f]xsave fra= me > * indicating the absence/presence of the extended state to the user. > */ > -bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int= size) > +bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int= size, u32 pkru) > { > struct task_struct *tsk =3D current; > struct fpstate *fpstate =3D tsk->thread.fpu.fpstate; > @@ -228,7 +228,7 @@ bool copy_fpstate_to_sigframe(void __user *buf, void = __user *buf_fx, int size) > fpregs_restore_userregs(); > > pagefault_disable(); > - ret =3D copy_fpregs_to_sigframe(buf_fx); > + ret =3D copy_fpregs_to_sigframe(buf_fx, pkru); > pagefault_enable(); > fpregs_unlock(); > > diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c > index 31b6f5dddfc2..1f1e8e0ac5a3 100644 > --- a/arch/x86/kernel/signal.c > +++ b/arch/x86/kernel/signal.c > @@ -84,6 +84,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs= , size_t frame_size, > unsigned long math_size =3D 0; > unsigned long sp =3D regs->sp; > unsigned long buf_fx =3D 0; > + u32 pkru =3D read_pkru(); > > /* redzone */ > if (!ia32_frame) > @@ -139,7 +140,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *re= gs, size_t frame_size, > } > > /* save i387 and extended state */ > - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, ma= th_size)) > + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, ma= th_size, pkru)) > return (void __user *)-1L; > > return (void __user *)sp; > -- > 2.39.3 >