From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B354BC02198 for ; Mon, 10 Feb 2025 22:46:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A9C9280008; Mon, 10 Feb 2025 17:46:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1598C280007; Mon, 10 Feb 2025 17:46:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04811280008; Mon, 10 Feb 2025 17:46:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D96A9280007 for ; Mon, 10 Feb 2025 17:46:52 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 91D95C074D for ; Mon, 10 Feb 2025 22:46:52 +0000 (UTC) X-FDA: 83105521464.18.78D3F8B Received: from mail-oo1-f41.google.com (mail-oo1-f41.google.com [209.85.161.41]) by imf07.hostedemail.com (Postfix) with ESMTP id A36524000C for ; Mon, 10 Feb 2025 22:46:50 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=V+Dt7myH; spf=pass (imf07.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.41 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739227610; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fpi31Ud1twFyFDNt4Eoy+6zvp4+YTfbaSRsSEdozPiw=; b=MjAGDrKhpNl83t/fsL/r0mb/rNl6NRc8Gd6OnEly5zg6zcHeo8TCQAGNxctf/CAuK2MS+D 2X8mpNKRBq0UhFUml0fCzfG9bXYFrdqkbWvvkrRancElBjvuYW21/FrSCsl4VUDE1OSpKY ZaH4NPwUTXYrZ9kvb4vLph9OAQkYyWo= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=V+Dt7myH; spf=pass (imf07.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.41 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739227610; a=rsa-sha256; cv=none; b=lX30H+ljdPSAl/kpZbKtMEQ5qpf0nDmrIfAmmTnNNlZe+YMIOo5rF424bXj+zhJTVFcCxl Uf8gxbshebDyzKBgZeIMdFOp2v7u3/s7wDZ6iy6EdMd3wPZhV6ig8K9L7sZmQ6CgEXdgaq GN+1c4CvPMfYBh9kyx6H3FHwr3WDeCQ= Received: by mail-oo1-f41.google.com with SMTP id 006d021491bc7-5fc6f75aa8aso232091eaf.1 for ; Mon, 10 Feb 2025 14:46:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1739227609; x=1739832409; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fpi31Ud1twFyFDNt4Eoy+6zvp4+YTfbaSRsSEdozPiw=; b=V+Dt7myH+AxkLCR/lt9yV5ZptqYrLmrt5CrUyzWokqChbKEQtegvm9jJDs68fGWHvc V/zZpl3hMB8utCCC7gCqZsx8VfwUx9qO4cf2c0dYi/plxVsYjHck2/AC73TVZGPY/OdB NMiNWNE8ynP9f72PVs5QFxztdDBmdssajojfE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739227609; x=1739832409; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fpi31Ud1twFyFDNt4Eoy+6zvp4+YTfbaSRsSEdozPiw=; b=a/VZu7V5WblGlPAxYBFYX+QOs6WHMmIUaNfRouMfOSlEHfwfmubWe3dbS95d8/Hekh hNrQuUZJWXqaUalMOcC2IejFsM+EwLuOMOmSnwDQBycwiDGMwmMaOjWPqef2dxHHZl1t +L2FAwBN2eoqtmTvRkQ57lWRZ4Yitldv0aAYopdNjmGz6HDjVwuE14UZ/8Gfor2Ft3GF UNZePs2M5bt+/ntQErzfWpQi7sqhWKq8ob1Z85gG2uwRiWozdg4j6DSVn6uBnUFYDC9J fnyiH38AAvZAoVK4k/dhra/NxT5ClQSAgQq4UidjqSeVZr/L1NTw1vXVewBv9dgEOPBK 8yPQ== X-Forwarded-Encrypted: i=1; AJvYcCX8gNLAxCh2qUdtAm7AO4PSUVhfJyzID4/A4EPWWYE+Mucmwvqv7mKcDlscoNGv3Do2hxZ2IWwKug==@kvack.org X-Gm-Message-State: AOJu0Yy1JnoESjtYrKLYQS85HxsG0fT3P8tEB7BOjBiC8uvmg855KArL 8oi+hk4rYbOAUpfkKByVgCQGqyZ5BtLKOcM8LyOX8NcDax5KcDy5g8T27kyKs8piO1SYa3LVxmD 6A0FLARq9rAX2e3TJFAp1hh1SLb1g3W6vtXHI X-Gm-Gg: ASbGncsdv1tlmdJgm3RdDh/6pnYfH+/SU+eEZEMJj/OjY+kicMBSuqglUr3p4Se3z+K gbGHyR7a9bo4WxGGDC8Db/bRolUn2B5bIfibzPZP83x0Y4vT386T0xE9BsZem8SvuYovzmzPUe8 ZZ0yXYSpw+pVd8dafb9mnvxBHsqg== X-Google-Smtp-Source: AGHT+IFtA/W8uueNibT5y90F75LKgRqO5D9NS/P6j1tawK7CKIyxBJJM/Bw9ALTDQM8crHcoP8Q4aYrWl3fdSQSm+vE= X-Received: by 2002:a05:6808:14c6:b0:3f3:bda5:572b with SMTP id 5614622812f47-3f3c41d8305mr94661b6e.1.1739227609577; Mon, 10 Feb 2025 14:46:49 -0800 (PST) MIME-Version: 1.0 References: <20240802061318.2140081-4-aruna.ramakrishna@oracle.com> <20250204100134.1843654-1-dvyukov@google.com> In-Reply-To: From: Jeff Xu Date: Mon, 10 Feb 2025 14:46:37 -0800 X-Gm-Features: AWEUYZnITYhOFN9BQHrNDIgJCmfEUxNfF-rj4qgty8D8FH8th5Xh_MGYgrLTHgg Message-ID: Subject: Re: [PATCH v8 3/5] x86/pkeys: Update PKRU to enable all pkeys before XSAVE To: Dmitry Vyukov Cc: aruna.ramakrishna@oracle.com, mathieu.desnoyers@efficios.com, peterz@infradead.org, paulmck@kernel.org, boqun.feng@gmail.com, dave.hansen@linux.intel.com, jannh@google.com, jorgelo@chromium.org, keescook@chromium.org, keith.lucas@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mingo@kernel.org, rick.p.edgecombe@intel.com, sroettger@google.com, tglx@linutronix.de, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A36524000C X-Stat-Signature: qk7zkyze89dmuysbeidq8mu98yfxrydh X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739227610-965734 X-HE-Meta: U2FsdGVkX19M6A8Dv8SbVAgrs/DhfxiDsmvMGfOB4VOUh2EiraAyqJljlNVIl9MxQ0H8InxMdHq5efj57nOhnQMYOUqwu376gkOLTJ8PFLSAD6ERv7KzB3A5Ho/bwIpdatKnlG/23x6Qlygm2Y7vWUY0IKz5WD4+/zVf1lcKbZeg/9t5nSdksVcNeivGFEy1gfSBgtesBrQPNroJxmzU9+sfm08UcmWJIfOq+FmysIH1BepX+C5tun9P7JUzlhNDaL8D4dKcl9GnlQKanqSscrKJ3ricQllIJiwesXGmFzsUjpAWmskRtuBTsgSxuxZN28+05tgSV3lD4W1JlkQwefy/8w3plhmF5cwFvMOY4jx5KJBjbg4JRygHIU9/bPF0eJvJ8HLe5sVpu/h2KxdxhYE5EWuLDhk53VZq7ibX8AIp2+CJNzrhtnyxNcz4jzLlGDYvjcain0bQIQ1TpNQTM5unQfb/GIR37GMZAGIMkt4SeFMELgKM1cpjL7E40L3tsxM5CLQTsS0IdC5p2kD7GwXE8htjXB6Mgzhn5WZxfmItLdTzfEvHD/nilQ+KnW6OuNtIOR275z24vrvLWmAfAWNkyupojR1dAjPja9yQ/kYDq10dTJ8uD9mbd9NrD7Guupz8ZZSe27/3Mf5kurttNWCjZvMOPl3L5nQdIvnVZ2N3nQFHyjGSUsaowv6xM66IQ6A2vPSAqmX9e9LBUwD+5Cd8SeHMShuVi0usX+LwTzxF0ZjDP34ucA0xp/5MfWdPSJ+xN7F/iB273VCcN7kqqTS7VZNRbRQrw7xhcolFLwx/Msj+3ilGHjsFd8BNcIIC8VWxtPYvDnGkmjmUIPcspqL2GCnJcWojrutX2/54XmpMP07xhzfd6+lnpYFGjoq+Re1HojQt2aSr/bNsfotBQNwLAPy/ZWOFP/uwR1r0CW/nP5e7Wp41yWRU2z+h6BbcBV5VHtlAInLHPPUV88D bhqWWOGs rettI1GZ2nbg015az44qg3p9OXywVTWV8+q4sxeUSzL4OVcKtl+3vEAf1V6H8q+kqC2bQacyxZd1yqCwKNGEiJfu16gjowry+G79BMPd0/nRMjhS95rLKJfui3dvZLsMxBzHFDW1RkEKaTd2HI6iagSsFPaur5I9v8tZ1GWCvVX45or9v/2ET8WZ7fsAlntrMEG5D9F2CRmO9BZ/jhtbT3GyAzlaUpMBJzNFnnO1gDBKwG2QP0/ZhLJX0PVNOQQQpurefS/XcoqouSyG8qZ7lJjjCVVnhFUnMdqYzCpcoaV5eHYsoEA6k5hJdVZvOHarwnIB1bnBHpfXSPb5IRPQgnV5SrPjjkYGa+Y3m7rt/bEaiTV9vv3FSmhUZ898wDMg67M6Ly6duh/tfLJTVAoLrOHhMPA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Dmitry On Thu, Feb 6, 2025 at 10:06=E2=80=AFAM Dmitry Vyukov = wrote: > > On Tue, 4 Feb 2025 at 11:02, Dmitry Vyukov wrote: > > > > Re commit 70044df250d022572e26cd301bddf75eac1fe50e: > > https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@= oracle.com/ > > > > > If the alternate signal stack is protected by a different pkey than t= he > > > current execution stack, copying xsave data to the sigaltstack will f= ail > > > if its pkey is not enabled in the PKRU register. > > > > > > We do not know which pkey was used by the application for the altstac= k, > > > so enable all pkeys before xsave. > > > > > > But this updated PKRU value is also pushed onto the sigframe, which > > > means the register value restored from sigcontext will be different f= rom > > > the user-defined one, which is unexpected. Fix that by overwriting th= e > > > PKRU value on the sigframe with the original, user-defined PKRU. > > > > Hi, > > > > This unfortunatly seems to be broken for rseq user-space writes. > > If the signal is caused by rseq struct being inaccessible due to PKEYs, > > we try to write to rseq again at setup_rt_frame->rseq_signal_deliver, > > which happens _before_ sig_prepare_pkru and won't succeed > > (PKEY is still inaccessible, hard kills the process). > > Any PKEY sandbox would want to restict untrusted access to rseq > > as well (otherwise allows easy sandbox escapes). > > > > If we do sig_prepare_pkru before rseq_signal_deliver (and generally > > before any copy_to_userpace), then user-space handler gets SIGSEGV > > and could unregister rseq and retry. > > > > However, I am not sure if it's the best solution performance- > > and complexity-wise (for user-space). A better solution may be to > > change __rseq_handle_notify_resume to temporary switch to default > > PKEY if user accesses fail. > > Rseq is similar to signals in this respect. Since rseq updates > > happen asynchronously with respect to user-space control flow, > > if a program uses rseq and ever makes rseq inaccessible with PKEYs, > > it's in trouble and will be randomly killed. > > Since rseq updates are asynchronous as signals, they shouldn't > > assume PKEY is set to default value that allows access > > to rseq descriptor. > > > > Thoughts? > > Another question about switching to pkey 0 and not switching back on all = errors. > Can it create security problems by allowing sandboxed code to escape? > Sandbox escape would be bad , we wouldn't want the calling thread to get PKRU =3D 0 in any error path. > Namely, here: > > + /* Update PKRU to enable access to the alternate signal stack. *= / > + pkru =3D sig_prepare_pkru(); > /* save i387 and extended state */ > - if (!copy_fpstate_to_sigframe(*fpstate, (void __user > *)buf_fx, math_size, pkru)) > + if (!copy_fpstate_to_sigframe(*fpstate, (void __user > *)buf_fx, math_size, pkru)) { > + /* > + * Restore PKRU to the original, user-defined value; dis= able > + * extra pkeys enabled for the alternate signal stack, i= f any. > + */ > + write_pkru(pkru); > return (void __user *)-1L; > + } > > we restore to the original pkru on this error, but there are other > failure paths later, e.g.: > https://elixir.bootlin.com/linux/v6.13.1/source/arch/x86/kernel/signal_64= .c#L199 > > on these errors paths we will eventually get here to force_sig(SIGSEGV): > https://elixir.bootlin.com/linux/v6.13.1/source/kernel/signal.c#L1685 > which just sends SIGSEGV and is not fatal. > > So hypothetically, if there is a SIGUSR1 handler without SA_ONSTACK, > which fails, but SIGSEGV handler has SA_ONSTACK and doesn't fail, this > will result in resetting PKRU to 0 without restoring it back. > Or sandboxed code somehow arranges for the first signal setup for other r= easons. > Can you walk me through the setup and steps that led to this situation? > This is, of course, a tricky attack vector, and the program must > resume after SIGSEGV somehow (there are some such cases, e.g. mmaping > something lazily and retrying), but with security you never know how > creative an attacker can get and what you are missing that they are > not missing. So it looks safer to restore to the original PKRU on all > errors. Thanks -Jeff