From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAD9AC02194 for ; Thu, 6 Feb 2025 18:06:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D00D6B0082; Thu, 6 Feb 2025 13:06:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 17F506B0083; Thu, 6 Feb 2025 13:06:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06F096B0089; Thu, 6 Feb 2025 13:06:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id DDD7B6B0082 for ; Thu, 6 Feb 2025 13:06:31 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 73BFB4BCF2 for ; Thu, 6 Feb 2025 18:06:31 +0000 (UTC) X-FDA: 83090299782.01.5F61486 Received: from mail-lj1-f173.google.com (mail-lj1-f173.google.com [209.85.208.173]) by imf10.hostedemail.com (Postfix) with ESMTP id 636B1C0019 for ; Thu, 6 Feb 2025 18:06:29 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mU+aC2SL; spf=pass (imf10.hostedemail.com: domain of dvyukov@google.com designates 209.85.208.173 as permitted sender) smtp.mailfrom=dvyukov@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738865189; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H5ijmrv1xMkTxvyrnG0sU8L3dMgDYIgFGyVBUXpmxA8=; b=1GHdbDbdAi5I7IdjFm3JqYy31e3QqeVJ/BZAK0Soyztx+URbUVZKk3is6JsaDuDvJtzMT5 fE6dv9fIwtirayEnQuvnzjTbAGqraw0TenwLKicH8Xhdmy1eZoBKEiY+be8sHJPvWb4Km1 KM/4oTQOe/yYn08RsKzf8W/+iQQjFls= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mU+aC2SL; spf=pass (imf10.hostedemail.com: domain of dvyukov@google.com designates 209.85.208.173 as permitted sender) smtp.mailfrom=dvyukov@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738865189; a=rsa-sha256; cv=none; b=8I1P2eUhcA8nSacZD2SxXcj5iVWMJLDBDndh8siHwlTI51Urn9/gqfcOVvRRMpKlLKlUqW ZvPpTyIgJYVghIwcb/B6I7Ni2591xJV8nD2ZrLO/zRoe1bshz5OqoJ8LNZ4T6zhuaEo/en oAyABna60cyxWlXqbGgPBphM3LCBsxs= Received: by mail-lj1-f173.google.com with SMTP id 38308e7fff4ca-30227c56b11so12355341fa.3 for ; Thu, 06 Feb 2025 10:06:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738865187; x=1739469987; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=H5ijmrv1xMkTxvyrnG0sU8L3dMgDYIgFGyVBUXpmxA8=; b=mU+aC2SLjSUiGUNDEjBxj2basBpWxYUNX1aJUpISbBtRPjWqF0ESxBFyjAb77HtG+C R1wSUHHbLtw5vG+EUi3hJ8HUToCiH2O0lWcP4Pz38bce+GyAbKdx5xfo7tU4iQ74kfkL z3QUMbltPqK0eT5gD3mbHjWzAjZF+WZ1w93ntX6SyhCB5n6jI/D5Tb7dUsuBDwIqzjG8 oe0DMyWH1Ae/PeoE5yuSuQV485VuxXfKzZUTOYQd10Olyy5ccYnzvSV/Y6WDfXGhtyrB MnYRsQuj1wBZK+y/gJgCl9WqF062W5h3XJ45z0pi5uZ9QVv+6yj1CGqRbnMwvgLlzWeg KCog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738865187; x=1739469987; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=H5ijmrv1xMkTxvyrnG0sU8L3dMgDYIgFGyVBUXpmxA8=; b=N1i0WAPrL/RYQ+H0na3pptnJdm7/Hy0LPkmFlhU/Nhcc0ceRQjXALp+Df5wubIUUHK E48QIBz1I2A1mOFVzqfVjq9zk1HFDik0XVIzG9xlZTXdBJRQj0XxEuYGMMG4ypfLC3NM jWBA0Q5VGFqeQGHY0KDg91xRP6udpFTyf5Hy1V4gFaPoo3Vn7EmYG9tT0P0e+uPr9O8I Zv0LgT474ZyFNeyLqQZ5A9I3u75zzQ9+tHYCNp1v71XYz0YJCI/ZNXj8UDDJdhwKYKzK EYoL7xa7Sqt1e42WpLJsdRKzvUmPT6WWDWbHnI6TN6XAIYJgxIpJk6+s7QK1xDGF/e2J Cv3w== X-Forwarded-Encrypted: i=1; AJvYcCU7bcEQXh2eh6zcVDBhLkvxKKcKTn+z/LjCJuT50C/1SdBgfPoUYrRiCIHVI3dsKAtkn+7QrG3ysQ==@kvack.org X-Gm-Message-State: AOJu0Yw5qE1zF89Rj3nReqOD/fj/9TwA5mggDPqTIVLxUbghrIxzEPQC jdtG0fqCJhnkke9xx1PGPllZN8Z153mLMXGJ1u3bX9y6nIt5vg33eekouyCEc6+96W5gKMuRLEq J2FTu+kAA3WDYCJ8hpGUjqfC4QEiIwNGy3C95 X-Gm-Gg: ASbGncsaAqZfITHLdWmyUaznQyAiPCO5GrE2WfgyiPvbdfhCXGcxM50lzQ6d8s41aGE SC2XAFy6jdazbncZw1nhragyAP8fLUZM2VBxDH674uZKlmIX3iZdLBRv+P/jxbZnEK1TZS/VZ8b CXxacnBLNbzPgRkNe6G9AGxjBbmauMwg== X-Google-Smtp-Source: AGHT+IFtL93N1CSknmLZbso3DBkiWJ+4cACr+iFlv5keotpZRTk0sb8bf1DhZDs2bZqric9fnxEreiNkHX/pOvsRKZ0= X-Received: by 2002:a05:651c:116:b0:307:95a1:2923 with SMTP id 38308e7fff4ca-307cf2f6bb5mr24494911fa.14.1738865187231; Thu, 06 Feb 2025 10:06:27 -0800 (PST) MIME-Version: 1.0 References: <20240802061318.2140081-4-aruna.ramakrishna@oracle.com> <20250204100134.1843654-1-dvyukov@google.com> In-Reply-To: <20250204100134.1843654-1-dvyukov@google.com> From: Dmitry Vyukov Date: Thu, 6 Feb 2025 19:06:15 +0100 X-Gm-Features: AWEUYZk6bPEFmdnSpNpNh5rkNyV67hJtJGcgIZZUWBHSj4jnjpP0SWwzLiY7F-A Message-ID: Subject: Re: [PATCH v8 3/5] x86/pkeys: Update PKRU to enable all pkeys before XSAVE To: aruna.ramakrishna@oracle.com, mathieu.desnoyers@efficios.com, peterz@infradead.org, paulmck@kernel.org, boqun.feng@gmail.com Cc: dave.hansen@linux.intel.com, jannh@google.com, jeffxu@chromium.org, jorgelo@chromium.org, keescook@chromium.org, keith.lucas@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mingo@kernel.org, rick.p.edgecombe@intel.com, sroettger@google.com, tglx@linutronix.de, x86@kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 636B1C0019 X-Stat-Signature: ipzeizh45czkpwwsn6utbkekj4aqs4t4 X-HE-Tag: 1738865189-68266 X-HE-Meta: U2FsdGVkX18qCBVGFsoyQGmmOB17FOgHKIMGGhDQx0t4ugUSALbKoqBpss76iqJ0yn02jjL74gAUnZ3k6obG0jWNRhY85y1lTU/5d7rreuoYxWwrStImvnBbUp6pkn4QOjP92A624/hlRxtCpW0SwNjwyKQNxotc/VH2/o5cLZsosWSWrqczO9AI9DOPEYacGfdGZwVpISzkBTtjIb7C7ieApvYq4q+7RPuztaIDa9BJn56FuUfKWwVPAV+bU6t0KUXiMJ7dOq/zqWWKoHefzyDOtPsDmn6lcewNQ80udge8LkuhWwN2qDaeG2mYmk5Ek2+R2MqUBAGXd3zBzzyDnnM258sXX6VxQObrBx6XREYDR3ddTTusvZNtAhCnE3X3B6KOgT5DcWxjAiswNjBqnEl60M8K7VaLS7lAMAb8dGYDIbsfHlO9i7vb4hIhJKpdfJbHJefh9vvP6LzjiJqAOOrsGKnQV05pDD/fAN3N8cCx7u87TB1Y5EHrcNHiCZCuCbybzUTjmyDpkd+iJQ+GU8coF9eJZM7szg/TxuJ76R/I6qMV/GMHEIS9u9xOfFHcc2wm740TQUgSVDzXiwOT+eG8bGHCSkpWRsG9QCnfBidGYjljJaov51DsMsZU4gSXuuYkSxENpWekhEV4pLQRsC0nD+zyG60TftPfq3VrGYPSrfRxEtxMR3LVvai9pCxatDWj0bzjnBxZIoUQp+NQuiOpk2bWOmhdetAw5Nw4IXDhVFhsMA0ZJKSRNYio7b/1UfO5D7jD/ULYMGu1grzEPjVZP4tTawDs91iLsqa5CuEb5KsrxU7IZlurtfy0FgJFSXwkF1v1uS5+ewegG+ej3cQBBzG9+ODNudy1ipw7mkdts9BkM7CtWZGou+JNpopCkoLvxok0Y9Fhz03dWYInPVJYLEgU47v/ZdxrTTHlrOA4M6sO/4eXIanmiJGDWTAQPn8/HWtkKHXpIMos40a 3g/wN0wb FZ+ORnW09pQbAaKOiBe27esNrMk/ERNzyp2VHpYlMshoXHG/IpSPw4lCckC7hM060XV5GWZ5kLIJnF/TWUt98rpPcsrLeMnZmU6tvjl4zpv37O/UDDquxxGNYBiytoMTSQKDIwZyUBcL7znIc1jD952MTLB4nVRk2/Zpgt+OYH8c1vZFefVgLJ1TbasxyyK7JOhsrQovH3F/svXYiZsv+P4/CiZNsa1gJyQfEjmYQr3vZJf+NPZnSbvHa9ayp64PLVy2M0NdZrlGEMEIvZu7zmzXqIy9yNx3WA/34EQG1nZgIt8ge0uW3R1TGQvvtZjxAXJaFeJBPrLX7W7F4nH6spTUSj3duo0wy72IYo7oLvI4QJVxHBBO10Z4T2QSMfIDGbVzhFip9aV2wUyE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 4 Feb 2025 at 11:02, Dmitry Vyukov wrote: > > Re commit 70044df250d022572e26cd301bddf75eac1fe50e: > https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@oracle.com/ > > > If the alternate signal stack is protected by a different pkey than the > > current execution stack, copying xsave data to the sigaltstack will fail > > if its pkey is not enabled in the PKRU register. > > > > We do not know which pkey was used by the application for the altstack, > > so enable all pkeys before xsave. > > > > But this updated PKRU value is also pushed onto the sigframe, which > > means the register value restored from sigcontext will be different from > > the user-defined one, which is unexpected. Fix that by overwriting the > > PKRU value on the sigframe with the original, user-defined PKRU. > > Hi, > > This unfortunatly seems to be broken for rseq user-space writes. > If the signal is caused by rseq struct being inaccessible due to PKEYs, > we try to write to rseq again at setup_rt_frame->rseq_signal_deliver, > which happens _before_ sig_prepare_pkru and won't succeed > (PKEY is still inaccessible, hard kills the process). > Any PKEY sandbox would want to restict untrusted access to rseq > as well (otherwise allows easy sandbox escapes). > > If we do sig_prepare_pkru before rseq_signal_deliver (and generally > before any copy_to_userpace), then user-space handler gets SIGSEGV > and could unregister rseq and retry. > > However, I am not sure if it's the best solution performance- > and complexity-wise (for user-space). A better solution may be to > change __rseq_handle_notify_resume to temporary switch to default > PKEY if user accesses fail. > Rseq is similar to signals in this respect. Since rseq updates > happen asynchronously with respect to user-space control flow, > if a program uses rseq and ever makes rseq inaccessible with PKEYs, > it's in trouble and will be randomly killed. > Since rseq updates are asynchronous as signals, they shouldn't > assume PKEY is set to default value that allows access > to rseq descriptor. > > Thoughts? Another question about switching to pkey 0 and not switching back on all errors. Can it create security problems by allowing sandboxed code to escape? Namely, here: + /* Update PKRU to enable access to the alternate signal stack. */ + pkru = sig_prepare_pkru(); /* save i387 and extended state */ - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru)) + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru)) { + /* + * Restore PKRU to the original, user-defined value; disable + * extra pkeys enabled for the alternate signal stack, if any. + */ + write_pkru(pkru); return (void __user *)-1L; + } we restore to the original pkru on this error, but there are other failure paths later, e.g.: https://elixir.bootlin.com/linux/v6.13.1/source/arch/x86/kernel/signal_64.c#L199 on these errors paths we will eventually get here to force_sig(SIGSEGV): https://elixir.bootlin.com/linux/v6.13.1/source/kernel/signal.c#L1685 which just sends SIGSEGV and is not fatal. So hypothetically, if there is a SIGUSR1 handler without SA_ONSTACK, which fails, but SIGSEGV handler has SA_ONSTACK and doesn't fail, this will result in resetting PKRU to 0 without restoring it back. Or sandboxed code somehow arranges for the first signal setup for other reasons. This is, of course, a tricky attack vector, and the program must resume after SIGSEGV somehow (there are some such cases, e.g. mmaping something lazily and retrying), but with security you never know how creative an attacker can get and what you are missing that they are not missing. So it looks safer to restore to the original PKRU on all errors.