From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C3D8C49EA1 for ; Tue, 6 Aug 2024 04:37:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D4E66B0083; Tue, 6 Aug 2024 00:37:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1854E6B0088; Tue, 6 Aug 2024 00:37:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04CD66B008C; Tue, 6 Aug 2024 00:37:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DB6876B0083 for ; Tue, 6 Aug 2024 00:37:55 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9015441E45 for ; Tue, 6 Aug 2024 04:37:55 +0000 (UTC) X-FDA: 82420562910.22.711B5DC Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf04.hostedemail.com (Postfix) with ESMTP id 53ADC4000F for ; Tue, 6 Aug 2024 04:37:52 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Wz3XIi2F; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf04.hostedemail.com: domain of kees@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=kees@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722919066; a=rsa-sha256; cv=none; b=J2oeagF1KV6DnjzdkVRgsiiuv7lLioAxX07AD/pAYh+h+fqrSbVsB2yL5870COe/H2nmko /ArhEwKG7lt8Jw7oWJaSe9Dm7PuxFb+UlZUgTGOx3FHQEDpWq6minNxhpXvVGh+rJgr333 7k8Uc4PwJB+PV7k98c9nE9Sy2KslSNI= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Wz3XIi2F; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf04.hostedemail.com: domain of kees@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=kees@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722919066; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=n8ylpCLlXs8P9i6vVlpfX/GUBVSu+4CFUuRPDn202T0=; b=iWsi6jf4AIPd3p2MO9cI7DZC1mxdgjsC5TqgLItiBPJM0VPQc7wqGPjoo7LQphLLw1PPTF MaBumcgi5tcoFvtdTQOdEBV1KSXVyVwW6iBvrDSokjblGlIbwUc3nKi+nvs108j6IGHTI+ H7Nnf1SzNlLRLWh6t9O5mLrmCtzVk0I= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 0F04DCE0C61; Tue, 6 Aug 2024 04:37:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 43167C32786; Tue, 6 Aug 2024 04:37:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722919069; bh=YpOpus8ftLD5mgDTpI/D5alidVClv/sUMv2NjE2OJaw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Wz3XIi2Fpjx9rs3oEP3S/dk0HrJAIHdF1SiIvFg6ACePhj721y0OZmVVqj/Ny0fYR lZLsKYwyhsErJ2bwhP7GIs0MzZmdZfaN6KWk9SCDlhLuWiEAJfB2y55aGrlxVbGd2s 4Jw8xxxB4w9P6HvcwNZ9U4FgFClSOKBWadJ4I9j3UOf/Z1ndqUL7a1fFoURrgFZf/e kSeGeU0ertdIm5kiMJLmsnb+Np+ZTyGlryDuyOpq6+e4bkNRIGd/xksm2J9AFu+cLa xz3dEsiG+bCEuB2HZOEy9Y38pDhcJIBsYbmUEOz3YH0CSLfia5Kbcw6KKZHcwaU6Q5 TZKelmhettuUA== Date: Mon, 5 Aug 2024 21:37:48 -0700 From: Kees Cook To: Andrew Zaborowski Cc: linux-edac@vger.kernel.org, linux-mm@kvack.org, Tony Luck , Eric Biederman , Borislav Petkov , Mathieu Desnoyers , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng Subject: Re: [RESEND][PATCH 3/3] rseq: Ensure SIGBUS delivered on memory failure Message-ID: <202408052136.119CD53B@keescook> References: <20240723144752.1478226-1-andrew.zaborowski@intel.com> <20240723144752.1478226-3-andrew.zaborowski@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240723144752.1478226-3-andrew.zaborowski@intel.com> X-Rspam-User: X-Rspamd-Queue-Id: 53ADC4000F X-Rspamd-Server: rspam01 X-Stat-Signature: rp1mgr1gndmnfy95s9c39bshtmerqcsd X-HE-Tag: 1722919072-78151 X-HE-Meta: U2FsdGVkX18KvbXc3Iw5jqwd1qcej3ahoayGLWAmvEI8bOp5J4S+cjmTFkBUkdSvWyK/KN4bDLFIqTeLGiOTrLNsZW03A5a8KCw6+t441aOj/e2oRG7Dbd3f55bTs8D2tCk9shbYR/KtvqnqftyFoFP6JtzIs6fGjC1PM6OSwlWO07b086sb9q9qqNxDiH86rtSJlr7rfnWlZ0Sr0mWuN7LjJbRTxMUqTL9gs8Zeekk+zJMi4juZlbAoPtqaFU+tfks62KOpWKdYPKzn/jKk4i0MqzZaCJvHOvi5t98PqZ+jM4z9jckEHuFUoT7XYcxvnIlJ0DeJjVv5MI2Kt7GW0W6504VSyz4m8avkNxLdqMUO+3VswbcqVQFyKYjtOmHYeGCFoqa4vKum4f6f6Lhmngp9rtRz9WewLl4hQP9bFMrPWdNFL1twb+W4UJDb6TxoW/VAkCi798Pxlwn65+f42SXCb8SwOQn4kciX3bgt2JhYVcAGyF7s+TDm+kQgfjJspfWJH1F9rIICnBZqgV8tEgttgdlx9uKG/wNtFSt/We3rDh71+ie93VWxroyCfLbn9AqciOybIUlkjhklAi6gtUySde9cSA4QMCCm3+e2tfM3Y629lM2LfZzKZSPfKvp/Dqs3vdB+ydDLFNBLy0dLc7NW+uLL02S3hNN3Ji4uGKVUr891uj8P6T0GGPHGQvkcJj3d7DjPgNKMLkEKgnohQAZGszAvnxODf+ciJdtS0lkc2P0CnmjmbxKhgM9to/JpKMbR4tFcbuoYhdrMruRHJXy1TTAOdD8VKj23ShTViwKHNF/R25CzGkrsQpJ2b2Gdg4mVITmZtVGkDIQqko2VfUHbW/gTJplo0R7KR/XzBIPljWowr4a5piqBQXg5zZzt0souPVa0owd5BSnQxMSR/enoQKWqYpDPbdDLMrenjG8kosHzYhkOauqxAcebMpz3h4NjHeEFect69Hcvr0N MSj0Hlz9 /ee6fTVFqcCKhF0Z2xXjqhtIvHpSiRM0L2sHo0nxXQNFyUwGKI9v8LLqHHNbi/vJWbZAIR/+ff44Yc/8MSOP3+Py5LamCu05HEBSQJJuZwgKPycfk8xs/5wEkIEWTNvoKRJZAiqo7EehkaZPnzxOF8omFNF0OwEt70+GP3t3Oa1Y3PWGn8369gOnceMZJ3NSDAMcA7PSUbV96H6guLiO7LJ0sxEsP9vNr9aT2tJu2wkBJJuU6jC85zPNqEMxi1k9WawRu697Se/abX8uWnhVN+F6UPoDI2yKrFf39N48O9xuuKUOxO/WfrYb59MaxqyfVJqXfiO3XnMCd8+Os1KEQVz09aezuvZP3lLJ+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 23, 2024 at 04:47:52PM +0200, Andrew Zaborowski wrote: > Uncorrected memory errors for user pages are signaled to processes > using SIGBUS or, if the error happens in a syscall, an error retval > from the syscall. The SIGBUS is documented in > Documentation/mm/hwpoison.rst#failure-recovery-modes > > Once a user task sets t->rseq in the rseq() syscall, if the kernel > cannot access the memory pointed to by t->rseq->rseq_cs, that initial > rseq() and all future syscalls should return an error so understandably > the code just kills the task. > > To ensure that SIGBUS is used set the new t->kill_on_efault flag and > run queued task work on rseq_get_rseq_cs() errors to give memory_failure > the chance to run. > > Note: the rseq checks run inside resume_user_mode_work() so whenever > _TIF_NOTIFY_RESUME is set. They do not run on every syscall exit so > I'm not concerned that these extra flag operations are in a hot path, > except with CONFIG_DEBUG_RSEQ. > > Signed-off-by: Andrew Zaborowski > --- > kernel/rseq.c | 25 +++++++++++++++++++++---- Can an rseq maintainer please review this? I can carry it via the execve tree with the related patches... -Kees > 1 file changed, 21 insertions(+), 4 deletions(-) > > diff --git a/kernel/rseq.c b/kernel/rseq.c > index 9de6e35fe..c5809cd13 100644 > --- a/kernel/rseq.c > +++ b/kernel/rseq.c > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > #include > > #define CREATE_TRACE_POINTS > @@ -320,6 +321,8 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs) > if (unlikely(t->flags & PF_EXITING)) > return; > > + t->kill_on_efault = true; > + > /* > * regs is NULL if and only if the caller is in a syscall path. Skip > * fixup and leave rseq_cs as is so that rseq_sycall() will detect and > @@ -330,13 +333,18 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs) > if (unlikely(ret < 0)) > goto error; > } > - if (unlikely(rseq_update_cpu_node_id(t))) > - goto error; > - return; > + if (likely(!rseq_update_cpu_node_id(t))) > + goto out; > > error: > + /* Allow task work to override signr */ > + task_work_run(); > + > sig = ksig ? ksig->sig : 0; > force_sigsegv(sig); > + > +out: > + t->kill_on_efault = false; > } > > #ifdef CONFIG_DEBUG_RSEQ > @@ -353,8 +361,17 @@ void rseq_syscall(struct pt_regs *regs) > > if (!t->rseq) > return; > - if (rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs)) > + > + t->kill_on_efault = true; > + > + if (rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs)) { > + /* Allow task work to override signr */ > + task_work_run(); > + > force_sig(SIGSEGV); > + } > + > + t->kill_on_efault = false; > } > > #endif > -- > 2.43.0 > -- Kees Cook