From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C1C4C54E94 for ; Mon, 23 Jan 2023 12:19:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC3806B0072; Mon, 23 Jan 2023 07:19:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E72426B0073; Mon, 23 Jan 2023 07:19:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3B1C6B0074; Mon, 23 Jan 2023 07:19:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C36556B0072 for ; Mon, 23 Jan 2023 07:19:42 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 874291207EC for ; Mon, 23 Jan 2023 12:19:42 +0000 (UTC) X-FDA: 80385969804.14.9C659DC Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf21.hostedemail.com (Postfix) with ESMTP id 4DCDA1C0003 for ; Mon, 23 Jan 2023 12:19:39 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf21.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674476379; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vVv9oKXtSiFXmi9oY3F2vUGRyaVq6F39or0fX43+8Ms=; b=bCHdS52MTS0FXQIomOHfyZrQLZyLoTAc/yhVAAyYj0CF92o9ByABhOjJu8jtl9cVStn6je 3lrX8ynv6RJol3eWxaUpf45XS1ipPwrs1KnkuyyRwx2ealTkYxmudmkkcQVjuQ+edHq9oM joG4ctAVEA8UY7N0ySZ+yNA26B9FCFg= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf21.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674476379; a=rsa-sha256; cv=none; b=X0J08+duJ9fbAzyDwG88rlrD+5dthBjRIqjTg7WjzHl7nML0fTNZN/hWImydvTix+SSIK6 LQvg5kv7dSsfZC4T9VFyGwzHpAjJ9+3okbSpbA5Ob3QugPtSzAETKRB6Y6LumwdcPv+8wA GeZWPj1EvzzCBw7qIhi/v+gVvic+Xeo= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id ACBE8B80D70; Mon, 23 Jan 2023 12:19:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5C76AC433D2; Mon, 23 Jan 2023 12:19:33 +0000 (UTC) Date: Mon, 23 Jan 2023 12:19:30 +0000 From: Catalin Marinas To: David Hildenbrand Cc: Joey Gouly , Andrew Morton , Lennart Poettering , Zbigniew =?utf-8?Q?J=C4=99drzejewski-Szmek?= , Alexander Viro , Kees Cook , Szabolcs Nagy , Mark Brown , Jeremy Linton , Topi Miettinen , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-abi-devel@lists.sourceforge.net, nd@arm.com, shuah@kernel.org Subject: Re: [PATCH v2 1/2] mm: Implement memory-deny-write-execute as a prctl Message-ID: References: <20230119160344.54358-1-joey.gouly@arm.com> <20230119160344.54358-2-joey.gouly@arm.com> <4a1faf67-178e-c9ba-0db1-cf90408b0d7d@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4a1faf67-178e-c9ba-0db1-cf90408b0d7d@redhat.com> X-Rspamd-Queue-Id: 4DCDA1C0003 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: iku36joenocgfjw3atyg7mtsjqsbsmdp X-HE-Tag: 1674476379-255495 X-HE-Meta: U2FsdGVkX1/M9VwamyiLAPVPIcU7J+jvwUvkhxbZZwE9bt0cngXAJlXAnUY32NRx4uHZNxBy5eGBXt1luK5meBBcdKIxPu2t2Pd293eZqh/XG3QuGozvyr9jlrP+pzYQ+d1eAnZBwNhV+fXyRelGfnZ2HmQSxWM1G92XE8VWsj5bcBFJHNKkoTuPWbRU6qtbnr68JEupVKOeAWIs0uskCOZKM1Ba726xBTx9ipr4EZBiZ/MOphm3SwtzdvN+H84K59FjqVIusrN7k6zKo8PM6+xGrN5CwCCpYCQH1U+ekkq83xkrBfKqcnwmocjz4+GCBE0CJ6XhJZeD8OcjjQmtrjQKJylgnfFrgXKyH96cIKjvokvwuU+4MDM/UiAXq0R3gKPLiFP8NPJ3PlaQnkFbmVIVkFagrhNpiVuz8JYoH9Z5nQVWCK+jJO7SVLGX7Zf7s1juxqj4kWWMsrA/oGAz2MuRnjwRekxPtK6HCoRJIc1AtoHQc4Rfu3OHEm+APY+HRwrdj85wpOnUfeSEQiK7UyaPetVBUDrTmXYYDk7btCfWNah2aHO83oIPl1wT4DkfizZsYfyjkmfF2FGLp3S2j3gydD7cxGujzmadsOjbYFhCwyXhq+skYvpSxNyD7Y5+HbSD0+0F/2wv9XkZsVtVo3bTaJMUHMzR/33FV93WRfwZqF3gaUBj0NkXFqUAkLiKx61jONn9Mm8L7rdSeup52rUgUy7hQd03+CDdlls32ugmVf6w1bz/Gr4zPtrcHvUfOBkS8GZeXXrnyyVXNorKWQYud/oLAabVjmrITfXlS6u+YqlcReCABo0JPxfo7+SL3MsvcZP4XAYXHt2Rx7Q9dUFWwVPeCk7VeU6/bf9lfBMpZEg6Ga4vAc9yyU1G5eCjcMP1BVkoNfK53riwiQQ6RVLClOK2+g10JgZk5L6hORpyEv7SPzmQdGEmp/N1jjFwyGMD+htPIICV6wAHpqR ceYOFu6a EBJdA/VHRmEG4LWPzI0Ghdn0a+8F9XAPn8WUYUJ4sLwbEKtfegIjb7y4mSiEfkJokDIjnaJQH4qi0zu3j5v9ZSJeb79YrtsD2VXoBlLyMCM8PNg0qXvfPsBhIUMk+I+mPwa1o684WxOHfIjbKoFE6V6Jdn+Mp2H2/PQ4ufXwZ+gldOF0OfKR0JY/yA0A71D5GqK+pQxHeVo0uXIMuQrJ8H2XKwT96AFsL+H8er0uJuu95EOGyiI5+qxfO3GPhk0/AJ969XbVxKHcBPPrQtPg5LosY0xIiYN7shd8DrD/CcJmh9VevwcWqBCZKtCOXD97y+VW7jYN8roKfMSjE4a74u/bgqg3fqghwf2bPE2aA0UEwbabVjyLyVFW6I2kfAApFxBcLEYAApskZsw/p6Tpf7t4ZQOZIDKqIYmEhzFwq9gAVAkSWR8bT6NO22EbG2CZh0vBHEsHzFW6AAeUVBNnkXykPNKi0gQ/cb1/w1P5ws/OuKYMlZKYkwZm9soiU/EagnP47g7lk+6RB3IgiYJs5ayE0TyV5fgdk/UYqY3T8qafR9W0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 23, 2023 at 12:45:50PM +0100, David Hildenbrand wrote: > On 19.01.23 17:03, Joey Gouly wrote: > > The aim of such policy is to prevent a user task from creating an > > executable mapping that is also writeable. > > > > An example of mmap() returning -EACCESS if the policy is enabled: > > > > mmap(0, size, PROT_READ | PROT_WRITE | PROT_EXEC, flags, 0, 0); > > > > Similarly, mprotect() would return -EACCESS below: > > > > addr = mmap(0, size, PROT_READ | PROT_EXEC, flags, 0, 0); > > mprotect(addr, size, PROT_READ | PROT_WRITE | PROT_EXEC); > > > > The BPF filter that systemd MDWE uses is stateless, and disallows > > mprotect() with PROT_EXEC completely. This new prctl allows PROT_EXEC to > > be enabled if it was already PROT_EXEC, which allows the following case: > > > > addr = mmap(0, size, PROT_READ | PROT_EXEC, flags, 0, 0); > > mprotect(addr, size, PROT_READ | PROT_EXEC | PROT_BTI); > > > > where PROT_BTI enables branch tracking identification on arm64. > > > > Signed-off-by: Joey Gouly > > Co-developed-by: Catalin Marinas > > Signed-off-by: Catalin Marinas > > Cc: Andrew Morton > > --- > > include/linux/mman.h | 34 ++++++++++++++++++++++++++++++++++ > > include/linux/sched/coredump.h | 6 +++++- > > include/uapi/linux/prctl.h | 6 ++++++ > > kernel/sys.c | 33 +++++++++++++++++++++++++++++++++ > > mm/mmap.c | 10 ++++++++++ > > mm/mprotect.c | 5 +++++ > > 6 files changed, 93 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/mman.h b/include/linux/mman.h > > index 58b3abd457a3..cee1e4b566d8 100644 > > --- a/include/linux/mman.h > > +++ b/include/linux/mman.h > > @@ -156,4 +156,38 @@ calc_vm_flag_bits(unsigned long flags) > > } > > unsigned long vm_commit_limit(void); > > + > > +/* > > + * Denies creating a writable executable mapping or gaining executable permissions. > > + * > > + * This denies the following: > > + * > > + * a) mmap(PROT_WRITE | PROT_EXEC) > > + * > > + * b) mmap(PROT_WRITE) > > + * mprotect(PROT_EXEC) > > + * > > + * c) mmap(PROT_WRITE) > > + * mprotect(PROT_READ) > > + * mprotect(PROT_EXEC) > > + * > > + * But allows the following: > > + * > > + * d) mmap(PROT_READ | PROT_EXEC) > > + * mmap(PROT_READ | PROT_EXEC | PROT_BTI) > > + */ > > Shouldn't we clear VM_MAYEXEC at mmap() time such that we cannot set VM_EXEC > anymore? In an ideal world, there would be no further mprotect changes > required. I don't think it works for this scenario. We don't want to disable PROT_EXEC entirely, only disallow it if the mapping is not already executable. The below should be allowed: addr = mmap(0, size, PROT_READ | PROT_EXEC, flags, 0, 0); mprotect(addr, size, PROT_READ | PROT_EXEC | PROT_BTI); but IIUC what you meant, it fails if we cleared VM_MAYEXEC at mmap() time. We could clear VM_MAYEXEC if the mapping was made VM_WRITE (either by mmap() or mprotect()) but IIRC we concluded that this should be an additional prctl() flag. This series aims to be pretty much a drop-in replacement for the systemd's MDWE SECCOMP feature (but allowing PROT_BTI). -- Catalin