From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45757CDB47E for ; Wed, 18 Oct 2023 15:18:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE17D8D015E; Wed, 18 Oct 2023 11:18:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C91CB8D0016; Wed, 18 Oct 2023 11:18:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B80B38D015E; Wed, 18 Oct 2023 11:18:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A58878D0016 for ; Wed, 18 Oct 2023 11:18:03 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 76FA3B5714 for ; Wed, 18 Oct 2023 15:18:03 +0000 (UTC) X-FDA: 81358937646.27.FE79934 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf27.hostedemail.com (Postfix) with ESMTP id 141074001E for ; Wed, 18 Oct 2023 15:18:00 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=SP8eWuTh; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697642281; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EYJWM82k/mYbNh1hfZQ8KkV147YIzBIZxX71F7kjj3o=; b=n/UZLlluegP5W7xkSpH3YeFXGKwxQfxmG4bsJ1RkvbdQYfhHbTrYRdoTEOEcpdlsodFG2t og/cBhmGDeAklRoBC3DrOhQRKfpb5P4kLqoFtNhbqy3x2YFEs2M3y56KLjOiK6QkawIRlO ze88FBso/ZvXgrtDYkY4sB1TI5qpdkI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=SP8eWuTh; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697642281; a=rsa-sha256; cv=none; b=5Z2hualHlW6oM98qOGjQtz4Beyvjq2sHIiKN2szWTi1a//pjWwvtLU++RMGCz2luY83Vjc ekunfh9aOpnlFt1A0wBmMYGsWaSrAx542Rn6BhC0gpIstcCzN5bqSn2lJyrBceX5Fzpwzu 27yMmqglaBbqZ/WGTrb8z6/L4BycXZE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=EYJWM82k/mYbNh1hfZQ8KkV147YIzBIZxX71F7kjj3o=; b=SP8eWuThWrXi0c+GKIfuQmjZth m8/aZFY0nYGaQzDsejHx0z/V6t3NsJjeORp/uGXzhj39+OR2w2/oRsloh++YDaE2cTL3KmaFAfJOn 1abJTfFo5CEZEKuPxBZ0FPFkDnDehbk+hCyRjsxjHUY6NXG0/A+MD2FRrLo2JYgFo8BaKNUaq0SEz /+odCu/MMdB5q5sBf9XxYOBN376/6smqd3AYT/9m7BwfQC9Adp6kQQhqG383BL57eLzLWs8latknJ yTnbdZ1C+yUGO2QdDKTttULIj+JqxIRcKc4PdhojxFG7PgO/UW0fs6OY8VbZXahhYtUOFINnnIjMs tVn/tT0w==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qt8Iq-001bDQ-Mh; Wed, 18 Oct 2023 15:17:44 +0000 Date: Wed, 18 Oct 2023 16:17:44 +0100 From: Matthew Wilcox To: Jeff Xu Cc: Theo de Raadt , Linus Torvalds , jeffxu@chromium.org, akpm@linux-foundation.org, keescook@chromium.org, sroettger@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, surenb@google.com, alex.sierra@amd.com, apopple@nvidia.com, aneesh.kumar@linux.ibm.com, axelrasmussen@google.com, ben@decadent.org.uk, catalin.marinas@arm.com, david@redhat.com, dwmw@amazon.co.uk, ying.huang@intel.com, hughd@google.com, joey.gouly@arm.com, corbet@lwn.net, wangkefeng.wang@huawei.com, Liam.Howlett@oracle.com, lstoakes@gmail.com, mawupeng1@huawei.com, linmiaohe@huawei.com, namit@vmware.com, peterx@redhat.com, peterz@infradead.org, ryan.roberts@arm.com, shr@devkernel.io, vbabka@suse.cz, xiujianfeng@huawei.com, yu.ma@intel.com, zhangpeng362@huawei.com, dave.hansen@intel.com, luto@kernel.org, linux-hardening@vger.kernel.org Subject: Re: [RFC PATCH v1 0/8] Introduce mseal() syscall Message-ID: References: <20231016143828.647848-1-jeffxu@chromium.org> <55960.1697566804@cvs.openbsd.org> <95482.1697587015@cvs.openbsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 141074001E X-Rspam-User: X-Stat-Signature: 98ux6u536zhoj31ze3ty8q4k6d7j5sxz X-Rspamd-Server: rspam01 X-HE-Tag: 1697642280-648287 X-HE-Meta: U2FsdGVkX19wQkCPC4mMNolqYACjQdojS2TRDCUYXsUGWrsjINw1SEXwFFhnolIXsWkfsidkm2g9dZi/x1EnDg45eDxevhsChGv92qGBLhTNKgQ6wq6/sfgTT2wjic4qEvZhsQDO/sZXk06LYA2pj+IoW/kwJJDfkPq0ZXaxjLzfAu1qrHak8AWbr0fRwCMJ7rD9czxH5xRkGvjajeIFz/nPcqyYa14HUdBoD57TL3ggUswjQmh18wjl0QHXQJpIcDT10hPK8jvSfOVyJZjYbG8U542ObIe3DRoNFbTQCTEHbcpD5GY/xbj0WC5bX1gCxIRpRmZtOXhXsua2VsD7KpVnqd8tfFJn1aOSBYj4BCohIlLEZql5vM3xxcPglzZTS5rkn2WHB4nZy6STHmv3s2DtZ2vyazEh4GVH+m16S7RdUuJ3g3XeAWp3jH6rUhKwls7A4oTuvUXULzAsMK3/3d8EN0tpyTB79lc938Vk9re8tN9mfCDTdDMeUu/VhV6gpJul9WSpMm9VJ+qBteNUoOdhscpR53p9JzH+jD2+ZSCtfpZ1N40019gU2jY5pcrbhKL2GmIBBQDBLYQx5ovzmMyM9G1NhP1vGkonnYgddx2eF9m2GpxTard+NXIz+KVSYb5LoJKQglornBzXusEtodM44zHFIjdNIqijAJr7bqbSrNZTquijMIrEgrnCGanV2QZpxrjKFmo/O/kJFtMYhVeNQZRqA4y1xrqoTcFgUZZpH9ejrGQHW+hs4eg/7BgN771+QEsVVHRiQrqnZW9glQgAQH6xQcYlIKSe8cPPFuDGbm6UOZ8705Z/5fsPgAV+kK1EmqP6ef7GPJ7kzv2eqPo/jfrrZrgIOoa5UZ3vjOdYAW2ZW616mPtnRsp/oqg4c2mNBmdDpROgqVH7CqxkiKb1FHvRvAWSkGHPXeUz+B0h+szDsRlKD6TijlgcCbJI4COGuRkliHrzwuhSOG6 rS4y7A+m NCOEq9LS88uljeWdwoJTIbDaLRhKNrd+1XG862Jbtpj52iGLLFQUB3aNYvK+9TIZuK56opYjOmDBEjnBrCVKkeB7mbBMYATbVD0YAkMe/ISLW5vxtYXbPt1l2/HQD62zHdLpwyzjm1garUpc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Oct 17, 2023 at 08:18:47PM -0700, Jeff Xu wrote: > In practice: libc could do below: > #define MM_IMMUTABLE > (MM_SEAL_MPROTECT|MM_SEAL_MUNMAP|MM_SEAL_MREMAP|MM_SEAL_MMAP) > mseal(add,len, MM_IMMUTABLE) > it will be equivalent to BSD's immutable(). No, it wouldn't, because you've carefully listed the syscalls you're blocking instead of understanding the _concept_ of what you need to block. > In linux cases, I think, eventually, mseal() will have a bigger scope than > BSD's mimmutable(). VMA's metadata(vm_area_struct) contains a lot > of control info, depending on application's needs, mseal() can be > expanded to seal individual control info. > > For example, in madvice(2) case: > As Jann point out in [1] and I quote: > "you'd probably also want to block destructive madvise() operations > that can effectively alter region contents by discarding pages and > such, ..." > > Another example: if an application wants to keep a memory always > present in RAM, for whatever the reason, it can call seal the mlock(). > > To handle those two new cases. mseal() could add two more bits: > MM_SEAL_MADVICE, MM_SEAL_MLOCK. Yes, thank you for demonstrating that you have no idea what you need to block. > It is practical to keep syscall extentable, when the business logic is the same. I concur with Theo & Linus. You don't know what you're doing. I think the underlying idea of mimmutable() is good, but how you've split it up and how you've implemented it is terrible. Let's start with the purpose. The point of mimmutable/mseal/whatever is to fix the mapping of an address range to its underlying object, be it a particular file mapping or anonymous memory. After the call succeeds, it must not be possible to make any address in that virtual range point into any other object. The secondary purpose is to lock down permissions on that range. Possibly to fix them where they are, possibly to allow RW->RO transitions. With those purposes in mind, you should be able to deduce for any syscall or any madvise(), ... whether it should be allowed. Look, I appreciate this is only your second set of patches to Linux and you've taken on a big job. But that's all the more reason you should listen to people who are trying to help you.