From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E79DC46CD2 for ; Wed, 24 Jan 2024 18:56:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFD778D0003; Wed, 24 Jan 2024 13:56:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EADA18D0001; Wed, 24 Jan 2024 13:56:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4E558D0003; Wed, 24 Jan 2024 13:56:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C04F08D0001 for ; Wed, 24 Jan 2024 13:56:22 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 70C74C0CE3 for ; Wed, 24 Jan 2024 18:56:22 +0000 (UTC) X-FDA: 81715110204.05.C66E62A Received: from mail-oa1-f46.google.com (mail-oa1-f46.google.com [209.85.160.46]) by imf26.hostedemail.com (Postfix) with ESMTP id B42AA140011 for ; Wed, 24 Jan 2024 18:56:20 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ZD+FNRXf; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf26.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.46 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706122580; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=flii4r2OO0TnE/OvuRDIQl+ib46GfR+jZZh0in4dSPQ=; b=Ry3zCDEmUFO/gWpFsg4nMsRW9GhSTZXeVdWGSCSM1jDbrVzT+mnqriLR2YggF3JYbHduRn 56ziwrj64fEDLpjxk8w+qbHJjgihVveXIwPcQH7mscoeoJduYQOxdmghR6VaHP/bTjeakw FpBQ3hj9Vt53X0+imFXen0wfhjjETNE= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ZD+FNRXf; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf26.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.46 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706122580; a=rsa-sha256; cv=none; b=Ka9DBaPHur5oEnnLanhERBDaD8yoUBn+OBnbstZSoWqLT2ZqKX9iW6vtHOt+S3H0o7Cqwh NOOxudrUBzPhDEPzNM/tXwB/6rb64N7Ia5DrWUDTEywVZNyQ9JYoJCSkie6uf/PA5fWtVu 6NhHh0xuHBTG9Fd41cUc95qQcGLB6mQ= Received: by mail-oa1-f46.google.com with SMTP id 586e51a60fabf-21429ac4dc9so2108193fac.1 for ; Wed, 24 Jan 2024 10:56:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1706122580; x=1706727380; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=flii4r2OO0TnE/OvuRDIQl+ib46GfR+jZZh0in4dSPQ=; b=ZD+FNRXfUesyYUpHFeUk70Y5yGOOssU5c9e3+JdvBTW5adlhIxnR33depUtnDtV0p9 if0K5RaXwS6WUdZliaVJhvytJifHHcvE1LMHmpHdmbwN4bowOIlCe+Wbnn1Xn2mgqumP C293U7hyq8VUR98vq1iFBvJOMsCHy20tBibuc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706122580; x=1706727380; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=flii4r2OO0TnE/OvuRDIQl+ib46GfR+jZZh0in4dSPQ=; b=NGPLv+l+A2xHJUsiHz9/MRVY0jQeHRhQgVr7dkRlDBImNvGee/EVYqYMPIHtfUXV/O 8YFRhgXfnqMEsxI1jrjf/nzDzcSYqrqsKo4N56gfwxF2RBkE0fez+cuMEmYb7g2AV3Xf zCQ+WjrqG7IOlKVSQzTfS857v8Rk+6KB9a9w2eKhWDYrdtJ/neTXxBXOmk1Hm1gsJWgR PYEgKSXzf/FUxUtocDZ3qd5DiFUFAMWb/H/pifpine136T9/MUDsET1rcLDbIMdxCaZj Tjt+uI/H88TOewgDO0FMP0ZXqzvsJXtaEx+JLBXZTKwuBmFivzzE6AXRyeAmDXyNXMcE 4rng== X-Gm-Message-State: AOJu0YwOa2jc5NkVT3+98Avasltd7dArEN4lgPzKg5HXdqbbTZd1jy3+ szI8+of3CRi08VTIPU3kj8ZfHNcXaPGqjNrloDLkKdNFy6ikAoZY5GGR2O6jYehDw/YKqQj5gXu pnpaeOdpKky2+bfnIZ5pmJ1Lva4B5g8SmJQfU X-Google-Smtp-Source: AGHT+IEokCZUL1ojdQxZQFJ36V2Jcy4lGrhyLp03ij/BLj+yI4uyj0fd4X4i3tCBO73J+D9TXYai3KJPK1EDBvi2YW0= X-Received: by 2002:a05:6870:b69b:b0:214:8734:1345 with SMTP id cy27-20020a056870b69b00b0021487341345mr35826oab.7.1706122579922; Wed, 24 Jan 2024 10:56:19 -0800 (PST) MIME-Version: 1.0 References: <20240122152905.2220849-1-jeffxu@chromium.org> <726.1705938579@cvs.openbsd.org> <86181.1705962897@cvs.openbsd.org> <20240123173320.2xl3wygzbxnrei2c@revolver> <85359.1706036321@cvs.openbsd.org> In-Reply-To: <85359.1706036321@cvs.openbsd.org> From: Jeff Xu Date: Wed, 24 Jan 2024 10:56:08 -0800 Message-ID: Subject: Re: [PATCH v7 0/4] Introduce mseal() To: "Liam R. Howlett" , Jeff Xu , akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org, usama.anjum@collabora.com, rdunlap@infradead.org, jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B42AA140011 X-Stat-Signature: n4c8toicybqkt5okmywmp1zjpir7skxc X-Rspam-User: X-HE-Tag: 1706122580-54458 X-HE-Meta: U2FsdGVkX1+mtuWo0a8jj3akGVf+wguAEi/AgDf8eaFOLl0d+EyEA2ZbXgB0/18WwbxhrxTzN2vSLqJtGHxwKQqgEIOJF/4SrsCMvhqpHq7pDpf8Auo8I34GV4CAhLIjyv5/IdAlvvdyFKMXvxk35a3WfWuxGz8hZTgnNHzQRv90YzUKZ1BUbsEAygEEuNxV7b17TqdqcsEB4EpkdvPcuqKG5cazb5u8bUpiaiIUgCxBWxr+zQfUCoFeio4B3sZwRstwhy447aD74/fLwCUNfOPiy5XlEr5KTWTVmV48kNn0aX9xAP9Dty4VyBvj9t+5pTSR8KskZRhHNxomtbulJR/sTlZpPehG8hzNI4bpYtXwmPvfmV4a1z/7z6Q3Uay9gkXUeFe5m5KmxXpLBZvDuY2a04ThuSsjRgPKUogA2ZIY5B82kPLJdmDAAgSfgkZ0Bjvu9LA7+IIvNZbfWKNvHu4XfDUTKDO1+ke6bIkVgQDs3YD8ncQdLlDRcVuhufvotbtS28E4bu9U5R0DQTX7+uSY4LHrYMRYuGiXOXHrOwwI73kXJoaq0ewmJArhkzv4pI0qslYoeUrh3sWcuUZEVz3t4C5QK4HYts9eUwNi1mn3Zoz4EDwtO7Fw4JJOJ39zz6mKnVne7lCrZh5BN69SDMDfNAA/+tmJ/ayvIoS8+cveg8Q63P7gAtYcpgTdNl45WkkHCZtlOYpMSYsNxIsHvo6rmqI1XxFsEs87wNqHyhD7599Nwvats46jGWKsWrexxF9baDOp5++isLi2+gus/spx5hLMzwAThVrFh48ptVBkhmFjj7BUiiZU9VYx07f7L3ZSqX25S34YRQhNuMSVHMIixuY+oQ/GrY0lxWQpyemhR/ofUs2gvi89HJwlPIz5Hm7cwgv2QftvMpv3ZmgDEsfvvNly5nk1Vw8DuqAGsoQNDxhf5Y+uWYLmxeHIjiLq4aFPeWNnoXfoePSosoS 76JnKypw sgsdVnRJgwOMclQ/rG9DQ2vCa1kIyVZatCi4eyI9b2ks0CbJpfeuJ1Z3QP5Jkn7VB8G5kYmeczG04vbVvycR4pHPHsxjhD/kbQNI41AYA5RIRqMQfCKnaLmGwBCa7GXEH6o+4spqTbCiwudX/VwFb2JBuWqR+SodJU3TQuvBGCctG+HEsh33vhs/GbIT9BVvXmDxNYeIxCpdWeI8xOjW6KGk5ATW9hSGRpEpkmZTRbjvVSZc5NLUwpnCuv045SgTUYpOxyzR2boVMcGyjzwhN1v4otxBId7hgsmCap1BfoWjNXTH73E2YMZJGg9+Aj5SpX1YLpgKsiWURwWGvl7V1sESzdeN8gGnyIxc5MJ2dcpRev+Pwbzpm2JkUwHo8gcgWA6rG7+MjFKc1P67/ZJnbkic14V+M3BKnjvxOaUEcmEhoC85fJV34VbhYTbcaMx2Ozx/o7jCbzBGL9b2wQpJT49C73PcvN5ZSnOHQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 23, 2024 at 10:58=E2=80=AFAM Theo de Raadt wrote: > > It's the same with MAP_MSEALABLE. I don't get it. So now there are 3 > memory types: > - cannot be sealed, ever > - not yet sealed > - sealed > > What purpose does the first type serve? Please explain the use case. > > Today, processes have control over their entire address space. > > What is the purpose of "permissions cannot be locked". Please supply > an example. If I am wrong, I'd like to know where I went wrong. > The linux example is in the V3 and V4 cover letter [1] [2] of the open discussion section. [1] https://lore.kernel.org/linux-mm/20231212231706.2680890-1-jeffxu@chromi= um.org/T/ [2] https://lore.kernel.org/linux-mm/20240104185138.169307-3-jeffxu@chromiu= m.org/T/ Copied below for ease of reading. ---------------------------------------------------------------------------= -------------- During the development of V3, I had new questions and thoughts and wished to discuss. 1> shm/aio >From reading the code, it seems to me that aio/shm can mmap/munmap maps on behalf of userspace, e.g. ksys_shmdt() in shm.c. The lifetime of those mapping are not tied to the lifetime of the process. If those memories are sealed from userspace, then unmap will fail. This isn=E2=80=99= t a huge problem, since the memory will eventually be freed at exit or exec. However, it feels like the solution is not complete, because of the leaks in VMA address space during the lifetime of the process. 2> Brk (heap/stack) Currently, userspace applications can seal parts of the heap by calling malloc() and mseal(). This raises the question of what the expected behavior is when sealing the heap is attempted. let's assume following calls from user space: ptr =3D malloc(size); mprotect(ptr, size, RO); mseal(ptr, size, SEAL_PROT_PKEY); free(ptr); Technically, before mseal() is added, the user can change the protection of the heap by calling mprotect(RO). As long as the user changes the protection back to RW before free(), the memory can be reused. Adding mseal() into picture, however, the heap is then sealed partially, user can still free it, but the memory remains to be RO, and the result of brk-shrink is nondeterministic, depending on if munmap() will try to free the sealed memory.(brk uses munmap to shrink the heap). 3> Above two cases led to the third topic: There one option to address the problem mentioned above. Option 1: A =E2=80=9CMAP_SEALABLE=E2=80=9D flag in mmap(). If a map is created without this flag, the mseal() operation will fail. Applications that are not concerned with sealing will expect their behavior to be unchanged. For those that are concerned, adding a flag at mmap time to opt in is not difficult. For the short term, this solves problems 1 and 2 above. The memory in shm/aio/brk will not have the MAP_SEALABLE flag at mmap(), and the same is true for the heap. If we choose not to go with path, all mapping will by default sealable. We could document above mentioned limitations so devs are more careful at the time to choose what memory to seal. I think deny of service through mseal() by attacker is probably not a concern, if attackers have access to mseal() and unsealed memory, then they can also do other harmful thing to the memory, such as munmap, etc. 4> I think it might be possible to seal the stack or other special mappings created at runtime (vdso, vsyscall, vvar). This means we can enforce and seal W^X for certain types of application. For instance, the stack is typically used in read-write mode, but in some cases, it can become executable. To defend against unintented addition of executable bit to stack, we could let the application to seal it. Sealing the heap (for adding X) requires special handling, since the heap can shrink, and shrink is implemented through munmap(). Indeed, it might be possible that all virtual memory accessible to user space, regardless of its usage pattern, could be sealed. However, this would require additional research and development work. ---------------------------------------------------------------------------= --------------------------