From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8E3CC3DA49 for ; Thu, 11 Jul 2024 17:57:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5E96C6B0089; Thu, 11 Jul 2024 13:57:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 599046B0096; Thu, 11 Jul 2024 13:57:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 487726B009B; Thu, 11 Jul 2024 13:57:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 298786B0089 for ; Thu, 11 Jul 2024 13:57:52 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id CC62EA07C7 for ; Thu, 11 Jul 2024 17:57:51 +0000 (UTC) X-FDA: 82328229942.25.2F82A6C Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) by imf04.hostedemail.com (Postfix) with ESMTP id CEF514001A for ; Thu, 11 Jul 2024 17:57:48 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=btGgNQ3l; spf=pass (imf04.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.54 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720720644; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hd/aM4WVpnq9dMpWhQNC6P88qoE7zzksarngsyWfv5Y=; b=swmRGSxcm1SDINpe9LvAvel9NNLt4sXgwTQcWUN+ARdOUhOZ0jIEUU08RyQ0k7De+XSVmB l4DBXS0b+g2W4+VgL1UB61mlD6vYEtu2Tt2Nyq7930BMfxma4i7I0mhzo42GPsUwtRdQgL bk3Ch7b3JZMAhokj+anNNJwOUFm8F2o= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=btGgNQ3l; spf=pass (imf04.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.54 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720720644; a=rsa-sha256; cv=none; b=0C71o1HjV4NbzbOociEbp0tk3TH/i1JXjnupZCK94mxqAD4jL6dM3GlOlS0qfUENUMK1VS pUPDM88RtC9sOwzXJixj2K/A8hck47aOR7Na8cBGU5HIsMzJxvJCEFJQW287wGWXmKi40w uPeLU2o0bD5Serpcwrujgf/lQw+xruE= Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-52ea79e689eso1619094e87.1 for ; Thu, 11 Jul 2024 10:57:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1720720667; x=1721325467; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Hd/aM4WVpnq9dMpWhQNC6P88qoE7zzksarngsyWfv5Y=; b=btGgNQ3lCvcLz1SMjvxbNZkh6flmLqH4gn9cO6qRX1ERPvv7G4DLgAHtvedOgAr8tU HBILZQ1c6v582OiJOJwIRECVH87zn+dNo6z6H6gCT2loTfv8IOMOWH/WnLIM8f+uudZj QE24kBmafQVRJmimNQBJH8nEU1Z97GJn9+BB8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720720667; x=1721325467; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Hd/aM4WVpnq9dMpWhQNC6P88qoE7zzksarngsyWfv5Y=; b=ahONMu01dsh8XpmiGkaELrx2rJuwSNUouZ7BIUd0ow+FawpjbV4mk0gPQylKi+Nrdt gjlIxp53YrhJB0EDse7RL/ylOLWImWmm0cCOLlEKrS+FfUd5c6vIm6YgQwmYyLuDVgam Tle9lRQuhAQQJooto4puDV64ILPdtv/06+VOh/lT2F8KMtAJpH9hJiSWTSGcJfZ1K/Q4 ZKUQdOBvHbJ118NHYCYJaJ44f9vad+S4rDEeqiblXQFZTvh3eMlWLp7SOmOa9NymuqGi 1dD/GHQqFqKR8Kxjx9Mg2Bu87eMbNAbL1RgAMTW2s+pTwgw2lDykZ4riOtyFx/mFZCkM iMgQ== X-Forwarded-Encrypted: i=1; AJvYcCURDJyolrmrb/9vVxvn5v4Ai8DBEVKy6QvjiBP2uipUyof0QQs2VG7NTwr2WS3VL8cYiK/Zw4VacxbGgtmQZVr6as8= X-Gm-Message-State: AOJu0YxorWvTZDa0U+M7tOwk1Q5+IlTvOuhtrB3JMikWKVYp5PaZcLsh 9Ih3m5FKH/tb6YhMP/zCcMOCxymsV/zZpcsUTe08KZ+qTg15hBOCD4YbfvOnuQCZE6rJ1H4XYiw FkDJtYQ== X-Google-Smtp-Source: AGHT+IGkOhF/nW5FJ78o7ylS60hHDoIjVZ/a7wfjlMkQ7r7O/6S0vq/rw3A1Ok2BNn+UxxQ28RENUA== X-Received: by 2002:a05:6512:3089:b0:52e:bdc3:e02 with SMTP id 2adb3069b0e04-52ebdc30e5bmr6140238e87.68.1720720666736; Thu, 11 Jul 2024 10:57:46 -0700 (PDT) Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com. [209.85.218.50]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a780a6bc5c0sm270910266b.25.2024.07.11.10.57.36 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jul 2024 10:57:36 -0700 (PDT) Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-a7979c3ffb1so81364066b.2 for ; Thu, 11 Jul 2024 10:57:36 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCVDpWCfQAOVMzMG61oJAvEnBTxpX5s+BJYYRs4E5hQsYLziwMoNOWpDOTOEErELPwOPeuSc5hlOqBxNLv1/4XZqC2Q= X-Received: by 2002:a05:6402:50c7:b0:578:638e:3683 with SMTP id 4fb4d7f45d1cf-594bab80624mr9962641a12.5.1720720655663; Thu, 11 Jul 2024 10:57:35 -0700 (PDT) MIME-Version: 1.0 References: <20240709130513.98102-1-Jason@zx2c4.com> <20240709130513.98102-2-Jason@zx2c4.com> <378f23cb-362e-413a-b221-09a5352e79f2@redhat.com> <9b400450-46bc-41c7-9e89-825993851101@redhat.com> In-Reply-To: From: Linus Torvalds Date: Thu, 11 Jul 2024 10:57:17 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v22 1/4] mm: add MAP_DROPPABLE for designating always lazily freeable mappings To: "Jason A. Donenfeld" Cc: David Hildenbrand , linux-kernel@vger.kernel.org, patches@lists.linux.dev, tglx@linutronix.de, linux-crypto@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, Greg Kroah-Hartman , Adhemerval Zanella Netto , "Carlos O'Donell" , Florian Weimer , Arnd Bergmann , Jann Horn , Christian Brauner , David Hildenbrand , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: kitpgmzb87i8b6rb3ogotmjbhmempqct X-Rspam-User: X-Rspamd-Queue-Id: CEF514001A X-Rspamd-Server: rspam02 X-HE-Tag: 1720720668-484800 X-HE-Meta: U2FsdGVkX19QleJCYIm0yHMM8kwkRWthg+Bo+lt4H6ElZq9Cc5iGj5QgwEvEtIRWXlE82gNhF/rzbwfwtM9aAmLkf871EU/aWRzA/oODKApchw8WWREI/deaIqcWIonyF6CIUdr/p/UpKzSXIx/XhlebnyPzcpbLRUgwK4aol7zuZ5cT/BPv1kMbRIcz6gfrFuYGgArgZXVzlgU8/WwGv8YpO+1tC5ApZo3P8wY+jvxENKZ9ALB4iNukx9sGcJEto5jvce/Xeb/AdLVgBSVN5Z4ovCUXDCmKIHnMrGe50UV8D/RvZ7TBBpHDVIKw8g6THof3fNd17RvHQLZ2fZmKKFISf1EFhW7QOrvdt/lX96wvU21nMD/hMwTi/nNYw4z9LleleWSiZzY7g1yqGwB+zF61mWm6SCtnpMncRqp8ZY/ZnXeKcy44/qHsDwU0UKr4vJ08w40RGdpKNGhAbCNbe3i2piyh8iRhfjYvw6TDjyNHCp9C9Xywez9jdfWz4W1icSyuqVdHRN3a737fazBSVLt/x0BHqnxJejbP+xVImM/sf/TgClIYXUGqPMwyXmUfoFMxcNYhpLrCjB5YBFIaZxBahj1OYtZaEfYnRPmq+nR4bsUTtNfnyVmoWgtHO+RhxfTzFU3WrNLjpAAx5PHRpxhQVpAjqlgJJqyjLcOA0GccPrKs/KwcZ2OS7YCJcb2hKdO/F5lXQITr5oWLHIvhK2dZK8bJKRV6jvEv/tbWPTYiik0haJvJhM5YagRa6e2/I7izocfggH7+i5w+bwi6QIebYHZOqeTc6kx9OFU7EyQbUHbyVXTvv99Vs8sUlLMx0z8KWNnoQ7ehx7H83kpyKGok//mpJgFp/C9sRuZosXXkXrMUDDyB3W9xF/y0kJ62y3PxyYQoQHHkQlo9izZ2/mn2XOorkB7RIbNH3YfFEY9pCD7yXJsJDaKe1GVOwISR0K4hVvGXV3DNFBtohoT DBMf1tCh C57QFj6PIP3m7rWkD2qvy2a+hGLe/MAF1S/o1DmT8gk2GrBIqqdQAoPmG6uUoNaxR1xnNPQi2B+slUkkVTL9osJZmr7Xroh+5m2NLi6VXpXm6nG2rUfHrDkO8WFUIGEpU4pVoN9YzbSlhjq0Qd2zPw3+jfOSbQ5sRudBpCRJ0boSrxvq91bG+UqTOHeFmxS2qBp4e/NmxLXexOpB6n9TK7FwlI9lN1tO6pHYhE2SU624fPPGlZx/EAA50EMMq9W6U5X5JCqfor2v7feZdeBA26iYK4pICdbGi17mXh/VVoimD1IyThNqNv6XL1roPsj6vZsmvUVMpaoIuM4P4akfSD8Q+YgWB3BysEurODwiU0GbbLEw+ca+RQMWC9SBs/U3bZ+mxiJtMnqWhtX57O/NdpNT9/18MvB2dcKqIUpZEO0TtC8PaFTwN+xqEUZJeBGy8Eh72ALxSOsAJ7PvUK1zHJmlM9Xvby1j4ilnMlRX1Dxp4yiZkMKuOPbfX5DlOgYy0H6EFqpmoNf/d5/23Nr2jmJ+wh/LEPoxFfszuYfsK7eJoVvxnDFfgjfVJEM7Ycyxq/2H3nsXOSXD/1Qq5corc6Rbaso5KMBGcmp3n58+yecoxGUI8LeXJ7RNFy/r5yrE4iWIL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 11 Jul 2024 at 10:09, Jason A. Donenfeld wrote: > > When I was working on this patchset this year with the syscall, this is > similar somewhat to the initial approach I was taking with setting up a > special mapping. It turned into kind of a mess and I couldn't get it > working. There's a lot of functionality built around anonymous pages > that would need to be duplicated (I think?). Yeah, I was kind of assuming that. You'd need to handle VM_DROPPABLE in the fault path specially, the way we currently split up based on vma_is_anonymous(), eg if (vma_is_anonymous(vmf->vma)) return do_anonymous_page(vmf); else return do_fault(vmf); in do_pte_missing() etc. I don't actually think it would be too hard, but it's a more "conceptual" change, and it's probably not worth it. > Alright, an hour later of fiddling, and it doesn't actually work (yet?) > -- the selftest fails. A diff follows below. May I suggest a slightly different approach: do what we did for "pte_mkwrite()". It needed the vma too, for not too dissimilar reasons: special dirty bit handling for the shadow stack. See bb3aadf7d446 ("x86/mm: Start actually marking _PAGE_SAVED_DIRTY") b497e52ddb2a ("x86/mm: Teach pte_mkwrite() about stack memory") and now we have "pte_mkwrite_novma()" with the old semantics for the legacy cases that didn't get converted - whether it's because the architecture doesn't have the issue, or because it's a kernel pte. And the conversion was actually quite pain-free, because we have #ifndef pte_mkwrite static inline pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma) { return pte_mkwrite_novma(pte); } #endif so all any architecture that didn't want this needed to do was to rename their pte_mkwrite() to pte_mkwrite_novma() and they were done. In fact, that was done first as basically semantically no-op patches: 2f0584f3f4bd ("mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()") 6ecc21bb432d ("mm: Move pte/pmd_mkwrite() callers with no VMA to _novma()") 161e393c0f63 ("mm: Make pte_mkwrite() take a VMA") which made this all very pain-free (and was largely a sed script, I think). > - !pte_dirty(pte) && !PageDirty(page)) > + !pte_dirty(pte) && !PageDirty(page) && > + !(vma->vm_flags & VM_DROPPABLE)) So instead of this kind of thing, we'd have > - !pte_dirty(pte) && !PageDirty(page)) > + !pte_dirty(pte, vma) && !PageDirty(page) && and the advantage here is that you can't miss anybody by mistake. The compiler will be very unhappy if you don't pass in the vma, and then any places that would be converted to "pte_dirty_novma()" We don't actually have all that many users of pte_dirty(), so it doesn't look too nasty. And if we make the pte_dirty() semantics depend on the vma, I really think we should do it the same way we did pte_mkwrite(). Long-term, maybe we should just aim to always pass in the vma to the pte_xyz() functions, but... Linus