From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35DD4C433EF for ; Fri, 1 Apr 2022 19:15:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA5236B0071; Fri, 1 Apr 2022 15:15:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A53566B0072; Fri, 1 Apr 2022 15:15:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F55E8D0001; Fri, 1 Apr 2022 15:15:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0212.hostedemail.com [216.40.44.212]) by kanga.kvack.org (Postfix) with ESMTP id 797CA6B0071 for ; Fri, 1 Apr 2022 15:15:27 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2707B183182C4 for ; Fri, 1 Apr 2022 19:15:17 +0000 (UTC) X-FDA: 79309263474.31.E2C4CA0 Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) by imf24.hostedemail.com (Postfix) with ESMTP id 954EC18000E for ; Fri, 1 Apr 2022 19:15:16 +0000 (UTC) Received: by mail-pg1-f170.google.com with SMTP id q19so3097572pgm.6 for ; Fri, 01 Apr 2022 12:15:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=74mB6MqkSJBkwlaVi13tzJeL9OKs1IHJIH4CRNeep8g=; b=bJTyRdDTmlbuSOM0B2IOeSL9/WFOBdcTPTkBu4BZi3VTXTZXyU9USVbd0ARBpqBd/o Xhqm+ZokB9H18FXjbbbKA3nLPKuk8jlb1DV9o4R5Shg4d7X2Zd9deT8e+k5Wcvp8TurP 7FXpoK+YCocWG8BZCLk/6mOj8pmP2DTACrdxvZe8W597X5HAdipYg2QW/3kQ8Hh8kNfw ddNYibFysh1gqMIgVAdqbXiMWNdskLI9YyNFUwamOWZgIQlW7hYipsYzaSZAs9wNLwDh wJvQMCRCBmmEzHUKRTYCPGPum6YvplBPsG//0oQb/q9Es28OAkEP2D8h/i4/NbjA1iCK Gf/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=74mB6MqkSJBkwlaVi13tzJeL9OKs1IHJIH4CRNeep8g=; b=Wf3LwlUVYkaM4UBFQY0TgDPmawVSlFliHgt8kM+Six4FP8LYscgo+BPyIUZg7SjcxK Boa9zYcW1+qKZFdWQuKNHhdcSW3VoJn1KPI1u8GDUy9zpkwI16EYrwVFl/N7o2Q6EU6B WCK7pjmnKyHagMsEl48flnd6VXGDKfid+0aCwxL/nkQAcS3Ar4LKoFnhFaFm6BAQpUlw Vyw64cUCcpTUS38gjc/6tkFJRqaH2zGUdYJ1U+WKdMn5fnSoCm2NGsMT08H1qtAAW5Pf Aqi3J/dD+TKkuX028NJNQNLJOxbi5Z+ACkZXB3/EFi4/CNLzZP/MKy8oT4NWVRL4zKY5 5BlQ== X-Gm-Message-State: AOAM533yccFTFc0OvNBFZQw3OYlk0Rum4O6C5lUqvYdT7HkE4xoqXPrW XldCqzXn5/3STGJBx5nSf4c= X-Google-Smtp-Source: ABdhPJyLNOCEvxq6jQfaw9AVNQZAkvBxjWn0h+EdPMHZI8Q6BSfpdilPDso/5du4exJGhiBafhi/+w== X-Received: by 2002:a63:3cf:0:b0:382:9343:bbd3 with SMTP id 198-20020a6303cf000000b003829343bbd3mr16248441pgd.511.1648840515432; Fri, 01 Apr 2022 12:15:15 -0700 (PDT) Received: from smtpclient.apple ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id s10-20020a056a00178a00b004fda49fb25dsm3962041pfg.9.2022.04.01.12.15.14 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 01 Apr 2022 12:15:14 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.80.82.1.1\)) Subject: Re: [PATCH v1 mmotm] mm/mprotect: try avoiding write faults for exclusive anonynmous pages when changing protection From: Nadav Amit In-Reply-To: <20220401101334.68859-1-david@redhat.com> Date: Fri, 1 Apr 2022 12:15:13 -0700 Cc: LKML , linux-mm , Linus Torvalds , Andrew Morton , Dave Hansen , Andrea Arcangeli , Peter Xu , Yang Shi , Hugh Dickins , Mel Gorman , "Edgecombe, Rick P" Content-Transfer-Encoding: quoted-printable Message-Id: <8A6AF878-D5D7-4D88-A736-0FEF71439D44@gmail.com> References: <20220401101334.68859-1-david@redhat.com> To: David Hildenbrand X-Mailer: Apple Mail (2.3696.80.82.1.1) X-Stat-Signature: f9esg57jfppdto79s5zwswyi5cwc4reg Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=bJTyRdDT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 954EC18000E X-HE-Tag: 1648840516-171066 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [ +Rick ] > On Apr 1, 2022, at 3:13 AM, David Hildenbrand = wrote: >=20 > Similar to our MM_CP_DIRTY_ACCT handling for shared, writable = mappings, we > can try mapping anonymous pages writable if they are exclusive, > the PTE is already dirty, and no special handling applies. Mapping the > PTE writable is essentially the same thing the write fault handler = would do > in this case. In general I am all supportive for such a change. I do have some mostly-minor concerns. >=20 > +static inline bool can_change_pte_writable(struct vm_area_struct = *vma, > + unsigned long addr, pte_t = pte, > + unsigned long cp_flags) > +{ > + struct page *page; > + > + if ((vma->vm_flags & VM_SHARED) && !(cp_flags & = MM_CP_DIRTY_ACCT)) > + /* > + * MM_CP_DIRTY_ACCT is only expressive for shared = mappings; > + * without MM_CP_DIRTY_ACCT, there is nothing to do. > + */ > + return false; > + > + if (!(vma->vm_flags & VM_WRITE)) > + return false; > + > + if (pte_write(pte) || pte_protnone(pte) || !pte_dirty(pte)) > + return false; If pte_write() is already try then return false? I understand you want to do so because the page is already writable, but it is confusing. In addition, I am not sure about the pte_dirty() check is really robust. I mean I think it is ok, but is there any issue with shadow-stack?=20 And this also assumes the kernel does not clear the dirty bit without clearing the present, as otherwise the note in Intel SDM section 4.8 ("Accessed and Dirty Flags=E2=80=9D) will be relevant and dirty bit = might be set unnecessarily. I think it is ok. > + > + /* Do we need write faults for softdirty tracking? */ > + if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && !pte_soft_dirty(pte) && > + (vma->vm_flags & VM_SOFTDIRTY)) If !IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) then VM_SOFTDIRTY =3D=3D 0. So I = do not think the IS_ENABLED() is necessary (unless you think it is clearer this way). > + return false; > + > + /* Do we need write faults for uffd-wp tracking? */ > + if (userfaultfd_pte_wp(vma, pte)) > + return false; > + > + if (!(vma->vm_flags & VM_SHARED)) { > + /* > + * We can only special-case on exclusive anonymous = pages, > + * because we know that our write-fault handler = similarly would > + * map them writable without any additional checks while = holding > + * the PT lock. > + */ > + page =3D vm_normal_page(vma, addr, pte); I guess we cannot call vm_normal_page() twice, once for prot_numa and = once here, in practice... > + if (!page || !PageAnon(page) || = !PageAnonExclusive(page)) > + return false; > + } > + > + return true; > +} Note that there is a small downside to all of that. Assume you = mprotect() a single page from RO to RW and you have many threads. With my pending patch you would avoid the TLB shootdown (and get a PF). With this patch you would get a TLB shootdown and save the PF. IOW, I think it is worthy to skip the shootdown as well in such a case and instead flush the TLB on spurious page-faults. But I guess that=E2=80=99s = for another patch.