From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F765C433EF for ; Wed, 20 Jul 2022 17:25:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 03F7F6B0071; Wed, 20 Jul 2022 13:25:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F31B96B0073; Wed, 20 Jul 2022 13:25:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DFCED6B0074; Wed, 20 Jul 2022 13:25:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D30A26B0071 for ; Wed, 20 Jul 2022 13:25:51 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AC74D1C600D for ; Wed, 20 Jul 2022 17:25:51 +0000 (UTC) X-FDA: 79708155702.28.6AAC76F Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf20.hostedemail.com (Postfix) with ESMTP id 2CE941C000F for ; Wed, 20 Jul 2022 17:25:50 +0000 (UTC) Received: by mail-pl1-f170.google.com with SMTP id v21so15575373plo.0 for ; Wed, 20 Jul 2022 10:25:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=KVy1jSlav0FOqpeAQ11Wb/ZstcAqe1KYzN+PWSIXtU8=; b=CarY0uRd9wdPwbNAzGtbvccyFjlu6IcKFsrNhmsAsz4IKmj1HvPTZGXBuM449JvuxH f95ulwixlmjvYZgka2dXgVU2vH/1Dkszh2AYpltxXqeRLSU5B4ynoJE0H/FZzpj0iRPG kjrDt8Bk1UfKUtb7s5LXPvjJBC5CBCktCdr8Lpud12PhX5S5FAm2/+TQ4SIQbzH11Wi4 RGMfBSz4MDDsmEk+FVRP9wXZR66ygynfm0VIEgCsRG+A+pPK3wgWplGE50TFgi92UESE Lya8B38DMjF8nkwZbEgwoYhdRyz15X5pMaLK/S+sUnwKevrwfwwMrdW1Aw/rPZkQ/uvh Sgkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=KVy1jSlav0FOqpeAQ11Wb/ZstcAqe1KYzN+PWSIXtU8=; b=2fYaOIO+ioxHxIQ5WNyMv3O3LTgOmc/xtu925tHnSGWQE/bKvotaMF+9zn9WTjIb71 N5VD8eDyT+fFX66FdHlaHyd/aF/YwD1B/lLOh0b+DYLVeTrO9bU31NfXEboUaqv+tcFB F0SbvMneorN53fFH1JGYun9v89XWz4ZLcTaxCbxdIyU6qr6DibuEd+TViPXIpGI1tHMG lhNoDTomW+44PBJg8MtiEsQCA2abR5nkW9MTUsMHILol5ey0KQChosPW3JSWYYUMER8L vmiOU314MoSfshsQ81kJZyESeEUr8H7a7PeTG+ONnb8M+NYOygZFnOnKpIi+eh/3YgCz Szkg== X-Gm-Message-State: AJIora8+ZKkt3YN514Crcua/50GDRoqBLMmtPVUyUbt0ARIW5A9R2kH4 GFAZ1rD9UhdXaL/HIlR04N8= X-Google-Smtp-Source: AGRyM1vuNFYUnTL4pLfVPY+7Q3tvAgLv/p7Egq2AmXWYJ3xqpA/0xROIZUHeJyVuqaYPK3GDdptMwQ== X-Received: by 2002:a17:903:1ca:b0:16c:4e2f:9294 with SMTP id e10-20020a17090301ca00b0016c4e2f9294mr40602508plh.30.1658337948964; Wed, 20 Jul 2022 10:25:48 -0700 (PDT) Received: from smtpclient.apple ([66.170.99.113]) by smtp.gmail.com with ESMTPSA id z26-20020aa79e5a000000b00517c84fd24asm14211873pfq.172.2022.07.20.10.25.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Jul 2022 10:25:48 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\)) Subject: Re: [RFC PATCH 03/14] mm/mprotect: allow exclusive anon pages to be writable From: Nadav Amit In-Reply-To: <23a9d678-487e-5940-4cde-dc53d920fb48@redhat.com> Date: Wed, 20 Jul 2022 10:25:46 -0700 Cc: Linux MM , LKML , Andrew Morton , Mike Rapoport , Axel Rasmussen , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Content-Transfer-Encoding: quoted-printable Message-Id: References: <20220718120212.3180-1-namit@vmware.com> <20220718120212.3180-4-namit@vmware.com> <23a9d678-487e-5940-4cde-dc53d920fb48@redhat.com> To: David Hildenbrand X-Mailer: Apple Mail (2.3696.100.31) ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=CarY0uRd; spf=pass (imf20.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658337950; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KVy1jSlav0FOqpeAQ11Wb/ZstcAqe1KYzN+PWSIXtU8=; b=YNPHkZAJnOImslG9UIy0Hx8XNUE9e112dVIZH+puLwHsANTxM9zRBxVcfC1wz0Q7QaeVfr +k0FgRznx1DzLzAd0OsqizE8Nn7ExUTC1tV6KHdYjQU/sdI9Ila4oMWGk6M2sbKGyle1hJ Yg/cRxiY9AOJPE1rVjU8gYpAyFTq0fY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658337950; a=rsa-sha256; cv=none; b=2LXn7ndMdA2XV0ovwB98qpVjkl7l3D1NfANCfwykrhnD169PoR2jAwXuDuaOfq7cGld2XM eryufcNMsTmeomr3qKASENeLGJgH+bru9828zb/OWNMt3I1MeNwakEbgwzvOhkYQjAyNPM BlgpV867/tQmt9AIG9pjzOe32QXvJX8= X-Rspam-User: X-Rspamd-Queue-Id: 2CE941C000F Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=CarY0uRd; spf=pass (imf20.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: etkjeiiza8g4aw4t1fe9fx4ncrwusda9 X-Rspamd-Server: rspam07 X-HE-Tag: 1658337950-162304 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Jul 20, 2022, at 8:19 AM, David Hildenbrand wrote: > On 18.07.22 14:02, Nadav Amit wrote: >> From: Nadav Amit >>=20 >> Anonymous pages might have the dirty bit clear, but this should not >> prevent mprotect from making them writable if they are exclusive. >> Therefore, skip the test whether the page is dirty in this case. >>=20 >> Cc: Andrea Arcangeli >> Cc: Andrew Cooper >> Cc: Andrew Morton >> Cc: Andy Lutomirski >> Cc: Dave Hansen >> Cc: David Hildenbrand >> Cc: Peter Xu >> Cc: Peter Zijlstra >> Cc: Thomas Gleixner >> Cc: Will Deacon >> Cc: Yu Zhao >> Cc: Nick Piggin >> Signed-off-by: Nadav Amit >> --- >> mm/mprotect.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >>=20 >> diff --git a/mm/mprotect.c b/mm/mprotect.c >> index 34c2dfb68c42..da5b9bf8204f 100644 >> --- a/mm/mprotect.c >> +++ b/mm/mprotect.c >> @@ -45,7 +45,7 @@ static inline bool can_change_pte_writable(struct = vm_area_struct *vma, >>=20 >> VM_BUG_ON(!(vma->vm_flags & VM_WRITE) || pte_write(pte)); >>=20 >> - if (pte_protnone(pte) || !pte_dirty(pte)) >> + if (pte_protnone(pte)) >> return false; >>=20 >> /* Do we need write faults for softdirty tracking? */ >> @@ -66,7 +66,8 @@ static inline bool can_change_pte_writable(struct = vm_area_struct *vma, >> page =3D vm_normal_page(vma, addr, pte); >> if (!page || !PageAnon(page) || = !PageAnonExclusive(page)) >> return false; >> - } >> + } else if (!pte_dirty(pte)) >> + return false; >>=20 >> return true; >> } >=20 > When I wrote that code, I was wondering how often that would actually > happen in practice -- and if we care about optimizing that. Do you = have > a gut feeling in which scenarios this would happen and if we care? >=20 > If the page is in the swapcache and was swapped out, you'd be = requiring > a writeback even though nobody modified the page and possibly isn't > going to do so in the near future. So here is my due diligence: I did not really encounter a scenario in = which it showed up. When I looked at your code, I assumed this was an = oversight and not a thoughtful decision. For me the issue is more of the = discrepancy between how a certain page is handled before and after it was pages out. The way that I see it, there is a tradeoff in the way dirty-bit should be handled: (1) Writable-clean PTEs introduce some non-negligible overhead. (2) Marking a PTE dirty speculatively would require a write back. =E2=80=A6 But this tradeoff should not affect whether a PTE is writable, = i.e., mapping the PTE as writable-clean should not cause a writeback. In other words, if you are concerned about unnecessary writebacks, which I think = is a fair concern, then do not set the dirty-bit. In a later patch I try to = avoid TLB flushes on clean-writable entries that are write-protected. So I do not think that the writeback you mentioned should be a real = issue. Yet if you think that using the fact that the page is not-dirty is a = good hueristics to avoid future TLB flushes (for P->NP; as I said there is a solution for RW->RO), or if you are concerned about the cost of vm_normal_page(), perhaps those are valid concerned (although I do not = think so). -- [ Regarding (1): After some discussions with Peter and reading more = code, I thought at some point that perhaps avoiding having writable-clean PTE as much as possible makes sense [*], since setting the dirty-bit costs ~550 cycles and a page fault is not a lot more than 1000. But with all the mitigations (and after adding IBRS for retbless) page-fault entry is = kind of expensive.=20 [*] At least on x86 ]=