From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23614C48260 for ; Fri, 16 Feb 2024 12:30:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6ACD58D0007; Fri, 16 Feb 2024 07:30:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 65C878D0006; Fri, 16 Feb 2024 07:30:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 54C118D0007; Fri, 16 Feb 2024 07:30:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4498B8D0006 for ; Fri, 16 Feb 2024 07:30:27 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 092A01C05BB for ; Fri, 16 Feb 2024 12:30:27 +0000 (UTC) X-FDA: 81797600094.10.950A3DF Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf14.hostedemail.com (Postfix) with ESMTP id D4C7D100020 for ; Fri, 16 Feb 2024 12:30:24 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf14.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708086625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GU/7J+2npw48+k/rcVZTZp/Av0qudjFYnO2RtLIGt0Q=; b=cbb/vfx764wRWwYTwvgzucjL4OquQX1MIf+2i2KHVWzA2VyY07P6A2llwKBJp8nFxxWdxI jGxwpoYFZAJbhC9Rn9k4uca5DaOOBDt4FnPR/0mSoOcfeC5+yWBuXC2amp8TZmFTPDVOhm rhwajgpbZmeQokPsJAWIA5wjwptGxaA= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf14.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708086625; a=rsa-sha256; cv=none; b=wP3lZf4j9Htp5ME2tbpmJ+ggGjzDewivWhliV3j12PtAFKuglx932FqiKj3VWuRmS5m2S3 ImpV3MswUAolzVN4U6qjXA2tYOsr/9Su8ML//vcNehltsEookff12/WXMQTyv31Ug35eQg w/FIAQ/FkTRzDXEOWH66az5uIvn/59Q= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 234B6CE293E; Fri, 16 Feb 2024 12:30:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA0D8C433C7; Fri, 16 Feb 2024 12:30:14 +0000 (UTC) Date: Fri, 16 Feb 2024 12:30:12 +0000 From: Catalin Marinas To: Ryan Roberts Cc: Will Deacon , Ard Biesheuvel , Marc Zyngier , James Morse , Andrey Ryabinin , Andrew Morton , Matthew Wilcox , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan , Barry Song <21cnbao@gmail.com>, Alistair Popple , Yang Shi , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v6 13/18] arm64/mm: Implement new wrprotect_ptes() batch API Message-ID: References: <20240215103205.2607016-1-ryan.roberts@arm.com> <20240215103205.2607016-14-ryan.roberts@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240215103205.2607016-14-ryan.roberts@arm.com> X-Rspamd-Queue-Id: D4C7D100020 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 16ndpzq195jboqm98ujyi5i6ubrfoamo X-HE-Tag: 1708086624-747335 X-HE-Meta: U2FsdGVkX1+KLNqKgYKJA4hOtkFyZb0gSQqoAJS/4+gcNGVLVx2ny7JXyNYQgiwLafYoTHV/mvu1dTL1rSLj0XusnnmhM9ZVKK7KvOVX9xweSjSihtlvmJdz5eTbklluFlPLh3a+hqv/fv3IjET6aMM28QgpEsiQ4PUo/nWwYsH2H70SpQKg/5YLl0xUjstQzqnIcT9OhBT96v9nRzDeDmq4DkbxS2azyWFJbOVPsilofggW+xyKWqI6JGyaplbJFiusEFA5hVaSMr+s3ftxa5y74hylJfMcNITFnytBuvXhd7y+Kv/K/kEXKtL1odverG2wCmLAHBu8muFbcAw14kEUAF08GOGfnSiSZqLif86TG3qeK2NrlnygVdPsbXmEaEMbxEmKatbWv4Ph2Tk20X2yMpmy0bu9bCGYiVKOf6L/hK4M9AWtkjKIqI6Nhx6XhdMnAXjLvdOGPDSqK3Uc4DaC3oEULMxRko1airimq9igpVl3FAfL85AsCCPoQVpugfYuQXiz59ju0Ud8rPlempXzwMoifOGYRCkQzZPuCJjgod2Inlk9pKazrSp3V4DEB4k/w5owJHpZWXyo1kB5LR7ryIMchP2+21VZdjPpduStDVhMLcSfIbEVwrCwpvvmzFVqdVFQhqv0dDCMwD5PQcklDSXnpjVlmqt5+whVj0a6I4ma2TLi9k5mmR9yT/UQ5lDNZWjuU8l6/JrJVBDP4zQ90r1spbx2233V1Yk9zCHEwVTFHw20BKWLXaFSfM0/EYZQT7hHQC+7grgeFC7Xol1uZhfu7KrayI1jc9urZt8SzeI1Cn7bYWSQJ5kT/1nkQUJ2Lfad3kJZBsCq4J1Y9WJBXfxrjOroOVGaP1b594o9VSCNF6+PgUI4qC0rLbH1QxJoJWofi38VKsur/TRf/3+9XA4F60QNuwdx+dkGpYwX01h0W7DkbigLBKcgB/UEJ6pNuYK3DYx5rUaUGMV HEwsobUl S9+jDBGE2IsEZmadxbQcUSC/UBuANKlbMK7eVwV5vjpl+FCeBOm0NoV0Hatg8fd8p+z6a3jr+Hu0yqZTyFYyfk1i0hbb/Ir34tj1+Bij1OxZSJNhhyA+O9T+ruMeFvZV0V6M5IvbBv5b59zwV6ibL/sGMopyHaw+Neee1rVUgUyKnSq5Y1ss3+m+cQ/udWfIqvxTX3tC5mqKY+qhwvagEzStIggupmbJL/QHCK3BYAvqutIlEaa7nAlTZHk9IA3g7sTSFv3RH+wSz96S+McCMnU5T66AhkO1NfWcgnZ4oiAu03CH9b2Bx7BL7MApPSKtwf63BwS8QBa60UZNAbZoatbgXO1f3W2v0NTnh91nket//J3x2lu1jX6B0lpBdc7o1tJVgfKQQh2fBPp1txG40+VHtLo830wSKG5UuZ7s7F1sBrji9XfReS8v1E/6S91uAG5ujinhrLxXELl3bNo8W1rOU0bTWuc1X4kmTPZ5eruPVBJ561sEQ3GbUp5x0z17CFSyq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 15, 2024 at 10:32:00AM +0000, Ryan Roberts wrote: > Optimize the contpte implementation to fix some of the fork performance > regression introduced by the initial contpte commit. Subsequent patches > will solve it entirely. > > During fork(), any private memory in the parent must be write-protected. > Previously this was done 1 PTE at a time. But the core-mm supports > batched wrprotect via the new wrprotect_ptes() API. So let's implement > that API and for fully covered contpte mappings, we no longer need to > unfold the contpte. This has 2 benefits: > > - reduced unfolding, reduces the number of tlbis that must be issued. > - The memory remains contpte-mapped ("folded") in the parent, so it > continues to benefit from the more efficient use of the TLB after > the fork. > > The optimization to wrprotect a whole contpte block without unfolding is > possible thanks to the tightening of the Arm ARM in respect to the > definition and behaviour when 'Misprogramming the Contiguous bit'. See > section D21194 at https://developer.arm.com/documentation/102105/ja-07/ > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas