From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73666C48260 for ; Fri, 16 Feb 2024 12:30:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E28516B0088; Fri, 16 Feb 2024 07:30:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DD8266B008A; Fri, 16 Feb 2024 07:30:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD1B28D0006; Fri, 16 Feb 2024 07:30:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id BAD296B0088 for ; Fri, 16 Feb 2024 07:30:58 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8C1101A0599 for ; Fri, 16 Feb 2024 12:30:58 +0000 (UTC) X-FDA: 81797601396.26.62F3170 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id 0891EA0003 for ; Fri, 16 Feb 2024 12:30:56 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf25.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708086657; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ym6zmEkgYqPxFNV4taTth0NjQrV8C1DSmhT44aWXzc8=; b=qThmKX8m4351eMR295q1OFK44X8nmcyysNffAFUWJyAHnay7EzvBGO8XgFdt/kl7oYsCJT u4CgjPQZl650cmCcdaYGoD2frhGbQ8Jof2WHB66ilSN6eBVNibapMxm5P+ZnrMMzP3/saN YUfRRFGsatdLcud531/JChGPtYdadJg= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf25.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708086657; a=rsa-sha256; cv=none; b=BDn4RcwCsC+nvmzOajjYQfdxqjZWeXHYp1j1XVnuwmNPpO8LLUORPCmqMz1qlVNkLujmVX nkySc3PFTEn1gY7AFjaNzorL0UGho8euVtKNw4K+prKcf7gcuEYNP/LYIeCmKgd8ST5tBc 01x2uqewKZeA+ydMHRqOWEBPdyjX6Vs= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id E44BB61F7A; Fri, 16 Feb 2024 12:30:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 102C3C43394; Fri, 16 Feb 2024 12:30:50 +0000 (UTC) Date: Fri, 16 Feb 2024 12:30:48 +0000 From: Catalin Marinas To: Ryan Roberts Cc: Will Deacon , Ard Biesheuvel , Marc Zyngier , James Morse , Andrey Ryabinin , Andrew Morton , Matthew Wilcox , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan , Barry Song <21cnbao@gmail.com>, Alistair Popple , Yang Shi , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v6 14/18] arm64/mm: Implement new [get_and_]clear_full_ptes() batch APIs Message-ID: References: <20240215103205.2607016-1-ryan.roberts@arm.com> <20240215103205.2607016-15-ryan.roberts@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240215103205.2607016-15-ryan.roberts@arm.com> X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 0891EA0003 X-Stat-Signature: nucmnmp18ujkosofhajtzn7gdcbyt831 X-HE-Tag: 1708086656-121191 X-HE-Meta: U2FsdGVkX1+Q5jpwSNe1eEver10KXnLRKve/IMURLrtOy5JzdB8UfBbF4XTLugyUtbaQhZ6bq05ETOUwb8KcWjyBuy9JzwZlCU11Mw6/86eR7Qj/tOJLvuHYkebJq5ik6uEJML/EZbPKrM9eWJO5a3M2uVWN/47dqjfM96NDEmdATq86zPMBEgmwHBh6yMfy1MegdecFQxfs2QNCwS5J6rlf1DHm6CZjuWE/EtuDeedlgrMMEb2m25R5xH620V2GwcF2RM8ptOsP4dvF4lUeZgok863AzL7h16v/nmC2vCLMIeFvJv8RgMdeDp6LetXkMtNk1bkZ7oRcaniRv3BWfYGQZbkivzf6wWCrmKOs6Hkx5jQfKmc01FTUklVeZMWM4FdGCNzO2p+0btiO9ggFXQ37qFG/lmYA7Qb0CHPIdb70629uPWTRHsPMeRk0Ib4TR17Kmh81R2B2l5c07eYT2Fr+ZkQvBq/of1gfisODbpkB6mK2O3Wf6+uW48K6oLAcxEFGCgCQgn50SnL8I16SQkJ0uq5YMNt6OyvQrSDiQ9L47sbzmHXmKb3CyWP8kVaHghTdSZnck6qgKho8Xk/dP4txYfX8WP/pKxsQ52oKDlbJgqZZkCnwTypxNOVaRBeYtzOcmEVTMpMAuEuGK9ZunAqtiq3ip0UPP/dFyu+IugwKOcKAkB7twrPmASfrQKr5+fEjOJQf9t/VvZRgYh55rsFm+z/1GcH6n17DzEyC+jfcQ2+XL1beoTtOP6VVEu6mKBhDJ2o9iA2mhEE+zAyirXheiZ9FhBD0qIGtAD0/h1zXlxAhe5lwTQ4yuYMaOsXyKLY1puqPo9nMkMbV8jtvk3DHwPRPJaKptcbiSCAsFOxd8PvFpgIYk2xxo3obTeWYuIjxWbWozsKTNZbTnyfEu3Mv8rnoRPUt9rufjpqxdHXzhu/gZp38O39WnTWysidKSSgI+nxewGBA6B18TB+ 4GE9mf6c CJuQXawUb36Xjq4uAXhW9WYvFfayQzC+3BqWIX8VAvWDRAVnP0Xpo6zrtX5gesHH6D7lbBM3a/Iz8+Z5cXFfR/JPv6YiBs0ial5fWupTPerPbIWBORhx/2L7szd0AVibAXsfvCv5vpTM3ESCF8OnbUJhdIouMFs4dpLX756485GgKFjRHxPFWT1U3p5Hb/6Ixxfx+9gwT4K7GoiafchMQVX5Ew8rHIPxjS+2dZ+xcNAP235KDRmzlXW//XKCVlJ7Di9auZ1bFWhtSHAuIggUEqqxbZVUw56HDpXIr5+jKKtbiHrQEWgrRIJ58+eqD4YIjtuayTniGe64bbR+sUtl26XqIMUa++nohPgsj+LHs3vXcGz1ssHaamZWZxQDj6M1G4oZFi6aQl5gPR+Y6Nq2zBKGdUtqq/FoDIZQN7I0s/9AZ4x0dIjVegolstBd51WzE/ZCvGNLGj0A/91TmIwTOubER8+Q2VF/xWyoCKsTKqzING7ePWirDtqzd4vyuZhr12UoMcngnHtV/9A8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 15, 2024 at 10:32:01AM +0000, Ryan Roberts wrote: > Optimize the contpte implementation to fix some of the > exit/munmap/dontneed performance regression introduced by the initial > contpte commit. Subsequent patches will solve it entirely. > > During exit(), munmap() or madvise(MADV_DONTNEED), mappings must be > cleared. Previously this was done 1 PTE at a time. But the core-mm > supports batched clear via the new [get_and_]clear_full_ptes() APIs. So > let's implement those APIs and for fully covered contpte mappings, we no > longer need to unfold the contpte. This significantly reduces unfolding > operations, reducing the number of tlbis that must be issued. > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas