From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B1EDC83F03 for ; Thu, 3 Jul 2025 19:05:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86D7C6B025F; Thu, 3 Jul 2025 15:05:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 843E06B0261; Thu, 3 Jul 2025 15:05:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 780936B026A; Thu, 3 Jul 2025 15:05:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 69CD76B025F for ; Thu, 3 Jul 2025 15:05:05 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 143DA1A0385 for ; Thu, 3 Jul 2025 19:05:05 +0000 (UTC) X-FDA: 83623880970.18.09EA6D8 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 4422A160008 for ; Thu, 3 Jul 2025 19:05:03 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751569503; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+pYCoCIh+fYti03jd5EdTchsg2gvbHdTILYRM1KwsRc=; b=IBlpXvSXWHVEiDP2dm+100Fo5UgmZoQtWxbHB2QyzMQkq/xF8Ekg7Dzf+8Tej8xNVvxlxZ Ubyd1V0xaSVtMgVi2HO0dUmNTQ8McfCC1F3VzN/XWAKsOqROcynYYPxkR0Or4xO5uVAAsX 96Md7oAXCgff0fsI7IYt8fmIjZqwGkA= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751569503; a=rsa-sha256; cv=none; b=Rdb01EtsWQOWRNM8v97gODkm1lhtxbZJwAXZBtYm8jRZD5YF9fMerI1g1sSBY+4u3HQMYh DrOQx0cATeAF3la7fl3ANwG22XuuqtWMafG3FEaZ/wirw5kM806VgK+HEFZQ7+nYCUsUA/ kS1UzTtLJn37uwrVNQso7/rjObzkqWg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 730C85C6807; Thu, 3 Jul 2025 19:05:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3D64DC4CEED; Thu, 3 Jul 2025 19:04:59 +0000 (UTC) From: Catalin Marinas To: ryan.roberts@arm.com, will@kernel.org, dev.jain@arm.com, Barry Song , Lance Yang , Xavier Xia Cc: akpm@linux-foundation.org, david@redhat.com, gshan@redhat.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, willy@infradead.org, xavier_qy@163.com, ziy@nvidia.com Subject: Re: [PATCH v7] arm64/mm: Optimize loop to reduce redundant operations of contpte_ptep_get Date: Thu, 3 Jul 2025 20:04:57 +0100 Message-Id: <175156948681.3519813.8652806937156134172.b4-ty@arm.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250624152549.2647828-1-xavier.qyxia@gmail.com> References: <20250624152549.2647828-1-xavier.qyxia@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Stat-Signature: bgaj55rgjx9urwjjs65dsqesd8934m9i X-Rspamd-Queue-Id: 4422A160008 X-Rspamd-Server: rspam11 X-Rspam-User: X-HE-Tag: 1751569503-136448 X-HE-Meta: U2FsdGVkX18jIh6YSNcP2kytGCvP5jueHaxVRlFDOTXr6bsn1tYByGfYfaDSk+nIYyJ/B6G0aG4dxbCk4m2s5XA4BjRdWnG+larEmCrYj00G3zSMj780sOSuNuRlsU2t/Wd0N08h9ub9Ytq1tvzdQxs6MnXRNmDrJlYoa2TuBogHvJ+eIyzmkQYmky/d9ZC1ue4Nop/1ZGdgvU3WFuhdutLedGXfymANZ8M1rbJEcV58ZcjQa0IQZLTA2J8OIk9UHQ0M+VXyxzUDXus5k8UrnGdyX+DMuRtKaKt2SJX6nriOccKmpmKdT+18cINZOfRUPMO4IwUzluqS9qN+IA0PxxR7BmpiYlwIPWOhs3HYd5t3LDX/8KUchY55ne4kWWGojCVW63I6H+9egN/Di47Zv6vh+QT+x3D12QzF/J15e5SfhF5atkrG3Lsm+8RqWhbKoab9N0QcAwC/V4qFTkK4dibdavukAo29o04CuCkFWTiW0bpBnyjUHARJN4ehowwNjnrfdE0aWzrzQrjIAFqUETT6sUHymzoApRnfZDaDnqMro/LGyUi1yo0H001gqLgo/p2bRvMMMqFxeuFICLmKduwOGfmUBUwZgt5jrY00XuPil7twesRLQwMJ4Z5p4AFC7wgS9HOvUIzF/T4bDfNM9hdR3Kbqc6tp9SLAJicS9EtUrj9n52/q9JT/8Zh8Hj3YNju//e0cOuqadu7mNy0JI7cnPLzz8v5vpgh5OkEIa4BpyOow8VN6C4GsO+EjOZNjdg73sFRDHW5xU5fxXiEg3B6lPO3W8WwRWXXd7+lRMQcElmtLlDHUABsciUwTc02lU4CVE1KCDfr+DRWdat6DHs5AvhuXi1sR/MDPSnHb0cS+hndNlRICDrU1gt+ipqEJnaIeg6EixWTblFsnub0QTF48pueau9r4cwUsEMWsUF3zdP/bzXPpkqKMmMQuVYk7bEzlZC5V8u6CZj0mNet vpXH1Il8 YNgBv6yqz/enwPKvmSIhTLK9QZfTTb05a5Aq97rBMBeThns7KRR97D4rqlk2CRH82XPj1ooAawHEG7qUBSIHVUtZpIm5vbuR7OxPW3zlFpSgbMoOOL5Z81R4vVmDUNlwvCa7RG39C3FkFFnIhIqYs3TqVg4n1o8Pr77dIfStmM2dbE7ArZ7Mb6znsiBOhH54wYbuKhObDYRCROlhxiqHlZLXBcpexTONzNjU3I1LRATIhCRT+0zXHaHW4U0VfjzQlrlxE+rCt96gT0VyGQqRzmxkVbY4aXitzSp4mvsLDy7nLrslvz2TvaAd1aDVzeHLWrbrSVLsN0boF/lrJXUZubwKbrbKWBDgPSo3pzs5JbC2yjENSwTr0tqIFjHN7KWsLwNdol5Nu7FuVOynKLI5satDw/Ncmh6wmeJJ7AWEGxGNGGnGuOvNeA8/y3A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 24 Jun 2025 23:25:49 +0800, Xavier Xia wrote: > This commit optimizes the contpte_ptep_get and contpte_ptep_get_lockless > function by adding early termination logic. It checks if the dirty and > young bits of orig_pte are already set and skips redundant bit-setting > operations during the loop. This reduces unnecessary iterations and > improves performance. > > In order to verify the optimization performance, a test function has been > designed. The function's execution time and instruction statistics have > been traced using perf, and the following are the operation results on a > certain Qualcomm mobile phone chip: > > [...] Applied to arm64 (for-next/misc), thanks! [1/1] arm64/mm: Optimize loop to reduce redundant operations of contpte_ptep_get https://git.kernel.org/arm64/c/093ae7a033cf -- Catalin