From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0DC8C3ABBE for ; Thu, 8 May 2025 04:01:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3AD156B000A; Thu, 8 May 2025 00:01:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 358EA6B0082; Thu, 8 May 2025 00:01:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1FB7D6B0085; Thu, 8 May 2025 00:01:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 02F546B000A for ; Thu, 8 May 2025 00:01:49 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 539811CB86F for ; Thu, 8 May 2025 04:01:50 +0000 (UTC) X-FDA: 83418391980.27.2D96F96 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id 2A51540007 for ; Thu, 8 May 2025 04:01:47 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746676908; a=rsa-sha256; cv=none; b=HIuDiylqFjMuQyY6jLCB9CytBiXPVmYTCcXOUcrFWqM6rNq2/GLzUzza1N1E4VxAymoZD8 xUtwmrxFice8K9BlwcK9RQF8zZMq9AGs8AV+sUXYmfa2SasY2sc0TwkU/Vd4xwnO1APESx uwN5WJNzVNIoqGREIvZA5jQCDSOrHsU= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746676908; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M1zRTBqAcxYLF/prKxs3qDNXUYWRb9GTk7CY7d+FiNA=; b=jYRe6wjUTu4i5SB43ORC+YHj0z2spdhAujFYgY/v6XiOkSuCTvfXM/V30hObTw9d7/o840 R59l5Psz3TWZTk4EWB2QyUFOFh63k22LatVO8FtnIigQw+NHqARYQtGpMrpcCB3D3C8QSr XAdSlwCBxlX8cUUXaBJ2FbLTtQRl4ug= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BAE1A106F; Wed, 7 May 2025 21:01:36 -0700 (PDT) Received: from [10.162.43.19] (K4MQJ0H1H2.blr.arm.com [10.162.43.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 787C83F5A1; Wed, 7 May 2025 21:01:40 -0700 (PDT) Message-ID: <7952474c-3eaf-4abf-b8de-e8a72b4bec0c@arm.com> Date: Thu, 8 May 2025 09:31:37 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 2/2] mm: Optimize mremap() by PTE batching To: Zi Yan Cc: akpm@linux-foundation.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, vbabka@suse.cz, jannh@google.com, pfalcato@suse.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, david@redhat.com, peterx@redhat.com, ryan.roberts@arm.com, mingo@kernel.org, libang.li@antgroup.com, maobibo@loongson.cn, zhengqi.arch@bytedance.com, baohua@kernel.org, anshuman.khandual@arm.com, willy@infradead.org, ioworker0@gmail.com, yang@os.amperecomputing.com, baolin.wang@linux.alibaba.com, hughd@google.com References: <20250507060256.78278-1-dev.jain@arm.com> <20250507060256.78278-3-dev.jain@arm.com> <74FDF9E1-3148-460B-8E3C-5EE156A3FA93@nvidia.com> Content-Language: en-US From: Dev Jain In-Reply-To: <74FDF9E1-3148-460B-8E3C-5EE156A3FA93@nvidia.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 2A51540007 X-Stat-Signature: ymhi1zcrynrjmk7i3wsbbktc6rmn9gqx X-HE-Tag: 1746676907-133561 X-HE-Meta: U2FsdGVkX1+1yzdDrj/DQFWsmnKunZmy4JJJPbQ/fk0Arg2M1hIWRTxH5ip1AjSFIeKchIaR7XwDNEDc+iKnQHa1fFf2y0CycMRt0/SU0yC3GQEdPN3Q2qPvqesdXB//Fw+rZuGWPrR+lUGr53w5ar8iSW+XYEkgXXbujynvBd/dAriH71lmRR3g4yf1g/H2dtmt+PYxf2dqPjok2y9cn6Odlp+2OpHZrKHeyyh88lytBQzsu2nD17vn3k5t5BEDkqWlXXIFABhiIGUdeqUj0Q+and1U0su5120IUtbVzG0m9Wi7kA9XzC71Tk2FcPmjdmkwp2qmhIOy/o/JRzs4R75nasGTfHQXR0t/M9CR/xsmwMiAtQ84xaYyVnOJNU9f3cvcOUXL4v7jPleuooIihlXkDvOiTqc53O5H99n+RV7yIrtjaSKICOsAER7Ck7je0h2AN6H+l3Rbj+hdBNxrq2Bsv1KqWluiQz23H5zF5MBatUU4jQUh22VZqROMusMc6kgrvYhJFXHFphb/bjgJZ2WCQyVYNaqsIg+6tpdk8ft+Cg3AOp6RjFSSO5WEt2C2u7PYyKeVEO4mcejvaOY20Uh8H/W/uFSodZrjCOd1gwRDMLcXmp0MjioPvN5QKJhVf6Kco+xTPLNCC59ZTJjCzTXFExLyVmSRYwutBBhQgnKCiSig6iFRoUBlg5Ybw6dzMkjGLnRkpuUIv9MOMG26yWcU+0dM//K8th9rKSmxdF7VeFUBrMst4iB4AQzZh+4BTwHW/et3k2h9N9ZnPAR7JIL1StHREkbv/fBOT2MS8RwPGeMaVmPSvJIcvMhZCwAN05yg1Anw2Lx3CuaKwVnuVQs11PBDIiMMxlD4aTDMUZKN/Nmmmd87JK/GZmxZtkX6dFzyH6vdI4dYCJhFkpGJ8HYrsB495jOvUCFPqXMo4bJJR6cc94pLhg/7GH0b4dS9Vm0MUtBkWXrEipcBSD9 /XICYUoU fXkcLC1E+b+9nX73e23se3yp/2err3umETkI+XZL2z084YxUei1moVcsh7NLH9eAi+TTGt3cDv/5tYczCG2U/jd3l3Kd8BzVpAzRI2QE1ggRYB4cdFpu8hiBgx8VaQU0vac0HyUnD8suefCZCnCnK54YAP3JJ51AnxaO9nlJntCN6clX1uouoGjzokmHlLjFXIhlusL4MG0lAJqE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 08/05/25 7:30 am, Zi Yan wrote: > On 7 May 2025, at 2:02, Dev Jain wrote: > >> To use PTE batching, we want to determine whether the folio mapped by >> the PTE is large, thus requiring the use of vm_normal_folio(). We want >> to avoid the cost of vm_normal_folio() if the code path doesn't already >> require the folio. For arm64, pte_batch_hint() does the job. To generalize >> this hint, add a helper which will determine whether two consecutive PTEs >> point to consecutive PFNs, in which case there is a high probability that >> the underlying folio is large. >> Next, use folio_pte_batch() to optimize move_ptes(). On arm64, if the ptes >> are painted with the contig bit, then ptep_get() will iterate through all 16 >> entries to collect a/d bits. Hence this optimization will result in a 16x >> reduction in the number of ptep_get() calls. Next, ptep_get_and_clear() >> will eventually call contpte_try_unfold() on every contig block, thus >> flushing the TLB for the complete large folio range. Instead, use >> get_and_clear_full_ptes() so as to elide TLBIs on each contig block, and only >> do them on the starting and ending contig block. >> >> Signed-off-by: Dev Jain >> --- >> include/linux/pgtable.h | 29 +++++++++++++++++++++++++++++ >> mm/mremap.c | 37 ++++++++++++++++++++++++++++++------- >> 2 files changed, 59 insertions(+), 7 deletions(-) >> >> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >> index b50447ef1c92..38dab1f562ed 100644 >> --- a/include/linux/pgtable.h >> +++ b/include/linux/pgtable.h >> @@ -369,6 +369,35 @@ static inline pgd_t pgdp_get(pgd_t *pgdp) >> } >> #endif >> >> +/** >> + * maybe_contiguous_pte_pfns - Hint whether the page mapped by the pte belongs >> + * to a large folio. >> + * @ptep: Pointer to the page table entry. >> + * @pte: The page table entry. >> + * >> + * This helper is invoked when the caller wants to batch over a set of ptes >> + * mapping a large folio, but the concerned code path does not already have >> + * the folio. We want to avoid the cost of vm_normal_folio() only to find that >> + * the underlying folio was small; i.e keep the small folio case as fast as >> + * possible. >> + * >> + * The caller must ensure that ptep + 1 exists. > > ptep points to an entry in a PTE page. As long as it is not pointing > to the last entry, ptep+1 should always exist. With PTRS_PER_PTE and > sizeof(pte_t), you can check ptep address to figure out whether it > is the last entry of a PTE page, right? Let me know if I misunderstand > anything. Sounds correct to me. > > > -- > Best Regards, > Yan, Zi