From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC53FC3ABAC for ; Tue, 6 May 2025 10:20:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A8406B0085; Tue, 6 May 2025 06:20:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2550C6B0088; Tue, 6 May 2025 06:20:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16AE66B0089; Tue, 6 May 2025 06:20:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EBF2F6B0085 for ; Tue, 6 May 2025 06:20:26 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 58569120187 for ; Tue, 6 May 2025 10:20:27 +0000 (UTC) X-FDA: 83412088494.30.99ED094 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf17.hostedemail.com (Postfix) with ESMTP id 320B84000F for ; Tue, 6 May 2025 10:20:25 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746526825; a=rsa-sha256; cv=none; b=pydYPg2zrLaDhePA313yewHnrbaRsrj296rBDbZ3P9Zcxm2rPHmmz73K8+rEwl91EXs7Rm kpZldy4OjSaw1oy4Km3jOq91DNfpBhZUjoi9RWjP4LiTrLRhAo5jXrA9TytxObGF7mlE2t ATzt8/lnf0/V5X8X2heJt/+kW2pcpMc= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746526825; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v2+FjGFRNIpKd4qs2Wj6QBLT6licp+3+TUM0scMSfP0=; b=JUDxKgdb0txn+zbLM6ikmfrfmoQWEalI8cEoyK07Q/klag77WhIAp/wqsoYKkt3p7zICy9 26yUHr3fyZJJNpPMG8R71XzVs1diGW4Kvd2OR2vOn6i4+Z5mAobiKZysNlD5/nLrkuVaKS Npf4wCvt5dlT0S05SlP4jm0p60o0ipc= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6BBC6113E; Tue, 6 May 2025 03:20:14 -0700 (PDT) Received: from [10.162.43.13] (K4MQJ0H1H2.blr.arm.com [10.162.43.13]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B1A813F5A1; Tue, 6 May 2025 03:20:18 -0700 (PDT) Message-ID: <35966495-6922-4e18-a852-efb5d159a343@arm.com> Date: Tue, 6 May 2025 15:50:15 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 3/3] mm: Optimize mremap() by PTE batching To: Anshuman Khandual , akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, vbabka@suse.cz, jannh@google.com, pfalcato@suse.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, david@redhat.com, peterx@redhat.com, ryan.roberts@arm.com, mingo@kernel.org, libang.li@antgroup.com, maobibo@loongson.cn, zhengqi.arch@bytedance.com, baohua@kernel.org, willy@infradead.org, ioworker0@gmail.com, yang@os.amperecomputing.com References: <20250506050056.59250-1-dev.jain@arm.com> <20250506050056.59250-4-dev.jain@arm.com> Content-Language: en-US From: Dev Jain In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 320B84000F X-Stat-Signature: odumd7egf36c8ffziuygnwjk4j7nqcso X-Rspam-User: X-HE-Tag: 1746526825-109801 X-HE-Meta: U2FsdGVkX19Hph4RFJPfZuyZpkrYSwDr6EXcB65pavsYQQwImQ7ROuaC+csm0WOdVk9SG1eKo7+aUHwwgRbcIEd+cIk5tfQQLCIX+aL2w9I7gtfbE8YWpvEidJ5JITi7Y+4EB/KHfRByjyn81ehnBAnptO5cbcZE77O+sn2GtTxKdV7yoFw7fCDNawzFd6BlfVnwG6Haq8YTdl+4wPNmSRFR/9B00sINqdP/ACskujFaCCB72EvuMjMqBO095G8BQQJVp0loBW/DieyyhIWveOyTTehzjA9eStMu9P1zS1MetCEc0M1VNgvR7T1d41EarJdm09lGQMVvcLD9iztZVMssImnQJ40bBVsRU6TNKRCpe4ynOIB7C7o+pur7rtUMIrviihcA83HzT8Y/i4rUiYEjBrebbyDfiFzLpbp4V+f8bMMOqWBro6BTzCDNVqOcirAhehbGpxkvXL5Iwk6FoIUZrNq1nPDyiZyDuCdIH21FaaXgQj2vCKiK1tqi2+/M+SI/fDALsOKCIfC5I1u78jSxKDSlNZ2ou0y2eHO5ePyU6vmhhfmIvb1vFe3Qg6xj4DzSGA3OOR4CgsdiuTFar3uqRzoaFfQ1lUZGJ3zEzUIeqeXPlHsRxnIclwHpDr/t3Q3abJD8oi/2rsQ7oBRy6Hlgke3hBFBe2T/R3dRN9b/9hODtAljeEPSFtagertM+wa5iyAlsRMA3lTMPc1W83yfDxJtc9x28u3yns2rSgjQ3iwwuP90mTH7fclFZLHJnOTDwsb8QVUK0qfeqJfvsSmqVeDtiwUAuffPhpH+VlOUfzVXLbc7ZfxKkreSEwB10/YeJyfnsmHos70/X8k7ZddG+U9GGljCLze7rEGCzIckJV0vn5mukBBiF+IU3lqxHlzghDoChat+HTMotLHDABroZpxoYe39n06svAc6usFQXqAo6n/IDDLXzpXhxFV7aicXwMNBVuuqmOlmw4pm nIQN88H6 nKj4L73YqAF8vq2FLV5IJNo9Lx8sickWARNGTj6thYk1n4yCxemB+UjSvs6tJcLp19gRn+hA3Fw8kAZWQjMIaUSIRnQDeBI9iKIO1xKXhkk+zVKbyUpFbxtwxZ9EyTX+v+Hjwx7PHceWKuaY+JrAZEuxF697FZGVrpxCJZVyS4ZTnGHTkVKxofTWbo+FQUPQbuv6D7/vyhwJHF9E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 06/05/25 3:40 pm, Anshuman Khandual wrote: > On 5/6/25 10:30, Dev Jain wrote: >> Use folio_pte_batch() to optimize move_ptes(). Use get_and_clear_full_ptes() >> so as to elide TLBIs on each contig block, which was previously done by >> ptep_get_and_clear(). >> >> Signed-off-by: Dev Jain >> --- >> mm/mremap.c | 24 +++++++++++++++++++----- >> 1 file changed, 19 insertions(+), 5 deletions(-) >> >> diff --git a/mm/mremap.c b/mm/mremap.c >> index 1a08a7c3b92f..3621c07d8eea 100644 >> --- a/mm/mremap.c >> +++ b/mm/mremap.c >> @@ -176,7 +176,7 @@ static int move_ptes(struct pagetable_move_control *pmc, >> struct vm_area_struct *vma = pmc->old; >> bool need_clear_uffd_wp = vma_has_uffd_without_event_remap(vma); >> struct mm_struct *mm = vma->vm_mm; >> - pte_t *old_ptep, *new_ptep, pte; >> + pte_t *old_ptep, *new_ptep, old_pte, pte; >> pmd_t dummy_pmdval; >> spinlock_t *old_ptl, *new_ptl; >> bool force_flush = false; >> @@ -185,6 +185,7 @@ static int move_ptes(struct pagetable_move_control *pmc, >> unsigned long old_end = old_addr + extent; >> unsigned long len = old_end - old_addr; >> int err = 0; >> + int nr; >> >> /* >> * When need_rmap_locks is true, we take the i_mmap_rwsem and anon_vma >> @@ -237,10 +238,14 @@ static int move_ptes(struct pagetable_move_control *pmc, >> >> for (; old_addr < old_end; old_ptep++, old_addr += PAGE_SIZE, >> new_ptep++, new_addr += PAGE_SIZE) { >> - if (pte_none(ptep_get(old_ptep))) >> + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; >> + int max_nr = (old_end - old_addr) >> PAGE_SHIFT; >> + >> + nr = 1; >> + old_pte = ptep_get(old_ptep); >> + if (pte_none(old_pte)) >> continue; >> >> - pte = ptep_get_and_clear(mm, old_addr, old_ptep); >> /* >> * If we are remapping a valid PTE, make sure >> * to flush TLB before we drop the PTL for the >> @@ -252,8 +257,17 @@ static int move_ptes(struct pagetable_move_control *pmc, >> * the TLB entry for the old mapping has been >> * flushed. >> */ >> - if (pte_present(pte)) >> + if (pte_present(old_pte)) { >> + if ((max_nr != 1) && maybe_contiguous_pte_pfns(old_ptep, old_pte)) { > > maybe_contiguous_pte_pfns() cost will be applicable for memory > areas greater than a single PAGE_SIZE (i.e max_nr != 1) ? This > helper extracts an additional consecutive pte, ensures that it > is valid mapped and extracts pfn before comparing for the span. > > There is some cost associated with the above code sequence which > looks justified for sequential access of memory buffers that has > consecutive physical memory backing. I did not see any regression for the simple case of mremapping base pages. > But what happens when such > buffers are less probable, will those buffers take a performance > hit for all the comparisons that just turn out to be negative ? When would that be the case? We are remapping consecutive ptes to consecutive ptes. > >> + struct folio *folio = vm_normal_folio(vma, old_addr, old_pte); >> + >> + if (folio && folio_test_large(folio)) >> + nr = folio_pte_batch(folio, old_addr, old_ptep, >> + old_pte, max_nr, fpb_flags, NULL, NULL, NULL); >> + } >> force_flush = true; >> + } >> + pte = get_and_clear_full_ptes(mm, old_addr, old_ptep, nr, 0); >> pte = move_pte(pte, old_addr, new_addr); >> pte = move_soft_dirty_pte(pte); >> >> @@ -266,7 +280,7 @@ static int move_ptes(struct pagetable_move_control *pmc, >> else if (is_swap_pte(pte)) >> pte = pte_swp_clear_uffd_wp(pte); >> } >> - set_pte_at(mm, new_addr, new_ptep, pte); >> + set_ptes(mm, new_addr, new_ptep, pte, nr); >> } >> } >>