From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CD25C3ABC0 for ; Thu, 8 May 2025 05:03:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50DBA6B0085; Thu, 8 May 2025 01:03:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 495646B0088; Thu, 8 May 2025 01:03:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 336316B0089; Thu, 8 May 2025 01:03:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1297A6B0085 for ; Thu, 8 May 2025 01:03:09 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 27DA91407B3 for ; Thu, 8 May 2025 05:03:10 +0000 (UTC) X-FDA: 83418546540.20.5D6BACC Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf20.hostedemail.com (Postfix) with ESMTP id 2A49A1C0010 for ; Thu, 8 May 2025 05:03:08 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf20.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746680588; a=rsa-sha256; cv=none; b=3EaRVDkzIltH9fQ2wE9Se6rPhm9+l3lx8epUwD9BzYZJNbQtdOHEHEZWOKUhr0G8ilbklF qPWPtNTU4Q91jfzKRVfaHKlM/5koT+Go03xaiT0EQKzeTF0OO5R6QYW44u9QAQKP/FXV2d b1CP5g6ue8X+2k9E20G847triv1ylgA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf20.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746680588; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UKgY2CHjA54CPPZiOYqEsHJAUY/sH1aLs5AUH4kYl6s=; b=blSwCNJsYmI8d14g7MALPM9SKJR9nvs/Em59aKYYWACaECuPsnaqD0TSSJOxlU4Vm8lF77 nj+a9I82FM5dWPwq3+pF56w7rUb//d5UL5IrTZHASxLSgOs4n+cXTnzLD3UlLV+B8Y/8Kg OaJ7CgOzJogmNRi8QirMCU2YrpwM68I= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C35BF106F; Wed, 7 May 2025 22:02:56 -0700 (PDT) Received: from [10.162.43.19] (K4MQJ0H1H2.blr.arm.com [10.162.43.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5C0F03F673; Wed, 7 May 2025 22:03:01 -0700 (PDT) Message-ID: Date: Thu, 8 May 2025 10:32:58 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/3] mm: Add generic helper to hint a large folio To: David Hildenbrand , akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, vbabka@suse.cz, jannh@google.com, pfalcato@suse.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, peterx@redhat.com, ryan.roberts@arm.com, mingo@kernel.org, libang.li@antgroup.com, maobibo@loongson.cn, zhengqi.arch@bytedance.com, baohua@kernel.org, anshuman.khandual@arm.com, willy@infradead.org, ioworker0@gmail.com, yang@os.amperecomputing.com References: <20250506050056.59250-1-dev.jain@arm.com> <20250506050056.59250-3-dev.jain@arm.com> <887fb371-409e-4dad-b4ff-38b85bfddf95@redhat.com> Content-Language: en-US From: Dev Jain In-Reply-To: <887fb371-409e-4dad-b4ff-38b85bfddf95@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 2A49A1C0010 X-Stat-Signature: 383bbhxwgp3386dshobq3r6dxf44q9yi X-Rspam-User: X-HE-Tag: 1746680587-382277 X-HE-Meta: U2FsdGVkX18O/7mqfER6jfy6EdKihoDCuquUwEnYqznwF/eMZJaj4t7/Et+qNoytancVt+Z46ZMwD3/m/PXz/AALYJuJH6KLqzpmOVfZiw06sJX9suCzTRpZWV3QbCstiIedKa/rleEFZZf5eopQ9n2USFtw6ZNzLSbOP26/N/k6tznGKhBY1GMEouHHfsF1neAFIxzX4d0k5CNqnW58Ec98LbfNZ6oYKc3mxSt7NxLYK4NXAPrHeN3iTgbn+QkYW0gNhKS3YqDmXmAAmfrnJp4tk/x96WzEhNc7WB+Jmoqplqr9p/pXa61KWxZzjqwPsGirJs9YpMC1ZvFFXMOoPSToQ74hTOojl9ObbfU1n7M+Gen7bRt8cD8PgGu34GoP2rhIaoPlVjo/7PlbgeE8ywO5w//67wEDszoaIjpuz+0uccilsRxmrvPALJvBB9XkRmPQWj1PHQw7bZT+3FWPoYJYXsGrx+uSBAfwvIGsBoDP25g4xOSVg3GviQowzQZ3LEfNWIDF5MfQaD9wv05cIpWDIVkK/0O8r6Am72tacZywa5OpoC2tPbyqQzFeLPSjK7x7WT56/4E7yaLzZtNUSh9xa+cFpVthHejbMFipil5YXdf6WTo+3CqTgBswF1wCR8C7EwCvVllTfU1D/lp40IxnJvAJ42nwbHk4eEVokE/Wy+IWsYjY9vJrXpeb3V5aP9M2T0qydHSFI6gesYKIj5DzuM1vfVZtFKxb+TdxKzMWzCdxSPEgIHwYRQd+82cSGU68TLMArMLzmJqIF+86angnmDTCT74aBvuWJStIHNaxl3sFjb6XJuQN4MiIWx7hQycyseP1Y8hAsuK1glnNdSzUeW7kJ/9S/xKp/XQSw9z2ByQMhEZz5DrL3u417HmWj1wtsV9M2U2eUmtL8GXbLHPxumu/cPwYOiKDE2kCdJD5JrP6Y6YQXw2hS/dUsn3GSqU0x3qShaMogUvxXhb Gseo3Gbg 1UTRabTyoe/3i/VnbXUcB7+d5TMX6suQ38Z0LcWz/0GxsfhAI93y2qJnch840PVn9SR7NXEmlXjqYv58h/pPXThcQqwRdP4q0CSbbYqc0Vw6A0QaW9tGBNfTeRihseOnZV4rOeu9McRcPSvAEnetRNgG34qCqNgIA56b33XfhD0z7VUC7PmoilthPXPppI2X4EoudGX9IufwYiSnq7o6MEMmFvLRJfFejaVE+DQrW0DV7ghw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 07/05/25 3:33 pm, David Hildenbrand wrote: > On 06.05.25 07:00, Dev Jain wrote: >> To use PTE batching, we want to determine whether the folio mapped by >> the PTE is large, thus requiring the use of vm_normal_folio(). We want >> to avoid the cost of vm_normal_folio() if the code path doesn't already >> require the folio. For arm64, pte_batch_hint() does the job. To >> generalize >> this hint, add a helper which will determine whether two consecutive PTEs >> point to consecutive PFNs, in which case there is a high probability that >> the underlying folio is large. >> >> Signed-off-by: Dev Jain >> --- >>   include/linux/pgtable.h | 16 ++++++++++++++++ >>   1 file changed, 16 insertions(+) >> >> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >> index b50447ef1c92..28e21fcc7837 100644 >> --- a/include/linux/pgtable.h >> +++ b/include/linux/pgtable.h >> @@ -369,6 +369,22 @@ static inline pgd_t pgdp_get(pgd_t *pgdp) >>   } >>   #endif >> +/* Caller must ensure that ptep + 1 exists */ >> +static inline bool maybe_contiguous_pte_pfns(pte_t *ptep, pte_t pte) >> +{ >> +    pte_t *next_ptep, next_pte; >> + >> +    if (pte_batch_hint(ptep, pte) != 1) >> +        return true; >> + >> +    next_ptep = ptep + 1; >> +    next_pte = ptep_get(next_ptep); >> +    if (!pte_present(next_pte)) >> +        return false; >> + >> +    return unlikely(pte_pfn(next_pte) - pte_pfn(pte) == PAGE_SIZE); >> +} > > So, where we want to use that is: > > if (pte_present(old_pte)) { >     if ((max_nr != 1) && maybe_contiguous_pte_pfns(old_ptep, old_pte)) { >         struct folio *folio = vm_normal_folio(vma, old_addr, old_pte); > >         if (folio && folio_test_large(folio)) >             nr = folio_pte_batch(folio, old_addr, old_ptep, >                          old_pte, max_nr, fpb_flags, NULL, NULL, NULL); >     } > } > > where we won't need the folio later. But want it all part of the same > folio? > > > And the simpler version would be > > > if (pte_present(old_pte)) { >     if (max_nr != 1) { >         struct folio *folio = vm_normal_folio(vma, old_addr, old_pte); > >         if (folio && folio_test_large(folio)) >             nr = folio_pte_batch(folio, old_addr, old_ptep, >                          old_pte, max_nr, fpb_flags, NULL, NULL, NULL); >     } > } > > > Two things come to mind: > > (1) Do we *really* care about the vm_normal_folio() + folio_test_large() > call that much, that you > have to add this optimization ahead of times ? :) For my mprotect series, I see a regression of almost (7.7 - 7.65)/7.7 = 0.65% for the small folio case. I am happy to remove this micro-optimization if that is the preference. > > (2) Do we really need "must be part of the same folio", or could be just > batch over present > ptes that map consecutive PFNs? In that case, a helper that avoids > folio_pte_batch() completely > might be better. > I am not sure I get you here. folio_pte_batch() seems to be the simplest thing we can do as being done around in the code elsewhere, I am not aware of any alternate.