From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06C6FC77B7F for ; Fri, 27 Jun 2025 14:19:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6224F6B00AC; Fri, 27 Jun 2025 10:19:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F9ED6B00AD; Fri, 27 Jun 2025 10:19:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 510BC6B00B6; Fri, 27 Jun 2025 10:19:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 42B216B00AC for ; Fri, 27 Jun 2025 10:19:54 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B7DD3101FEB for ; Fri, 27 Jun 2025 14:19:53 +0000 (UTC) X-FDA: 83601389466.28.79D1A1B Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) by imf21.hostedemail.com (Postfix) with ESMTP id 15AB71C0008 for ; Fri, 27 Jun 2025 14:19:51 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=i+k5EmYI; spf=pass (imf21.hostedemail.com: domain of refault0@gmail.com designates 209.85.219.54 as permitted sender) smtp.mailfrom=refault0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751033992; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oeiLvBeCyMccIVYG3MCR2a7PippxdtExjyMZVApEjXE=; b=wwl1ESmM8tVRBIap9ra4q5KC+G0+gw6zZsDnCsKG0MeYav9UvVOg+16keiAeu2VK/2o8uT /j/gsMKN7U3BCSOifDNBjTKZZPAW75pHVeZvAmvhwMdPsnNyUBtm53+sKn0GZJdokpsdZH zIH8fTKPE1ZmH6JepbIC/ugoYDAZXl8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751033992; a=rsa-sha256; cv=none; b=Mvs+84skIVzweGCg2CfufHRGkZp6wHj14N/BCmKsB5goVdSdKP8zTKY0tSdiy5iLZGpzpL hXP2c7YZtqh9ulCvzyu3dHUymtYsrg9vuVu7EVHxNY3ZClsnqtIsw1zNOGvODGt4LbrwYi H3JTKRs+b+leNrwow+tLqJM6/pxMkrk= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=i+k5EmYI; spf=pass (imf21.hostedemail.com: domain of refault0@gmail.com designates 209.85.219.54 as permitted sender) smtp.mailfrom=refault0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-6ecf99dd567so25751546d6.0 for ; Fri, 27 Jun 2025 07:19:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751033991; x=1751638791; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=oeiLvBeCyMccIVYG3MCR2a7PippxdtExjyMZVApEjXE=; b=i+k5EmYIBvaj+ScwqMiVhqhtLR7j9zaK7jwRKvTvVXzDj5wUPOT0DfqTY5g8nIGxkn wi1Mta4ULTX5ZztbbhpxEoJGqnXVJ+3yx/mlcayR5BhsxwLa7R5Eg61EV21YtE9nHlsc +WA2yqavwxD5Mk4lHRWxcngicZCZnP5tbYqQ+F35KrW/TaWlMVgp/CpvH9upmopwBDQ0 wI+CGxDBAEIlw6kADAtEmHybok5Mm+SOiYxowe1tGqJ/S/P3sBm1p7Z4YDkWU/x/ZD6z h37EuHx9/gVbWQbntr8oj0UVctgMR2tjiSGBBflKMTt2e5vpFDOHGOIG9v/RS2dxg2WX INSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751033991; x=1751638791; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oeiLvBeCyMccIVYG3MCR2a7PippxdtExjyMZVApEjXE=; b=Wd4JZpQ9OUopC7Rh1DNLhTv0r3vNLmkAN231LBnJx3FyKjGIOz6CyVs5ci3f18jUSJ OvkIQToalfJFK+et2K3iaR2j4rCmmz3bHbo1/FOwIkr5h1kq2q2keWnKGlpFBfRzwpt3 3LNN45EQUDbiYX8oZIKGEEkohCQuxbX5kpwsZ1yNx0g7Z26XDVvtes3CAYh5VHX6PNKB I5AL7jh7WMFtyK0uEO9P9iqKylMo7mNhCIpRv3kcAJMG7KbMMOauxyJe04v6gDu6Th2W ZxGgVbKnWuNIs5C85+vw2dLrbe0cjs/a8nKNydpL9Pt1Xm35+gvB0uPDI5AxX+wqep7B qNTw== X-Forwarded-Encrypted: i=1; AJvYcCVp5V5HpUf+BtZXZSBjXhLKf15d/FAOPF2HJ0MBSSA0HDyb9/W22z4N3UOpfGoe2+cXd3HXz4pfPQ==@kvack.org X-Gm-Message-State: AOJu0YzaeanT8dNpUrS+zWgNvQt1otjLM2fJ04qv5wsRf29u/JgSCLZ8 AKFd/7wMkM7Xg33Y6CjqZH+K7dP1QsXI2NwPXy8CAshgRrMigTQ6UHJzJAs95luzNjNTynCEcVw Py9T5u9H2Z3iiM5G8QOqH2kHtTPExj8w= X-Gm-Gg: ASbGncsDw5wyQXbi3r8ipEPIWdbrS1aGiAnUI0wH7WUx0jO/22aYERYTQ8iG+Ldsw91 3DF6kBUTKZkB0Bh9k7fvNwi9WNXjJIzS0tcmJTpRQZQ8YtLZdkxfpQOTE3CLIwTsx0QaH9naye5 AMzf16nvoDLCe8zP2kPrT7gHMXmt+c9izR/Tmc4a4SOF+kTNp+YPM5px3O9w== X-Google-Smtp-Source: AGHT+IF+SXK9jE+jGxIGSSPYHvm7ImFijF4HgpUMkxyA2joMWTh/O49aWFB8KJqW0UmxHss58UkBUll0Mx951JecDqA= X-Received: by 2002:a05:6214:3bc2:b0:6f2:a4cf:5fd7 with SMTP id 6a1803df08f44-700147fea74mr66123936d6.45.1751033990960; Fri, 27 Jun 2025 07:19:50 -0700 (PDT) MIME-Version: 1.0 References: <20250627115510.3273675-1-david@redhat.com> <20250627115510.3273675-4-david@redhat.com> In-Reply-To: <20250627115510.3273675-4-david@redhat.com> From: Lance Yang Date: Fri, 27 Jun 2025 22:19:32 +0800 X-Gm-Features: Ac12FXx8oaRXdpoubH6t8VIlasdHGnQUvbUOT86RSA4npLMwJPCzHnWjPFYfjQI Message-ID: Subject: Re: [PATCH v1 3/4] mm: split folio_pte_batch() into folio_pte_batch() and folio_pte_batch_ext() To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , "Liam R. Howlett" , Lorenzo Stoakes , Vlastimil Babka , Jann Horn , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Zi Yan , Matthew Brost , Joshua Hahn , Rakie Kim , Byungchul Park , Gregory Price , Ying Huang , Alistair Popple , Pedro Falcato , Rik van Riel , Harry Yoo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 15AB71C0008 X-Rspam-User: X-Rspamd-Server: rspam06 X-Stat-Signature: c55jrsa7wuj4ktk9st8ymdycyo97xm4z X-HE-Tag: 1751033991-407334 X-HE-Meta: U2FsdGVkX18ZZCgW+sNd0sJBMarx7BYt5jgGF6Po+ToDc2R65SHS1+wo2+ecQmMZVYSeSlrG5sbrtTLkRdzi+ojjOyjWoxlXskN2qWx5tDXLoges0qjZSv7+iQqarS/IYBAvjb8CY7DTnX/T3wepLgXsT16cgjoaSmflbCCWuh6fPHaKeqzAdlC52zqFVpmfaoziW24raHSkIm/jfSwafz3AbMvf2i7n/x60SX6VePRVnu6j5fpooGls+Lk9RSlJ28fakfXU3GHKXtGZ2BsJG300RsmzFCNMqqHVUFwlxkYa2m+8VwdZnhbFWQlgRB0a7YzZ/13KBdrcj4/J/++X7laJjyr2yL2XuGN8gxTUSXXLd1y7Rne4d+oTurGjWUAxYH953MlChlmNG2X6qoFEGmBP4SmXrWARAy3ZqBwK/Usz8mgsXh0aJSWB1HjbLW9pXvQWJm7Wq9YVeBTkoZJaiFrSSksPxHJ4xZiFfIzQ0HGcWQycKhnf+W/E7X5kuteGGgFZs1bq0ULLDuGjPTNunvgr3QNz1r25xFAix9BVF2tJnz1dlFDkeUAeavkzaUkmO5x62CIAUueepCUNuZmn4Xna2LyRvKT4i1cw8OTYWB0VYtDCeb7nWQI+li5xP88etRzAT3lx8oFHGpZk+bAt3HzKEWTgkXHCdvlmsjOWnG/2JMoDPqf8Zv83siUrNXLIcBhPv4An3Otr03Djh9FRwRXPFNatD5ZKQxkg5RYQtiD8x2DapVm35Ra5hFBtb94jlcNVgkTimKVtzAqWme1fxAkVjm7vHb++jc+GVP+ZeFjTzCax94lCkgW/W2D8Wd/BtcKuQW418y2vF3tZQJYs5KGvHzntrL2K4c2i4IXTCsNQ2eHDA98VT/Euzx7jmkXOzhymrKCRr+ycfsU5N9Qu/ZM1EmssRIY1SEH6T/bO2FHliZkbwgKqdmLaH5OenmByE2kQmmYBopgM6CVA1KT kHy2jdzp UNR2t4A4UXB9uhbmsmV+L5Tox19HRnyo9c13HPhWkKU0ZTIwJ2opyGihphWAwuoWm/s3aaBRVyyUmOBWuAKydsk8ZTOHO2eq3u8LSZEsRmnXKZ3fXoxkF0oc4K59Xzbs6GNIbycwuRdmNT+/uM9u1s0KqepK/0vbTIzgGzC+P7eO7rFK9pL6feihaQY16aRncnQHvKSab0qhkIY7UCZ8ciJJ+HNmg+h3nZwiqd0sb9a877waIk80htZ0j8exMnd8vXIltDTo9OamJ538ButDIvL37co33vLoyhpXO53uG0oHCGIXxZvSZ47degQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 27, 2025 at 7:55=E2=80=AFPM David Hildenbrand wrote: > > Many users (including upcoming ones) don't really need the flags etc, > and can live with a function call. > > So let's provide a basic, non-inlined folio_pte_batch(). > > In zap_present_ptes(), where we care about performance, the compiler > already seem to generate a call to a common inlined folio_pte_batch() > variant, shared with fork() code. So calling the new non-inlined variant > should not make a difference. It's always an interesting dance with the compiler when it comes to inlinin= g, isn't it? We want the speed of 'inline' for critical paths, but also a comp= act binary for the common case ... This split is a nice solution to the classic 'inline' vs. code size dilemma= ;p Thanks, Lance > > While at it, drop the "addr" parameter that is unused. > > Signed-off-by: David Hildenbrand > --- > mm/internal.h | 11 ++++++++--- > mm/madvise.c | 4 ++-- > mm/memory.c | 6 ++---- > mm/mempolicy.c | 3 +-- > mm/mlock.c | 3 +-- > mm/mremap.c | 3 +-- > mm/rmap.c | 3 +-- > mm/util.c | 29 +++++++++++++++++++++++++++++ > 8 files changed, 45 insertions(+), 17 deletions(-) > > diff --git a/mm/internal.h b/mm/internal.h > index ca6590c6d9eab..6000b683f68ee 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -218,9 +218,8 @@ static inline pte_t __pte_batch_clear_ignored(pte_t p= te, fpb_t flags) > } > > /** > - * folio_pte_batch - detect a PTE batch for a large folio > + * folio_pte_batch_ext - detect a PTE batch for a large folio > * @folio: The large folio to detect a PTE batch for. > - * @addr: The user virtual address the first page is mapped at. > * @ptep: Page table pointer for the first entry. > * @pte: Page table entry for the first page. > * @max_nr: The maximum number of table entries to consider. > @@ -243,9 +242,12 @@ static inline pte_t __pte_batch_clear_ignored(pte_t = pte, fpb_t flags) > * must be limited by the caller so scanning cannot exceed a single VMA = and > * a single page table. > * > + * This function will be inlined to optimize based on the input paramete= rs; > + * consider using folio_pte_batch() instead if applicable. > + * > * Return: the number of table entries in the batch. > */ > -static inline unsigned int folio_pte_batch(struct folio *folio, unsigned= long addr, > +static inline unsigned int folio_pte_batch_ext(struct folio *folio, > pte_t *ptep, pte_t pte, unsigned int max_nr, fpb_t flags, > bool *any_writable, bool *any_young, bool *any_dirty) > { > @@ -293,6 +295,9 @@ static inline unsigned int folio_pte_batch(struct fol= io *folio, unsigned long ad > return min(nr, max_nr); > } > > +unsigned int folio_pte_batch(struct folio *folio, pte_t *ptep, pte_t pte= , > + unsigned int max_nr); > + > /** > * pte_move_swp_offset - Move the swap entry offset field of a swap pte > * forward or backward by delta > diff --git a/mm/madvise.c b/mm/madvise.c > index 661bb743d2216..9b9c35a398ed0 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -349,8 +349,8 @@ static inline int madvise_folio_pte_batch(unsigned lo= ng addr, unsigned long end, > { > int max_nr =3D (end - addr) / PAGE_SIZE; > > - return folio_pte_batch(folio, addr, ptep, pte, max_nr, 0, NULL, > - any_young, any_dirty); > + return folio_pte_batch_ext(folio, ptep, pte, max_nr, 0, NULL, > + any_young, any_dirty); > } > > static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > diff --git a/mm/memory.c b/mm/memory.c > index ab2d6c1425691..43d35d6675f2e 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -995,7 +995,7 @@ copy_present_ptes(struct vm_area_struct *dst_vma, str= uct vm_area_struct *src_vma > if (vma_soft_dirty_enabled(src_vma)) > flags |=3D FPB_HONOR_SOFT_DIRTY; > > - nr =3D folio_pte_batch(folio, addr, src_pte, pte, max_nr,= flags, > + nr =3D folio_pte_batch_ext(folio, src_pte, pte, max_nr, f= lags, > &any_writable, NULL, NULL); > folio_ref_add(folio, nr); > if (folio_test_anon(folio)) { > @@ -1564,9 +1564,7 @@ static inline int zap_present_ptes(struct mmu_gathe= r *tlb, > * by keeping the batching logic separate. > */ > if (unlikely(folio_test_large(folio) && max_nr !=3D 1)) { > - nr =3D folio_pte_batch(folio, addr, pte, ptent, max_nr, 0= , > - NULL, NULL, NULL); > - > + nr =3D folio_pte_batch(folio, pte, ptent, max_nr); > zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent,= nr, > addr, details, rss, force_flush, > force_break, any_skipped); > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index 2a25eedc3b1c0..eb83cff7db8c3 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -711,8 +711,7 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigne= d long addr, > if (!folio || folio_is_zone_device(folio)) > continue; > if (folio_test_large(folio) && max_nr !=3D 1) > - nr =3D folio_pte_batch(folio, addr, pte, ptent, > - max_nr, 0, NULL, NULL, NULL)= ; > + nr =3D folio_pte_batch(folio, pte, ptent, max_nr)= ; > /* > * vm_normal_folio() filters out zero pages, but there mi= ght > * still be reserved folios to skip, perhaps in a VDSO. > diff --git a/mm/mlock.c b/mm/mlock.c > index 2238cdc5eb1c1..a1d93ad33c6db 100644 > --- a/mm/mlock.c > +++ b/mm/mlock.c > @@ -313,8 +313,7 @@ static inline unsigned int folio_mlock_step(struct fo= lio *folio, > if (!folio_test_large(folio)) > return 1; > > - return folio_pte_batch(folio, addr, pte, ptent, count, 0, NULL, > - NULL, NULL); > + return folio_pte_batch(folio, pte, ptent, count); > } > > static inline bool allow_mlock_munlock(struct folio *folio, > diff --git a/mm/mremap.c b/mm/mremap.c > index d4d3ffc931502..1f5bebbb9c0cb 100644 > --- a/mm/mremap.c > +++ b/mm/mremap.c > @@ -182,8 +182,7 @@ static int mremap_folio_pte_batch(struct vm_area_stru= ct *vma, unsigned long addr > if (!folio || !folio_test_large(folio)) > return 1; > > - return folio_pte_batch(folio, addr, ptep, pte, max_nr, 0, NULL, > - NULL, NULL); > + return folio_pte_batch(folio, ptep, pte, max_nr); > } > > static int move_ptes(struct pagetable_move_control *pmc, > diff --git a/mm/rmap.c b/mm/rmap.c > index a29d7d29c7283..6658968600b72 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1859,8 +1859,7 @@ static inline bool can_batch_unmap_folio_ptes(unsig= ned long addr, > if (pte_pfn(pte) !=3D folio_pfn(folio)) > return false; > > - return folio_pte_batch(folio, addr, ptep, pte, max_nr, 0, NULL, > - NULL, NULL) =3D=3D max_nr; > + return folio_pte_batch(folio, ptep, pte, max_nr); > } > > /* > diff --git a/mm/util.c b/mm/util.c > index 0b270c43d7d12..d29dcc135ad28 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -1171,3 +1171,32 @@ int compat_vma_mmap_prepare(struct file *file, str= uct vm_area_struct *vma) > return 0; > } > EXPORT_SYMBOL(compat_vma_mmap_prepare); > + > +#ifdef CONFIG_MMU > +/** > + * folio_pte_batch - detect a PTE batch for a large folio > + * @folio: The large folio to detect a PTE batch for. > + * @ptep: Page table pointer for the first entry. > + * @pte: Page table entry for the first page. > + * @max_nr: The maximum number of table entries to consider. > + * > + * This is a simplified variant of folio_pte_batch_ext(). > + * > + * Detect a PTE batch: consecutive (present) PTEs that map consecutive > + * pages of the same large folio in a single VMA and a single page table= . > + * > + * All PTEs inside a PTE batch have the same PTE bits set, excluding the= PFN, > + * the accessed bit, writable bit, dirt-bit and soft-dirty bit. > + * > + * ptep must map any page of the folio. max_nr must be at least one and > + * must be limited by the caller so scanning cannot exceed a single VMA = and > + * a single page table. > + * > + * Return: the number of table entries in the batch. > + */ > +unsigned int folio_pte_batch(struct folio *folio, pte_t *ptep, pte_t pte= , > + unsigned int max_nr) > +{ > + return folio_pte_batch_ext(folio, ptep, pte, max_nr, 0, NULL, NUL= L, NULL); > +} > +#endif /* CONFIG_MMU */ > -- > 2.49.0 > >