From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE45CC5B549 for ; Wed, 4 Jun 2025 14:40:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A0E58D0022; Wed, 4 Jun 2025 10:40:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 351C88D0007; Wed, 4 Jun 2025 10:40:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 240A88D0022; Wed, 4 Jun 2025 10:40:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EE6048D0007 for ; Wed, 4 Jun 2025 10:40:00 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 73CCE121719 for ; Wed, 4 Jun 2025 14:40:00 +0000 (UTC) X-FDA: 83517977760.25.438FA31 Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf13.hostedemail.com (Postfix) with ESMTP id 6D3FB2000A for ; Wed, 4 Jun 2025 14:39:58 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eENc09jN; spf=pass (imf13.hostedemail.com: domain of surenb@google.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749047998; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aFSaiArb5JlS4rKdNQDLgny0OyxB393pGHWtf5taMQ4=; b=LousWVQKFDWemoTa2UAbfW0EV12nldtped1fiGqYhStWJ4KK1XnZil7+4XmSd8BYwmgypN 6gtniWRXvaoIbRFO/2aN4owef6xzY63QgVrb0O+JmnS2JQ+u2bXFA+pf6z03TCB6n3tbsz O6KdvzZO2lluQw6UjqwVX5ZN5ph3XO4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749047998; a=rsa-sha256; cv=none; b=z4F8fRLifgoZs2MzAYOhLr5vq5JKtc0RBeGIkKnLwLMlRKnDBlCcPvi9/BF+HIsLToKgvu PWLYyAsKrpwTrHta+d7jCVulWWlps7YjG5pXbEnDnIlqmIuGMbVQGZvyNuEw2IM7huzb4C 4PVOkvI2sgBmmpzqvzoq6TpBxEhtZnE= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eENc09jN; spf=pass (imf13.hostedemail.com: domain of surenb@google.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-601a67c6e61so12800a12.0 for ; Wed, 04 Jun 2025 07:39:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1749047997; x=1749652797; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=aFSaiArb5JlS4rKdNQDLgny0OyxB393pGHWtf5taMQ4=; b=eENc09jNZ208vB4mkbCZWlp/hQkaTFA9P8JCvkczxg8fCBbM6BAjn87p9SzKhzJEZR HmdYOlUS2KaaxjSIDwBuqH5WHXvnG9x1EVYWKSKwtq3tcR0h27bW8+ieDHWJqQnsoxZ4 D7x6N2XLvMQjx+Ac+KyYTuhhbK3YHDKEK3j5FwPxliOEXjMBQoRRUH5lnHSS1BwHk1Ei 7uQLfGHAz2pAMIMqDJDxd0gPW3Y4vBLDQ9IJYOGmaVAvQfsd0DKpcBXIlcc5mf7OkSJM /k/2eWQwml1lStD2EFRutV0Tpcl+vOgW2GY5oyn73JpQxR3sGIT4kb3rhHCuObaf5yen 9cRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749047997; x=1749652797; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aFSaiArb5JlS4rKdNQDLgny0OyxB393pGHWtf5taMQ4=; b=DENR0poWqNSoG+wta9ipCZvJrAP633nbQj9FsjdV9z+A1RAtc6FXWAs0bAdr1lMI29 x5lAfN+/laDCzX2/1wh2KJsjjtlwNPwEq29bhzxu2J2ID5DOOkTTNGi619OOUHRl22Cv hlKbMO52afhW8OhMY40ByxCG4Cm/RoMslzNmd6NKZyS0Z0W/6e7k67iEnNzb5032tpM8 9YbFr8D31E5f5ApnbbQHed4P1VZYQCraUwuVWT2igjhxm6xm1pR24WbIV1SW8DN5aAcc DYUPP0pl7Yo6KSaS22WlyTZM/1C7CiheKQfxouq3wnk+rRcPdWwrWWUSRMHxHUjzR6BZ NxkQ== X-Forwarded-Encrypted: i=1; AJvYcCUCY+68kSf8bvQdfQCPUTLu0k2ScRWwDF82eENB0aSK5KVBdzIB/J5qtLKb/4J8i+5w2zZ5q4fQQw==@kvack.org X-Gm-Message-State: AOJu0Yy8CazmbecL4+5JYRQ7mDtUh4PG2inX5Iv33TgfvqOaIp6Tu6fo QhYH8nuSg8dsnteFhLFwQZ1/f6w05XvSbWtr5GZhjbszeXqsEaNuu/QTNHBnoFYrd1L0OAB9pbM 1CKhTMIafVSKK9RtUfaZPuLB1RbByMTDv5pvN5PrE X-Gm-Gg: ASbGncsaXWwpEiDH4ZUJofc1OvClxZi08JJwrIk1vSCFkW9xH+sOyXyDe2jEmPwAOZv rc8GRYg3ZWfblePZ2yAUjc9KD8bOh6JWSz4G5Lk6El+r/zqZ3/SwW8JPcb9+w7CsrIBBPHTPZfk 1A/GbpF+fG+QsjCXNo8ictpd3RQf8GcGeo+lLMmQL3u0w5Z1jGcoDO3rB1ViEf4VKb5qpV5KD2 X-Google-Smtp-Source: AGHT+IFe67h9c8KQL1yClqkdMmIaa6YJ4wH8M7vVp7rH287AK15LYvq5zc0Wx87dYgNK+3JMBzj2Ni08hl8HmHGq0vk= X-Received: by 2002:aa7:d6cd:0:b0:602:3bf:ce73 with SMTP id 4fb4d7f45d1cf-606ee31b340mr85155a12.1.1749047995780; Wed, 04 Jun 2025 07:39:55 -0700 (PDT) MIME-Version: 1.0 References: <20250604141958.111300-1-lorenzo.stoakes@oracle.com> In-Reply-To: <20250604141958.111300-1-lorenzo.stoakes@oracle.com> From: Suren Baghdasaryan Date: Wed, 4 Jun 2025 07:39:41 -0700 X-Gm-Features: AX0GCFvLFvrZnFxiBvsumrroa9R0gCS-F75JDJtm-veAoC59HniL7QD93_4kkCc Message-ID: Subject: Re: [PATCH v2] mm/pagewalk: split walk_page_range_novma() into kernel/user parts To: Lorenzo Stoakes Cc: Andrew Morton , Barry Song , David Hildenbrand , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Michal Hocko , Muchun Song , Oscar Salvador , Huacai Chen , WANG Xuerui , Jonas Bonn , Stefan Kristiansson , Stafford Horne , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Jann Horn , loongarch@lists.linux.dev, linux-kernel@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 6D3FB2000A X-Stat-Signature: 8oqdas3oj58x5mhe1mbztbtzo7hyxe7r X-Rspam-User: X-HE-Tag: 1749047998-203459 X-HE-Meta: U2FsdGVkX1/xLWDz5xpjNKcv847S+xZo+gKOMlWVWoF1E6WA/APTeA7F/NuSfbiAtWF+p6QvLHn74cPK5tEyrxsb2SvKe2nyk+BFt5s/fCNZh4jU6ESxJndotSzpwMjnkI64Lu93MKWeSztpWzV88m6vWAtiYcvWZQ3gwWNs1O9w97kyF/JhUl+agADxn3n0WCK9g6gt0l8ek4RumROT2NyQ1A4Ti7xOm7VGXNwn0tp2JSg28FExKLwwRLH96ZApS4XmV3hDu1C2HbIrrwiHbVz6I2c7KO6g644Lab+S4TzXaeKAypcXCh2CEkujR6weLsLRlbUfd6nGZeHTiox4XZ3el88clokzi0chUnTTReGYMn+lfCYpPVXxrt5b7vniYF+eFIFRYe363PIYleynCOTrCVvLXBWXMk5a4bl3s5lnRme7osSHTTjt1lsFsKFeGBZfL1P0OIYu85THF0nrZj1HSWDLxv8DE/F7cB/LMlR35Ll3GNsZYDdECR07gwa9CNY3FpMcKCk02mBDxA8oTcczLq7z/5wbi19h7Gf//CaNAyk9dIrZdmNS5edx+ZZAXYbgvPAiKFPlAZwcosYyl8pvZt1/8oxu/ru+z1EGafrByGAGolSRIiMrPPjJkfXip0McgnTmx3hkFa12qKyoYIOk/HsFhISxs3k777XlKN50env+Ytnm/kw+U9PD34RsuZtVqZT8+LHrkK8THLl54BXCtjf7ncoDXURHTbxMp8Kovu+PO2ibmS1z9TE1M8yn8nJKVpKAq9+9aYPsMBEOs6nYWHYZY8tPWehF2ZwI4uqHZ51K0LaokxTVO42r6x+3sSUUbxWy5vvnRB47wn7CGmwknL/8FSx3Hh77+nhWs78u2y31eVfaTdTPtPbVTDu4giiEagN0IgB5kY10erw+4X6DylhLBeCFyzGXw7tyaGtsZRj7GVRp3hs6LRRvmMHW89O2T8jVzI6Z3WICuHz Re35ej00 ya9HXJMRor9zxw8jgaHgx0IMtri5Moy4UNc+Bv3VS09oc8+SwKgjhMsDEkBvVCa/935GkRjpr4oW9Kt/wvgktCj2Ga/BKGYOSi7s+MrdbAjCtzARimA1dpiCxL5gutiMxLz3w9K1HecZd5BSG6IdTBngrWZkVEGq6Rxd6k70N7Q6L+j4uWRZJvaokK5JTxDQ/F83gWo95rxqGwantcWEhCPm76RFQ/jsBbXiapXToRo0uzV1c0gKZ/F1UlrSjxibZdJyMZNYIGbhW3I9Km1SWwxHy6I1t2zmgfesWYmWxacTlHbF8vU8DTu1c9g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 4, 2025 at 7:21=E2=80=AFAM Lorenzo Stoakes wrote: > > The walk_page_range_novma() function is rather confusing - it supports tw= o > modes, one used often, the other used only for debugging. > > The first mode is the common case of traversal of kernel page tables, whi= ch > is what nearly all callers use this for. > > Secondly it provides an unusual debugging interface that allows for the > traversal of page tables in a userland range of memory even for that memo= ry > which is not described by a VMA. > > It is far from certain that such page tables should even exist, but perha= ps > this is precisely why it is useful as a debugging mechanism. > > As a result, this is utilised by ptdump only. Historically, things were > reversed - ptdump was the only user, and other parts of the kernel evolve= d > to use the kernel page table walking here. > > Since we have some complicated and confusing locking rules for the novma > case, it makes sense to separate the two usages into their own functions. > > Doing this also provide self-documentation as to the intent of the caller= - > are they doing something rather unusual or are they simply doing a standa= rd > kernel page table walk? > > We therefore establish two separate functions - walk_page_range_debug() f= or > this single usage, and walk_kernel_page_table_range() for general kernel > page table walking. > > We additionally make walk_page_range_debug() internal to mm. > > Note that ptdump uses the precise same function for kernel walking as a > convenience, so we permit this but make it very explicit by having > walk_page_range_novma() invoke walk_kernel_page_table_range() in this cas= e. > > Signed-off-by: Lorenzo Stoakes > Acked-by: Mike Rapoport (Microsoft) Reviewed-by: Suren Baghdasaryan > --- > v2: > * Renamed walk_page_range_novma() to walk_page_range_debug() as per David= . > * Moved walk_page_range_debug() definition to mm/internal.h as per Mike. > * Renamed walk_page_range_kernel() to walk_kernel_page_table_range() as > per David. > > v1 resend: > * Actually cc'd lists... > * Fixed mistake in walk_page_range_novma() not handling kernel mappings a= nd > update commit message to referene. > * Added Mike's off-list Acked-by. > * Fixed up comments as per Mike. > * Add some historic flavour to the commit message as per Mike. > https://lore.kernel.org/all/20250603192213.182931-1-lorenzo.stoakes@oracl= e.com/ > > v1: > (accidentally sent off-list due to error in scripting) > > arch/loongarch/mm/pageattr.c | 2 +- > arch/openrisc/kernel/dma.c | 4 +- > arch/riscv/mm/pageattr.c | 8 +-- > include/linux/pagewalk.h | 7 ++- > mm/hugetlb_vmemmap.c | 2 +- > mm/internal.h | 4 ++ > mm/pagewalk.c | 98 ++++++++++++++++++++++++------------ > mm/ptdump.c | 3 +- > 8 files changed, 82 insertions(+), 46 deletions(-) > > diff --git a/arch/loongarch/mm/pageattr.c b/arch/loongarch/mm/pageattr.c > index 99165903908a..f5e910b68229 100644 > --- a/arch/loongarch/mm/pageattr.c > +++ b/arch/loongarch/mm/pageattr.c > @@ -118,7 +118,7 @@ static int __set_memory(unsigned long addr, int numpa= ges, pgprot_t set_mask, pgp > return 0; > > mmap_write_lock(&init_mm); > - ret =3D walk_page_range_novma(&init_mm, start, end, &pageattr_ops= , NULL, &masks); > + ret =3D walk_kernel_page_table_range(start, end, &pageattr_ops, N= ULL, &masks); > mmap_write_unlock(&init_mm); > > flush_tlb_kernel_range(start, end); > diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c > index 3a7b5baaa450..af932a4ad306 100644 > --- a/arch/openrisc/kernel/dma.c > +++ b/arch/openrisc/kernel/dma.c > @@ -72,7 +72,7 @@ void *arch_dma_set_uncached(void *cpu_addr, size_t size= ) > * them and setting the cache-inhibit bit. > */ > mmap_write_lock(&init_mm); > - error =3D walk_page_range_novma(&init_mm, va, va + size, > + error =3D walk_kernel_page_table_range(va, va + size, > &set_nocache_walk_ops, NULL, NULL); > mmap_write_unlock(&init_mm); > > @@ -87,7 +87,7 @@ void arch_dma_clear_uncached(void *cpu_addr, size_t siz= e) > > mmap_write_lock(&init_mm); > /* walk_page_range shouldn't be able to fail here */ > - WARN_ON(walk_page_range_novma(&init_mm, va, va + size, > + WARN_ON(walk_kernel_page_table_range(va, va + size, > &clear_nocache_walk_ops, NULL, NULL)); > mmap_write_unlock(&init_mm); > } > diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c > index d815448758a1..3f76db3d2769 100644 > --- a/arch/riscv/mm/pageattr.c > +++ b/arch/riscv/mm/pageattr.c > @@ -299,7 +299,7 @@ static int __set_memory(unsigned long addr, int numpa= ges, pgprot_t set_mask, > if (ret) > goto unlock; > > - ret =3D walk_page_range_novma(&init_mm, lm_start,= lm_end, > + ret =3D walk_kernel_page_table_range(lm_start, lm= _end, > &pageattr_ops, NULL, = &masks); > if (ret) > goto unlock; > @@ -317,13 +317,13 @@ static int __set_memory(unsigned long addr, int num= pages, pgprot_t set_mask, > if (ret) > goto unlock; > > - ret =3D walk_page_range_novma(&init_mm, lm_start, lm_end, > + ret =3D walk_kernel_page_table_range(lm_start, lm_end, > &pageattr_ops, NULL, &masks); > if (ret) > goto unlock; > } > > - ret =3D walk_page_range_novma(&init_mm, start, end, &pageattr_op= s, NULL, > + ret =3D walk_kernel_page_table_range(start, end, &pageattr_ops, = NULL, > &masks); > > unlock: > @@ -335,7 +335,7 @@ static int __set_memory(unsigned long addr, int numpa= ges, pgprot_t set_mask, > */ > flush_tlb_all(); > #else > - ret =3D walk_page_range_novma(&init_mm, start, end, &pageattr_op= s, NULL, > + ret =3D walk_kernel_page_table_range(start, end, &pageattr_ops, = NULL, > &masks); > > mmap_write_unlock(&init_mm); > diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h > index 9700a29f8afb..8ac2f6d6d2a3 100644 > --- a/include/linux/pagewalk.h > +++ b/include/linux/pagewalk.h > @@ -129,10 +129,9 @@ struct mm_walk { > int walk_page_range(struct mm_struct *mm, unsigned long start, > unsigned long end, const struct mm_walk_ops *ops, > void *private); > -int walk_page_range_novma(struct mm_struct *mm, unsigned long start, > - unsigned long end, const struct mm_walk_ops *op= s, > - pgd_t *pgd, > - void *private); > +int walk_kernel_page_table_range(unsigned long start, > + unsigned long end, const struct mm_walk_ops *ops, > + pgd_t *pgd, void *private); > int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, > unsigned long end, const struct mm_walk_ops *ops, > void *private); > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > index 27245e86df25..ba0fb1b6a5a8 100644 > --- a/mm/hugetlb_vmemmap.c > +++ b/mm/hugetlb_vmemmap.c > @@ -166,7 +166,7 @@ static int vmemmap_remap_range(unsigned long start, u= nsigned long end, > VM_BUG_ON(!PAGE_ALIGNED(start | end)); > > mmap_read_lock(&init_mm); > - ret =3D walk_page_range_novma(&init_mm, start, end, &vmemmap_rema= p_ops, > + ret =3D walk_kernel_page_table_range(start, end, &vmemmap_remap_o= ps, > NULL, walk); > mmap_read_unlock(&init_mm); > if (ret) > diff --git a/mm/internal.h b/mm/internal.h > index 6b8ed2017743..43788d0de6e3 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -1605,6 +1605,10 @@ static inline void accept_page(struct page *page) > int walk_page_range_mm(struct mm_struct *mm, unsigned long start, > unsigned long end, const struct mm_walk_ops *ops, > void *private); > +int walk_page_range_debug(struct mm_struct *mm, unsigned long start, > + unsigned long end, const struct mm_walk_ops *op= s, > + pgd_t *pgd, > + void *private); > > /* pt_reclaim.c */ > bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdv= al); > diff --git a/mm/pagewalk.c b/mm/pagewalk.c > index e478777c86e1..057a125c3bc0 100644 > --- a/mm/pagewalk.c > +++ b/mm/pagewalk.c > @@ -584,9 +584,28 @@ int walk_page_range(struct mm_struct *mm, unsigned l= ong start, > return walk_page_range_mm(mm, start, end, ops, private); > } > > +static int __walk_page_range_novma(struct mm_struct *mm, unsigned long s= tart, > + unsigned long end, const struct mm_walk_ops *ops, > + pgd_t *pgd, void *private) > +{ > + struct mm_walk walk =3D { > + .ops =3D ops, > + .mm =3D mm, > + .pgd =3D pgd, > + .private =3D private, > + .no_vma =3D true > + }; > + > + if (start >=3D end || !walk.mm) > + return -EINVAL; > + if (!check_ops_valid(ops)) > + return -EINVAL; > + > + return walk_pgd_range(start, end, &walk); > +} > + > /** > - * walk_page_range_novma - walk a range of pagetables not backed by a vm= a > - * @mm: mm_struct representing the target process of page= table walk > + * walk_kernel_page_table_range - walk a range of kernel pagetables. > * @start: start address of the virtual address range > * @end: end address of the virtual address range > * @ops: operation to call during the walk > @@ -596,56 +615,69 @@ int walk_page_range(struct mm_struct *mm, unsigned = long start, > * Similar to walk_page_range() but can walk any page tables even if the= y are > * not backed by VMAs. Because 'unusual' entries may be walked this func= tion > * will also not lock the PTEs for the pte_entry() callback. This is use= ful for > - * walking the kernel pages tables or page tables for firmware. > + * walking kernel pages tables or page tables for firmware. > * > * Note: Be careful to walk the kernel pages tables, the caller may be n= eed to > * take other effective approaches (mmap lock may be insufficient) to pr= event > * the intermediate kernel page tables belonging to the specified addres= s range > * from being freed (e.g. memory hot-remove). > */ > -int walk_page_range_novma(struct mm_struct *mm, unsigned long start, > +int walk_kernel_page_table_range(unsigned long start, unsigned long end, > + const struct mm_walk_ops *ops, pgd_t *pgd, void *private) > +{ > + struct mm_struct *mm =3D &init_mm; > + > + /* > + * Kernel intermediate page tables are usually not freed, so the = mmap > + * read lock is sufficient. But there are some exceptions. > + * E.g. memory hot-remove. In which case, the mmap lock is insuff= icient > + * to prevent the intermediate kernel pages tables belonging to t= he > + * specified address range from being freed. The caller should ta= ke > + * other actions to prevent this race. > + */ > + mmap_assert_locked(mm); > + > + return __walk_page_range_novma(mm, start, end, ops, pgd, private)= ; > +} > + > +/** > + * walk_page_range_debug - walk a range of pagetables not backed by a vm= a > + * @mm: mm_struct representing the target process of page= table walk > + * @start: start address of the virtual address range > + * @end: end address of the virtual address range > + * @ops: operation to call during the walk > + * @pgd: pgd to walk if different from mm->pgd > + * @private: private data for callbacks' usage > + * > + * Similar to walk_page_range() but can walk any page tables even if the= y are > + * not backed by VMAs. Because 'unusual' entries may be walked this func= tion > + * will also not lock the PTEs for the pte_entry() callback. > + * > + * This is for debugging purposes ONLY. > + */ > +int walk_page_range_debug(struct mm_struct *mm, unsigned long start, > unsigned long end, const struct mm_walk_ops *op= s, > pgd_t *pgd, > void *private) > { > - struct mm_walk walk =3D { > - .ops =3D ops, > - .mm =3D mm, > - .pgd =3D pgd, > - .private =3D private, > - .no_vma =3D true > - }; > - > - if (start >=3D end || !walk.mm) > - return -EINVAL; > - if (!check_ops_valid(ops)) > - return -EINVAL; > + /* > + * For convenience, we allow this function to also traverse kerne= l > + * mappings. > + */ > + if (mm =3D=3D &init_mm) > + return walk_kernel_page_table_range(start, end, ops, pgd,= private); > > /* > - * 1) For walking the user virtual address space: > - * > * The mmap lock protects the page walker from changes to the pag= e > * tables during the walk. However a read lock is insufficient t= o > * protect those areas which don't have a VMA as munmap() detache= s > * the VMAs before downgrading to a read lock and actually tearin= g > * down PTEs/page tables. In which case, the mmap write lock shou= ld > - * be hold. > - * > - * 2) For walking the kernel virtual address space: > - * > - * The kernel intermediate page tables usually do not be freed, s= o > - * the mmap map read lock is sufficient. But there are some excep= tions. > - * E.g. memory hot-remove. In which case, the mmap lock is insuff= icient > - * to prevent the intermediate kernel pages tables belonging to t= he > - * specified address range from being freed. The caller should ta= ke > - * other actions to prevent this race. > + * be held. > */ > - if (mm =3D=3D &init_mm) > - mmap_assert_locked(walk.mm); > - else > - mmap_assert_write_locked(walk.mm); > + mmap_assert_write_locked(mm); > > - return walk_pgd_range(start, end, &walk); > + return __walk_page_range_novma(mm, start, end, ops, pgd, private)= ; > } > > int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, > diff --git a/mm/ptdump.c b/mm/ptdump.c > index 9374f29cdc6f..61a352aa12ed 100644 > --- a/mm/ptdump.c > +++ b/mm/ptdump.c > @@ -4,6 +4,7 @@ > #include > #include > #include > +#include "internal.h" > > #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) > /* > @@ -177,7 +178,7 @@ void ptdump_walk_pgd(struct ptdump_state *st, struct = mm_struct *mm, pgd_t *pgd) > > mmap_write_lock(mm); > while (range->start !=3D range->end) { > - walk_page_range_novma(mm, range->start, range->end, > + walk_page_range_debug(mm, range->start, range->end, > &ptdump_ops, pgd, st); > range++; > } > -- > 2.49.0