From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA191C4338F for ; Wed, 28 Jul 2021 07:48:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4CAC060FA0 for ; Wed, 28 Jul 2021 07:48:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4CAC060FA0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=csgroup.eu Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C60958D0001; Wed, 28 Jul 2021 03:48:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C105A6B005D; Wed, 28 Jul 2021 03:48:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFE398D0001; Wed, 28 Jul 2021 03:48:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id 95A966B0036 for ; Wed, 28 Jul 2021 03:48:11 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 40CF023109 for ; Wed, 28 Jul 2021 07:48:11 +0000 (UTC) X-FDA: 78411218382.34.DCE7771 Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) by imf09.hostedemail.com (Postfix) with ESMTP id 77901300B3F9 for ; Wed, 28 Jul 2021 07:48:10 +0000 (UTC) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4GZQkc3T82zBBlF; Wed, 28 Jul 2021 09:48:08 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zhO5PGMIrOxw; Wed, 28 Jul 2021 09:48:08 +0200 (CEST) Received: from vm-hermes.si.c-s.fr (vm-hermes.si.c-s.fr [192.168.25.253]) by pegase1.c-s.fr (Postfix) with ESMTP id 4GZQkc2DGBzBBkp; Wed, 28 Jul 2021 09:48:08 +0200 (CEST) Received: by vm-hermes.si.c-s.fr (Postfix, from userid 33) id B9E1A8EA; Wed, 28 Jul 2021 09:53:27 +0200 (CEST) Received: from 37.165.138.29 ([37.165.138.29]) by messagerie.c-s.fr (Horde Framework) with HTTP; Wed, 28 Jul 2021 09:53:27 +0200 Date: Wed, 28 Jul 2021 09:53:26 +0200 Message-ID: <20210728095326.Horde.k1npSPaQKh2i7W3XoBsdiQ3@messagerie.c-s.fr> From: Christophe Leroy To: Gavin Shan Cc: shan.gavin@gmail.com, chuhu@redhat.com, akpm@linux-foundation.org, will@kernel.org, catalin.marinas@arm.com, cai@lca.pw, aneesh.kumar@linux.ibm.com, gerald.schaefer@linux.ibm.com, anshuman.khandual@arm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v4 12/12] mm/debug_vm_pgtable: Fix corrupted page flag References: <20210727061401.592616-1-gshan@redhat.com> <20210727061401.592616-13-gshan@redhat.com> In-Reply-To: <20210727061401.592616-13-gshan@redhat.com> User-Agent: Internet Messaging Program (IMP) H5 (6.2.3) Content-Type: text/plain; charset=UTF-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 77901300B3F9 Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf09.hostedemail.com: domain of christophe.leroy@csgroup.eu designates 93.17.236.30 as permitted sender) smtp.mailfrom=christophe.leroy@csgroup.eu X-Stat-Signature: 3613pr88hidtheyfkrfssp8b1sxkosrj X-HE-Tag: 1627458490-414336 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Gavin Shan a =C3=A9crit=C2=A0: > In page table entry modifying tests, set_xxx_at() are used to populate > the page table entries. On ARM64, PG_arch_1 (PG_dcache_clean) flag is > set to the target page flag if execution permission is given. The logic > exits since commit 4f04d8f00545 ("arm64: MMU definitions"). The page > flag is kept when the page is free'd to buddy's free area list. However, > it will trigger page checking failure when it's pulled from the buddy's > free area list, as the following warning messages indicate. > > BUG: Bad page state in process memhog pfn:08000 > page:0000000015c0a628 refcount:0 mapcount:0 \ > mapping:0000000000000000 index:0x1 pfn:0x8000 > flags: 0x7ffff8000000800(arch_1|node=3D0|zone=3D0|lastcpupid=3D0xfffff= ) > raw: 07ffff8000000800 dead000000000100 dead000000000122 00000000000000= 00 > raw: 0000000000000001 0000000000000000 00000000ffffffff 00000000000000= 00 > page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set > > This fixes the issue by clearing PG_arch_1 through flush_dcache_page() > after set_xxx_at() is called. For architectures other than ARM64, the > unexpected overhead of cache flushing is acceptable. > > Signed-off-by: Gavin Shan Maybe a Fixes: tag would be good to have And would it be possible to have this fix as first patch of the series=20= =20 so=20that it can be applied to stable without applying the whole series ? Christophe > --- > mm/debug_vm_pgtable.c | 55 +++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 51 insertions(+), 4 deletions(-) > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index 162ff6329f7b..d2c2d23e542e 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -29,6 +29,8 @@ > #include > #include > #include > + > +#include > #include > #include > > @@ -119,19 +121,28 @@ static void __init pte_basic_tests(struct=20=20 >=20pgtable_debug_args *args, int idx) > > static void __init pte_advanced_tests(struct pgtable_debug_args *args) > { > + struct page *page; > pte_t pte; > > /* > * Architectures optimize set_pte_at by avoiding TLB flush. > * This requires set_pte_at to be not used to update an > * existing pte entry. Clear pte before we do set_pte_at > + * > + * flush_dcache_page() is called after set_pte_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > */ > - if (args->pte_pfn =3D=3D ULONG_MAX) > + page =3D (args->pte_pfn !=3D ULONG_MAX) ? pfn_to_page(args->pte_pfn) : = NULL; > + if (!page) > return; > > pr_debug("Validating PTE advanced\n"); > pte =3D pfn_pte(args->pte_pfn, args->page_prot); > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > ptep_set_wrprotect(args->mm, args->vaddr, args->ptep); > pte =3D ptep_get(args->ptep); > WARN_ON(pte_write(pte)); > @@ -143,6 +154,7 @@ static void __init pte_advanced_tests(struct=20=20 >=20pgtable_debug_args *args) > pte =3D pte_wrprotect(pte); > pte =3D pte_mkclean(pte); > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > pte =3D pte_mkwrite(pte); > pte =3D pte_mkdirty(pte); > ptep_set_access_flags(args->vma, args->vaddr, args->ptep, pte, 1); > @@ -155,6 +167,7 @@ static void __init pte_advanced_tests(struct=20=20 >=20pgtable_debug_args *args) > pte =3D pfn_pte(args->pte_pfn, args->page_prot); > pte =3D pte_mkyoung(pte); > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > ptep_test_and_clear_young(args->vma, args->vaddr, args->ptep); > pte =3D ptep_get(args->ptep); > WARN_ON(pte_young(pte)); > @@ -213,15 +226,24 @@ static void __init pmd_basic_tests(struct=20=20 >=20pgtable_debug_args *args, int idx) > > static void __init pmd_advanced_tests(struct pgtable_debug_args *args) > { > + struct page *page; > pmd_t pmd; > unsigned long vaddr =3D args->vaddr; > > if (!has_transparent_hugepage()) > return; > > - if (args->pmd_pfn =3D=3D ULONG_MAX) > + page =3D (args->pmd_pfn !=3D ULONG_MAX) ? pfn_to_page(args->pmd_pfn) : = NULL; > + if (!page) > return; > > + /* > + * flush_dcache_page() is called after set_pmd_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > + */ > pr_debug("Validating PMD advanced\n"); > /* Align the address wrt HPAGE_PMD_SIZE */ > vaddr &=3D HPAGE_PMD_MASK; > @@ -230,6 +252,7 @@ static void __init pmd_advanced_tests(struct=20=20 >=20pgtable_debug_args *args) > > pmd =3D pfn_pmd(args->pmd_pfn, args->page_prot); > set_pmd_at(args->mm, vaddr, args->pmdp, pmd); > + flush_dcache_page(page); > pmdp_set_wrprotect(args->mm, vaddr, args->pmdp); > pmd =3D READ_ONCE(*args->pmdp); > WARN_ON(pmd_write(pmd)); > @@ -241,6 +264,7 @@ static void __init pmd_advanced_tests(struct=20=20 >=20pgtable_debug_args *args) > pmd =3D pmd_wrprotect(pmd); > pmd =3D pmd_mkclean(pmd); > set_pmd_at(args->mm, vaddr, args->pmdp, pmd); > + flush_dcache_page(page); > pmd =3D pmd_mkwrite(pmd); > pmd =3D pmd_mkdirty(pmd); > pmdp_set_access_flags(args->vma, vaddr, args->pmdp, pmd, 1); > @@ -253,6 +277,7 @@ static void __init pmd_advanced_tests(struct=20=20 >=20pgtable_debug_args *args) > pmd =3D pmd_mkhuge(pfn_pmd(args->pmd_pfn, args->page_prot)); > pmd =3D pmd_mkyoung(pmd); > set_pmd_at(args->mm, vaddr, args->pmdp, pmd); > + flush_dcache_page(page); > pmdp_test_and_clear_young(args->vma, vaddr, args->pmdp); > pmd =3D READ_ONCE(*args->pmdp); > WARN_ON(pmd_young(pmd)); > @@ -339,21 +364,31 @@ static void __init pud_basic_tests(struct=20=20 >=20pgtable_debug_args *args, int idx) > > static void __init pud_advanced_tests(struct pgtable_debug_args *args) > { > + struct page *page; > unsigned long vaddr =3D args->vaddr; > pud_t pud; > > if (!has_transparent_hugepage()) > return; > > - if (args->pud_pfn =3D=3D ULONG_MAX) > + page =3D (args->pud_pfn !=3D ULONG_MAX) ? pfn_to_page(args->pud_pfn) : = NULL; > + if (!page) > return; > > + /* > + * flush_dcache_page() is called after set_pud_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > + */ > pr_debug("Validating PUD advanced\n"); > /* Align the address wrt HPAGE_PUD_SIZE */ > vaddr &=3D HPAGE_PUD_MASK; > > pud =3D pfn_pud(args->pud_pfn, args->page_prot); > set_pud_at(args->mm, vaddr, args->pudp, pud); > + flush_dcache_page(page); > pudp_set_wrprotect(args->mm, vaddr, args->pudp); > pud =3D READ_ONCE(*args->pudp); > WARN_ON(pud_write(pud)); > @@ -367,6 +402,7 @@ static void __init pud_advanced_tests(struct=20=20 >=20pgtable_debug_args *args) > pud =3D pud_wrprotect(pud); > pud =3D pud_mkclean(pud); > set_pud_at(args->mm, vaddr, args->pudp, pud); > + flush_dcache_page(page); > pud =3D pud_mkwrite(pud); > pud =3D pud_mkdirty(pud); > pudp_set_access_flags(args->vma, vaddr, args->pudp, pud, 1); > @@ -382,6 +418,7 @@ static void __init pud_advanced_tests(struct=20=20 >=20pgtable_debug_args *args) > pud =3D pfn_pud(args->pud_pfn, args->page_prot); > pud =3D pud_mkyoung(pud); > set_pud_at(args->mm, vaddr, args->pudp, pud); > + flush_dcache_page(page); > pudp_test_and_clear_young(args->vma, vaddr, args->pudp); > pud =3D READ_ONCE(*args->pudp); > WARN_ON(pud_young(pud)); > @@ -594,16 +631,26 @@ static void __init pgd_populate_tests(struct=20=20 >=20pgtable_debug_args *args) { } > > static void __init pte_clear_tests(struct pgtable_debug_args *args) > { > + struct page *page; > pte_t pte =3D pfn_pte(args->pte_pfn, args->page_prot); > > - if (args->pte_pfn =3D=3D ULONG_MAX) > + page =3D (args->pte_pfn !=3D ULONG_MAX)=20? pfn_to_page(args->pte_pfn) = : NULL; > + if (!page) > return; > > + /* > + * flush_dcache_page() is called after set_pte_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > + */ > pr_debug("Validating PTE clear\n"); > #ifndef CONFIG_RISCV > pte =3D __pte(pte_val(pte) | RANDOM_ORVALUE); > #endif > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > barrier(); > pte_clear(args->mm, args->vaddr, args->ptep); > pte =3D ptep_get(args->ptep); > -- > 2.23.0