From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C0DCC25B78 for ; Wed, 22 May 2024 16:13:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 101256B0089; Wed, 22 May 2024 12:13:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AEF36B008A; Wed, 22 May 2024 12:13:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E91F26B008C; Wed, 22 May 2024 12:13:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C958B6B0089 for ; Wed, 22 May 2024 12:13:53 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 540CA121361 for ; Wed, 22 May 2024 16:13:53 +0000 (UTC) X-FDA: 82146527946.09.3234C39 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 320634002C for ; Wed, 22 May 2024 16:13:50 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Aw3Fgi2O; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716394431; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3nOUOflu7eQzj/terIhor8uH3BfQ+23Qg6GF2/UGitw=; b=s5vUTdGJ8qdp8kUAMvRPCfyWbROh2Q7Esqs4pWcyk5zmLIQ1zSvzIiK35lOiccHEAkWz+X bJa3pT2EypvFRkV6EbjQkG5lblJnjpNj1bESP9Lw8Wxet/c3BUp+2aRjfnbiWvE/BIHI17 f/CkSd1oDIAh3wJC1Q2Do55m/aPQKWE= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Aw3Fgi2O; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716394431; a=rsa-sha256; cv=none; b=S29m8igT74o9DVmAzDxoZlwd5u54eeuArGuWWBtYmwNwTqucv+qf47mDJuDHRhIqqMCxgc DHmctzMy30YGnwHjmVYcXovZF8Wi+0Tuso4KRtynJHSpnQ1wpxth+IpU5075MCJAN8lDyQ 9OXQ03VavJpqSsxQeyCecsNfqZwRFo8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1716394430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3nOUOflu7eQzj/terIhor8uH3BfQ+23Qg6GF2/UGitw=; b=Aw3Fgi2OPeNobl6V6uXb4qpMv2w+9o3PFqy/WAiGcqvZxxtxcv8Hg3jYhMxFewa1gmxP7v xQFNLH7E5WMEA6IcQrY9ikPnqoWZulJaRfe6k3YYzntkrnlKF7F6w4ChZ5JIUhay+o7MRt yzRwo0vQD4M/znt8kgqw9OEQkV9zGi4= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-400-X7eaqCEaPImxgV0o89UXoA-1; Wed, 22 May 2024 12:13:48 -0400 X-MC-Unique: X7eaqCEaPImxgV0o89UXoA-1 Received: by mail-qv1-f70.google.com with SMTP id 6a1803df08f44-6ab89f69cd8so1410616d6.0 for ; Wed, 22 May 2024 09:13:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716394428; x=1716999228; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3nOUOflu7eQzj/terIhor8uH3BfQ+23Qg6GF2/UGitw=; b=Y1U6nsec7AO/3Y9pffpWARmxyvXI6vvlv9fRWKqM2nmsfaJqEEGG4Xp9JvejeDcGYw Fd/vHVJrNyM1t8HGs1QLbsMF5avI3nF0arz0/Of/HnpNcBUTmjzg52gnjuDAWhgc34KI +TK6ttcCgJIL9xr/CqTOX9nRSGq8govrY2cyhGF6NhQlLEOgjuO+E6x3f6SQe6G23uYi MBcZYJy1h5TPizJWMFjVRze99CE4UtIakGV/gr/RGG4ykUPW9CofHhTvrHO2POqI56/m 1/U0Vn2zo62AbI9zSYuiMIP5GyKG4TAFeAGHRECaw1WOoJp498AusJ/WXUQ4nQ57YVRk feyQ== X-Forwarded-Encrypted: i=1; AJvYcCXlpaEiVvAK6BE11biJyY4S/k4jOnZ7rPzkv5wcX5h1P+djPa61srPW/f9Ji49TB1xscwJxT7vbVzEDpYsrgVwZjnw= X-Gm-Message-State: AOJu0YwBIiudSMsPdkLl5treM60yFjL8W4Ml9efkEk73zJytoGiMaiBE mR6SaKiEGB1yqlN1PSc4CF/rOMarYLEjeV69GBdOgvWojFZo9vrFKCH3fIFs0KjEMPf1WklE019 uYtXTIm44mgjfhWgsbC3HEpusU1I8eXmAJqIJT9jA3387Hrnr X-Received: by 2002:a05:6214:2626:b0:6a0:87e5:210c with SMTP id 6a1803df08f44-6ab80931c4emr24096566d6.5.1716394428107; Wed, 22 May 2024 09:13:48 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGZToKdbUy5jaMTAiNrIxehX58pgqc0VaKB4XF1FynvcepfCcDlxlf35vwQBh40mWPVQggo8g== X-Received: by 2002:a05:6214:2626:b0:6a0:87e5:210c with SMTP id 6a1803df08f44-6ab80931c4emr24096056d6.5.1716394427346; Wed, 22 May 2024 09:13:47 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6a15f1d9dd2sm134417386d6.129.2024.05.22.09.13.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 09:13:47 -0700 (PDT) Date: Wed, 22 May 2024 12:13:45 -0400 From: Peter Xu To: David Hildenbrand , "Aneesh Kumar K.V" , Gavin Shan , Anshuman Khandual Cc: Mikhail Gavrilov , Pavel Tatashin , axelrasmussen@google.com, nadav.amit@gmail.com, Andrew Morton , Linux Memory Management List , Linux List Kernel Mailing Subject: Re: 6.10/bisected/regression - commit 8430557fc584 cause warning at mm/page_table_check.c:198 __page_table_check_ptes_set+0x306 Message-ID: References: <03faa624-1685-4a21-81fc-cc9e8b760e97@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 320634002C X-Stat-Signature: 5jfswy834k9qh9gpikbr3w7kqs9j5gey X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1716394430-805342 X-HE-Meta: U2FsdGVkX1+3GANVJmQCXfgTRUKpGX+dnTSdgOoRU2unVhXzT8vKI9DWc0LV9/8PAKFuonwfAXa9a7d+Ibnj2sIsnFuLo7+Y78R6vqMdXy7btebIHMyTJKeTa5TU8ine8IucEK/HAPRNWbUjDZCPKipTBuHnjanhkEr2l+HNid48HmfxYbGtzHlgCxMUQDKiN3Qau0gp5KbnNhM9XiZX23ACQelulCLVUJClReWHJ9aPFWLro1c4lC1hefRaF0AdX4bt4Y/+i/5OguT/yypLmmDiW3fIiMTNLuXlTzoXoLKpPksKM/hHBjUl9s4797CYelRi+bUZL2mKPr7UNtts4oInuTgIh3PidIEGhiByHBrnrUNGeBJXOONTBgiHq/Jr0uGMM6jWmJwKcWtDPqc3eV8lskNFka7is3GesyO43Ss1UJ90/5l5IS+U5bcux/x4wo6QnFUo5TL6AoD8tsWPGvKLJNSjGs/izi0yPXqNqSbbuvcFzpy08L5eMiYA0LSNFbtV0D5O/11dOHLTkbAWb2XdsR3/5t3/jpm7j5mDy/hfFVfWZRIKXTUU28Z8ZYOfAZPTMYiPmfcsd95zWaC38t64DHG05TlVyPo7q66/PZRH9WGYVXNKqp2/3jCA88W5IA0IHQY9+7Yqfs9YjLyro6PRufS7NZkAlHKE1yifPwDiv++X1kizzTTK6kbdC5o/5bnMLTazlx1cqU8VabB7M+1YRFhx7nqz3kR2o6tc+acWk0FZRl4KpADp5ytZyYx7UDaAKAwKr/EQyLzAoA6Od07tKKneTJbfn/eg+lR/fx0uziKs/DIrnMC0D9GuKy1oP/UF9o9PfNE7mqnYi++bfBfAQfwvsnY6XiktCij+7TO6XsKdh/KLpQE5k9MYab3OljPFWy6y5yomSIKg8qEpuRKXkAm14pTufdBQ/nN71YnIGKum4l0VtorJHEpIjqJLoT3E0mMmlTPp4fPIDgl wlUIwrn7 adV9cmZxQYO6W74RfYDbJaEZYqihOG+jCFNvB1YkxNkJ3UlzEnGZX5B++VxRNK09TGX55+d7oDWhJ6EIvGzjNDwTJXQ/Zw26/HMpuGrz0ztAeIuOpZ6OCuqHYdi2qNNaDwOSBd5+h6lxCAZXyDvO0i//8FJXTKMBYrSiH/tCb76RrynXASpFku55Po1mzvUOetCVW8DALnZmPYX6RVVSgZ8WsMLeZhFj8Wf6TbwkhwUu4IRVZRQ3TzfMZl1Jrmu4Txj7xd4heTb04unzmmF3D/QSJuisygVBewWHOzbbNn3iIPVpRZW91gIdLiNIMjDtXWqhBVFmVewfizkxaMmRkL8ILpP2A9jcWJxP7a88GDNGTWuj0ln1yV1I3g84gMnDtaFvki9+nHJWYYLEEyLsYXp9SMX8CYmsQGRrr8eX8bx6vxLbJ1eiOBlBPEq5kHcrSrjt2wQLwcbSR3yUD6p+I24ShLqitecWTg1PP9iiUqPBSJE2DGEiMoigTti/Geei3PFvF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 22, 2024 at 12:10:30PM -0400, Peter Xu wrote: > On Wed, May 22, 2024 at 05:34:21PM +0200, David Hildenbrand wrote: > > On 22.05.24 17:18, Peter Xu wrote: > > > On Wed, May 22, 2024 at 09:48:51AM +0200, David Hildenbrand wrote: > > > > On 22.05.24 00:36, Peter Xu wrote: > > > > > On Wed, May 22, 2024 at 03:21:04AM +0500, Mikhail Gavrilov wrote: > > > > > > On Wed, May 22, 2024 at 2:37 AM Peter Xu wrote: > > > > > > > Hmm I still cannot reproduce. Weird. > > > > > > > > > > > > > > Would it be possible for you to identify which line in debug_vm_pgtable.c > > > > > > > triggered that issue? > > > > > > > > > > > > > > I think it should be some set_pte_at() but I'm not sure, as there aren't a > > > > > > > lot and all of them look benign so far. It could be that I missed > > > > > > > something important. > > > > > > > > > > > > I hope it's helps: > > > > > > > > > > Thanks for offering this, it's just that it doesn't look coherent with what > > > > > was reported for some reason. > > > > > > > > > > > > > > > > > > sh /usr/src/kernels/(uname -r)/scripts/faddr2line /lib/debug/lib/modules/(uname -r)/vmlinux debug_vm_pgtable+0x1c04 > > > > > > debug_vm_pgtable+0x1c04/0x3360: > > > > > > native_ptep_get_and_clear at arch/x86/include/asm/pgtable_64.h:94 > > > > > > (inlined by) ptep_get_and_clear at arch/x86/include/asm/pgtable.h:1262 > > > > > > (inlined by) ptep_clear at include/linux/pgtable.h:509 > > > > > > > > > > This is a pte_clear(), and pte_clear() shouldn't even do the set() checks, > > > > > and shouldn't stumble over what I added. > > > > > > > > > > IOW, it doesn't match with the real stack dump previously: > > > > > > > > > > [ 5.581003] ? __page_table_check_ptes_set+0x306/0x3c0 > > > > > [ 5.581274] ? __pfx___page_table_check_ptes_set+0x10/0x10 > > > > > [ 5.581544] ? __pfx_check_pgprot+0x10/0x10 > > > > > [ 5.581806] set_ptes.constprop.0+0x66/0xd0 > > > > > [ 5.582072] ? __pfx_set_ptes.constprop.0+0x10/0x10 > > > > > [ 5.582333] ? __pfx_pte_val+0x10/0x10 > > > > > [ 5.582595] debug_vm_pgtable+0x1c04/0x3360 > > > > > > > > > > > > > Staring at pte_clear_tests(): > > > > > > > > #ifndef CONFIG_RISCV > > > > pte = __pte(pte_val(pte) | RANDOM_ORVALUE); > > > > #endif > > > > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > > > > > > > > So we set random PTE bits, probably setting the present, uffd and write bit > > > > at the same time. That doesn't make too much sense when we want to perform > > > > that such combinations cannot exist. > > > > > > Here the issue is I don't think it should set W bit anyway, as we init > > > page_prot to be RWX but !shared: > > > > > > args->page_prot = vm_get_page_prot(VM_ACCESS_FLAGS); > > > > > > On x86_64 (Mikhail's system) it should have W bit cleared afaict, meanwhile > > > the RANDOM_ORVALUE won't touch bit W due to S390_SKIP_MASK (which contains > > > bit W / bit 1, which is another "accident"..). Then even if with that it > > > should not trigger.. I think that's also why I cannot reproduce this > > > problem locally. > > > > Why oh why are skip mask applied independently of the architecture. > > > > While _PAGE_RW should indeed be masked out by RANDOM_ORVALUE. > > > > But with shadow stacks we consider a PTE writable (see > > pte_write()->pte_shstk()) if > > (1) X86_FEATURE_SHSTK is enabled > > (2) _PAGE_RW is clear > > (3) _PAGE_DIRTY is set > > > > _PAGE_DIRTY is bit 6. > > > > Likely your CPU does not support shadow stacks. > > Good point. My host has it, but I tested in the VM which doesn't. I > suppose we can wait and double check whether Mikhail should see the issue > went away with that patch provided. > > In this case, instead of keep fiddling with random bits to apply and > further work on top of per-arch random bits, I'd hope we can simply drop > that random mechanism as I don't think it'll be pxx_none() now. I attached > a patch I plan to post. Does it look reasonable? > > I also copied Anshuman, Gavin and Aneesh. No I didn't.. this one will.. > > Thanks, > > ===8<=== > From c10cde00b14d2d305390dd418a8a8855d3e6437f Mon Sep 17 00:00:00 2001 > From: Peter Xu > Date: Wed, 22 May 2024 12:04:33 -0400 > Subject: [PATCH] drop RANDOM_ORVALUE bits > > Signed-off-by: Peter Xu > --- > mm/debug_vm_pgtable.c | 30 ++++-------------------------- > 1 file changed, 4 insertions(+), 26 deletions(-) > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index f1c9a2c5abc0..b5d7be05063a 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -40,22 +40,7 @@ > * Please refer Documentation/mm/arch_pgtable_helpers.rst for the semantics > * expectations that are being validated here. All future changes in here > * or the documentation need to be in sync. > - * > - * On s390 platform, the lower 4 bits are used to identify given page table > - * entry type. But these bits might affect the ability to clear entries with > - * pxx_clear() because of how dynamic page table folding works on s390. So > - * while loading up the entries do not change the lower 4 bits. It does not > - * have affect any other platform. Also avoid the 62nd bit on ppc64 that is > - * used to mark a pte entry. > */ > -#define S390_SKIP_MASK GENMASK(3, 0) > -#if __BITS_PER_LONG == 64 > -#define PPC64_SKIP_MASK GENMASK(62, 62) > -#else > -#define PPC64_SKIP_MASK 0x0 > -#endif > -#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK) > -#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK) > #define RANDOM_NZVALUE GENMASK(7, 0) > > struct pgtable_debug_args { > @@ -511,8 +496,7 @@ static void __init pud_clear_tests(struct pgtable_debug_args *args) > return; > > pr_debug("Validating PUD clear\n"); > - pud = __pud(pud_val(pud) | RANDOM_ORVALUE); > - WRITE_ONCE(*args->pudp, pud); > + WARN_ON(pud_none(pud)); > pud_clear(args->pudp); > pud = READ_ONCE(*args->pudp); > WARN_ON(!pud_none(pud)); > @@ -548,8 +532,7 @@ static void __init p4d_clear_tests(struct pgtable_debug_args *args) > return; > > pr_debug("Validating P4D clear\n"); > - p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE); > - WRITE_ONCE(*args->p4dp, p4d); > + WARN_ON(p4d_none(p4d)); > p4d_clear(args->p4dp); > p4d = READ_ONCE(*args->p4dp); > WARN_ON(!p4d_none(p4d)); > @@ -582,8 +565,7 @@ static void __init pgd_clear_tests(struct pgtable_debug_args *args) > return; > > pr_debug("Validating PGD clear\n"); > - pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE); > - WRITE_ONCE(*args->pgdp, pgd); > + WARN_ON(pgd_none(pgd)); > pgd_clear(args->pgdp); > pgd = READ_ONCE(*args->pgdp); > WARN_ON(!pgd_none(pgd)); > @@ -634,9 +616,6 @@ static void __init pte_clear_tests(struct pgtable_debug_args *args) > if (WARN_ON(!args->ptep)) > return; > > -#ifndef CONFIG_RISCV > - pte = __pte(pte_val(pte) | RANDOM_ORVALUE); > -#endif > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > flush_dcache_page(page); > barrier(); > @@ -650,8 +629,7 @@ static void __init pmd_clear_tests(struct pgtable_debug_args *args) > pmd_t pmd = READ_ONCE(*args->pmdp); > > pr_debug("Validating PMD clear\n"); > - pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE); > - WRITE_ONCE(*args->pmdp, pmd); > + WARN_ON(pmd_none(pmd)); > pmd_clear(args->pmdp); > pmd = READ_ONCE(*args->pmdp); > WARN_ON(!pmd_none(pmd)); > -- > 2.45.0 > > -- > Peter Xu -- Peter Xu