From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 514CDC25B78 for ; Wed, 22 May 2024 16:10:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 939546B0082; Wed, 22 May 2024 12:10:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C2416B0083; Wed, 22 May 2024 12:10:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73BE36B0089; Wed, 22 May 2024 12:10:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 542096B0082 for ; Wed, 22 May 2024 12:10:51 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0064D161283 for ; Wed, 22 May 2024 16:10:50 +0000 (UTC) X-FDA: 82146520260.21.AFAEFE9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 0193A20011 for ; Wed, 22 May 2024 16:10:46 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KaoT7WPZ; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716394247; a=rsa-sha256; cv=none; b=Gux0hA6v+p9VEdit8qEDbr0J1qOU37H33vgWHb0c7v1KbuapfMoYepmkM+ks21Qb0GPtFp /cfv1fvov9SYDEuQAzEiaz3YbO15AKaC/xKZx8sAPtXMyWHLhcLIskXZ5KZ32sQjDoh+Qa a7qZpBfuR+2D9p0cZv1SnOQ03jkxBe4= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KaoT7WPZ; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716394247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FOdt5cQ8YRHRDrpgBgiRS2WyBE/dtaEagvf3T0KeDxg=; b=R4S5HNghXiL+toF8Na5rAVK0qtg1bNjx5YZagB+c2ZK8MjOYjCUAuHNHAb88I8DmU34VIE ZfDIlL3FsOT5Aw9TFOP4bfQ6311s3cgxJQBb3r0GV+HhxguJj+vkte4VPH4ulI5LCq6NbY gTlZMdtr5QMIY7Ctgybz/KeikLQMmpA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1716394246; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FOdt5cQ8YRHRDrpgBgiRS2WyBE/dtaEagvf3T0KeDxg=; b=KaoT7WPZXmLjad96l0/VcYthoHVTPCgxF5sFNSlWgKodHdFSwj7ssHpnYz6wCFz2Gr6H5m JMSueRtodBbksYzGElxHnfsnOFWG2nn/OmHbrNt+f17HA0PZlm09/LwXzi7w0iZkyQGrmx RRkwROuOHtOZav0gKcnGY+jjM9CVtEQ= Received: from mail-oo1-f69.google.com (mail-oo1-f69.google.com [209.85.161.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-96-OwM6KrduOOqWxmfoMSCE0A-1; Wed, 22 May 2024 12:10:35 -0400 X-MC-Unique: OwM6KrduOOqWxmfoMSCE0A-1 Received: by mail-oo1-f69.google.com with SMTP id 006d021491bc7-5b278f641e7so110977eaf.3 for ; Wed, 22 May 2024 09:10:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716394234; x=1716999034; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FOdt5cQ8YRHRDrpgBgiRS2WyBE/dtaEagvf3T0KeDxg=; b=H4QgBHlWaFShratWw0KmI3JD42r3uHRLOIFa6143NqP85dKH1nn5+X5Kx27LvzGPMz uecB7qrBFdNfIzorxxudylrWW1UcGHCxHe7fcbtShqMHkOHWrovjP7R16jop05srfDeW HrUQIpvWkaEy5o3F0tezG7/po745XAoQzuVOnfUQszGTez4NavKZjKX5pS98nyrnxJeb Xl2hNA4PAFJi/g1BYFBeIL7XNINho4U9qppR9kFzQV48B3df1P8Kn1omyypyGyGAsXM2 aU5nn50vfQ9bjDlkY4Wyf6nRSytoGLLn1Rffwm7xiB7RMKEezgnF6tI4QbYhk7nkBrag doUw== X-Forwarded-Encrypted: i=1; AJvYcCUWTYJIqEC0xjST2PEFQ9ECwQ6OMyOH1l8L5HRvZVsEeDvYmQH80qasyUyIhnQGUi9r3t08VoUckj8Q8vbjOGrRn0c= X-Gm-Message-State: AOJu0YzNvrWVXU9m37FBVC/D8mzduodsuLzqcAuRBZ4uWBTNeRL4Giw0 AAvSviV1qL+HA6MnvwOg4n38U9sir6sGbA+Kya9B5aIUgRTRsRsBWHIF20ugI+sbxDrJ4vwe/sY +7ZY7DbDDwPHC0U5NInQeYUYx6iIqm5Rsrwz0UlTBzT58quLM X-Received: by 2002:a05:6870:d10a:b0:24c:5871:a16b with SMTP id 586e51a60fabf-24c68baaeb0mr2653317fac.2.1716394234089; Wed, 22 May 2024 09:10:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHZVy3DQlVuU9nmcdCoRNi+3OXluZNJWNWvWpvE4VxiUMTXr+5glEVfMoElaUEpC1shbZd41A== X-Received: by 2002:a05:6870:d10a:b0:24c:5871:a16b with SMTP id 586e51a60fabf-24c68baaeb0mr2653265fac.2.1716394233255; Wed, 22 May 2024 09:10:33 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6ab88a234b0sm4525296d6.42.2024.05.22.09.10.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 09:10:32 -0700 (PDT) Date: Wed, 22 May 2024 12:10:30 -0400 From: Peter Xu To: David Hildenbrand Cc: Mikhail Gavrilov , Pavel Tatashin , axelrasmussen@google.com, nadav.amit@gmail.com, Andrew Morton , Linux Memory Management List , Linux List Kernel Mailing Subject: Re: 6.10/bisected/regression - commit 8430557fc584 cause warning at mm/page_table_check.c:198 __page_table_check_ptes_set+0x306 Message-ID: References: <03faa624-1685-4a21-81fc-cc9e8b760e97@redhat.com> MIME-Version: 1.0 In-Reply-To: <03faa624-1685-4a21-81fc-cc9e8b760e97@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 0193A20011 X-Stat-Signature: uhd7ca4bb3gtte96nc8k1ugy7pbwpsg8 X-HE-Tag: 1716394246-186838 X-HE-Meta: U2FsdGVkX19ntxfQX962DEASnbn6PPJJtuRRypvBAcV/62KqnAHD5XkMnz6K2b476unYgjlRaokWJ+HhNVDL1w3N+CW+Tih/1eAFlZtGhW/EsI6OS6CaMSFE2vSXTj5y0R1xxa/10qmIrRyi3WpPFSqD9dxtzPX3dRjSSCVBDpuPq52PDg+CvSuu2mJJvixDiI+zWBkGEF5682wRezcxvKbatLNPT9VqS1zRykKl+pm/8VT3ynBfCbm3X/hZBU4JqscXEhVZQz7gnBQuHEiyqfGcHdBGtLPDs7lJIQFd9fP82PCBOx8dJprHPwsgrDi2pImDoUPCKjAsz3HwW4qJnWg1If/0Sr3mG1uFHEdyRH7btRhYFT2r3+MSjZqfnvb3z2gTbH6G8pyewweB2udEkNXX2Vh5CUZ0b2hXRuQSjCXg8F+B4S/MEHa8Rxi8rQzcsOJ/gqm6QXjPmK+c+MqneY3AToOnHCwYlO+f/4XBVnq9qZ/0OrT4yiaQ3B6zgA6N7UD5fUKV0jh10z2h6D7eHRMMDdClWx9KaBwunyH0WBDlY+6+8IcSilYKApisb3NHvuMtP3zyMVwbqU0g3QhrHDCcGwaF9/URodqVg08y5vLFXPiONMMfUeIW6/+thMIklZwNsF12GZFSFVfG3FPtHOtkHlt43qbXex3u83EPZWUxENG2nF919n8l2sVy4FPYP+xM1+qEX45Wz8pDUDPnO87dtiVC+nA4SWcIpvdVBT8sZ04coZBeM1O6JBhc1NI4e1t7BqjUxuJw5fvdsfvyP240FCxVRCNGQQz5dgaoPHWHwwY2O7Kq96h7x1/GsE4QP/9kfsDl1rNqDASEB+UPiWVbKco3F3kHRmsWxMuLyz6hRgVadhqyKSqkqpHEY8rPq09BlRjcbpE96BbilllSt9tlXflQx2fQSWqhoH0YNyEoiF1ocl/g5XNayFxxt2MkreDVT1kzT6jaMwc/Vjp CERWtN/R QnTvzC6V3QgYQ8eaaOrL24OZ3MWaeO4KDLaFiQsEVmtpETtNFCZJkNQlPGC9vy9M1ndZmQPb8MvJy2pekmkQyLTxZDs/oC1PnK7bCh7DPicX+kzB0k1tPH1Fkz+mImPKW0WcJ7ar2p2wJY1WK6WeKhgo8cIxU8PqGlCookdT2A3OSVDrC4laK5s8Huy7QyeZbETlgO8dqODE6rGbsObuqTO/h2tfsSHGRfHNu3cC/ALbgXKT4G9oXpIvN+wrpXZPnUc/IByoka9DIJ41KXbD06fiY+GRywod5fHBatFcUl9+/Y1y85YKMklFyd361B0minlZQekhh1xkVm+mf/VAOCNLkm8uyVaSIx6gboI0d7mk8TiCegLouwCAsEWT/g6YAtgya73YuU1iaRGSMQkTKeGWyGvhSvQrWELIjnNRogaVOMFn3KaYVv141TJCPmTSZytzOtrgxgixH1iP/BuyR5FtQXCxrdL5oK0q9wuY5f8AIotfxqnYANJX9gC47Se7A+cUixlFzaokQYOMJ/mVFc8YAD5RrNJFSKC0KaQjOMbOkyd1g+XBlwJ0e0Y1A97qEyUA/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 22, 2024 at 05:34:21PM +0200, David Hildenbrand wrote: > On 22.05.24 17:18, Peter Xu wrote: > > On Wed, May 22, 2024 at 09:48:51AM +0200, David Hildenbrand wrote: > > > On 22.05.24 00:36, Peter Xu wrote: > > > > On Wed, May 22, 2024 at 03:21:04AM +0500, Mikhail Gavrilov wrote: > > > > > On Wed, May 22, 2024 at 2:37 AM Peter Xu wrote: > > > > > > Hmm I still cannot reproduce. Weird. > > > > > > > > > > > > Would it be possible for you to identify which line in debug_vm_pgtable.c > > > > > > triggered that issue? > > > > > > > > > > > > I think it should be some set_pte_at() but I'm not sure, as there aren't a > > > > > > lot and all of them look benign so far. It could be that I missed > > > > > > something important. > > > > > > > > > > I hope it's helps: > > > > > > > > Thanks for offering this, it's just that it doesn't look coherent with what > > > > was reported for some reason. > > > > > > > > > > > > > > > sh /usr/src/kernels/(uname -r)/scripts/faddr2line /lib/debug/lib/modules/(uname -r)/vmlinux debug_vm_pgtable+0x1c04 > > > > > debug_vm_pgtable+0x1c04/0x3360: > > > > > native_ptep_get_and_clear at arch/x86/include/asm/pgtable_64.h:94 > > > > > (inlined by) ptep_get_and_clear at arch/x86/include/asm/pgtable.h:1262 > > > > > (inlined by) ptep_clear at include/linux/pgtable.h:509 > > > > > > > > This is a pte_clear(), and pte_clear() shouldn't even do the set() checks, > > > > and shouldn't stumble over what I added. > > > > > > > > IOW, it doesn't match with the real stack dump previously: > > > > > > > > [ 5.581003] ? __page_table_check_ptes_set+0x306/0x3c0 > > > > [ 5.581274] ? __pfx___page_table_check_ptes_set+0x10/0x10 > > > > [ 5.581544] ? __pfx_check_pgprot+0x10/0x10 > > > > [ 5.581806] set_ptes.constprop.0+0x66/0xd0 > > > > [ 5.582072] ? __pfx_set_ptes.constprop.0+0x10/0x10 > > > > [ 5.582333] ? __pfx_pte_val+0x10/0x10 > > > > [ 5.582595] debug_vm_pgtable+0x1c04/0x3360 > > > > > > > > > > Staring at pte_clear_tests(): > > > > > > #ifndef CONFIG_RISCV > > > pte = __pte(pte_val(pte) | RANDOM_ORVALUE); > > > #endif > > > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > > > > > > So we set random PTE bits, probably setting the present, uffd and write bit > > > at the same time. That doesn't make too much sense when we want to perform > > > that such combinations cannot exist. > > > > Here the issue is I don't think it should set W bit anyway, as we init > > page_prot to be RWX but !shared: > > > > args->page_prot = vm_get_page_prot(VM_ACCESS_FLAGS); > > > > On x86_64 (Mikhail's system) it should have W bit cleared afaict, meanwhile > > the RANDOM_ORVALUE won't touch bit W due to S390_SKIP_MASK (which contains > > bit W / bit 1, which is another "accident"..). Then even if with that it > > should not trigger.. I think that's also why I cannot reproduce this > > problem locally. > > Why oh why are skip mask applied independently of the architecture. > > While _PAGE_RW should indeed be masked out by RANDOM_ORVALUE. > > But with shadow stacks we consider a PTE writable (see > pte_write()->pte_shstk()) if > (1) X86_FEATURE_SHSTK is enabled > (2) _PAGE_RW is clear > (3) _PAGE_DIRTY is set > > _PAGE_DIRTY is bit 6. > > Likely your CPU does not support shadow stacks. Good point. My host has it, but I tested in the VM which doesn't. I suppose we can wait and double check whether Mikhail should see the issue went away with that patch provided. In this case, instead of keep fiddling with random bits to apply and further work on top of per-arch random bits, I'd hope we can simply drop that random mechanism as I don't think it'll be pxx_none() now. I attached a patch I plan to post. Does it look reasonable? I also copied Anshuman, Gavin and Aneesh. Thanks, ===8<=== >From c10cde00b14d2d305390dd418a8a8855d3e6437f Mon Sep 17 00:00:00 2001 From: Peter Xu Date: Wed, 22 May 2024 12:04:33 -0400 Subject: [PATCH] drop RANDOM_ORVALUE bits Signed-off-by: Peter Xu --- mm/debug_vm_pgtable.c | 30 ++++-------------------------- 1 file changed, 4 insertions(+), 26 deletions(-) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index f1c9a2c5abc0..b5d7be05063a 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -40,22 +40,7 @@ * Please refer Documentation/mm/arch_pgtable_helpers.rst for the semantics * expectations that are being validated here. All future changes in here * or the documentation need to be in sync. - * - * On s390 platform, the lower 4 bits are used to identify given page table - * entry type. But these bits might affect the ability to clear entries with - * pxx_clear() because of how dynamic page table folding works on s390. So - * while loading up the entries do not change the lower 4 bits. It does not - * have affect any other platform. Also avoid the 62nd bit on ppc64 that is - * used to mark a pte entry. */ -#define S390_SKIP_MASK GENMASK(3, 0) -#if __BITS_PER_LONG == 64 -#define PPC64_SKIP_MASK GENMASK(62, 62) -#else -#define PPC64_SKIP_MASK 0x0 -#endif -#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK) -#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK) #define RANDOM_NZVALUE GENMASK(7, 0) struct pgtable_debug_args { @@ -511,8 +496,7 @@ static void __init pud_clear_tests(struct pgtable_debug_args *args) return; pr_debug("Validating PUD clear\n"); - pud = __pud(pud_val(pud) | RANDOM_ORVALUE); - WRITE_ONCE(*args->pudp, pud); + WARN_ON(pud_none(pud)); pud_clear(args->pudp); pud = READ_ONCE(*args->pudp); WARN_ON(!pud_none(pud)); @@ -548,8 +532,7 @@ static void __init p4d_clear_tests(struct pgtable_debug_args *args) return; pr_debug("Validating P4D clear\n"); - p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE); - WRITE_ONCE(*args->p4dp, p4d); + WARN_ON(p4d_none(p4d)); p4d_clear(args->p4dp); p4d = READ_ONCE(*args->p4dp); WARN_ON(!p4d_none(p4d)); @@ -582,8 +565,7 @@ static void __init pgd_clear_tests(struct pgtable_debug_args *args) return; pr_debug("Validating PGD clear\n"); - pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE); - WRITE_ONCE(*args->pgdp, pgd); + WARN_ON(pgd_none(pgd)); pgd_clear(args->pgdp); pgd = READ_ONCE(*args->pgdp); WARN_ON(!pgd_none(pgd)); @@ -634,9 +616,6 @@ static void __init pte_clear_tests(struct pgtable_debug_args *args) if (WARN_ON(!args->ptep)) return; -#ifndef CONFIG_RISCV - pte = __pte(pte_val(pte) | RANDOM_ORVALUE); -#endif set_pte_at(args->mm, args->vaddr, args->ptep, pte); flush_dcache_page(page); barrier(); @@ -650,8 +629,7 @@ static void __init pmd_clear_tests(struct pgtable_debug_args *args) pmd_t pmd = READ_ONCE(*args->pmdp); pr_debug("Validating PMD clear\n"); - pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE); - WRITE_ONCE(*args->pmdp, pmd); + WARN_ON(pmd_none(pmd)); pmd_clear(args->pmdp); pmd = READ_ONCE(*args->pmdp); WARN_ON(!pmd_none(pmd)); -- 2.45.0 -- Peter Xu