linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>,
	kernel test robot <lkp@intel.com>,
	linux-mm@kvack.org, llvm@lists.linux.dev,
	oe-kbuild-all@lists.linux.dev,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	"Mike Rapoport (IBM)" <rppt@kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	x86@kernel.org, linux-m68k@lists.linux-m68k.org,
	linux-fsdevel@vger.kernel.org, kasan-dev@googlegroups.com,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Dimitri Sivanich <dimitri.sivanich@hpe.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Muchun Song <muchun.song@linux.dev>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
Date: Mon, 23 Sep 2024 16:21:18 +0100	[thread overview]
Message-ID: <ebf8d9c6-867d-4e50-9e98-5d7f854278d8@arm.com> (raw)
In-Reply-To: <Zu1EwTItDrnkTVTB@shell.armlinux.org.uk>

>> Let's just rewind a bit. This thread exists because the kernel test robot failed
>> to compile pgd_none_or_clear_bad() (a core-mm function) for the arm architecture
>> after Anshuman changed the direct pgd dereference to pgdp_get(). The reason
>> compilation failed is because arm defines its own pgdp_get() override, but it is
>> broken (there is a typo).
> 
> Let's not rewind, because had you fully read and digested my reply, you
> would have seen why this isn't a problem... but let me spell it out.
> 
>>
>> Code before Anshuman's change:
>>
>> static inline int pgd_none_or_clear_bad(pgd_t *pgd)
>> {
>> 	if (pgd_none(*pgd))
>> 		return 1;
>> 	if (unlikely(pgd_bad(*pgd))) {
>> 		pgd_clear_bad(pgd);
>> 		return 1;
>> 	}
>> 	return 0;
>> }
> 
> This isn't a problem as the code stands. While there is a dereference
> in C, that dereference is a simple struct copy, something that we use
> everywhere in the kernel. However, that is as far as it goes, because
> neither pgd_none() and pgd_bad() make use of their argument, and thus
> the compiler will optimise it away, resulting in no actual access to
> the page tables - _as_ _intended_.

Right. Are you saying you depend upon those loads being optimized away for
correctness or performance reasons?

> 
> If these are going to be converted to pgd_get(), then we need pgd_get()
> to _also_ be optimised away, 

OK, agreed.

So perhaps the best approach is to modify the existing default pxdp_get()
implementations to just do a C dereference. That will ensure that there are no
intended consequences, unlike moving to READ_ONCE() by default. Then riscv
(which I think is the only arch to actually use pxdp_get() currently?) will need
its own pxdp_get() overrides, which use READ_ONCE(). arm64 would also define its
own overrides in terms of READ_ONCE() to ensure single copy atomicity in the
presence of HW updates.

How does that sound to you?

> and if e.g. this is the only place that
> pgd_get() is going to be used, the suggestion I made in my previous
> email is entirely reasonable, since we know that the result of pgd_get()
> will not actually be used.

I guess you could do that as an arm-specific override, but I don't think it adds
anything over using my proposed reworked default? Your call.

> 
>> As an aside, the kernel also dereferences p4d, pud, pmd and pte pointers in
>> various circumstances.
> 
> I already covered these in my previous reply.
> 
>> And other changes in this series are also replacing those
>> direct dereferences with calls to similar helpers. The fact that these are all
>> folded (by a custom arm implementation if I've understood the below correctly)
>> just means that each dereference is returning what you would call the pmd from
>> the HW perspective, I think?
> 
> It'll "return" the first of each pair of level-1 page table entries,
> which is pgd[0] or *p4d, *pud, *pmd - but all of these except *pmd
> need to be optimised away, so throwing lots of READ_ONCE() around
> this code without considering this is certainly the wrong approach.

Yep, got it.

> 
>>>> The core-mm today
>>>> dereferences pgd pointers (and p4d, pud, pmd pointers) directly in its code. See
>>>> follow_pfnmap_start(),
>>>
>>> Doesn't seem to exist at least not in 6.11.
>>
>> Appologies, I'm on mm-unstable and that isn't upstream yet. See follow_pte() in
>> v6.11 or __apply_to_page_range(), or pgd_none_or_clear_bad() as per above.
> 
> Looking at follow_pte(), it's not a problem.
> 
> I think we wouldn't be having this conversation before:
> 
> commit a32618d28dbe6e9bf8ec508ccbc3561a7d7d32f0
> Author: Russell King <rmk+kernel@arm.linux.org.uk>
> Date:   Tue Nov 22 17:30:28 2011 +0000
> 
>     ARM: pgtable: switch to use pgtable-nopud.h
> 
> where:
> -#define pgd_none(pgd)          (0)
> -#define pgd_bad(pgd)           (0)
> 
> existed before this commit - and thus the dereference in things like:
> 
> 	pgd_none(*pgd)
> 
> wouldn't even be visible to beyond the preprocessor step.
> 



  reply	other threads:[~2024-09-23 15:21 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-17  7:31 [PATCH V2 0/7] mm: Use pxdp_get() for accessing page table entries Anshuman Khandual
2024-09-17  7:31 ` [PATCH V2 1/7] m68k/mm: Change pmd_val() Anshuman Khandual
2024-09-17  8:40   ` Ryan Roberts
2024-09-17 10:20   ` David Hildenbrand
2024-09-17 10:27     ` Ryan Roberts
2024-09-17 10:30       ` David Hildenbrand
2024-09-17  7:31 ` [PATCH V2 2/7] x86/mm: Drop page table entry address output from pxd_ERROR() Anshuman Khandual
2024-09-17 10:22   ` David Hildenbrand
2024-09-17 11:19     ` Dave Hansen
2024-09-17 11:25       ` Anshuman Khandual
2024-09-17 11:31       ` David Hildenbrand
2024-09-17  7:31 ` [PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries Anshuman Khandual
2024-09-17  8:44   ` Ryan Roberts
2024-09-17 10:28   ` David Hildenbrand
2024-09-18  6:32     ` Anshuman Khandual
2024-09-19  8:04       ` David Hildenbrand
2024-09-19  9:20         ` Anshuman Khandual
2024-09-17  7:31 ` [PATCH V2 4/7] mm: Use pmdp_get() for accessing PMD entries Anshuman Khandual
2024-09-17 10:05   ` Ryan Roberts
2024-09-18 18:57   ` kernel test robot
2024-09-19  7:21     ` Anshuman Khandual
2024-09-18 19:07   ` kernel test robot
2024-09-19  7:12     ` Anshuman Khandual
2024-09-17  7:31 ` [PATCH V2 5/7] mm: Use pudp_get() for accessing PUD entries Anshuman Khandual
2024-09-17  7:31 ` [PATCH V2 6/7] mm: Use p4dp_get() for accessing P4D entries Anshuman Khandual
2024-09-17  7:31 ` [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries Anshuman Khandual
2024-09-18 20:30   ` kernel test robot
2024-09-19  7:55     ` Anshuman Khandual
2024-09-19  9:11       ` Russell King (Oracle)
2024-09-19 15:48         ` Ryan Roberts
2024-09-19 17:06           ` Russell King (Oracle)
2024-09-19 17:49             ` Ryan Roberts
2024-09-19 20:25               ` Russell King (Oracle)
2024-09-20  6:57                 ` Ryan Roberts
2024-09-20  9:47                   ` Russell King (Oracle)
2024-09-23 15:21                     ` Ryan Roberts [this message]
2024-09-25 10:05 ` [PATCH V2 0/7] mm: Use pxdp_get() for accessing page table entries Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebf8d9c6-867d-4e50-9e98-5d7f854278d8@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=arnd@arndb.de \
    --cc=cl@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=dimitri.sivanich@hpe.com \
    --cc=hch@infradead.org \
    --cc=kasan-dev@googlegroups.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lkp@intel.com \
    --cc=llvm@lists.linux.dev \
    --cc=muchun.song@linux.dev \
    --cc=oe-kbuild-all@lists.linux.dev \
    --cc=rppt@kernel.org \
    --cc=ryabinin.a.a@gmail.com \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox