linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
	"Mike Rapoport (Microsoft)" <rppt@kernel.org>
Cc: akpm@linux-foundation.org,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v3 1/3] x86/mm/pat: Convert pte code to use ptdescs
Date: Tue, 3 Feb 2026 09:23:47 -0800	[thread overview]
Message-ID: <e029f69b-6d99-4029-94f5-caafb5eb767c@intel.com> (raw)
In-Reply-To: <20260202172005.683870-2-vishal.moola@gmail.com>

On 2/2/26 09:20, Vishal Moola (Oracle) wrote:
> In order to separately allocate ptdescs from pages, we need all allocation
> and free sites to use the appropriate functions. Convert these pte
> allocation/free sites to use ptdescs.

Imperative voice, please.

> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 6c6eb486f7a6..f9f9d4ca8e71 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -1408,7 +1408,7 @@ static bool try_to_free_pte_page(pte_t *pte)
>  		if (!pte_none(pte[i]))
>  			return false;
>  
> -	free_page((unsigned long)pte);
> +	pagetable_free(virt_to_ptdesc((void *)pte));
>  	return true;
>  }

This looks wrong to me, or at least that the API needs improvement. Most
callers are going to have a pointer that they've been modifying. They're
not going to have a ptdesc handy.

So I think this needs to look like:

	pagetable_free(pte);

You can convert to ptdescs internally or do whatever you want with
ptdesc sanity checks, but the API needs to be on writeable pointers. If
the API takes a const pointer that requires callers to cast it, I think
the API is broken.

> @@ -1537,12 +1537,15 @@ static void unmap_pud_range(p4d_t *p4d, unsigned long start, unsigned long end)
>  	 */
>  }
>  
> -static int alloc_pte_page(pmd_t *pmd)
> +static int alloc_pte_ptdesc(pmd_t *pmd)

Why change the name? Nobody cares what this is doing internally.

>  {
> -	pte_t *pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
> -	if (!pte)
> +	pte_t *pte;
> +	struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL | __GFP_ZERO, 0);
> +
> +	if (!ptdesc)
>  		return -1;

This also looks wrong.

What kind of maniac is ever going to allocate page tables without
__GFP_ZERO? __GFP_ZERO really should be a part of pagetable_alloc(),
don't you think?

> +	pte = (pte_t *) ptdesc_address(ptdesc);
>  	set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
>  	return 0;
>  }

Why is there a cast here? ptdesc_address() returns void*, no?

Also, if there a ptdesc_pa(), this could be:

static int alloc_pte_ptdesc(pmd_t *pmd)
{
	struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL, 0);

	if (!ptdesc)
 		return -1;

 	set_pmd(pmd, __pmd(ptdesc_pa(ptdesc) | _KERNPG_TABLE));
 	return 0;
}

This *should* be a very common pattern. After you allocate a page table
page, you almost always need its physical address because it's going to
get pointed to by other page table or hardware register.

To me, it doesn't look like the ptdesc API is very mature yet, or at
least hasn't been expanded for ease for actual users. I don't want to
grow its use in arch/x86 until it's a wee bit more mature.


  parent reply	other threads:[~2026-02-03 17:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-02 17:20 [PATCH v3 0/3] Convert 64-bit x86/mm/pat to ptdescs Vishal Moola (Oracle)
2026-02-02 17:20 ` [PATCH v3 1/3] x86/mm/pat: Convert pte code to use ptdescs Vishal Moola (Oracle)
2026-02-03 17:03   ` Mike Rapoport
2026-02-03 17:23   ` Dave Hansen [this message]
2026-02-03 21:07     ` Vishal Moola (Oracle)
2026-02-03 21:15       ` Dave Hansen
2026-02-02 17:20 ` [PATCH v3 2/3] x86/mm/pat: Convert pmd " Vishal Moola (Oracle)
2026-02-02 17:20 ` [PATCH v3 3/3] x86/mm/pat: Convert split_large_page() " Vishal Moola (Oracle)
2026-02-03 17:03 ` [PATCH v3 0/3] Convert 64-bit x86/mm/pat to ptdescs Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e029f69b-6d99-4029-94f5-caafb5eb767c@intel.com \
    --to=dave.hansen@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox