linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Gregory Price <gourry@gourry.net>
To: Zi Yan <ziy@nvidia.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@meta.com, akpm@linux-foundation.org, vbabka@suse.cz,
	surenb@google.com, mhocko@suse.com, jackmanb@google.com,
	hannes@cmpxchg.org, richard.weiyang@gmail.com, osalvador@suse.de,
	rientjes@google.com, david@redhat.com, joshua.hahnjy@gmail.com,
	fvdl@google.com
Subject: Re: [PATCH v5] page_alloc: allow migration of smaller hugepages during contig_alloc
Date: Thu, 18 Dec 2025 15:42:22 -0500	[thread overview]
Message-ID: <aURnLqMziaOilLCu@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <0E77F151-99B0-4F67-814A-4D79439C9A88@nvidia.com>

On Thu, Dec 18, 2025 at 02:45:37PM -0500, Zi Yan wrote:
> 
> That can save another scan? And caller can pass hugetlb_search_result if
> they care and check its value if pfn_range_valid_contig() returns false.
> 

Well, first, I've generally seen it discouraged to do output-parameters
like this for such trivial things.  But that aside...

We have to scan again either way if we want to prefer allocating
non-hugetlb regions in different memory blocks first.  This is what Mel
was pointing out (we should touch every OTHER block before we attempt
HugeTLB migrations).

The best optimization you could hope for is something like the following
- but honestly, this is ugly, racy (zone contents may have changed
between scans), and if you're already in the slow reliable path then we
should just be slow and re-scan the non-hugetlb sections as well.

Other than this being ugly, I don't have strong feelings.  If people
would prefer the second pass to ONLY touch hugetlb sections, I'll ship
this.

static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
                                   unsigned long nr_pages, bool search_hugetlb,
                                   bool *hugetlb_found)
{
        bool hugetlb = false;

        for (i = start_pfn; i < end_pfn; i++) {
	...
                if (PageHuge(page)) {
                        if (hugetlb_found)
                                *hugetlb_found = true;

                        if (!search_hugetlb)
                                return false;

			...
                        hugetlb = true;
                }
        }
	/* 
	 * If we're searching for hugetlb regions, only return those
	 * Otherwise only return regions without hugetlb reservations
	 */
        return !search_hugetlb || hugetlb;
}


struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
                                 int nid, nodemask_t *nodemask)
{
        bool search_hugetlb = false;
	bool hugetlb_found = false;

retry:
        zonelist = node_zonelist(nid, gfp_mask);
        for_each_zone_zonelist_nodemask(zone, z, zonelist,
                                        gfp_zone(gfp_mask), nodemask) {
                spin_lock_irqsave(&zone->lock, flags);

                pfn = ALIGN(zone->zone_start_pfn, nr_pages);
                while (zone_spans_last_pfn(zone, pfn, nr_pages)) {
                        if (pfn_range_valid_contig(zone, pfn, nr_pages,
                                                   search_hugetlb,
                                                   &hugetlb_found)) {
						   ...
                }
        }
        if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) &&
            !search_hugetlb && hugetlb_found) {
                search_hugetlb = true;
                goto retry;
        }
        return NULL;
}

~Gregory


  reply	other threads:[~2025-12-18 20:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-18 19:08 Gregory Price
2025-12-18 19:35 ` Johannes Weiner
2025-12-18 23:38   ` Gregory Price
2025-12-18 19:45 ` Zi Yan
2025-12-18 20:42   ` Gregory Price [this message]
2025-12-18 21:17     ` Zi Yan
2025-12-18 21:32       ` Gregory Price
2025-12-18 21:07   ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aURnLqMziaOilLCu@gourry-fedora-PF4VCD3F \
    --to=gourry@gourry.net \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=fvdl@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox