linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: "Li, Tianyou" <tianyou.li@intel.com>,
	Oscar Salvador <osalvador@suse.de>,
	Mike Rapoport <rppt@kernel.org>,
	Wei Yang <richard.weiyang@gmail.com>
Cc: linux-mm@kvack.org, Yong Hu <yong.hu@intel.com>,
	Nanhai Zou <nanhai.zou@intel.com>, Yuan Liu <yuan1.liu@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Qiuxu Zhuo <qiuxu.zhuo@intel.com>,
	Yu C Chen <yu.c.chen@intel.com>, Pan Deng <pan.deng@intel.com>,
	Chen Zhang <zhangchen.kidd@jd.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v7 2/2] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
Date: Thu, 15 Jan 2026 18:00:38 +0100	[thread overview]
Message-ID: <4d47cd01-034d-4886-80c4-3861d2181e18@kernel.org> (raw)
In-Reply-To: <d97a8744-6d49-4582-82fc-a38ee2ae06fc@intel.com>

On 1/15/26 16:09, Li, Tianyou wrote:
> Sorry for my delayed response. Appreciated for all your suggestions
> David. My thoughts inlined for your kind consideration.
> 
> On 1/14/2026 7:13 PM, David Hildenbrand (Red Hat) wrote:
>>>>
>>>> This is nasty. I would wish we could just leave that code path alone.
>>>>
>>>> In particular: I am 99% sure that we never ever run into this case in
>>>> practice.
>>>>
>>>> E.g., on x86, we can have up to 2 GiB memory blocks. But the memmap of
>>>> that is 64/4096*2GiB == 32 MB ... and a memory section is 128 MiB.
>>>>
>>>>
>>>> As commented on patch #1, we should drop the set_zone_contiguous() in
>>>> this function either way and let online_pages() deal with it.
>>>>
>>>> We just have to make sure that we don't create some inconsistencies by
>>>> doing that.
>>>>
>>>> Can you double-check?
>>
>> I thought about this some more, and it's all a bit nasty. We have to
>> get this right.
>>
>> Losing the optimization for memmap_on_memory users indicates that we
>> are doing the wrong thing.
>>
>> You could introduce the set_zone_contiguous() in this patch here. But
>> then, I think instead of
>>
>> +    /*
>> +     * If the allocated memmap pages are not in a full section, keep the
>> +     * contiguous state as ZONE_CONTIG_NO.
>> +     */
>> +    if (IS_ALIGNED(end_pfn, PAGES_PER_SECTION))
>> +        new_contiguous_state = zone_contig_state_after_growing(zone,
>> +                                pfn, nr_pages);
>> +
>>
>> We'd actually unconditionally have to do that, no?
>>
> That's a great idea! Probably we need to move
> the adjust_present_page_count right after mhp_init_memmap_on_memory to
> keep the present_pages in sync.
> 
> When the zone is contiguous previously, there probably 2 situations:
> 
> 1. If new added range is at the start of the zone, then
> after mhp_init_memmap_on_memory, the zone contiguous will be false at
> fast path. It should be expected as long as the
> adjust_present_page_count was called before the online_pages,
> so zone_contig_state_after_growing in online_pages will return
> ZONE_CONTIG_MAYBE, which is expected. Then the set_zone_contiguous in
> online_pages will get correct result.
> 
> 2. If new added range is at the end of the zone, then
> the zone_contig_state_after_growing will return ZONE_CONTIG_YES or
> ZONE_CONTIG_NO, regardless of the memory section online or not. When the
> contiguous check comes into the online_pages, it will follow the fast
> path and get the correct contiguous state.
> 
> When the zone is not contiguous previously, in any case the new zone
> contiguous state will be false in mhp_init_memmap_on_memory, because
> either the nr_vmemmap_pages can not fill the hole or the memory section
> is not online. If we update the present_pages correctly, then in
> online_pages, the zone_contig_state_after_growing could have the chance
> to return ZONE_CONTIG_MAYBE, and since all memory sections are onlined,
> the set_zone_contiguous will get the correct result.
> 
> 
> I am not sure if you'd like to consider another option: could we
> encapsulate the mhp_init_memmap_on_memory and online_pages into one
> function eg. online_memory_block_pages, and offline_memory_block_pages
> correspondingly as well, in the memory_hotplug.c. So we can check the
> zone contiguous state as the whole for the new added range.
> 
> int online_memory_block_pages(unsigned long start_pfn, unsigned long
> nr_pages,
>               unsigned long nr_vmemmap_pages, struct zone *zone,
>               struct memory_group *group)
> {
>       bool contiguous = zone->contiguous;
>       enum zone_contig_state new_contiguous_state;
>       int ret;
> 
>       /*
>        * Calculate the new zone contig state before move_pfn_range_to_zone()
>        * sets the zone temporarily to non-contiguous.
>        */
>       new_contiguous_state = zone_contig_state_after_growing(zone, start_pfn,
>                                   nr_pages);

Should we clear the flag here?

> 
>       if (nr_vmemmap_pages) {
>           ret = mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages,
>                           zone);
>           if (ret)
>               goto restore_zone_contig;
>       }
> 
>       ret = online_pages(start_pfn + nr_vmemmap_pages,
>                  nr_pages - nr_vmemmap_pages, zone, group);
>       if (ret) {
>           if (nr_vmemmap_pages)
>               mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages);
>           goto restore_zone_contig;
>       }
> 
>       /*
>        * Account once onlining succeeded. If the zone was unpopulated, it is
>        * now already properly populated.
>        */
>       if (nr_vmemmap_pages)
>           adjust_present_page_count(pfn_to_page(start_pfn), mem->group,
>                         nr_vmemmap_pages);
> 
>       /*
>        * Now that the ranges are indicated as online, check whether the whole
>        * zone is contiguous.
>        */
>       set_zone_contiguous(zone, new_contiguous_state);
>       return 0;
> 
> restore_zone_contig:
>       zone->contiguous = contiguous;
>       return ret;
> }

That is even better, although it sucks to have to handle it on that 
level, and that it's so far away from actual zone resizing code.

Should we do the same on the offlining path?

-- 
Cheers

David


  reply	other threads:[~2026-01-15 17:00 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-22 14:58 [PATCH v7 0/2] Optimize zone->contiguous update and issue fix Tianyou Li
2025-12-22 14:58 ` [PATCH v7 1/2] mm/memory hotplug: fix zone->contiguous always false when hotplug Tianyou Li
2026-01-06 20:03   ` David Hildenbrand (Red Hat)
2026-01-08  7:35     ` Li, Tianyou
2025-12-22 14:58 ` [PATCH v7 2/2] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range Tianyou Li
2026-01-06 20:18   ` David Hildenbrand (Red Hat)
2026-01-08  8:13     ` Li, Tianyou
2026-01-08  8:23     ` Li, Tianyou
2026-01-14 11:13       ` David Hildenbrand (Red Hat)
2026-01-15 15:09         ` Li, Tianyou
2026-01-15 17:00           ` David Hildenbrand (Red Hat) [this message]
2026-01-16 11:58             ` Li, Tianyou
2026-01-06 20:25   ` David Hildenbrand (Red Hat)
2026-01-08  8:31     ` Li, Tianyou
2026-01-05 12:21 ` [PATCH v7 0/2] Optimize zone->contiguous update and issue fix Li, Tianyou
2026-01-05 16:27   ` David Hildenbrand (Red Hat)
2026-01-06  3:48     ` Li, Tianyou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d47cd01-034d-4886-80c4-3861d2181e18@kernel.org \
    --to=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nanhai.zou@intel.com \
    --cc=osalvador@suse.de \
    --cc=pan.deng@intel.com \
    --cc=qiuxu.zhuo@intel.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rppt@kernel.org \
    --cc=tianyou.li@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=yong.hu@intel.com \
    --cc=yu.c.chen@intel.com \
    --cc=yuan1.liu@intel.com \
    --cc=zhangchen.kidd@jd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox