* Re: [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-17 3:30 [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range Tianyou Li
@ 2025-11-17 2:38 ` Li, Tianyou
2025-11-17 11:57 ` David Hildenbrand (Red Hat)
2025-11-18 5:13 ` Mike Rapoport
2 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-11-17 2:38 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
Hi All,
Rebased with latest master. Add "Reported-by" from customer. Looking
forward to any comments/suggestions. Thanks.
Regards,
Tianyou
On 11/17/2025 11:30 AM, Tianyou Li wrote:
> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
> checking the new zone's pfn range from the beginning to the end, regardless
> the previous state of the old zone. When the zone's pfn range is large, the
> cost of traversing the pfn range to update the zone->contiguous could be
> significant.
>
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
>
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Memory Hotplug | 256G | 10s | 3s | 70% |
> | +------+---------------+--------------+----------------+
> | | 512G | 33s | 8s | 76% |
> +----------------+------+---------------+--------------+----------------+
>
> [1] Qemu commands to hotplug 512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 512G
>
> [2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
> [3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> ---
> mm/internal.h | 3 +++
> mm/memory_hotplug.c | 48 ++++++++++++++++++++++++++++++++++++++++++++-
> mm/mm_init.c | 31 ++++++++++++++++++++++-------
> 3 files changed, 74 insertions(+), 8 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 1561fc2ff5b8..734caae6873c 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -734,6 +734,9 @@ void set_zone_contiguous(struct zone *zone);
> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
> unsigned long nr_pages);
>
> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
> + unsigned long nr_pages);
> +
> static inline void clear_zone_contiguous(struct zone *zone)
> {
> zone->contiguous = false;
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..96c003271b8e 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -723,6 +723,47 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>
> }
>
> +static void __meminit update_zone_contiguous(struct zone *zone,
> + bool old_contiguous, unsigned long old_start_pfn,
> + unsigned long old_nr_pages, unsigned long old_absent_pages,
> + unsigned long new_start_pfn, unsigned long new_nr_pages)
> +{
> + unsigned long old_end_pfn = old_start_pfn + old_nr_pages;
> + unsigned long new_end_pfn = new_start_pfn + new_nr_pages;
> + unsigned long new_filled_pages = 0;
> +
> + /*
> + * If the moved pfn range does not intersect with the old zone span,
> + * the contiguous property is surely false.
> + */
> + if (new_end_pfn < old_start_pfn || new_start_pfn > old_end_pfn)
> + return;
> +
> + /*
> + * If the moved pfn range is adjacent to the old zone span,
> + * check the range to the left or to the right
> + */
> + if (new_end_pfn == old_start_pfn || new_start_pfn == old_end_pfn) {
> + zone->contiguous = old_contiguous &&
> + check_zone_contiguous(zone, new_start_pfn, new_nr_pages);
> + return;
> + }
> +
> + /*
> + * If old zone's hole larger than the new filled pages, the contiguous
> + * property is surely false.
> + */
> + new_filled_pages = new_end_pfn - old_start_pfn;
> + if (new_start_pfn > old_start_pfn)
> + new_filled_pages -= new_start_pfn - old_start_pfn;
> + if (new_end_pfn > old_end_pfn)
> + new_filled_pages -= new_end_pfn - old_end_pfn;
> + if (new_filled_pages < old_absent_pages)
> + return;
> +
> + set_zone_contiguous(zone);
> +}
> +
> #ifdef CONFIG_ZONE_DEVICE
> static void section_taint_zone_device(unsigned long pfn)
> {
> @@ -752,6 +793,10 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> {
> struct pglist_data *pgdat = zone->zone_pgdat;
> int nid = pgdat->node_id;
> + bool old_contiguous = zone->contiguous;
> + unsigned long old_start_pfn = zone->zone_start_pfn;
> + unsigned long old_nr_pages = zone->spanned_pages;
> + unsigned long old_absent_pages = zone->spanned_pages - zone->present_pages;
>
> clear_zone_contiguous(zone);
>
> @@ -783,7 +828,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> MEMINIT_HOTPLUG, altmap, migratetype,
> isolate_pageblock);
>
> - set_zone_contiguous(zone);
> + update_zone_contiguous(zone, old_contiguous, old_start_pfn, old_nr_pages,
> + old_absent_pages, start_pfn, nr_pages);
> }
>
> struct auto_movable_stats {
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 7712d887b696..04fdd949fe49 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -2263,26 +2263,43 @@ void __init init_cma_pageblock(struct page *page)
> }
> #endif
>
> -void set_zone_contiguous(struct zone *zone)
> +/*
> + * Check if all pageblocks in the given PFN range belong to the given zone.
> + * The given range is expected to be within the zone's pfn range, otherwise
> + * false is returned.
> + */
> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
> + unsigned long nr_pages)
> {
> - unsigned long block_start_pfn = zone->zone_start_pfn;
> + unsigned long end_pfn = start_pfn + nr_pages;
> + unsigned long block_start_pfn = start_pfn;
> unsigned long block_end_pfn;
>
> + if (start_pfn < zone->zone_start_pfn || end_pfn > zone_end_pfn(zone))
> + return false;
> +
> block_end_pfn = pageblock_end_pfn(block_start_pfn);
> - for (; block_start_pfn < zone_end_pfn(zone);
> + for (; block_start_pfn < end_pfn;
> block_start_pfn = block_end_pfn,
> block_end_pfn += pageblock_nr_pages) {
>
> - block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
> + block_end_pfn = min(block_end_pfn, end_pfn);
>
> if (!__pageblock_pfn_to_page(block_start_pfn,
> block_end_pfn, zone))
> - return;
> + return false;
> cond_resched();
> }
>
> - /* We confirm that there is no hole */
> - zone->contiguous = true;
> + return true;
> +}
> +
> +void set_zone_contiguous(struct zone *zone)
> +{
> + unsigned long start_pfn = zone->zone_start_pfn;
> + unsigned long nr_pages = zone->spanned_pages;
> +
> + zone->contiguous = check_zone_contiguous(zone, start_pfn, nr_pages);
> }
>
> /*
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-17 3:30 [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range Tianyou Li
2025-11-17 2:38 ` Li, Tianyou
@ 2025-11-17 11:57 ` David Hildenbrand (Red Hat)
2025-11-18 9:07 ` Li, Tianyou
2025-11-18 5:13 ` Mike Rapoport
2 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-17 11:57 UTC (permalink / raw)
To: Tianyou Li, Oscar Salvador
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
On 17.11.25 04:30, Tianyou Li wrote:
Sorry for the late review!
> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
> checking the new zone's pfn range from the beginning to the end, regardless
> the previous state of the old zone. When the zone's pfn range is large, the
> cost of traversing the pfn range to update the zone->contiguous could be
> significant.
Right, unfortunately we have to iterate pageblocks.
We know that hotplugged sections always belong to the same zone, so we
could optimize for them as well. Only early section have to walk pageblocks.
if (early_section(__pfn_to_section(pfn)))
We could also walk memory blocks I guess (for_each_memory_block). If
mem->zone != NULL, we know the whole block spans a single zone.
Memory blocks are as small as 128 MiB on x86-64, with pageblocks being 2
MiB we would walk 64 pageblocks.
(I think we can also walk MAX_PAGE_ORDER chunks instead of pageblock chunks)
>
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
>
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Memory Hotplug | 256G | 10s | 3s | 70% |
> | +------+---------------+--------------+----------------+
> | | 512G | 33s | 8s | 76% |
> +----------------+------+---------------+--------------+----------------+
Did not expect that to be the most expensive part, nice!
>
> [1] Qemu commands to hotplug 512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 512G
>
> [2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
> [3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> ---
> mm/internal.h | 3 +++
> mm/memory_hotplug.c | 48 ++++++++++++++++++++++++++++++++++++++++++++-
> mm/mm_init.c | 31 ++++++++++++++++++++++-------
> 3 files changed, 74 insertions(+), 8 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 1561fc2ff5b8..734caae6873c 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -734,6 +734,9 @@ void set_zone_contiguous(struct zone *zone);
> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
> unsigned long nr_pages);
>
> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
> + unsigned long nr_pages);
> +
> static inline void clear_zone_contiguous(struct zone *zone)
> {
> zone->contiguous = false;
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..96c003271b8e 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -723,6 +723,47 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>
> }
>
> +static void __meminit update_zone_contiguous(struct zone *zone,
> + bool old_contiguous, unsigned long old_start_pfn,
> + unsigned long old_nr_pages, unsigned long old_absent_pages,
> + unsigned long new_start_pfn, unsigned long new_nr_pages)
Is "old" the old zone range and "new", the new part we are adding?
In that case, old vs. new is misleading, could be interpreted as "old
spanned zone range" and "new spanned zone range".
Why are we passing in old_absent_pages and not simply calculate it based
on zone->present pages in here?
> +{
> + unsigned long old_end_pfn = old_start_pfn + old_nr_pages;
> + unsigned long new_end_pfn = new_start_pfn + new_nr_pages;
Can both be const.
> + unsigned long new_filled_pages = 0;
> +
> + /*
> + * If the moved pfn range does not intersect with the old zone span,
> + * the contiguous property is surely false.
> + */
> + if (new_end_pfn < old_start_pfn || new_start_pfn > old_end_pfn)
> + return;
> +
> + /*
> + * If the moved pfn range is adjacent to the old zone span,
> + * check the range to the left or to the right
> + */
> + if (new_end_pfn == old_start_pfn || new_start_pfn == old_end_pfn) {
> + zone->contiguous = old_contiguous &&
> + check_zone_contiguous(zone, new_start_pfn, new_nr_pages);
It's sufficient to check that a single pageblock at the old start/end
(depending where we're adding) has the same zone already.
Why are we checking the new range we are adding? That doesn't make sense
unless I am missing something. We know that that one is contiguous.
> + return;
> + }
> +
> + /*
> + * If old zone's hole larger than the new filled pages, the contiguous
> + * property is surely false.
> + */
> + new_filled_pages = new_end_pfn - old_start_pfn;
> + if (new_start_pfn > old_start_pfn)
> + new_filled_pages -= new_start_pfn - old_start_pfn;
> + if (new_end_pfn > old_end_pfn)
> + new_filled_pages -= new_end_pfn - old_end_pfn;
> + if (new_filled_pages < old_absent_pages)
> + return;
I don't quite like the dependence on present pages here. But I guess
there is no other simple way to just detect that there is a large hole
in there that cannot possibly get closed.
> +
> + set_zone_contiguous(zone);
> +}
> +
> #ifdef CONFIG_ZONE_DEVICE
> static void section_taint_zone_device(unsigned long pfn)
> {
> @@ -752,6 +793,10 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> {
> struct pglist_data *pgdat = zone->zone_pgdat;
> int nid = pgdat->node_id;
> + bool old_contiguous = zone->contiguous;
> + unsigned long old_start_pfn = zone->zone_start_pfn;
> + unsigned long old_nr_pages = zone->spanned_pages;
> + unsigned long old_absent_pages = zone->spanned_pages - zone->present_pages;
>
> clear_zone_contiguous(zone);
>
> @@ -783,7 +828,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> MEMINIT_HOTPLUG, altmap, migratetype,
> isolate_pageblock);
>
> - set_zone_contiguous(zone);
> + update_zone_contiguous(zone, old_contiguous, old_start_pfn, old_nr_pages,
> + old_absent_pages, start_pfn, nr_pages);
> }
>
> struct auto_movable_stats {
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 7712d887b696..04fdd949fe49 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -2263,26 +2263,43 @@ void __init init_cma_pageblock(struct page *page)
> }
> #endif
>
> -void set_zone_contiguous(struct zone *zone)
> +/*
> + * Check if all pageblocks in the given PFN range belong to the given zone.
> + * The given range is expected to be within the zone's pfn range, otherwise
> + * false is returned.
> + */
> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
> + unsigned long nr_pages)
> {
> - unsigned long block_start_pfn = zone->zone_start_pfn;
> + unsigned long end_pfn = start_pfn + nr_pages;
> + unsigned long block_start_pfn = start_pfn;
Can be const.
--
Cheers
David
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-17 11:57 ` David Hildenbrand (Red Hat)
@ 2025-11-18 9:07 ` Li, Tianyou
0 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-11-18 9:07 UTC (permalink / raw)
To: David Hildenbrand (Red Hat), Oscar Salvador
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
Very appreciated for your timely review and insightful comments, David.
On 11/17/2025 7:57 PM, David Hildenbrand (Red Hat) wrote:
> On 17.11.25 04:30, Tianyou Li wrote:
>
> Sorry for the late review!
>
>> When invoke move_pfn_range_to_zone, it will update the
>> zone->contiguous by
>> checking the new zone's pfn range from the beginning to the end,
>> regardless
>> the previous state of the old zone. When the zone's pfn range is
>> large, the
>> cost of traversing the pfn range to update the zone->contiguous could be
>> significant.
>
> Right, unfortunately we have to iterate pageblocks.
>
> We know that hotplugged sections always belong to the same zone, so we
> could optimize for them as well. Only early section have to walk
> pageblocks.
>
> if (early_section(__pfn_to_section(pfn)))
>
> We could also walk memory blocks I guess (for_each_memory_block). If
> mem->zone != NULL, we know the whole block spans a single zone.
>
>
> Memory blocks are as small as 128 MiB on x86-64, with pageblocks being
> 2 MiB we would walk 64 pageblocks.
>
> (I think we can also walk MAX_PAGE_ORDER chunks instead of pageblock
> chunks)
>
This actually point to another optimization opportunity that reduce the
memory access even further to get better performance beyond this patch.
This patch is to avoid the contiguous check as much as possible if we
can deduce the result of contiguous with given conditions. I must
confess I did not know that we can walk through memory blocks in some
situations. Allow me to think through the idea and create another patch
to optimize the slow path. At same time, would you mind to review the
patch v2 for the fast paths situation separately?
>
>>
>> Add fast paths to quickly detect cases where zone is definitely not
>> contiguous without scanning the new zone. The cases are: when the new
>> range
>> did not overlap with previous range, the contiguous should be false;
>> if the
>> new range adjacent with the previous range, just need to check the new
>> range; if the new added pages could not fill the hole of previous
>> zone, the
>> contiguous should be false.
>>
>> The following test cases of memory hotplug for a VM [1], tested in the
>> environment [2], show that this optimization can significantly reduce
>> the
>> memory hotplug time [3].
>>
>> +----------------+------+---------------+--------------+----------------+
>>
>> | | Size | Time (before) | Time (after) | Time
>> Reduction |
>> | +------+---------------+--------------+----------------+
>> | Memory Hotplug | 256G | 10s | 3s | 70% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 33s | 8s | 76% |
>> +----------------+------+---------------+--------------+----------------+
>>
>
> Did not expect that to be the most expensive part, nice!
>
Thanks.
>>
>> [1] Qemu commands to hotplug 512G memory for a VM:
>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> qom-set vmem1 requested-size 512G
>>
>> [2] Hardware : Intel Icelake server
>> Guest Kernel : v6.18-rc2
>> Qemu : v9.0.0
>>
>> Launch VM :
>> qemu-system-x86_64 -accel kvm -cpu host \
>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> -drive file=./seed.img,format=raw,if=virtio \
>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> -m 2G,slots=10,maxmem=2052472M \
>> -device
>> pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> -nographic -machine q35 \
>> -nic user,hostfwd=tcp::3000-:22
>>
>> Guest kernel auto-onlines newly added memory blocks:
>> echo online > /sys/devices/system/memory/auto_online_blocks
>>
>> [3] The time from typing the QEMU commands in [1] to when the output of
>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> memory is recognized.
>>
>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>> ---
>> mm/internal.h | 3 +++
>> mm/memory_hotplug.c | 48 ++++++++++++++++++++++++++++++++++++++++++++-
>> mm/mm_init.c | 31 ++++++++++++++++++++++-------
>> 3 files changed, 74 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/internal.h b/mm/internal.h
>> index 1561fc2ff5b8..734caae6873c 100644
>> --- a/mm/internal.h
>> +++ b/mm/internal.h
>> @@ -734,6 +734,9 @@ void set_zone_contiguous(struct zone *zone);
>> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>> unsigned long nr_pages);
>> +bool check_zone_contiguous(struct zone *zone, unsigned long
>> start_pfn,
>> + unsigned long nr_pages);
>> +
>> static inline void clear_zone_contiguous(struct zone *zone)
>> {
>> zone->contiguous = false;
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 0be83039c3b5..96c003271b8e 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -723,6 +723,47 @@ static void __meminit resize_pgdat_range(struct
>> pglist_data *pgdat, unsigned lon
>> }
>> +static void __meminit update_zone_contiguous(struct zone *zone,
>> + bool old_contiguous, unsigned long old_start_pfn,
>> + unsigned long old_nr_pages, unsigned long old_absent_pages,
>> + unsigned long new_start_pfn, unsigned long new_nr_pages)
>
> Is "old" the old zone range and "new", the new part we are adding?
>
> In that case, old vs. new is misleading, could be interpreted as "old
> spanned zone range" and "new spanned zone range".
>
>
Agreed, the naming struggles. Could I rename the 'old' to 'origin' that
indicates the original spanned zone range and removes the 'new_' to
indicate the added/moved pfn range? Will that be more descriptive to fit
in this situation?
> Why are we passing in old_absent_pages and not simply calculate it
> based on zone->present pages in here?
Will do in patch v2. Previously I am a bit conservative to use
zone->present_pages directly in the update_zone_contiguous function
because implicitly it creates a dependency that zone->present_pages did
not get updated in any of the functions before it. Allow me to send the
patch v2 for your review.
>
>> +{
>> + unsigned long old_end_pfn = old_start_pfn + old_nr_pages;
>> + unsigned long new_end_pfn = new_start_pfn + new_nr_pages;
>
> Can both be const.
>
Thanks, will do in patch v2.
>> + unsigned long new_filled_pages = 0;
>> +
>> + /*
>> + * If the moved pfn range does not intersect with the old zone
>> span,
>> + * the contiguous property is surely false.
>> + */
>> + if (new_end_pfn < old_start_pfn || new_start_pfn > old_end_pfn)
>> + return;
>> +
>> + /*
>> + * If the moved pfn range is adjacent to the old zone span,
>> + * check the range to the left or to the right
>> + */
>> + if (new_end_pfn == old_start_pfn || new_start_pfn == old_end_pfn) {
>> + zone->contiguous = old_contiguous &&
>> + check_zone_contiguous(zone, new_start_pfn, new_nr_pages);
>
> It's sufficient to check that a single pageblock at the old start/end
> (depending where we're adding) has the same zone already.
>
> Why are we checking the new range we are adding? That doesn't make
> sense unless I am missing something. We know that that one is contiguous.
>
You are right. The memmap_init_range makes the new ranges contiguous
with the same zone as the original zone span, because it passed with the
nid and zone_idx from the original zone. In that case, should we just
inherited the original contiguous property, probably even need not to
check additional pageblocks?
A quick test with your idea shows significant improvements, for 256G
configuration, the time reduced from 3s to 2s, and for 512G
configuration, the time reduced from 8s to 6s. The new result as below:
+----------------+------+---------------+--------------+----------------+
| | Size | Time (before) | Time (after) | Time Reduction |
| +------+---------------+--------------+----------------+
| Memory Hotplug | 256G | 10s | 2s | 80% |
| +------+---------------+--------------+----------------+
| | 512G | 33s | 6s | 81% |
+----------------+------+---------------+--------------+----------------+
>> + return;
>> + }
>> +
>> + /*
>> + * If old zone's hole larger than the new filled pages, the
>> contiguous
>> + * property is surely false.
>> + */
>> + new_filled_pages = new_end_pfn - old_start_pfn;
>> + if (new_start_pfn > old_start_pfn)
>> + new_filled_pages -= new_start_pfn - old_start_pfn;
>> + if (new_end_pfn > old_end_pfn)
>> + new_filled_pages -= new_end_pfn - old_end_pfn;
>> + if (new_filled_pages < old_absent_pages)
>> + return;
>
> I don't quite like the dependence on present pages here. But I guess
> there is no other simple way to just detect that there is a large hole
> in there that cannot possibly get closed.
>
Yes, I did not find a better solution to cover this situation simply,
and I'd like to cover most of cases include inside/overlap/span of the
original zone.
>> +
>> + set_zone_contiguous(zone);
>> +}
>> +
>> #ifdef CONFIG_ZONE_DEVICE
>> static void section_taint_zone_device(unsigned long pfn)
>> {
>> @@ -752,6 +793,10 @@ void move_pfn_range_to_zone(struct zone *zone,
>> unsigned long start_pfn,
>> {
>> struct pglist_data *pgdat = zone->zone_pgdat;
>> int nid = pgdat->node_id;
>> + bool old_contiguous = zone->contiguous;
>> + unsigned long old_start_pfn = zone->zone_start_pfn;
>> + unsigned long old_nr_pages = zone->spanned_pages;
>> + unsigned long old_absent_pages = zone->spanned_pages -
>> zone->present_pages;
>> clear_zone_contiguous(zone);
>> @@ -783,7 +828,8 @@ void move_pfn_range_to_zone(struct zone *zone,
>> unsigned long start_pfn,
>> MEMINIT_HOTPLUG, altmap, migratetype,
>> isolate_pageblock);
>> - set_zone_contiguous(zone);
>> + update_zone_contiguous(zone, old_contiguous, old_start_pfn,
>> old_nr_pages,
>> + old_absent_pages, start_pfn, nr_pages);
>> }
>> struct auto_movable_stats {
>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>> index 7712d887b696..04fdd949fe49 100644
>> --- a/mm/mm_init.c
>> +++ b/mm/mm_init.c
>> @@ -2263,26 +2263,43 @@ void __init init_cma_pageblock(struct page
>> *page)
>> }
>> #endif
>> -void set_zone_contiguous(struct zone *zone)
>> +/*
>> + * Check if all pageblocks in the given PFN range belong to the
>> given zone.
>> + * The given range is expected to be within the zone's pfn range,
>> otherwise
>> + * false is returned.
>> + */
>> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
>> + unsigned long nr_pages)
>> {
>> - unsigned long block_start_pfn = zone->zone_start_pfn;
>> + unsigned long end_pfn = start_pfn + nr_pages;
>> + unsigned long block_start_pfn = start_pfn;
>
> Can be const.
>
Yes, will do in patch v2.
Thanks & Regards,
Tianyou
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-17 3:30 [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range Tianyou Li
2025-11-17 2:38 ` Li, Tianyou
2025-11-17 11:57 ` David Hildenbrand (Red Hat)
@ 2025-11-18 5:13 ` Mike Rapoport
2025-11-18 9:28 ` Li, Tianyou
` (2 more replies)
2 siblings, 3 replies; 23+ messages in thread
From: Mike Rapoport @ 2025-11-18 5:13 UTC (permalink / raw)
To: Tianyou Li
Cc: David Hildenbrand, Oscar Salvador, linux-mm, Yong Hu, Nanhai Zou,
Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng, Chen Zhang,
linux-kernel
On Mon, Nov 17, 2025 at 11:30:52AM +0800, Tianyou Li wrote:
> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
> checking the new zone's pfn range from the beginning to the end, regardless
> the previous state of the old zone. When the zone's pfn range is large, the
> cost of traversing the pfn range to update the zone->contiguous could be
> significant.
>
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
>
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Memory Hotplug | 256G | 10s | 3s | 70% |
> | +------+---------------+--------------+----------------+
> | | 512G | 33s | 8s | 76% |
> +----------------+------+---------------+--------------+----------------+
>
> [1] Qemu commands to hotplug 512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 512G
>
> [2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
> [3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> ---
> mm/internal.h | 3 +++
> mm/memory_hotplug.c | 48 ++++++++++++++++++++++++++++++++++++++++++++-
> mm/mm_init.c | 31 ++++++++++++++++++++++-------
> 3 files changed, 74 insertions(+), 8 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 1561fc2ff5b8..734caae6873c 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -734,6 +734,9 @@ void set_zone_contiguous(struct zone *zone);
> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
> unsigned long nr_pages);
>
> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
> + unsigned long nr_pages);
> +
> static inline void clear_zone_contiguous(struct zone *zone)
> {
> zone->contiguous = false;
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..96c003271b8e 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -723,6 +723,47 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>
> }
>
> +static void __meminit update_zone_contiguous(struct zone *zone,
> + bool old_contiguous, unsigned long old_start_pfn,
> + unsigned long old_nr_pages, unsigned long old_absent_pages,
> + unsigned long new_start_pfn, unsigned long new_nr_pages)
> +{
> + unsigned long old_end_pfn = old_start_pfn + old_nr_pages;
> + unsigned long new_end_pfn = new_start_pfn + new_nr_pages;
> + unsigned long new_filled_pages = 0;
> +
> + /*
> + * If the moved pfn range does not intersect with the old zone span,
> + * the contiguous property is surely false.
> + */
> + if (new_end_pfn < old_start_pfn || new_start_pfn > old_end_pfn)
> + return;
> +
> + /*
> + * If the moved pfn range is adjacent to the old zone span,
> + * check the range to the left or to the right
> + */
> + if (new_end_pfn == old_start_pfn || new_start_pfn == old_end_pfn) {
> + zone->contiguous = old_contiguous &&
> + check_zone_contiguous(zone, new_start_pfn, new_nr_pages);
> + return;
The check for adjacency of the new range to the zone can be moved to the
beginning of move_pfn_range_to_zone() and it will already optimize the
common case when we hotplug memory to a contiguous zone.
> + }
> +
> + /*
> + * If old zone's hole larger than the new filled pages, the contiguous
> + * property is surely false.
> + */
> + new_filled_pages = new_end_pfn - old_start_pfn;
> + if (new_start_pfn > old_start_pfn)
> + new_filled_pages -= new_start_pfn - old_start_pfn;
> + if (new_end_pfn > old_end_pfn)
> + new_filled_pages -= new_end_pfn - old_end_pfn;
> + if (new_filled_pages < old_absent_pages)
> + return;
Let's just check that we don't add enough pages to cover the hole
if (nr_new_pages < old_absent_pages)
return;
and if we do go to the slow path and walk the pageblocks.
> +
> + set_zone_contiguous(zone);
> +}
> +
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-18 5:13 ` Mike Rapoport
@ 2025-11-18 9:28 ` Li, Tianyou
2025-11-18 9:35 ` Li, Tianyou
2025-11-19 4:07 ` [PATCH v2] " Tianyou Li
2 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-11-18 9:28 UTC (permalink / raw)
To: Mike Rapoport
Cc: David Hildenbrand, Oscar Salvador, linux-mm, Yong Hu, Nanhai Zou,
Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng, Chen Zhang,
linux-kernel
[-- Attachment #1: Type: text/plain, Size: 6616 bytes --]
Thanks for your comments Mike. Appreciated.
On 11/18/2025 1:13 PM, Mike Rapoport wrote:
> On Mon, Nov 17, 2025 at 11:30:52AM +0800, Tianyou Li wrote:
>> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
>> checking the new zone's pfn range from the beginning to the end, regardless
>> the previous state of the old zone. When the zone's pfn range is large, the
>> cost of traversing the pfn range to update the zone->contiguous could be
>> significant.
>>
>> Add fast paths to quickly detect cases where zone is definitely not
>> contiguous without scanning the new zone. The cases are: when the new range
>> did not overlap with previous range, the contiguous should be false; if the
>> new range adjacent with the previous range, just need to check the new
>> range; if the new added pages could not fill the hole of previous zone, the
>> contiguous should be false.
>>
>> The following test cases of memory hotplug for a VM [1], tested in the
>> environment [2], show that this optimization can significantly reduce the
>> memory hotplug time [3].
>>
>> +----------------+------+---------------+--------------+----------------+
>> | | Size | Time (before) | Time (after) | Time Reduction |
>> | +------+---------------+--------------+----------------+
>> | Memory Hotplug | 256G | 10s | 3s | 70% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 33s | 8s | 76% |
>> +----------------+------+---------------+--------------+----------------+
>>
>> [1] Qemu commands to hotplug 512G memory for a VM:
>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> qom-set vmem1 requested-size 512G
>>
>> [2] Hardware : Intel Icelake server
>> Guest Kernel : v6.18-rc2
>> Qemu : v9.0.0
>>
>> Launch VM :
>> qemu-system-x86_64 -accel kvm -cpu host \
>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> -drive file=./seed.img,format=raw,if=virtio \
>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> -m 2G,slots=10,maxmem=2052472M \
>> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> -nographic -machine q35 \
>> -nic user,hostfwd=tcp::3000-:22
>>
>> Guest kernel auto-onlines newly added memory blocks:
>> echo online > /sys/devices/system/memory/auto_online_blocks
>>
>> [3] The time from typing the QEMU commands in [1] to when the output of
>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> memory is recognized.
>>
>> Reported-by: Nanhai Zou<nanhai.zou@intel.com>
>> Reported-by: Chen Zhang<zhangchen.kidd@jd.com>
>> Tested-by: Yuan Liu<yuan1.liu@intel.com>
>> Reviewed-by: Tim Chen<tim.c.chen@linux.intel.com>
>> Reviewed-by: Qiuxu Zhuo<qiuxu.zhuo@intel.com>
>> Reviewed-by: Yu C Chen<yu.c.chen@intel.com>
>> Reviewed-by: Pan Deng<pan.deng@intel.com>
>> Reviewed-by: Nanhai Zou<nanhai.zou@intel.com>
>> Signed-off-by: Tianyou Li<tianyou.li@intel.com>
>> ---
>> mm/internal.h | 3 +++
>> mm/memory_hotplug.c | 48 ++++++++++++++++++++++++++++++++++++++++++++-
>> mm/mm_init.c | 31 ++++++++++++++++++++++-------
>> 3 files changed, 74 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/internal.h b/mm/internal.h
>> index 1561fc2ff5b8..734caae6873c 100644
>> --- a/mm/internal.h
>> +++ b/mm/internal.h
>> @@ -734,6 +734,9 @@ void set_zone_contiguous(struct zone *zone);
>> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>> unsigned long nr_pages);
>>
>> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
>> + unsigned long nr_pages);
>> +
>> static inline void clear_zone_contiguous(struct zone *zone)
>> {
>> zone->contiguous = false;
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 0be83039c3b5..96c003271b8e 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -723,6 +723,47 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>>
>> }
>>
>> +static void __meminit update_zone_contiguous(struct zone *zone,
>> + bool old_contiguous, unsigned long old_start_pfn,
>> + unsigned long old_nr_pages, unsigned long old_absent_pages,
>> + unsigned long new_start_pfn, unsigned long new_nr_pages)
>> +{
>> + unsigned long old_end_pfn = old_start_pfn + old_nr_pages;
>> + unsigned long new_end_pfn = new_start_pfn + new_nr_pages;
>> + unsigned long new_filled_pages = 0;
>> +
>> + /*
>> + * If the moved pfn range does not intersect with the old zone span,
>> + * the contiguous property is surely false.
>> + */
>> + if (new_end_pfn < old_start_pfn || new_start_pfn > old_end_pfn)
>> + return;
>> +
>> + /*
>> + * If the moved pfn range is adjacent to the old zone span,
>> + * check the range to the left or to the right
>> + */
>> + if (new_end_pfn == old_start_pfn || new_start_pfn == old_end_pfn) {
>> + zone->contiguous = old_contiguous &&
>> + check_zone_contiguous(zone, new_start_pfn, new_nr_pages);
>> + return;
> The check for adjacency of the new range to the zone can be moved to the
> beginning of move_pfn_range_to_zone() and it will already optimize the
> common case when we hotplug memory to a contiguous zone.
Do you mean we can separate the update_zone_contiguous logic into two
parts, one for fast path at the beginning of the move_pfn_range_to_zone,
and the other for slow path after the memmep_init_range?
>> + }
>> +
>> + /*
>> + * If old zone's hole larger than the new filled pages, the contiguous
>> + * property is surely false.
>> + */
>> + new_filled_pages = new_end_pfn - old_start_pfn;
>> + if (new_start_pfn > old_start_pfn)
>> + new_filled_pages -= new_start_pfn - old_start_pfn;
>> + if (new_end_pfn > old_end_pfn)
>> + new_filled_pages -= new_end_pfn - old_end_pfn;
>> + if (new_filled_pages < old_absent_pages)
>> + return;
> Let's just check that we don't add enough pages to cover the hole
>
> if (nr_new_pages < old_absent_pages)
> return;
>
> and if we do go to the slow path and walk the pageblocks.
I'd like to avoid of the slow path as much as possible. The check 'if
(nr_new_pages < old_absent_pages)' is more strict if overlap happens. I
am OK to simplify it if there is no overlap cases or to reduce the
maintaining efforts.
Thanks & Regards,
Tianyou
>> +
>> + set_zone_contiguous(zone);
>> +}
>> +
[-- Attachment #2: Type: text/html, Size: 8303 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-18 5:13 ` Mike Rapoport
2025-11-18 9:28 ` Li, Tianyou
@ 2025-11-18 9:35 ` Li, Tianyou
2025-11-18 10:31 ` Li, Tianyou
2025-11-19 4:07 ` [PATCH v2] " Tianyou Li
2 siblings, 1 reply; 23+ messages in thread
From: Li, Tianyou @ 2025-11-18 9:35 UTC (permalink / raw)
To: Mike Rapoport
Cc: David Hildenbrand, Oscar Salvador, linux-mm, Yong Hu, Nanhai Zou,
Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng, Chen Zhang,
linux-kernel
Thanks for your comments Mike. Appreciated.
On 11/18/2025 1:13 PM, Mike Rapoport wrote:
> On Mon, Nov 17, 2025 at 11:30:52AM +0800, Tianyou Li wrote:
>> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
>> checking the new zone's pfn range from the beginning to the end, regardless
>> the previous state of the old zone. When the zone's pfn range is large, the
>> cost of traversing the pfn range to update the zone->contiguous could be
>> significant.
>>
>> Add fast paths to quickly detect cases where zone is definitely not
>> contiguous without scanning the new zone. The cases are: when the new range
>> did not overlap with previous range, the contiguous should be false; if the
>> new range adjacent with the previous range, just need to check the new
>> range; if the new added pages could not fill the hole of previous zone, the
>> contiguous should be false.
>>
>> The following test cases of memory hotplug for a VM [1], tested in the
>> environment [2], show that this optimization can significantly reduce the
>> memory hotplug time [3].
>>
>> +----------------+------+---------------+--------------+----------------+
>> | | Size | Time (before) | Time (after) | Time Reduction |
>> | +------+---------------+--------------+----------------+
>> | Memory Hotplug | 256G | 10s | 3s | 70% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 33s | 8s | 76% |
>> +----------------+------+---------------+--------------+----------------+
>>
>> [1] Qemu commands to hotplug 512G memory for a VM:
>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> qom-set vmem1 requested-size 512G
>>
>> [2] Hardware : Intel Icelake server
>> Guest Kernel : v6.18-rc2
>> Qemu : v9.0.0
>>
>> Launch VM :
>> qemu-system-x86_64 -accel kvm -cpu host \
>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> -drive file=./seed.img,format=raw,if=virtio \
>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> -m 2G,slots=10,maxmem=2052472M \
>> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> -nographic -machine q35 \
>> -nic user,hostfwd=tcp::3000-:22
>>
>> Guest kernel auto-onlines newly added memory blocks:
>> echo online > /sys/devices/system/memory/auto_online_blocks
>>
>> [3] The time from typing the QEMU commands in [1] to when the output of
>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> memory is recognized.
>>
>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>> ---
>> mm/internal.h | 3 +++
>> mm/memory_hotplug.c | 48 ++++++++++++++++++++++++++++++++++++++++++++-
>> mm/mm_init.c | 31 ++++++++++++++++++++++-------
>> 3 files changed, 74 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/internal.h b/mm/internal.h
>> index 1561fc2ff5b8..734caae6873c 100644
>> --- a/mm/internal.h
>> +++ b/mm/internal.h
>> @@ -734,6 +734,9 @@ void set_zone_contiguous(struct zone *zone);
>> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>> unsigned long nr_pages);
>>
>> +bool check_zone_contiguous(struct zone *zone, unsigned long start_pfn,
>> + unsigned long nr_pages);
>> +
>> static inline void clear_zone_contiguous(struct zone *zone)
>> {
>> zone->contiguous = false;
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 0be83039c3b5..96c003271b8e 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -723,6 +723,47 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>>
>> }
>>
>> +static void __meminit update_zone_contiguous(struct zone *zone,
>> + bool old_contiguous, unsigned long old_start_pfn,
>> + unsigned long old_nr_pages, unsigned long old_absent_pages,
>> + unsigned long new_start_pfn, unsigned long new_nr_pages)
>> +{
>> + unsigned long old_end_pfn = old_start_pfn + old_nr_pages;
>> + unsigned long new_end_pfn = new_start_pfn + new_nr_pages;
>> + unsigned long new_filled_pages = 0;
>> +
>> + /*
>> + * If the moved pfn range does not intersect with the old zone span,
>> + * the contiguous property is surely false.
>> + */
>> + if (new_end_pfn < old_start_pfn || new_start_pfn > old_end_pfn)
>> + return;
>> +
>> + /*
>> + * If the moved pfn range is adjacent to the old zone span,
>> + * check the range to the left or to the right
>> + */
>> + if (new_end_pfn == old_start_pfn || new_start_pfn == old_end_pfn) {
>> + zone->contiguous = old_contiguous &&
>> + check_zone_contiguous(zone, new_start_pfn, new_nr_pages);
>> + return;
> The check for adjacency of the new range to the zone can be moved to the
> beginning of move_pfn_range_to_zone() and it will already optimize the
> common case when we hotplug memory to a contiguous zone.
Do you mean we can separate the update_zone_contiguous logic into two
parts, one for fast path at the beginning of the move_pfn_range_to_zone,
and the other for slow path after the memmep_init_range?
>> + }
>> +
>> + /*
>> + * If old zone's hole larger than the new filled pages, the contiguous
>> + * property is surely false.
>> + */
>> + new_filled_pages = new_end_pfn - old_start_pfn;
>> + if (new_start_pfn > old_start_pfn)
>> + new_filled_pages -= new_start_pfn - old_start_pfn;
>> + if (new_end_pfn > old_end_pfn)
>> + new_filled_pages -= new_end_pfn - old_end_pfn;
>> + if (new_filled_pages < old_absent_pages)
>> + return;
> Let's just check that we don't add enough pages to cover the hole
>
> if (nr_new_pages < old_absent_pages)
> return;
>
> and if we do go to the slow path and walk the pageblocks.
I'd like to avoid of the slow path as much as possible. The check 'if
(nr_new_pages < old_absent_pages)' is more strict if overlap happens. I
am OK to simplify it if there is no overlap cases or to reduce the
maintaining efforts.
Thanks & Regards,
Tianyou
>> +
>> + set_zone_contiguous(zone);
>> +}
>> +
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-18 9:35 ` Li, Tianyou
@ 2025-11-18 10:31 ` Li, Tianyou
0 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-11-18 10:31 UTC (permalink / raw)
To: Mike Rapoport
Cc: David Hildenbrand, Oscar Salvador, linux-mm, Yong Hu, Nanhai Zou,
Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng, Chen Zhang,
linux-kernel
On 11/18/2025 5:35 PM, Li, Tianyou wrote:
> Thanks for your comments Mike. Appreciated.
>
>
> On 11/18/2025 1:13 PM, Mike Rapoport wrote:
>> On Mon, Nov 17, 2025 at 11:30:52AM +0800, Tianyou Li wrote:
>>> When invoke move_pfn_range_to_zone, it will update the
>>> zone->contiguous by
>>> checking the new zone's pfn range from the beginning to the end,
>>> regardless
>>> the previous state of the old zone. When the zone's pfn range is
>>> large, the
>>> cost of traversing the pfn range to update the zone->contiguous
>>> could be
>>> significant.
>>>
>>> Add fast paths to quickly detect cases where zone is definitely not
>>> contiguous without scanning the new zone. The cases are: when the
>>> new range
>>> did not overlap with previous range, the contiguous should be false;
>>> if the
>>> new range adjacent with the previous range, just need to check the new
>>> range; if the new added pages could not fill the hole of previous
>>> zone, the
>>> contiguous should be false.
>>>
>>> The following test cases of memory hotplug for a VM [1], tested in the
>>> environment [2], show that this optimization can significantly
>>> reduce the
>>> memory hotplug time [3].
>>>
>>> +----------------+------+---------------+--------------+----------------+
>>>
>>> | | Size | Time (before) | Time (after) | Time
>>> Reduction |
>>> | +------+---------------+--------------+----------------+
>>> | Memory Hotplug | 256G | 10s | 3s | 70% |
>>> | +------+---------------+--------------+----------------+
>>> | | 512G | 33s | 8s | 76% |
>>> +----------------+------+---------------+--------------+----------------+
>>>
>>>
>>> [1] Qemu commands to hotplug 512G memory for a VM:
>>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>>> qom-set vmem1 requested-size 512G
>>>
>>> [2] Hardware : Intel Icelake server
>>> Guest Kernel : v6.18-rc2
>>> Qemu : v9.0.0
>>>
>>> Launch VM :
>>> qemu-system-x86_64 -accel kvm -cpu host \
>>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>>> -drive file=./seed.img,format=raw,if=virtio \
>>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>>> -m 2G,slots=10,maxmem=2052472M \
>>> -device
>>> pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>>> -nographic -machine q35 \
>>> -nic user,hostfwd=tcp::3000-:22
>>>
>>> Guest kernel auto-onlines newly added memory blocks:
>>> echo online > /sys/devices/system/memory/auto_online_blocks
>>>
>>> [3] The time from typing the QEMU commands in [1] to when the output of
>>> 'grep MemTotal /proc/meminfo' on Guest reflects that all
>>> hotplugged
>>> memory is recognized.
>>>
>>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>>> ---
>>> mm/internal.h | 3 +++
>>> mm/memory_hotplug.c | 48
>>> ++++++++++++++++++++++++++++++++++++++++++++-
>>> mm/mm_init.c | 31 ++++++++++++++++++++++-------
>>> 3 files changed, 74 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/mm/internal.h b/mm/internal.h
>>> index 1561fc2ff5b8..734caae6873c 100644
>>> --- a/mm/internal.h
>>> +++ b/mm/internal.h
>>> @@ -734,6 +734,9 @@ void set_zone_contiguous(struct zone *zone);
>>> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>>> unsigned long nr_pages);
>>> +bool check_zone_contiguous(struct zone *zone, unsigned long
>>> start_pfn,
>>> + unsigned long nr_pages);
>>> +
>>> static inline void clear_zone_contiguous(struct zone *zone)
>>> {
>>> zone->contiguous = false;
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 0be83039c3b5..96c003271b8e 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -723,6 +723,47 @@ static void __meminit resize_pgdat_range(struct
>>> pglist_data *pgdat, unsigned lon
>>> }
>>> +static void __meminit update_zone_contiguous(struct zone *zone,
>>> + bool old_contiguous, unsigned long old_start_pfn,
>>> + unsigned long old_nr_pages, unsigned long
>>> old_absent_pages,
>>> + unsigned long new_start_pfn, unsigned long new_nr_pages)
>>> +{
>>> + unsigned long old_end_pfn = old_start_pfn + old_nr_pages;
>>> + unsigned long new_end_pfn = new_start_pfn + new_nr_pages;
>>> + unsigned long new_filled_pages = 0;
>>> +
>>> + /*
>>> + * If the moved pfn range does not intersect with the old zone
>>> span,
>>> + * the contiguous property is surely false.
>>> + */
>>> + if (new_end_pfn < old_start_pfn || new_start_pfn > old_end_pfn)
>>> + return;
>>> +
>>> + /*
>>> + * If the moved pfn range is adjacent to the old zone span,
>>> + * check the range to the left or to the right
>>> + */
>>> + if (new_end_pfn == old_start_pfn || new_start_pfn ==
>>> old_end_pfn) {
>>> + zone->contiguous = old_contiguous &&
>>> + check_zone_contiguous(zone, new_start_pfn, new_nr_pages);
>>> + return;
>> The check for adjacency of the new range to the zone can be moved to the
>> beginning of move_pfn_range_to_zone() and it will already optimize the
>> common case when we hotplug memory to a contiguous zone.
>
>
> Do you mean we can separate the update_zone_contiguous logic into two
> parts, one for fast path at the beginning of the
> move_pfn_range_to_zone, and the other for slow path after the
> memmep_init_range?
>
Re-think your idea, it's doable consider the check_zone_contiguous is
not necessary. We can have a function check_zone_contiguous_fast, which
need to pass the zone, start_pfn and nr_pages, return a boolean value to
indicate the fast path or not. The code changes minimized. Will send the
patch v2 soon.
>
>>> + }
>>> +
>>> + /*
>>> + * If old zone's hole larger than the new filled pages, the
>>> contiguous
>>> + * property is surely false.
>>> + */
>>> + new_filled_pages = new_end_pfn - old_start_pfn;
>>> + if (new_start_pfn > old_start_pfn)
>>> + new_filled_pages -= new_start_pfn - old_start_pfn;
>>> + if (new_end_pfn > old_end_pfn)
>>> + new_filled_pages -= new_end_pfn - old_end_pfn;
>>> + if (new_filled_pages < old_absent_pages)
>>> + return;
>> Let's just check that we don't add enough pages to cover the hole
>>
>> if (nr_new_pages < old_absent_pages)
>> return;
>>
>> and if we do go to the slow path and walk the pageblocks.
>
>
> I'd like to avoid of the slow path as much as possible. The check 'if
> (nr_new_pages < old_absent_pages)' is more strict if overlap happens.
> I am OK to simplify it if there is no overlap cases or to reduce the
> maintaining efforts.
>
>
> Thanks & Regards,
>
> Tianyou
>
>
>>> +
>>> + set_zone_contiguous(zone);
>>> +}
>>> +
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-18 5:13 ` Mike Rapoport
2025-11-18 9:28 ` Li, Tianyou
2025-11-18 9:35 ` Li, Tianyou
@ 2025-11-19 4:07 ` Tianyou Li
2025-11-19 3:13 ` Li, Tianyou
2025-11-19 11:42 ` Wei Yang
2 siblings, 2 replies; 23+ messages in thread
From: Tianyou Li @ 2025-11-19 4:07 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador, Mike Rapoport
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, linux-kernel
When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
checking the new zone's pfn range from the beginning to the end, regardless
the previous state of the old zone. When the zone's pfn range is large, the
cost of traversing the pfn range to update the zone->contiguous could be
significant.
Add fast paths to quickly detect cases where zone is definitely not
contiguous without scanning the new zone. The cases are: when the new range
did not overlap with previous range, the contiguous should be false; if the
new range adjacent with the previous range, just need to check the new
range; if the new added pages could not fill the hole of previous zone, the
contiguous should be false.
The following test cases of memory hotplug for a VM [1], tested in the
environment [2], show that this optimization can significantly reduce the
memory hotplug time [3].
+----------------+------+---------------+--------------+----------------+
| | Size | Time (before) | Time (after) | Time Reduction |
| +------+---------------+--------------+----------------+
| Memory Hotplug | 256G | 10s | 2s | 80% |
| +------+---------------+--------------+----------------+
| | 512G | 33s | 6s | 81% |
+----------------+------+---------------+--------------+----------------+
[1] Qemu commands to hotplug 512G memory for a VM:
object_add memory-backend-ram,id=hotmem0,size=512G,share=on
device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
qom-set vmem1 requested-size 512G
[2] Hardware : Intel Icelake server
Guest Kernel : v6.18-rc2
Qemu : v9.0.0
Launch VM :
qemu-system-x86_64 -accel kvm -cpu host \
-drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
-drive file=./seed.img,format=raw,if=virtio \
-smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
-m 2G,slots=10,maxmem=2052472M \
-device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
-device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
-nographic -machine q35 \
-nic user,hostfwd=tcp::3000-:22
Guest kernel auto-onlines newly added memory blocks:
echo online > /sys/devices/system/memory/auto_online_blocks
[3] The time from typing the QEMU commands in [1] to when the output of
'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
memory is recognized.
Reported-by: Nanhai Zou <nanhai.zou@intel.com>
Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
Tested-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
Reviewed-by: Pan Deng <pan.deng@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
Signed-off-by: Tianyou Li <tianyou.li@intel.com>
---
mm/memory_hotplug.c | 57 ++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 54 insertions(+), 3 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0be83039c3b5..8f126f20ca47 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -723,6 +723,57 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
}
+static bool __meminit check_zone_contiguous_fast(struct zone *zone,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ const unsigned long end_pfn = start_pfn + nr_pages;
+ unsigned long nr_filled_pages;
+
+ /*
+ * Given the moved pfn range's contiguous property is always true,
+ * under the conditional of empty zone, the contiguous property should
+ * be true.
+ */
+ if (zone_is_empty(zone)) {
+ zone->contiguous = true;
+ return true;
+ }
+
+ /*
+ * If the moved pfn range does not intersect with the original zone span,
+ * the contiguous property is surely false.
+ */
+ if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
+ zone->contiguous = false;
+ return true;
+ }
+
+ /*
+ * If the moved pfn range is adjacent to the original zone span, given
+ * the moved pfn range's contiguous property is always true, the zone's
+ * contiguous property inherited from the original value.
+ */
+ if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
+ return true;
+
+ /*
+ * If the original zone's hole larger than the new filled pages, the
+ * contiguous property is surely false.
+ */
+ nr_filled_pages = end_pfn - zone->zone_start_pfn;
+ if (start_pfn > zone->zone_start_pfn)
+ nr_filled_pages -= start_pfn - zone->zone_start_pfn;
+ if (end_pfn > zone_end_pfn(zone))
+ nr_filled_pages -= end_pfn - zone_end_pfn(zone);
+ if (nr_filled_pages < (zone->spanned_pages - zone->present_pages)) {
+ zone->contiguous = false;
+ return true;
+ }
+
+ clear_zone_contiguous(zone);
+ return false;
+}
+
#ifdef CONFIG_ZONE_DEVICE
static void section_taint_zone_device(unsigned long pfn)
{
@@ -752,8 +803,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
{
struct pglist_data *pgdat = zone->zone_pgdat;
int nid = pgdat->node_id;
-
- clear_zone_contiguous(zone);
+ const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
if (zone_is_empty(zone))
init_currently_empty_zone(zone, start_pfn, nr_pages);
@@ -783,7 +833,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
MEMINIT_HOTPLUG, altmap, migratetype,
isolate_pageblock);
- set_zone_contiguous(zone);
+ if (!fast_path)
+ set_zone_contiguous(zone);
}
struct auto_movable_stats {
--
2.47.1
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 4:07 ` [PATCH v2] " Tianyou Li
@ 2025-11-19 3:13 ` Li, Tianyou
2025-11-28 11:49 ` David Hildenbrand (Red Hat)
2025-11-19 11:42 ` Wei Yang
1 sibling, 1 reply; 23+ messages in thread
From: Li, Tianyou @ 2025-11-19 3:13 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador, Mike Rapoport
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
Hi All,
Patch v2 with the changes suggested from David and Mike, which
simplified the code and also performance get further improved on
Icelake. Appreciated for your review. Thanks.
Regards,
Tianyou
On 11/19/2025 12:07 PM, Tianyou Li wrote:
> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
> checking the new zone's pfn range from the beginning to the end, regardless
> the previous state of the old zone. When the zone's pfn range is large, the
> cost of traversing the pfn range to update the zone->contiguous could be
> significant.
>
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
>
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Memory Hotplug | 256G | 10s | 2s | 80% |
> | +------+---------------+--------------+----------------+
> | | 512G | 33s | 6s | 81% |
> +----------------+------+---------------+--------------+----------------+
>
> [1] Qemu commands to hotplug 512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 512G
>
> [2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
> [3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> ---
> mm/memory_hotplug.c | 57 ++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 54 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..8f126f20ca47 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -723,6 +723,57 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>
> }
>
> +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
> + unsigned long start_pfn, unsigned long nr_pages)
> +{
> + const unsigned long end_pfn = start_pfn + nr_pages;
> + unsigned long nr_filled_pages;
> +
> + /*
> + * Given the moved pfn range's contiguous property is always true,
> + * under the conditional of empty zone, the contiguous property should
> + * be true.
> + */
> + if (zone_is_empty(zone)) {
> + zone->contiguous = true;
> + return true;
> + }
> +
> + /*
> + * If the moved pfn range does not intersect with the original zone span,
> + * the contiguous property is surely false.
> + */
> + if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
> + zone->contiguous = false;
> + return true;
> + }
> +
> + /*
> + * If the moved pfn range is adjacent to the original zone span, given
> + * the moved pfn range's contiguous property is always true, the zone's
> + * contiguous property inherited from the original value.
> + */
> + if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
> + return true;
> +
> + /*
> + * If the original zone's hole larger than the new filled pages, the
> + * contiguous property is surely false.
> + */
> + nr_filled_pages = end_pfn - zone->zone_start_pfn;
> + if (start_pfn > zone->zone_start_pfn)
> + nr_filled_pages -= start_pfn - zone->zone_start_pfn;
> + if (end_pfn > zone_end_pfn(zone))
> + nr_filled_pages -= end_pfn - zone_end_pfn(zone);
> + if (nr_filled_pages < (zone->spanned_pages - zone->present_pages)) {
> + zone->contiguous = false;
> + return true;
> + }
> +
> + clear_zone_contiguous(zone);
> + return false;
> +}
> +
> #ifdef CONFIG_ZONE_DEVICE
> static void section_taint_zone_device(unsigned long pfn)
> {
> @@ -752,8 +803,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> {
> struct pglist_data *pgdat = zone->zone_pgdat;
> int nid = pgdat->node_id;
> -
> - clear_zone_contiguous(zone);
> + const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
>
> if (zone_is_empty(zone))
> init_currently_empty_zone(zone, start_pfn, nr_pages);
> @@ -783,7 +833,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> MEMINIT_HOTPLUG, altmap, migratetype,
> isolate_pageblock);
>
> - set_zone_contiguous(zone);
> + if (!fast_path)
> + set_zone_contiguous(zone);
> }
>
> struct auto_movable_stats {
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 3:13 ` Li, Tianyou
@ 2025-11-28 11:49 ` David Hildenbrand (Red Hat)
2025-11-28 13:33 ` Li, Tianyou
0 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-28 11:49 UTC (permalink / raw)
To: Li, Tianyou, Oscar Salvador, Mike Rapoport
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
On 11/19/25 04:13, Li, Tianyou wrote:
> Hi All,
>
> Patch v2 with the changes suggested from David and Mike, which
> simplified the code and also performance get further improved on
> Icelake. Appreciated for your review. Thanks.
Now about to review v3 :)
--
Cheers
David
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-28 11:49 ` David Hildenbrand (Red Hat)
@ 2025-11-28 13:33 ` Li, Tianyou
0 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-11-28 13:33 UTC (permalink / raw)
To: David Hildenbrand (Red Hat), Oscar Salvador, Mike Rapoport
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
Thanks David. Appreciated.
Regards,
Tianyou
On 11/28/2025 7:49 PM, David Hildenbrand (Red Hat) wrote:
> On 11/19/25 04:13, Li, Tianyou wrote:
>> Hi All,
>>
>> Patch v2 with the changes suggested from David and Mike, which
>> simplified the code and also performance get further improved on
>> Icelake. Appreciated for your review. Thanks.
>
> Now about to review v3 :)
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 4:07 ` [PATCH v2] " Tianyou Li
2025-11-19 3:13 ` Li, Tianyou
@ 2025-11-19 11:42 ` Wei Yang
2025-11-19 12:41 ` Li, Tianyou
2025-11-19 14:06 ` [PATCH v3] " Tianyou Li
1 sibling, 2 replies; 23+ messages in thread
From: Wei Yang @ 2025-11-19 11:42 UTC (permalink / raw)
To: Tianyou Li
Cc: David Hildenbrand, Oscar Salvador, Mike Rapoport, linux-mm,
Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen,
Pan Deng, Chen Zhang, linux-kernel
On Wed, Nov 19, 2025 at 12:07:18PM +0800, Tianyou Li wrote:
>When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
>checking the new zone's pfn range from the beginning to the end, regardless
>the previous state of the old zone. When the zone's pfn range is large, the
>cost of traversing the pfn range to update the zone->contiguous could be
>significant.
>
>Add fast paths to quickly detect cases where zone is definitely not
>contiguous without scanning the new zone. The cases are: when the new range
>did not overlap with previous range, the contiguous should be false; if the
>new range adjacent with the previous range, just need to check the new
>range; if the new added pages could not fill the hole of previous zone, the
>contiguous should be false.
>
>The following test cases of memory hotplug for a VM [1], tested in the
>environment [2], show that this optimization can significantly reduce the
>memory hotplug time [3].
>
>+----------------+------+---------------+--------------+----------------+
>| | Size | Time (before) | Time (after) | Time Reduction |
>| +------+---------------+--------------+----------------+
>| Memory Hotplug | 256G | 10s | 2s | 80% |
>| +------+---------------+--------------+----------------+
>| | 512G | 33s | 6s | 81% |
>+----------------+------+---------------+--------------+----------------+
>
Nice
>[1] Qemu commands to hotplug 512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 512G
>
>[2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
>[3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
>Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>Tested-by: Yuan Liu <yuan1.liu@intel.com>
>Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>Reviewed-by: Pan Deng <pan.deng@intel.com>
>Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
>Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>---
> mm/memory_hotplug.c | 57 ++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 54 insertions(+), 3 deletions(-)
>
>diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>index 0be83039c3b5..8f126f20ca47 100644
>--- a/mm/memory_hotplug.c
>+++ b/mm/memory_hotplug.c
>@@ -723,6 +723,57 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>
> }
>
>+static bool __meminit check_zone_contiguous_fast(struct zone *zone,
>+ unsigned long start_pfn, unsigned long nr_pages)
>+{
>+ const unsigned long end_pfn = start_pfn + nr_pages;
>+ unsigned long nr_filled_pages;
>+
>+ /*
>+ * Given the moved pfn range's contiguous property is always true,
>+ * under the conditional of empty zone, the contiguous property should
>+ * be true.
>+ */
>+ if (zone_is_empty(zone)) {
>+ zone->contiguous = true;
>+ return true;
>+ }
>+
>+ /*
>+ * If the moved pfn range does not intersect with the original zone span,
>+ * the contiguous property is surely false.
>+ */
>+ if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
>+ zone->contiguous = false;
>+ return true;
>+ }
>+
>+ /*
>+ * If the moved pfn range is adjacent to the original zone span, given
>+ * the moved pfn range's contiguous property is always true, the zone's
>+ * contiguous property inherited from the original value.
>+ */
>+ if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
>+ return true;
>+
>+ /*
>+ * If the original zone's hole larger than the new filled pages, the
>+ * contiguous property is surely false.
>+ */
>+ nr_filled_pages = end_pfn - zone->zone_start_pfn;
>+ if (start_pfn > zone->zone_start_pfn)
>+ nr_filled_pages -= start_pfn - zone->zone_start_pfn;
>+ if (end_pfn > zone_end_pfn(zone))
>+ nr_filled_pages -= end_pfn - zone_end_pfn(zone);
>+ if (nr_filled_pages < (zone->spanned_pages - zone->present_pages)) {
>+ zone->contiguous = false;
>+ return true;
>+ }
>+
Mike's suggestion is easier for me to understand :-)
>+ clear_zone_contiguous(zone);
>+ return false;
>+}
>+
> #ifdef CONFIG_ZONE_DEVICE
> static void section_taint_zone_device(unsigned long pfn)
> {
>@@ -752,8 +803,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> {
> struct pglist_data *pgdat = zone->zone_pgdat;
> int nid = pgdat->node_id;
>-
>- clear_zone_contiguous(zone);
>+ const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
>
> if (zone_is_empty(zone))
> init_currently_empty_zone(zone, start_pfn, nr_pages);
>@@ -783,7 +833,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> MEMINIT_HOTPLUG, altmap, migratetype,
> isolate_pageblock);
>
>- set_zone_contiguous(zone);
>+ if (!fast_path)
>+ set_zone_contiguous(zone);
> }
>
> struct auto_movable_stats {
>--
>2.47.1
>
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 11:42 ` Wei Yang
@ 2025-11-19 12:41 ` Li, Tianyou
2025-11-19 12:44 ` Wei Yang
2025-11-19 14:06 ` [PATCH v3] " Tianyou Li
1 sibling, 1 reply; 23+ messages in thread
From: Li, Tianyou @ 2025-11-19 12:41 UTC (permalink / raw)
To: Wei Yang
Cc: David Hildenbrand, Oscar Salvador, Mike Rapoport, linux-mm,
Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen,
Pan Deng, Chen Zhang, linux-kernel
On 11/19/2025 7:42 PM, Wei Yang wrote:
> On Wed, Nov 19, 2025 at 12:07:18PM +0800, Tianyou Li wrote:
>> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
>> checking the new zone's pfn range from the beginning to the end, regardless
>> the previous state of the old zone. When the zone's pfn range is large, the
>> cost of traversing the pfn range to update the zone->contiguous could be
>> significant.
>>
>> Add fast paths to quickly detect cases where zone is definitely not
>> contiguous without scanning the new zone. The cases are: when the new range
>> did not overlap with previous range, the contiguous should be false; if the
>> new range adjacent with the previous range, just need to check the new
>> range; if the new added pages could not fill the hole of previous zone, the
>> contiguous should be false.
>>
>> The following test cases of memory hotplug for a VM [1], tested in the
>> environment [2], show that this optimization can significantly reduce the
>> memory hotplug time [3].
>>
>> +----------------+------+---------------+--------------+----------------+
>> | | Size | Time (before) | Time (after) | Time Reduction |
>> | +------+---------------+--------------+----------------+
>> | Memory Hotplug | 256G | 10s | 2s | 80% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 33s | 6s | 81% |
>> +----------------+------+---------------+--------------+----------------+
>>
> Nice
Thanks for your time to review.
>> [1] Qemu commands to hotplug 512G memory for a VM:
>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> qom-set vmem1 requested-size 512G
>>
>> [2] Hardware : Intel Icelake server
>> Guest Kernel : v6.18-rc2
>> Qemu : v9.0.0
>>
>> Launch VM :
>> qemu-system-x86_64 -accel kvm -cpu host \
>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> -drive file=./seed.img,format=raw,if=virtio \
>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> -m 2G,slots=10,maxmem=2052472M \
>> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> -nographic -machine q35 \
>> -nic user,hostfwd=tcp::3000-:22
>>
>> Guest kernel auto-onlines newly added memory blocks:
>> echo online > /sys/devices/system/memory/auto_online_blocks
>>
>> [3] The time from typing the QEMU commands in [1] to when the output of
>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> memory is recognized.
>>
>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>> ---
>> mm/memory_hotplug.c | 57 ++++++++++++++++++++++++++++++++++++++++++---
>> 1 file changed, 54 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 0be83039c3b5..8f126f20ca47 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -723,6 +723,57 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>>
>> }
>>
>> +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
>> + unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> + const unsigned long end_pfn = start_pfn + nr_pages;
>> + unsigned long nr_filled_pages;
>> +
>> + /*
>> + * Given the moved pfn range's contiguous property is always true,
>> + * under the conditional of empty zone, the contiguous property should
>> + * be true.
>> + */
>> + if (zone_is_empty(zone)) {
>> + zone->contiguous = true;
>> + return true;
>> + }
>> +
>> + /*
>> + * If the moved pfn range does not intersect with the original zone span,
>> + * the contiguous property is surely false.
>> + */
>> + if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
>> + zone->contiguous = false;
>> + return true;
>> + }
>> +
>> + /*
>> + * If the moved pfn range is adjacent to the original zone span, given
>> + * the moved pfn range's contiguous property is always true, the zone's
>> + * contiguous property inherited from the original value.
>> + */
>> + if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
>> + return true;
>> +
>> + /*
>> + * If the original zone's hole larger than the new filled pages, the
>> + * contiguous property is surely false.
>> + */
>> + nr_filled_pages = end_pfn - zone->zone_start_pfn;
>> + if (start_pfn > zone->zone_start_pfn)
>> + nr_filled_pages -= start_pfn - zone->zone_start_pfn;
>> + if (end_pfn > zone_end_pfn(zone))
>> + nr_filled_pages -= end_pfn - zone_end_pfn(zone);
>> + if (nr_filled_pages < (zone->spanned_pages - zone->present_pages)) {
>> + zone->contiguous = false;
>> + return true;
>> + }
>> +
> Mike's suggestion is easier for me to understand :-)
OK :-), with the clear votes now, I will change it in patch v3 real
quick. Thanks.
>> + clear_zone_contiguous(zone);
>> + return false;
>> +}
>> +
>> #ifdef CONFIG_ZONE_DEVICE
>> static void section_taint_zone_device(unsigned long pfn)
>> {
>> @@ -752,8 +803,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>> {
>> struct pglist_data *pgdat = zone->zone_pgdat;
>> int nid = pgdat->node_id;
>> -
>> - clear_zone_contiguous(zone);
>> + const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
>>
>> if (zone_is_empty(zone))
>> init_currently_empty_zone(zone, start_pfn, nr_pages);
>> @@ -783,7 +833,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>> MEMINIT_HOTPLUG, altmap, migratetype,
>> isolate_pageblock);
>>
>> - set_zone_contiguous(zone);
>> + if (!fast_path)
>> + set_zone_contiguous(zone);
>> }
>>
>> struct auto_movable_stats {
>> --
>> 2.47.1
>>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 12:41 ` Li, Tianyou
@ 2025-11-19 12:44 ` Wei Yang
2025-11-19 13:16 ` Li, Tianyou
0 siblings, 1 reply; 23+ messages in thread
From: Wei Yang @ 2025-11-19 12:44 UTC (permalink / raw)
To: Li, Tianyou
Cc: Wei Yang, David Hildenbrand, Oscar Salvador, Mike Rapoport,
linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
On Wed, Nov 19, 2025 at 08:41:11PM +0800, Li, Tianyou wrote:
>
>On 11/19/2025 7:42 PM, Wei Yang wrote:
>> On Wed, Nov 19, 2025 at 12:07:18PM +0800, Tianyou Li wrote:
>> > When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
>> > checking the new zone's pfn range from the beginning to the end, regardless
>> > the previous state of the old zone. When the zone's pfn range is large, the
>> > cost of traversing the pfn range to update the zone->contiguous could be
>> > significant.
>> >
>> > Add fast paths to quickly detect cases where zone is definitely not
>> > contiguous without scanning the new zone. The cases are: when the new range
>> > did not overlap with previous range, the contiguous should be false; if the
>> > new range adjacent with the previous range, just need to check the new
>> > range; if the new added pages could not fill the hole of previous zone, the
>> > contiguous should be false.
>> >
>> > The following test cases of memory hotplug for a VM [1], tested in the
>> > environment [2], show that this optimization can significantly reduce the
>> > memory hotplug time [3].
>> >
>> > +----------------+------+---------------+--------------+----------------+
>> > | | Size | Time (before) | Time (after) | Time Reduction |
>> > | +------+---------------+--------------+----------------+
>> > | Memory Hotplug | 256G | 10s | 2s | 80% |
>> > | +------+---------------+--------------+----------------+
>> > | | 512G | 33s | 6s | 81% |
>> > +----------------+------+---------------+--------------+----------------+
>> >
>> Nice
>
>
>Thanks for your time to review.
>
>
>> > [1] Qemu commands to hotplug 512G memory for a VM:
>> > object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>> > device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> > qom-set vmem1 requested-size 512G
>> >
>> > [2] Hardware : Intel Icelake server
>> > Guest Kernel : v6.18-rc2
>> > Qemu : v9.0.0
>> >
>> > Launch VM :
>> > qemu-system-x86_64 -accel kvm -cpu host \
>> > -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> > -drive file=./seed.img,format=raw,if=virtio \
>> > -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> > -m 2G,slots=10,maxmem=2052472M \
>> > -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> > -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> > -nographic -machine q35 \
>> > -nic user,hostfwd=tcp::3000-:22
>> >
>> > Guest kernel auto-onlines newly added memory blocks:
>> > echo online > /sys/devices/system/memory/auto_online_blocks
>> >
>> > [3] The time from typing the QEMU commands in [1] to when the output of
>> > 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> > memory is recognized.
>> >
>> > Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>> > Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>> > Tested-by: Yuan Liu <yuan1.liu@intel.com>
>> > Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> > Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> > Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>> > Reviewed-by: Pan Deng <pan.deng@intel.com>
>> > Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>> > Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
>> > Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>> > ---
>> > mm/memory_hotplug.c | 57 ++++++++++++++++++++++++++++++++++++++++++---
>> > 1 file changed, 54 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> > index 0be83039c3b5..8f126f20ca47 100644
>> > --- a/mm/memory_hotplug.c
>> > +++ b/mm/memory_hotplug.c
>> > @@ -723,6 +723,57 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>> >
>> > }
>> >
>> > +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
>> > + unsigned long start_pfn, unsigned long nr_pages)
>> > +{
>> > + const unsigned long end_pfn = start_pfn + nr_pages;
>> > + unsigned long nr_filled_pages;
>> > +
>> > + /*
>> > + * Given the moved pfn range's contiguous property is always true,
>> > + * under the conditional of empty zone, the contiguous property should
>> > + * be true.
>> > + */
>> > + if (zone_is_empty(zone)) {
>> > + zone->contiguous = true;
>> > + return true;
>> > + }
>> > +
>> > + /*
>> > + * If the moved pfn range does not intersect with the original zone span,
>> > + * the contiguous property is surely false.
>> > + */
>> > + if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
>> > + zone->contiguous = false;
>> > + return true;
>> > + }
>> > +
>> > + /*
>> > + * If the moved pfn range is adjacent to the original zone span, given
>> > + * the moved pfn range's contiguous property is always true, the zone's
>> > + * contiguous property inherited from the original value.
>> > + */
>> > + if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
>> > + return true;
>> > +
>> > + /*
>> > + * If the original zone's hole larger than the new filled pages, the
>> > + * contiguous property is surely false.
>> > + */
>> > + nr_filled_pages = end_pfn - zone->zone_start_pfn;
>> > + if (start_pfn > zone->zone_start_pfn)
>> > + nr_filled_pages -= start_pfn - zone->zone_start_pfn;
>> > + if (end_pfn > zone_end_pfn(zone))
>> > + nr_filled_pages -= end_pfn - zone_end_pfn(zone);
>> > + if (nr_filled_pages < (zone->spanned_pages - zone->present_pages)) {
>> > + zone->contiguous = false;
>> > + return true;
>> > + }
>> > +
>> Mike's suggestion is easier for me to understand :-)
>
>
>OK :-), with the clear votes now, I will change it in patch v3 real quick.
>Thanks.
>
Thanks for your effort.
While maybe wait a little for v3, let's see other's comment on v2 :-)
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v2] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 12:44 ` Wei Yang
@ 2025-11-19 13:16 ` Li, Tianyou
0 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-11-19 13:16 UTC (permalink / raw)
To: Wei Yang
Cc: David Hildenbrand, Oscar Salvador, Mike Rapoport, linux-mm,
Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen,
Pan Deng, Chen Zhang, linux-kernel
On 11/19/2025 8:44 PM, Wei Yang wrote:
> On Wed, Nov 19, 2025 at 08:41:11PM +0800, Li, Tianyou wrote:
>> On 11/19/2025 7:42 PM, Wei Yang wrote:
>>> On Wed, Nov 19, 2025 at 12:07:18PM +0800, Tianyou Li wrote:
>>>> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
>>>> checking the new zone's pfn range from the beginning to the end, regardless
>>>> the previous state of the old zone. When the zone's pfn range is large, the
>>>> cost of traversing the pfn range to update the zone->contiguous could be
>>>> significant.
>>>>
>>>> Add fast paths to quickly detect cases where zone is definitely not
>>>> contiguous without scanning the new zone. The cases are: when the new range
>>>> did not overlap with previous range, the contiguous should be false; if the
>>>> new range adjacent with the previous range, just need to check the new
>>>> range; if the new added pages could not fill the hole of previous zone, the
>>>> contiguous should be false.
>>>>
>>>> The following test cases of memory hotplug for a VM [1], tested in the
>>>> environment [2], show that this optimization can significantly reduce the
>>>> memory hotplug time [3].
>>>>
>>>> +----------------+------+---------------+--------------+----------------+
>>>> | | Size | Time (before) | Time (after) | Time Reduction |
>>>> | +------+---------------+--------------+----------------+
>>>> | Memory Hotplug | 256G | 10s | 2s | 80% |
>>>> | +------+---------------+--------------+----------------+
>>>> | | 512G | 33s | 6s | 81% |
>>>> +----------------+------+---------------+--------------+----------------+
>>>>
>>> Nice
>>
>> Thanks for your time to review.
>>
>>
>>>> [1] Qemu commands to hotplug 512G memory for a VM:
>>>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>>>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>>>> qom-set vmem1 requested-size 512G
>>>>
>>>> [2] Hardware : Intel Icelake server
>>>> Guest Kernel : v6.18-rc2
>>>> Qemu : v9.0.0
>>>>
>>>> Launch VM :
>>>> qemu-system-x86_64 -accel kvm -cpu host \
>>>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>>>> -drive file=./seed.img,format=raw,if=virtio \
>>>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>>>> -m 2G,slots=10,maxmem=2052472M \
>>>> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>>>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>>>> -nographic -machine q35 \
>>>> -nic user,hostfwd=tcp::3000-:22
>>>>
>>>> Guest kernel auto-onlines newly added memory blocks:
>>>> echo online > /sys/devices/system/memory/auto_online_blocks
>>>>
>>>> [3] The time from typing the QEMU commands in [1] to when the output of
>>>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>>>> memory is recognized.
>>>>
>>>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>>>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>>>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>>>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>>>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>>>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>>>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>>>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>>>> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
>>>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>>>> ---
>>>> mm/memory_hotplug.c | 57 ++++++++++++++++++++++++++++++++++++++++++---
>>>> 1 file changed, 54 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>> index 0be83039c3b5..8f126f20ca47 100644
>>>> --- a/mm/memory_hotplug.c
>>>> +++ b/mm/memory_hotplug.c
>>>> @@ -723,6 +723,57 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>>>>
>>>> }
>>>>
>>>> +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
>>>> + unsigned long start_pfn, unsigned long nr_pages)
>>>> +{
>>>> + const unsigned long end_pfn = start_pfn + nr_pages;
>>>> + unsigned long nr_filled_pages;
>>>> +
>>>> + /*
>>>> + * Given the moved pfn range's contiguous property is always true,
>>>> + * under the conditional of empty zone, the contiguous property should
>>>> + * be true.
>>>> + */
>>>> + if (zone_is_empty(zone)) {
>>>> + zone->contiguous = true;
>>>> + return true;
>>>> + }
>>>> +
>>>> + /*
>>>> + * If the moved pfn range does not intersect with the original zone span,
>>>> + * the contiguous property is surely false.
>>>> + */
>>>> + if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
>>>> + zone->contiguous = false;
>>>> + return true;
>>>> + }
>>>> +
>>>> + /*
>>>> + * If the moved pfn range is adjacent to the original zone span, given
>>>> + * the moved pfn range's contiguous property is always true, the zone's
>>>> + * contiguous property inherited from the original value.
>>>> + */
>>>> + if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
>>>> + return true;
>>>> +
>>>> + /*
>>>> + * If the original zone's hole larger than the new filled pages, the
>>>> + * contiguous property is surely false.
>>>> + */
>>>> + nr_filled_pages = end_pfn - zone->zone_start_pfn;
>>>> + if (start_pfn > zone->zone_start_pfn)
>>>> + nr_filled_pages -= start_pfn - zone->zone_start_pfn;
>>>> + if (end_pfn > zone_end_pfn(zone))
>>>> + nr_filled_pages -= end_pfn - zone_end_pfn(zone);
>>>> + if (nr_filled_pages < (zone->spanned_pages - zone->present_pages)) {
>>>> + zone->contiguous = false;
>>>> + return true;
>>>> + }
>>>> +
>>> Mike's suggestion is easier for me to understand :-)
>>
>> OK :-), with the clear votes now, I will change it in patch v3 real quick.
>> Thanks.
>>
> Thanks for your effort.
>
> While maybe wait a little for v3, let's see other's comment on v2 :-)
Thanks. While I am less worried if it might change back -:) In the
scenarios we tested, the overlap case did not happen. I agree Mike's
suggestion do simplify the code and reduce the maintaining efforts.
Regards,
Tianyou
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 11:42 ` Wei Yang
2025-11-19 12:41 ` Li, Tianyou
@ 2025-11-19 14:06 ` Tianyou Li
2025-11-20 12:00 ` Mike Rapoport
2025-11-28 12:01 ` David Hildenbrand (Red Hat)
1 sibling, 2 replies; 23+ messages in thread
From: Tianyou Li @ 2025-11-19 14:06 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, linux-kernel
When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
checking the new zone's pfn range from the beginning to the end, regardless
the previous state of the old zone. When the zone's pfn range is large, the
cost of traversing the pfn range to update the zone->contiguous could be
significant.
Add fast paths to quickly detect cases where zone is definitely not
contiguous without scanning the new zone. The cases are: when the new range
did not overlap with previous range, the contiguous should be false; if the
new range adjacent with the previous range, just need to check the new
range; if the new added pages could not fill the hole of previous zone, the
contiguous should be false.
The following test cases of memory hotplug for a VM [1], tested in the
environment [2], show that this optimization can significantly reduce the
memory hotplug time [3].
+----------------+------+---------------+--------------+----------------+
| | Size | Time (before) | Time (after) | Time Reduction |
| +------+---------------+--------------+----------------+
| Memory Hotplug | 256G | 10s | 2s | 80% |
| +------+---------------+--------------+----------------+
| | 512G | 33s | 6s | 81% |
+----------------+------+---------------+--------------+----------------+
[1] Qemu commands to hotplug 512G memory for a VM:
object_add memory-backend-ram,id=hotmem0,size=512G,share=on
device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
qom-set vmem1 requested-size 512G
[2] Hardware : Intel Icelake server
Guest Kernel : v6.18-rc2
Qemu : v9.0.0
Launch VM :
qemu-system-x86_64 -accel kvm -cpu host \
-drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
-drive file=./seed.img,format=raw,if=virtio \
-smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
-m 2G,slots=10,maxmem=2052472M \
-device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
-device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
-nographic -machine q35 \
-nic user,hostfwd=tcp::3000-:22
Guest kernel auto-onlines newly added memory blocks:
echo online > /sys/devices/system/memory/auto_online_blocks
[3] The time from typing the QEMU commands in [1] to when the output of
'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
memory is recognized.
Reported-by: Nanhai Zou <nanhai.zou@intel.com>
Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
Tested-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
Reviewed-by: Pan Deng <pan.deng@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
Signed-off-by: Tianyou Li <tianyou.li@intel.com>
---
mm/memory_hotplug.c | 51 ++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 48 insertions(+), 3 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0be83039c3b5..aed1827a2778 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -723,6 +723,51 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
}
+static bool __meminit check_zone_contiguous_fast(struct zone *zone,
+ unsigned long start_pfn, unsigned long nr_pages)
+{
+ const unsigned long end_pfn = start_pfn + nr_pages;
+
+ /*
+ * Given the moved pfn range's contiguous property is always true,
+ * under the conditional of empty zone, the contiguous property should
+ * be true.
+ */
+ if (zone_is_empty(zone)) {
+ zone->contiguous = true;
+ return true;
+ }
+
+ /*
+ * If the moved pfn range does not intersect with the original zone span,
+ * the contiguous property is surely false.
+ */
+ if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
+ zone->contiguous = false;
+ return true;
+ }
+
+ /*
+ * If the moved pfn range is adjacent to the original zone span, given
+ * the moved pfn range's contiguous property is always true, the zone's
+ * contiguous property inherited from the original value.
+ */
+ if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
+ return true;
+
+ /*
+ * If the original zone's hole larger than the moved pages in the range,
+ * the contiguous property is surely false.
+ */
+ if (nr_pages < (zone->spanned_pages - zone->present_pages)) {
+ zone->contiguous = false;
+ return true;
+ }
+
+ clear_zone_contiguous(zone);
+ return false;
+}
+
#ifdef CONFIG_ZONE_DEVICE
static void section_taint_zone_device(unsigned long pfn)
{
@@ -752,8 +797,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
{
struct pglist_data *pgdat = zone->zone_pgdat;
int nid = pgdat->node_id;
-
- clear_zone_contiguous(zone);
+ const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
if (zone_is_empty(zone))
init_currently_empty_zone(zone, start_pfn, nr_pages);
@@ -783,7 +827,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
MEMINIT_HOTPLUG, altmap, migratetype,
isolate_pageblock);
- set_zone_contiguous(zone);
+ if (!fast_path)
+ set_zone_contiguous(zone);
}
struct auto_movable_stats {
--
2.47.1
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v3] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 14:06 ` [PATCH v3] " Tianyou Li
@ 2025-11-20 12:00 ` Mike Rapoport
2025-11-20 14:21 ` Li, Tianyou
2025-11-28 12:01 ` David Hildenbrand (Red Hat)
1 sibling, 1 reply; 23+ messages in thread
From: Mike Rapoport @ 2025-11-20 12:00 UTC (permalink / raw)
To: Tianyou Li
Cc: David Hildenbrand, Oscar Salvador, Wei Yang, linux-mm, Yong Hu,
Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng,
Chen Zhang, linux-kernel
Hi,
Please start a new thread when sending a new version of a patch next time.
And as Wei mentioned, wait a bit for the discussion on vN to settle before
sending vN+1.
On Wed, Nov 19, 2025 at 10:06:57PM +0800, Tianyou Li wrote:
> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
> checking the new zone's pfn range from the beginning to the end, regardless
> the previous state of the old zone. When the zone's pfn range is large, the
> cost of traversing the pfn range to update the zone->contiguous could be
> significant.
>
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
>
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Memory Hotplug | 256G | 10s | 2s | 80% |
> | +------+---------------+--------------+----------------+
> | | 512G | 33s | 6s | 81% |
> +----------------+------+---------------+--------------+----------------+
>
> [1] Qemu commands to hotplug 512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 512G
>
> [2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
> [3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> ---
> mm/memory_hotplug.c | 51 ++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 48 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..aed1827a2778 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -723,6 +723,51 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>
> }
>
> +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
> + unsigned long start_pfn, unsigned long nr_pages)
> +{
> + const unsigned long end_pfn = start_pfn + nr_pages;
> +
> + /*
> + * Given the moved pfn range's contiguous property is always true,
> + * under the conditional of empty zone, the contiguous property should
> + * be true.
> + */
> + if (zone_is_empty(zone)) {
> + zone->contiguous = true;
I don't think it's safe to set zone->contiguous until the end of
move_pfn_range_to_zone(). See commit feee6b298916 ("mm/memory_hotplug:
shrink zones when offlining memory").
check_zone_contiguous_fast() should only check if the zone remains
contiguous after hotplug or it's certainly discontinuous, but should not
set zone->contiguous. It still must be cleared before resizing the zone and
set after the initialization of the memory map.
> + return true;
> + }
> +
> + /*
> + * If the moved pfn range does not intersect with the original zone span,
> + * the contiguous property is surely false.
> + */
> + if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
> + zone->contiguous = false;
> + return true;
> + }
> +
> + /*
> + * If the moved pfn range is adjacent to the original zone span, given
> + * the moved pfn range's contiguous property is always true, the zone's
> + * contiguous property inherited from the original value.
> + */
> + if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
> + return true;
> +
> + /*
> + * If the original zone's hole larger than the moved pages in the range,
> + * the contiguous property is surely false.
> + */
> + if (nr_pages < (zone->spanned_pages - zone->present_pages)) {
> + zone->contiguous = false;
> + return true;
> + }
> +
> + clear_zone_contiguous(zone);
> + return false;
> +}
> +
> #ifdef CONFIG_ZONE_DEVICE
> static void section_taint_zone_device(unsigned long pfn)
> {
> @@ -752,8 +797,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> {
> struct pglist_data *pgdat = zone->zone_pgdat;
> int nid = pgdat->node_id;
> -
> - clear_zone_contiguous(zone);
> + const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
>
> if (zone_is_empty(zone))
> init_currently_empty_zone(zone, start_pfn, nr_pages);
> @@ -783,7 +827,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> MEMINIT_HOTPLUG, altmap, migratetype,
> isolate_pageblock);
>
> - set_zone_contiguous(zone);
> + if (!fast_path)
> + set_zone_contiguous(zone);
> }
>
> struct auto_movable_stats {
> --
> 2.47.1
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v3] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-20 12:00 ` Mike Rapoport
@ 2025-11-20 14:21 ` Li, Tianyou
0 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-11-20 14:21 UTC (permalink / raw)
To: Mike Rapoport
Cc: David Hildenbrand, Oscar Salvador, Wei Yang, linux-mm, Yong Hu,
Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng,
Chen Zhang, linux-kernel
Hi Mike,
Thanks for your review. Appreciated.
On 11/20/2025 8:00 PM, Mike Rapoport wrote:
> Hi,
>
> Please start a new thread when sending a new version of a patch next time.
Got it.
> And as Wei mentioned, wait a bit for the discussion on vN to settle before
> sending vN+1.
Will do. Thanks.
> On Wed, Nov 19, 2025 at 10:06:57PM +0800, Tianyou Li wrote:
>> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
>> checking the new zone's pfn range from the beginning to the end, regardless
>> the previous state of the old zone. When the zone's pfn range is large, the
>> cost of traversing the pfn range to update the zone->contiguous could be
>> significant.
>>
>> Add fast paths to quickly detect cases where zone is definitely not
>> contiguous without scanning the new zone. The cases are: when the new range
>> did not overlap with previous range, the contiguous should be false; if the
>> new range adjacent with the previous range, just need to check the new
>> range; if the new added pages could not fill the hole of previous zone, the
>> contiguous should be false.
>>
>> The following test cases of memory hotplug for a VM [1], tested in the
>> environment [2], show that this optimization can significantly reduce the
>> memory hotplug time [3].
>>
>> +----------------+------+---------------+--------------+----------------+
>> | | Size | Time (before) | Time (after) | Time Reduction |
>> | +------+---------------+--------------+----------------+
>> | Memory Hotplug | 256G | 10s | 2s | 80% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 33s | 6s | 81% |
>> +----------------+------+---------------+--------------+----------------+
>>
>> [1] Qemu commands to hotplug 512G memory for a VM:
>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> qom-set vmem1 requested-size 512G
>>
>> [2] Hardware : Intel Icelake server
>> Guest Kernel : v6.18-rc2
>> Qemu : v9.0.0
>>
>> Launch VM :
>> qemu-system-x86_64 -accel kvm -cpu host \
>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> -drive file=./seed.img,format=raw,if=virtio \
>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> -m 2G,slots=10,maxmem=2052472M \
>> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> -nographic -machine q35 \
>> -nic user,hostfwd=tcp::3000-:22
>>
>> Guest kernel auto-onlines newly added memory blocks:
>> echo online > /sys/devices/system/memory/auto_online_blocks
>>
>> [3] The time from typing the QEMU commands in [1] to when the output of
>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> memory is recognized.
>>
>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>> ---
>> mm/memory_hotplug.c | 51 ++++++++++++++++++++++++++++++++++++++++++---
>> 1 file changed, 48 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 0be83039c3b5..aed1827a2778 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -723,6 +723,51 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>>
>> }
>>
>> +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
>> + unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> + const unsigned long end_pfn = start_pfn + nr_pages;
>> +
>> + /*
>> + * Given the moved pfn range's contiguous property is always true,
>> + * under the conditional of empty zone, the contiguous property should
>> + * be true.
>> + */
>> + if (zone_is_empty(zone)) {
>> + zone->contiguous = true;
> I don't think it's safe to set zone->contiguous until the end of
> move_pfn_range_to_zone(). See commit feee6b298916 ("mm/memory_hotplug:
> shrink zones when offlining memory").
>
> check_zone_contiguous_fast() should only check if the zone remains
> contiguous after hotplug or it's certainly discontinuous, but should not
> set zone->contiguous. It still must be cleared before resizing the zone and
> set after the initialization of the memory map.
Thanks for the pointer. Allow me to learn more about the context and get
back to you soon. Thanks.
>> + return true;
>> + }
>> +
>> + /*
>> + * If the moved pfn range does not intersect with the original zone span,
>> + * the contiguous property is surely false.
>> + */
>> + if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
>> + zone->contiguous = false;
>> + return true;
>> + }
>> +
>> + /*
>> + * If the moved pfn range is adjacent to the original zone span, given
>> + * the moved pfn range's contiguous property is always true, the zone's
>> + * contiguous property inherited from the original value.
>> + */
>> + if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
>> + return true;
>> +
>> + /*
>> + * If the original zone's hole larger than the moved pages in the range,
>> + * the contiguous property is surely false.
>> + */
>> + if (nr_pages < (zone->spanned_pages - zone->present_pages)) {
>> + zone->contiguous = false;
>> + return true;
>> + }
>> +
>> + clear_zone_contiguous(zone);
>> + return false;
>> +}
>> +
>> #ifdef CONFIG_ZONE_DEVICE
>> static void section_taint_zone_device(unsigned long pfn)
>> {
>> @@ -752,8 +797,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>> {
>> struct pglist_data *pgdat = zone->zone_pgdat;
>> int nid = pgdat->node_id;
>> -
>> - clear_zone_contiguous(zone);
>> + const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
>>
>> if (zone_is_empty(zone))
>> init_currently_empty_zone(zone, start_pfn, nr_pages);
>> @@ -783,7 +827,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>> MEMINIT_HOTPLUG, altmap, migratetype,
>> isolate_pageblock);
>>
>> - set_zone_contiguous(zone);
>> + if (!fast_path)
>> + set_zone_contiguous(zone);
>> }
>>
>> struct auto_movable_stats {
>> --
>> 2.47.1
>>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-19 14:06 ` [PATCH v3] " Tianyou Li
2025-11-20 12:00 ` Mike Rapoport
@ 2025-11-28 12:01 ` David Hildenbrand (Red Hat)
2025-11-28 15:17 ` Li, Tianyou
1 sibling, 1 reply; 23+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-28 12:01 UTC (permalink / raw)
To: Tianyou Li, Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
On 11/19/25 15:06, Tianyou Li wrote:
> When invoke move_pfn_range_to_zone, it will update the zone->contiguous by
> checking the new zone's pfn range from the beginning to the end, regardless
> the previous state of the old zone. When the zone's pfn range is large, the
> cost of traversing the pfn range to update the zone->contiguous could be
> significant.
>
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
>
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Memory Hotplug | 256G | 10s | 2s | 80% |
> | +------+---------------+--------------+----------------+
> | | 512G | 33s | 6s | 81% |
> +----------------+------+---------------+--------------+----------------+
>
> [1] Qemu commands to hotplug 512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 512G
>
> [2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
> [3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> ---
> mm/memory_hotplug.c | 51 ++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 48 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..aed1827a2778 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -723,6 +723,51 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
>
> }
>
> +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
> + unsigned long start_pfn, unsigned long nr_pages)
> +{
> + const unsigned long end_pfn = start_pfn + nr_pages;
> +
> + /*
> + * Given the moved pfn range's contiguous property is always true,
> + * under the conditional of empty zone, the contiguous property should
> + * be true.
> + */
> + if (zone_is_empty(zone)) {
> + zone->contiguous = true;
> + return true;
> + }
> +
> + /*
> + * If the moved pfn range does not intersect with the original zone span,
> + * the contiguous property is surely false.
> + */
> + if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone)) {
> + zone->contiguous = false;
> + return true;
> + }
> +
> + /*
> + * If the moved pfn range is adjacent to the original zone span, given
> + * the moved pfn range's contiguous property is always true, the zone's
> + * contiguous property inherited from the original value.
> + */
> + if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
> + return true;
> +
> + /*
> + * If the original zone's hole larger than the moved pages in the range,
> + * the contiguous property is surely false.
> + */
> + if (nr_pages < (zone->spanned_pages - zone->present_pages)) {
> + zone->contiguous = false;
> + return true;
> + }
> +
> + clear_zone_contiguous(zone);
> + return false;
> +}
> +
> #ifdef CONFIG_ZONE_DEVICE
> static void section_taint_zone_device(unsigned long pfn)
> {
> @@ -752,8 +797,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> {
> struct pglist_data *pgdat = zone->zone_pgdat;
> int nid = pgdat->node_id;
> -
> - clear_zone_contiguous(zone);
> + const bool fast_path = check_zone_contiguous_fast(zone, start_pfn, nr_pages);
>
> if (zone_is_empty(zone))
> init_currently_empty_zone(zone, start_pfn, nr_pages);
> @@ -783,7 +827,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> MEMINIT_HOTPLUG, altmap, migratetype,
> isolate_pageblock);
>
> - set_zone_contiguous(zone);
> + if (!fast_path)
> + set_zone_contiguous(zone);
> }
>
> struct auto_movable_stats {
Agreed with Mike that we should keep clearing+resetting the bit.
Also, I don't particularly enjoy the "fast_path" terminology. Probably we
want in the end something high-level like:
bool definetly_contig;
definetly_contig = clear_zone_contiguous_for_growing(zone, start_pfn, nr_pages);
...
set_zone_contiguous(zone, definetly_contig);
We could do something similar on the removal path then, where the zone
will for sure stay contiguous if we are removing the first/last part.
bool definetly_contig;
stays_contiguous = clear_zone_contiguous_for_shrinking(zone, start_pfn, nr_pages);
...
set_zone_contiguous(zone, definetly_contig);
If we can come up for a better name for definetly_contig that would be nice.
--
Cheers
David
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v3] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-28 12:01 ` David Hildenbrand (Red Hat)
@ 2025-11-28 15:17 ` Li, Tianyou
2025-11-28 16:04 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 23+ messages in thread
From: Li, Tianyou @ 2025-11-28 15:17 UTC (permalink / raw)
To: David Hildenbrand (Red Hat), Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
Thanks for your review David.
On 11/28/2025 8:01 PM, David Hildenbrand (Red Hat) wrote:
> On 11/19/25 15:06, Tianyou Li wrote:
>> When invoke move_pfn_range_to_zone, it will update the
>> zone->contiguous by
>> checking the new zone's pfn range from the beginning to the end,
>> regardless
>> the previous state of the old zone. When the zone's pfn range is
>> large, the
>> cost of traversing the pfn range to update the zone->contiguous could be
>> significant.
>>
>> Add fast paths to quickly detect cases where zone is definitely not
>> contiguous without scanning the new zone. The cases are: when the new
>> range
>> did not overlap with previous range, the contiguous should be false;
>> if the
>> new range adjacent with the previous range, just need to check the new
>> range; if the new added pages could not fill the hole of previous
>> zone, the
>> contiguous should be false.
>>
>> The following test cases of memory hotplug for a VM [1], tested in the
>> environment [2], show that this optimization can significantly reduce
>> the
>> memory hotplug time [3].
>>
>> +----------------+------+---------------+--------------+----------------+
>>
>> | | Size | Time (before) | Time (after) | Time
>> Reduction |
>> | +------+---------------+--------------+----------------+
>> | Memory Hotplug | 256G | 10s | 2s | 80% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 33s | 6s | 81% |
>> +----------------+------+---------------+--------------+----------------+
>>
>>
>> [1] Qemu commands to hotplug 512G memory for a VM:
>> object_add memory-backend-ram,id=hotmem0,size=512G,share=on
>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> qom-set vmem1 requested-size 512G
>>
>> [2] Hardware : Intel Icelake server
>> Guest Kernel : v6.18-rc2
>> Qemu : v9.0.0
>>
>> Launch VM :
>> qemu-system-x86_64 -accel kvm -cpu host \
>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> -drive file=./seed.img,format=raw,if=virtio \
>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> -m 2G,slots=10,maxmem=2052472M \
>> -device
>> pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> -nographic -machine q35 \
>> -nic user,hostfwd=tcp::3000-:22
>>
>> Guest kernel auto-onlines newly added memory blocks:
>> echo online > /sys/devices/system/memory/auto_online_blocks
>>
>> [3] The time from typing the QEMU commands in [1] to when the output of
>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> memory is recognized.
>>
>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>> ---
>> mm/memory_hotplug.c | 51 ++++++++++++++++++++++++++++++++++++++++++---
>> 1 file changed, 48 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 0be83039c3b5..aed1827a2778 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -723,6 +723,51 @@ static void __meminit resize_pgdat_range(struct
>> pglist_data *pgdat, unsigned lon
>> }
>> +static bool __meminit check_zone_contiguous_fast(struct zone *zone,
>> + unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> + const unsigned long end_pfn = start_pfn + nr_pages;
>> +
>> + /*
>> + * Given the moved pfn range's contiguous property is always true,
>> + * under the conditional of empty zone, the contiguous property
>> should
>> + * be true.
>> + */
>> + if (zone_is_empty(zone)) {
>> + zone->contiguous = true;
>> + return true;
>> + }
>> +
>> + /*
>> + * If the moved pfn range does not intersect with the original
>> zone span,
>> + * the contiguous property is surely false.
>> + */
>> + if (end_pfn < zone->zone_start_pfn || start_pfn >
>> zone_end_pfn(zone)) {
>> + zone->contiguous = false;
>> + return true;
>> + }
>> +
>> + /*
>> + * If the moved pfn range is adjacent to the original zone span,
>> given
>> + * the moved pfn range's contiguous property is always true, the
>> zone's
>> + * contiguous property inherited from the original value.
>> + */
>> + if (end_pfn == zone->zone_start_pfn || start_pfn ==
>> zone_end_pfn(zone))
>> + return true;
>> +
>> + /*
>> + * If the original zone's hole larger than the moved pages in
>> the range,
>> + * the contiguous property is surely false.
>> + */
>> + if (nr_pages < (zone->spanned_pages - zone->present_pages)) {
>> + zone->contiguous = false;
>> + return true;
>> + }
>> +
>> + clear_zone_contiguous(zone);
>> + return false;
>> +}
>> +
>> #ifdef CONFIG_ZONE_DEVICE
>> static void section_taint_zone_device(unsigned long pfn)
>> {
>> @@ -752,8 +797,7 @@ void move_pfn_range_to_zone(struct zone *zone,
>> unsigned long start_pfn,
>> {
>> struct pglist_data *pgdat = zone->zone_pgdat;
>> int nid = pgdat->node_id;
>> -
>> - clear_zone_contiguous(zone);
>> + const bool fast_path = check_zone_contiguous_fast(zone,
>> start_pfn, nr_pages);
>> if (zone_is_empty(zone))
>> init_currently_empty_zone(zone, start_pfn, nr_pages);
>> @@ -783,7 +827,8 @@ void move_pfn_range_to_zone(struct zone *zone,
>> unsigned long start_pfn,
>> MEMINIT_HOTPLUG, altmap, migratetype,
>> isolate_pageblock);
>> - set_zone_contiguous(zone);
>> + if (!fast_path)
>> + set_zone_contiguous(zone);
>> }
>> struct auto_movable_stats {
>
> Agreed with Mike that we should keep clearing+resetting the bit.
Got it. Worked with Yuan Liu to understand the risk that if set the
zone->contiguous before the pfn range fully initialized. It seems
pageblock_pfn_to_page code path could be affect thus potentially it is
not safe.
> Also, I don't particularly enjoy the "fast_path" terminology. Probably we
> want in the end something high-level like:
>
>
> bool definetly_contig;
>
> definetly_contig = clear_zone_contiguous_for_growing(zone, start_pfn,
> nr_pages);
>
> ...
>
> set_zone_contiguous(zone, definetly_contig);
>
>
> We could do something similar on the removal path then, where the zone
> will for sure stay contiguous if we are removing the first/last part.
>
>
> bool definetly_contig;
>
> stays_contiguous = clear_zone_contiguous_for_shrinking(zone,
> start_pfn, nr_pages);
>
> ...
>
> set_zone_contiguous(zone, definetly_contig);
>
>
>
> If we can come up for a better name for definetly_contig that would be
> nice.
>
Instead of a bool value, could the clear_zone_contiguous_for_growing
and clear_zone_contiguous_for_shrinking return a enum value to indicate
one of the three states: 1. DEFINITELY_CONTIGUOUS;
2. DEFINITELY_NOT_CONTIGUOUS; 3. UNDETERMINED_CONTIGUOUS? The
set_zone_contiguous took the state and skip the contiguous check if
DEFINITELY_CONTIGUOUS or DEFINITELY_NOT_CONTIGUOUS.
Regards,
Tianyou
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH v3] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-28 15:17 ` Li, Tianyou
@ 2025-11-28 16:04 ` David Hildenbrand (Red Hat)
2025-12-01 12:28 ` Li, Tianyou
0 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-28 16:04 UTC (permalink / raw)
To: Li, Tianyou, Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
>
> Instead of a bool value, could the clear_zone_contiguous_for_growing
> and clear_zone_contiguous_for_shrinking return a enum value to indicate
> one of the three states: 1. DEFINITELY_CONTIGUOUS;
> 2. DEFINITELY_NOT_CONTIGUOUS; 3. UNDETERMINED_CONTIGUOUS? The
> set_zone_contiguous took the state and skip the contiguous check if
> DEFINITELY_CONTIGUOUS or DEFINITELY_NOT_CONTIGUOUS.
I had the exact same thought while writing my rely, so it's worth
investigating.
If that helps to come up with even better+descriptive variable/function
names, even better :)
--
Cheers
David
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3] mm/memory hotplug/unplug: Optimize zone->contiguous update when move pfn range
2025-11-28 16:04 ` David Hildenbrand (Red Hat)
@ 2025-12-01 12:28 ` Li, Tianyou
0 siblings, 0 replies; 23+ messages in thread
From: Li, Tianyou @ 2025-12-01 12:28 UTC (permalink / raw)
To: David Hildenbrand (Red Hat), Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
Thanks David for your time to review.
On 11/29/2025 12:04 AM, David Hildenbrand (Red Hat) wrote:
>>
>> Instead of a bool value, could the clear_zone_contiguous_for_growing
>> and clear_zone_contiguous_for_shrinking return a enum value to indicate
>> one of the three states: 1. DEFINITELY_CONTIGUOUS;
>> 2. DEFINITELY_NOT_CONTIGUOUS; 3. UNDETERMINED_CONTIGUOUS? The
>> set_zone_contiguous took the state and skip the contiguous check if
>> DEFINITELY_CONTIGUOUS or DEFINITELY_NOT_CONTIGUOUS.
>
> I had the exact same thought while writing my rely, so it's worth
> investigating.
>
> If that helps to come up with even better+descriptive
> variable/function names, even better :)
>
I've created a patch v4 for review in a new thread as previously
suggested, Yuan Liu added the test result for memory plug and unplug.
Welcome for any comments or suggestions. Appreciated.
Regards,
Tianyou
^ permalink raw reply [flat|nested] 23+ messages in thread