* [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
@ 2025-12-01 13:22 Tianyou Li
2025-12-01 18:54 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 7+ messages in thread
From: Tianyou Li @ 2025-12-01 13:22 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, linux-kernel
When invoke move_pfn_range_to_zone or remove_pfn_range_from_zone, it will
update the zone->contiguous by checking the new zone's pfn range from the
beginning to the end, regardless the previous state of the old zone. When
the zone's pfn range is large, the cost of traversing the pfn range to
update the zone->contiguous could be significant.
Add fast paths to quickly detect cases where zone is definitely not
contiguous without scanning the new zone. The cases are: when the new range
did not overlap with previous range, the contiguous should be false; if the
new range adjacent with the previous range, just need to check the new
range; if the new added pages could not fill the hole of previous zone, the
contiguous should be false.
The following test cases of memory hotplug for a VM [1], tested in the
environment [2], show that this optimization can significantly reduce the
memory hotplug time [3].
+----------------+------+---------------+--------------+----------------+
| | Size | Time (before) | Time (after) | Time Reduction |
| +------+---------------+--------------+----------------+
| Plug Memory | 256G | 10s | 2s | 80% |
| +------+---------------+--------------+----------------+
| | 512G | 33s | 6s | 81% |
+----------------+------+---------------+--------------+----------------+
+----------------+------+---------------+--------------+----------------+
| | Size | Time (before) | Time (after) | Time Reduction |
| +------+---------------+--------------+----------------+
| Unplug Memory | 256G | 10s | 2s | 80% |
| +------+---------------+--------------+----------------+
| | 512G | 34s | 6s | 82% |
+----------------+------+---------------+--------------+----------------+
[1] Qemu commands to hotplug 256G/512G memory for a VM:
object_add memory-backend-ram,id=hotmem0,size=256G/512G,share=on
device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
qom-set vmem1 requested-size 256G/512G (Plug Memory)
qom-set vmem1 requested-size 0G (Unplug Memory)
[2] Hardware : Intel Icelake server
Guest Kernel : v6.18-rc2
Qemu : v9.0.0
Launch VM :
qemu-system-x86_64 -accel kvm -cpu host \
-drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
-drive file=./seed.img,format=raw,if=virtio \
-smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
-m 2G,slots=10,maxmem=2052472M \
-device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
-device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
-nographic -machine q35 \
-nic user,hostfwd=tcp::3000-:22
Guest kernel auto-onlines newly added memory blocks:
echo online > /sys/devices/system/memory/auto_online_blocks
[3] The time from typing the QEMU commands in [1] to when the output of
'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
memory is recognized.
Reported-by: Nanhai Zou <nanhai.zou@intel.com>
Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
Tested-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
Reviewed-by: Pan Deng <pan.deng@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
Signed-off-by: Tianyou Li <tianyou.li@intel.com>
---
mm/internal.h | 8 ++++-
mm/memory_hotplug.c | 79 ++++++++++++++++++++++++++++++++++++++++++---
mm/mm_init.c | 36 +++++++++++++--------
3 files changed, 103 insertions(+), 20 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 1561fc2ff5b8..a94928520a55 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -730,7 +730,13 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
}
-void set_zone_contiguous(struct zone *zone);
+enum zone_contiguous_state {
+ CONTIGUOUS_DEFINITELY_NOT = 0,
+ CONTIGUOUS_DEFINITELY = 1,
+ CONTIGUOUS_UNDETERMINED = 2,
+};
+
+void set_zone_contiguous(struct zone *zone, enum zone_contiguous_state state);
bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
unsigned long nr_pages);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0be83039c3b5..b74e558ce822 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -544,6 +544,32 @@ static void update_pgdat_span(struct pglist_data *pgdat)
pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
}
+static enum zone_contiguous_state __meminit clear_zone_contiguous_for_shrinking(
+ struct zone *zone, unsigned long start_pfn, unsigned long nr_pages)
+{
+ const unsigned long end_pfn = start_pfn + nr_pages;
+ enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
+
+ /*
+ * If the removed pfn range inside the original zone span, the contiguous
+ * property is surely false.
+ */
+ if (start_pfn > zone->zone_start_pfn && end_pfn < zone_end_pfn(zone))
+ result = CONTIGUOUS_DEFINITELY_NOT;
+
+ /*
+ * If the removed pfn range is at the beginning or end of the
+ * original zone span, the contiguous property is preserved when
+ * the original zone is contiguous.
+ */
+ else if (start_pfn == zone->zone_start_pfn || end_pfn == zone_end_pfn(zone))
+ result = zone->contiguous ?
+ CONTIGUOUS_DEFINITELY : CONTIGUOUS_UNDETERMINED;
+
+ clear_zone_contiguous(zone);
+ return result;
+}
+
void remove_pfn_range_from_zone(struct zone *zone,
unsigned long start_pfn,
unsigned long nr_pages)
@@ -551,6 +577,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
const unsigned long end_pfn = start_pfn + nr_pages;
struct pglist_data *pgdat = zone->zone_pgdat;
unsigned long pfn, cur_nr_pages;
+ enum zone_contiguous_state contiguous_state = CONTIGUOUS_UNDETERMINED;
/* Poison struct pages because they are now uninitialized again. */
for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
@@ -571,12 +598,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
if (zone_is_zone_device(zone))
return;
- clear_zone_contiguous(zone);
+ contiguous_state = clear_zone_contiguous_for_shrinking(
+ zone, start_pfn, nr_pages);
shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
update_pgdat_span(pgdat);
- set_zone_contiguous(zone);
+ set_zone_contiguous(zone, contiguous_state);
}
/**
@@ -736,6 +764,47 @@ static inline void section_taint_zone_device(unsigned long pfn)
}
#endif
+static enum zone_contiguous_state __meminit clear_zone_contiguous_for_growing(
+ struct zone *zone, unsigned long start_pfn, unsigned long nr_pages)
+{
+ const unsigned long end_pfn = start_pfn + nr_pages;
+ enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
+
+ /*
+ * Given the moved pfn range's contiguous property is always true,
+ * under the conditional of empty zone, the contiguous property should
+ * be true.
+ */
+ if (zone_is_empty(zone))
+ result = CONTIGUOUS_DEFINITELY;
+
+ /*
+ * If the moved pfn range does not intersect with the original zone span,
+ * the contiguous property is surely false.
+ */
+ else if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone))
+ result = CONTIGUOUS_DEFINITELY_NOT;
+
+ /*
+ * If the moved pfn range is adjacent to the original zone span, given
+ * the moved pfn range's contiguous property is always true, the zone's
+ * contiguous property inherited from the original value.
+ */
+ else if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
+ result = zone->contiguous ?
+ CONTIGUOUS_DEFINITELY : CONTIGUOUS_DEFINITELY_NOT;
+
+ /*
+ * If the original zone's hole larger than the moved pages in the range,
+ * the contiguous property is surely false.
+ */
+ else if (nr_pages < (zone->spanned_pages - zone->present_pages))
+ result = CONTIGUOUS_DEFINITELY_NOT;
+
+ clear_zone_contiguous(zone);
+ return result;
+}
+
/*
* Associate the pfn range with the given zone, initializing the memmaps
* and resizing the pgdat/zone data to span the added pages. After this
@@ -752,8 +821,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
{
struct pglist_data *pgdat = zone->zone_pgdat;
int nid = pgdat->node_id;
-
- clear_zone_contiguous(zone);
+ const enum zone_contiguous_state contiguous_state =
+ clear_zone_contiguous_for_growing(zone, start_pfn, nr_pages);
if (zone_is_empty(zone))
init_currently_empty_zone(zone, start_pfn, nr_pages);
@@ -783,7 +852,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
MEMINIT_HOTPLUG, altmap, migratetype,
isolate_pageblock);
- set_zone_contiguous(zone);
+ set_zone_contiguous(zone, contiguous_state);
}
struct auto_movable_stats {
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 7712d887b696..06db3fcf7f95 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -2263,26 +2263,34 @@ void __init init_cma_pageblock(struct page *page)
}
#endif
-void set_zone_contiguous(struct zone *zone)
+void set_zone_contiguous(struct zone *zone, enum zone_contiguous_state state)
{
unsigned long block_start_pfn = zone->zone_start_pfn;
unsigned long block_end_pfn;
- block_end_pfn = pageblock_end_pfn(block_start_pfn);
- for (; block_start_pfn < zone_end_pfn(zone);
- block_start_pfn = block_end_pfn,
- block_end_pfn += pageblock_nr_pages) {
+ if (state == CONTIGUOUS_DEFINITELY) {
+ zone->contiguous = true;
+ return;
+ } else if (state == CONTIGUOUS_DEFINITELY_NOT) {
+ // zone contiguous has already cleared as false, just return.
+ return;
+ } else if (state == CONTIGUOUS_UNDETERMINED) {
+ block_end_pfn = pageblock_end_pfn(block_start_pfn);
+ for (; block_start_pfn < zone_end_pfn(zone);
+ block_start_pfn = block_end_pfn,
+ block_end_pfn += pageblock_nr_pages) {
- block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
+ block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
- if (!__pageblock_pfn_to_page(block_start_pfn,
- block_end_pfn, zone))
- return;
- cond_resched();
- }
+ if (!__pageblock_pfn_to_page(block_start_pfn,
+ block_end_pfn, zone))
+ return;
+ cond_resched();
+ }
- /* We confirm that there is no hole */
- zone->contiguous = true;
+ /* We confirm that there is no hole */
+ zone->contiguous = true;
+ }
}
/*
@@ -2348,7 +2356,7 @@ void __init page_alloc_init_late(void)
shuffle_free_memory(NODE_DATA(nid));
for_each_populated_zone(zone)
- set_zone_contiguous(zone);
+ set_zone_contiguous(zone, CONTIGUOUS_UNDETERMINED);
/* Initialize page ext after all struct pages are initialized. */
if (deferred_struct_pages)
--
2.47.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
2025-12-01 13:22 [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range Tianyou Li
@ 2025-12-01 18:54 ` David Hildenbrand (Red Hat)
2025-12-01 23:40 ` Li, Tianyou
2025-12-02 10:57 ` Mike Rapoport
0 siblings, 2 replies; 7+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-01 18:54 UTC (permalink / raw)
To: Tianyou Li, Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
On 12/1/25 14:22, Tianyou Li wrote:
> When invoke move_pfn_range_to_zone or remove_pfn_range_from_zone, it will
> update the zone->contiguous by checking the new zone's pfn range from the
> beginning to the end, regardless the previous state of the old zone. When
> the zone's pfn range is large, the cost of traversing the pfn range to
> update the zone->contiguous could be significant.
>
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
>
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Plug Memory | 256G | 10s | 2s | 80% |
> | +------+---------------+--------------+----------------+
> | | 512G | 33s | 6s | 81% |
> +----------------+------+---------------+--------------+----------------+
>
> +----------------+------+---------------+--------------+----------------+
> | | Size | Time (before) | Time (after) | Time Reduction |
> | +------+---------------+--------------+----------------+
> | Unplug Memory | 256G | 10s | 2s | 80% |
> | +------+---------------+--------------+----------------+
> | | 512G | 34s | 6s | 82% |
> +----------------+------+---------------+--------------+----------------+
>
> [1] Qemu commands to hotplug 256G/512G memory for a VM:
> object_add memory-backend-ram,id=hotmem0,size=256G/512G,share=on
> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> qom-set vmem1 requested-size 256G/512G (Plug Memory)
> qom-set vmem1 requested-size 0G (Unplug Memory)
>
> [2] Hardware : Intel Icelake server
> Guest Kernel : v6.18-rc2
> Qemu : v9.0.0
>
> Launch VM :
> qemu-system-x86_64 -accel kvm -cpu host \
> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> -drive file=./seed.img,format=raw,if=virtio \
> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> -m 2G,slots=10,maxmem=2052472M \
> -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> -nographic -machine q35 \
> -nic user,hostfwd=tcp::3000-:22
>
> Guest kernel auto-onlines newly added memory blocks:
> echo online > /sys/devices/system/memory/auto_online_blocks
>
> [3] The time from typing the QEMU commands in [1] to when the output of
> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> memory is recognized.
>
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> ---
> mm/internal.h | 8 ++++-
> mm/memory_hotplug.c | 79 ++++++++++++++++++++++++++++++++++++++++++---
> mm/mm_init.c | 36 +++++++++++++--------
> 3 files changed, 103 insertions(+), 20 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 1561fc2ff5b8..a94928520a55 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -730,7 +730,13 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
> }
>
> -void set_zone_contiguous(struct zone *zone);
> +enum zone_contiguous_state {
> + CONTIGUOUS_DEFINITELY_NOT = 0,
> + CONTIGUOUS_DEFINITELY = 1,
> + CONTIGUOUS_UNDETERMINED = 2,
No need for the values.
> +};
I don't like that the defines don't match the enum name (zone_c... vs.
CONT... ).
Essentially you want a "yes / no / maybe" tristate. I don't think we
have an existing type for that, unfortunately.
enum zone_contig_state {
ZONE_CONTIG_YES,
ZONE_CONTIG_NO,
ZONE_CONTIG_MAYBE,
};
Maybe someone reading along has a better idea.
> +
> +void set_zone_contiguous(struct zone *zone, enum zone_contiguous_state state);
> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
> unsigned long nr_pages);
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..b74e558ce822 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -544,6 +544,32 @@ static void update_pgdat_span(struct pglist_data *pgdat)
> pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
> }
>
> +static enum zone_contiguous_state __meminit clear_zone_contiguous_for_shrinking(
> + struct zone *zone, unsigned long start_pfn, unsigned long nr_pages)
> +{
> + const unsigned long end_pfn = start_pfn + nr_pages;
> + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
> +
> + /*
> + * If the removed pfn range inside the original zone span, the contiguous
> + * property is surely false.
> + */
> + if (start_pfn > zone->zone_start_pfn && end_pfn < zone_end_pfn(zone))
> + result = CONTIGUOUS_DEFINITELY_NOT;
> +
> + /*
> + * If the removed pfn range is at the beginning or end of the
> + * original zone span, the contiguous property is preserved when
> + * the original zone is contiguous.
> + */
> + else if (start_pfn == zone->zone_start_pfn || end_pfn == zone_end_pfn(zone))
> + result = zone->contiguous ?
> + CONTIGUOUS_DEFINITELY : CONTIGUOUS_UNDETERMINED;
> +
See my comment below on how to make this readable.
> + clear_zone_contiguous(zone);
> + return result;
> +}
> +
> void remove_pfn_range_from_zone(struct zone *zone,
> unsigned long start_pfn,
> unsigned long nr_pages)
> @@ -551,6 +577,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
> const unsigned long end_pfn = start_pfn + nr_pages;
> struct pglist_data *pgdat = zone->zone_pgdat;
> unsigned long pfn, cur_nr_pages;
> + enum zone_contiguous_state contiguous_state = CONTIGUOUS_UNDETERMINED;
>
> /* Poison struct pages because they are now uninitialized again. */
> for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
> @@ -571,12 +598,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
> if (zone_is_zone_device(zone))
> return;
>
> - clear_zone_contiguous(zone);
> + contiguous_state = clear_zone_contiguous_for_shrinking(
> + zone, start_pfn, nr_pages);
>
Reading this again, I wonder whether it would be nicer to have something
like:
new_contig_state = zone_contig_state_after_shrinking();
clear_zone_contiguous(zone);
or sth like that. Similar for the growing case.
> shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
> update_pgdat_span(pgdat);
>
> - set_zone_contiguous(zone);
> + set_zone_contiguous(zone, contiguous_state);
> }
>
> /**
> @@ -736,6 +764,47 @@ static inline void section_taint_zone_device(unsigned long pfn)
> }
> #endif
>
> +static enum zone_contiguous_state __meminit clear_zone_contiguous_for_growing(
> + struct zone *zone, unsigned long start_pfn, unsigned long nr_pages)
> +{
> + const unsigned long end_pfn = start_pfn + nr_pages;
> + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
> +
> + /*
> + * Given the moved pfn range's contiguous property is always true,
> + * under the conditional of empty zone, the contiguous property should
> + * be true.
> + */
I don't think that comment is required.
> + if (zone_is_empty(zone))
> + result = CONTIGUOUS_DEFINITELY;
> +
> + /*
> + * If the moved pfn range does not intersect with the original zone span,
> + * the contiguous property is surely false.
> + */
> + else if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone))
> + result = CONTIGUOUS_DEFINITELY_NOT;
> +
> + /*
> + * If the moved pfn range is adjacent to the original zone span, given
> + * the moved pfn range's contiguous property is always true, the zone's
> + * contiguous property inherited from the original value.
> + */
> + else if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
> + result = zone->contiguous ?
> + CONTIGUOUS_DEFINITELY : CONTIGUOUS_DEFINITELY_NOT;
> +
> + /*
> + * If the original zone's hole larger than the moved pages in the range,
> + * the contiguous property is surely false.
> + */
> + else if (nr_pages < (zone->spanned_pages - zone->present_pages))
> + result = CONTIGUOUS_DEFINITELY_NOT;
> +
This is a bit unreadable :)
if (zone_is_empty(zone)) {
result = CONTIGUOUS_DEFINITELY;
} else if (...) {
/* ... */
...
} else if (...) {
...
}
> + clear_zone_contiguous(zone);
> + return result;
> +}
> +
> /*
> * Associate the pfn range with the given zone, initializing the memmaps
> * and resizing the pgdat/zone data to span the added pages. After this
> @@ -752,8 +821,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> {
> struct pglist_data *pgdat = zone->zone_pgdat;
> int nid = pgdat->node_id;
> -
> - clear_zone_contiguous(zone);
> + const enum zone_contiguous_state contiguous_state =
> + clear_zone_contiguous_for_growing(zone, start_pfn, nr_pages);
>
> if (zone_is_empty(zone))
> init_currently_empty_zone(zone, start_pfn, nr_pages);
> @@ -783,7 +852,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> MEMINIT_HOTPLUG, altmap, migratetype,
> isolate_pageblock);
>
> - set_zone_contiguous(zone);
> + set_zone_contiguous(zone, contiguous_state);
> }
>
> struct auto_movable_stats {
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 7712d887b696..06db3fcf7f95 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -2263,26 +2263,34 @@ void __init init_cma_pageblock(struct page *page)
> }
> #endif
>
> -void set_zone_contiguous(struct zone *zone)
> +void set_zone_contiguous(struct zone *zone, enum zone_contiguous_state state)
> {
> unsigned long block_start_pfn = zone->zone_start_pfn;
> unsigned long block_end_pfn;
>
> - block_end_pfn = pageblock_end_pfn(block_start_pfn);
> - for (; block_start_pfn < zone_end_pfn(zone);
> - block_start_pfn = block_end_pfn,
> - block_end_pfn += pageblock_nr_pages) {
> + if (state == CONTIGUOUS_DEFINITELY) {
> + zone->contiguous = true;
> + return;
> + } else if (state == CONTIGUOUS_DEFINITELY_NOT) {
> + // zone contiguous has already cleared as false, just return.
> + return;
> + } else if (state == CONTIGUOUS_UNDETERMINED) {
> + block_end_pfn = pageblock_end_pfn(block_start_pfn);
> + for (; block_start_pfn < zone_end_pfn(zone);
> + block_start_pfn = block_end_pfn,
> + block_end_pfn += pageblock_nr_pages) {
>
> - block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
> + block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>
> - if (!__pageblock_pfn_to_page(block_start_pfn,
> - block_end_pfn, zone))
> - return;
> - cond_resched();
> - }
> + if (!__pageblock_pfn_to_page(block_start_pfn,
> + block_end_pfn, zone))
> + return;
> + cond_resched();
> + }
>
> - /* We confirm that there is no hole */
> - zone->contiguous = true;
> + /* We confirm that there is no hole */
> + zone->contiguous = true;
> + }
> }
>
switch (state) {
case CONTIGUOUS_DEFINITELY:
zone->contiguous = true;
return;
case CONTIGUOUS_DEFINITELY_NOT:
return;
default:
break;
}
... unchanged logic.
> /*
> @@ -2348,7 +2356,7 @@ void __init page_alloc_init_late(void)
> shuffle_free_memory(NODE_DATA(nid));
>
> for_each_populated_zone(zone)
> - set_zone_contiguous(zone);
> + set_zone_contiguous(zone, CONTIGUOUS_UNDETERMINED);
>
> /* Initialize page ext after all struct pages are initialized. */
> if (deferred_struct_pages)
--
Cheers
David
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
2025-12-01 18:54 ` David Hildenbrand (Red Hat)
@ 2025-12-01 23:40 ` Li, Tianyou
2025-12-02 10:24 ` David Hildenbrand (Red Hat)
2025-12-02 10:57 ` Mike Rapoport
1 sibling, 1 reply; 7+ messages in thread
From: Li, Tianyou @ 2025-12-01 23:40 UTC (permalink / raw)
To: David Hildenbrand (Red Hat), Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
Thanks David for the detailed review comments. I will change the code
accordingly after one or two days, in case any other review changes
needed to accommodate. Very appreciated.
On 12/2/2025 2:54 AM, David Hildenbrand (Red Hat) wrote:
> On 12/1/25 14:22, Tianyou Li wrote:
>> When invoke move_pfn_range_to_zone or remove_pfn_range_from_zone, it
>> will
>> update the zone->contiguous by checking the new zone's pfn range from
>> the
>> beginning to the end, regardless the previous state of the old zone.
>> When
>> the zone's pfn range is large, the cost of traversing the pfn range to
>> update the zone->contiguous could be significant.
>>
>> Add fast paths to quickly detect cases where zone is definitely not
>> contiguous without scanning the new zone. The cases are: when the new
>> range
>> did not overlap with previous range, the contiguous should be false;
>> if the
>> new range adjacent with the previous range, just need to check the new
>> range; if the new added pages could not fill the hole of previous
>> zone, the
>> contiguous should be false.
>>
>> The following test cases of memory hotplug for a VM [1], tested in the
>> environment [2], show that this optimization can significantly reduce
>> the
>> memory hotplug time [3].
>>
>> +----------------+------+---------------+--------------+----------------+
>>
>> | | Size | Time (before) | Time (after) | Time
>> Reduction |
>> | +------+---------------+--------------+----------------+
>> | Plug Memory | 256G | 10s | 2s | 80% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 33s | 6s | 81% |
>> +----------------+------+---------------+--------------+----------------+
>>
>>
>> +----------------+------+---------------+--------------+----------------+
>>
>> | | Size | Time (before) | Time (after) | Time
>> Reduction |
>> | +------+---------------+--------------+----------------+
>> | Unplug Memory | 256G | 10s | 2s | 80% |
>> | +------+---------------+--------------+----------------+
>> | | 512G | 34s | 6s | 82% |
>> +----------------+------+---------------+--------------+----------------+
>>
>>
>> [1] Qemu commands to hotplug 256G/512G memory for a VM:
>> object_add memory-backend-ram,id=hotmem0,size=256G/512G,share=on
>> device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>> qom-set vmem1 requested-size 256G/512G (Plug Memory)
>> qom-set vmem1 requested-size 0G (Unplug Memory)
>>
>> [2] Hardware : Intel Icelake server
>> Guest Kernel : v6.18-rc2
>> Qemu : v9.0.0
>>
>> Launch VM :
>> qemu-system-x86_64 -accel kvm -cpu host \
>> -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>> -drive file=./seed.img,format=raw,if=virtio \
>> -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>> -m 2G,slots=10,maxmem=2052472M \
>> -device
>> pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>> -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>> -nographic -machine q35 \
>> -nic user,hostfwd=tcp::3000-:22
>>
>> Guest kernel auto-onlines newly added memory blocks:
>> echo online > /sys/devices/system/memory/auto_online_blocks
>>
>> [3] The time from typing the QEMU commands in [1] to when the output of
>> 'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>> memory is recognized.
>>
>> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
>> Tested-by: Yuan Liu <yuan1.liu@intel.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
>> Reviewed-by: Pan Deng <pan.deng@intel.com>
>> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
>> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
>> Signed-off-by: Tianyou Li <tianyou.li@intel.com>
>> ---
>> mm/internal.h | 8 ++++-
>> mm/memory_hotplug.c | 79 ++++++++++++++++++++++++++++++++++++++++++---
>> mm/mm_init.c | 36 +++++++++++++--------
>> 3 files changed, 103 insertions(+), 20 deletions(-)
>>
>> diff --git a/mm/internal.h b/mm/internal.h
>> index 1561fc2ff5b8..a94928520a55 100644
>> --- a/mm/internal.h
>> +++ b/mm/internal.h
>> @@ -730,7 +730,13 @@ static inline struct page
>> *pageblock_pfn_to_page(unsigned long start_pfn,
>> return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
>> }
>> -void set_zone_contiguous(struct zone *zone);
>> +enum zone_contiguous_state {
>> + CONTIGUOUS_DEFINITELY_NOT = 0,
>> + CONTIGUOUS_DEFINITELY = 1,
>> + CONTIGUOUS_UNDETERMINED = 2,
>
> No need for the values.
>
Got it.
>> +};
>
> I don't like that the defines don't match the enum name (zone_c... vs.
> CONT... ).
>
> Essentially you want a "yes / no / maybe" tristate. I don't think we
> have an existing type for that, unfortunately.
>
> enum zone_contig_state {
> ZONE_CONTIG_YES,
> ZONE_CONTIG_NO,
> ZONE_CONTIG_MAYBE,
> };
>
> Maybe someone reading along has a better idea.
>
I agree it's better. Will wait for a day or two to make the change.
>> +
>> +void set_zone_contiguous(struct zone *zone, enum
>> zone_contiguous_state state);
>> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>> unsigned long nr_pages);
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 0be83039c3b5..b74e558ce822 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -544,6 +544,32 @@ static void update_pgdat_span(struct pglist_data
>> *pgdat)
>> pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
>> }
>> +static enum zone_contiguous_state __meminit
>> clear_zone_contiguous_for_shrinking(
>> + struct zone *zone, unsigned long start_pfn, unsigned long
>> nr_pages)
>> +{
>> + const unsigned long end_pfn = start_pfn + nr_pages;
>> + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
>> +
>> + /*
>> + * If the removed pfn range inside the original zone span, the
>> contiguous
>> + * property is surely false.
>> + */
>> + if (start_pfn > zone->zone_start_pfn && end_pfn <
>> zone_end_pfn(zone))
>> + result = CONTIGUOUS_DEFINITELY_NOT;
>> +
>> + /*
>> + * If the removed pfn range is at the beginning or end of the
>> + * original zone span, the contiguous property is preserved when
>> + * the original zone is contiguous.
>> + */
>> + else if (start_pfn == zone->zone_start_pfn || end_pfn ==
>> zone_end_pfn(zone))
>> + result = zone->contiguous ?
>> + CONTIGUOUS_DEFINITELY : CONTIGUOUS_UNDETERMINED;
>> +
>
> See my comment below on how to make this readable.
>
>> + clear_zone_contiguous(zone);
>> + return result;
>> +}
>> +
>> void remove_pfn_range_from_zone(struct zone *zone,
>> unsigned long start_pfn,
>> unsigned long nr_pages)
>> @@ -551,6 +577,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
>> const unsigned long end_pfn = start_pfn + nr_pages;
>> struct pglist_data *pgdat = zone->zone_pgdat;
>> unsigned long pfn, cur_nr_pages;
>> + enum zone_contiguous_state contiguous_state =
>> CONTIGUOUS_UNDETERMINED;
>> /* Poison struct pages because they are now uninitialized
>> again. */
>> for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
>> @@ -571,12 +598,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
>> if (zone_is_zone_device(zone))
>> return;
>> - clear_zone_contiguous(zone);
>> + contiguous_state = clear_zone_contiguous_for_shrinking(
>> + zone, start_pfn, nr_pages);
>
> Reading this again, I wonder whether it would be nicer to have
> something like:
>
> new_contig_state = zone_contig_state_after_shrinking();
> clear_zone_contiguous(zone);
>
> or sth like that. Similar for the growing case.
>
In both shrinking and growing case, separate the clear_zone_contiguous
from the logic of zone state check, right?
>> shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
>> update_pgdat_span(pgdat);
>> - set_zone_contiguous(zone);
>> + set_zone_contiguous(zone, contiguous_state);
>> }
>> /**
>> @@ -736,6 +764,47 @@ static inline void
>> section_taint_zone_device(unsigned long pfn)
>> }
>> #endif
>> +static enum zone_contiguous_state __meminit
>> clear_zone_contiguous_for_growing(
>> + struct zone *zone, unsigned long start_pfn, unsigned long
>> nr_pages)
>> +{
>> + const unsigned long end_pfn = start_pfn + nr_pages;
>> + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
>> +
>> + /*
>> + * Given the moved pfn range's contiguous property is always true,
>> + * under the conditional of empty zone, the contiguous property
>> should
>> + * be true.
>> + */
>
> I don't think that comment is required.
>
Got it.
>> + if (zone_is_empty(zone))
>> + result = CONTIGUOUS_DEFINITELY;
>> +
>> + /*
>> + * If the moved pfn range does not intersect with the original
>> zone span,
>> + * the contiguous property is surely false.
>> + */
>> + else if (end_pfn < zone->zone_start_pfn || start_pfn >
>> zone_end_pfn(zone))
>> + result = CONTIGUOUS_DEFINITELY_NOT;
>> +
>> + /*
>> + * If the moved pfn range is adjacent to the original zone span,
>> given
>> + * the moved pfn range's contiguous property is always true, the
>> zone's
>> + * contiguous property inherited from the original value.
>> + */
>> + else if (end_pfn == zone->zone_start_pfn || start_pfn ==
>> zone_end_pfn(zone))
>> + result = zone->contiguous ?
>> + CONTIGUOUS_DEFINITELY : CONTIGUOUS_DEFINITELY_NOT;
>> +
>> + /*
>> + * If the original zone's hole larger than the moved pages in
>> the range,
>> + * the contiguous property is surely false.
>> + */
>> + else if (nr_pages < (zone->spanned_pages - zone->present_pages))
>> + result = CONTIGUOUS_DEFINITELY_NOT;
>> +
>
> This is a bit unreadable :)
>
> if (zone_is_empty(zone)) {
> result = CONTIGUOUS_DEFINITELY;
> } else if (...) {
> /* ... */
> ...
> } else if (...) {
> ...
> }
>
Yes, I am thinking of that during coding. I was keeping that way for a
'better looking'. I agree that will not be a good case for maintaining
the code. Thanks.
>> + clear_zone_contiguous(zone);
>> + return result;
>> +}
>> +
>> /*
>> * Associate the pfn range with the given zone, initializing the
>> memmaps
>> * and resizing the pgdat/zone data to span the added pages. After
>> this
>> @@ -752,8 +821,8 @@ void move_pfn_range_to_zone(struct zone *zone,
>> unsigned long start_pfn,
>> {
>> struct pglist_data *pgdat = zone->zone_pgdat;
>> int nid = pgdat->node_id;
>> -
>> - clear_zone_contiguous(zone);
>> + const enum zone_contiguous_state contiguous_state =
>> + clear_zone_contiguous_for_growing(zone, start_pfn, nr_pages);
>> if (zone_is_empty(zone))
>> init_currently_empty_zone(zone, start_pfn, nr_pages);
>> @@ -783,7 +852,7 @@ void move_pfn_range_to_zone(struct zone *zone,
>> unsigned long start_pfn,
>> MEMINIT_HOTPLUG, altmap, migratetype,
>> isolate_pageblock);
>> - set_zone_contiguous(zone);
>> + set_zone_contiguous(zone, contiguous_state);
>> }
>> struct auto_movable_stats {
>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>> index 7712d887b696..06db3fcf7f95 100644
>> --- a/mm/mm_init.c
>> +++ b/mm/mm_init.c
>> @@ -2263,26 +2263,34 @@ void __init init_cma_pageblock(struct page
>> *page)
>> }
>> #endif
>> -void set_zone_contiguous(struct zone *zone)
>> +void set_zone_contiguous(struct zone *zone, enum
>> zone_contiguous_state state)
>> {
>> unsigned long block_start_pfn = zone->zone_start_pfn;
>> unsigned long block_end_pfn;
>> - block_end_pfn = pageblock_end_pfn(block_start_pfn);
>> - for (; block_start_pfn < zone_end_pfn(zone);
>> - block_start_pfn = block_end_pfn,
>> - block_end_pfn += pageblock_nr_pages) {
>> + if (state == CONTIGUOUS_DEFINITELY) {
>> + zone->contiguous = true;
>> + return;
>> + } else if (state == CONTIGUOUS_DEFINITELY_NOT) {
>> + // zone contiguous has already cleared as false, just return.
>> + return;
>> + } else if (state == CONTIGUOUS_UNDETERMINED) {
>> + block_end_pfn = pageblock_end_pfn(block_start_pfn);
>> + for (; block_start_pfn < zone_end_pfn(zone);
>> + block_start_pfn = block_end_pfn,
>> + block_end_pfn += pageblock_nr_pages) {
>> - block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>> + block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>> - if (!__pageblock_pfn_to_page(block_start_pfn,
>> - block_end_pfn, zone))
>> - return;
>> - cond_resched();
>> - }
>> + if (!__pageblock_pfn_to_page(block_start_pfn,
>> + block_end_pfn, zone))
>> + return;
>> + cond_resched();
>> + }
>> - /* We confirm that there is no hole */
>> - zone->contiguous = true;
>> + /* We confirm that there is no hole */
>> + zone->contiguous = true;
>> + }
>> }
>
>
> switch (state) {
> case CONTIGUOUS_DEFINITELY:
> zone->contiguous = true;
> return;
> case CONTIGUOUS_DEFINITELY_NOT:
> return;
> default:
> break;
> }
>
> ... unchanged logic.
>
Will do. Thanks.
>> /*
>> @@ -2348,7 +2356,7 @@ void __init page_alloc_init_late(void)
>> shuffle_free_memory(NODE_DATA(nid));
>> for_each_populated_zone(zone)
>> - set_zone_contiguous(zone);
>> + set_zone_contiguous(zone, CONTIGUOUS_UNDETERMINED);
>> /* Initialize page ext after all struct pages are
>> initialized. */
>> if (deferred_struct_pages)
>
>
Regards,
Tianyou
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
2025-12-01 23:40 ` Li, Tianyou
@ 2025-12-02 10:24 ` David Hildenbrand (Red Hat)
2025-12-02 10:41 ` Li, Tianyou
0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-02 10:24 UTC (permalink / raw)
To: Li, Tianyou, Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
>>> +};
>>
>> I don't like that the defines don't match the enum name (zone_c... vs.
>> CONT... ).
>>
>> Essentially you want a "yes / no / maybe" tristate. I don't think we
>> have an existing type for that, unfortunately.
>>
>> enum zone_contig_state {
>> ZONE_CONTIG_YES,
>> ZONE_CONTIG_NO,
>> ZONE_CONTIG_MAYBE,
>> };
>>
>> Maybe someone reading along has a better idea.
>>
>
> I agree it's better. Will wait for a day or two to make the change.
>
Yes, good idea. No needs to rush at this point because the merge window
just opened up.
>
>>> +
>>> +void set_zone_contiguous(struct zone *zone, enum
>>> zone_contiguous_state state);
>>> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>>> unsigned long nr_pages);
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 0be83039c3b5..b74e558ce822 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -544,6 +544,32 @@ static void update_pgdat_span(struct pglist_data
>>> *pgdat)
>>> pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
>>> }
>>> +static enum zone_contiguous_state __meminit
>>> clear_zone_contiguous_for_shrinking(
>>> + struct zone *zone, unsigned long start_pfn, unsigned long
>>> nr_pages)
>>> +{
>>> + const unsigned long end_pfn = start_pfn + nr_pages;
>>> + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
>>> +
>>> + /*
>>> + * If the removed pfn range inside the original zone span, the
>>> contiguous
>>> + * property is surely false.
>>> + */
>>> + if (start_pfn > zone->zone_start_pfn && end_pfn <
>>> zone_end_pfn(zone))
>>> + result = CONTIGUOUS_DEFINITELY_NOT;
>>> +
>>> + /*
>>> + * If the removed pfn range is at the beginning or end of the
>>> + * original zone span, the contiguous property is preserved when
>>> + * the original zone is contiguous.
>>> + */
>>> + else if (start_pfn == zone->zone_start_pfn || end_pfn ==
>>> zone_end_pfn(zone))
>>> + result = zone->contiguous ?
>>> + CONTIGUOUS_DEFINITELY : CONTIGUOUS_UNDETERMINED;
>>> +
>>
>> See my comment below on how to make this readable.
>>
>>> + clear_zone_contiguous(zone);
>>> + return result;
>>> +}
>>> +
>>> void remove_pfn_range_from_zone(struct zone *zone,
>>> unsigned long start_pfn,
>>> unsigned long nr_pages)
>>> @@ -551,6 +577,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
>>> const unsigned long end_pfn = start_pfn + nr_pages;
>>> struct pglist_data *pgdat = zone->zone_pgdat;
>>> unsigned long pfn, cur_nr_pages;
>>> + enum zone_contiguous_state contiguous_state =
>>> CONTIGUOUS_UNDETERMINED;
>>> /* Poison struct pages because they are now uninitialized
>>> again. */
>>> for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
>>> @@ -571,12 +598,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
>>> if (zone_is_zone_device(zone))
>>> return;
>>> - clear_zone_contiguous(zone);
>>> + contiguous_state = clear_zone_contiguous_for_shrinking(
>>> + zone, start_pfn, nr_pages);
>>
>> Reading this again, I wonder whether it would be nicer to have
>> something like:
>>
>> new_contig_state = zone_contig_state_after_shrinking();
>> clear_zone_contiguous(zone);
>>
>> or sth like that. Similar for the growing case.
>>
>
> In both shrinking and growing case, separate the clear_zone_contiguous
> from the logic of zone state check, right?
Yes, I think that makes it look a bit nicer.
--
Cheers
David
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
2025-12-02 10:24 ` David Hildenbrand (Red Hat)
@ 2025-12-02 10:41 ` Li, Tianyou
0 siblings, 0 replies; 7+ messages in thread
From: Li, Tianyou @ 2025-12-02 10:41 UTC (permalink / raw)
To: David Hildenbrand (Red Hat), Oscar Salvador, Mike Rapoport, Wei Yang
Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
Yu C Chen, Pan Deng, Chen Zhang, linux-kernel
On 12/2/2025 6:24 PM, David Hildenbrand (Red Hat) wrote:
>>>> +};
>>>
>>> I don't like that the defines don't match the enum name (zone_c... vs.
>>> CONT... ).
>>>
>>> Essentially you want a "yes / no / maybe" tristate. I don't think we
>>> have an existing type for that, unfortunately.
>>>
>>> enum zone_contig_state {
>>> ZONE_CONTIG_YES,
>>> ZONE_CONTIG_NO,
>>> ZONE_CONTIG_MAYBE,
>>> };
>>>
>>> Maybe someone reading along has a better idea.
>>>
>>
>> I agree it's better. Will wait for a day or two to make the change.
>>
>
> Yes, good idea. No needs to rush at this point because the merge
> window just opened up.
>
Got it. Allow me to take one more day then complete the patch v5 with
sufficient testing.
>>
>>>> +
>>>> +void set_zone_contiguous(struct zone *zone, enum
>>>> zone_contiguous_state state);
>>>> bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>>>> unsigned long nr_pages);
>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>> index 0be83039c3b5..b74e558ce822 100644
>>>> --- a/mm/memory_hotplug.c
>>>> +++ b/mm/memory_hotplug.c
>>>> @@ -544,6 +544,32 @@ static void update_pgdat_span(struct pglist_data
>>>> *pgdat)
>>>> pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
>>>> }
>>>> +static enum zone_contiguous_state __meminit
>>>> clear_zone_contiguous_for_shrinking(
>>>> + struct zone *zone, unsigned long start_pfn, unsigned long
>>>> nr_pages)
>>>> +{
>>>> + const unsigned long end_pfn = start_pfn + nr_pages;
>>>> + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
>>>> +
>>>> + /*
>>>> + * If the removed pfn range inside the original zone span, the
>>>> contiguous
>>>> + * property is surely false.
>>>> + */
>>>> + if (start_pfn > zone->zone_start_pfn && end_pfn <
>>>> zone_end_pfn(zone))
>>>> + result = CONTIGUOUS_DEFINITELY_NOT;
>>>> +
>>>> + /*
>>>> + * If the removed pfn range is at the beginning or end of the
>>>> + * original zone span, the contiguous property is preserved when
>>>> + * the original zone is contiguous.
>>>> + */
>>>> + else if (start_pfn == zone->zone_start_pfn || end_pfn ==
>>>> zone_end_pfn(zone))
>>>> + result = zone->contiguous ?
>>>> + CONTIGUOUS_DEFINITELY : CONTIGUOUS_UNDETERMINED;
>>>> +
>>>
>>> See my comment below on how to make this readable.
>>>
>>>> + clear_zone_contiguous(zone);
>>>> + return result;
>>>> +}
>>>> +
>>>> void remove_pfn_range_from_zone(struct zone *zone,
>>>> unsigned long start_pfn,
>>>> unsigned long nr_pages)
>>>> @@ -551,6 +577,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
>>>> const unsigned long end_pfn = start_pfn + nr_pages;
>>>> struct pglist_data *pgdat = zone->zone_pgdat;
>>>> unsigned long pfn, cur_nr_pages;
>>>> + enum zone_contiguous_state contiguous_state =
>>>> CONTIGUOUS_UNDETERMINED;
>>>> /* Poison struct pages because they are now uninitialized
>>>> again. */
>>>> for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
>>>> @@ -571,12 +598,13 @@ void remove_pfn_range_from_zone(struct zone
>>>> *zone,
>>>> if (zone_is_zone_device(zone))
>>>> return;
>>>> - clear_zone_contiguous(zone);
>>>> + contiguous_state = clear_zone_contiguous_for_shrinking(
>>>> + zone, start_pfn, nr_pages);
>>>
>>> Reading this again, I wonder whether it would be nicer to have
>>> something like:
>>>
>>> new_contig_state = zone_contig_state_after_shrinking();
>>> clear_zone_contiguous(zone);
>>>
>>> or sth like that. Similar for the growing case.
>>>
>>
>> In both shrinking and growing case, separate the clear_zone_contiguous
>> from the logic of zone state check, right?
>
> Yes, I think that makes it look a bit nicer.
>
Thanks for the confirmation David. Noted and will do. Thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
2025-12-01 18:54 ` David Hildenbrand (Red Hat)
2025-12-01 23:40 ` Li, Tianyou
@ 2025-12-02 10:57 ` Mike Rapoport
2025-12-03 9:35 ` Li, Tianyou
1 sibling, 1 reply; 7+ messages in thread
From: Mike Rapoport @ 2025-12-02 10:57 UTC (permalink / raw)
To: David Hildenbrand (Red Hat)
Cc: Tianyou Li, Oscar Salvador, Wei Yang, linux-mm, Yong Hu,
Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng,
Chen Zhang, linux-kernel
On Mon, Dec 01, 2025 at 07:54:34PM +0100, David Hildenbrand (Red Hat) wrote:
> On 12/1/25 14:22, Tianyou Li wrote:
...
> > void remove_pfn_range_from_zone(struct zone *zone,
> > unsigned long start_pfn,
> > unsigned long nr_pages)
> > @@ -551,6 +577,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
> > const unsigned long end_pfn = start_pfn + nr_pages;
> > struct pglist_data *pgdat = zone->zone_pgdat;
> > unsigned long pfn, cur_nr_pages;
> > + enum zone_contiguous_state contiguous_state = CONTIGUOUS_UNDETERMINED;
> > /* Poison struct pages because they are now uninitialized again. */
> > for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
> > @@ -571,12 +598,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
> > if (zone_is_zone_device(zone))
> > return;
> > - clear_zone_contiguous(zone);
> > + contiguous_state = clear_zone_contiguous_for_shrinking(
> > + zone, start_pfn, nr_pages);
>
> Reading this again, I wonder whether it would be nicer to have something
> like:
>
> new_contig_state = zone_contig_state_after_shrinking();
> clear_zone_contiguous(zone);
+1
> > +static enum zone_contiguous_state __meminit clear_zone_contiguous_for_growing(
> > + struct zone *zone, unsigned long start_pfn, unsigned long nr_pages)
> > +{
> > + const unsigned long end_pfn = start_pfn + nr_pages;
> > + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
> > +
> > + /*
> > + * Given the moved pfn range's contiguous property is always true,
> > + * under the conditional of empty zone, the contiguous property should
> > + * be true.
> > + */
>
> I don't think that comment is required.
>
> > + if (zone_is_empty(zone))
> > + result = CONTIGUOUS_DEFINITELY;
> > +
> > + /*
> > + * If the moved pfn range does not intersect with the original zone span,
> > + * the contiguous property is surely false.
> > + */
> > + else if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone))
> > + result = CONTIGUOUS_DEFINITELY_NOT;
> > +
> > + /*
> > + * If the moved pfn range is adjacent to the original zone span, given
> > + * the moved pfn range's contiguous property is always true, the zone's
> > + * contiguous property inherited from the original value.
> > + */
> > + else if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
> > + result = zone->contiguous ?
> > + CONTIGUOUS_DEFINITELY : CONTIGUOUS_DEFINITELY_NOT;
> > +
> > + /*
> > + * If the original zone's hole larger than the moved pages in the range,
> > + * the contiguous property is surely false.
> > + */
> > + else if (nr_pages < (zone->spanned_pages - zone->present_pages))
> > + result = CONTIGUOUS_DEFINITELY_NOT;
> > +
>
> This is a bit unreadable :)
>
> if (zone_is_empty(zone)) {
> result = CONTIGUOUS_DEFINITELY;
> } else if (...) {
> /* ... */
> ...
> } else if (...) {
> ...
> }
If we update zone state outside this function it can be even simpler:
if (zone_is_empty(zone))
return CONTIGUOUS_DEFINITELY;
if (nr_pages < (zone->spanned_pages - zone->present_pages))
return CONTIGUOUS_DEFINITELY_NOT;
etc.
> > + clear_zone_contiguous(zone);
> > + return result;
> > +}
> > +
> > /*
> > * Associate the pfn range with the given zone, initializing the memmaps
> > * and resizing the pgdat/zone data to span the added pages. After this
> > @@ -752,8 +821,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> > {
> > struct pglist_data *pgdat = zone->zone_pgdat;
> > int nid = pgdat->node_id;
> > -
> > - clear_zone_contiguous(zone);
> > + const enum zone_contiguous_state contiguous_state =
> > + clear_zone_contiguous_for_growing(zone, start_pfn, nr_pages);
> > if (zone_is_empty(zone))
> > init_currently_empty_zone(zone, start_pfn, nr_pages);
> > @@ -783,7 +852,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> > MEMINIT_HOTPLUG, altmap, migratetype,
> > isolate_pageblock);
> > - set_zone_contiguous(zone);
> > + set_zone_contiguous(zone, contiguous_state);
> > }
> > struct auto_movable_stats {
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index 7712d887b696..06db3fcf7f95 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -2263,26 +2263,34 @@ void __init init_cma_pageblock(struct page *page)
> > }
> > #endif
> > -void set_zone_contiguous(struct zone *zone)
> > +void set_zone_contiguous(struct zone *zone, enum zone_contiguous_state state)
> > {
> > unsigned long block_start_pfn = zone->zone_start_pfn;
> > unsigned long block_end_pfn;
> > - block_end_pfn = pageblock_end_pfn(block_start_pfn);
> > - for (; block_start_pfn < zone_end_pfn(zone);
> > - block_start_pfn = block_end_pfn,
> > - block_end_pfn += pageblock_nr_pages) {
> > + if (state == CONTIGUOUS_DEFINITELY) {
> > + zone->contiguous = true;
> > + return;
> > + } else if (state == CONTIGUOUS_DEFINITELY_NOT) {
> > + // zone contiguous has already cleared as false, just return.
Please no C++ style comments.
> > + return;
> > + } else if (state == CONTIGUOUS_UNDETERMINED) {
> > + block_end_pfn = pageblock_end_pfn(block_start_pfn);
> > + for (; block_start_pfn < zone_end_pfn(zone);
> > + block_start_pfn = block_end_pfn,
> > + block_end_pfn += pageblock_nr_pages) {
> > - block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
> > + block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
> > - if (!__pageblock_pfn_to_page(block_start_pfn,
> > - block_end_pfn, zone))
> > - return;
> > - cond_resched();
> > - }
> > + if (!__pageblock_pfn_to_page(block_start_pfn,
> > + block_end_pfn, zone))
> > + return;
> > + cond_resched();
> > + }
> > - /* We confirm that there is no hole */
> > - zone->contiguous = true;
> > + /* We confirm that there is no hole */
> > + zone->contiguous = true;
> > + }
> > }
>
>
> switch (state) {
> case CONTIGUOUS_DEFINITELY:
> zone->contiguous = true;
> return;
> case CONTIGUOUS_DEFINITELY_NOT:
> return;
> default:
> break;
> }
> ... unchanged logic.
I was going to suggest rather to drop 'else'
if (state == CONTIGUOUS_DEFINITELY) {
zone->contiguous = true;
return;
}
if (state == CONTIGUOUS_DEFINITELY_NOT)
return;
... unchanged logic.
but I don't feel strongly about it.
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
2025-12-02 10:57 ` Mike Rapoport
@ 2025-12-03 9:35 ` Li, Tianyou
0 siblings, 0 replies; 7+ messages in thread
From: Li, Tianyou @ 2025-12-03 9:35 UTC (permalink / raw)
To: Mike Rapoport, David Hildenbrand (Red Hat)
Cc: Oscar Salvador, Wei Yang, linux-mm, Yong Hu, Nanhai Zou,
Yuan Liu, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng, Chen Zhang,
linux-kernel
On 12/2/2025 6:57 PM, Mike Rapoport wrote:
> On Mon, Dec 01, 2025 at 07:54:34PM +0100, David Hildenbrand (Red Hat) wrote:
>> On 12/1/25 14:22, Tianyou Li wrote:
> ...
>
>>> void remove_pfn_range_from_zone(struct zone *zone,
>>> unsigned long start_pfn,
>>> unsigned long nr_pages)
>>> @@ -551,6 +577,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
>>> const unsigned long end_pfn = start_pfn + nr_pages;
>>> struct pglist_data *pgdat = zone->zone_pgdat;
>>> unsigned long pfn, cur_nr_pages;
>>> + enum zone_contiguous_state contiguous_state = CONTIGUOUS_UNDETERMINED;
>>> /* Poison struct pages because they are now uninitialized again. */
>>> for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
>>> @@ -571,12 +598,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
>>> if (zone_is_zone_device(zone))
>>> return;
>>> - clear_zone_contiguous(zone);
>>> + contiguous_state = clear_zone_contiguous_for_shrinking(
>>> + zone, start_pfn, nr_pages);
>> Reading this again, I wonder whether it would be nicer to have something
>> like:
>>
>> new_contig_state = zone_contig_state_after_shrinking();
>> clear_zone_contiguous(zone);
> +1
>
>>> +static enum zone_contiguous_state __meminit clear_zone_contiguous_for_growing(
>>> + struct zone *zone, unsigned long start_pfn, unsigned long nr_pages)
>>> +{
>>> + const unsigned long end_pfn = start_pfn + nr_pages;
>>> + enum zone_contiguous_state result = CONTIGUOUS_UNDETERMINED;
>>> +
>>> + /*
>>> + * Given the moved pfn range's contiguous property is always true,
>>> + * under the conditional of empty zone, the contiguous property should
>>> + * be true.
>>> + */
>> I don't think that comment is required.
>>
>>> + if (zone_is_empty(zone))
>>> + result = CONTIGUOUS_DEFINITELY;
>>> +
>>> + /*
>>> + * If the moved pfn range does not intersect with the original zone span,
>>> + * the contiguous property is surely false.
>>> + */
>>> + else if (end_pfn < zone->zone_start_pfn || start_pfn > zone_end_pfn(zone))
>>> + result = CONTIGUOUS_DEFINITELY_NOT;
>>> +
>>> + /*
>>> + * If the moved pfn range is adjacent to the original zone span, given
>>> + * the moved pfn range's contiguous property is always true, the zone's
>>> + * contiguous property inherited from the original value.
>>> + */
>>> + else if (end_pfn == zone->zone_start_pfn || start_pfn == zone_end_pfn(zone))
>>> + result = zone->contiguous ?
>>> + CONTIGUOUS_DEFINITELY : CONTIGUOUS_DEFINITELY_NOT;
>>> +
>>> + /*
>>> + * If the original zone's hole larger than the moved pages in the range,
>>> + * the contiguous property is surely false.
>>> + */
>>> + else if (nr_pages < (zone->spanned_pages - zone->present_pages))
>>> + result = CONTIGUOUS_DEFINITELY_NOT;
>>> +
>> This is a bit unreadable :)
>>
>> if (zone_is_empty(zone)) {
>> result = CONTIGUOUS_DEFINITELY;
>> } else if (...) {
>> /* ... */
>> ...
>> } else if (...) {
>> ...
>> }
> If we update zone state outside this function it can be even simpler:
>
> if (zone_is_empty(zone))
> return CONTIGUOUS_DEFINITELY;
>
> if (nr_pages < (zone->spanned_pages - zone->present_pages))
> return CONTIGUOUS_DEFINITELY_NOT;
>
> etc.
>
Thanks Mike. It's doable, consider clear_zone_contiguous moved out.
>>> + clear_zone_contiguous(zone);
>>> + return result;
>>> +}
>>> +
>>> /*
>>> * Associate the pfn range with the given zone, initializing the memmaps
>>> * and resizing the pgdat/zone data to span the added pages. After this
>>> @@ -752,8 +821,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>>> {
>>> struct pglist_data *pgdat = zone->zone_pgdat;
>>> int nid = pgdat->node_id;
>>> -
>>> - clear_zone_contiguous(zone);
>>> + const enum zone_contiguous_state contiguous_state =
>>> + clear_zone_contiguous_for_growing(zone, start_pfn, nr_pages);
>>> if (zone_is_empty(zone))
>>> init_currently_empty_zone(zone, start_pfn, nr_pages);
>>> @@ -783,7 +852,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>>> MEMINIT_HOTPLUG, altmap, migratetype,
>>> isolate_pageblock);
>>> - set_zone_contiguous(zone);
>>> + set_zone_contiguous(zone, contiguous_state);
>>> }
>>> struct auto_movable_stats {
>>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>>> index 7712d887b696..06db3fcf7f95 100644
>>> --- a/mm/mm_init.c
>>> +++ b/mm/mm_init.c
>>> @@ -2263,26 +2263,34 @@ void __init init_cma_pageblock(struct page *page)
>>> }
>>> #endif
>>> -void set_zone_contiguous(struct zone *zone)
>>> +void set_zone_contiguous(struct zone *zone, enum zone_contiguous_state state)
>>> {
>>> unsigned long block_start_pfn = zone->zone_start_pfn;
>>> unsigned long block_end_pfn;
>>> - block_end_pfn = pageblock_end_pfn(block_start_pfn);
>>> - for (; block_start_pfn < zone_end_pfn(zone);
>>> - block_start_pfn = block_end_pfn,
>>> - block_end_pfn += pageblock_nr_pages) {
>>> + if (state == CONTIGUOUS_DEFINITELY) {
>>> + zone->contiguous = true;
>>> + return;
>>> + } else if (state == CONTIGUOUS_DEFINITELY_NOT) {
>>> + // zone contiguous has already cleared as false, just return.
> Please no C++ style comments.
Got it. Thanks.
>>> + return;
>>> + } else if (state == CONTIGUOUS_UNDETERMINED) {
>>> + block_end_pfn = pageblock_end_pfn(block_start_pfn);
>>> + for (; block_start_pfn < zone_end_pfn(zone);
>>> + block_start_pfn = block_end_pfn,
>>> + block_end_pfn += pageblock_nr_pages) {
>>> - block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>>> + block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>>> - if (!__pageblock_pfn_to_page(block_start_pfn,
>>> - block_end_pfn, zone))
>>> - return;
>>> - cond_resched();
>>> - }
>>> + if (!__pageblock_pfn_to_page(block_start_pfn,
>>> + block_end_pfn, zone))
>>> + return;
>>> + cond_resched();
>>> + }
>>> - /* We confirm that there is no hole */
>>> - zone->contiguous = true;
>>> + /* We confirm that there is no hole */
>>> + zone->contiguous = true;
>>> + }
>>> }
>>
>> switch (state) {
>> case CONTIGUOUS_DEFINITELY:
>> zone->contiguous = true;
>> return;
>> case CONTIGUOUS_DEFINITELY_NOT:
>> return;
>> default:
>> break;
>> }
>> ... unchanged logic.
> I was going to suggest rather to drop 'else'
>
> if (state == CONTIGUOUS_DEFINITELY) {
> zone->contiguous = true;
> return;
> }
>
> if (state == CONTIGUOUS_DEFINITELY_NOT)
> return;
>
> ... unchanged logic.
>
> but I don't feel strongly about it.
>
Got it. Thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-12-03 9:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-01 13:22 [PATCH v4] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range Tianyou Li
2025-12-01 18:54 ` David Hildenbrand (Red Hat)
2025-12-01 23:40 ` Li, Tianyou
2025-12-02 10:24 ` David Hildenbrand (Red Hat)
2025-12-02 10:41 ` Li, Tianyou
2025-12-02 10:57 ` Mike Rapoport
2025-12-03 9:35 ` Li, Tianyou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox