linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: fix huge page table not free after memory unplug
@ 2025-12-22  4:11 Yuan Liu
  2025-12-23  1:15 ` Andrew Morton
  2025-12-23 20:01 ` Mike Rapoport
  0 siblings, 2 replies; 3+ messages in thread
From: Yuan Liu @ 2025-12-22  4:11 UTC (permalink / raw)
  To: baolu.lu, akpm, david, rppt; +Cc: linux-mm, yuan1.liu

newly plugged memory is marked as prot_sethuge via phys_pmd_init
without setting PG_head. During memory unplug, free_hugepage_table
frees the page table as 2M, but pagetable_free handles it as 4K.

The following test case of memory unplug for a VM [1], tested in
the environment [2], show that results.

+-----------------------+------+------+
|Check System Memory    |Plug  |Unplug|
|via free -h            |256GB |256GB |
+-----------------------+------+------+
| Free 4K page table    |257GB |5.6GB |
+-----------------------+------+------+
| Free 2M page table    |257GB |1.7GB |
+-----------------------+------+------+

[1] Qemu commands to unhotplug 256G memory for a VM:
    object_add memory-backend-ram,id=hotmem0,size=256G,share=on
    device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
    qom-set vmem1 requested-size 256G (Plug Memory)
    qom-set vmem1 requested-size 0G (Unplug Memory)

[2] Hardware     : Intel Icelake server
    Guest Kernel : v6.19-rc1
    Qemu         : v9.0.0

Launch VM:
    qemu-system-x86_64 -accel kvm -cpu host \
    -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
    -drive file=./seed.img,format=raw,if=virtio \
    -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
    -m 2G,slots=10,maxmem=2052472M \
    -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
    -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
    -nographic -machine q35 \
    -nic user,hostfwd=tcp::3000-:22

   Guest kernel auto-onlines newly added memory blocks:
   echo online > /sys/devices/system/memory/auto_online_blocks

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
---
 arch/x86/mm/init_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 9983017ecbe0..1044aafd5d94 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1028,7 +1028,7 @@ static void __meminit free_pagetable(struct page *page, int order)
 		free_reserved_pages(page, nr_pages);
 #endif
 	} else {
-		pagetable_free(page_ptdesc(page));
+		__free_pages(page, order);
 	}
 }
 
-- 
2.47.3



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: fix huge page table not free after memory unplug
  2025-12-22  4:11 [PATCH] mm: fix huge page table not free after memory unplug Yuan Liu
@ 2025-12-23  1:15 ` Andrew Morton
  2025-12-23 20:01 ` Mike Rapoport
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2025-12-23  1:15 UTC (permalink / raw)
  To: Yuan Liu; +Cc: baolu.lu, david, rppt, linux-mm

On Sun, 21 Dec 2025 23:11:17 -0500 Yuan Liu <yuan1.liu@intel.com> wrote:

> newly plugged memory is marked as prot_sethuge via phys_pmd_init
> without setting PG_head. During memory unplug, free_hugepage_table
> frees the page table as 2M, but pagetable_free handles it as 4K.
> 
> The following test case of memory unplug for a VM [1], tested in
> the environment [2], show that results.
> 
> +-----------------------+------+------+
> |Check System Memory    |Plug  |Unplug|
> |via free -h            |256GB |256GB |
> +-----------------------+------+------+
> | Free 4K page table    |257GB |5.6GB |
> +-----------------------+------+------+
> | Free 2M page table    |257GB |1.7GB |
> +-----------------------+------+------+
> 
> [1] Qemu commands to unhotplug 256G memory for a VM:
>     object_add memory-backend-ram,id=hotmem0,size=256G,share=on
>     device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>     qom-set vmem1 requested-size 256G (Plug Memory)
>     qom-set vmem1 requested-size 0G (Unplug Memory)
> 
> [2] Hardware     : Intel Icelake server
>     Guest Kernel : v6.19-rc1
>     Qemu         : v9.0.0
> 
> Launch VM:
>     qemu-system-x86_64 -accel kvm -cpu host \
>     -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>     -drive file=./seed.img,format=raw,if=virtio \
>     -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>     -m 2G,slots=10,maxmem=2052472M \
>     -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>     -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>     -nographic -machine q35 \
>     -nic user,hostfwd=tcp::3000-:22
> 
>    Guest kernel auto-onlines newly added memory blocks:
>    echo online > /sys/devices/system/memory/auto_online_blocks
> 
> ...
>
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1028,7 +1028,7 @@ static void __meminit free_pagetable(struct page *page, int order)
>  		free_reserved_pages(page, nr_pages);
>  #endif
>  	} else {
> -		pagetable_free(page_ptdesc(page));
> +		__free_pages(page, order);
>  	}
>  }

This reverts half of bf9e4e30f353 ("x86/mm: use pagetable_free()").

What about the other half?  The below change that patch made to
arch/x86/mm/pat/set_memory.c - is that OK?

--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -429,7 +429,7 @@ static void cpa_collapse_large_pages(struct cpa_data *cpa)
 
 	list_for_each_entry_safe(ptdesc, tmp, &pgtables, pt_list) {
 		list_del(&ptdesc->pt_list);
-		__free_page(ptdesc_page(ptdesc));
+		pagetable_free(ptdesc);
 	}
 }



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: fix huge page table not free after memory unplug
  2025-12-22  4:11 [PATCH] mm: fix huge page table not free after memory unplug Yuan Liu
  2025-12-23  1:15 ` Andrew Morton
@ 2025-12-23 20:01 ` Mike Rapoport
  1 sibling, 0 replies; 3+ messages in thread
From: Mike Rapoport @ 2025-12-23 20:01 UTC (permalink / raw)
  To: Yuan Liu; +Cc: baolu.lu, akpm, david, linux-mm, Dave Hansen

On Sun, Dec 21, 2025 at 11:11:17PM -0500, Yuan Liu wrote:
> newly plugged memory is marked as prot_sethuge via phys_pmd_init
> without setting PG_head. During memory unplug, free_hugepage_table
> frees the page table as 2M, but pagetable_free handles it as 4K.
> 
> The following test case of memory unplug for a VM [1], tested in
> the environment [2], show that results.
> 
> +-----------------------+------+------+
> |Check System Memory    |Plug  |Unplug|
> |via free -h            |256GB |256GB |
> +-----------------------+------+------+
> | Free 4K page table    |257GB |5.6GB |
> +-----------------------+------+------+
> | Free 2M page table    |257GB |1.7GB |
> +-----------------------+------+------+
> 
> [1] Qemu commands to unhotplug 256G memory for a VM:
>     object_add memory-backend-ram,id=hotmem0,size=256G,share=on
>     device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>     qom-set vmem1 requested-size 256G (Plug Memory)
>     qom-set vmem1 requested-size 0G (Unplug Memory)
> 
> [2] Hardware     : Intel Icelake server
>     Guest Kernel : v6.19-rc1
>     Qemu         : v9.0.0
> 
> Launch VM:
>     qemu-system-x86_64 -accel kvm -cpu host \
>     -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>     -drive file=./seed.img,format=raw,if=virtio \
>     -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>     -m 2G,slots=10,maxmem=2052472M \
>     -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>     -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>     -nographic -machine q35 \
>     -nic user,hostfwd=tcp::3000-:22
> 
>    Guest kernel auto-onlines newly added memory blocks:
>    echo online > /sys/devices/system/memory/auto_online_blocks
> 
> Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
> ---
>  arch/x86/mm/init_64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 9983017ecbe0..1044aafd5d94 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1028,7 +1028,7 @@ static void __meminit free_pagetable(struct page *page, int order)
>  		free_reserved_pages(page, nr_pages);
>  #endif
>  	} else {
> -		pagetable_free(page_ptdesc(page));
> +		__free_pages(page, order);

The issue is that the page table page does not have proper compound_head
set, so the fix should address that rather partially revert the commit that
introduced pagetable_free() here.

>  	}
>  }
>  
> -- 
> 2.47.3
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-12-23 20:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-22  4:11 [PATCH] mm: fix huge page table not free after memory unplug Yuan Liu
2025-12-23  1:15 ` Andrew Morton
2025-12-23 20:01 ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox