* [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
@ 2023-09-05 3:13 Yuan Can
2023-09-05 3:13 ` [PATCH 2/2] mm: hugetlb_vmemmap: allow alloc_vmemmap_page_list() ignore watermarks Yuan Can
2023-09-05 9:06 ` [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Muchun Song
0 siblings, 2 replies; 10+ messages in thread
From: Yuan Can @ 2023-09-05 3:13 UTC (permalink / raw)
To: mike.kravetz, muchun.song, akpm, linux-mm; +Cc: wangkefeng.wang, yuancan
The decreasing of hugetlb pages number failed with the following message
given:
sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace.part.6+0x84/0xe4
show_stack+0x18/0x24
dump_stack_lvl+0x48/0x60
dump_stack+0x18/0x24
warn_alloc+0x100/0x1bc
__alloc_pages_slowpath.constprop.107+0xa40/0xad8
__alloc_pages+0x244/0x2d0
hugetlb_vmemmap_restore+0x104/0x1e4
__update_and_free_hugetlb_folio+0x44/0x1f4
update_and_free_hugetlb_folio+0x20/0x68
update_and_free_pages_bulk+0x4c/0xac
set_max_huge_pages+0x198/0x334
nr_hugepages_store_common+0x118/0x178
nr_hugepages_store+0x18/0x24
kobj_attr_store+0x18/0x2c
sysfs_kf_write+0x40/0x54
kernfs_fop_write_iter+0x164/0x1dc
vfs_write+0x3a8/0x460
ksys_write+0x6c/0x100
__arm64_sys_write+0x1c/0x28
invoke_syscall+0x44/0x100
el0_svc_common.constprop.1+0x6c/0xe4
do_el0_svc+0x38/0x94
el0_svc+0x28/0x74
el0t_64_sync_handler+0xa0/0xc4
el0t_64_sync+0x174/0x178
Mem-Info:
...
The reason is that the hugetlb pages being released are allocated from
movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
need to be allocated from the same node during the hugetlb pages
releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
node is always failed. Fix this problem by removing __GFP_THISNODE.
Signed-off-by: Yuan Can <yuancan@huawei.com>
---
mm/hugetlb_vmemmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index c2007ef5e9b0..0485e471d224 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
struct list_head *list)
{
- gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
+ gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
int nid = page_to_nid((struct page *)start);
struct page *page, *next;
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] mm: hugetlb_vmemmap: allow alloc_vmemmap_page_list() ignore watermarks
2023-09-05 3:13 [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Yuan Can
@ 2023-09-05 3:13 ` Yuan Can
2023-09-05 6:59 ` Muchun Song
2023-09-05 9:06 ` [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Muchun Song
1 sibling, 1 reply; 10+ messages in thread
From: Yuan Can @ 2023-09-05 3:13 UTC (permalink / raw)
To: mike.kravetz, muchun.song, akpm, linux-mm; +Cc: wangkefeng.wang, yuancan
The alloc_vmemmap_page_list() is called when hugetlb get freed, more memory
will be returned to buddy after it succeed, thus work with __GFP_MEMALLOC
to allow it ignore watermarks.
Signed-off-by: Yuan Can <yuancan@huawei.com>
---
mm/hugetlb_vmemmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 0485e471d224..dc0b9247a1f9 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
struct list_head *list)
{
- gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
+ gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_MEMALLOC;
unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
int nid = page_to_nid((struct page *)start);
struct page *page, *next;
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: hugetlb_vmemmap: allow alloc_vmemmap_page_list() ignore watermarks
2023-09-05 3:13 ` [PATCH 2/2] mm: hugetlb_vmemmap: allow alloc_vmemmap_page_list() ignore watermarks Yuan Can
@ 2023-09-05 6:59 ` Muchun Song
0 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2023-09-05 6:59 UTC (permalink / raw)
To: Yuan Can; +Cc: Mike Kravetz, Andrew Morton, Linux-MM, Kefeng Wang
> On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
>
> The alloc_vmemmap_page_list() is called when hugetlb get freed, more memory
> will be returned to buddy after it succeed, thus work with __GFP_MEMALLOC
> to allow it ignore watermarks.
From the kernel document about __GFP_MEMALLOC, it says:
* %__GFP_MEMALLOC allows access to all memory. This should only be used when
* the caller guarantees the allocation will allow more memory to be freed
* very shortly e.g. process exiting or swapping. Users either should
* be the MM or co-ordinating closely with the VM (e.g. swap over NFS).
* Users of this flag have to be extremely careful to not deplete the reserve
I think we may deplete the reserve memory if a 1GB page is freed. It'll
be even worse if recent patchset[1] is merged, because the vmemmap pages
will be freed batched meaning those memory will not be freed in a very
short time (the cover letter has some numbers). So NACK.
* completely and implement a throttling mechanism which controls the
* consumption of the reserve based on the amount of freed memory.
* Usage of a pre-allocated pool (e.g. mempool) should be always considered
* before using this flag.
[1] https://lore.kernel.org/linux-mm/20230825190436.55045-1-mike.kravetz@oracle.com/
>
> Signed-off-by: Yuan Can <yuancan@huawei.com>
> ---
> mm/hugetlb_vmemmap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 0485e471d224..dc0b9247a1f9 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
> struct list_head *list)
> {
> - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
> + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_MEMALLOC;
> unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
> int nid = page_to_nid((struct page *)start);
> struct page *page, *next;
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
2023-09-05 3:13 [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Yuan Can
2023-09-05 3:13 ` [PATCH 2/2] mm: hugetlb_vmemmap: allow alloc_vmemmap_page_list() ignore watermarks Yuan Can
@ 2023-09-05 9:06 ` Muchun Song
2023-09-05 10:43 ` Kefeng Wang
` (2 more replies)
1 sibling, 3 replies; 10+ messages in thread
From: Muchun Song @ 2023-09-05 9:06 UTC (permalink / raw)
To: Yuan Can; +Cc: Mike Kravetz, Andrew Morton, Linux-MM, wangkefeng.wang
> On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
>
> The decreasing of hugetlb pages number failed with the following message
> given:
>
> sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
> CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> dump_backtrace.part.6+0x84/0xe4
> show_stack+0x18/0x24
> dump_stack_lvl+0x48/0x60
> dump_stack+0x18/0x24
> warn_alloc+0x100/0x1bc
> __alloc_pages_slowpath.constprop.107+0xa40/0xad8
> __alloc_pages+0x244/0x2d0
> hugetlb_vmemmap_restore+0x104/0x1e4
> __update_and_free_hugetlb_folio+0x44/0x1f4
> update_and_free_hugetlb_folio+0x20/0x68
> update_and_free_pages_bulk+0x4c/0xac
> set_max_huge_pages+0x198/0x334
> nr_hugepages_store_common+0x118/0x178
> nr_hugepages_store+0x18/0x24
> kobj_attr_store+0x18/0x2c
> sysfs_kf_write+0x40/0x54
> kernfs_fop_write_iter+0x164/0x1dc
> vfs_write+0x3a8/0x460
> ksys_write+0x6c/0x100
> __arm64_sys_write+0x1c/0x28
> invoke_syscall+0x44/0x100
> el0_svc_common.constprop.1+0x6c/0xe4
> do_el0_svc+0x38/0x94
> el0_svc+0x28/0x74
> el0t_64_sync_handler+0xa0/0xc4
> el0t_64_sync+0x174/0x178
> Mem-Info:
> ...
>
> The reason is that the hugetlb pages being released are allocated from
> movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
> need to be allocated from the same node during the hugetlb pages
Thanks for your fix, I think it should be a real word issue, it's better
to add a Fixes tag to indicate backporting. Thanks.
> releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
> node is always failed. Fix this problem by removing __GFP_THISNODE.
>
> Signed-off-by: Yuan Can <yuancan@huawei.com>
> ---
> mm/hugetlb_vmemmap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index c2007ef5e9b0..0485e471d224 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
> struct list_head *list)
> {
> - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
> + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
There is a little change for non-movable case after this change, we fist try
to allocate memory from the preferred node (it is same as original), if it
fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
huge pages could be freed once other nodes could satisfy the allocation of
vmemmap pages.
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Thanks.
> unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
> int nid = page_to_nid((struct page *)start);
> struct page *page, *next;
> --
> 2.17.1
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
2023-09-05 9:06 ` [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Muchun Song
@ 2023-09-05 10:43 ` Kefeng Wang
2023-09-05 12:41 ` Yuan Can
2023-09-06 0:28 ` Mike Kravetz
2 siblings, 0 replies; 10+ messages in thread
From: Kefeng Wang @ 2023-09-05 10:43 UTC (permalink / raw)
To: Muchun Song, Yuan Can; +Cc: Mike Kravetz, Andrew Morton, Linux-MM
On 2023/9/5 17:06, Muchun Song wrote:
>
>
>> On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
>>
>> The decreasing of hugetlb pages number failed with the following message
>> given:
>>
>> sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
>> CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
>> Hardware name: linux,dummy-virt (DT)
>> Call trace:
>> dump_backtrace.part.6+0x84/0xe4
>> show_stack+0x18/0x24
>> dump_stack_lvl+0x48/0x60
>> dump_stack+0x18/0x24
>> warn_alloc+0x100/0x1bc
>> __alloc_pages_slowpath.constprop.107+0xa40/0xad8
>> __alloc_pages+0x244/0x2d0
>> hugetlb_vmemmap_restore+0x104/0x1e4
>> __update_and_free_hugetlb_folio+0x44/0x1f4
>> update_and_free_hugetlb_folio+0x20/0x68
>> update_and_free_pages_bulk+0x4c/0xac
>> set_max_huge_pages+0x198/0x334
>> nr_hugepages_store_common+0x118/0x178
>> nr_hugepages_store+0x18/0x24
>> kobj_attr_store+0x18/0x2c
>> sysfs_kf_write+0x40/0x54
>> kernfs_fop_write_iter+0x164/0x1dc
>> vfs_write+0x3a8/0x460
>> ksys_write+0x6c/0x100
>> __arm64_sys_write+0x1c/0x28
>> invoke_syscall+0x44/0x100
>> el0_svc_common.constprop.1+0x6c/0xe4
>> do_el0_svc+0x38/0x94
>> el0_svc+0x28/0x74
>> el0t_64_sync_handler+0xa0/0xc4
>> el0t_64_sync+0x174/0x178
>> Mem-Info:
>> ...
>>
>> The reason is that the hugetlb pages being released are allocated from
>> movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
>> need to be allocated from the same node during the hugetlb pages
>
> Thanks for your fix, I think it should be a real word issue, it's better
> to add a Fixes tag to indicate backporting. Thanks.
>
>> releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
>> node is always failed. Fix this problem by removing __GFP_THISNODE.
Should be ad2fa3717b74 ("mm: hugetlb: alloc the vmemmap pages associated
with each HugeTLB page")
>>
>> Signed-off-by: Yuan Can <yuancan@huawei.com>
>> ---
>> mm/hugetlb_vmemmap.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>> index c2007ef5e9b0..0485e471d224 100644
>> --- a/mm/hugetlb_vmemmap.c
>> +++ b/mm/hugetlb_vmemmap.c
>> @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
>> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
>> struct list_head *list)
>> {
>> - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
>> + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
>
> There is a little change for non-movable case after this change, we fist try
> to allocate memory from the preferred node (it is same as original), if it
> fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
> huge pages could be freed once other nodes could satisfy the allocation of
> vmemmap pages.
>
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>
> Thanks.
>
>> unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>> int nid = page_to_nid((struct page *)start);
>> struct page *page, *next;
>> --
>> 2.17.1
>>
>>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
2023-09-05 9:06 ` [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Muchun Song
2023-09-05 10:43 ` Kefeng Wang
@ 2023-09-05 12:41 ` Yuan Can
2023-09-06 0:28 ` Mike Kravetz
2 siblings, 0 replies; 10+ messages in thread
From: Yuan Can @ 2023-09-05 12:41 UTC (permalink / raw)
To: Muchun Song; +Cc: Mike Kravetz, Andrew Morton, Linux-MM, wangkefeng.wang
在 2023/9/5 17:06, Muchun Song 写道:
>
>> On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
>>
>> The decreasing of hugetlb pages number failed with the following message
>> given:
>>
>> sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
>> CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
>> Hardware name: linux,dummy-virt (DT)
>> Call trace:
>> dump_backtrace.part.6+0x84/0xe4
>> show_stack+0x18/0x24
>> dump_stack_lvl+0x48/0x60
>> dump_stack+0x18/0x24
>> warn_alloc+0x100/0x1bc
>> __alloc_pages_slowpath.constprop.107+0xa40/0xad8
>> __alloc_pages+0x244/0x2d0
>> hugetlb_vmemmap_restore+0x104/0x1e4
>> __update_and_free_hugetlb_folio+0x44/0x1f4
>> update_and_free_hugetlb_folio+0x20/0x68
>> update_and_free_pages_bulk+0x4c/0xac
>> set_max_huge_pages+0x198/0x334
>> nr_hugepages_store_common+0x118/0x178
>> nr_hugepages_store+0x18/0x24
>> kobj_attr_store+0x18/0x2c
>> sysfs_kf_write+0x40/0x54
>> kernfs_fop_write_iter+0x164/0x1dc
>> vfs_write+0x3a8/0x460
>> ksys_write+0x6c/0x100
>> __arm64_sys_write+0x1c/0x28
>> invoke_syscall+0x44/0x100
>> el0_svc_common.constprop.1+0x6c/0xe4
>> do_el0_svc+0x38/0x94
>> el0_svc+0x28/0x74
>> el0t_64_sync_handler+0xa0/0xc4
>> el0t_64_sync+0x174/0x178
>> Mem-Info:
>> ...
>>
>> The reason is that the hugetlb pages being released are allocated from
>> movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
>> need to be allocated from the same node during the hugetlb pages
> Thanks for your fix, I think it should be a real word issue, it's better
> to add a Fixes tag to indicate backporting. Thanks.
>
>> releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
>> node is always failed. Fix this problem by removing __GFP_THISNODE.
>>
>> Signed-off-by: Yuan Can <yuancan@huawei.com>
>> ---
>> mm/hugetlb_vmemmap.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>> index c2007ef5e9b0..0485e471d224 100644
>> --- a/mm/hugetlb_vmemmap.c
>> +++ b/mm/hugetlb_vmemmap.c
>> @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
>> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
>> struct list_head *list)
>> {
>> - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
>> + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
> There is a little change for non-movable case after this change, we fist try
> to allocate memory from the preferred node (it is same as original), if it
> fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
> huge pages could be freed once other nodes could satisfy the allocation of
> vmemmap pages.
>
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>
> Thanks.
Thanks for the review, I will send the v2 patch with Fixes tag and your
Reviewed-by soon.
--
Best regards,
Yuan Can
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
2023-09-05 9:06 ` [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Muchun Song
2023-09-05 10:43 ` Kefeng Wang
2023-09-05 12:41 ` Yuan Can
@ 2023-09-06 0:28 ` Mike Kravetz
2023-09-06 2:32 ` Muchun Song
2023-09-06 7:25 ` David Hildenbrand
2 siblings, 2 replies; 10+ messages in thread
From: Mike Kravetz @ 2023-09-06 0:28 UTC (permalink / raw)
To: Muchun Song
Cc: Yuan Can, Andrew Morton, Linux-MM, wangkefeng.wang,
David Hildenbrand, Michal Hocko
On 09/05/23 17:06, Muchun Song wrote:
>
>
> > On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
> >
> > The decreasing of hugetlb pages number failed with the following message
> > given:
> >
> > sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
> > CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
> > Hardware name: linux,dummy-virt (DT)
> > Call trace:
> > dump_backtrace.part.6+0x84/0xe4
> > show_stack+0x18/0x24
> > dump_stack_lvl+0x48/0x60
> > dump_stack+0x18/0x24
> > warn_alloc+0x100/0x1bc
> > __alloc_pages_slowpath.constprop.107+0xa40/0xad8
> > __alloc_pages+0x244/0x2d0
> > hugetlb_vmemmap_restore+0x104/0x1e4
> > __update_and_free_hugetlb_folio+0x44/0x1f4
> > update_and_free_hugetlb_folio+0x20/0x68
> > update_and_free_pages_bulk+0x4c/0xac
> > set_max_huge_pages+0x198/0x334
> > nr_hugepages_store_common+0x118/0x178
> > nr_hugepages_store+0x18/0x24
> > kobj_attr_store+0x18/0x2c
> > sysfs_kf_write+0x40/0x54
> > kernfs_fop_write_iter+0x164/0x1dc
> > vfs_write+0x3a8/0x460
> > ksys_write+0x6c/0x100
> > __arm64_sys_write+0x1c/0x28
> > invoke_syscall+0x44/0x100
> > el0_svc_common.constprop.1+0x6c/0xe4
> > do_el0_svc+0x38/0x94
> > el0_svc+0x28/0x74
> > el0t_64_sync_handler+0xa0/0xc4
> > el0t_64_sync+0x174/0x178
> > Mem-Info:
> > ...
> >
> > The reason is that the hugetlb pages being released are allocated from
> > movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
> > need to be allocated from the same node during the hugetlb pages
>
> Thanks for your fix, I think it should be a real word issue, it's better
> to add a Fixes tag to indicate backporting. Thanks.
>
I thought we might get get the same error (Unable to allocate on movable
node) when creating the hugetlb page. Why? Because we replace the head
vmemmap page. However, I see that failure to allocate there is not a
fatal error and we fallback to the currently mapped page. We also pass
__GFP_NOWARN to that allocation attempt so there will be no report of the
failure.
We might want to change this as well?
> > releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
> > node is always failed. Fix this problem by removing __GFP_THISNODE.
> >
> > Signed-off-by: Yuan Can <yuancan@huawei.com>
> > ---
> > mm/hugetlb_vmemmap.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index c2007ef5e9b0..0485e471d224 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
> > static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
> > struct list_head *list)
> > {
> > - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
> > + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
>
> There is a little change for non-movable case after this change, we fist try
> to allocate memory from the preferred node (it is same as original), if it
> fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
> huge pages could be freed once other nodes could satisfy the allocation of
> vmemmap pages.
>
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
This looks reasonable to me as well.
Cc'ing David and Michal as they are expert in hotplug.
--
Mike Kravetz
>
> Thanks.
>
> > unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
> > int nid = page_to_nid((struct page *)start);
> > struct page *page, *next;
> > --
> > 2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
2023-09-06 0:28 ` Mike Kravetz
@ 2023-09-06 2:32 ` Muchun Song
2023-09-06 2:59 ` Yuan Can
2023-09-06 7:25 ` David Hildenbrand
1 sibling, 1 reply; 10+ messages in thread
From: Muchun Song @ 2023-09-06 2:32 UTC (permalink / raw)
To: Mike Kravetz
Cc: Yuan Can, Andrew Morton, Linux-MM, Kefeng Wang,
David Hildenbrand, Michal Hocko
> On Sep 6, 2023, at 08:28, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 09/05/23 17:06, Muchun Song wrote:
>>
>>
>>> On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
>>>
>>> The decreasing of hugetlb pages number failed with the following message
>>> given:
>>>
>>> sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
>>> CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
>>> Hardware name: linux,dummy-virt (DT)
>>> Call trace:
>>> dump_backtrace.part.6+0x84/0xe4
>>> show_stack+0x18/0x24
>>> dump_stack_lvl+0x48/0x60
>>> dump_stack+0x18/0x24
>>> warn_alloc+0x100/0x1bc
>>> __alloc_pages_slowpath.constprop.107+0xa40/0xad8
>>> __alloc_pages+0x244/0x2d0
>>> hugetlb_vmemmap_restore+0x104/0x1e4
>>> __update_and_free_hugetlb_folio+0x44/0x1f4
>>> update_and_free_hugetlb_folio+0x20/0x68
>>> update_and_free_pages_bulk+0x4c/0xac
>>> set_max_huge_pages+0x198/0x334
>>> nr_hugepages_store_common+0x118/0x178
>>> nr_hugepages_store+0x18/0x24
>>> kobj_attr_store+0x18/0x2c
>>> sysfs_kf_write+0x40/0x54
>>> kernfs_fop_write_iter+0x164/0x1dc
>>> vfs_write+0x3a8/0x460
>>> ksys_write+0x6c/0x100
>>> __arm64_sys_write+0x1c/0x28
>>> invoke_syscall+0x44/0x100
>>> el0_svc_common.constprop.1+0x6c/0xe4
>>> do_el0_svc+0x38/0x94
>>> el0_svc+0x28/0x74
>>> el0t_64_sync_handler+0xa0/0xc4
>>> el0t_64_sync+0x174/0x178
>>> Mem-Info:
>>> ...
>>>
>>> The reason is that the hugetlb pages being released are allocated from
>>> movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
>>> need to be allocated from the same node during the hugetlb pages
>>
>> Thanks for your fix, I think it should be a real word issue, it's better
>> to add a Fixes tag to indicate backporting. Thanks.
>>
>
> I thought we might get get the same error (Unable to allocate on movable
> node) when creating the hugetlb page. Why? Because we replace the head
> vmemmap page. However, I see that failure to allocate there is not a
> fatal error and we fallback to the currently mapped page. We also pass
> __GFP_NOWARN to that allocation attempt so there will be no report of the
> failure.
>
> We might want to change this as well?
I think yes. I also thought about this yesterday, but I think
this one is not a fetal error, it should be an improvement patch.
So it is better not to fold this change into this patch (a bug fix one).
Thanks.
>
>>> releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
>>> node is always failed. Fix this problem by removing __GFP_THISNODE.
>>>
>>> Signed-off-by: Yuan Can <yuancan@huawei.com>
>>> ---
>>> mm/hugetlb_vmemmap.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>> index c2007ef5e9b0..0485e471d224 100644
>>> --- a/mm/hugetlb_vmemmap.c
>>> +++ b/mm/hugetlb_vmemmap.c
>>> @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
>>> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
>>> struct list_head *list)
>>> {
>>> - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
>>> + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
>>
>> There is a little change for non-movable case after this change, we fist try
>> to allocate memory from the preferred node (it is same as original), if it
>> fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
>> huge pages could be freed once other nodes could satisfy the allocation of
>> vmemmap pages.
>>
>> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>
> This looks reasonable to me as well.
>
> Cc'ing David and Michal as they are expert in hotplug.
> --
> Mike Kravetz
>
>>
>> Thanks.
>>
>>> unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>>> int nid = page_to_nid((struct page *)start);
>>> struct page *page, *next;
>>> --
>>> 2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
2023-09-06 2:32 ` Muchun Song
@ 2023-09-06 2:59 ` Yuan Can
0 siblings, 0 replies; 10+ messages in thread
From: Yuan Can @ 2023-09-06 2:59 UTC (permalink / raw)
To: Muchun Song, Mike Kravetz
Cc: Andrew Morton, Linux-MM, Kefeng Wang, David Hildenbrand, Michal Hocko
在 2023/9/6 10:32, Muchun Song 写道:
>
>> On Sep 6, 2023, at 08:28, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>
>> On 09/05/23 17:06, Muchun Song wrote:
>>>
>>>> On Sep 5, 2023, at 11:13, Yuan Can <yuancan@huawei.com> wrote:
>>>>
>>>> The decreasing of hugetlb pages number failed with the following message
>>>> given:
>>>>
>>>> sh: page allocation failure: order:0, mode:0x204cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_THISNODE)
>>>> CPU: 1 PID: 112 Comm: sh Not tainted 6.5.0-rc7-... #45
>>>> Hardware name: linux,dummy-virt (DT)
>>>> Call trace:
>>>> dump_backtrace.part.6+0x84/0xe4
>>>> show_stack+0x18/0x24
>>>> dump_stack_lvl+0x48/0x60
>>>> dump_stack+0x18/0x24
>>>> warn_alloc+0x100/0x1bc
>>>> __alloc_pages_slowpath.constprop.107+0xa40/0xad8
>>>> __alloc_pages+0x244/0x2d0
>>>> hugetlb_vmemmap_restore+0x104/0x1e4
>>>> __update_and_free_hugetlb_folio+0x44/0x1f4
>>>> update_and_free_hugetlb_folio+0x20/0x68
>>>> update_and_free_pages_bulk+0x4c/0xac
>>>> set_max_huge_pages+0x198/0x334
>>>> nr_hugepages_store_common+0x118/0x178
>>>> nr_hugepages_store+0x18/0x24
>>>> kobj_attr_store+0x18/0x2c
>>>> sysfs_kf_write+0x40/0x54
>>>> kernfs_fop_write_iter+0x164/0x1dc
>>>> vfs_write+0x3a8/0x460
>>>> ksys_write+0x6c/0x100
>>>> __arm64_sys_write+0x1c/0x28
>>>> invoke_syscall+0x44/0x100
>>>> el0_svc_common.constprop.1+0x6c/0xe4
>>>> do_el0_svc+0x38/0x94
>>>> el0_svc+0x28/0x74
>>>> el0t_64_sync_handler+0xa0/0xc4
>>>> el0t_64_sync+0x174/0x178
>>>> Mem-Info:
>>>> ...
>>>>
>>>> The reason is that the hugetlb pages being released are allocated from
>>>> movable nodes, and with hugetlb_optimize_vmemmap enabled, vmemmap pages
>>>> need to be allocated from the same node during the hugetlb pages
>>> Thanks for your fix, I think it should be a real word issue, it's better
>>> to add a Fixes tag to indicate backporting. Thanks.
>>>
>> I thought we might get get the same error (Unable to allocate on movable
>> node) when creating the hugetlb page. Why? Because we replace the head
>> vmemmap page. However, I see that failure to allocate there is not a
>> fatal error and we fallback to the currently mapped page. We also pass
>> __GFP_NOWARN to that allocation attempt so there will be no report of the
>> failure.
>>
>> We might want to change this as well?
> I think yes. I also thought about this yesterday, but I think
> this one is not a fetal error, it should be an improvement patch.
> So it is better not to fold this change into this patch (a bug fix one).
>
> Thanks.
Sure, let me send another patch passing __GFP_NOWARN.
--
Best regards,
Yuan Can
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes
2023-09-06 0:28 ` Mike Kravetz
2023-09-06 2:32 ` Muchun Song
@ 2023-09-06 7:25 ` David Hildenbrand
1 sibling, 0 replies; 10+ messages in thread
From: David Hildenbrand @ 2023-09-06 7:25 UTC (permalink / raw)
To: Mike Kravetz, Muchun Song
Cc: Yuan Can, Andrew Morton, Linux-MM, wangkefeng.wang, Michal Hocko
>>> releasing. With GFP_KERNEL and __GFP_THISNODE set, allocating from movable
>>> node is always failed. Fix this problem by removing __GFP_THISNODE.
>>>
>>> Signed-off-by: Yuan Can <yuancan@huawei.com>
>>> ---
>>> mm/hugetlb_vmemmap.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>> index c2007ef5e9b0..0485e471d224 100644
>>> --- a/mm/hugetlb_vmemmap.c
>>> +++ b/mm/hugetlb_vmemmap.c
>>> @@ -386,7 +386,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
>>> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
>>> struct list_head *list)
>>> {
>>> - gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
>>> + gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
>>
>> There is a little change for non-movable case after this change, we fist try
>> to allocate memory from the preferred node (it is same as original), if it
>> fails, it fallbacks to other nodes now. For me, it makes sense. At least, those
>> huge pages could be freed once other nodes could satisfy the allocation of
>> vmemmap pages.
>>
>> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>
> This looks reasonable to me as well.
>
> Cc'ing David and Michal as they are expert in hotplug.
IIUC, we still won't allocate from ZONE_MOVABLE / MIGRATE_CMA (due to
GFP_KERNEL), so it should be fine.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-09-06 7:25 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-05 3:13 [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Yuan Can
2023-09-05 3:13 ` [PATCH 2/2] mm: hugetlb_vmemmap: allow alloc_vmemmap_page_list() ignore watermarks Yuan Can
2023-09-05 6:59 ` Muchun Song
2023-09-05 9:06 ` [PATCH 1/2] mm: hugetlb_vmemmap: fix hugetlb page number decrease failed on movable nodes Muchun Song
2023-09-05 10:43 ` Kefeng Wang
2023-09-05 12:41 ` Yuan Can
2023-09-06 0:28 ` Mike Kravetz
2023-09-06 2:32 ` Muchun Song
2023-09-06 2:59 ` Yuan Can
2023-09-06 7:25 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox