[PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem
@ 2024-04-04 16:25 Frank van der Linden
  2024-04-04 16:25 ` [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid Frank van der Linden
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 16:25 UTC (permalink / raw)
  To: linux-mm, muchun.song, akpm
  Cc: linux-kernel, Frank van der Linden, Marek Szyprowski, David Hildenbrand

cma_init_reserved_mem uses IS_ALIGNED to check if the size
represented by one bit in the cma allocation bitmask is
aligned with CMA_MIN_ALIGNMENT_BYTES (pageblock size).

However, this is too strict, as this will fail if
order_per_bit > pageblock_order, which is a valid configuration.

We could check IS_ALIGNED both ways, but since both numbers are
powers of two, no check is needed at all.

Signed-off-by: Frank van der Linden <fvdl@google.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Fixes: de9e14eebf33 ("drivers: dma-contiguous: add initialization from device tree")
---
 mm/cma.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/cma.c b/mm/cma.c
index 01f5a8f71ddf..3e9724716bad 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -182,10 +182,6 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 	if (!size || !memblock_is_region_reserved(base, size))
 		return -EINVAL;
 
-	/* alignment should be aligned with order_per_bit */
-	if (!IS_ALIGNED(CMA_MIN_ALIGNMENT_PAGES, 1 << order_per_bit))
-		return -EINVAL;
-
 	/* ensure minimal alignment required by mm core */
 	if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES))
 		return -EINVAL;
-- 
2.44.0.478.gd926399ef9-goog



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 16:25 [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem Frank van der Linden
@ 2024-04-04 16:25 ` Frank van der Linden
  2024-04-04 18:56   ` Roman Gushchin
                     ` (3 more replies)
  2024-04-04 20:15 ` [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem Andrew Morton
       [not found] ` <93eccef7-a559-4ad8-be0f-8cc99c00bd09@redhat.com>
  2 siblings, 4 replies; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 16:25 UTC (permalink / raw)
  To: linux-mm, muchun.song, akpm
  Cc: linux-kernel, Frank van der Linden, Roman Gushchin

The hugetlb_cma code passes 0 in the order_per_bit argument to
cma_declare_contiguous_nid (the alignment, computed using the
page order, is correctly passed in).

This causes a bit in the cma allocation bitmap to always represent
a 4k page, making the bitmaps potentially very large, and slower.

So, correctly pass in the order instead.

Signed-off-by: Frank van der Linden <fvdl@google.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
---
 mm/hugetlb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 23ef240ba48a..6dc62d8b2a3a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
 		 * huge page demotion.
 		 */
 		res = cma_declare_contiguous_nid(0, size, 0,
-						PAGE_SIZE << HUGETLB_PAGE_ORDER,
-						 0, false, name,
-						 &hugetlb_cma[nid], nid);
+					PAGE_SIZE << HUGETLB_PAGE_ORDER,
+					HUGETLB_PAGE_ORDER, false, name,
+					&hugetlb_cma[nid], nid);
 		if (res) {
 			pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
 				res, nid);
-- 
2.44.0.478.gd926399ef9-goog



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 16:25 ` [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid Frank van der Linden
@ 2024-04-04 18:56   ` Roman Gushchin
  2024-04-04 19:40     ` Frank van der Linden
  2024-04-04 20:13   ` David Hildenbrand
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Roman Gushchin @ 2024-04-04 18:56 UTC (permalink / raw)
  To: Frank van der Linden; +Cc: linux-mm, muchun.song, akpm, linux-kernel

On Thu, Apr 04, 2024 at 04:25:15PM +0000, Frank van der Linden wrote:
> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.
> 
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")

Hi Frank,

there is a comment just above your changes which explains why order_per_bit is 0.
Is this not true anymore? If so, please, fix the comment too. Please, clarify.

Thanks!


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 18:56   ` Roman Gushchin
@ 2024-04-04 19:40     ` Frank van der Linden
  2024-04-04 20:45       ` Roman Gushchin
  0 siblings, 1 reply; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 19:40 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: linux-mm, muchun.song, akpm, linux-kernel

On Thu, Apr 4, 2024 at 11:56 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> On Thu, Apr 04, 2024 at 04:25:15PM +0000, Frank van der Linden wrote:
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> >
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> >
> > So, correctly pass in the order instead.
> >
> > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>
> Hi Frank,
>
> there is a comment just above your changes which explains why order_per_bit is 0.
> Is this not true anymore? If so, please, fix the comment too. Please, clarify.
>
> Thanks!

Hi Roman,

I'm assuming you're referring to this comment:

/*
 * Note that 'order per bit' is based on smallest size that
 * may be returned to CMA allocator in the case of
 * huge page demotion.
 */

That comment was added in a01f43901cfb9 ("hugetlb: be sure to free
demoted CMA pages to CMA").

It talks about HUGETLB_PAGE_ORDER being the minimum order being given
back to the CMA allocator (after hugetlb demotion), therefore
order_per_bit must be HUGETLB_PAGE_ORDER. See the commit message for
that commit:

"Therefore, at region setup time we use HUGETLB_PAGE_ORDER as the
smallest possible huge page size that can be given back to CMA."

But the commit, while correctly changing the alignment, left the
order_per_bit argument at 0,  even though it clearly intended to set
it at HUGETLB_PAGE_ORDER. The confusion may have been that
cma_declare_contiguous_nid has 9 arguments, several of which can be
left at 0 meaning 'use default', so it's easy to misread.

In other words, the comment was correct, but the code was not. After
this patch, comment and code match.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 16:25 ` [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid Frank van der Linden
  2024-04-04 18:56   ` Roman Gushchin
@ 2024-04-04 20:13   ` David Hildenbrand
  2024-04-04 20:52     ` Roman Gushchin
  2024-04-04 21:44     ` Frank van der Linden
  2024-04-04 20:17   ` Andrew Morton
  2024-04-07  8:02   ` Muchun Song
  3 siblings, 2 replies; 18+ messages in thread
From: David Hildenbrand @ 2024-04-04 20:13 UTC (permalink / raw)
  To: Frank van der Linden, linux-mm, muchun.song, akpm
  Cc: linux-kernel, Roman Gushchin

On 04.04.24 18:25, Frank van der Linden wrote:
> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.
> 
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")

It might be subopimal, but do we call it a "BUG" that needs "fixing". I 
know, controversial :)

> ---
>   mm/hugetlb.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 23ef240ba48a..6dc62d8b2a3a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
>   		 * huge page demotion.
>   		 */
>   		res = cma_declare_contiguous_nid(0, size, 0,
> -						PAGE_SIZE << HUGETLB_PAGE_ORDER,
> -						 0, false, name,
> -						 &hugetlb_cma[nid], nid);
> +					PAGE_SIZE << HUGETLB_PAGE_ORDER,
> +					HUGETLB_PAGE_ORDER, false, name,
> +					&hugetlb_cma[nid], nid);
>   		if (res) {
>   			pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
>   				res, nid);

... I'm afraid this is not completely correct.

For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.

... but we do support smaller hugetlb sizes than that (cont-pte hugetlb 
size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem
  2024-04-04 16:25 [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem Frank van der Linden
  2024-04-04 16:25 ` [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid Frank van der Linden
@ 2024-04-04 20:15 ` Andrew Morton
  2024-04-04 20:45   ` Frank van der Linden
       [not found] ` <93eccef7-a559-4ad8-be0f-8cc99c00bd09@redhat.com>
  2 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2024-04-04 20:15 UTC (permalink / raw)
  To: Frank van der Linden
  Cc: linux-mm, muchun.song, linux-kernel, Marek Szyprowski, David Hildenbrand

On Thu,  4 Apr 2024 16:25:14 +0000 Frank van der Linden <fvdl@google.com> wrote:

> cma_init_reserved_mem uses IS_ALIGNED to check if the size
> represented by one bit in the cma allocation bitmask is
> aligned with CMA_MIN_ALIGNMENT_BYTES (pageblock size).
> 
> However, this is too strict, as this will fail if
> order_per_bit > pageblock_order, which is a valid configuration.
> 
> We could check IS_ALIGNED both ways, but since both numbers are
> powers of two, no check is needed at all.

What are the userspace visible effects of this bug?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 16:25 ` [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid Frank van der Linden
  2024-04-04 18:56   ` Roman Gushchin
  2024-04-04 20:13   ` David Hildenbrand
@ 2024-04-04 20:17   ` Andrew Morton
  2024-04-04 21:58     ` Frank van der Linden
  2024-04-07  8:02   ` Muchun Song
  3 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2024-04-04 20:17 UTC (permalink / raw)
  To: Frank van der Linden; +Cc: linux-mm, muchun.song, linux-kernel, Roman Gushchin

On Thu,  4 Apr 2024 16:25:15 +0000 Frank van der Linden <fvdl@google.com> wrote:

> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.

Ditto.  Should we backport this?   Can we somewhat quantify "potentially very",
and understand under what circumstances this might occur?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 19:40     ` Frank van der Linden
@ 2024-04-04 20:45       ` Roman Gushchin
  0 siblings, 0 replies; 18+ messages in thread
From: Roman Gushchin @ 2024-04-04 20:45 UTC (permalink / raw)
  To: Frank van der Linden; +Cc: linux-mm, muchun.song, akpm, linux-kernel

On Thu, Apr 04, 2024 at 12:40:58PM -0700, Frank van der Linden wrote:
> On Thu, Apr 4, 2024 at 11:56 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> >
> > On Thu, Apr 04, 2024 at 04:25:15PM +0000, Frank van der Linden wrote:
> > > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > > cma_declare_contiguous_nid (the alignment, computed using the
> > > page order, is correctly passed in).
> > >
> > > This causes a bit in the cma allocation bitmap to always represent
> > > a 4k page, making the bitmaps potentially very large, and slower.
> > >
> > > So, correctly pass in the order instead.
> > >
> > > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> >
> > Hi Frank,
> >
> > there is a comment just above your changes which explains why order_per_bit is 0.
> > Is this not true anymore? If so, please, fix the comment too. Please, clarify.
> >
> > Thanks!
> 
> Hi Roman,
> 
> I'm assuming you're referring to this comment:
> 
> /*
>  * Note that 'order per bit' is based on smallest size that
>  * may be returned to CMA allocator in the case of
>  * huge page demotion.
>  */
> 
> That comment was added in a01f43901cfb9 ("hugetlb: be sure to free
> demoted CMA pages to CMA").
> 
> It talks about HUGETLB_PAGE_ORDER being the minimum order being given
> back to the CMA allocator (after hugetlb demotion), therefore
> order_per_bit must be HUGETLB_PAGE_ORDER. See the commit message for
> that commit:
> 
> "Therefore, at region setup time we use HUGETLB_PAGE_ORDER as the
> smallest possible huge page size that can be given back to CMA."
> 
> But the commit, while correctly changing the alignment, left the
> order_per_bit argument at 0,  even though it clearly intended to set
> it at HUGETLB_PAGE_ORDER. The confusion may have been that
> cma_declare_contiguous_nid has 9 arguments, several of which can be
> left at 0 meaning 'use default', so it's easy to misread.
> 
> In other words, the comment was correct, but the code was not. After
> this patch, comment and code match.

Indeed the mentioned commit which added a comment which was not aligned
with the code was confusing. It all makes sense now, thank you for
the explanation!

Please, feel free to add
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
for your patch.

Thanks!


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem
  2024-04-04 20:15 ` [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem Andrew Morton
@ 2024-04-04 20:45   ` Frank van der Linden
  0 siblings, 0 replies; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 20:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, muchun.song, linux-kernel, Marek Szyprowski, David Hildenbrand

On Thu, Apr 4, 2024 at 1:15 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu,  4 Apr 2024 16:25:14 +0000 Frank van der Linden <fvdl@google.com> wrote:
>
> > cma_init_reserved_mem uses IS_ALIGNED to check if the size
> > represented by one bit in the cma allocation bitmask is
> > aligned with CMA_MIN_ALIGNMENT_BYTES (pageblock size).
> >
> > However, this is too strict, as this will fail if
> > order_per_bit > pageblock_order, which is a valid configuration.
> >
> > We could check IS_ALIGNED both ways, but since both numbers are
> > powers of two, no check is needed at all.
>
> What are the userspace visible effects of this bug?

None that I know of. This bug was exposed because I made the hugetlb
code correctly pass the right order_per_bit argument (see the
accompanying hugetlb cma fix), which then tripped this check when I
backported it to an older kernel, passing an order of 30 (1G hugetlb
page) as order_per_bit. This actually won't happen for 6.9-rc, since
the (intended) order_per_bit was reduced to HUGETLB_PAGE_ORDER because
of hugetlb page demotion.

So, no user visible effects. However, if the other fix is going to be
backported, this one is a prereq.

- Frank

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem
       [not found] ` <93eccef7-a559-4ad8-be0f-8cc99c00bd09@redhat.com>
@ 2024-04-04 20:48   ` Frank van der Linden
  0 siblings, 0 replies; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 20:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, muchun.song, akpm, linux-kernel, Marek Szyprowski

On Thu, Apr 4, 2024 at 1:05 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 04.04.24 18:25, Frank van der Linden wrote:
> > cma_init_reserved_mem uses IS_ALIGNED to check if the size
> > represented by one bit in the cma allocation bitmask is
> > aligned with CMA_MIN_ALIGNMENT_BYTES (pageblock size).
>
> I recall the important part is that our area always covers full
> pageblocks (CMA_MIN_ALIGNMENT_BYTES), because we cannot have "partial
> CMA" pageblocks.
>
> Internally, allocating from multiple pageblock should just work.
>
> It's late in Germany, hopefully I am not missing something
>
> Acked-by: David Hildenbrand <david@redhat.com>
>
> >
> > However, this is too strict, as this will fail if
> > order_per_bit > pageblock_order, which is a valid configuration.
> >
> > We could check IS_ALIGNED both ways, but since both numbers are
> > powers of two, no check is needed at all.
> >
> > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > Cc: David Hildenbrand <david@redhat.com>
> > Fixes: de9e14eebf33 ("drivers: dma-contiguous: add initialization from device tree")
>
> Is there are real setup/BUG we are fixing? Why did we not stumble over
> that earlier?
>
> If so, please describe that in the patch description.

Nobody stumbled over it because the only user of CMA that should have
passed in an order_per_bit large enough to trigger this was
hugetlb_cma. However, because of a bug, it didn't :) When I fixed
that, I noticed that this check fired.

- Frank


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 20:13   ` David Hildenbrand
@ 2024-04-04 20:52     ` Roman Gushchin
  2024-04-04 22:02       ` Frank van der Linden
  2024-04-04 21:44     ` Frank van der Linden
  1 sibling, 1 reply; 18+ messages in thread
From: Roman Gushchin @ 2024-04-04 20:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Frank van der Linden, linux-mm, muchun.song, akpm, linux-kernel

On Thu, Apr 04, 2024 at 10:13:21PM +0200, David Hildenbrand wrote:
> On 04.04.24 18:25, Frank van der Linden wrote:
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> > 
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> > 
> > So, correctly pass in the order instead.
> > 
> > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> 
> It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> know, controversial :)

We probably should not rush with a stable backporting, especially given your
next comment on page sizes on arm.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 20:13   ` David Hildenbrand
  2024-04-04 20:52     ` Roman Gushchin
@ 2024-04-04 21:44     ` Frank van der Linden
  2024-04-04 22:22       ` Frank van der Linden
  2024-04-08  8:15       ` David Hildenbrand
  1 sibling, 2 replies; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 21:44 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, muchun.song, akpm, linux-kernel, Roman Gushchin

On Thu, Apr 4, 2024 at 1:13 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 04.04.24 18:25, Frank van der Linden wrote:
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> >
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> >
> > So, correctly pass in the order instead.
> >
> > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>
> It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> know, controversial :)
>
> > ---
> >   mm/hugetlb.c | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 23ef240ba48a..6dc62d8b2a3a 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
> >                * huge page demotion.
> >                */
> >               res = cma_declare_contiguous_nid(0, size, 0,
> > -                                             PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > -                                              0, false, name,
> > -                                              &hugetlb_cma[nid], nid);
> > +                                     PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > +                                     HUGETLB_PAGE_ORDER, false, name,
> > +                                     &hugetlb_cma[nid], nid);
> >               if (res) {
> >                       pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
> >                               res, nid);
>
> ... I'm afraid this is not completely correct.
>
> For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.
>
> ... but we do support smaller hugetlb sizes than that (cont-pte hugetlb
> size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)

Right,  smaller hugetlb page sizes exist. But, the value here is not
intended to represent the minimum hugetlb page size - it's the minimum
hugetlb page size that we can demote a CMA-allocated hugetlb page to.
See:

a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to CMA")

So, this just restricts demotion of the gigantic, CMA-allocated pages
to HUGETLB_PAGE_ORDER-sized chunks.

- Frank


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 20:17   ` Andrew Morton
@ 2024-04-04 21:58     ` Frank van der Linden
  0 siblings, 0 replies; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 21:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, muchun.song, linux-kernel, Roman Gushchin

On Thu, Apr 4, 2024 at 1:17 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu,  4 Apr 2024 16:25:15 +0000 Frank van der Linden <fvdl@google.com> wrote:
>
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> >
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> >
> > So, correctly pass in the order instead.
>
> Ditto.  Should we backport this?   Can we somewhat quantify "potentially very",
> and understand under what circumstances this might occur?

It would create bitmaps that would be pretty big. E.g. for a 4k page
size on x86, hugetlb_cma=64G would mean a bitmap size of (64G / 4k) /
8 == 2M. With HUGETLB_PAGE_ORDER as order_per_bit, as intended, this
would be (64G / 2M) / 8 == 4k. So, that's quite a difference :)

Also, this restricted the hugetlb_cma area to ((PAGE_SIZE <<
MAX_PAGE_ORDER) * 8) * PAGE_SIZE (e.g. 128G on x86) , since
bitmap_alloc uses normal page allocation, and is thus restricted by
MAX_PAGE_ORDER. Specifying anything about that would fail the CMA
initialization.

- Frank


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 20:52     ` Roman Gushchin
@ 2024-04-04 22:02       ` Frank van der Linden
  2024-04-04 22:20         ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 22:02 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: David Hildenbrand, linux-mm, muchun.song, akpm, linux-kernel

Rushing is never good, of course, but see my reply to David - while
smaller hugetlb page sizes than HUGETLB_PAGE_ORDER exist, that's not
the issue in that particular code path.

The only restriction for backports is, I think, that the two patches
need to go together.

I have backported them to 6.6 (which was just a clean apply), and
5.10, which doesn't have hugetlb page demotion, so it actually can
pass the full 1G as order_per_bit. That works fine if you also apply
the CMA align check fix, but would fail otherwise.

- Frank

On Thu, Apr 4, 2024 at 1:52 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> On Thu, Apr 04, 2024 at 10:13:21PM +0200, David Hildenbrand wrote:
> > On 04.04.24 18:25, Frank van der Linden wrote:
> > > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > > cma_declare_contiguous_nid (the alignment, computed using the
> > > page order, is correctly passed in).
> > >
> > > This causes a bit in the cma allocation bitmap to always represent
> > > a 4k page, making the bitmaps potentially very large, and slower.
> > >
> > > So, correctly pass in the order instead.
> > >
> > > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> >
> > It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> > know, controversial :)
>
> We probably should not rush with a stable backporting, especially given your
> next comment on page sizes on arm.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 22:02       ` Frank van der Linden
@ 2024-04-04 22:20         ` Andrew Morton
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2024-04-04 22:20 UTC (permalink / raw)
  To: Frank van der Linden
  Cc: Roman Gushchin, David Hildenbrand, linux-mm, muchun.song, linux-kernel

On Thu, 4 Apr 2024 15:02:34 -0700 Frank van der Linden <fvdl@google.com> wrote:

> Rushing is never good, of course, but see my reply to David - while
> smaller hugetlb page sizes than HUGETLB_PAGE_ORDER exist, that's not
> the issue in that particular code path.
> 
> The only restriction for backports is, I think, that the two patches
> need to go together.
> 
> I have backported them to 6.6 (which was just a clean apply), and
> 5.10, which doesn't have hugetlb page demotion, so it actually can
> pass the full 1G as order_per_bit. That works fine if you also apply
> the CMA align check fix, but would fail otherwise.

OK, thanks.  I added cc:stable to both patches and added this:

: It would create bitmaps that would be pretty big.  E.g.  for a 4k page
: size on x86, hugetlb_cma=64G would mean a bitmap size of (64G / 4k) / 8
: == 2M.  With HUGETLB_PAGE_ORDER as order_per_bit, as intended, this
: would be (64G / 2M) / 8 == 4k.  So, that's quite a difference.
: 
: Also, this restricted the hugetlb_cma area to ((PAGE_SIZE <<
: MAX_PAGE_ORDER) * 8) * PAGE_SIZE (e.g.  128G on x86) , since
: bitmap_alloc uses normal page allocation, and is thus restricted by
: MAX_PAGE_ORDER.  Specifying anything about that would fail the CMA
: initialization.

to the [2/2] changelog.


For extra test & review I'll leave them in mm-[un]stable so they go
into mainline for 6.10-rc1 which will then trigger the backporting
process.  This can of course all be altered...



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 21:44     ` Frank van der Linden
@ 2024-04-04 22:22       ` Frank van der Linden
  2024-04-08  8:15       ` David Hildenbrand
  1 sibling, 0 replies; 18+ messages in thread
From: Frank van der Linden @ 2024-04-04 22:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, muchun.song, akpm, linux-kernel, Roman Gushchin

On Thu, Apr 4, 2024 at 2:44 PM Frank van der Linden <fvdl@google.com> wrote:
>
> On Thu, Apr 4, 2024 at 1:13 PM David Hildenbrand <david@redhat.com> wrote:
> >
> > On 04.04.24 18:25, Frank van der Linden wrote:
> > > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > > cma_declare_contiguous_nid (the alignment, computed using the
> > > page order, is correctly passed in).
> > >
> > > This causes a bit in the cma allocation bitmap to always represent
> > > a 4k page, making the bitmaps potentially very large, and slower.
> > >
> > > So, correctly pass in the order instead.
> > >
> > > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> >
> > It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> > know, controversial :)
> >
> > > ---
> > >   mm/hugetlb.c | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 23ef240ba48a..6dc62d8b2a3a 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
> > >                * huge page demotion.
> > >                */
> > >               res = cma_declare_contiguous_nid(0, size, 0,
> > > -                                             PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > > -                                              0, false, name,
> > > -                                              &hugetlb_cma[nid], nid);
> > > +                                     PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > > +                                     HUGETLB_PAGE_ORDER, false, name,
> > > +                                     &hugetlb_cma[nid], nid);
> > >               if (res) {
> > >                       pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
> > >                               res, nid);
> >
> > ... I'm afraid this is not completely correct.
> >
> > For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.
> >
> > ... but we do support smaller hugetlb sizes than that (cont-pte hugetlb
> > size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)
>
> Right,  smaller hugetlb page sizes exist. But, the value here is not
> intended to represent the minimum hugetlb page size - it's the minimum
> hugetlb page size that we can demote a CMA-allocated hugetlb page to.
> See:
>
> a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to CMA")
>
> So, this just restricts demotion of the gigantic, CMA-allocated pages
> to HUGETLB_PAGE_ORDER-sized chunks.
>
> - Frank

Just to clarify what I'm saying here: the restriction of the size you
can demote a CMA-allocated gigantic page to was already there, my
patch doesn't change anything in that regard.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 16:25 ` [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid Frank van der Linden
                     ` (2 preceding siblings ...)
  2024-04-04 20:17   ` Andrew Morton
@ 2024-04-07  8:02   ` Muchun Song
  3 siblings, 0 replies; 18+ messages in thread
From: Muchun Song @ 2024-04-07  8:02 UTC (permalink / raw)
  To: Frank van der Linden
  Cc: Linux-MM, Andrew Morton, linux-kernel, Roman Gushchin



> On Apr 5, 2024, at 00:25, Frank van der Linden <fvdl@google.com> wrote:
> 
> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.
> 
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> ---
> mm/hugetlb.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 23ef240ba48a..6dc62d8b2a3a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
> 		* huge page demotion.
> 		*/
> 		res = cma_declare_contiguous_nid(0, size, 0,
> - 						PAGE_SIZE << HUGETLB_PAGE_ORDER,
> - 						0, false, name,
> - 						&hugetlb_cma[nid], nid);
> +						PAGE_SIZE << HUGETLB_PAGE_ORDER,
> +						HUGETLB_PAGE_ORDER, false, name,

IIUC, we could make the optimization further to change order_per_bit to
'MAX_PAGE_ORDER + 1' since only gigantic hugetlb pages could allocated from
the CMA pool meaning any gigantic page is greater than or equal to the
size of two to the power of 'MAX_PAGE_ORDER + 1'.

Thanks.

> +						&hugetlb_cma[nid], nid);
> 		if (res) {
> 			pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
> 				res, nid);
> -- 
> 2.44.0.478.gd926399ef9-goog
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid
  2024-04-04 21:44     ` Frank van der Linden
  2024-04-04 22:22       ` Frank van der Linden
@ 2024-04-08  8:15       ` David Hildenbrand
  1 sibling, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2024-04-08  8:15 UTC (permalink / raw)
  To: Frank van der Linden
  Cc: linux-mm, muchun.song, akpm, linux-kernel, Roman Gushchin

On 04.04.24 23:44, Frank van der Linden wrote:
> On Thu, Apr 4, 2024 at 1:13 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 04.04.24 18:25, Frank van der Linden wrote:
>>> The hugetlb_cma code passes 0 in the order_per_bit argument to
>>> cma_declare_contiguous_nid (the alignment, computed using the
>>> page order, is correctly passed in).
>>>
>>> This causes a bit in the cma allocation bitmap to always represent
>>> a 4k page, making the bitmaps potentially very large, and slower.
>>>
>>> So, correctly pass in the order instead.
>>>
>>> Signed-off-by: Frank van der Linden <fvdl@google.com>
>>> Cc: Roman Gushchin <roman.gushchin@linux.dev>
>>> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>>
>> It might be subopimal, but do we call it a "BUG" that needs "fixing". I
>> know, controversial :)
>>
>>> ---
>>>    mm/hugetlb.c | 6 +++---
>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 23ef240ba48a..6dc62d8b2a3a 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
>>>                 * huge page demotion.
>>>                 */
>>>                res = cma_declare_contiguous_nid(0, size, 0,
>>> -                                             PAGE_SIZE << HUGETLB_PAGE_ORDER,
>>> -                                              0, false, name,
>>> -                                              &hugetlb_cma[nid], nid);
>>> +                                     PAGE_SIZE << HUGETLB_PAGE_ORDER,
>>> +                                     HUGETLB_PAGE_ORDER, false, name,
>>> +                                     &hugetlb_cma[nid], nid);
>>>                if (res) {
>>>                        pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
>>>                                res, nid);
>>
>> ... I'm afraid this is not completely correct.
>>
>> For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.
>>
>> ... but we do support smaller hugetlb sizes than that (cont-pte hugetlb
>> size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)
> 
> Right,  smaller hugetlb page sizes exist. But, the value here is not
> intended to represent the minimum hugetlb page size - it's the minimum
> hugetlb page size that we can demote a CMA-allocated hugetlb page to.
> See:
> 
> a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to CMA")
> 
> So, this just restricts demotion of the gigantic, CMA-allocated pages
> to HUGETLB_PAGE_ORDER-sized chunks.

Right, now my memory comes back. In v1 of that patch set, Mike did 
support denoting to any smaller hugetlb order.

And because that smallest hugetlb order is not known when we call 
cma_declare_contiguous_nid(), he used to pass "0" for the order.

In v2, he simplifed that, though, and limited it to HUGETLB_PAGE_ORDER. 
He didn't update the order here, though.

Acked-by: David Hildenbrand <david@redhat.com>


Note that I don't think any of these patches are CC-stable material.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-04-08  8:15 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-04 16:25 [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem Frank van der Linden
2024-04-04 16:25 ` [PATCH 2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid Frank van der Linden
2024-04-04 18:56   ` Roman Gushchin
2024-04-04 19:40     ` Frank van der Linden
2024-04-04 20:45       ` Roman Gushchin
2024-04-04 20:13   ` David Hildenbrand
2024-04-04 20:52     ` Roman Gushchin
2024-04-04 22:02       ` Frank van der Linden
2024-04-04 22:20         ` Andrew Morton
2024-04-04 21:44     ` Frank van der Linden
2024-04-04 22:22       ` Frank van der Linden
2024-04-08  8:15       ` David Hildenbrand
2024-04-04 20:17   ` Andrew Morton
2024-04-04 21:58     ` Frank van der Linden
2024-04-07  8:02   ` Muchun Song
2024-04-04 20:15 ` [PATCH 1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem Andrew Morton
2024-04-04 20:45   ` Frank van der Linden
     [not found] ` <93eccef7-a559-4ad8-be0f-8cc99c00bd09@redhat.com>
2024-04-04 20:48   ` Frank van der Linden

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox