linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC] mm/page_isolation: Fix an infinite loop in isolate_single_pageblock()
@ 2022-05-30 11:50 Anshuman Khandual
  2022-05-30 13:53 ` Zi Yan
  0 siblings, 1 reply; 3+ messages in thread
From: Anshuman Khandual @ 2022-05-30 11:50 UTC (permalink / raw)
  To: linux-mm; +Cc: Anshuman Khandual, Andrew Morton, Zi Yan, linux-kernel

HugeTLB allocation (32MB pages on 4K base page) via sysfs on arm64 platform
is getting stuck in isolate_single_pageblock(), because of an infinite loop
Because head_pfn always evaluate the same, so does pfn, and the outer loop
never exits. Dropping the relevant code block, which seems redundant, makes
the problem go away.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
I am not sure about this fix, and also did not find much time today to
debug any further. There are much code changes around this function in
recent days. This problem is present on latest mainline kernel.

- Anshuman

 mm/page_isolation.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 6021f8444b5a..b0922fee75c1 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -389,10 +389,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
 			struct page *head = compound_head(page);
 			unsigned long head_pfn = page_to_pfn(head);
 
-			if (head_pfn + nr_pages <= boundary_pfn) {
-				pfn = head_pfn + nr_pages;
-				continue;
-			}
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
 			/*
 			 * hugetlb, lru compound (THP), and movable compound pages
-- 
2.20.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] mm/page_isolation: Fix an infinite loop in isolate_single_pageblock()
  2022-05-30 11:50 [RFC] mm/page_isolation: Fix an infinite loop in isolate_single_pageblock() Anshuman Khandual
@ 2022-05-30 13:53 ` Zi Yan
  2022-05-31  2:22   ` Anshuman Khandual
  0 siblings, 1 reply; 3+ messages in thread
From: Zi Yan @ 2022-05-30 13:53 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: linux-mm, Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2545 bytes --]

On 30 May 2022, at 7:50, Anshuman Khandual wrote:

> HugeTLB allocation (32MB pages on 4K base page) via sysfs on arm64 platform
> is getting stuck in isolate_single_pageblock(), because of an infinite loop
> Because head_pfn always evaluate the same, so does pfn, and the outer loop
> never exits. Dropping the relevant code block, which seems redundant, makes
> the problem go away.

Thanks for the report.

>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> I am not sure about this fix, and also did not find much time today to
> debug any further. There are much code changes around this function in
> recent days. This problem is present on latest mainline kernel.
>
> - Anshuman
>
>  mm/page_isolation.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 6021f8444b5a..b0922fee75c1 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -389,10 +389,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>  			struct page *head = compound_head(page);
>  			unsigned long head_pfn = page_to_pfn(head);
>
> -			if (head_pfn + nr_pages <= boundary_pfn) {
> -				pfn = head_pfn + nr_pages;
> -				continue;
> -			}
>  #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>  			/*
>  			 * hugetlb, lru compound (THP), and movable compound pages
> -- 
> 2.20.1

Can you try the patch below to see if it fixes the issue? Thanks.

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 6021f8444b5a..d200d41ad0d3 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -385,9 +385,9 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
                 * above do the rest. If migration is not possible, just fail.
                 */
                if (PageCompound(page)) {
-                       unsigned long nr_pages = compound_nr(page);
                        struct page *head = compound_head(page);
                        unsigned long head_pfn = page_to_pfn(head);
+                       unsigned long nr_pages = compound_nr(head);

                        if (head_pfn + nr_pages <= boundary_pfn) {
                                pfn = head_pfn + nr_pages;


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] mm/page_isolation: Fix an infinite loop in isolate_single_pageblock()
  2022-05-30 13:53 ` Zi Yan
@ 2022-05-31  2:22   ` Anshuman Khandual
  0 siblings, 0 replies; 3+ messages in thread
From: Anshuman Khandual @ 2022-05-31  2:22 UTC (permalink / raw)
  To: Zi Yan; +Cc: linux-mm, Andrew Morton, linux-kernel



On 5/30/22 19:23, Zi Yan wrote:
> On 30 May 2022, at 7:50, Anshuman Khandual wrote:
> 
>> HugeTLB allocation (32MB pages on 4K base page) via sysfs on arm64 platform
>> is getting stuck in isolate_single_pageblock(), because of an infinite loop
>> Because head_pfn always evaluate the same, so does pfn, and the outer loop
>> never exits. Dropping the relevant code block, which seems redundant, makes
>> the problem go away.
> 
> Thanks for the report.
> 
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: linux-mm@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>> I am not sure about this fix, and also did not find much time today to
>> debug any further. There are much code changes around this function in
>> recent days. This problem is present on latest mainline kernel.
>>
>> - Anshuman
>>
>>  mm/page_isolation.c | 4 ----
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 6021f8444b5a..b0922fee75c1 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -389,10 +389,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>  			struct page *head = compound_head(page);
>>  			unsigned long head_pfn = page_to_pfn(head);
>>
>> -			if (head_pfn + nr_pages <= boundary_pfn) {
>> -				pfn = head_pfn + nr_pages;
>> -				continue;
>> -			}
>>  #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>>  			/*
>>  			 * hugetlb, lru compound (THP), and movable compound pages
>> -- 
>> 2.20.1
> 
> Can you try the patch below to see if it fixes the issue? Thanks.
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 6021f8444b5a..d200d41ad0d3 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -385,9 +385,9 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>                  * above do the rest. If migration is not possible, just fail.
>                  */
>                 if (PageCompound(page)) {
> -                       unsigned long nr_pages = compound_nr(page);
>                         struct page *head = compound_head(page);
>                         unsigned long head_pfn = page_to_pfn(head);
> +                       unsigned long nr_pages = compound_nr(head);
> 
>                         if (head_pfn + nr_pages <= boundary_pfn) {
>                                 pfn = head_pfn + nr_pages;
> 
> 

Yes, this does solve the problem. I guess nr_pages should have been derived
from the compound head itself for it be meaningful (i.e > 1). I assume you
will send a fix patch with appropriate write up that describes this problem.

- Anshuman


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-05-31  2:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-30 11:50 [RFC] mm/page_isolation: Fix an infinite loop in isolate_single_pageblock() Anshuman Khandual
2022-05-30 13:53 ` Zi Yan
2022-05-31  2:22   ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox