linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock""
@ 2023-04-26 15:03 Baolin Wang
  2023-04-26 15:10 ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Baolin Wang @ 2023-04-26 15:03 UTC (permalink / raw)
  To: akpm; +Cc: mgorman, vbabka, baolin.wang, linux-mm, linux-kernel

This reverts commit 95e7a450b8190673675836bfef236262ceff084a.

When I tested thpscale with v6.3 kernel, I found the compaction efficiency
had a great regression compared to v6.2-rc1 kernel. See below numbers:
                                    v6.2-rc             v6.3
Percentage huge-3        81.35 (   0.00%)       32.97 ( -59.47%)
Percentage huge-5        89.92 (   0.00%)       41.70 ( -53.63%)
Percentage huge-7        92.41 (   0.00%)       34.08 ( -63.12%)
Percentage huge-12       90.29 (   0.00%)       41.10 ( -54.49%)
Percentage huge-18       82.38 (   0.00%)       41.24 ( -49.95%)
Percentage huge-24       80.34 (   0.00%)       35.99 ( -55.20%)
Percentage huge-30       88.90 (   0.00%)       44.20 ( -50.28%)
Percentage huge-32       90.69 (   0.00%)       79.57 ( -12.25%)

Ops Compaction stalls                 113790.00      207099.00
Ops Compaction success                 33983.00      19488.00
Ops Compaction failures                79807.00      187611.00
Ops Compaction efficiency                 29.86          9.41

After some investigation, I found the commit 95e7a450b819
("Revert mm/compaction: fix set skip in fast_find_migrateblock") caused
the regression. This commit revert the commit 7efc3b726103 ("mm/compaction:
fix set skip in fast_find_migrateblock") to fix a CPU stalling issue, which
is caused by compaction stucked in repeating fast_find_migrateblock().

And now the compaction stalling issue is addressed by commit cfccd2e63e7e
("mm, compaction: finish pageblocks on complete migration failure"). So
we should revert the temporary fix by commit 95e7a450b819, since the
fast pfn found by fast_find_migrateblock() really can help to isolate
some migratable pages.

After reverting the commit, the regression has gone.
                               v6.2-rc1                  v6.3           v6.3_patched
Percentage huge-3        81.35 (   0.00%)       32.97 ( -59.47%)       87.78 (   7.90%)
Percentage huge-5        89.92 (   0.00%)       41.70 ( -53.63%)       89.68 (  -0.27%)
Percentage huge-7        92.41 (   0.00%)       34.08 ( -63.12%)       85.89 (  -7.05%)
Percentage huge-12       90.29 (   0.00%)       41.10 ( -54.49%)       94.10 (   4.22%)
Percentage huge-18       82.38 (   0.00%)       41.24 ( -49.95%)       85.06 (   3.25%)
Percentage huge-24       80.34 (   0.00%)       35.99 ( -55.20%)       84.38 (   5.02%)
Percentage huge-30       88.90 (   0.00%)       44.20 ( -50.28%)       95.54 (   7.48%)
Percentage huge-32       90.69 (   0.00%)       79.57 ( -12.25%)       92.30 (   1.78%)

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/compaction.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 33650541bebc..567c8d41d01e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1860,7 +1860,6 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
 					pfn = cc->zone->zone_start_pfn;
 				cc->fast_search_fail = 0;
 				found_block = true;
-				set_pageblock_skip(freepage);
 				break;
 			}
 		}
-- 
2.27.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock""
  2023-04-26 15:03 [PATCH] Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock"" Baolin Wang
@ 2023-04-26 15:10 ` Vlastimil Babka
  2023-04-26 15:33   ` Mel Gorman
  0 siblings, 1 reply; 4+ messages in thread
From: Vlastimil Babka @ 2023-04-26 15:10 UTC (permalink / raw)
  To: Baolin Wang, akpm; +Cc: mgorman, linux-mm, linux-kernel

On 4/26/23 17:03, Baolin Wang wrote:
> This reverts commit 95e7a450b8190673675836bfef236262ceff084a.
> 
> When I tested thpscale with v6.3 kernel, I found the compaction efficiency
> had a great regression compared to v6.2-rc1 kernel. See below numbers:
>                                     v6.2-rc             v6.3
> Percentage huge-3        81.35 (   0.00%)       32.97 ( -59.47%)
> Percentage huge-5        89.92 (   0.00%)       41.70 ( -53.63%)
> Percentage huge-7        92.41 (   0.00%)       34.08 ( -63.12%)
> Percentage huge-12       90.29 (   0.00%)       41.10 ( -54.49%)
> Percentage huge-18       82.38 (   0.00%)       41.24 ( -49.95%)
> Percentage huge-24       80.34 (   0.00%)       35.99 ( -55.20%)
> Percentage huge-30       88.90 (   0.00%)       44.20 ( -50.28%)
> Percentage huge-32       90.69 (   0.00%)       79.57 ( -12.25%)
> 
> Ops Compaction stalls                 113790.00      207099.00
> Ops Compaction success                 33983.00      19488.00
> Ops Compaction failures                79807.00      187611.00
> Ops Compaction efficiency                 29.86          9.41
> 
> After some investigation, I found the commit 95e7a450b819
> ("Revert mm/compaction: fix set skip in fast_find_migrateblock") caused
> the regression. This commit revert the commit 7efc3b726103 ("mm/compaction:
> fix set skip in fast_find_migrateblock") to fix a CPU stalling issue, which
> is caused by compaction stucked in repeating fast_find_migrateblock().
> 
> And now the compaction stalling issue is addressed by commit cfccd2e63e7e
> ("mm, compaction: finish pageblocks on complete migration failure"). So

IIRC at that time I was pointing out some scenarios that could make the
problem appear even after that commit, and we wanted to revisit that
when Mel is back.

> we should revert the temporary fix by commit 95e7a450b819, since the
> fast pfn found by fast_find_migrateblock() really can help to isolate
> some migratable pages.

So thanks for the reminder, yet we should make sure the fix is complete
before removing the workaround.

> After reverting the commit, the regression has gone.
>                                v6.2-rc1                  v6.3           v6.3_patched
> Percentage huge-3        81.35 (   0.00%)       32.97 ( -59.47%)       87.78 (   7.90%)
> Percentage huge-5        89.92 (   0.00%)       41.70 ( -53.63%)       89.68 (  -0.27%)
> Percentage huge-7        92.41 (   0.00%)       34.08 ( -63.12%)       85.89 (  -7.05%)
> Percentage huge-12       90.29 (   0.00%)       41.10 ( -54.49%)       94.10 (   4.22%)
> Percentage huge-18       82.38 (   0.00%)       41.24 ( -49.95%)       85.06 (   3.25%)
> Percentage huge-24       80.34 (   0.00%)       35.99 ( -55.20%)       84.38 (   5.02%)
> Percentage huge-30       88.90 (   0.00%)       44.20 ( -50.28%)       95.54 (   7.48%)
> Percentage huge-32       90.69 (   0.00%)       79.57 ( -12.25%)       92.30 (   1.78%)
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  mm/compaction.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 33650541bebc..567c8d41d01e 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1860,7 +1860,6 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
>  					pfn = cc->zone->zone_start_pfn;
>  				cc->fast_search_fail = 0;
>  				found_block = true;
> -				set_pageblock_skip(freepage);
>  				break;
>  			}
>  		}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock""
  2023-04-26 15:10 ` Vlastimil Babka
@ 2023-04-26 15:33   ` Mel Gorman
  2023-04-27  0:57     ` Baolin Wang
  0 siblings, 1 reply; 4+ messages in thread
From: Mel Gorman @ 2023-04-26 15:33 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Baolin Wang, akpm, linux-mm, linux-kernel

On Wed, Apr 26, 2023 at 05:10:14PM +0200, Vlastimil Babka wrote:
> On 4/26/23 17:03, Baolin Wang wrote:
> > This reverts commit 95e7a450b8190673675836bfef236262ceff084a.
> > 
> > When I tested thpscale with v6.3 kernel, I found the compaction efficiency
> > had a great regression compared to v6.2-rc1 kernel. See below numbers:
> >                                     v6.2-rc             v6.3
> > Percentage huge-3        81.35 (   0.00%)       32.97 ( -59.47%)
> > Percentage huge-5        89.92 (   0.00%)       41.70 ( -53.63%)
> > Percentage huge-7        92.41 (   0.00%)       34.08 ( -63.12%)
> > Percentage huge-12       90.29 (   0.00%)       41.10 ( -54.49%)
> > Percentage huge-18       82.38 (   0.00%)       41.24 ( -49.95%)
> > Percentage huge-24       80.34 (   0.00%)       35.99 ( -55.20%)
> > Percentage huge-30       88.90 (   0.00%)       44.20 ( -50.28%)
> > Percentage huge-32       90.69 (   0.00%)       79.57 ( -12.25%)
> > 
> > Ops Compaction stalls                 113790.00      207099.00
> > Ops Compaction success                 33983.00      19488.00
> > Ops Compaction failures                79807.00      187611.00
> > Ops Compaction efficiency                 29.86          9.41
> > 
> > After some investigation, I found the commit 95e7a450b819
> > ("Revert mm/compaction: fix set skip in fast_find_migrateblock") caused
> > the regression. This commit revert the commit 7efc3b726103 ("mm/compaction:
> > fix set skip in fast_find_migrateblock") to fix a CPU stalling issue, which
> > is caused by compaction stucked in repeating fast_find_migrateblock().
> > 
> > And now the compaction stalling issue is addressed by commit cfccd2e63e7e
> > ("mm, compaction: finish pageblocks on complete migration failure"). So
> 
> IIRC at that time I was pointing out some scenarios that could make the
> problem appear even after that commit, and we wanted to revisit that
> when Mel is back.
> 

Yes, I've prototyped the fix against 6.3-rc7 and the revert is at the
end but the revert on its own has the potential for causing problems. The
series needs to be rebased, retested and posted. What I last tested
should show up shortly at

https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/ mm-follupfastmigrate-v1r1

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock""
  2023-04-26 15:33   ` Mel Gorman
@ 2023-04-27  0:57     ` Baolin Wang
  0 siblings, 0 replies; 4+ messages in thread
From: Baolin Wang @ 2023-04-27  0:57 UTC (permalink / raw)
  To: Mel Gorman, Vlastimil Babka; +Cc: akpm, linux-mm, linux-kernel



On 4/26/2023 11:33 PM, Mel Gorman wrote:
> On Wed, Apr 26, 2023 at 05:10:14PM +0200, Vlastimil Babka wrote:
>> On 4/26/23 17:03, Baolin Wang wrote:
>>> This reverts commit 95e7a450b8190673675836bfef236262ceff084a.
>>>
>>> When I tested thpscale with v6.3 kernel, I found the compaction efficiency
>>> had a great regression compared to v6.2-rc1 kernel. See below numbers:
>>>                                      v6.2-rc             v6.3
>>> Percentage huge-3        81.35 (   0.00%)       32.97 ( -59.47%)
>>> Percentage huge-5        89.92 (   0.00%)       41.70 ( -53.63%)
>>> Percentage huge-7        92.41 (   0.00%)       34.08 ( -63.12%)
>>> Percentage huge-12       90.29 (   0.00%)       41.10 ( -54.49%)
>>> Percentage huge-18       82.38 (   0.00%)       41.24 ( -49.95%)
>>> Percentage huge-24       80.34 (   0.00%)       35.99 ( -55.20%)
>>> Percentage huge-30       88.90 (   0.00%)       44.20 ( -50.28%)
>>> Percentage huge-32       90.69 (   0.00%)       79.57 ( -12.25%)
>>>
>>> Ops Compaction stalls                 113790.00      207099.00
>>> Ops Compaction success                 33983.00      19488.00
>>> Ops Compaction failures                79807.00      187611.00
>>> Ops Compaction efficiency                 29.86          9.41
>>>
>>> After some investigation, I found the commit 95e7a450b819
>>> ("Revert mm/compaction: fix set skip in fast_find_migrateblock") caused
>>> the regression. This commit revert the commit 7efc3b726103 ("mm/compaction:
>>> fix set skip in fast_find_migrateblock") to fix a CPU stalling issue, which
>>> is caused by compaction stucked in repeating fast_find_migrateblock().
>>>
>>> And now the compaction stalling issue is addressed by commit cfccd2e63e7e
>>> ("mm, compaction: finish pageblocks on complete migration failure"). So
>>
>> IIRC at that time I was pointing out some scenarios that could make the
>> problem appear even after that commit, and we wanted to revisit that
>> when Mel is back.

Ah, I missed that, and will check previous discussion.

> Yes, I've prototyped the fix against 6.3-rc7 and the revert is at the
> end but the revert on its own has the potential for causing problems. The
> series needs to be rebased, retested and posted. What I last tested
> should show up shortly at
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/ mm-follupfastmigrate-v1r1

Thanks.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-04-27  0:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-26 15:03 [PATCH] Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock"" Baolin Wang
2023-04-26 15:10 ` Vlastimil Babka
2023-04-26 15:33   ` Mel Gorman
2023-04-27  0:57     ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox