linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode
       [not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
@ 2023-09-01  9:17   ` Mel Gorman
  2023-09-01  9:32     ` Kemeng Shi
  0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2023-09-01  9:17 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: akpm, baolin.wang, david, willy, linux-mm, linux-kernel

On Fri, Sep 01, 2023 at 11:51:38PM +0800, Kemeng Shi wrote:
> In strict mode, we should return 0 if there is any hole in pageblock. If
> we successfully isolated pages at beginning at pageblock and then have a
> bogus compound_order outside pageblock in next page. We will abort search
> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> we will treat it as a successful isolation in strict mode as blockpfn is
> not < end_pfn and return partial isolated pages. Then
> isolate_freepages_range may success unexpectly with hole in isolated
> range.
> 
> Fixes: 9fcd6d2e052e ("mm, compaction: skip compound pages by order in free scanner")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  mm/compaction.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a40550a33aee..9ecbfbc695e5 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -626,11 +626,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>  		if (PageCompound(page)) {
>  			const unsigned int order = compound_order(page);
>  
> -			if (likely(order <= MAX_ORDER)) {
> +			if (blockpfn + (1UL << order) <= end_pfn) {
>  				blockpfn += (1UL << order) - 1;
>  				page += (1UL << order) - 1;
>  				nr_scanned += (1UL << order) - 1;
>  			}
> +
>  			goto isolate_fail;
>  		}
>  
> @@ -678,8 +679,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>  		spin_unlock_irqrestore(&cc->zone->lock, flags);
>  
>  	/*
> -	 * There is a tiny chance that we have read bogus compound_order(),
> -	 * so be careful to not go outside of the pageblock.
> +	 * Be careful to not go outside of the pageblock.
>  	 */
>  	if (unlikely(blockpfn > end_pfn))
>  		blockpfn = end_pfn;

Is this check still necessary after the first hunk?

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode
  2023-09-01  9:17   ` [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode Mel Gorman
@ 2023-09-01  9:32     ` Kemeng Shi
  2023-09-01 10:01       ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01  9:32 UTC (permalink / raw)
  To: Mel Gorman; +Cc: akpm, baolin.wang, david, willy, linux-mm, linux-kernel



on 9/1/2023 5:17 PM, Mel Gorman wrote:
> On Fri, Sep 01, 2023 at 11:51:38PM +0800, Kemeng Shi wrote:
>> In strict mode, we should return 0 if there is any hole in pageblock. If
>> we successfully isolated pages at beginning at pageblock and then have a
>> bogus compound_order outside pageblock in next page. We will abort search
>> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
>> we will treat it as a successful isolation in strict mode as blockpfn is
>> not < end_pfn and return partial isolated pages. Then
>> isolate_freepages_range may success unexpectly with hole in isolated
>> range.
>>
>> Fixes: 9fcd6d2e052e ("mm, compaction: skip compound pages by order in free scanner")
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>>  mm/compaction.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index a40550a33aee..9ecbfbc695e5 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -626,11 +626,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>>  		if (PageCompound(page)) {
>>  			const unsigned int order = compound_order(page);
>>  
>> -			if (likely(order <= MAX_ORDER)) {
>> +			if (blockpfn + (1UL << order) <= end_pfn) {
>>  				blockpfn += (1UL << order) - 1;
>>  				page += (1UL << order) - 1;
>>  				nr_scanned += (1UL << order) - 1;
>>  			}
>> +
>>  			goto isolate_fail;
>>  		}
>>  
>> @@ -678,8 +679,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>>  		spin_unlock_irqrestore(&cc->zone->lock, flags);
>>  
>>  	/*
>> -	 * There is a tiny chance that we have read bogus compound_order(),
>> -	 * so be careful to not go outside of the pageblock.
>> +	 * Be careful to not go outside of the pageblock.
>>  	 */
>>  	if (unlikely(blockpfn > end_pfn))
>>  		blockpfn = end_pfn;
> 
> Is this check still necessary after the first hunk?
> 
Actually, I removed this check in the first version, but Baolin thought remove this check is not
cheap and not worth it. More discussion can be found in [1]. Thanks!

[1] https://lore.kernel.org/all/a8edac8d-8e22-89cf-2c8c-217a54608d27@linux.alibaba.com/



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode
  2023-09-01  9:32     ` Kemeng Shi
@ 2023-09-01 10:01       ` Mel Gorman
  0 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2023-09-01 10:01 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: akpm, baolin.wang, david, willy, linux-mm, linux-kernel

On Fri, Sep 01, 2023 at 05:32:49PM +0800, Kemeng Shi wrote:
> 
> 
> on 9/1/2023 5:17 PM, Mel Gorman wrote:
> > On Fri, Sep 01, 2023 at 11:51:38PM +0800, Kemeng Shi wrote:
> >> In strict mode, we should return 0 if there is any hole in pageblock. If
> >> we successfully isolated pages at beginning at pageblock and then have a
> >> bogus compound_order outside pageblock in next page. We will abort search
> >> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> >> we will treat it as a successful isolation in strict mode as blockpfn is
> >> not < end_pfn and return partial isolated pages. Then
> >> isolate_freepages_range may success unexpectly with hole in isolated
> >> range.
> >>
> >> Fixes: 9fcd6d2e052e ("mm, compaction: skip compound pages by order in free scanner")
> >> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> >> ---
> >>  mm/compaction.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/compaction.c b/mm/compaction.c
> >> index a40550a33aee..9ecbfbc695e5 100644
> >> --- a/mm/compaction.c
> >> +++ b/mm/compaction.c
> >> @@ -626,11 +626,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> >>  		if (PageCompound(page)) {
> >>  			const unsigned int order = compound_order(page);
> >>  
> >> -			if (likely(order <= MAX_ORDER)) {
> >> +			if (blockpfn + (1UL << order) <= end_pfn) {
> >>  				blockpfn += (1UL << order) - 1;
> >>  				page += (1UL << order) - 1;
> >>  				nr_scanned += (1UL << order) - 1;
> >>  			}
> >> +
> >>  			goto isolate_fail;
> >>  		}
> >>  
> >> @@ -678,8 +679,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> >>  		spin_unlock_irqrestore(&cc->zone->lock, flags);
> >>  
> >>  	/*
> >> -	 * There is a tiny chance that we have read bogus compound_order(),
> >> -	 * so be careful to not go outside of the pageblock.
> >> +	 * Be careful to not go outside of the pageblock.
> >>  	 */
> >>  	if (unlikely(blockpfn > end_pfn))
> >>  		blockpfn = end_pfn;
> > 
> > Is this check still necessary after the first hunk?
> > 
> Actually, I removed this check in the first version, but Baolin thought remove this check is not
> cheap and not worth it. More discussion can be found in [1]. Thanks!
> 

Ok, fair enough. While I think the check is redundant right now, it's a
reasonable defensive check and this is not a fast path so

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 0/6] Fixes and cleanups to compaction
@ 2023-09-01 15:51 Kemeng Shi
       [not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
  To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel

Hi all, this is another series to do fix and clean up to compaction.
Patch 1-2 fix and clean up freepage list operation.
Patch 3-4 fix and clean up isolation of freepages
Patch 7 factor code to check if compaction is needed for allocation
order.
More details can be found in respective patches. Thanks!

v2->v3:
-Collect RVB and ACK from Baolin and Mel
-Avoid blockpfn outside pageblock in original likely(order <= MAX_ORDER)
block in patch 3
-Move comment into __reset_isolation_suitable in patch 4
-Improve indentation in patch 6

v1->v2:
-Collect RVB from Baolin.
-Keep pfn inside of pageblock in patch 3.
-Only improve comment of is_via_compact_memory in patch 6.
-Squash patch 8 and patch 9 into patch 7 and use ALLOC_WMARK_MIN
instead of magic number 0.

Kemeng Shi (6):
  mm/compaction: use correct list in move_freelist_{head}/{tail}
  mm/compaction: call list_is_{first}/{last} more intuitively in
    move_freelist_{head}/{tail}
  mm/compaction: correctly return failure with bogus compound_order in
    strict mode
  mm/compaction: remove repeat compact_blockskip_flush check in
    reset_isolation_suitable
  mm/compaction: improve comment of is_via_compact_memory
  mm/compaction: factor out code to test if we should run compaction for
    target order

 mm/compaction.c | 91 ++++++++++++++++++++++++++++---------------------
 1 file changed, 52 insertions(+), 39 deletions(-)

-- 
2.30.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 2/6] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}
  2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
       [not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
@ 2023-09-01 15:51 ` Kemeng Shi
  2023-09-01 15:51 ` [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Kemeng Shi
  2023-09-01 15:51 ` [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
  3 siblings, 0 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
  To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel

We use move_freelist_head after list_for_each_entry_reverse to skip
recent pages. And there is no need to do actual move if all freepages
are searched in list_for_each_entry_reverse, e.g. freepage point to
first page in freelist. It's more intuitively to call list_is_first
with list entry as the first argument and list head as the second
argument to check if list entry is the first list entry instead of
call list_is_last with list entry and list head passed in reverse.

Similarly, call list_is_last in move_freelist_tail is more intuitively.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/compaction.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index e3ee1bc1c0ad..a40550a33aee 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1395,7 +1395,7 @@ move_freelist_head(struct list_head *freelist, struct page *freepage)
 {
 	LIST_HEAD(sublist);
 
-	if (!list_is_last(freelist, &freepage->buddy_list)) {
+	if (!list_is_first(&freepage->buddy_list, freelist)) {
 		list_cut_before(&sublist, freelist, &freepage->buddy_list);
 		list_splice_tail(&sublist, freelist);
 	}
@@ -1412,7 +1412,7 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage)
 {
 	LIST_HEAD(sublist);
 
-	if (!list_is_first(freelist, &freepage->buddy_list)) {
+	if (!list_is_last(&freepage->buddy_list, freelist)) {
 		list_cut_position(&sublist, freelist, &freepage->buddy_list);
 		list_splice_tail(&sublist, freelist);
 	}
-- 
2.30.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable
  2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
       [not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
  2023-09-01 15:51 ` [PATCH v3 2/6] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail} Kemeng Shi
@ 2023-09-01 15:51 ` Kemeng Shi
  2023-09-01 15:51 ` [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
  3 siblings, 0 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
  To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel

We have compact_blockskip_flush check in __reset_isolation_suitable, just
remove repeat check before __reset_isolation_suitable in
compact_blockskip_flush.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/compaction.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 9ecbfbc695e5..c377d78e0f15 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -382,6 +382,7 @@ static void __reset_isolation_suitable(struct zone *zone)
 	bool source_set = false;
 	bool free_set = false;
 
+	/* Only flush if a full compaction finished recently */
 	if (!zone->compact_blockskip_flush)
 		return;
 
@@ -434,9 +435,7 @@ void reset_isolation_suitable(pg_data_t *pgdat)
 		if (!populated_zone(zone))
 			continue;
 
-		/* Only flush if a full compaction finished recently */
-		if (zone->compact_blockskip_flush)
-			__reset_isolation_suitable(zone);
+		__reset_isolation_suitable(zone);
 	}
 }
 
-- 
2.30.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order
  2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
                   ` (2 preceding siblings ...)
  2023-09-01 15:51 ` [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Kemeng Shi
@ 2023-09-01 15:51 ` Kemeng Shi
  3 siblings, 0 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
  To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel

We always do zone_watermark_ok check and compaction_suitable check
together to test if compaction for target order should be ran.
Factor these code out to remove repeat code.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/compaction.c | 66 +++++++++++++++++++++++++++++--------------------
 1 file changed, 39 insertions(+), 27 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index ff3426a0d9c5..1940e8693638 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2378,6 +2378,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
 	return false;
 }
 
+/*
+ * Should we do compaction for target allocation order.
+ * Return COMPACT_SUCCESS if allocation for target order can be already
+ * satisfied
+ * Return COMPACT_SKIPPED if compaction for target order is likely to fail
+ * Return COMPACT_CONTINUE if compaction for target order should be ran
+ */
+static enum compact_result
+compaction_suit_allocation_order(struct zone *zone, unsigned int order,
+				 int highest_zoneidx, unsigned int alloc_flags)
+{
+	unsigned long watermark;
+
+	watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
+	if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
+			      alloc_flags))
+		return COMPACT_SUCCESS;
+
+	if (!compaction_suitable(zone, order, highest_zoneidx))
+		return COMPACT_SKIPPED;
+
+	return COMPACT_CONTINUE;
+}
+
 static enum compact_result
 compact_zone(struct compact_control *cc, struct capture_control *capc)
 {
@@ -2403,19 +2427,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
 	cc->migratetype = gfp_migratetype(cc->gfp_mask);
 
 	if (!is_via_compact_memory(cc->order)) {
-		unsigned long watermark;
-
-		/* Allocation can already succeed, nothing to do */
-		watermark = wmark_pages(cc->zone,
-					cc->alloc_flags & ALLOC_WMARK_MASK);
-		if (zone_watermark_ok(cc->zone, cc->order, watermark,
-				      cc->highest_zoneidx, cc->alloc_flags))
-			return COMPACT_SUCCESS;
-
-		/* Compaction is likely to fail */
-		if (!compaction_suitable(cc->zone, cc->order,
-					 cc->highest_zoneidx))
-			return COMPACT_SKIPPED;
+		ret = compaction_suit_allocation_order(cc->zone, cc->order,
+						       cc->highest_zoneidx,
+						       cc->alloc_flags);
+		if (ret != COMPACT_CONTINUE)
+			return ret;
 	}
 
 	/*
@@ -2914,6 +2930,7 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
 	int zoneid;
 	struct zone *zone;
 	enum zone_type highest_zoneidx = pgdat->kcompactd_highest_zoneidx;
+	enum compact_result ret;
 
 	for (zoneid = 0; zoneid <= highest_zoneidx; zoneid++) {
 		zone = &pgdat->node_zones[zoneid];
@@ -2921,14 +2938,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
 		if (!populated_zone(zone))
 			continue;
 
-		/* Allocation can already succeed, check other zones */
-		if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
-				      min_wmark_pages(zone),
-				      highest_zoneidx, 0))
-			continue;
-
-		if (compaction_suitable(zone, pgdat->kcompactd_max_order,
-					highest_zoneidx))
+		ret = compaction_suit_allocation_order(zone,
+				pgdat->kcompactd_max_order,
+				highest_zoneidx, ALLOC_WMARK_MIN);
+		if (ret == COMPACT_CONTINUE)
 			return true;
 	}
 
@@ -2951,6 +2964,8 @@ static void kcompactd_do_work(pg_data_t *pgdat)
 		.ignore_skip_hint = false,
 		.gfp_mask = GFP_KERNEL,
 	};
+	enum compact_result ret;
+
 	trace_mm_compaction_kcompactd_wake(pgdat->node_id, cc.order,
 							cc.highest_zoneidx);
 	count_compact_event(KCOMPACTD_WAKE);
@@ -2965,12 +2980,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
 		if (compaction_deferred(zone, cc.order))
 			continue;
 
-		/* Allocation can already succeed, nothing to do */
-		if (zone_watermark_ok(zone, cc.order,
-				      min_wmark_pages(zone), zoneid, 0))
-			continue;
-
-		if (!compaction_suitable(zone, cc.order, zoneid))
+		ret = compaction_suit_allocation_order(zone,
+				cc.order, zoneid, ALLOC_WMARK_MIN);
+		if (ret != COMPACT_CONTINUE)
 			continue;
 
 		if (kthread_should_stop())
-- 
2.30.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-09-01 10:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
     [not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
2023-09-01  9:17   ` [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode Mel Gorman
2023-09-01  9:32     ` Kemeng Shi
2023-09-01 10:01       ` Mel Gorman
2023-09-01 15:51 ` [PATCH v3 2/6] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail} Kemeng Shi
2023-09-01 15:51 ` [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Kemeng Shi
2023-09-01 15:51 ` [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox