* Re: [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode
[not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
@ 2023-09-01 9:17 ` Mel Gorman
2023-09-01 9:32 ` Kemeng Shi
0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2023-09-01 9:17 UTC (permalink / raw)
To: Kemeng Shi; +Cc: akpm, baolin.wang, david, willy, linux-mm, linux-kernel
On Fri, Sep 01, 2023 at 11:51:38PM +0800, Kemeng Shi wrote:
> In strict mode, we should return 0 if there is any hole in pageblock. If
> we successfully isolated pages at beginning at pageblock and then have a
> bogus compound_order outside pageblock in next page. We will abort search
> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> we will treat it as a successful isolation in strict mode as blockpfn is
> not < end_pfn and return partial isolated pages. Then
> isolate_freepages_range may success unexpectly with hole in isolated
> range.
>
> Fixes: 9fcd6d2e052e ("mm, compaction: skip compound pages by order in free scanner")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> mm/compaction.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a40550a33aee..9ecbfbc695e5 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -626,11 +626,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> if (PageCompound(page)) {
> const unsigned int order = compound_order(page);
>
> - if (likely(order <= MAX_ORDER)) {
> + if (blockpfn + (1UL << order) <= end_pfn) {
> blockpfn += (1UL << order) - 1;
> page += (1UL << order) - 1;
> nr_scanned += (1UL << order) - 1;
> }
> +
> goto isolate_fail;
> }
>
> @@ -678,8 +679,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> spin_unlock_irqrestore(&cc->zone->lock, flags);
>
> /*
> - * There is a tiny chance that we have read bogus compound_order(),
> - * so be careful to not go outside of the pageblock.
> + * Be careful to not go outside of the pageblock.
> */
> if (unlikely(blockpfn > end_pfn))
> blockpfn = end_pfn;
Is this check still necessary after the first hunk?
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode
2023-09-01 9:17 ` [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode Mel Gorman
@ 2023-09-01 9:32 ` Kemeng Shi
2023-09-01 10:01 ` Mel Gorman
0 siblings, 1 reply; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 9:32 UTC (permalink / raw)
To: Mel Gorman; +Cc: akpm, baolin.wang, david, willy, linux-mm, linux-kernel
on 9/1/2023 5:17 PM, Mel Gorman wrote:
> On Fri, Sep 01, 2023 at 11:51:38PM +0800, Kemeng Shi wrote:
>> In strict mode, we should return 0 if there is any hole in pageblock. If
>> we successfully isolated pages at beginning at pageblock and then have a
>> bogus compound_order outside pageblock in next page. We will abort search
>> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
>> we will treat it as a successful isolation in strict mode as blockpfn is
>> not < end_pfn and return partial isolated pages. Then
>> isolate_freepages_range may success unexpectly with hole in isolated
>> range.
>>
>> Fixes: 9fcd6d2e052e ("mm, compaction: skip compound pages by order in free scanner")
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>> mm/compaction.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index a40550a33aee..9ecbfbc695e5 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -626,11 +626,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>> if (PageCompound(page)) {
>> const unsigned int order = compound_order(page);
>>
>> - if (likely(order <= MAX_ORDER)) {
>> + if (blockpfn + (1UL << order) <= end_pfn) {
>> blockpfn += (1UL << order) - 1;
>> page += (1UL << order) - 1;
>> nr_scanned += (1UL << order) - 1;
>> }
>> +
>> goto isolate_fail;
>> }
>>
>> @@ -678,8 +679,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>> spin_unlock_irqrestore(&cc->zone->lock, flags);
>>
>> /*
>> - * There is a tiny chance that we have read bogus compound_order(),
>> - * so be careful to not go outside of the pageblock.
>> + * Be careful to not go outside of the pageblock.
>> */
>> if (unlikely(blockpfn > end_pfn))
>> blockpfn = end_pfn;
>
> Is this check still necessary after the first hunk?
>
Actually, I removed this check in the first version, but Baolin thought remove this check is not
cheap and not worth it. More discussion can be found in [1]. Thanks!
[1] https://lore.kernel.org/all/a8edac8d-8e22-89cf-2c8c-217a54608d27@linux.alibaba.com/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode
2023-09-01 9:32 ` Kemeng Shi
@ 2023-09-01 10:01 ` Mel Gorman
0 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2023-09-01 10:01 UTC (permalink / raw)
To: Kemeng Shi; +Cc: akpm, baolin.wang, david, willy, linux-mm, linux-kernel
On Fri, Sep 01, 2023 at 05:32:49PM +0800, Kemeng Shi wrote:
>
>
> on 9/1/2023 5:17 PM, Mel Gorman wrote:
> > On Fri, Sep 01, 2023 at 11:51:38PM +0800, Kemeng Shi wrote:
> >> In strict mode, we should return 0 if there is any hole in pageblock. If
> >> we successfully isolated pages at beginning at pageblock and then have a
> >> bogus compound_order outside pageblock in next page. We will abort search
> >> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
> >> we will treat it as a successful isolation in strict mode as blockpfn is
> >> not < end_pfn and return partial isolated pages. Then
> >> isolate_freepages_range may success unexpectly with hole in isolated
> >> range.
> >>
> >> Fixes: 9fcd6d2e052e ("mm, compaction: skip compound pages by order in free scanner")
> >> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> >> ---
> >> mm/compaction.c | 6 +++---
> >> 1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/compaction.c b/mm/compaction.c
> >> index a40550a33aee..9ecbfbc695e5 100644
> >> --- a/mm/compaction.c
> >> +++ b/mm/compaction.c
> >> @@ -626,11 +626,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> >> if (PageCompound(page)) {
> >> const unsigned int order = compound_order(page);
> >>
> >> - if (likely(order <= MAX_ORDER)) {
> >> + if (blockpfn + (1UL << order) <= end_pfn) {
> >> blockpfn += (1UL << order) - 1;
> >> page += (1UL << order) - 1;
> >> nr_scanned += (1UL << order) - 1;
> >> }
> >> +
> >> goto isolate_fail;
> >> }
> >>
> >> @@ -678,8 +679,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> >> spin_unlock_irqrestore(&cc->zone->lock, flags);
> >>
> >> /*
> >> - * There is a tiny chance that we have read bogus compound_order(),
> >> - * so be careful to not go outside of the pageblock.
> >> + * Be careful to not go outside of the pageblock.
> >> */
> >> if (unlikely(blockpfn > end_pfn))
> >> blockpfn = end_pfn;
> >
> > Is this check still necessary after the first hunk?
> >
> Actually, I removed this check in the first version, but Baolin thought remove this check is not
> cheap and not worth it. More discussion can be found in [1]. Thanks!
>
Ok, fair enough. While I think the check is redundant right now, it's a
reasonable defensive check and this is not a fast path so
Acked-by: Mel Gorman <mgorman@techsingularity.net>
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 0/6] Fixes and cleanups to compaction
@ 2023-09-01 15:51 Kemeng Shi
[not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel
Hi all, this is another series to do fix and clean up to compaction.
Patch 1-2 fix and clean up freepage list operation.
Patch 3-4 fix and clean up isolation of freepages
Patch 7 factor code to check if compaction is needed for allocation
order.
More details can be found in respective patches. Thanks!
v2->v3:
-Collect RVB and ACK from Baolin and Mel
-Avoid blockpfn outside pageblock in original likely(order <= MAX_ORDER)
block in patch 3
-Move comment into __reset_isolation_suitable in patch 4
-Improve indentation in patch 6
v1->v2:
-Collect RVB from Baolin.
-Keep pfn inside of pageblock in patch 3.
-Only improve comment of is_via_compact_memory in patch 6.
-Squash patch 8 and patch 9 into patch 7 and use ALLOC_WMARK_MIN
instead of magic number 0.
Kemeng Shi (6):
mm/compaction: use correct list in move_freelist_{head}/{tail}
mm/compaction: call list_is_{first}/{last} more intuitively in
move_freelist_{head}/{tail}
mm/compaction: correctly return failure with bogus compound_order in
strict mode
mm/compaction: remove repeat compact_blockskip_flush check in
reset_isolation_suitable
mm/compaction: improve comment of is_via_compact_memory
mm/compaction: factor out code to test if we should run compaction for
target order
mm/compaction.c | 91 ++++++++++++++++++++++++++++---------------------
1 file changed, 52 insertions(+), 39 deletions(-)
--
2.30.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 2/6] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}
2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
[not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
@ 2023-09-01 15:51 ` Kemeng Shi
2023-09-01 15:51 ` [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Kemeng Shi
2023-09-01 15:51 ` [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
3 siblings, 0 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel
We use move_freelist_head after list_for_each_entry_reverse to skip
recent pages. And there is no need to do actual move if all freepages
are searched in list_for_each_entry_reverse, e.g. freepage point to
first page in freelist. It's more intuitively to call list_is_first
with list entry as the first argument and list head as the second
argument to check if list entry is the first list entry instead of
call list_is_last with list entry and list head passed in reverse.
Similarly, call list_is_last in move_freelist_tail is more intuitively.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/compaction.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index e3ee1bc1c0ad..a40550a33aee 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1395,7 +1395,7 @@ move_freelist_head(struct list_head *freelist, struct page *freepage)
{
LIST_HEAD(sublist);
- if (!list_is_last(freelist, &freepage->buddy_list)) {
+ if (!list_is_first(&freepage->buddy_list, freelist)) {
list_cut_before(&sublist, freelist, &freepage->buddy_list);
list_splice_tail(&sublist, freelist);
}
@@ -1412,7 +1412,7 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage)
{
LIST_HEAD(sublist);
- if (!list_is_first(freelist, &freepage->buddy_list)) {
+ if (!list_is_last(&freepage->buddy_list, freelist)) {
list_cut_position(&sublist, freelist, &freepage->buddy_list);
list_splice_tail(&sublist, freelist);
}
--
2.30.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable
2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
[not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
2023-09-01 15:51 ` [PATCH v3 2/6] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail} Kemeng Shi
@ 2023-09-01 15:51 ` Kemeng Shi
2023-09-01 15:51 ` [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
3 siblings, 0 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel
We have compact_blockskip_flush check in __reset_isolation_suitable, just
remove repeat check before __reset_isolation_suitable in
compact_blockskip_flush.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/compaction.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 9ecbfbc695e5..c377d78e0f15 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -382,6 +382,7 @@ static void __reset_isolation_suitable(struct zone *zone)
bool source_set = false;
bool free_set = false;
+ /* Only flush if a full compaction finished recently */
if (!zone->compact_blockskip_flush)
return;
@@ -434,9 +435,7 @@ void reset_isolation_suitable(pg_data_t *pgdat)
if (!populated_zone(zone))
continue;
- /* Only flush if a full compaction finished recently */
- if (zone->compact_blockskip_flush)
- __reset_isolation_suitable(zone);
+ __reset_isolation_suitable(zone);
}
}
--
2.30.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order
2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
` (2 preceding siblings ...)
2023-09-01 15:51 ` [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Kemeng Shi
@ 2023-09-01 15:51 ` Kemeng Shi
3 siblings, 0 replies; 7+ messages in thread
From: Kemeng Shi @ 2023-09-01 15:51 UTC (permalink / raw)
To: akpm, baolin.wang, mgorman, david, willy, linux-mm, linux-kernel
We always do zone_watermark_ok check and compaction_suitable check
together to test if compaction for target order should be ran.
Factor these code out to remove repeat code.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
mm/compaction.c | 66 +++++++++++++++++++++++++++++--------------------
1 file changed, 39 insertions(+), 27 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index ff3426a0d9c5..1940e8693638 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2378,6 +2378,30 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
return false;
}
+/*
+ * Should we do compaction for target allocation order.
+ * Return COMPACT_SUCCESS if allocation for target order can be already
+ * satisfied
+ * Return COMPACT_SKIPPED if compaction for target order is likely to fail
+ * Return COMPACT_CONTINUE if compaction for target order should be ran
+ */
+static enum compact_result
+compaction_suit_allocation_order(struct zone *zone, unsigned int order,
+ int highest_zoneidx, unsigned int alloc_flags)
+{
+ unsigned long watermark;
+
+ watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
+ if (zone_watermark_ok(zone, order, watermark, highest_zoneidx,
+ alloc_flags))
+ return COMPACT_SUCCESS;
+
+ if (!compaction_suitable(zone, order, highest_zoneidx))
+ return COMPACT_SKIPPED;
+
+ return COMPACT_CONTINUE;
+}
+
static enum compact_result
compact_zone(struct compact_control *cc, struct capture_control *capc)
{
@@ -2403,19 +2427,11 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
cc->migratetype = gfp_migratetype(cc->gfp_mask);
if (!is_via_compact_memory(cc->order)) {
- unsigned long watermark;
-
- /* Allocation can already succeed, nothing to do */
- watermark = wmark_pages(cc->zone,
- cc->alloc_flags & ALLOC_WMARK_MASK);
- if (zone_watermark_ok(cc->zone, cc->order, watermark,
- cc->highest_zoneidx, cc->alloc_flags))
- return COMPACT_SUCCESS;
-
- /* Compaction is likely to fail */
- if (!compaction_suitable(cc->zone, cc->order,
- cc->highest_zoneidx))
- return COMPACT_SKIPPED;
+ ret = compaction_suit_allocation_order(cc->zone, cc->order,
+ cc->highest_zoneidx,
+ cc->alloc_flags);
+ if (ret != COMPACT_CONTINUE)
+ return ret;
}
/*
@@ -2914,6 +2930,7 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
int zoneid;
struct zone *zone;
enum zone_type highest_zoneidx = pgdat->kcompactd_highest_zoneidx;
+ enum compact_result ret;
for (zoneid = 0; zoneid <= highest_zoneidx; zoneid++) {
zone = &pgdat->node_zones[zoneid];
@@ -2921,14 +2938,10 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
if (!populated_zone(zone))
continue;
- /* Allocation can already succeed, check other zones */
- if (zone_watermark_ok(zone, pgdat->kcompactd_max_order,
- min_wmark_pages(zone),
- highest_zoneidx, 0))
- continue;
-
- if (compaction_suitable(zone, pgdat->kcompactd_max_order,
- highest_zoneidx))
+ ret = compaction_suit_allocation_order(zone,
+ pgdat->kcompactd_max_order,
+ highest_zoneidx, ALLOC_WMARK_MIN);
+ if (ret == COMPACT_CONTINUE)
return true;
}
@@ -2951,6 +2964,8 @@ static void kcompactd_do_work(pg_data_t *pgdat)
.ignore_skip_hint = false,
.gfp_mask = GFP_KERNEL,
};
+ enum compact_result ret;
+
trace_mm_compaction_kcompactd_wake(pgdat->node_id, cc.order,
cc.highest_zoneidx);
count_compact_event(KCOMPACTD_WAKE);
@@ -2965,12 +2980,9 @@ static void kcompactd_do_work(pg_data_t *pgdat)
if (compaction_deferred(zone, cc.order))
continue;
- /* Allocation can already succeed, nothing to do */
- if (zone_watermark_ok(zone, cc.order,
- min_wmark_pages(zone), zoneid, 0))
- continue;
-
- if (!compaction_suitable(zone, cc.order, zoneid))
+ ret = compaction_suit_allocation_order(zone,
+ cc.order, zoneid, ALLOC_WMARK_MIN);
+ if (ret != COMPACT_CONTINUE)
continue;
if (kthread_should_stop())
--
2.30.0
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-09-01 10:01 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-01 15:51 [PATCH v3 0/6] Fixes and cleanups to compaction Kemeng Shi
[not found] ` <20230901155141.249860-4-shikemeng@huaweicloud.com>
2023-09-01 9:17 ` [PATCH v3 3/6] mm/compaction: correctly return failure with bogus compound_order in strict mode Mel Gorman
2023-09-01 9:32 ` Kemeng Shi
2023-09-01 10:01 ` Mel Gorman
2023-09-01 15:51 ` [PATCH v3 2/6] mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail} Kemeng Shi
2023-09-01 15:51 ` [PATCH v3 4/6] mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable Kemeng Shi
2023-09-01 15:51 ` [PATCH v3 6/6] mm/compaction: factor out code to test if we should run compaction for target order Kemeng Shi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox