From: Zi Yan <zi.yan@sent.com>
To: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>,
linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
Qian Cai <quic_qiancai@quicinc.com>,
Vlastimil Babka <vbabka@suse.cz>,
Mel Gorman <mgorman@techsingularity.net>,
Eric Ren <renzhengeek@gmail.com>, Mike Rapoport <rppt@kernel.org>,
Oscar Salvador <osalvador@suse.de>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Zi Yan <ziy@nvidia.com>, Doug Berger <opendmb@gmail.com>
Subject: [PATCH 2/2] mm: split free page with properly free memory accounting and without race
Date: Thu, 26 May 2022 19:15:31 -0400 [thread overview]
Message-ID: <20220526231531.2404977-2-zi.yan@sent.com> (raw)
In-Reply-To: <20220526231531.2404977-1-zi.yan@sent.com>
From: Zi Yan <ziy@nvidia.com>
In isolate_single_pageblock(), free pages are checked without holding zone
lock, but they can go away in split_free_page() when zone lock is held.
Check the free page and its order again in split_free_page() when zone lock
is held. Recheck the page if the free page is gone under zone lock.
In addition, in split_free_page(), the free page was deleted from the page
list without changing free page accounting. Add the missing free page
accounting code.
Fix the type of order parameter in split_free_page().
Link: https://lore.kernel.org/lkml/20220525103621.987185e2ca0079f7b97b856d@linux-foundation.org/
Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
Reported-by: Doug Berger <opendmb@gmail.com>
Link: https://lore.kernel.org/linux-mm/c3932a6f-77fe-29f7-0c29-fe6b1c67ab7b@gmail.com/
Signed-off-by: Zi Yan <ziy@nvidia.com>
---
mm/internal.h | 4 ++--
mm/page_alloc.c | 24 ++++++++++++++++++++----
mm/page_isolation.c | 10 +++++++---
3 files changed, 29 insertions(+), 9 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 20e0a990da40..7cf12a15475b 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -374,8 +374,8 @@ extern void *memmap_alloc(phys_addr_t size, phys_addr_t align,
phys_addr_t min_addr,
int nid, bool exact_nid);
-void split_free_page(struct page *free_page,
- int order, unsigned long split_pfn_offset);
+int split_free_page(struct page *free_page,
+ unsigned int order, unsigned long split_pfn_offset);
#if defined CONFIG_COMPACTION || defined CONFIG_CMA
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 355bd017b185..2717d6dede99 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1112,30 +1112,44 @@ static inline void __free_one_page(struct page *page,
* @order: the order of the page
* @split_pfn_offset: split offset within the page
*
+ * Return -ENOENT if the free page is changed, otherwise 0
+ *
* It is used when the free page crosses two pageblocks with different migratetypes
* at split_pfn_offset within the page. The split free page will be put into
* separate migratetype lists afterwards. Otherwise, the function achieves
* nothing.
*/
-void split_free_page(struct page *free_page,
- int order, unsigned long split_pfn_offset)
+int split_free_page(struct page *free_page,
+ unsigned int order, unsigned long split_pfn_offset)
{
struct zone *zone = page_zone(free_page);
unsigned long free_page_pfn = page_to_pfn(free_page);
unsigned long pfn;
unsigned long flags;
int free_page_order;
+ int mt;
+ int ret = 0;
if (split_pfn_offset == 0)
- return;
+ return ret;
spin_lock_irqsave(&zone->lock, flags);
+
+ if (!PageBuddy(free_page) || buddy_order(free_page) != order) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ mt = get_pageblock_migratetype(free_page);
+ if (likely(!is_migrate_isolate(mt)))
+ __mod_zone_freepage_state(zone, -(1UL << order), mt);
+
del_page_from_free_list(free_page, zone, order);
for (pfn = free_page_pfn;
pfn < free_page_pfn + (1UL << order);) {
int mt = get_pfnblock_migratetype(pfn_to_page(pfn), pfn);
- free_page_order = min_t(int,
+ free_page_order = min_t(unsigned int,
pfn ? __ffs(pfn) : order,
__fls(split_pfn_offset));
__free_one_page(pfn_to_page(pfn), pfn, zone, free_page_order,
@@ -1146,7 +1160,9 @@ void split_free_page(struct page *free_page,
if (split_pfn_offset == 0)
split_pfn_offset = (1UL << order) - (pfn - free_page_pfn);
}
+out:
spin_unlock_irqrestore(&zone->lock, flags);
+ return ret;
}
/*
* A bad page could be due to a number of fields. Instead of multiple branches,
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index fbd820b21292..6021f8444b5a 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -371,9 +371,13 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
if (PageBuddy(page)) {
int order = buddy_order(page);
- if (pfn + (1UL << order) > boundary_pfn)
- split_free_page(page, order, boundary_pfn - pfn);
- pfn += (1UL << order);
+ if (pfn + (1UL << order) > boundary_pfn) {
+ /* free page changed before split, check it again */
+ if (split_free_page(page, order, boundary_pfn - pfn))
+ continue;
+ }
+
+ pfn += 1UL << order;
continue;
}
/*
--
2.35.1
prev parent reply other threads:[~2022-05-26 23:15 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-26 23:15 [PATCH 1/2] mm: page-isolation: skip isolated pageblock in start_isolate_page_range() Zi Yan
2022-05-26 23:15 ` Zi Yan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220526231531.2404977-2-zi.yan@sent.com \
--to=zi.yan@sent.com \
--cc=akpm@linux-foundation.org \
--cc=christophe.leroy@csgroup.eu \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=opendmb@gmail.com \
--cc=osalvador@suse.de \
--cc=quic_qiancai@quicinc.com \
--cc=renzhengeek@gmail.com \
--cc=rppt@kernel.org \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux-foundation.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox