From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
Vlastimil Babka <vbabka@suse.cz>,
Mel Gorman <mgorman@techsingularity.net>,
Eric Ren <renzhengeek@gmail.com>, Mike Rapoport <rppt@kernel.org>,
Oscar Salvador <osalvador@suse.de>,
Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: Re: [PATCH v8 2/5] mm: page_isolation: check specified range for unmovable pages
Date: Mon, 21 Mar 2022 14:23:07 -0400 [thread overview]
Message-ID: <3379379B-489B-460F-8B01-9A1D584A5036@nvidia.com> (raw)
In-Reply-To: <44a512ba-1707-d9c7-7df3-b81af9b5f0fb@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 10033 bytes --]
On 21 Mar 2022, at 13:30, David Hildenbrand wrote:
> On 17.03.22 16:37, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> Enable set_migratetype_isolate() to check specified sub-range for
>> unmovable pages during isolation. Page isolation is done
>> at max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages) granularity, but not all
>> pages within that granularity are intended to be isolated. For example,
>> alloc_contig_range(), which uses page isolation, allows ranges without
>> alignment. This commit makes unmovable page check only look for
>> interesting pages, so that page isolation can succeed for any
>> non-overlapping ranges.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>> include/linux/page-isolation.h | 10 +++++
>> mm/page_alloc.c | 13 +------
>> mm/page_isolation.c | 69 ++++++++++++++++++++--------------
>> 3 files changed, 51 insertions(+), 41 deletions(-)
>>
>> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
>> index e14eddf6741a..eb4a208fe907 100644
>> --- a/include/linux/page-isolation.h
>> +++ b/include/linux/page-isolation.h
>> @@ -15,6 +15,16 @@ static inline bool is_migrate_isolate(int migratetype)
>> {
>> return migratetype == MIGRATE_ISOLATE;
>> }
>> +static inline unsigned long pfn_max_align_down(unsigned long pfn)
>> +{
>> + return ALIGN_DOWN(pfn, MAX_ORDER_NR_PAGES);
>> +}
>> +
>> +static inline unsigned long pfn_max_align_up(unsigned long pfn)
>> +{
>> + return ALIGN(pfn, MAX_ORDER_NR_PAGES);
>> +}
>> +
>> #else
>> static inline bool has_isolate_pageblock(struct zone *zone)
>> {
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 6de57d058d3d..680580a40a35 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -8937,16 +8937,6 @@ void *__init alloc_large_system_hash(const char *tablename,
>> }
>>
>> #ifdef CONFIG_CONTIG_ALLOC
>> -static unsigned long pfn_max_align_down(unsigned long pfn)
>> -{
>> - return ALIGN_DOWN(pfn, MAX_ORDER_NR_PAGES);
>> -}
>> -
>> -static unsigned long pfn_max_align_up(unsigned long pfn)
>> -{
>> - return ALIGN(pfn, MAX_ORDER_NR_PAGES);
>> -}
>> -
>> #if defined(CONFIG_DYNAMIC_DEBUG) || \
>> (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
>> /* Usage: See admin-guide/dynamic-debug-howto.rst */
>> @@ -9091,8 +9081,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>> * put back to page allocator so that buddy can use them.
>> */
>>
>> - ret = start_isolate_page_range(pfn_max_align_down(start),
>> - pfn_max_align_up(end), migratetype, 0);
>> + ret = start_isolate_page_range(start, end, migratetype, 0);
>> if (ret)
>> return ret;
>
> Shouldn't we similarly adjust undo_isolate_page_range()? IOW, all users
> of pfn_max_align_down()/pfn_max_align_up(). would be gone from that file
> and you can move these defines into mm/page_isolation.c instead of
> include/linux/page-isolation.h?
undo_isolate_page_range() faces much simpler situation, just needing
to unset migratetype. We can just pass pageblock_nr_pages aligned range
to it. For start_isolate_page_range(), start and end are also used for
has_unmovable_pages() for precise unmovable page identification, so
they cannot be pageblock_nr_pages aligned. But for readability and symmetry,
yes, I can change undo_isolate_page_range() too.
>
> Maybe perform this change in a separate patch for
> start_isolate_page_range() and undo_isolate_page_range() ?
The change is trivial enough to be folded into this one.
>
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index b34f1310aeaa..419c805dbdcd 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -16,7 +16,8 @@
>> #include <trace/events/page_isolation.h>
>>
>> /*
>> - * This function checks whether pageblock includes unmovable pages or not.
>> + * This function checks whether pageblock within [start_pfn, end_pfn) includes
>> + * unmovable pages or not.
>
> I think we still want to limit that to a single pageblock (see below),
> as we're going to isolate individual pageblocks. Then an updated
> description could be:
>
> "This function checks whether the range [start_pfn, end_pfn) includes
> unmovable pages or not. The range must fall into a single pageblock and
> consequently belong to a single zone."
>
Sure.
>> *
>> * PageLRU check without isolation or lru_lock could race so that
>> * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
>> @@ -28,27 +29,26 @@
>> * cannot get removed (e.g., via memory unplug) concurrently.
>> *
>> */
>> -static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>> - int migratetype, int flags)
>> +static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long end_pfn,
>> + int migratetype, int flags)
>> {
>> - unsigned long iter = 0;
>> - unsigned long pfn = page_to_pfn(page);
>> - unsigned long offset = pfn % pageblock_nr_pages;
>> + unsigned long pfn = start_pfn;
>>
>> - if (is_migrate_cma_page(page)) {
>> - /*
>> - * CMA allocations (alloc_contig_range) really need to mark
>> - * isolate CMA pageblocks even when they are not movable in fact
>> - * so consider them movable here.
>> - */
>> - if (is_migrate_cma(migratetype))
>> - return NULL;
>
> If we're really dealing with a range that falls into a single pageblock,
> then you can leave the is_migrate_cma_page() in place and also lookup
> the zone only once. That should speed up things and minimize the
> required changes.
>
> You can then further add VM_BUG_ON()s that make sure that start_pfn and
> end_pfn-1 belong to a single pageblock.
Sure.
>
>> + for (pfn = start_pfn; pfn < end_pfn; pfn++) {
>> + struct page *page = pfn_to_page(pfn);
>> + struct zone *zone = page_zone(page);
>>
>> - return page;
>> - }
>> + if (is_migrate_cma_page(page)) {
>> + /*
>> + * CMA allocations (alloc_contig_range) really need to mark
>> + * isolate CMA pageblocks even when they are not movable in fact
>> + * so consider them movable here.
>> + */
>> + if (is_migrate_cma(migratetype))
>> + return NULL;
>>
>> - for (; iter < pageblock_nr_pages - offset; iter++) {
>> - page = pfn_to_page(pfn + iter);
>> + return page;
>> + }
>>
>> /*
>> * Both, bootmem allocations and memory holes are marked
>> @@ -85,7 +85,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>> }
>>
>> skip_pages = compound_nr(head) - (page - head);
>> - iter += skip_pages - 1;
>> + pfn += skip_pages - 1;
>> continue;
>> }
>>
>> @@ -97,7 +97,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>> */
>> if (!page_ref_count(page)) {
>> if (PageBuddy(page))
>> - iter += (1 << buddy_order(page)) - 1;
>> + pfn += (1 << buddy_order(page)) - 1;
>> continue;
>> }
>>
>> @@ -134,7 +134,13 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>> return NULL;
>> }
>>
>> -static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
>> +/*
>> + * This function set pageblock migratetype to isolate if no unmovable page is
>> + * present in [start_pfn, end_pfn). The pageblock must intersect with
>> + * [start_pfn, end_pfn).
>> + */
>> +static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags,
>> + unsigned long start_pfn, unsigned long end_pfn)
>> {
>> struct zone *zone = page_zone(page);
>> struct page *unmovable;
>> @@ -155,8 +161,13 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
>> /*
>> * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
>> * We just check MOVABLE pages.
>> + *
>> + * Pass the intersection of [start_pfn, end_pfn) and the page's pageblock
>> + * to avoid redundant checks.
>> */
>
> I think I'd prefer some helper variables for readability.
Will do.
>
>> - unmovable = has_unmovable_pages(zone, page, migratetype, isol_flags);
>> + unmovable = has_unmovable_pages(max(page_to_pfn(page), start_pfn),
>> + min(ALIGN(page_to_pfn(page) + 1, pageblock_nr_pages), end_pfn),
>> + migratetype, isol_flags);
>> if (!unmovable) {
>> unsigned long nr_pages;
>> int mt = get_pageblock_migratetype(page);
>> @@ -267,7 +278,6 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
>> * be MIGRATE_ISOLATE.
>> * @start_pfn: The lower PFN of the range to be isolated.
>> * @end_pfn: The upper PFN of the range to be isolated.
>> - * start_pfn/end_pfn must be aligned to pageblock_order.
>> * @migratetype: Migrate type to set in error recovery.
>> * @flags: The following flags are allowed (they can be combined in
>> * a bit mask)
>> @@ -309,15 +319,16 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>> unsigned long pfn;
>> struct page *page;
>>
>> - BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
>> - BUG_ON(!IS_ALIGNED(end_pfn, pageblock_nr_pages));
>> + unsigned long isolate_start = pfn_max_align_down(start_pfn);
>> + unsigned long isolate_end = pfn_max_align_up(end_pfn);
>>
>> - for (pfn = start_pfn;
>> - pfn < end_pfn;
>> + for (pfn = isolate_start;
>> + pfn < isolate_end;
>> pfn += pageblock_nr_pages) {
>> page = __first_valid_page(pfn, pageblock_nr_pages);
>> - if (page && set_migratetype_isolate(page, migratetype, flags)) {
>> - undo_isolate_page_range(start_pfn, pfn, migratetype);
>> + if (page && set_migratetype_isolate(page, migratetype, flags,
>> + start_pfn, end_pfn)) {
>> + undo_isolate_page_range(isolate_start, pfn, migratetype);
>> return -EBUSY;
>> }
>> }
>
>
> --
> Thanks,
>
> David / dhildenb
Thanks for your review.
--
Best Regards,
Yan, Zi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]
next prev parent reply other threads:[~2022-03-21 18:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-17 15:37 [PATCH v8 0/5] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
2022-03-17 15:37 ` [PATCH v8 1/5] mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c Zi Yan
2022-03-17 15:37 ` [PATCH v8 2/5] mm: page_isolation: check specified range for unmovable pages Zi Yan
2022-03-21 17:30 ` David Hildenbrand
2022-03-21 18:23 ` Zi Yan [this message]
2022-03-22 16:42 ` David Hildenbrand
2022-03-22 21:42 ` Zi Yan
2022-03-17 15:37 ` [PATCH v8 3/5] mm: make alloc_contig_range work at pageblock granularity Zi Yan
2022-03-17 15:37 ` [PATCH v8 4/5] mm: cma: use pageblock_order as the single alignment Zi Yan
2022-03-17 15:37 ` [PATCH v8 5/5] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3379379B-489B-460F-8B01-9A1D584A5036@nvidia.com \
--to=ziy@nvidia.com \
--cc=christophe.leroy@csgroup.eu \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=osalvador@suse.de \
--cc=renzhengeek@gmail.com \
--cc=rppt@kernel.org \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox