Re: [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Eric Ren <renzhengeek@gmail.com>, Mike Rapoport <rppt@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: Re: [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages
Date: Tue, 12 Apr 2022 13:41:41 -0400	[thread overview]
Message-ID: <D53C41E3-E934-4577-9286-387AF3DDBE03@nvidia.com> (raw)
In-Reply-To: <afa115e6-d2ae-eeff-ba31-f8b2a5715b95@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 7012 bytes --]

On 12 Apr 2022, at 11:06, David Hildenbrand wrote:

> On 12.04.22 17:01, Zi Yan wrote:
>> On 12 Apr 2022, at 10:49, David Hildenbrand wrote:
>>
>>> On 12.04.22 16:07, Zi Yan wrote:
>>>> On 12 Apr 2022, at 9:10, David Hildenbrand wrote:
>>>>
>>>>> On 06.04.22 17:18, Zi Yan wrote:
>>>>>> From: Zi Yan <ziy@nvidia.com>
>>>>>>
>>>>>> Enable set_migratetype_isolate() to check specified sub-range for
>>>>>> unmovable pages during isolation. Page isolation is done
>>>>>> at MAX_ORDER_NR_PAEGS granularity, but not all pages within that
>>>>>> granularity are intended to be isolated. For example,
>>>>>> alloc_contig_range(), which uses page isolation, allows ranges without
>>>>>> alignment. This commit makes unmovable page check only look for
>>>>>> interesting pages, so that page isolation can succeed for any
>>>>>> non-overlapping ranges.
>>>>>>
>>>>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>>>>> ---
>>>>>
>>>>> [...]
>>>>>
>>>>>>  /*
>>>>>> - * This function checks whether pageblock includes unmovable pages or not.
>>>>>> + * This function checks whether the range [start_pfn, end_pfn) includes
>>>>>> + * unmovable pages or not. The range must fall into a single pageblock and
>>>>>> + * consequently belong to a single zone.
>>>>>>   *
>>>>>>   * PageLRU check without isolation or lru_lock could race so that
>>>>>>   * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
>>>>>> @@ -28,12 +30,14 @@
>>>>>>   * cannot get removed (e.g., via memory unplug) concurrently.
>>>>>>   *
>>>>>>   */
>>>>>> -static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>> -				 int migratetype, int flags)
>>>>>> +static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long end_pfn,
>>>>>> +				int migratetype, int flags)
>>>>>>  {
>>>>>> -	unsigned long iter = 0;
>>>>>> -	unsigned long pfn = page_to_pfn(page);
>>>>>> -	unsigned long offset = pfn % pageblock_nr_pages;
>>>>>> +	unsigned long pfn = start_pfn;
>>>>>> +	struct page *page = pfn_to_page(pfn);
>>>>>
>>>>>
>>>>> Just do
>>>>>
>>>>> struct page *page = pfn_to_page(start_pfn);
>>>>> struct zone *zone = page_zone(page);
>>>>>
>>>>> here. No need to lookup the zone again in the loop because, as you
>>>>> document "must ... belong to a single zone.".
>>>>>
>>>>> Then, there is also no need to initialize "pfn" here. In the loop header
>>>>> is sufficient.
>>>>>
>>>>
>>>> Sure.
>>>>
>>>>>> +
>>>>>> +	VM_BUG_ON(ALIGN_DOWN(start_pfn, pageblock_nr_pages) !=
>>>>>> +		  ALIGN_DOWN(end_pfn - 1, pageblock_nr_pages));
>>>>>>
>>>>>>  	if (is_migrate_cma_page(page)) {
>>>>>>  		/*
>>>>>> @@ -47,8 +51,11 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>  		return page;
>>>>>>  	}
>>>>>>
>>>>>> -	for (; iter < pageblock_nr_pages - offset; iter++) {
>>>>>> -		page = pfn_to_page(pfn + iter);
>>>>>> +	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
>>>>>> +		struct zone *zone;
>>>>>> +
>>>>>> +		page = pfn_to_page(pfn);
>>>>>> +		zone = page_zone(page);
>>>>>>
>>>>>>  		/*
>>>>>>  		 * Both, bootmem allocations and memory holes are marked
>>>>>> @@ -85,7 +92,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>  			}
>>>>>>
>>>>>>  			skip_pages = compound_nr(head) - (page - head);
>>>>>> -			iter += skip_pages - 1;
>>>>>> +			pfn += skip_pages - 1;
>>>>>>  			continue;
>>>>>>  		}
>>>>>>
>>>>>> @@ -97,7 +104,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>  		 */
>>>>>>  		if (!page_ref_count(page)) {
>>>>>>  			if (PageBuddy(page))
>>>>>> -				iter += (1 << buddy_order(page)) - 1;
>>>>>> +				pfn += (1 << buddy_order(page)) - 1;
>>>>>>  			continue;
>>>>>>  		}
>>>>>>
>>>>>> @@ -134,11 +141,18 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>  	return NULL;
>>>>>>  }
>>>>>>
>>>>>> -static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
>>>>>> +/*
>>>>>> + * This function set pageblock migratetype to isolate if no unmovable page is
>>>>>> + * present in [start_pfn, end_pfn). The pageblock must intersect with
>>>>>> + * [start_pfn, end_pfn).
>>>>>> + */
>>>>>> +static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags,
>>>>>> +			unsigned long start_pfn, unsigned long end_pfn)
>>>>>
>>>>> I think we might be able do better, eventually not passing start_pfn at
>>>>> all. Hmm.
>>>>
>>>> IMHO, having start_pfn and end_pfn in the parameter list would make the
>>>> interface easier to understand. Otherwise if we remove start_pfn,
>>>> the caller needs to adjust @page to be within the range of [start_pfn,
>>>> end_pfn)
>>>>
>>>>>
>>>>> I think we want to pull out the
>>>>> start_isolate_page_range()/undo_isolate_page_range() interface change
>>>>> into a separate patch.
>>>>
>>>> You mean a patch just adding
>>>>
>>>> unsigned long isolate_start = pfn_max_align_down(start_pfn);
>>>> unsigned long isolate_end = pfn_max_align_up(end_pfn);
>>>>
>>>> in start_isolate_page_range()/undo_isolate_page_range()?
>>>>
>>>> Yes I can do that.
>>>
>>> I think we have to be careful with memory onlining/offlining. There are
>>> corner cases where we get called with only pageblock alignment and
>>> must not adjust the range.
>>
>> In the patch below, you added a new set of start_isolate_pageblocks()
>> and undo_isolate_pageblocks(), where start_isolate_pageblocks() still
>> calls set_migratetype_isolate() and noted their range should not be
>> adjusted. But in my patch, set_migratetype_isolate() adjusts
>> the range for has_unmovable_pages() check. Do you mean
>
> Right, that's broken in your patch. Memory onlining/offlining behavior
> changed recently when "vmemmap on memory" was added. The start range
> might only be aligned to pageblocks but not MAX_ORDER -1 -- and we must
> not align u..
>
> The core thing is that there are two types of users: memory offlining
> that knows what it's doing when it aligns to less then MAX_ORDER -1 ,
> and range allocators, that just pass in the range of interest.

Oh, you mean the pfn_max_align_down() and pfn_max_align_up() are wrong
for memory onlining/offlining callers. Got it. And in patch 3, this is
not a concern any more, since we move to pageblock_nr_pages alignment.

>
>> start_isolate_pageblocks() should call a different version of
>> set_migratetype_isolate() without range adjustment? That can be done
>> with an additional parameter in start_isolate_page_range(), like
>> bool strict, right?
>
> Random boolean flags are in general frowned upon ;)

I misunderstood about the alignment	issue. No need for this additional
parameter.

Thanks. Will take your patch and adjust this one based on yours.

I will wait for your reviews on patch 3 and onwards before sending
out a new version.

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

next prev parent reply	other threads:[~2022-04-12 17:41 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-06 15:18 [PATCH v10 0/5] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
2022-04-06 15:18 ` [PATCH v10 1/5] mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c Zi Yan
2022-04-06 15:18 ` [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages Zi Yan
2022-04-12 13:10   ` David Hildenbrand
2022-04-12 14:07     ` Zi Yan
2022-04-12 14:49       ` David Hildenbrand
2022-04-12 15:01         ` Zi Yan
2022-04-12 15:06           ` David Hildenbrand
2022-04-12 17:41             ` Zi Yan [this message]
2022-04-12 19:34               ` Zi Yan
2022-04-06 15:18 ` [PATCH v10 3/5] mm: make alloc_contig_range work at pageblock granularity Zi Yan
2022-04-06 15:18 ` [PATCH v10 4/5] mm: cma: use pageblock_order as the single alignment Zi Yan
2022-04-06 15:18 ` [PATCH v10 5/5] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
2022-04-12 12:35 ` [PATCH v10 0/5] Use pageblock_order for cma and alloc_contig_range alignment David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D53C41E3-E934-4577-9286-387AF3DDBE03@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=osalvador@suse.de \
    --cc=renzhengeek@gmail.com \
    --cc=rppt@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox