linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linuxppc-dev@lists.ozlabs.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Nicholas Piggin <npiggin@gmail.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Naveen N Rao <naveen@kernel.org>,
	Madhavan Srinivasan <maddy@linux.ibm.com>
Subject: Re: [PATCH RESEND v2 4/6] mm/page_alloc: sort out the alloc_contig_range() gfp flags mess
Date: Tue, 03 Dec 2024 10:49:20 -0500	[thread overview]
Message-ID: <498871B1-D26C-4934-8E89-C6C8ECE8872A@nvidia.com> (raw)
In-Reply-To: <04c1d28e-bbea-4499-9a6d-770f9ca53ba9@suse.cz>

On 3 Dec 2024, at 9:24, Vlastimil Babka wrote:

> On 12/3/24 15:12, David Hildenbrand wrote:
>> On 03.12.24 14:55, Vlastimil Babka wrote:
>>> On 12/3/24 10:47, David Hildenbrand wrote:
>>>> It's all a bit complicated for alloc_contig_range(). For example, we don't
>>>> support many flags, so let's start bailing out on unsupported
>>>> ones -- ignoring the placement hints, as we are already given the range
>>>> to allocate.
>>>>
>>>> While we currently set cc.gfp_mask, in __alloc_contig_migrate_range() we
>>>> simply create yet another GFP mask whereby we ignore the reclaim flags
>>>> specify by the caller. That looks very inconsistent.
>>>>
>>>> Let's clean it up, constructing the gfp flags used for
>>>> compaction/migration exactly once. Update the documentation of the
>>>> gfp_mask parameter for alloc_contig_range() and alloc_contig_pages().
>>>>
>>>> Acked-by: Zi Yan <ziy@nvidia.com>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>
>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>
>>>> +	/*
>>>> +	 * Flags to control page compaction/migration/reclaim, to free up our
>>>> +	 * page range. Migratable pages are movable, __GFP_MOVABLE is implied
>>>> +	 * for them.
>>>> +	 *
>>>> +	 * Traditionally we always had __GFP_HARDWALL|__GFP_RETRY_MAYFAIL set,
>>>> +	 * keep doing that to not degrade callers.
>>>> +	 */
>>>
>>> Wonder if we could revisit that eventually. Why limit migration targets by
>>> cpuset via __GFP_HARDWALL if we were not called with __GFP_HARDWALL? And why
>>> weaken the attempts with __GFP_RETRY_MAYFAIL if we didn't specify it?
>>
>> See below.
>>
>>>
>>> Unless I'm missing something, cc->gfp is only checked for __GFP_FS and
>>> __GFP_NOWARN in few places, so it's mostly migration_target_control the
>>> callers could meaningfully influence.
>>
>> Note the fist change in the file, where we now use the mask instead of coming up
>> with another one out of the blue. :)
>
> I know. What I wanted to say - cc->gfp is on its own only checked in few
> places, but now since we also translate it to migration_target_control's
> gfp_mask, it's mostly that part the caller might influence with the passed
> flags. But we still impose own additions to it, limiting that influence.
>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index ce7589a4ec01..54594cc4f650 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6294,7 +6294,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>>   	int ret = 0;
>>   	struct migration_target_control mtc = {
>>   		.nid = zone_to_nid(cc->zone),
>> -		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
>> +		.gfp_mask = cc->gfp_mask,
>>   		.reason = MR_CONTIG_RANGE,
>>   	};
>>
>> GFP_USER contains __GFP_HARDWALL. I am not sure if that matters here, but
>
> Yeah wonder if GFP_USER was used specifically for that part, or just randomly :)
>
>> likely the thing we are assuming here is that we are migrating a page, and
>> usually, these are user allocation (except maybe balloon and some other non-lru
>> movable things).
>
> Yeah and user allocations obey cpuset and mempolicies etc. But these are
> likely somebody elses allocations that were done according to their
> policies. With our migration we might be actually violating those, which
> probably can't be helped (is at least migration within the same node
> preferred? hmm). But it doesn't seem to me that our caller's restrictions
> (if those exist, would be enforced by __GFP_HARDWALL) are that relevant for
> somebody else's pages?

Yeah, I was wondering why current_gfp_context() is used to adjust gfp_mask,
since current context might not be relevant. But I see it is used in
the original code, so I did not ask. If current context is irrelevant w.r.t
the to-be-migrated pages, should current_gfp_context() part be removed?

Ideally, to respect the to-be-migrated page’s gfp, we might need to go through
rmap to find its corresponding vma and possible task struct, but that seems
overly complicated.


Best Regards,
Yan, Zi


  reply	other threads:[~2024-12-03 15:49 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03  9:47 [PATCH RESEND v2 0/6] mm/page_alloc: gfp flags cleanups for alloc_contig_*() David Hildenbrand
2024-12-03  9:47 ` [PATCH RESEND v2 1/6] mm/page_isolation: don't pass gfp flags to isolate_single_pageblock() David Hildenbrand
2024-12-03 13:31   ` Vlastimil Babka
2024-12-03 15:30   ` Oscar Salvador
2024-12-03 21:44   ` Vishal Moola
2024-12-03  9:47 ` [PATCH RESEND v2 2/6] mm/page_isolation: don't pass gfp flags to start_isolate_page_range() David Hildenbrand
2024-12-03 13:32   ` Vlastimil Babka
2024-12-03 15:32   ` Oscar Salvador
2024-12-03 21:44   ` Vishal Moola
2024-12-03  9:47 ` [PATCH RESEND v2 3/6] mm/page_alloc: make __alloc_contig_migrate_range() static David Hildenbrand
2024-12-03 13:33   ` Vlastimil Babka
2024-12-03 15:33   ` Oscar Salvador
2024-12-03 21:45   ` Vishal Moola
2024-12-03  9:47 ` [PATCH RESEND v2 4/6] mm/page_alloc: sort out the alloc_contig_range() gfp flags mess David Hildenbrand
2024-12-03 13:55   ` Vlastimil Babka
2024-12-03 14:12     ` David Hildenbrand
2024-12-03 14:24       ` Vlastimil Babka
2024-12-03 15:49         ` Zi Yan [this message]
2024-12-03 19:07           ` David Hildenbrand
2024-12-03 19:19         ` David Hildenbrand
2024-12-04  8:54           ` Vlastimil Babka
2024-12-04  8:59           ` Oscar Salvador
2024-12-04  9:03             ` Vlastimil Babka
2024-12-04  9:15               ` Oscar Salvador
2024-12-04  9:28                 ` David Hildenbrand
2024-12-04 10:04                   ` Oscar Salvador
2024-12-04 11:05                     ` David Hildenbrand
2024-12-04  9:00   ` Oscar Salvador
2024-12-03  9:47 ` [PATCH RESEND v2 5/6] mm/page_alloc: forward the gfp flags from alloc_contig_range() to post_alloc_hook() David Hildenbrand
2024-12-03 14:36   ` Vlastimil Babka
2024-12-04  9:03   ` Oscar Salvador
2024-12-03  9:47 ` [PATCH RESEND v2 6/6] powernv/memtrace: use __GFP_ZERO with alloc_contig_pages() David Hildenbrand
2024-12-03 14:39   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=498871B1-D26C-4934-8E89-C6C8ECE8872A@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=naveen@kernel.org \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox