linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	akpm@linux-foundation.org, vbabka@suse.cz,
	mgorman@techsingularity.net, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
Date: Tue, 7 Feb 2017 23:32:47 +0800	[thread overview]
Message-ID: <20170207153247.GB31837@WeideMBP.lan> (raw)
In-Reply-To: <20170207094557.GE5065@dhcp22.suse.cz>

[-- Attachment #1: Type: text/plain, Size: 3197 bytes --]

On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
>On Mon 06-02-17 23:43:14, Wei Yang wrote:
>> The whole memory space is divided into several zones and nodes may have no
>> page in some zones. In this case, the __absent_pages_in_range() would
>> return 0, since the range it is searching for is an empty range.
>> 
>> Also this happens more often to those nodes with higher memory range when
>> there are more nodes, which is a trend for future architectures.
>
>I do not understand this part. Why would we see more zones with zero pfn
>range in higher memory ranges.
>

Based on my understanding, zone boundary is fixed address. For example, on
x84_64, ZONE_DMA is < 16M, ZONE_DMA32 is < 4G. And similar rules apply to
sparc, ia64, s390 as shown in the comment of ZONE definition.

For example, currently we see a server with 8 NUMA nodes and with 4T memory.
Those zone boundaries may all sits in the first node range, so that the nodes
with higher memory range may all sits in the last zone, which is ZONE_NORMAL I
think. During the memory initialization, for each node we still iterate on
each zone and calculate the memory range in each zone. By doing so, those
nodes with higher memory range will see several empty zones.

>> This patch checks the zone range after clamp and adjustment, return 0 if
>> the range is an empty range.
>
>I assume the whole point of this patch is to save
>__absent_pages_in_range which iterates over all memblock regions, right?

Yes, you are right. Since we know there is no overlap, it is not necessary to
do the iteration on memblock.

>Is there any reason why for_each_mem_pfn_range cannot be changed to
>honor the given start/end pfns instead? I can imagine that a small zone
>would see a similar pointless iterations...
>

Hmm... No special reason, just not thought about this implementation. And
actually I just do the similar thing as in zone_spanned_pages_in_node(), in
which also return 0 when there is no overlap.

BTW, I don't get your point. You wish to put the check in
for_each_mem_pfn_range() definition?

>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>  mm/page_alloc.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 6de9440e3ae2..51c60c0eadcb 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5521,6 +5521,11 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
>>  	adjust_zone_range_for_zone_movable(nid, zone_type,
>>  			node_start_pfn, node_end_pfn,
>>  			&zone_start_pfn, &zone_end_pfn);
>> +
>> +	/* If this node has no page within this zone, return 0. */
>> +	if (zone_start_pfn == zone_end_pfn)
>> +		return 0;
>> +
>>  	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
>>  
>>  	/*
>> -- 
>> 2.11.0
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2017-02-07 15:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-06 15:43 Wei Yang
2017-02-06 23:29 ` Andrew Morton
2017-02-07 15:07   ` Wei Yang
2017-02-07  9:45 ` Michal Hocko
2017-02-07 15:32   ` Wei Yang [this message]
2017-02-07 15:41     ` Michal Hocko
2017-02-08 14:05       ` Wei Yang
2017-02-08 14:39         ` Michal Hocko
2017-02-09 13:59       ` Wei Yang
2017-02-22  8:49         ` Michal Hocko
2017-02-22 10:51           ` Wei Yang
2017-02-22 11:45             ` Michal Hocko
2017-02-22 14:18               ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170207153247.GB31837@WeideMBP.lan \
    --to=richard.weiyang@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox