linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	mgorman@techsingularity.net, osalvador@suse.de
Subject: Re: [PATCH] mm, page_alloc: clear zone_movable_pfn if the node doesn't have ZONE_MOVABLE
Date: Tue, 18 Dec 2018 20:27:43 +0000	[thread overview]
Message-ID: <20181218202743.i5wvlzipzdl54fuq@master> (raw)
In-Reply-To: <20181218144724.GM30879@dhcp22.suse.cz>

On Tue, Dec 18, 2018 at 03:47:24PM +0100, Michal Hocko wrote:
>On Tue 18-12-18 14:39:43, Wei Yang wrote:
>> On Tue, Dec 18, 2018 at 01:14:51PM +0100, Michal Hocko wrote:
>> >On Mon 17-12-18 14:18:02, Wei Yang wrote:
>> >> On Mon, Dec 17, 2018 at 11:25:34AM +0100, Michal Hocko wrote:
>> >> >On Sun 16-12-18 20:56:24, Wei Yang wrote:
>> >> >> A non-zero zone_movable_pfn indicates this node has ZONE_MOVABLE, while
>> >> >> current implementation doesn't comply with this rule when kernel
>> >> >> parameter "kernelcore=" is used.
>> >> >> 
>> >> >> Current implementation doesn't harm the system, since the value in
>> >> >> zone_movable_pfn is out of the range of current zone. While user would
>> >> >> see this message during bootup, even that node doesn't has ZONE_MOVABLE.
>> >> >> 
>> >> >>     Movable zone start for each node
>> >> >>       Node 0: 0x0000000080000000
>> >> >
>> >> >I am sorry but the above description confuses me more than it helps.
>> >> >Could you start over again and describe the user visible problem, then
>> >> >follow up with the udnerlying bug and finally continue with a proposed
>> >> >fix?
>> >> 
>> >> Yep, how about this one:
>> >> 
>> >> For example, a machine with 8G RAM, 2 nodes with 4G on each, if we pass
>> >
>> >Did you mean 2G on each? Because your nodes do have 2GB each.
>> >
>> >> "kernelcore=2G" as kernel parameter, the dmesg looks like:
>> >> 
>> >>      Movable zone start for each node
>> >>        Node 0: 0x0000000080000000
>> >>        Node 1: 0x0000000100000000
>> >> 
>> >> This looks like both Node 0 and 1 has ZONE_MOVABLE, while the following
>> >> dmesg shows only Node 1 has ZONE_MOVABLE.
>> >
>> >Well, the documentation says
>> >	kernelcore=	[KNL,X86,IA-64,PPC]
>> >			Format: nn[KMGTPE] | nn% | "mirror"
>> >			This parameter specifies the amount of memory usable by
>> >			the kernel for non-movable allocations.  The requested
>> >			amount is spread evenly throughout all nodes in the
>> >			system as ZONE_NORMAL.  The remaining memory is used for
>> >			movable memory in its own zone, ZONE_MOVABLE.  In the
>> >			event, a node is too small to have both ZONE_NORMAL and
>> >			ZONE_MOVABLE, kernelcore memory will take priority and
>> >			other nodes will have a larger ZONE_MOVABLE.
>> 
>> Yes, current behavior is a little bit different.
>
>Then it is either a bug in implementation or documentation.
>
>> 
>> When you look at find_usable_zone_for_movable(), the ZONE_MOVABLE is in the
>> highest ZONE. Which means if a node doesn't has the highest zone, all
>> its memory belongs to kernelcore.
>
>Each node can have all zones. DMA and DMA32 have address range specific
>but there is always NORMAL zone to hold kernel memory irrespective of
>the pfn range.
>
>> 
>> Looks like a design decision?
>> 
>> >
>> >>      On node 0 totalpages: 524190
>> >>        DMA zone: 64 pages used for memmap
>> >>        DMA zone: 21 pages reserved
>> >>        DMA zone: 3998 pages, LIFO batch:0
>> >>        DMA32 zone: 8128 pages used for memmap
>> >>        DMA32 zone: 520192 pages, LIFO batch:63
>> >>      
>> >>      On node 1 totalpages: 524255
>> >>        DMA32 zone: 4096 pages used for memmap
>> >>        DMA32 zone: 262111 pages, LIFO batch:63
>> >>        Movable zone: 4096 pages used for memmap
>> >>        Movable zone: 262144 pages, LIFO batch:63
>> >
>> >so assuming your really have 4GB in total and 2GB should be in kernel
>> >zones then each node should get half of it to kernel zones and the
>> >remaining 2G evenly distributed to movable zones. So something seems
>> >broken here.
>> 
>> In case we really have this implemented. We will have following memory
>> layout.
>> 
>> 
>>     +---------+------+---------+--------+------------+
>>     |DMA      |DMA32 |Movable  |DMA32   |Movable     |
>>     +---------+------+---------+--------+------------+
>>     |<        Node 0          >|<      Node 1       >|
>> 
>> This means we have none-monotonic increasing zone.
>> 
>> Is this what we expect now? If this is, we really have someting broken.
>
>Absolutely. Each node can have all zones as mentioned above.
>

Ok, this seems the implementation is not correct now.

BTW, would this eat lower zone's memory? For example, has less DMA32?

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

  reply	other threads:[~2018-12-18 20:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-16 12:56 Wei Yang
2018-12-17 10:25 ` Michal Hocko
2018-12-17 14:18   ` Wei Yang
2018-12-18 12:14     ` Michal Hocko
2018-12-18 14:39       ` Wei Yang
2018-12-18 14:47         ` Michal Hocko
2018-12-18 20:27           ` Wei Yang [this message]
2018-12-19  6:56             ` Michal Hocko
2018-12-19 12:56               ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181218202743.i5wvlzipzdl54fuq@master \
    --to=richard.weiyang@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox