linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zaslonko Mikhail <zaslonko@linux.bm.com>
To: Pasha Tatashin <Pavel.Tatashin@microsoft.com>,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>,
	Michal Hocko <mhocko@kernel.org>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	schwidefsky@de.ibm.com
Subject: Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary
Date: Wed, 24 Oct 2018 21:28:51 +0200	[thread overview]
Message-ID: <83c1d934-c74c-c469-8472-de21ebbf6e46@linux.bm.com> (raw)
In-Reply-To: <38ce1d0b-14bd-9a4a-1061-62c366cb11b5@microsoft.com>

Hello,

I hope it's still possible to revive this thread. Please find my comments below.

On 12.09.2018 16:40, Pasha Tatashin wrote:
> 
> 
> On 9/12/18 10:27 AM, Gerald Schaefer wrote:
>> On Wed, 12 Sep 2018 15:39:33 +0200
>> Michal Hocko <mhocko@kernel.org> wrote:
>>
>>> On Wed 12-09-18 15:03:56, Gerald Schaefer wrote:
>>> [...]
>>>> BTW, those sysfs attributes are world-readable, so anyone can trigger
>>>> the panic by simply reading them, or just run lsmem (also available for
>>>> x86 since util-linux 2.32). OK, you need a special not-memory-block-aligned
>>>> mem= parameter and DEBUG_VM for poison check, but w/o DEBUG_VM you would
>>>> still access uninitialized struct pages. This sounds very wrong, and I
>>>> think it really should be fixed.
>>>
>>> Ohh, absolutely. Nobody is questioning that. The thing is that the
>>> code has been likely always broken. We just haven't noticed because
>>> those unitialized parts where zeroed previously. Now that the implicit
>>> zeroying is gone it is just visible.
>>>
>>> All that I am arguing is that there are many places which assume
>>> pageblocks to be fully initialized and plugging one place that blows up
>>> at the time is just whack a mole. We need to address this much earlier.
>>> E.g. by allowing only full pageblocks when adding a memory range.
>>
>> Just to make sure we are talking about the same thing: when you say
>> "pageblocks", do you mean the MAX_ORDER_NR_PAGES / pageblock_nr_pages
>> unit of pages, or do you mean the memory (hotplug) block unit?
> 
>  From early discussion, it was about pageblock_nr_pages not about
> memory_block_size_bytes
> 
>>
>> I do not see any issue here with MAX_ORDER_NR_PAGES / pageblock_nr_pages
>> pageblocks, and if there was such an issue, of course you are right that
>> this would affect many places. If there was such an issue, I would also
>> assume that we would see the new page poison warning in many other places.
>>
>> The bug that Mikhails patch would fix only affects code that operates
>> on / iterates through memory (hotplug) blocks, and that does not happen
>> in many places, only in the two functions that his patch fixes.
> 
> Just to be clear, so memory is pageblock_nr_pages aligned, yet
> memory_block are larger and panic is still triggered?
> 
> I ask, because 3075M is not 128M aligned.
> 
>>
>> When you say "address this much earlier", do you mean changing the way
>> that free_area_init_core()/memmap_init() initialize struct pages, i.e.
>> have them not use zone->spanned_pages as limit, but rather align that
>> up to the memory block (not pageblock) boundary?
>>
> 
> This was my initial proposal, to fix memmap_init() and initialize struct
> pages beyond the "end", and before the "start" to cover the whole
> section. But, I think Michal suggested (and he might correct me) to
> simply ignore unaligned memory to section memory much earlier: so
> anything that does not align to sparse order is not added at all to the
> system.
> 

I tried both approaches but each of them has issues.

First I tried to ignore unaligned memory early by adjusting memory_end value. 
But the thing is that kernel mem parameter parsing and memory_end calculation 
take place in the architecture code and adjusting it afterwards in common code 
might be too late in my view. Also with this approach we might lose the memory 
up to the entire section(256Mb on s390) just because of unfortunate alignment.

Another approach was "to fix memmap_init() and initialize struct pages beyond 
the end". Since struct pages are allocated section-wise we can try to
round the size parameter passed to the memmap_init() function up to the 
section boundary thus forcing the mapping initialization for the entire 
section. But then it leads to another VM_BUG_ON panic due to zone_spans_pfn() 
sanity check triggered for the first page of each page block from 
set_pageblock_migratetype() function.
     page dumped because: VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn))
      Call Trace: 

      ([<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x140) 

       [<00000000003014aa>] set_pageblock_migratetype+0x5a/0x70 

       [<0000000000bef706>] memmap_init_zone+0x25e/0x2e0 

       [<00000000010fc3d8>] free_area_init_node+0x530/0x558 

       [<00000000010fcf02>] free_area_init_nodes+0x81a/0x8f0 

       [<00000000010e7fdc>] paging_init+0x124/0x130 

       [<00000000010e4dfa>] setup_arch+0xbf2/0xcc8 

       [<00000000010de9e6>] start_kernel+0x7e/0x588 

       [<000000000010007c>] startup_continue+0x7c/0x300 

      INFO: lockdep is turned off. 

      Last Breaking-Event-Address: 

       [<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x140
We might ignore this check for the struct pages beyond the "end" but I'm not 
sure about further implications.
Why don't we stay for now with my original proposal fixing specific functions 
for memory hotplug sysfs handlers. Please, tell me what you think.

Thanks,
Mikhail Zaslonko

      parent reply	other threads:[~2018-10-24 19:29 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-10 12:35 Mikhail Zaslonko
2018-09-10 13:17 ` Michal Hocko
2018-09-10 13:46   ` Pasha Tatashin
2018-09-10 13:59     ` Michal Hocko
2018-09-10 14:11       ` Pasha Tatashin
2018-09-10 14:19         ` Michal Hocko
2018-09-10 14:32           ` Pasha Tatashin
2018-09-10 14:41             ` Michal Hocko
2018-09-10 15:26               ` Pasha Tatashin
2018-09-11  9:16                 ` Michal Hocko
2018-09-12 14:28                 ` Gerald Schaefer
2018-09-11 14:08     ` Zaslonko Mikhail
2018-09-11 14:06   ` Zaslonko Mikhail
2018-09-12 12:21     ` Michal Hocko
2018-09-12 13:03   ` Gerald Schaefer
2018-09-12 13:39     ` Michal Hocko
2018-09-12 14:27       ` Gerald Schaefer
2018-09-12 14:40         ` Pasha Tatashin
2018-09-12 15:51           ` Gerald Schaefer
2018-10-24 19:28           ` Zaslonko Mikhail [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83c1d934-c74c-c469-8472-de21ebbf6e46@linux.bm.com \
    --to=zaslonko@linux.bm.com \
    --cc=Pavel.Tatashin@microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=zaslonko@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox