From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
tglx@linutronix.de, willy@infradead.org, mingo@redhat.com,
axboe@kernel.dk, gregkh@linuxfoundation.org, davem@davemloft.net,
viro@zeniv.linux.org.uk, airlied@gmail.com,
Tejun Heo <tj@kernel.org>,
tytso@google.com, snitzer@redhat.com,
Linux Memory Management List <linux-mm@kvack.org>,
neelx@redhat.com, mgorman@techsingularity.net
Subject: Re: Instability in current -git tree
Date: Fri, 13 Jul 2018 23:03:43 -0400 [thread overview]
Message-ID: <CAGM2rebeo3UUo2bL6kXCMGhuM36wjF5CfvqGG_3rpCfBs5S2wA@mail.gmail.com> (raw)
In-Reply-To: <CA+55aFwPAwczHS3XKkEnjY02PaDf2mWrcqx_hket4Ce3nScsSg@mail.gmail.com>
On Fri, Jul 13, 2018 at 10:40 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Jul 13, 2018 at 5:47 PM Pavel Tatashin
> <pasha.tatashin@oracle.com> wrote:
> >
> > The commit intends to zero memmap (struct pages) for every hole in
> > e820 ranges by marking them reserved in memblock. Later
> > zero_resv_unavail() walks through memmap ranges and zeroes struct
> > pages for every page that is reserved, but does not have a physical
> > backing known by kernel.
>
> Ahh. That just looks incoredibly buggy.
>
> You can't just memset() the 'struct page' to zero after it's been set up.
That should not be happening, unless there is a bug.
zero_resv_unavail() is supposed be zeroing struct pages that were not
setup.
Struct pages are configured here:
free_area_init_nodes()
free_area_init_node()
free_area_init_core()
memmap_init()
memmap_init_zone()
if (pfn_valid(pfn)) <--- Actually few more checks:
early_pfn_valid(pfn), early_pfn_in_nid(pfn, nid)
__init_single_page(pfn_to_page(pfn))
mm_zero_struct_page(page);
set the other fields in struct page
zero_resv_unavail(); <-- is called at the end of free_area_init_nodes()
if (!pfn_valid(pfn))
mm_zero_struct_page(pfn_to_page(pfn)); <- So the idea that
we zero only the part of memmap that
was not initialized in __init_single_page().
We want to zero those struct pages so we do not have uninitialized
data accessed by various parts of the code that rounds down large
pages and access the first page in section without verifying that the
page is valid. The example of this is described in commit that
introduced zero_resv_unavail()
>
> That also zeroes page->flags, but page->flags contains things like the
> zone and node ID.
>
> That would explain the completely bogus "DMA" zone. That's not the
> real zone, it's just that page_zonenr() returns 0 because of an
> incorrect clearing of page->flags.
>
> And it would also completely bugger pfn_to_page() for
> CONFIG_DISCONTIGMEM, because the way that works is that it looks up
> the node using page_to_nid(), and then looks up the pfn by using the
> per-node pglist_data ->node_mem_map (that the 'struct page' is
> supposed to be a pointer into).
>
> So zerong the page->flags after it has been set up is completely wrong
> as far as I can see. It literally invalidates 'struct page' and makes
> various core VM function assumptions stop working.
>
> As an example, it makes "page_to_pfn -> pfn_to_page" not be the
> identity transformation, which could result in totally random
> behavior.
>
> And it definitely explains the whole "oh, now the zone ID doesn't
> match" issue. It &*should* have been zone #1 ("DMA32"), but it got
> cleared and that made it zone #0 ("DMA").
>
> So yeah, this looks like the cause of it. And it could result in any
> number of really odd problems, so this could easily explain the syzbot
> failures and reboots at bootup too. Who knows what happens when
> pfn_to_page() doesn't work any more.
>
> Should we perhaps just revert
>
> 124049decbb1 x86/e820: put !E820_TYPE_RAM regions into memblock.reserved
> f7f99100d8d9 mm: stop zeroing memory during allocation in vmemmap
>
> it still reverts fairly cleanly (there's a trivial conflict with the
> older commit).
>
> Linus
>
next prev parent reply other threads:[~2018-07-14 3:04 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CA+55aFyARQV302+mXNYznrOOjzW+yxbcv+=OkD43dG6G1ktoMQ@mail.gmail.com>
[not found] ` <alpine.DEB.2.21.1807140031440.2644@nanos.tec.linutronix.de>
[not found] ` <CA+55aFzBx1haeM2QSFvhaW2t_HVK78Y=bKvsiJmOZztwkZ-y7Q@mail.gmail.com>
[not found] ` <CA+55aFzVGa57apuzDMBLgWQQRcm3BNBs1UEg-G_2o7YW1i=o2Q@mail.gmail.com>
[not found] ` <CA+55aFy9NJZeqT7h_rAgbKUZLjzfxvDPwneFQracBjVhY53aQQ@mail.gmail.com>
2018-07-13 23:48 ` Andrew Morton
2018-07-13 23:51 ` Linus Torvalds
2018-07-13 23:58 ` Andrew Morton
2018-07-14 0:19 ` Pavel Tatashin
2018-07-14 0:28 ` Linus Torvalds
2018-07-14 0:46 ` Pavel Tatashin
2018-07-14 2:40 ` Linus Torvalds
2018-07-14 3:03 ` Pavel Tatashin [this message]
2018-07-14 3:25 ` Linus Torvalds
2018-07-14 3:28 ` Pavel Tatashin
2018-07-14 13:39 ` Pavel Tatashin
2018-07-14 17:11 ` Linus Torvalds
2018-07-14 17:29 ` Linus Torvalds
2018-07-16 12:06 ` Michal Hocko
2018-07-16 12:09 ` Pavel Tatashin
2018-07-16 12:29 ` Michal Hocko
2018-07-16 13:26 ` Pavel Tatashin
2018-07-16 14:12 ` Michal Hocko
2018-07-16 13:39 ` Oscar Salvador
2018-07-14 3:04 ` Linus Torvalds
2018-07-14 0:20 ` Linus Torvalds
2018-07-14 9:28 ` Ard Biesheuvel
2018-07-17 2:59 ` Ard Biesheuvel
2018-07-17 3:14 ` Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGM2rebeo3UUo2bL6kXCMGhuM36wjF5CfvqGG_3rpCfBs5S2wA@mail.gmail.com \
--to=pasha.tatashin@oracle.com \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=davem@davemloft.net \
--cc=gregkh@linuxfoundation.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=neelx@redhat.com \
--cc=snitzer@redhat.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@google.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox