linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	tglx@linutronix.de, willy@infradead.org, mingo@redhat.com,
	axboe@kernel.dk, gregkh@linuxfoundation.org, davem@davemloft.net,
	viro@zeniv.linux.org.uk, Dave Airlie <airlied@gmail.com>,
	Tejun Heo <tj@kernel.org>, Theodore Tso <tytso@google.com>,
	snitzer@redhat.com,
	Linux Memory Management List <linux-mm@kvack.org>,
	neelx@redhat.com, mgorman@techsingularity.net
Subject: Re: Instability in current -git tree
Date: Sat, 14 Jul 2018 09:39:29 -0400	[thread overview]
Message-ID: <CAGM2reb2Zk6t=QJtJZPRGwovKKR9bdm+fzgmA_7CDVfDTjSgKA@mail.gmail.com> (raw)
In-Reply-To: <CA+55aFxetyCqX2EzFBDdHtriwt6UDYcm0chHGQUdPX20qNHb4Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2098 bytes --]

Hi Linus,

I attached a temporary fix, which I could not test, as I was unable to
reproduce the problem, but it should fix the issue.

Reverting "f7f99100d8d9 mm: stop zeroing memory during allocation in
vmemmap" would introduce a significant boot performance regression, as
we would zero the whole memmap twice during boot.

Later, I will introduce a more detailed fix that will get rid of
zero_resv_unavail() entirely, and instead will zero skipped struct
pages in memmap_init_zone(), where it should be done.

Thank you,
Pavel

On Fri, Jul 13, 2018 at 11:25 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Jul 13, 2018 at 8:04 PM Pavel Tatashin
> <pasha.tatashin@oracle.com> wrote:
> >
> > > You can't just memset() the 'struct page' to zero after it's been set up.
> >
> > That should not be happening, unless there is a bug.
>
> Well, it does seem to happen. My memory stress-tester has been running
> for about half an hour now with the revert I posted - it used to
> trigger the problem in maybe ~5 minutes before.
>
> So I do think that revert fixes it for me. No guarantees, but since I
> figured out how to trigger it, it's been fairly reliable.
>
> > We want to zero those struct pages so we do not have uninitialized
> > data accessed by various parts of the code that rounds down large
> > pages and access the first page in section without verifying that the
> > page is valid. The example of this is described in commit that
> > introduced zero_resv_unavail()
>
> I'm attaching the relevant (?) parts of dmesg, which has the node
> ranges, maybe you can see what the problem with the code is.
>
> (NOTE! This dmesg is with that "mem=6G" command line option, which causes that
>
>   e820: remove [mem 0x180000000-0xfffffffffffffffe] usable
>
> line - that's just because it's my stress-test boot. It happens with
> or without it, but without the "mem=6G" it took days to trigger).
>
> I'm more than willing to test patches (either for added information or
> for testing fixes), although I think I'm getting off the computer for
> today.
>
>                 Linus

[-- Attachment #2: 0001-mm-zero-unavailable-pages-before-memmap-init.patch --]
[-- Type: text/x-patch, Size: 2245 bytes --]

From 95259841ef79cc17c734a994affa3714479753e3 Mon Sep 17 00:00:00 2001
From: Pavel Tatashin <pasha.tatashin@oracle.com>
Date: Sat, 14 Jul 2018 09:15:07 -0400
Subject: [PATCH] mm: zero unavailable pages before memmap init

We must zero struct pages for memory that is not backed by physical memory,
or kernel does not have access to.

Recently, there was a change which zeroed all memmap for all holes in e820.
Unfortunately, it introduced a bug that is discussed here:

https://www.spinics.net/lists/linux-mm/msg156764.html

Linus, also saw this bug on his machine, and confirmed that pulling
commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved")
fixes the issue.

The problem is that we incorrectly zero some struct pages after they were
setup.

The fix is to zero unavailable struct pages prior to initializing of struct pages.

A more detailed fix should come later that would avoid double zeroing
cases: one in __init_single_page(), the other one in zero_resv_unavail().

Fixes: 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved")

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 mm/page_alloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1521100f1e63..5d800d61ddb7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6847,6 +6847,7 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
 	/* Initialise every node */
 	mminit_verify_pageflags_layout();
 	setup_nr_node_ids();
+	zero_resv_unavail();
 	for_each_online_node(nid) {
 		pg_data_t *pgdat = NODE_DATA(nid);
 		free_area_init_node(nid, NULL,
@@ -6857,7 +6858,6 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
 			node_set_state(nid, N_MEMORY);
 		check_for_memory(pgdat, nid);
 	}
-	zero_resv_unavail();
 }
 
 static int __init cmdline_parse_core(char *p, unsigned long *core,
@@ -7033,9 +7033,9 @@ void __init set_dma_reserve(unsigned long new_dma_reserve)
 
 void __init free_area_init(unsigned long *zones_size)
 {
+	zero_resv_unavail();
 	free_area_init_node(0, zones_size,
 			__pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL);
-	zero_resv_unavail();
 }
 
 static int page_alloc_cpu_dead(unsigned int cpu)
-- 
2.18.0


  parent reply	other threads:[~2018-07-14 13:40 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CA+55aFyARQV302+mXNYznrOOjzW+yxbcv+=OkD43dG6G1ktoMQ@mail.gmail.com>
     [not found] ` <alpine.DEB.2.21.1807140031440.2644@nanos.tec.linutronix.de>
     [not found]   ` <CA+55aFzBx1haeM2QSFvhaW2t_HVK78Y=bKvsiJmOZztwkZ-y7Q@mail.gmail.com>
     [not found]     ` <CA+55aFzVGa57apuzDMBLgWQQRcm3BNBs1UEg-G_2o7YW1i=o2Q@mail.gmail.com>
     [not found]       ` <CA+55aFy9NJZeqT7h_rAgbKUZLjzfxvDPwneFQracBjVhY53aQQ@mail.gmail.com>
2018-07-13 23:48         ` Andrew Morton
2018-07-13 23:51           ` Linus Torvalds
2018-07-13 23:58             ` Andrew Morton
2018-07-14  0:19               ` Pavel Tatashin
2018-07-14  0:28                 ` Linus Torvalds
2018-07-14  0:46                   ` Pavel Tatashin
2018-07-14  2:40                     ` Linus Torvalds
2018-07-14  3:03                       ` Pavel Tatashin
2018-07-14  3:25                         ` Linus Torvalds
2018-07-14  3:28                           ` Pavel Tatashin
2018-07-14 13:39                           ` Pavel Tatashin [this message]
2018-07-14 17:11                             ` Linus Torvalds
2018-07-14 17:29                               ` Linus Torvalds
2018-07-16 12:06                             ` Michal Hocko
2018-07-16 12:09                               ` Pavel Tatashin
2018-07-16 12:29                                 ` Michal Hocko
2018-07-16 13:26                                   ` Pavel Tatashin
2018-07-16 14:12                                     ` Michal Hocko
2018-07-16 13:39                                   ` Oscar Salvador
2018-07-14  3:04                       ` Linus Torvalds
2018-07-14  0:20             ` Linus Torvalds
2018-07-14  9:28               ` Ard Biesheuvel
2018-07-17  2:59               ` Ard Biesheuvel
2018-07-17  3:14                 ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGM2reb2Zk6t=QJtJZPRGwovKKR9bdm+fzgmA_7CDVfDTjSgKA@mail.gmail.com' \
    --to=pasha.tatashin@oracle.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=neelx@redhat.com \
    --cc=snitzer@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox