From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f69.google.com (mail-it0-f69.google.com [209.85.214.69]) by kanga.kvack.org (Postfix) with ESMTP id 9889B6B0005 for ; Mon, 16 Jul 2018 08:09:51 -0400 (EDT) Received: by mail-it0-f69.google.com with SMTP id r10-v6so13649109itc.2 for ; Mon, 16 Jul 2018 05:09:51 -0700 (PDT) Received: from aserp2120.oracle.com (aserp2120.oracle.com. [141.146.126.78]) by mx.google.com with ESMTPS id 9-v6si9295133itg.42.2018.07.16.05.09.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Jul 2018 05:09:50 -0700 (PDT) Subject: Re: Instability in current -git tree References: <20180713164804.fc2c27ccbac4c02ca2c8b984@linux-foundation.org> <20180713165812.ec391548ffeead96725d044c@linux-foundation.org> <9b93d48c-b997-01f7-2fd6-6e35301ef263@oracle.com> <5edf2d71-f548-98f9-16dd-b7fed29f4869@oracle.com> <20180716120642.GN17280@dhcp22.suse.cz> From: Pavel Tatashin Message-ID: Date: Mon, 16 Jul 2018 08:09:19 -0400 MIME-Version: 1.0 In-Reply-To: <20180716120642.GN17280@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Linus Torvalds , Andrew Morton , tglx@linutronix.de, willy@infradead.org, mingo@redhat.com, axboe@kernel.dk, gregkh@linuxfoundation.org, davem@davemloft.net, viro@zeniv.linux.org.uk, Dave Airlie , Tejun Heo , Theodore Tso , snitzer@redhat.com, Linux Memory Management List , neelx@redhat.com, mgorman@techsingularity.net On 07/16/2018 08:06 AM, Michal Hocko wrote: > On Sat 14-07-18 09:39:29, Pavel Tatashin wrote: > [...] >> From 95259841ef79cc17c734a994affa3714479753e3 Mon Sep 17 00:00:00 2001 >> From: Pavel Tatashin >> Date: Sat, 14 Jul 2018 09:15:07 -0400 >> Subject: [PATCH] mm: zero unavailable pages before memmap init >> >> We must zero struct pages for memory that is not backed by physical memory, >> or kernel does not have access to. >> >> Recently, there was a change which zeroed all memmap for all holes in e820. >> Unfortunately, it introduced a bug that is discussed here: >> >> https://www.spinics.net/lists/linux-mm/msg156764.html >> >> Linus, also saw this bug on his machine, and confirmed that pulling >> commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved") >> fixes the issue. >> >> The problem is that we incorrectly zero some struct pages after they were >> setup. > > I am sorry but I simply do not see it. zero_resv_unavail should be > touching only reserved memory ranges and those are not initialized > anywhere. So who has reused them and put them to normal available > memory to be initialized by free_area_init_node[s]? > > The patch itself should be safe because reserved and available memory > ranges should be disjoint so the ordering shouldn't matter. The fact > that it matters is the crux thing to understand and document. So the > change looks good to me but I do not understand _why_ it makes any > difference. There must be somebody to put (memblock) reserved memory > available to the page allocator behind our backs. Thats exactly right, and I am also not sure why this is happening, there must be some overlapping happening that just should not. I will study it later. Now, I need to figure out what is happening with x86-32 failure, that is caused by my fix. Pavel