From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f69.google.com (mail-pg0-f69.google.com [74.125.83.69]) by kanga.kvack.org (Postfix) with ESMTP id 7D5976B0069 for ; Wed, 4 Jan 2017 21:03:57 -0500 (EST) Received: by mail-pg0-f69.google.com with SMTP id g1so1408998544pgn.3 for ; Wed, 04 Jan 2017 18:03:57 -0800 (PST) Received: from mail-pf0-x230.google.com (mail-pf0-x230.google.com. [2607:f8b0:400e:c00::230]) by mx.google.com with ESMTPS id t1si45726453plj.63.2017.01.04.18.03.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Jan 2017 18:03:56 -0800 (PST) Received: by mail-pf0-x230.google.com with SMTP id d2so84979202pfd.0 for ; Wed, 04 Jan 2017 18:03:56 -0800 (PST) Subject: Re: [PATCH v3] arm64: mm: Fix NOMAP page initialization References: <20161216165437.21612-1-rrichter@cavium.com> From: Hanjun Guo Message-ID: Date: Thu, 5 Jan 2017 10:03:48 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ard Biesheuvel , Robert Richter Cc: Russell King , Catalin Marinas , Will Deacon , David Daney , Mark Rutland , James Morse , Yisheng Xie , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" On 2017/1/4 21:56, Ard Biesheuvel wrote: > On 16 December 2016 at 16:54, Robert Richter wrote: >> On ThunderX systems with certain memory configurations we see the >> following BUG_ON(): >> >> kernel BUG at mm/page_alloc.c:1848! >> >> This happens for some configs with 64k page size enabled. The BUG_ON() >> checks if start and end page of a memmap range belongs to the same >> zone. >> >> The BUG_ON() check fails if a memory zone contains NOMAP regions. In >> this case the node information of those pages is not initialized. This >> causes an inconsistency of the page links with wrong zone and node >> information for that pages. NOMAP pages from node 1 still point to the >> mem zone from node 0 and have the wrong nid assigned. >> >> The reason for the mis-configuration is a change in pfn_valid() which >> reports pages marked NOMAP as invalid: >> >> 68709f45385a arm64: only consider memblocks with NOMAP cleared for linear mapping >> >> This causes pages marked as nomap being no longer reassigned to the >> new zone in memmap_init_zone() by calling __init_single_pfn(). >> >> Fixing this by implementing an arm64 specific early_pfn_valid(). This >> causes all pages of sections with memory including NOMAP ranges to be >> initialized by __init_single_page() and ensures consistency of page >> links to zone, node and section. >> > > I like this solution a lot better than the first one, but I am still > somewhat uneasy about having the kernel reason about attributes of > pages it should not touch in the first place. But the fact that > early_pfn_valid() is only used a single time in the whole kernel does > give some confidence that we are not simply moving the problem > elsewhere. > > Given that you are touching arch/arm/ as well as arch/arm64, could you > explain why only arm64 needs this treatment? Is it simply because we > don't have NUMA support there? > > Considering that Hisilicon D05 suffered from the same issue, I would > like to get some coverage there as well. Hanjun, is this something you > can arrange? Thanks Sure, we will test this patch with LTP MM stress test (which triggers the bug on D05), and give the feedback. Thanks Hanjun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org