From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6552C433EF for ; Mon, 6 Dec 2021 13:06:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C09C6B007B; Mon, 6 Dec 2021 08:06:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 46FA76B007D; Mon, 6 Dec 2021 08:06:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3377D6B007E; Mon, 6 Dec 2021 08:06:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id 25E686B007B for ; Mon, 6 Dec 2021 08:06:43 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DEA29182012B3 for ; Mon, 6 Dec 2021 13:06:32 +0000 (UTC) X-FDA: 78887393424.23.0644BCD Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf01.hostedemail.com (Postfix) with ESMTP id 49A3A40006 for ; Mon, 6 Dec 2021 13:06:32 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 010FF212C5; Mon, 6 Dec 2021 13:06:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638795991; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9K7RftGjQ4l6lY7PF5hHrou9HVqN2l0LIhFrO5xCxwU=; b=cFv0qpNtVh2cPCHRk4PnwSEC5n8ocHd7vzCeqS0VGoOgyU+Kzhl7X7pjg+vdqRFIq8MLiV qzb7gpJ1oyvWMjwSwYI79OSpsA7iwkRdAFmt4Y0OcaE66+ihENUM2t0dhLrXkdJLFFxNT/ mYFJtO13EQiel8pCaDRHhQ34aBtDkZg= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B2487A3B8F; Mon, 6 Dec 2021 13:06:30 +0000 (UTC) Date: Mon, 6 Dec 2021 14:06:27 +0100 From: Michal Hocko To: David Hildenbrand Cc: Nico Pache , linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, shakeelb@google.com, ktkhai@virtuozzo.com, shy828301@gmail.com, guro@fb.com, vbabka@suse.cz, vdavydov.dev@gmail.com, raquini@redhat.com Subject: Re: [RFC PATCH 2/2] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes Message-ID: References: <20211206033338.743270-1-npache@redhat.com> <20211206033338.743270-3-npache@redhat.com> <840cb3d0-61fe-b6cb-9918-69146ba06cf7@redhat.com> <51c65635-1dae-6ba4-daf9-db9df0ec35d8@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51c65635-1dae-6ba4-daf9-db9df0ec35d8@redhat.com> X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 49A3A40006 X-Stat-Signature: c9jqrngps4ekdf1kjgnkusd14j957ent Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=cFv0qpNt; spf=pass (imf01.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-HE-Tag: 1638795992-280106 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 06-12-21 13:43:27, David Hildenbrand wrote: [...] > > Now practically speaking !node_online should not apear node_online (note > > I am attentionally avoiding to say offline and online as that has a > > completely different semantic) shouldn't really happen for some > > architectures. x86 should allocate pgdat for each possible node. I do > > not know what was the architecture in this case but we already have > > another report for x86 that remains unexplained. > > So we'd allocate the pgdat although all we want is just a zonelist. The > obvious alternative is to implement the fallback where reasonable -- for > example, in the page allocator. It knows the fallback order: > build_zonelists(). That's pretty much all we need the preferred_nid for. > > So just making prepare_alloc_pages()/node_zonelist() deal with a missing > pgdat could make sense as well. Something like: > > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > index b976c4177299..2d2649e78766 100644 > --- a/include/linux/gfp.h > +++ b/include/linux/gfp.h > @@ -508,9 +508,14 @@ static inline int gfp_zonelist(gfp_t flags) > * > * For the case of non-NUMA systems the NODE_DATA() gets optimized to > * &contig_page_data at compile-time. > + * > + * If the node does not have a pgdat yet, returns the zonelist of the > + * first online node. > */ > static inline struct zonelist *node_zonelist(int nid, gfp_t flags) > { > + if (unlikely(!NODE_DATA(nid))) > + nid = first_online_node; > return NODE_DATA(nid)->node_zonelists + gfp_zonelist(flags); > } This is certainly possible. But it a) adds a branch to the hotpath and b) it doesn't solve any other potential dereference of garbage node. > But of course, there might be value in a proper node-aware fallback list > as we have in build_zonelists() -- but it remains questionable if the > difference for these corner cases would be relevant in practice. Only the platform knows the proper node topology and that includes memory less nodes. So they should be setting up a node properly and we shouldn't be dealing with this at the allocator nor any other code. > Further, if we could have thousands of nodes, we'd have to update each > and every one when building zone lists ... Why would that be a practical problem? > Removing the hotadd_new_pgdat() stuff does sound appealing, though. Yes our hotplug code could be simplified as well. All we really need is an arch code to initialize all the possible nodes. -- Michal Hocko SUSE Labs