From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8F5BC433EF for ; Wed, 8 Dec 2021 07:54:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 731A86B0073; Wed, 8 Dec 2021 02:54:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E09E6B0074; Wed, 8 Dec 2021 02:54:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A8AD6B0075; Wed, 8 Dec 2021 02:54:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay039.a.hostedemail.com [64.99.140.39]) by kanga.kvack.org (Postfix) with ESMTP id 4CB5A6B0073 for ; Wed, 8 Dec 2021 02:54:34 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1F30B604BB for ; Wed, 8 Dec 2021 07:54:24 +0000 (UTC) X-FDA: 78893864448.27.6CF5E6E Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf30.hostedemail.com (Postfix) with ESMTP id 92085E0016B0 for ; Wed, 8 Dec 2021 07:54:23 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 4289A1FDFC; Wed, 8 Dec 2021 07:54:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638950062; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7uzUKjoU+isYDo8PAXO470+rsJzNPdDtwPvqdFrtvD4=; b=CuovXBs+2XcXxJu3pz2b5FIR0XfMaH1v7T92OsbZJPG19RXknc4r1XM5lJnOwHNmli78V+ NVvGvdDFZeEm+0IJtd80MdurmCghvRgr+TEzTgUSeA4wioa0GGYq9+OPzBtoemj65CpkqO j/mNivXMKkgW7JpAmE6cKTns+w5l1iI= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 10129A3B93; Wed, 8 Dec 2021 07:54:22 +0000 (UTC) Date: Wed, 8 Dec 2021 08:54:21 +0100 From: Michal Hocko To: Nico Pache Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, shakeelb@google.com, ktkhai@virtuozzo.com, shy828301@gmail.com, guro@fb.com, vbabka@suse.cz, vdavydov.dev@gmail.com, raquini@redhat.com, david@redhat.com Subject: Re: [PATCH v2 1/1] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes Message-ID: References: <20211207224013.880775-1-npache@redhat.com> <20211207224013.880775-2-npache@redhat.com> <20211207154438.c1e49a3f0b5ebc9245aac61b@linux-foundation.org> <4c4b4db2-27b9-6001-5bae-ccc500695b42@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4c4b4db2-27b9-6001-5bae-ccc500695b42@redhat.com> X-Stat-Signature: of3m3nicj7msdoepb8smce55s94w5stb Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=CuovXBs+; spf=pass (imf30.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 92085E0016B0 X-HE-Tag: 1638950063-956476 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 07-12-21 19:40:33, Nico Pache wrote: > > > On 12/7/21 18:44, Andrew Morton wrote: > > On Tue, 7 Dec 2021 17:40:13 -0500 Nico Pache wrote: > > > >> We have run into a panic caused by a shrinker allocation being attempted > >> on an offlined node. > >> > >> Our crash analysis has determined that the issue originates from trying > >> to allocate pages on an offlined node in expand_one_shrinker_info. This > >> function makes the incorrect assumption that we can allocate on any node. > >> To correct this we make sure the node is online before tempting an > >> allocation. If it is not online choose the closest node. > > > > This isn't fully accurate, is it? We could allocate on a node which is > > presently offline but which was previously onlined, by testing > > NODE_DATA(nid). > > Thanks for the review! I took your changes below into consideration for my V3. > > My knowledge of offlined/onlined nodes is quite limited but after looking into > it it doesnt seem like anything clears the state of NODE_DATA(nid) after a > try_offline_node is attempted. So theoretically the panic we saw would not > happen. What is the expected behavior of trying to allocate a page on a offline > node? To fall back (in the zonelist order) into the other node. If __GFP_THISNODE is specified then simply fail the allocation. -- Michal Hocko SUSE Labs