From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72FD2C433F5 for ; Mon, 6 Dec 2021 21:29:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0103C6B0072; Mon, 6 Dec 2021 16:28:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFF4D6B0083; Mon, 6 Dec 2021 16:28:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC81C6B0085; Mon, 6 Dec 2021 16:28:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay030.a.hostedemail.com [64.99.140.30]) by kanga.kvack.org (Postfix) with ESMTP id CCF446B0072 for ; Mon, 6 Dec 2021 16:28:58 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8BFA7208E3 for ; Mon, 6 Dec 2021 21:28:48 +0000 (UTC) X-FDA: 78888659136.01.ABD17BF Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) by imf17.hostedemail.com (Postfix) with ESMTP id 25D18F0001CD for ; Mon, 6 Dec 2021 21:28:47 +0000 (UTC) Received: by mail-ed1-f52.google.com with SMTP id r25so48352309edq.7 for ; Mon, 06 Dec 2021 13:28:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=UrzG1F2FOa3VtmEiI4VOyaEzgpwlPGnaXDRCIF3mM/U=; b=eSxgxzc1WnpZmWh38Ufecg6Ba+9/74i9ytV3nuEJx4uwWE05riNrdYZQocDSYDj7w/ DQgXLTlCPzP4KkQgJ2aFaS1js7tJ2rNT3D5EZs2WJEmkgId4Q3cBF4PYvTUyxWA8QJuD 5vyaBVsqTIGR3RHGmECMC1yjjtWXHZk4waYtRvXF0Po1GeaBHq8X98mm6TgtgZyW7kWv v60mIlb9tYppOiYyFqXxFDI0+tbUZu6CeKdflEEscBIx62oOhtbWN04Z8+Hl23gLra0Y R/79O4vVCm4ran9ZZxgSoRrntvYd/ucXGuLv+E4FO1+WoSRTGXMfx9zEgXRupiI9Zqiw pv+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UrzG1F2FOa3VtmEiI4VOyaEzgpwlPGnaXDRCIF3mM/U=; b=ahmmZZp8D4PzL9NX940ktdzhQz32qJcodKAqBO5eacp1aEhK0oaEoUPeK6sFhXUfp1 hnvnM+SAD2gCGgD3giK+Y+hXY8K9ZDVe+2LLlGONYXOON/LWLcqNtcuqt+29Ibko2obW 1HSZdW4x/dUm6xxgMxpuXxCFyRcjUmouSbSqMaanelDpPllRo9Q7SZ4n9EOycDdSvW1L NWqpqCIoNKAuH7XcNZm5ryesNpVin2kUZXlcbYK6gL7kFsm64PZF0STcgg+udBMpa4cl AMVGDLapjm2hy/xFd+2qrXGWSnNYM98Cc2hlEmKZ3zKp+n3ygk+ylilVLx0TF4RvvY3p +IgA== X-Gm-Message-State: AOAM5316FkxVtw4IAM7tmsQtD3U6MWrmSyGju9awifvgJ/Ij3vNXkkvS jpalyOd0zEept4essXOHuRli80RQMVb7PnClPEg= X-Google-Smtp-Source: ABdhPJyADNf6T4Ies4DR3lGsTriyK7xqtDItPSxSe1FWodoGiJVmcPcC2cedfVtcVGj+RKEYxjYtlzbcXbYUmhRpT30= X-Received: by 2002:a17:906:538d:: with SMTP id g13mr49050917ejo.62.1638826126701; Mon, 06 Dec 2021 13:28:46 -0800 (PST) MIME-Version: 1.0 References: <20211206033338.743270-1-npache@redhat.com> <20211206033338.743270-3-npache@redhat.com> <24b4455c-aff9-ca9f-e29f-350833e7a0d1@virtuozzo.com> In-Reply-To: From: Yang Shi Date: Mon, 6 Dec 2021 13:28:34 -0800 Message-ID: Subject: Re: [RFC PATCH 2/2] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes To: David Hildenbrand Cc: Kirill Tkhai , Michal Hocko , Nico Pache , Linux Kernel Mailing List , Linux MM , Andrew Morton , Shakeel Butt , Roman Gushchin , Vlastimil Babka , Vladimir Davydov , raquini@redhat.com Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=eSxgxzc1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.52 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 25D18F0001CD X-Stat-Signature: kq1m4r79ho471xw3fbkp663i1g4qf7s8 X-HE-Tag: 1638826127-458653 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Dec 6, 2021 at 11:01 AM David Hildenbrand wrote: > > On 06.12.21 19:42, Yang Shi wrote: > > On Mon, Dec 6, 2021 at 5:19 AM Kirill Tkhai wrote: > >> > >> On 06.12.2021 13:45, David Hildenbrand wrote: > >>>> This doesn't seen complete. Slab shrinkers are used in the reclaim > >>>> context. Previously offline nodes could be onlined later and this would > >>>> lead to NULL ptr because there is no hook to allocate new shrinker > >>>> infos. This would be also really impractical because this would have to > >>>> update all existing memcgs... > >>> > >>> Instead of going through the trouble of updating... > >>> > >>> ... maybe just keep for_each_node() and check if the target node is > >>> offline. If it's offline, just allocate from the first online node. > >>> After all, we're not using __GFP_THISNODE, so there are no guarantees > >>> either way ... > >> > >> Hm, can't we add shrinker maps allocation to __try_online_node() in addition > >> to this patch? > > > > I think the below fix (an example, doesn't cover all affected > > callsites) should be good enough for now? It doesn't touch the hot > > path of the page allocator. > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index fb9584641ac7..1252a33f7c28 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -222,13 +222,15 @@ static int expand_one_shrinker_info(struct > > mem_cgroup *memcg, > > int size = map_size + defer_size; > > > > for_each_node(nid) { > > + int tmp = nid; > > pn = memcg->nodeinfo[nid]; > > old = shrinker_info_protected(memcg, nid); > > /* Not yet online memcg */ > > if (!old) > > return 0; > > - > > - new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid); > > + if (!node_online(nid)) > > + tmp = -1; > > + new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, tmp); > > if (!new) > > return -ENOMEM; > > > > It used to use kvmalloc instead of kvmalloc_node(). The commit > > 86daf94efb11d7319fbef5e480018c4807add6ef ("mm/memcontrol.c: allocate > > shrinker_map on appropriate NUMA node") changed to use *_node() > > version. The justification was that "kswapd is always bound to > > specific node. So allocate shrinker_map from the related NUMA node to > > respect its NUMA locality." There is no kswapd for offlined node, so > > just allocate shrinker info on node 0. This is also what > > alloc_mem_cgroup_per_node_info() does. > > Yes, that's what I refer to as fixing it in the caller -- similar to > [1]. Michals point is to not require such node_online() checks at all, > neither in the caller nor in the buddy. > > I see 2 options short-term > > 1) What we have in [1]. > 2) What I proposed in [2], fixing it for all such instances until we > have something better. > > Long term I tend to agree that what Michal proposes is better. > > Short term I tend to like [2], because it avoids having to mess with all > such instances to eventually get it right and the temporary overhead > until we have the code reworked should be really negligible ... Thanks, David. Basically either option looks fine to me. But I'm a little bit concerned about [2]. It silently changes the node requested by the callers. It actually papers over potential bugs? And what if the callers specify __GFP_THISNODE (I didn't search if such callers really exist in the current code)? How's about a helper function, for example, called kvmalloc_best_node()? It does: void * kvmalloc_best_node(unsigned long size, int flag, int nid) { bool onlined = node_online(nid); WARN_ON_ONCE((flag & __GFP_THISNODE) && !onlined); if (!onlined) nid = -1; return kvmalloc_node(size, GFP_xxx, nid); } > > > > [1] https://lkml.kernel.org/r/20211108202325.20304-1-amakhalov@vmware.com > [2] > https://lkml.kernel.org/r/51c65635-1dae-6ba4-daf9-db9df0ec35d8@redhat.com > > -- > Thanks, > > David / dhildenb >