From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 402D3C433DF for ; Sun, 18 Oct 2020 14:19:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5918321D7F for ; Sun, 18 Oct 2020 14:19:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5918321D7F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=h3c.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7A5E16B0062; Sun, 18 Oct 2020 10:19:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 752F76B0068; Sun, 18 Oct 2020 10:19:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F40C6B006E; Sun, 18 Oct 2020 10:19:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0227.hostedemail.com [216.40.44.227]) by kanga.kvack.org (Postfix) with ESMTP id 32ABE6B0062 for ; Sun, 18 Oct 2020 10:19:33 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BA693824999B for ; Sun, 18 Oct 2020 14:19:32 +0000 (UTC) X-FDA: 77385254184.22.sky66_2507e702722f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 8F76618038E68 for ; Sun, 18 Oct 2020 14:19:32 +0000 (UTC) X-HE-Tag: sky66_2507e702722f X-Filterd-Recvd-Size: 8169 Received: from h3cspam02-ex.h3c.com (smtp.h3c.com [60.191.123.50]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Sun, 18 Oct 2020 14:19:25 +0000 (UTC) Received: from DAG2EX03-BASE.srv.huawei-3com.com ([10.8.0.66]) by h3cspam02-ex.h3c.com with ESMTPS id 09IEIZDq084122 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=FAIL); Sun, 18 Oct 2020 22:18:36 +0800 (GMT-8) (envelope-from tian.xianting@h3c.com) Received: from DAG2EX03-BASE.srv.huawei-3com.com (10.8.0.66) by DAG2EX03-BASE.srv.huawei-3com.com (10.8.0.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Sun, 18 Oct 2020 22:18:37 +0800 Received: from DAG2EX03-BASE.srv.huawei-3com.com ([fe80::5d18:e01c:bbbd:c074]) by DAG2EX03-BASE.srv.huawei-3com.com ([fe80::5d18:e01c:bbbd:c074%7]) with mapi id 15.01.2106.002; Sun, 18 Oct 2020 22:18:37 +0800 From: Tianxianting To: Michal Hocko CC: "cl@linux.com" , "penberg@kernel.org" , "rientjes@google.com" , "iamjoonsoo.kim@lge.com" , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "kuba@kernel.org" , "alexei.starovoitov@gmail.com" Subject: RE: [PATCH] mm: Make allocator take care of memoryless numa node Thread-Topic: [PATCH] mm: Make allocator take care of memoryless numa node Thread-Index: AQHWoHLpCFmBht8fWEyQ1LSKDhVJgamTi1oAgAnlFDA= Date: Sun, 18 Oct 2020 14:18:37 +0000 Message-ID: <10ae851702e346369db44e1ec9c830fb@h3c.com> References: <20201012082739.15661-1-tian.xianting@h3c.com> <20201012150554.GE29725@dhcp22.suse.cz> In-Reply-To: <20201012150554.GE29725@dhcp22.suse.cz> Accept-Language: en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.99.141.128] x-sender-location: DAG2 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-DNSRBL: X-MAIL:h3cspam02-ex.h3c.com 09IEIZDq084122 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks for the comments I found in current code, there are two places to call local_memory_node(nod= e) before calling kzalloc_node(), I think we can remove them? -----Original Message----- From: Michal Hocko [mailto:mhocko@suse.com]=20 Sent: Monday, October 12, 2020 11:06 PM To: tianxianting (RD) Cc: cl@linux.com; penberg@kernel.org; rientjes@google.com; iamjoonsoo.kim@l= ge.com; akpm@linux-foundation.org; linux-mm@kvack.org; linux-kernel@vger.ke= rnel.org; kuba@kernel.org; alexei.starovoitov@gmail.com Subject: Re: [PATCH] mm: Make allocator take care of memoryless numa node On Mon 12-10-20 16:27:39, Xianting Tian wrote: > In architecture like powerpc, we can have cpus without any local=20 > memory attached to it. In such cases the node does not have real memory. Yes, this is normal (unfortunately). > In many places of current kernel code, it doesn't judge whether the=20 > node is memoryless numa node before calling allocator interface. And that is correct. It shouldn't make any assumption on the memory on a gi= ven node because that memory might be depleted (similar to no memory) or it= can disappear at any moment because of the memory offlining. > This patch is to use local_memory_node(), which is guaranteed to have=20 > memory, in allocator interface. local_memory_node() is a noop in other=20 > architectures that don't support memoryless nodes. >=20 > As the call path: > alloc_pages_node > __alloc_pages_node > __alloc_pages_nodemask > and __alloc_pages_node,__alloc_pages_nodemask may be called directly,=20 > so only add local_memory_node() in __alloc_pages_nodemask. Page allocator should deal with memory less nodes just fine. It has zonelis= ts constructed for each possible nodes. And it will automatically fall back= into a node with is closest to the requested node. local_memory_node might be incorrect choice from the topology POV. What kind of problem are you trying to fix? > Signed-off-by: Xianting Tian > --- > include/linux/slab.h | 3 +++ > mm/page_alloc.c | 1 + > mm/slab.c | 6 +++++- > mm/slob.c | 1 + > mm/slub.c | 10 ++++++++-- > 5 files changed, 18 insertions(+), 3 deletions(-) >=20 > diff --git a/include/linux/slab.h b/include/linux/slab.h index=20 > 24df2393e..527e811e0 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -574,6 +574,7 @@ static __always_inline void *kmalloc_node(size_t size= , gfp_t flags, int node) > flags, node, size); > } > #endif > + node =3D local_memory_node(node); > return __kmalloc_node(size, flags, node); } > =20 > @@ -626,6 +627,8 @@ static inline void *kmalloc_array_node(size_t n, size= _t size, gfp_t flags, > return NULL; > if (__builtin_constant_p(n) && __builtin_constant_p(size)) > return kmalloc_node(bytes, flags, node); > + > + node =3D local_memory_node(node); > return __kmalloc_node(bytes, flags, node); } > =20 > diff --git a/mm/page_alloc.c b/mm/page_alloc.c index=20 > 6866533de..be63c62c2 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4878,6 +4878,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int= order, int preferred_nid, > return NULL; > } > =20 > + preferred_nid =3D local_memory_node(preferred_nid); > gfp_mask &=3D gfp_allowed_mask; > alloc_mask =3D gfp_mask; > if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask,=20 > &ac, &alloc_mask, &alloc_flags)) diff --git a/mm/slab.c b/mm/slab.c=20 > index f658e86ec..263c2f2e1 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -3575,7 +3575,10 @@ EXPORT_SYMBOL(kmem_cache_alloc_trace); > */ > void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags,=20 > int nodeid) { > - void *ret =3D slab_alloc_node(cachep, flags, nodeid, _RET_IP_); > + void *ret; > + > + nodeid =3D local_memory_node(nodeid); > + ret =3D slab_alloc_node(cachep, flags, nodeid, _RET_IP_); > =20 > trace_kmem_cache_alloc_node(_RET_IP_, ret, > cachep->object_size, cachep->size, @@ -3593,6 +3596,7 @@ void=20 > *kmem_cache_alloc_node_trace(struct kmem_cache *cachep, { > void *ret; > =20 > + nodeid =3D local_memory_node(nodeid); > ret =3D slab_alloc_node(cachep, flags, nodeid, _RET_IP_); > =20 > ret =3D kasan_kmalloc(cachep, ret, size, flags); diff --git=20 > a/mm/slob.c b/mm/slob.c index 7cc9805c8..1f1c25e06 100644 > --- a/mm/slob.c > +++ b/mm/slob.c > @@ -636,6 +636,7 @@ EXPORT_SYMBOL(__kmalloc_node); > =20 > void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t gfp, int=20 > node) { > + node =3D local_memory_node(node); > return slob_alloc_node(cachep, gfp, node); } =20 > EXPORT_SYMBOL(kmem_cache_alloc_node); > diff --git a/mm/slub.c b/mm/slub.c > index 6d3574013..6e5e12b04 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2921,7 +2921,10 @@ EXPORT_SYMBOL(kmem_cache_alloc_trace); > #ifdef CONFIG_NUMA > void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int=20 > node) { > - void *ret =3D slab_alloc_node(s, gfpflags, node, _RET_IP_); > + void *ret; > + > + node =3D local_memory_node(node); > + ret =3D slab_alloc_node(s, gfpflags, node, _RET_IP_); > =20 > trace_kmem_cache_alloc_node(_RET_IP_, ret, > s->object_size, s->size, gfpflags, node); @@ -2935,7 +2938,10=20 > @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *s, > gfp_t gfpflags, > int node, size_t size) > { > - void *ret =3D slab_alloc_node(s, gfpflags, node, _RET_IP_); > + void *ret; > + > + node =3D local_memory_node(node); > + ret =3D slab_alloc_node(s, gfpflags, node, _RET_IP_); > =20 > trace_kmalloc_node(_RET_IP_, ret, > size, s->size, gfpflags, node); > -- > 2.17.1 >=20 --=20 Michal Hocko SUSE Labs