From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB4E3C10DCE for ; Wed, 18 Mar 2020 16:10:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B54A020663 for ; Wed, 18 Mar 2020 16:10:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B54A020663 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4615A6B006C; Wed, 18 Mar 2020 12:10:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 411E56B006E; Wed, 18 Mar 2020 12:10:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 328326B0070; Wed, 18 Mar 2020 12:10:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0140.hostedemail.com [216.40.44.140]) by kanga.kvack.org (Postfix) with ESMTP id 0F9F76B006C for ; Wed, 18 Mar 2020 12:10:26 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B61412477 for ; Wed, 18 Mar 2020 16:10:25 +0000 (UTC) X-FDA: 76608970410.14.boy71_7813ac7d1f034 X-HE-Tag: boy71_7813ac7d1f034 X-Filterd-Recvd-Size: 5279 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Mar 2020 16:10:25 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 01DCCAD43; Wed, 18 Mar 2020 16:10:21 +0000 (UTC) Subject: Re: [RFC 1/2] mm, slub: prevent kmalloc_node crashes and memory leaks To: bharata@linux.ibm.com Cc: linux-mm@kvack.org, Sachin Sant , Srikar Dronamraju , Mel Gorman , Michael Ellerman , Michal Hocko , Christopher Lameter , linuxppc-dev@lists.ozlabs.org, Joonsoo Kim , Pekka Enberg , David Rientjes , Kirill Tkhai , Nathan Lynch References: <20200318144220.18083-1-vbabka@suse.cz> <20200318160610.GD26049@in.ibm.com> From: Vlastimil Babka Message-ID: <148f1b95-86e7-b98a-1446-46ecb42f5610@suse.cz> Date: Wed, 18 Mar 2020 17:10:19 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: <20200318160610.GD26049@in.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 3/18/20 5:06 PM, Bharata B Rao wrote: > On Wed, Mar 18, 2020 at 03:42:19PM +0100, Vlastimil Babka wrote: >> This is a PowerPC platform with following NUMA topology: >> >> available: 2 nodes (0-1) >> node 0 cpus: >> node 0 size: 0 MB >> node 0 free: 0 MB >> node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 >> node 1 size: 35247 MB >> node 1 free: 30907 MB >> node distances: >> node 0 1 >> 0: 10 40 >> 1: 40 10 >> >> possible numa nodes: 0-31 >> >> A related issue was reported by Bharata [3] where a similar PowerPC >> configuration, but without patch [2] ends up allocating large amounts of pages >> by kmalloc-1k kmalloc-512. This seems to have the same underlying issue with >> node_to_mem_node() not behaving as expected, and might probably also lead >> to an infinite loop with CONFIG_SLUB_CPU_PARTIAL. > > This patch doesn't fix the issue of kmalloc caches consuming more > memory for the above mentioned topology. Also CONFIG_SLUB_CPU_PARTIAL is set > here and I have not observed infinite loop till now. OK that means something is wrong with my analysis. > Or, are you expecting your fix to work on top of Srikar's other patchset > https://lore.kernel.org/linuxppc-dev/20200311110237.5731-1-srikar@linux.vnet.ibm.com/t/#u ? No, I hoped it would work on mainline. > With the above patchset, no fix is required to address increased memory > consumption of kmalloc caches because this patchset prevents such > topology from occuring thereby making it impossible for the problem > to surface (or at least impossible for the specific topology that I > mentioned) Right, I hope to fix it nevertheless. >> diff --git a/mm/slub.c b/mm/slub.c >> index 17dc00e33115..4d798cacdae1 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -1511,7 +1511,7 @@ static inline struct page *alloc_slab_page(struct kmem_cache *s, >> struct page *page; >> unsigned int order = oo_order(oo); >> >> - if (node == NUMA_NO_NODE) >> + if (node == NUMA_NO_NODE || !node_online(node)) >> page = alloc_pages(flags, order); >> else >> page = __alloc_pages_node(node, flags, order); >> @@ -1973,8 +1973,6 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node, >> >> if (node == NUMA_NO_NODE) >> searchnode = numa_mem_id(); >> - else if (!node_present_pages(node)) >> - searchnode = node_to_mem_node(node); > > We still come here with memory-less node=0 (and not NUMA_NO_NODE), fail to > find partial slab, go back and allocate a new one thereby continuosly > increasing the number of newly allocated slabs. > >> >> object = get_partial_node(s, get_node(s, searchnode), c, flags); >> if (object || node != NUMA_NO_NODE) >> @@ -2568,12 +2566,15 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, >> redo: >> >> if (unlikely(!node_match(page, node))) { >> - int searchnode = node; >> - >> - if (node != NUMA_NO_NODE && !node_present_pages(node)) >> - searchnode = node_to_mem_node(node); >> - >> - if (unlikely(!node_match(page, searchnode))) { >> + /* >> + * node_match() false implies node != NUMA_NO_NODE >> + * but if the node is not online or has no pages, just >> + * ignore the constraint >> + */ >> + if ((!node_online(node) || !node_present_pages(node))) { >> + node = NUMA_NO_NODE; >> + goto redo; > > Many calls for allocating slab object from memory-less node 0 in my case > don't even hit the above check because they get short circuited by > goto new_slab label which is present a few lines above. Hence I don't see > any reduction in the amount of slab memory with this fix. Thanks a lot for the info, I will try again :) > Regards, > Bharata. >