From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A94CCC6FD1C for ; Tue, 14 Mar 2023 12:37:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EEF086B0072; Tue, 14 Mar 2023 08:37:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9F496B0074; Tue, 14 Mar 2023 08:37:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB4F68E0001; Tue, 14 Mar 2023 08:37:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CCE376B0072 for ; Tue, 14 Mar 2023 08:37:24 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 989CAAB3A0 for ; Tue, 14 Mar 2023 12:37:24 +0000 (UTC) X-FDA: 80567454408.14.E02245C Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf04.hostedemail.com (Postfix) with ESMTP id 6668140002 for ; Tue, 14 Mar 2023 12:37:18 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf04.hostedemail.com: domain of chenjun102@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=chenjun102@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678797442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references; bh=/LZ+IDJcelBrk6LxM5+hu0x44IhlhxP3Mx05rMy5D1M=; b=6yEEeyvUx18UxhxyUQSR1goxyJpZFrpcqAyPbSJ2f7uoU/UHF8rM0SpHiSm9sap7w9XFz5 ObnEBuRRLEq9adGe2TPw8c4/YagpF+ECJmISzAUN21TGAC8o6Z8abPRMUKe8gA/LmOJs3j lmssvo+7YcPnwidvl+JhlsdoM/Yrm+4= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf04.hostedemail.com: domain of chenjun102@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=chenjun102@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678797442; a=rsa-sha256; cv=none; b=bOGeGwPu69gFUdAdQnZSoTLwe+mQth8WE5Q6ymcmbmEPE/8xdXgYuRA4NK/ihGE5tjcA6/ 0CEyDW18lKa315LphnAjyfTVLy+ZOiBusXobDoz9NoYqsIE/OHs33B6twhlGqH24hwi8bH I8zmnJpGQtT8bGoL7eKCDicPqr9KD1w= Received: from dggpemm500006.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4PbXzR67ZhzHwfV; Tue, 14 Mar 2023 20:34:59 +0800 (CST) Received: from mdc.huawei.com (10.175.112.208) by dggpemm500006.china.huawei.com (7.185.36.236) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Tue, 14 Mar 2023 20:37:09 +0800 From: Chen Jun To: , , , , , , , CC: , , Subject: [PATCH] mm/slub: Reduce memory consumption in extreme scenarios Date: Tue, 14 Mar 2023 12:34:03 +0000 Message-ID: <20230314123403.100158-1-chenjun102@huawei.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.112.208] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500006.china.huawei.com (7.185.36.236) X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6668140002 X-Stat-Signature: 4isi9w641xedcohwyxkmwxhx4dzxr8h5 X-HE-Tag: 1678797438-233470 X-HE-Meta: U2FsdGVkX1+/8ygqKPGVSOizmiy62/FOLe8UF0rtWrxp34cc/zCxsbOCtd2pPAfiDwLT2U79jxILnWQ+l2fnTxvuoqjyjznISiwPnznkqwEl5jEby+F1mEVYTTEq1NZYSNCzgnDTH0x/aeYyhHxjvZUtrbLrsaZ5xRf32f8nfHzFIeJisBg1BG3sb273ds6FZPELXQEF2DEtlRJ2UNPUinkf1V+3NTujV67LtArC7LOB+kJtPfPkGdHsZj/M481yMot1Z0GYt8r+6sWUCdijfBZd8xuhiZvrDdAqotTKlo9ZO0IXY30RhCEvxMfjUVNzHiRZX/2KzVhDCkm9TiCSC4RwBhn93Hd6zLNCLDGXweabmXitKdUAEO1Z/4L6Q1k6wrsq56yI571RZTEJgyHryslxuTuZdQsv/i5gZyojTe4Gc07iWhUD/T27XtAFiV6E4+6o2/SuuVXPSBOu6nPdyaXnkJrBrl1B470pVHXcyRAhTtEcuh+IoamrgvvFgB72VXQOR5C/nndd5c1oDyzYd1JE11ORPV85Ru3PRW//wQ4NPEyXLU+mDW1g12UAAYwS6Dh/2BKf5KQW9GIKtzE/lltbCuEO43iFZxYbxNMhaiPyK/jF/TSr6sujTvme1pVY4hatGd6nsbCWZ+IO/5SrKnShq2iW3bAIozJxLAxgOf55rNsvL5zSUxQp6C7oI++TYVxpK2XEswg1a0dvcGbN+QHJgwaPi6na1QYr88l7M2Dcu0Ly7+Tl/YYnNPNXr4vM+IOep7zI6nL4yzAjTvY2oJ7J0hXSoU+q42YSqoGxpS3/peVjs73jGDoqgdxxl03r/vdtyeuRv73MhnmN7EcEuYeM9Wa/TypnJ6oqEwe1BdoW6UBCToroCNwdlWItDvoXbwBeBwSVnDY1PC4DMZrAQ6vayxOedEJBqk3Ew7/ie/7eayNrp/xvdPMkv0e8caz3ljMXqIZERThTaZ8CkVo JOajM/7y hJj7H6khr0vTsfFr6hWcrECf4pmMF+wiL9HH7LAOaw/dZwGt6YKl3AquDmageuHdIoH+5Mza0HKDnwkUTxqpJpV7QkP81NMVywLnJxKv7HQFn8ZqQAhWwjk7hqhgouAuSWO9C X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When kmalloc_node() is called without __GFP_THISNODE and the target node lacks sufficient memory, SLUB allocates a folio from a different node other than the requested node, instead of taking a partial slab from it. However, since the allocated folio does not belong to the requested node, it is deactivated and added to the partial slab list of the node it belongs to. This behavior can result in excessive memory usage when the requested node has insufficient memory, as SLUB will repeatedly allocate folios from other nodes without reusing the previously allocated ones. To prevent memory wastage, when (node != NUMA_NO_NODE) && (gfpflags & __GFP_THISNODE) is: 1) try to get a partial slab from target node with __GFP_THISNODE. 2) if 1) failed, try to allocate a new slab from target node with __GFP_THISNODE. 3) if 2) failed, retry 1) and 2) without __GFP_THISNODE constraint. when node != NUMA_NO_NODE || (gfpflags & __GFP_THISNODE), the behavior remains unchanged. On qemu with 4 numa nodes and each numa has 1G memory. Write a test ko to call kmalloc_node(196, GFP_KERNEL, 3) for (4 * 1024 + 4) * 1024 times. cat /proc/slabinfo shows: kmalloc-256 4200530 13519712 256 32 2 : tunables.. after this patch, cat /proc/slabinfo shows: kmalloc-256 4200558 4200768 256 32 2 : tunables.. Signed-off-by: Chen Jun --- mm/slub.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 39327e98fce3..32e436957e03 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2384,7 +2384,7 @@ static void *get_partial(struct kmem_cache *s, int node, struct partial_context searchnode = numa_mem_id(); object = get_partial_node(s, get_node(s, searchnode), pc); - if (object || node != NUMA_NO_NODE) + if (object || (node != NUMA_NO_NODE && (pc->flags & __GFP_THISNODE))) return object; return get_any_partial(s, pc); @@ -3069,6 +3069,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, struct slab *slab; unsigned long flags; struct partial_context pc; + bool try_thisnode = true; stat(s, ALLOC_SLOWPATH); @@ -3181,8 +3182,18 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, } new_objects: - pc.flags = gfpflags; + + /* + * when (node != NUMA_NO_NODE) && (gfpflags & __GFP_THISNODE) + * 1) try to get a partial slab from target node with __GFP_THISNODE. + * 2) if 1) failed, try to allocate a new slab from target node with + * __GFP_THISNODE. + * 3) if 2) failed, retry 1) and 2) without __GFP_THISNODE constraint. + */ + if (node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE) && try_thisnode) + pc.flags |= __GFP_THISNODE; + pc.slab = &slab; pc.orig_size = orig_size; freelist = get_partial(s, node, &pc); @@ -3190,10 +3201,15 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, goto check_new_slab; slub_put_cpu_ptr(s->cpu_slab); - slab = new_slab(s, gfpflags, node); + slab = new_slab(s, pc.flags, node); c = slub_get_cpu_ptr(s->cpu_slab); if (unlikely(!slab)) { + if (node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE) && try_thisnode) { + try_thisnode = false; + goto new_objects; + } + slab_out_of_memory(s, gfpflags, node); return NULL; } -- 2.17.1