From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0862FC3601E for ; Thu, 10 Apr 2025 19:15:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B65952800C4; Thu, 10 Apr 2025 15:15:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B59F82800AB; Thu, 10 Apr 2025 15:15:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DE322800C0; Thu, 10 Apr 2025 15:15:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 455A66B0379 for ; Thu, 10 Apr 2025 15:15:02 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id DF87B1A06E1 for ; Thu, 10 Apr 2025 19:15:02 +0000 (UTC) X-FDA: 83319086844.17.35F66F8 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf18.hostedemail.com (Postfix) with ESMTP id DC6791C0007 for ; Thu, 10 Apr 2025 19:15:00 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=CMQZDzdw; spf=pass (imf18.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com; dmarc=pass (policy=reject) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744312501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HZV89/87me5vO2rrtx5oiznojRMUfJzE0drJ0U7p2nE=; b=6dADLXDtRbKq9Sfv48vyZ+rs/tUzBealdjYjWiMGmRFZOUGy769bTqnToC3Yndndpvqg/3 sBgFY3gv/8+y9QRQmF0uB0m1JgfqxI9JD+PS7dMewvvPOcB660u2KPnwDQoZHDnsp1bzfC p04N9vw3iYAXwD31hFJ0hpbAh8j9FAM= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=CMQZDzdw; spf=pass (imf18.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com; dmarc=pass (policy=reject) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744312501; a=rsa-sha256; cv=none; b=W8ZhdRKCkyMaPCc84OnUlxAt/oVl/w1EiHorMJaHqEc4TYP4IBz2/t0W3Vf+snAps1wD+N /vPJqupODmyZHhbN/t3dweQv3GL8dpPSfBqmghudnc2vCFpqXA74hQohWYcNET7rjGCPtn DrNVxgVBaCbFAuB2zK207RCjWokTeFY= Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 53AJ7ApT023189; Thu, 10 Apr 2025 19:14:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2023-11-20; bh=HZV89 /87me5vO2rrtx5oiznojRMUfJzE0drJ0U7p2nE=; b=CMQZDzdwJQA4DFqAHaJJS fOzQ9tJtLbtaLSDr7E2chz+9sxm2dVbqSijnZU1cEV+rqkyHzjNyTdmrtlW/J+jN VRaqVo8GjByo14VxE6i7uzsftMBX9lrJXzt/Ete3X/h5lKe27pgePZpMfA461O2x 600eTGjTHMb4/oUeHe2+C2Y+Ie8JhzAO9w63FiRb7icdUYAc4NavJA6gVCgd7UXw LaknZuHLMZ466i7k5jNuKIKzT8NitZ+kEV7iZ1DVcT88ovrwVVWHI+R/gPfk7Lej EODaL/zD73aYHZKyhlbDZRlllt9X6dJYJeBwtmuwWBU6sOcoC1CqGwxMV1ff+zid g== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 45xkuy00g0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 Apr 2025 19:14:55 +0000 (GMT) Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 53AIwiO7023934; Thu, 10 Apr 2025 19:14:55 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 45ttyk1gjp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 Apr 2025 19:14:55 +0000 Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 53AJEn92040606; Thu, 10 Apr 2025 19:14:54 GMT Received: from sidhakum-ubuntu.osdevelopmeniad.oraclevcn.com (sidhakum-ubuntu.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.250.108]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 45ttyk1gg6-4; Thu, 10 Apr 2025 19:14:54 +0000 From: Sidhartha Kumar To: linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org Cc: linux-mm@kvack.org, akpm@linux-foundation.org, liam.howlett@oracle.com, willy@infradead.org, Sidhartha Kumar , "Liam R . Howlett" Subject: [PATCH v5 3/6] maple_tree: use vacant nodes to reduce worst case allocations Date: Thu, 10 Apr 2025 19:14:43 +0000 Message-ID: <20250410191446.2474640-4-sidhartha.kumar@oracle.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250410191446.2474640-1-sidhartha.kumar@oracle.com> References: <20250410191446.2474640-1-sidhartha.kumar@oracle.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1095,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-04-10_05,2025-04-10_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2502280000 definitions=main-2504100139 X-Proofpoint-GUID: HKoDVbyOLI-CV0R6je0n84BL7VEbT_My X-Proofpoint-ORIG-GUID: HKoDVbyOLI-CV0R6je0n84BL7VEbT_My X-Rspamd-Queue-Id: DC6791C0007 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: qeq8bbhetafd4nqcouarezx4g3kx5exe X-HE-Tag: 1744312500-824160 X-HE-Meta: U2FsdGVkX1/ZtjlDSrmOjLY83wY+ls/WfzJuM58dVtcWhZPPUUn6Yck0+MESzMT3gCT33i73Z5w17dSp0enaoDouFCl16wIHyp6Hz8zDp3KmzZe7Eu81NfxoANX8k8HN2SdGbn7UBoTKWMc4TFOR8fOlhZSltPROat0kyIrZIQrtg4R8K4CiTtZLnguvAc+O2KORO7mtj9oHDEyQV2bcjkcflevkXezJBLKAbKo8cGRM6PBWNnHlRpM4AlA5+UdMyQMYz4eEd1zkhAa9ha2gXooCifKyJWfTHgBRSfIfOgkl7rNtEvpXZ0Qjfb4AS32sHQqZNX3h1JM0ay114BuK+7CkNBxJ6vRb0pew3Jp/Y9U8TUDcvkoxwGcxVBNgov0bVwi7EhtAEdsn+3UOVzjtl3iOqU/ftt4KLFCouuEyxchvDwZA/tRkZ59v1frCJo5bJp44pDSAn2SKUSYy3uNtabP8D0KSIPkmqHkUcY5mzB3jQncmfEa1huP7KH5QWW9ifK9ScZlTLRIiB2D0QO/3n+qZqVLTJ4N7Pu2ysD5FjL1uEIOxezs4r7tO+Gjfs5efpmtMAUeY84WKIH/QK2wey0rMYOf2IJXJykZqsCCOTuItqQ87Ki7I24npu43+2Xb6DdbtIY6pdj5SwI9zPBG6oaiZOfx88I8Fhfp4wSUH9aFXigv55PJI8fRu1PkonfP5w8hAKYGpuXeWICvVYXQI6Lqt7qSo0ltBtubsOSDSRRoBcfhl+pdCZlPpB14wtAPK26PD2hfLfo9hqDyWuxmMbv//3xMXa2z7xpt5+WcXR61rFSMarXVYH5skOkSiid2sehOBpMdU2YZFwXFXzu8VQ/rXzFDfefmgE4r1yJIxr7X8Vnh5K2mI5WfCuSQ40aO+trxv3HWTnYseG5cZUNSsC07TB1jqNNUY8y2F9o9pCKWuU189ZzjCZBrvktMiNKB1mkYiVdFxcfrfpSeVG8Z wxsl7uhy RuAOblf9JK+owb9He+IAgqOXDqKPneGedpWmF3HMZdv2d4z+SSAgl+RuOEfRxpJGvY0tfNSetnzwjyVrdjGeauSzLoHgquCr2lq6vG8QO8L/VSQY+RgXeW571bz1fVDYhbTw74maaEhCzuviJPtQ9V1MbVFJw44miiqJ1hWOUroNQFsZCxBIdxZ9cdWRS53mMcntHvk1GUeVOtMF7rRTCKQK26klB1Pze9oIi8EkmyRyxYrILN8gylYpPiTdK7W0mq+9fGmvq3Zoo0luSDiU09esr3tSfsmG3S3Zp X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to determine the store type for a maple tree operation, a walk of the tree is done through mas_wr_walk(). This function descends the tree until a spanning write is detected or we reach a leaf node. While descending, keep track of the height at which we encounter a node with available space. This is done by checking if mas->end is less than the number of slots a given node type can fit. Now that the height of the vacant node is tracked, we can use the difference between the height of the tree and the height of the vacant node to know how many levels we will have to propagate creating new nodes. Update mas_prealloc_calc() to consider the vacant height and reduce the number of worst-case allocations. Rebalancing and spanning stores are not supported and fall back to using the full height of the tree for allocations. Update preallocation testing assertions to take into account vacant height. Reviewed-by: Liam R. Howlett Signed-off-by: Sidhartha Kumar --- include/linux/maple_tree.h | 2 + lib/maple_tree.c | 13 ++++-- tools/testing/radix-tree/maple.c | 79 ++++++++++++++++++++++++++++---- 3 files changed, 82 insertions(+), 12 deletions(-) diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index cbbcd18d4186..657adb33e61e 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -463,6 +463,7 @@ struct ma_wr_state { void __rcu **slots; /* mas->node->slots pointer */ void *entry; /* The entry to write */ void *content; /* The existing entry that is being overwritten */ + unsigned char vacant_height; /* Height of lowest node with free space */ }; #define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock)) @@ -498,6 +499,7 @@ struct ma_wr_state { .mas = ma_state, \ .content = NULL, \ .entry = wr_entry, \ + .vacant_height = 0 \ } #define MA_TOPIARY(name, tree) \ diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 195b19505b39..3f794ef072f4 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -3537,6 +3537,9 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas) if (ma_is_leaf(wr_mas->type)) return true; + if (mas->end < mt_slots[wr_mas->type] - 1) + wr_mas->vacant_height = mas->depth + 1; + mas_wr_walk_traverse(wr_mas); } @@ -4152,7 +4155,9 @@ static inline void mas_wr_prealloc_setup(struct ma_wr_state *wr_mas) static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entry) { struct ma_state *mas = wr_mas->mas; - int ret = mas_mt_height(mas) * 3 + 1; + unsigned char height = mas_mt_height(mas); + int ret = height * 3 + 1; + unsigned char delta = height - wr_mas->vacant_height; switch (mas->store_type) { case wr_invalid: @@ -4170,13 +4175,13 @@ static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entry) ret = 0; break; case wr_spanning_store: - ret = mas_mt_height(mas) * 3 + 1; + WARN_ON_ONCE(ret != height * 3 + 1); break; case wr_split_store: - ret = mas_mt_height(mas) * 2 + 1; + ret = delta * 2 + 1; break; case wr_rebalance: - ret = mas_mt_height(mas) * 2 - 1; + ret = height * 2 + 1; break; case wr_node_store: ret = mt_in_rcu(mas->tree) ? 1 : 0; diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c index e0f8fabe8821..e37a3ab2e921 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -35475,15 +35475,65 @@ static void check_dfs_preorder(struct maple_tree *mt) } /* End of depth first search tests */ +/* get height of the lowest non-leaf node with free space */ +static unsigned char get_vacant_height(struct ma_wr_state *wr_mas, void *entry) +{ + struct ma_state *mas = wr_mas->mas; + char vacant_height = 0; + enum maple_type type; + unsigned long *pivots; + unsigned long min = 0; + unsigned long max = ULONG_MAX; + unsigned char offset; + + /* start traversal */ + mas_reset(mas); + mas_start(mas); + if (!xa_is_node(mas_root(mas))) + return 0; + + type = mte_node_type(mas->node); + wr_mas->type = type; + while (!ma_is_leaf(type)) { + mas_node_walk(mas, mte_to_node(mas->node), type, &min, &max); + offset = mas->offset; + mas->end = mas_data_end(mas); + pivots = ma_pivots(mte_to_node(mas->node), type); + + if (pivots) { + if (offset) + min = pivots[mas->offset - 1]; + if (offset < mas->end) + max = pivots[mas->offset]; + } + wr_mas->r_max = offset < mas->end ? pivots[offset] : mas->max; + + /* detect spanning write */ + if (mas_is_span_wr(wr_mas)) + break; + + if (mas->end < mt_slot_count(mas->node) - 1) + vacant_height = mas->depth + 1; + + mas_descend(mas); + type = mte_node_type(mas->node); + mas->depth++; + } + + return vacant_height; +} + /* Preallocation testing */ static noinline void __init check_prealloc(struct maple_tree *mt) { unsigned long i, max = 100; unsigned long allocated; unsigned char height; + unsigned char vacant_height; struct maple_node *mn; void *ptr = check_prealloc; MA_STATE(mas, mt, 10, 20); + MA_WR_STATE(wr_mas, &mas, ptr); mt_set_non_kernel(1000); for (i = 0; i <= max; i++) @@ -35494,8 +35544,9 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&wr_mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mas_destroy(&mas); allocated = mas_allocated(&mas); MT_BUG_ON(mt, allocated != 0); @@ -35503,8 +35554,9 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&wr_mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); mas_destroy(&mas); allocated = mas_allocated(&mas); @@ -35514,7 +35566,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); mn->parent = ma_parent_ptr(mn); @@ -35527,7 +35580,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); @@ -35540,7 +35594,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); mas_push_node(&mas, mn); @@ -35553,7 +35608,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); @@ -35578,7 +35634,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 2); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 2); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); mt_set_non_kernel(1); @@ -35595,8 +35652,14 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&wr_mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + /* + * vacant height cannot be used to compute the number of nodes needed + * as the root contains two entries which means it is on the verge of + * insufficiency. The worst case full height of the tree is needed. + */ + MT_BUG_ON(mt, allocated != height * 3 + 1); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); mas_set_range(&mas, 0, 200); -- 2.43.0