From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C991FD60D0C for ; Tue, 19 Nov 2024 02:30:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6170B6B0096; Mon, 18 Nov 2024 21:30:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 59F3B6B0098; Mon, 18 Nov 2024 21:30:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4192A6B0099; Mon, 18 Nov 2024 21:30:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1977B6B0096 for ; Mon, 18 Nov 2024 21:30:55 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 95EBAAC9AE for ; Tue, 19 Nov 2024 02:30:54 +0000 (UTC) X-FDA: 82801264686.27.60D5450 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf18.hostedemail.com (Postfix) with ESMTP id 6B16C1C000B for ; Tue, 19 Nov 2024 02:30:30 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Efn9l+73; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731983269; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GR5z0Zmm7jGTILB2ZTA1J3Uc3kfsx9w8jQHvhZLeBrg=; b=sxcVj7xi6NNCLinZ5BwrLxy4it6zi7eoVz0fsEbr+zqBqKukKh1cQnldPKEfnnfpVeoK+x ZuQTKIodbVX4xJz0OkH9H1Gqjf7Mpfk/J8vfhvRICMcDsMZkK/gzZVl6LsvF8MkX0MpyI8 bQgIv8GsaAv1aNSCijD0zbWD7Jo8DT4= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Efn9l+73; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731983269; a=rsa-sha256; cv=none; b=ltb+Y+7Ka8BPqO+1Tr8R7U33Mo3qwcmFqWccxlXl2J0/nCtwYalQauM5feHZ3lsAOxidCJ JE/mRuewhymlNlQi6wc5x2/puLgJWl84L8KwiItAWcJLrZ6OoyLphim5/Fa+vzzGw+dwx7 a/Pdujz+gsKu0If+gXEyAHg140GbzYw= Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a9ed7d8c86cso766723666b.2 for ; Mon, 18 Nov 2024 18:30:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731983451; x=1732588251; darn=kvack.org; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=GR5z0Zmm7jGTILB2ZTA1J3Uc3kfsx9w8jQHvhZLeBrg=; b=Efn9l+73aRdr81tElqLQtHI/5owkYETRYxe7Kq8OmXvnZvsz9rITYDYTd2t/g/uDfy BJK3GcuDif6qdzVeFwj6gMoqVTF6itOo6mdhmeETTWIrJVK3rG5UAvURLKGsOwn6248p Hoj3xkLmCVqd7hr4xbPZW9gOVRJdXE4fe3eilbdtpdqZwZJicugcNf/aegXDl0kBRxQj UAXMYdguq+vfAOBYvpJFNK7IGNRNlGsHA3dULfONA6AvWNvnFz/Rgw3R0T6pPkXCoWVG bUT5V21c2KvIaZ+pBA1eZNPlZAF3ThYJ9HuFKczp+OzRCkI2QlQuqZfmsAV51SUq1tlW e61Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731983451; x=1732588251; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=GR5z0Zmm7jGTILB2ZTA1J3Uc3kfsx9w8jQHvhZLeBrg=; b=vZOsxuyF649f/DqZYoLtqjplw45y1Cn3+2Ar1M4botuF5sQ1ReN4qcrcamjzy/eCSg FoJhKUT0Paz4jo1VAz1oWfh+wgTrwTOsOs7dmv/bd8fzqprO6v2az81pcSp+EInvhhlz L05vDM3vVB/RpCHcDY35GXr9gHkPHNwuHbP3CwqZ5tn9cRIOWuddbJG3av8iAGaBVPU4 pboOMPQDdQUVc7Koj92FAtXnX9E9CGdxaTZxYClWLC6gIjHWjNLSzkFvCPac9bWoIfZj 2i3mNzdWc3ktms/oEFBvpsfeuNoqfrJuIFSQeRk3wW+obZtYZiuMgM0iYhyOqp1ukwzm b8lg== X-Forwarded-Encrypted: i=1; AJvYcCWvOKHCllvrMxuZmFdkNbnVbpxfTDHZkJv5GVgGbhQrZq3Fl3Ic9A5DzeLXNGu1ebiSzX0+cbLkdg==@kvack.org X-Gm-Message-State: AOJu0YwEoH+qKHdZHscqhsWBnfp3FLFIEhfwbF6d5Va8FLLMBEDRQonQ iFRpwKj9PsjVZqB1tabMUfgKzKDWMxDu2s2oug2x8kH4YwSV0dtS X-Google-Smtp-Source: AGHT+IECCcO1uWUVqVM4ALsnE48fZ2zOS3OG/TR46Myj99u1TJg/iJh5npC+mONFvjHM6HPxnZPNOQ== X-Received: by 2002:a17:906:dac7:b0:aa4:777d:739a with SMTP id a640c23a62f3a-aa4833e9fcbmr1381694766b.8.1731983450744; Mon, 18 Nov 2024 18:30:50 -0800 (PST) Received: from localhost ([185.92.221.13]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa20e045270sm595633666b.146.2024.11.18.18.30.47 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Nov 2024 18:30:48 -0800 (PST) Date: Tue, 19 Nov 2024 02:30:47 +0000 From: Wei Yang To: Sidhartha Kumar Cc: Wei Yang , linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, linux-mm@kvack.org, akpm@linux-foundation.org, liam.howlett@oracle.com Subject: Re: [PATCH 3/5] maple_tree: use vacant nodes to reduce worst case allocations Message-ID: <20241119023047.sentd5e75yak36za@master> Reply-To: Wei Yang References: <20241114170524.64391-1-sidhartha.kumar@oracle.com> <20241114170524.64391-4-sidhartha.kumar@oracle.com> <20241115075203.ojspk255cw3sr3s3@master> <2aa439f1-d33d-43ee-9945-5ac570506c7e@oracle.com> <20241116014116.a7am7z4p7k33rkl5@master> <74e8665a-3818-4e6c-83d1-9a0220a79e49@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <74e8665a-3818-4e6c-83d1-9a0220a79e49@oracle.com> User-Agent: NeoMutt/20170113 (1.7.2) X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6B16C1C000B X-Stat-Signature: pp917jm914q3ohy55xyifdeidba636es X-Rspam-User: X-HE-Tag: 1731983430-178175 X-HE-Meta: U2FsdGVkX19uyIm38x7j/nAATZh47B9Ha7AtvD3YrqZ4ouvYpXToiFRoWkLLUJ0y1vcrKLMTJ2VH2CbgqqkbNKOp2JybwjGnyC99hZ/8yx486Pfber29NgfSMome7Jzy/IExBxrp2cXqWe3cWWGHiBjt/HMtgNE9Q9hqp5179IG9/hH14QqR7VdKDzSCGLChWstKZuml0ObO/sHLc+b4NeFpkWvzcUc1C0868+zV96iMoGBCYJrT1nAG2v7AQ8fh8QiRRDF1FCyty7DiBEFn8csRqSfwAvkK6UkQ+3IaoR9ihCa+s0Ibk9gIpjDNh6PbQgO2ER5KBfC59ig2GbvaeD5yYrBhtE42/4QAstF0UEvN8gqsjjN5wjmlZtm2uXHJ+I8ZVJyDozX2yLkfreIvJwqU4v6HCKY/+TWF+40KDKy0aTY/2bgEH0FpR270gfpaoZzIjSdQgSfWRp6NU3tpM6xWiI07nPcvzkzGYcYQOpD2uQvSuBBEVNU9m0pfRwuOTI+bZMowPV0Y3DVkLnHgWsMR7ZY30x7wbJUWnA4P78X2qht1J1Kxi4J3kV1xPEvtFnyWcPCirxxuxUoa+ec4DjNH3qULfPlHOrXbELYESAkDJMwf8a2qcnMhyszf0qH9hP/1O3O3gdLuoOHJRRckwoZRzbfxJwEFa1SWMiZgRawfj527dZdIh04NRjW5T+cJx4RZAsBbZlnvzgM0mnNNZG8mUzd1yckkm7q0Euu+f844LwGe7l0es3kT6gMVimd+e5z81AXkkkZRkTvfxfOC3IOy4wr/Ej1qKsBkUYxbbMphyuccfaVSjBvGu73LkiUK0Z7r1Sdbpv4EsNUFO1J5QZ5/+PBWNk0LVBhTHuPFd/Q8gjtaHSdY52CdHHsDrUj+iSwiSTsYCaO52YOrCZfuWcHuX0L1iu+nCXy+LMxgfzCMqUPq0W6vDVUFgF9dKXTDli3PGj6G0I27N7CWfA8 +JOuaTI/ PpDlUV0IjMatimvTWtLvO5ErqTSDMzpGyjJnadZUdsLfJfSfeqaa/ivMgSIQYPX4yLkRYy8Xnv4wZDlrVKqXt6p1q6t2elfvcZ+1+g23v6kAtnEVwLgsOg3VdkhMIy4dTRbagw1/PpeWKg8eECNKyWv0iPhJP/IXZnmDZ15+9RcQTrGKhL+L3GEFOBkRBTWnd83H3kS1BPBH7Y2M3ipAymmzcBGCmgV8IENM8mSQwUzH8JAIB94eL9PcFIW8nFhlZ6/J0T54a8KRwuv4G83JufGBLBQFwypqHS7dyRu1A2fH7wBBBlaQ6P/evBk1CVYM/0OaFk3TdBhUccGcMl17GNBDhtH2xl1ITABwEDnYLKWGpicIjMzcgfgQW0zMX3f+fq7h+VTHkdLxdkB5dEfGLCZAuHvzkmSj5BZTMMKOPCt80G+ckOwNlamzaLST03kP+GingjmCQMXt2zPu7TCeDbzZ4EPKFJjrqLrAkxcEzwci1vB9l/Lh+Mw6ARte20FatYqg4K5LwU+w21PSCLCXtytRc6X9AMr14iRq8rOCqg9eI1MzDS1DpPRydFF8eMoHx1jqi1AgWlLYKLIPKg2vyJvspiQ17OLeYVUzXAYsAxJVLN2PM0kCtiyPp3X/M03pq8pcOUp+Y9E8bNjuUSXRNmi+uoM8fL8FAQ4f6HOQFLmrrVTB9tKl1nqlbcYT6cxZFVcL7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.003191, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 18, 2024 at 11:36:18AM -0500, Sidhartha Kumar wrote: >On 11/15/24 8:41 PM, Wei Yang wrote: >> On Fri, Nov 15, 2024 at 03:34:55PM -0500, Sidhartha Kumar wrote: >> > On 11/15/24 2:52 AM, Wei Yang wrote: >> > > On Thu, Nov 14, 2024 at 12:05:22PM -0500, Sidhartha Kumar wrote: >> > > > In order to determine the store type for a maple tree operation, a walk >> > > > of the tree is done through mas_wr_walk(). This function descends the >> > > > tree until a spanning write is detected or we reach a leaf node. While >> > > > descending, keep track of the height at which we encounter a node with >> > > > available space. This is done by checking if mas->end is less than the >> > > > number of slots a given node type can fit. >> > > > >> > > > Now that the height of the vacant node is tracked, we can use the >> > > > difference between the height of the tree and the height of the vacant >> > > > node to know how many levels we will have to propagate creating new >> > > > nodes. Update mas_prealloc_calc() to consider the vacant height and >> > > > reduce the number of worst allocations. >> > > > >> > > > Rebalancing stores are not supported and fall back to using the full >> > > > height of the tree for allocations. >> > > > >> > > > Update preallocation testing assertions to take into account vacant >> > > > height. >> > > > >> > > > Signed-off-by: Sidhartha >> > > > --- >> > > > include/linux/maple_tree.h | 2 + >> > > > lib/maple_tree.c | 13 +++-- >> > > > tools/testing/radix-tree/maple.c | 97 +++++++++++++++++++++++++++++--- >> > > > 3 files changed, 100 insertions(+), 12 deletions(-) >> > > > >> > > > diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h >> > > > index cbbcd18d4186..7d777aa2d9ed 100644 >> > > > --- a/include/linux/maple_tree.h >> > > > +++ b/include/linux/maple_tree.h >> > > > @@ -463,6 +463,7 @@ struct ma_wr_state { >> > > > void __rcu **slots; /* mas->node->slots pointer */ >> > > > void *entry; /* The entry to write */ >> > > > void *content; /* The existing entry that is being overwritten */ >> > > > + unsigned char vacant_height; /* Depth of lowest node with free space */ >> > > ^^^ ^^^ >> > > >> > > Would this be a little misleading? >> > > >> > >> > Could you elaborate on how its misleading? >> > >> >> As you mentioned in previous patch, depth and height has different meaning. >> >> Root node has depth of 0 and height of 1. So I may wandering whether this is >> depth or height. >> >> > > > }; >> > > > >> > > > #define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock)) >> > > > @@ -498,6 +499,7 @@ struct ma_wr_state { >> > > > .mas = ma_state, \ >> > > > .content = NULL, \ >> > > > .entry = wr_entry, \ >> > > > + .vacant_height = 0 \ >> > > > } >> > > > >> > > > #define MA_TOPIARY(name, tree) \ >> > > > diff --git a/lib/maple_tree.c b/lib/maple_tree.c >> > > > index 21289e350382..f14d70c171c2 100644 >> > > > --- a/lib/maple_tree.c >> > > > +++ b/lib/maple_tree.c >> > > > @@ -3545,6 +3545,9 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas) >> > > > if (ma_is_leaf(wr_mas->type)) >> > > > return true; >> > > > >> > > > + if (mas->end < mt_slots[wr_mas->type] - 1) >> > > > + wr_mas->vacant_height = mas->depth + 1; >> > > >> > > For some cases in rebalance, we may split data into three parts, which means >> > > we need 2 extra vacant slot. >> > > >> > > Maybe this check is not accurate? >> > > >> > >> > The triple split scenario which you are describing comes from the spanning >> > store case not on the wr_rebalance case. There is a check before we set >> > vacant height to return if is_span_wr() so I believe this is correct still. >> > >> >> Hmm... I come up with a case in which vacant_height may not high enough. >> >> Here is the subtree where spanning write locates. The first level is the >> parent node of the second level nodes. >> >> vacant node >> +--------+-+-+-------+ >> | |l|r| | >> +--------+-+-+-------+ >> >> l r >> +------+ +----+-------+ +---------+--+ +------+ >> | | | | | | | | | | >> +------+ +----+-------+ +---------+--+ +------+ >> ^ ^ >> | | >> index last >> >> When mas_wr_walk_descend() to node l, mas_is_span_wr() return true since last >> is in the right sibling, node r. Let's say the parent is vacant and l/r is >> leaf. So the request number is (1 * 3) + 1. >> >> Let's assume: >> >> * vacant node is just sufficient >> * l's left part + r's right part is sufficient but not overflow >> >> Then the "new vacant node" would be deficient and needs another round >> rebalance. >> >> For this case, I am afraid it doesn't allocate enough nodes. Or I >> misunderstand this? > >I think you are correct and we need to use the sufficient height which is >introduced in patch 5 in the spanning store case similar to how patch 5 uses >it for the rebalance store case. > > case wr_rebalance: > if (wr_mas->sufficient_height < wr_mas->vacant_height) { > ret = (height - wr_mas->sufficient_height)*2 +1; > break; > } > ret = delta * 2 + 1; > break; > I have thought about this. But the spanning write case seems to be a little special to splitting/rebalance. It could be both require one more extra slot and may need to be rebalanced. We are not sure the exact behavior on converge node. Only check one aspect seems not enough. Do you think so ? -- Wei Yang Help you, Help me