linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: maple-tree@lists.infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Suren Baghdasaryan <surenb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Sidhartha Kumar <sidhartha.kumar@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Alice Ryhl <aliceryhl@google.com>,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Christian Kujau <lists@nerdbynature.de>,
	SeongJae Park <sj@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>
Subject: [PATCH v3 22/30] maple_tree: Use maple copy node for mas_wr_rebalance() operation
Date: Fri, 30 Jan 2026 15:59:27 -0500	[thread overview]
Message-ID: <20260130205935.2559335-23-Liam.Howlett@oracle.com> (raw)
In-Reply-To: <20260130205935.2559335-1-Liam.Howlett@oracle.com>

Stop using the maple big node for rebalance operations by changing to
more align with spanning store.  The rebalance operation needs its own
data calculation in rebalance_data().

In the event of too much data, the rebalance tries to push the data
using push_data_sib().  If there is insufficient data, the rebalance
operation will rebalance against a sibling (found with rebalance_sib()).

The rebalance starts at the leaf and works its way upward in the tree
using rebalance_ascend().  Most of the code is shared with spanning
store such as the copy node having a new root, but is fundamentally
different in that the data must come from a sibling.

A parent maple state is used to track the parent location to avoid
multiple mas_ascend() calls.  The maple state tree location is copied
from the parent to the mas (child) in the ascend step.  Ascending itself
is done in the main loop.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 lib/maple_tree.c | 213 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 206 insertions(+), 7 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 42038e42a4c7e..0d5d68913de44 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2305,6 +2305,19 @@ static inline void mte_mid_split_check(struct maple_enode **l,
 	*split = mid_split;
 }
 
+static inline void rebalance_sib(struct ma_state *parent, struct ma_state *sib)
+{
+	*sib = *parent;
+	/* Prioritize move right to pull data left */
+	if (sib->offset < sib->end)
+		sib->offset++;
+	else
+		sib->offset--;
+
+	mas_descend(sib);
+	sib->end = mas_data_end(sib);
+}
+
 static inline
 void spanning_sib(struct ma_wr_state *l_wr_mas,
 		struct ma_wr_state *r_wr_mas, struct ma_state *nneighbour)
@@ -2855,6 +2868,112 @@ static inline void cp_data_calc(struct maple_copy *cp,
 	cp->data += r_wr_mas->mas->end - r_wr_mas->offset_end;
 }
 
+static bool data_fits(struct ma_state *sib, struct ma_state *mas,
+		struct maple_copy *cp)
+{
+	unsigned char new_data;
+	enum maple_type type;
+	unsigned char space;
+	unsigned char end;
+
+	type = mte_node_type(mas->node);
+	space = 2 * mt_slots[type];
+	end = sib->end;
+
+	new_data = end + 1 + cp->data;
+	if (new_data > space)
+		return false;
+
+	/*
+	 * This is off by one by design.  The extra space is left to reduce
+	 * jitter in operations that add then remove two entries.
+	 *
+	 * end is an index while new space and data are both sizes.  Adding one
+	 * to end to convert the index to a size means that the below
+	 * calculation should be <=, but we want to keep an extra space in nodes
+	 * to reduce jitter.
+	 *
+	 * Note that it is still possible to get a full node on the left by the
+	 * NULL landing exactly on the split.  The NULL ending of a node happens
+	 * in the dst_setup() function, where we will either increase the split
+	 * by one or decrease it by one, if possible.  In the case of split
+	 * (this case), it is always possible to shift the spilt by one - again
+	 * because there is at least one slot free by the below checking.
+	 */
+	if (new_data < space)
+		return true;
+
+	return false;
+}
+
+static inline void push_data_sib(struct maple_copy *cp, struct ma_state *mas,
+		struct ma_state *sib, struct ma_state *parent)
+{
+
+	if (mte_is_root(mas->node))
+		goto no_push;
+
+
+	*sib = *parent;
+	if (sib->offset) {
+		sib->offset--;
+		mas_descend(sib);
+		sib->end = mas_data_end(sib);
+		if (data_fits(sib, mas, cp))	/* Push left */
+			return;
+
+		*sib = *parent;
+	}
+
+	if (sib->offset >= sib->end)
+		goto no_push;
+
+	sib->offset++;
+	mas_descend(sib);
+	sib->end = mas_data_end(sib);
+	if (data_fits(sib, mas, cp))		/* Push right*/
+		return;
+
+no_push:
+	sib->end = 0;
+}
+
+/*
+ * rebalance_data() - Calculate the @cp data, populate @sib if insufficient or
+ * if the data can be pushed into a sibling.
+ * @cp: The maple copy node
+ * @wr_mas: The left write maple state
+ * @sib: The maple state of the sibling.
+ *
+ * Note: @cp->data is a size and not indexed by 0. @sib->end may be set to 0 to
+ * indicate it will not be used.
+ *
+ */
+static inline void rebalance_data(struct maple_copy *cp,
+		struct ma_wr_state *wr_mas, struct ma_state *sib,
+		struct ma_state *parent)
+{
+	cp_data_calc(cp, wr_mas, wr_mas);
+	sib->end = 0;
+	if (cp->data > mt_slots[wr_mas->type]) {
+		push_data_sib(cp, wr_mas->mas, sib, parent);
+		if (sib->end)
+			goto use_sib;
+	} else if (cp->data <= mt_min_slots[wr_mas->type]) {
+		if ((wr_mas->mas->min != 0) ||
+		    (wr_mas->mas->max != ULONG_MAX)) {
+			rebalance_sib(parent, sib);
+			goto use_sib;
+		}
+	}
+
+	return;
+
+use_sib:
+
+	cp->data += sib->end + 1;
+}
+
 /*
  * spanning_data() - Calculate the @cp data and populate @sib if insufficient
  * @cp: The maple copy node
@@ -3412,6 +3531,55 @@ static bool spanning_ascend(struct maple_copy *cp, struct ma_state *mas,
 	return true;
 }
 
+/*
+ * rebalance_ascend() - Ascend the tree and set up for the next loop - if
+ * necessary
+ *
+ * Return: True if there another rebalancing operation on the next level is
+ * needed, false otherwise.
+ */
+static inline bool rebalance_ascend(struct maple_copy *cp,
+		struct ma_wr_state *wr_mas, struct ma_state *sib,
+		struct ma_state *parent)
+{
+	struct ma_state *mas;
+	unsigned long min, max;
+
+	mas = wr_mas->mas;
+	if (!sib->end) {
+		min = mas->min;
+		max = mas->max;
+	} else if (sib->min > mas->max) { /* Move right succeeded */
+		min = mas->min;
+		max = sib->max;
+		wr_mas->offset_end = parent->offset + 1;
+	} else {
+		min = sib->min;
+		max = mas->max;
+		wr_mas->offset_end = parent->offset;
+		parent->offset--;
+	}
+
+	cp_dst_to_slots(cp, min, max, mas);
+	if (cp_is_new_root(cp, mas))
+		return false;
+
+	if (cp->d_count == 1 && !sib->end) {
+		cp->dst[0].node->parent = ma_parent_ptr(mas_mn(mas)->parent);
+		return false;
+	}
+
+	cp->height++;
+	mas->node = parent->node;
+	mas->offset = parent->offset;
+	mas->min = parent->min;
+	mas->max = parent->max;
+	mas->end = parent->end;
+	mas->depth = parent->depth;
+	wr_mas_setup(wr_mas, mas);
+	return true;
+}
+
 /*
  * mas_rebalance() - Rebalance a given node.
  * @mas: The maple state
@@ -4379,16 +4547,47 @@ static noinline_for_kasan void mas_wr_split(struct ma_wr_state *wr_mas)
  * mas_wr_rebalance() - Insufficient data in one node needs to either get data
  * from a sibling or absorb a sibling all together.
  * @wr_mas: The write maple state
+ *
+ * Rebalance is different than a spanning store in that the write state is
+ * already at the leaf node that's being altered.
  */
-static noinline_for_kasan void mas_wr_rebalance(struct ma_wr_state *wr_mas)
+static void mas_wr_rebalance(struct ma_wr_state *wr_mas)
 {
-	struct maple_big_node b_node;
+	struct maple_enode *old_enode;
+	struct ma_state parent;
+	struct ma_state *mas;
+	struct maple_copy cp;
+	struct ma_state sib;
 
-	trace_ma_write(__func__, wr_mas->mas, 0, wr_mas->entry);
-	memset(&b_node, 0, sizeof(struct maple_big_node));
-	mas_store_b_node(wr_mas, &b_node, wr_mas->offset_end);
-	WARN_ON_ONCE(wr_mas->mas->store_type != wr_rebalance);
-	return mas_rebalance(wr_mas->mas, &b_node);
+	/*
+	 * Rebalancing occurs if a node is insufficient.  Data is rebalanced
+	 * against the node to the right if it exists, otherwise the node to the
+	 * left of this node is rebalanced against this node.  If rebalancing
+	 * causes just one node to be produced instead of two, then the parent
+	 * is also examined and rebalanced if it is insufficient.  Every level
+	 * tries to combine the data in the same way.  If one node contains the
+	 * entire range of the tree, then that node is used as a new root node.
+	 */
+
+	mas = wr_mas->mas;
+	trace_ma_op(TP_FCT, mas);
+	parent = *mas;
+	cp_leaf_init(&cp, mas, wr_mas, wr_mas);
+	do {
+		if (!mte_is_root(parent.node)) {
+			mas_ascend(&parent);
+			parent.end = mas_data_end(&parent);
+		}
+		rebalance_data(&cp, wr_mas, &sib, &parent);
+		multi_src_setup(&cp, wr_mas, wr_mas, &sib);
+		dst_setup(&cp, mas, wr_mas->type);
+		cp_data_write(&cp, mas);
+	} while (rebalance_ascend(&cp, wr_mas, &sib, &parent));
+
+	old_enode = mas->node;
+	mas->node = mt_slot_locked(mas->tree, cp.slot, 0);
+	mas_wmb_replace(mas, old_enode, cp.height);
+	mtree_range_walk(mas);
 }
 
 /*
-- 
2.47.3



  parent reply	other threads:[~2026-01-30 22:29 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-30 20:59 [PATCH v3 00/30] maple_tree: Replace big node with maple copy Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 01/30] maple_tree: Fix mas_dup_alloc() sparse warning Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 02/30] maple_tree: Move mas_spanning_rebalance loop to function Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 03/30] maple_tree: Extract use of big node from mas_wr_spanning_store() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 04/30] maple_tree: Remove unnecessary assignment of orig_l index Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 05/30] maple_tree: inline mas_spanning_rebalance() into mas_wr_spanning_rebalance() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 06/30] maple_tree: Make ma_wr_states reliable for reuse in spanning store Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 07/30] maple_tree: Remove l_wr_mas from mas_wr_spanning_rebalance Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 08/30] maple_tree: Don't pass through height in mas_wr_spanning_store Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 09/30] maple_tree: Move maple_subtree_state from mas_wr_spanning_store to mas_wr_spanning_rebalance Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 10/30] maple_tree: Correct right ma_wr_state end pivot in mas_wr_spanning_store() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 11/30] maple_tree: Introduce maple_copy node and use it in mas_spanning_rebalance() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 12/30] maple_tree: Testing update for spanning store Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 13/30] maple_tree: Inline mas_spanning_rebalance_loop() into mas_wr_spanning_rebalance() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 14/30] maple_tree: Change initial big node setup in mas_wr_spanning_rebalance() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 15/30] maple_tree: Introduce ma_leaf_max_gap() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 16/30] maple_tree: Add gap support, slot and pivot sizes for maple copy Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 17/30] maple_tree: Start using maple copy node for destination Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 18/30] maple_tree: inline mas_wr_spanning_rebalance() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 19/30] maple_tree: Remove unnecessary return statements Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 20/30] maple_tree: Separate wr_split_store and wr_rebalance store type code path Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 21/30] maple_tree: Add cp_is_new_root() helper Liam R. Howlett
2026-02-01  0:10   ` SeongJae Park
2026-02-02 14:58     ` Liam R. Howlett
2026-02-02 15:56       ` SeongJae Park
2026-02-02 17:01         ` Liam R. Howlett
2026-02-02 17:53           ` SeongJae Park
2026-02-03 17:26   ` Liam R. Howlett
2026-02-04  6:36     ` SeongJae Park
2026-01-30 20:59 ` Liam R. Howlett [this message]
2026-01-30 20:59 ` [PATCH v3 23/30] maple_tree: Add test for rebalance calculation off-by-one Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 24/30] maple_tree: Add copy_tree_location() helper Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 25/30] maple_tree: Add cp_converged() helper Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 26/30] maple_tree: Use maple copy node for mas_wr_split() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 27/30] maple_tree: Remove maple big node and subtree structs Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 28/30] maple_tree: Pass maple copy node to mas_wmb_replace() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 29/30] maple_tree: Don't pass end to mas_wr_append() Liam R. Howlett
2026-01-30 20:59 ` [PATCH v3 30/30] maple_tree: Clean up mas_wr_node_store() Liam R. Howlett
2026-01-31 20:27 ` [PATCH v3 00/30] maple_tree: Replace big node with maple copy Andrew Morton
2026-02-02 15:40   ` Liam R. Howlett
2026-02-02 18:31     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260130205935.2559335-23-Liam.Howlett@oracle.com \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=aliceryhl@google.com \
    --cc=arnd@arndb.de \
    --cc=geert@linux-m68k.org \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lists@nerdbynature.de \
    --cc=maple-tree@lists.infradead.org \
    --cc=sidhartha.kumar@oracle.com \
    --cc=sj@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox