linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] refine mas_end of mas_mab_cp()
@ 2024-09-19 23:48 Wei Yang
  2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
  To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang

mas_mab_cp() copy range [mas_start, mas_end] inclusively. For some usage, the
mas_end passed could be refined.

Patch 1: if we know the end of a maple_node, we don't need to copy end + 1.
Patch 2: this one tries to leverage mt_pivot_count() for optimize the code.
Patch 3: the maximum end of a maple_node is mt_pivot_count() instead of
         mt_slot_count()
Patch 4: after above cleanup, we are sure mas_end won't exceed mt_pivots[], so
         we can remove the check in mas_mab_cp()

Wei Yang (4):
  maple_tree: not necessary to copy mas->end + 1
  maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1
  maple_tree: copy to mt_pivot_count() instead of mt_slot_count()
  maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt]

 lib/maple_tree.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1
  2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
  2024-09-19 23:48 ` [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1 Wei Yang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
  To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang

mas_mab_cp() copy range [mas_start, mas_end] inclusively, so we just
need to copy to mas->end which points to the last valid data in maple
node.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 lib/maple_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index a4796101decc..51f49b29b8b0 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2099,7 +2099,7 @@ static noinline_for_kasan void mas_store_b_node(struct ma_wr_state *wr_mas,
 		goto b_end;
 
 	/* Copy end data to the end of the node. */
-	mas_mab_cp(mas, slot, mas->end + 1, b_node, ++b_end);
+	mas_mab_cp(mas, slot, mas->end, b_node, ++b_end);
 	b_node->b_end--;
 	return;
 
-- 
2.34.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1
  2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
  2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
  2024-09-19 23:48 ` [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count() Wei Yang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
  To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang

We have (mt_slot_count() - 1 == mt_pivot_count()), except for
maple_dense node. Since mas_mab_cp() doesn't apply to maple_dense node,
it is fine to use mt_pivot_count() here.

By doing so we can save instruction to decrease 1 from mt_slot_count().
Below is the comparison of generated code.

Current:

.L1595:
  lib/maple_tree.c:3165: 		mas_mab_cp(mas, split + skip, mt_slot_count(mas->node) - 1,
	movl	%ebx, (%esp)	# prephitmp_234,
	movzbl	mt_slots(%edx), %ecx	# mt_slots[_90], mt_slots[_90]
	movl	%esi, 4(%esp)	# _26,
	movzbl	-23(%ebp), %eax	# %sfp,
	movzbl	-21(%ebp), %edx	# %sfp, split
	decb	%cl	# tmp258
	movzbl	%cl, %ecx	# tmp258, tmp260
	addb	%al, %dl	#, split
	movl	-16(%ebp), %eax	# %sfp,
	movzbl	%dl, %edx	# tmp261, tmp262
	call	mas_mab_cp	#

After applying this patch:

.L1595:
  lib/maple_tree.c:3165: 		mas_mab_cp(mas, split + skip, mt_pivot_count(mas->node),
	movl	%ebx, (%esp)	# prephitmp_231,
	movzbl	mt_pivots(%edx), %ecx	# mt_pivots[_87], mt_pivots[_87]
	movl	%esi, 4(%esp)	# _26,
	movzbl	-23(%ebp), %eax	# %sfp,
	movzbl	-21(%ebp), %edx	# %sfp, split
	addb	%al, %dl	#, split
	movl	-16(%ebp), %eax	# %sfp,
	movzbl	%dl, %edx	# tmp256, tmp257
	call	mas_mab_cp	#

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 lib/maple_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 51f49b29b8b0..8ff075645f21 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3167,7 +3167,7 @@ static inline void mast_fill_bnode(struct maple_subtree_state *mast,
 		cp = false;
 
 	if (cp)
-		mas_mab_cp(mas, split + skip, mt_slot_count(mas->node) - 1,
+		mas_mab_cp(mas, split + skip, mt_pivot_count(mas->node),
 			   mast->bn, mast->bn->b_end);
 
 	mast->bn->b_end--;
-- 
2.34.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count()
  2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
  2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
  2024-09-19 23:48 ` [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1 Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
  2024-09-19 23:48 ` [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt] Wei Yang
  2024-09-23 19:57 ` [PATCH 0/4] refine mas_end of mas_mab_cp() Liam R. Howlett
  4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
  To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang

mas_mab_cp() copy a maple_node's range inclusively, and the maximum
valid data index is (mt_slot_count() - 1) instead of mt_slot_count().

But if using (mt_slot_count() - 1) directly, it would have more
instructions.  The good news is we have defined mt_pivot_count() ==
(mt_slot_count - 1), except for maple_dense node. And mas_mab_cp()
doesn't handle maple_dense node. So it is safe to use it.

By doing so have almost identical generated code:

Current:

  lib/maple_tree.c:2209: 	mas_mab_cp(mast->orig_r, 0, mt_slot_count(mast->orig_r->node),
	movl	%esi, (%esp)	# _135,
	movl	-88(%ebp), %eax	# %sfp, _137
	movzbl	mt_slots(%edx), %ecx	# mt_slots[_142], mt_slots[_142]
	xorl	%edx, %edx	#
	movl	%eax, 4(%esp)	# _137,
	movl	%ebx, %eax	# _138,
	call	mas_mab_cp	#

After applying this patch:

  lib/maple_tree.c:2209: 	mas_mab_cp(mast->orig_r, 0, mt_pivot_count(mast->orig_r->node),
	movl	%esi, (%esp)	# _135,
	movl	-88(%ebp), %eax	# %sfp, _137
	movzbl	mt_pivots(%edx), %ecx	# mt_pivots[_142], mt_pivots[_142]
	xorl	%edx, %edx	#
	movl	%eax, 4(%esp)	# _137,
	movl	%ebx, %eax	# _138,
	call	mas_mab_cp	#

The difference is we access mt_pivots to get ecx instead of mt_slots,
which doesn't expect performance difference.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 lib/maple_tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 8ff075645f21..bb4183b973d4 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2211,7 +2211,7 @@ static inline void mast_rebalance_next(struct maple_subtree_state *mast)
 {
 	unsigned char b_end = mast->bn->b_end;
 
-	mas_mab_cp(mast->orig_r, 0, mt_slot_count(mast->orig_r->node),
+	mas_mab_cp(mast->orig_r, 0, mt_pivot_count(mast->orig_r->node),
 		   mast->bn, b_end);
 	mast->orig_r->last = mast->orig_r->max;
 }
@@ -2692,7 +2692,7 @@ static inline void mast_combine_cp_right(struct maple_subtree_state *mast)
 		return;
 
 	mas_mab_cp(mast->orig_r, mast->orig_r->offset + 1,
-		   mt_slot_count(mast->orig_r->node), mast->bn,
+		   mt_pivot_count(mast->orig_r->node), mast->bn,
 		   mast->bn->b_end);
 	mast->orig_r->last = mast->orig_r->max;
 }
@@ -2963,7 +2963,7 @@ static inline int mas_rebalance(struct ma_state *mas,
 	l_mas = r_mas = *mas;
 
 	if (mas_next_sibling(&r_mas)) {
-		mas_mab_cp(&r_mas, 0, mt_slot_count(r_mas.node), b_node, b_end);
+		mas_mab_cp(&r_mas, 0, mt_pivot_count(r_mas.node), b_node, b_end);
 		r_mas.last = r_mas.index = r_mas.max;
 	} else {
 		mas_prev_sibling(&l_mas);
-- 
2.34.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt]
  2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
                   ` (2 preceding siblings ...)
  2024-09-19 23:48 ` [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count() Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
  2024-09-23 19:57 ` [PATCH 0/4] refine mas_end of mas_mab_cp() Liam R. Howlett
  4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
  To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang

Now all the cases of mas_mab_cp() is called with mas_end at most equals
mt_pivots[mt]. So we can get piv_end == mas_end.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 lib/maple_tree.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index bb4183b973d4..ef9aba07adbd 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1907,11 +1907,11 @@ static inline void mas_mab_cp(struct ma_state *mas, unsigned char mas_start,
 	void __rcu **slots;
 	unsigned long *pivots, *gaps;
 	int i = mas_start, j = mab_start;
-	unsigned char piv_end;
 
 	node = mas_mn(mas);
 	mt = mte_node_type(mas->node);
 	pivots = ma_pivots(node, mt);
+
 	if (!i) {
 		b_node->pivot[j] = pivots[i++];
 		if (unlikely(i > mas_end))
@@ -1919,8 +1919,7 @@ static inline void mas_mab_cp(struct ma_state *mas, unsigned char mas_start,
 		j++;
 	}
 
-	piv_end = min(mas_end, mt_pivots[mt]);
-	for (; i < piv_end; i++, j++) {
+	for (; i < mas_end; i++, j++) {
 		b_node->pivot[j] = pivots[i];
 		if (unlikely(!b_node->pivot[j]))
 			goto complete;
-- 
2.34.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] refine mas_end of mas_mab_cp()
  2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
                   ` (3 preceding siblings ...)
  2024-09-19 23:48 ` [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt] Wei Yang
@ 2024-09-23 19:57 ` Liam R. Howlett
  4 siblings, 0 replies; 6+ messages in thread
From: Liam R. Howlett @ 2024-09-23 19:57 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, maple-tree, linux-mm

* Wei Yang <richard.weiyang@gmail.com> [240919 20:40]:
> mas_mab_cp() copy range [mas_start, mas_end] inclusively. For some usage, the
> mas_end passed could be refined.
> 
> Patch 1: if we know the end of a maple_node, we don't need to copy end + 1.

I haven't looked too hard at patch 1, I am currently away.

> Patch 2: this one tries to leverage mt_pivot_count() for optimize the code.
> Patch 3: the maximum end of a maple_node is mt_pivot_count() instead of
>          mt_slot_count()
> Patch 4: after above cleanup, we are sure mas_end won't exceed mt_pivots[], so
>          we can remove the check in mas_mab_cp()

You are changing the generic code to only work for the current node
types and I don't want to do that.  I wrote it to be more generic, so
now we are introducing the requirement that slots are one larger than
pivots.

Again, I haven't looked too closely at this but it seems like you are
making the code depend on the array differences staying constant.

> 
> Wei Yang (4):
>   maple_tree: not necessary to copy mas->end + 1
>   maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1
>   maple_tree: copy to mt_pivot_count() instead of mt_slot_count()
>   maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt]
> 
>  lib/maple_tree.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-09-23 19:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
2024-09-19 23:48 ` [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1 Wei Yang
2024-09-19 23:48 ` [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count() Wei Yang
2024-09-19 23:48 ` [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt] Wei Yang
2024-09-23 19:57 ` [PATCH 0/4] refine mas_end of mas_mab_cp() Liam R. Howlett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox