* [PATCH 0/4] refine mas_end of mas_mab_cp()
@ 2024-09-19 23:48 Wei Yang
2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang
mas_mab_cp() copy range [mas_start, mas_end] inclusively. For some usage, the
mas_end passed could be refined.
Patch 1: if we know the end of a maple_node, we don't need to copy end + 1.
Patch 2: this one tries to leverage mt_pivot_count() for optimize the code.
Patch 3: the maximum end of a maple_node is mt_pivot_count() instead of
mt_slot_count()
Patch 4: after above cleanup, we are sure mas_end won't exceed mt_pivots[], so
we can remove the check in mas_mab_cp()
Wei Yang (4):
maple_tree: not necessary to copy mas->end + 1
maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1
maple_tree: copy to mt_pivot_count() instead of mt_slot_count()
maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt]
lib/maple_tree.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1
2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
2024-09-19 23:48 ` [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1 Wei Yang
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang
mas_mab_cp() copy range [mas_start, mas_end] inclusively, so we just
need to copy to mas->end which points to the last valid data in maple
node.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
lib/maple_tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index a4796101decc..51f49b29b8b0 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2099,7 +2099,7 @@ static noinline_for_kasan void mas_store_b_node(struct ma_wr_state *wr_mas,
goto b_end;
/* Copy end data to the end of the node. */
- mas_mab_cp(mas, slot, mas->end + 1, b_node, ++b_end);
+ mas_mab_cp(mas, slot, mas->end, b_node, ++b_end);
b_node->b_end--;
return;
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1
2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
2024-09-19 23:48 ` [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count() Wei Yang
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang
We have (mt_slot_count() - 1 == mt_pivot_count()), except for
maple_dense node. Since mas_mab_cp() doesn't apply to maple_dense node,
it is fine to use mt_pivot_count() here.
By doing so we can save instruction to decrease 1 from mt_slot_count().
Below is the comparison of generated code.
Current:
.L1595:
lib/maple_tree.c:3165: mas_mab_cp(mas, split + skip, mt_slot_count(mas->node) - 1,
movl %ebx, (%esp) # prephitmp_234,
movzbl mt_slots(%edx), %ecx # mt_slots[_90], mt_slots[_90]
movl %esi, 4(%esp) # _26,
movzbl -23(%ebp), %eax # %sfp,
movzbl -21(%ebp), %edx # %sfp, split
decb %cl # tmp258
movzbl %cl, %ecx # tmp258, tmp260
addb %al, %dl #, split
movl -16(%ebp), %eax # %sfp,
movzbl %dl, %edx # tmp261, tmp262
call mas_mab_cp #
After applying this patch:
.L1595:
lib/maple_tree.c:3165: mas_mab_cp(mas, split + skip, mt_pivot_count(mas->node),
movl %ebx, (%esp) # prephitmp_231,
movzbl mt_pivots(%edx), %ecx # mt_pivots[_87], mt_pivots[_87]
movl %esi, 4(%esp) # _26,
movzbl -23(%ebp), %eax # %sfp,
movzbl -21(%ebp), %edx # %sfp, split
addb %al, %dl #, split
movl -16(%ebp), %eax # %sfp,
movzbl %dl, %edx # tmp256, tmp257
call mas_mab_cp #
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
lib/maple_tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 51f49b29b8b0..8ff075645f21 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3167,7 +3167,7 @@ static inline void mast_fill_bnode(struct maple_subtree_state *mast,
cp = false;
if (cp)
- mas_mab_cp(mas, split + skip, mt_slot_count(mas->node) - 1,
+ mas_mab_cp(mas, split + skip, mt_pivot_count(mas->node),
mast->bn, mast->bn->b_end);
mast->bn->b_end--;
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count()
2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
2024-09-19 23:48 ` [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1 Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
2024-09-19 23:48 ` [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt] Wei Yang
2024-09-23 19:57 ` [PATCH 0/4] refine mas_end of mas_mab_cp() Liam R. Howlett
4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang
mas_mab_cp() copy a maple_node's range inclusively, and the maximum
valid data index is (mt_slot_count() - 1) instead of mt_slot_count().
But if using (mt_slot_count() - 1) directly, it would have more
instructions. The good news is we have defined mt_pivot_count() ==
(mt_slot_count - 1), except for maple_dense node. And mas_mab_cp()
doesn't handle maple_dense node. So it is safe to use it.
By doing so have almost identical generated code:
Current:
lib/maple_tree.c:2209: mas_mab_cp(mast->orig_r, 0, mt_slot_count(mast->orig_r->node),
movl %esi, (%esp) # _135,
movl -88(%ebp), %eax # %sfp, _137
movzbl mt_slots(%edx), %ecx # mt_slots[_142], mt_slots[_142]
xorl %edx, %edx #
movl %eax, 4(%esp) # _137,
movl %ebx, %eax # _138,
call mas_mab_cp #
After applying this patch:
lib/maple_tree.c:2209: mas_mab_cp(mast->orig_r, 0, mt_pivot_count(mast->orig_r->node),
movl %esi, (%esp) # _135,
movl -88(%ebp), %eax # %sfp, _137
movzbl mt_pivots(%edx), %ecx # mt_pivots[_142], mt_pivots[_142]
xorl %edx, %edx #
movl %eax, 4(%esp) # _137,
movl %ebx, %eax # _138,
call mas_mab_cp #
The difference is we access mt_pivots to get ecx instead of mt_slots,
which doesn't expect performance difference.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
lib/maple_tree.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 8ff075645f21..bb4183b973d4 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2211,7 +2211,7 @@ static inline void mast_rebalance_next(struct maple_subtree_state *mast)
{
unsigned char b_end = mast->bn->b_end;
- mas_mab_cp(mast->orig_r, 0, mt_slot_count(mast->orig_r->node),
+ mas_mab_cp(mast->orig_r, 0, mt_pivot_count(mast->orig_r->node),
mast->bn, b_end);
mast->orig_r->last = mast->orig_r->max;
}
@@ -2692,7 +2692,7 @@ static inline void mast_combine_cp_right(struct maple_subtree_state *mast)
return;
mas_mab_cp(mast->orig_r, mast->orig_r->offset + 1,
- mt_slot_count(mast->orig_r->node), mast->bn,
+ mt_pivot_count(mast->orig_r->node), mast->bn,
mast->bn->b_end);
mast->orig_r->last = mast->orig_r->max;
}
@@ -2963,7 +2963,7 @@ static inline int mas_rebalance(struct ma_state *mas,
l_mas = r_mas = *mas;
if (mas_next_sibling(&r_mas)) {
- mas_mab_cp(&r_mas, 0, mt_slot_count(r_mas.node), b_node, b_end);
+ mas_mab_cp(&r_mas, 0, mt_pivot_count(r_mas.node), b_node, b_end);
r_mas.last = r_mas.index = r_mas.max;
} else {
mas_prev_sibling(&l_mas);
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt]
2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
` (2 preceding siblings ...)
2024-09-19 23:48 ` [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count() Wei Yang
@ 2024-09-19 23:48 ` Wei Yang
2024-09-23 19:57 ` [PATCH 0/4] refine mas_end of mas_mab_cp() Liam R. Howlett
4 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2024-09-19 23:48 UTC (permalink / raw)
To: akpm, Liam.Howlett; +Cc: maple-tree, linux-mm, Wei Yang
Now all the cases of mas_mab_cp() is called with mas_end at most equals
mt_pivots[mt]. So we can get piv_end == mas_end.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
lib/maple_tree.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index bb4183b973d4..ef9aba07adbd 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1907,11 +1907,11 @@ static inline void mas_mab_cp(struct ma_state *mas, unsigned char mas_start,
void __rcu **slots;
unsigned long *pivots, *gaps;
int i = mas_start, j = mab_start;
- unsigned char piv_end;
node = mas_mn(mas);
mt = mte_node_type(mas->node);
pivots = ma_pivots(node, mt);
+
if (!i) {
b_node->pivot[j] = pivots[i++];
if (unlikely(i > mas_end))
@@ -1919,8 +1919,7 @@ static inline void mas_mab_cp(struct ma_state *mas, unsigned char mas_start,
j++;
}
- piv_end = min(mas_end, mt_pivots[mt]);
- for (; i < piv_end; i++, j++) {
+ for (; i < mas_end; i++, j++) {
b_node->pivot[j] = pivots[i];
if (unlikely(!b_node->pivot[j]))
goto complete;
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/4] refine mas_end of mas_mab_cp()
2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
` (3 preceding siblings ...)
2024-09-19 23:48 ` [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt] Wei Yang
@ 2024-09-23 19:57 ` Liam R. Howlett
4 siblings, 0 replies; 6+ messages in thread
From: Liam R. Howlett @ 2024-09-23 19:57 UTC (permalink / raw)
To: Wei Yang; +Cc: akpm, maple-tree, linux-mm
* Wei Yang <richard.weiyang@gmail.com> [240919 20:40]:
> mas_mab_cp() copy range [mas_start, mas_end] inclusively. For some usage, the
> mas_end passed could be refined.
>
> Patch 1: if we know the end of a maple_node, we don't need to copy end + 1.
I haven't looked too hard at patch 1, I am currently away.
> Patch 2: this one tries to leverage mt_pivot_count() for optimize the code.
> Patch 3: the maximum end of a maple_node is mt_pivot_count() instead of
> mt_slot_count()
> Patch 4: after above cleanup, we are sure mas_end won't exceed mt_pivots[], so
> we can remove the check in mas_mab_cp()
You are changing the generic code to only work for the current node
types and I don't want to do that. I wrote it to be more generic, so
now we are introducing the requirement that slots are one larger than
pivots.
Again, I haven't looked too closely at this but it seems like you are
making the code depend on the array differences staying constant.
>
> Wei Yang (4):
> maple_tree: not necessary to copy mas->end + 1
> maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1
> maple_tree: copy to mt_pivot_count() instead of mt_slot_count()
> maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt]
>
> lib/maple_tree.c | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-23 19:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-19 23:48 [PATCH 0/4] refine mas_end of mas_mab_cp() Wei Yang
2024-09-19 23:48 ` [PATCH 1/4] maple_tree: not necessary to copy mas->end + 1 Wei Yang
2024-09-19 23:48 ` [PATCH 2/4] maple_tree: use mt_pivot_count() to replace mt_slot_count() - 1 Wei Yang
2024-09-19 23:48 ` [PATCH 3/4] maple_tree: copy to mt_pivot_count() instead of mt_slot_count() Wei Yang
2024-09-19 23:48 ` [PATCH 4/4] maple_tree: now we are sure mas_end wouldn't exceed mt_pivots[mt] Wei Yang
2024-09-23 19:57 ` [PATCH 0/4] refine mas_end of mas_mab_cp() Liam R. Howlett
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox