linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify()
@ 2024-06-01  2:55 Jung-JaeJoon
  2024-06-02  2:41 ` Liam R. Howlett
  0 siblings, 1 reply; 6+ messages in thread
From: Jung-JaeJoon @ 2024-06-01  2:55 UTC (permalink / raw)
  To: Liam R. Howlett; +Cc: Jung-JaeJoon, maple-tree, linux-mm, linux-kernel

From: Jung-JaeJoon <rgbi3307@gmail.com>

If there are not enough nodes, mas_node_count() set an error state via mas_set_err()
and return control flow to the beginning.

In the return flow, mas_nomem() checks the error status, allocates new nodes,
and resumes execution again.

In particular,
if this happens in mas_split() in the slow_path section executed in mas_wr_modify(),
unnecessary work is repeated, causing a slowdown in speed as below flow:

_begin:
mas_wr_modify() --> if (new_end >= mt_slots[wr_mas->type]) --> goto slow_path
slow_path:
    --> mas_wr_bnode() --> mas_store_b_node() --> mas_commit_b_node() --> mas_split()
    --> mas_node_count() return to _begin

But, in the above flow, if mas_node_count() is executed before entering slow_path,
execution efficiency is improved by allocating nodes without entering slow_path repeatedly.

Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
---
 lib/maple_tree.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 2d7d27e6ae3c..8ffabd73619f 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -4176,8 +4176,13 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas)
 	 * path.
 	 */
 	new_end = mas_wr_new_end(wr_mas);
-	if (new_end >= mt_slots[wr_mas->type])
+	if (new_end >= mt_slots[wr_mas->type]) {
+                mas->depth = mas_mt_height(mas);
+                mas_node_count(mas, 1 + mas->depth * 2);
+                if (mas_is_err(mas))
+                        return;
 		goto slow_path;
+        }
 
 	/* Attempt to append */
 	if (mas_wr_append(wr_mas, new_end))
-- 
2.17.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify()
  2024-06-01  2:55 [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify() Jung-JaeJoon
@ 2024-06-02  2:41 ` Liam R. Howlett
  2024-06-02  9:05   ` JaeJoon Jung
  0 siblings, 1 reply; 6+ messages in thread
From: Liam R. Howlett @ 2024-06-02  2:41 UTC (permalink / raw)
  To: Jung-JaeJoon; +Cc: maple-tree, linux-mm, linux-kernel

* Jung-JaeJoon <rgbi3307@gmail.com> [240531 22:55]:
> From: Jung-JaeJoon <rgbi3307@gmail.com>
> 
> If there are not enough nodes, mas_node_count() set an error state via mas_set_err()
> and return control flow to the beginning.
> 
> In the return flow, mas_nomem() checks the error status, allocates new nodes,
> and resumes execution again.
> 
> In particular,
> if this happens in mas_split() in the slow_path section executed in mas_wr_modify(),
> unnecessary work is repeated, causing a slowdown in speed as below flow:
> 
> _begin:
> mas_wr_modify() --> if (new_end >= mt_slots[wr_mas->type]) --> goto slow_path
> slow_path:
>     --> mas_wr_bnode() --> mas_store_b_node() --> mas_commit_b_node() --> mas_split()
>     --> mas_node_count() return to _begin
> 
> But, in the above flow, if mas_node_count() is executed before entering slow_path,
> execution efficiency is improved by allocating nodes without entering slow_path repeatedly.

Thank you for your patch, but I have to NACK this change.

You are trying to optimise the work done when we are out of memory,
which is a very rare state.  How did you check this works?

If we run out of memory, the code will kick back to mas_nomem() and
may start running in reclaim to free enough memory for the allocations.
There is nothing we can do to make a meaningful change in the speed of
execution at this point. IOW, the duplicate work is the least of our
problems.

This change has also separated the allocations from why we are
allocating - which isn't really apparent in this change.  We could put
in a comment about why we are doing this, but the difference in
execution speed when we are in a low memory, probably reclaim retry
situation is not worth this complication.

We also have a feature on the mailing list called "Store type" around
changing how this works to make preallocations avoid duplicate work and
it is actively being worked on (as noted in the email to the list). [1]
The key difference being that the store type feature will allow us to
avoid unnecessary work that happens all the time for preallocations.

[1] http://lists.infradead.org/pipermail/maple-tree/2023-December/003098.html

Thanks,
Liam

> 
> Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
> ---
>  lib/maple_tree.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index 2d7d27e6ae3c..8ffabd73619f 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -4176,8 +4176,13 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas)
>  	 * path.
>  	 */
>  	new_end = mas_wr_new_end(wr_mas);
> -	if (new_end >= mt_slots[wr_mas->type])
> +	if (new_end >= mt_slots[wr_mas->type]) {
> +                mas->depth = mas_mt_height(mas);
> +                mas_node_count(mas, 1 + mas->depth * 2);
> +                if (mas_is_err(mas))
> +                        return;
>  		goto slow_path;
> +        }
>  
>  	/* Attempt to append */
>  	if (mas_wr_append(wr_mas, new_end))
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify()
  2024-06-02  2:41 ` Liam R. Howlett
@ 2024-06-02  9:05   ` JaeJoon Jung
  2024-06-03 12:29     ` Liam R. Howlett
  0 siblings, 1 reply; 6+ messages in thread
From: JaeJoon Jung @ 2024-06-02  9:05 UTC (permalink / raw)
  To: Liam R. Howlett, Jung-JaeJoon, maple-tree, linux-mm, linux-kernel

Hello, Liam.
Thank you very much for the detailed answer and explanation.

I tested this patch in userspace.
In user space, this phenomenon always occurs when kmem_cache_alloc()
is executed to allocate a new node.
I will try to test it in more detail in kernel space.
I will also refer to the notes from the email list you shared
and send results once a more clear analysis has been made.

Thanks,
JaeJoon Jung

On Sun, 2 Jun 2024 at 11:41, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Jung-JaeJoon <rgbi3307@gmail.com> [240531 22:55]:
> > From: Jung-JaeJoon <rgbi3307@gmail.com>
> >
> > If there are not enough nodes, mas_node_count() set an error state via mas_set_err()
> > and return control flow to the beginning.
> >
> > In the return flow, mas_nomem() checks the error status, allocates new nodes,
> > and resumes execution again.
> >
> > In particular,
> > if this happens in mas_split() in the slow_path section executed in mas_wr_modify(),
> > unnecessary work is repeated, causing a slowdown in speed as below flow:
> >
> > _begin:
> > mas_wr_modify() --> if (new_end >= mt_slots[wr_mas->type]) --> goto slow_path
> > slow_path:
> >     --> mas_wr_bnode() --> mas_store_b_node() --> mas_commit_b_node() --> mas_split()
> >     --> mas_node_count() return to _begin
> >
> > But, in the above flow, if mas_node_count() is executed before entering slow_path,
> > execution efficiency is improved by allocating nodes without entering slow_path repeatedly.
>
> Thank you for your patch, but I have to NACK this change.
>
> You are trying to optimise the work done when we are out of memory,
> which is a very rare state.  How did you check this works?
>
> If we run out of memory, the code will kick back to mas_nomem() and
> may start running in reclaim to free enough memory for the allocations.
> There is nothing we can do to make a meaningful change in the speed of
> execution at this point. IOW, the duplicate work is the least of our
> problems.
>
> This change has also separated the allocations from why we are
> allocating - which isn't really apparent in this change.  We could put
> in a comment about why we are doing this, but the difference in
> execution speed when we are in a low memory, probably reclaim retry
> situation is not worth this complication.
>
> We also have a feature on the mailing list called "Store type" around
> changing how this works to make preallocations avoid duplicate work and
> it is actively being worked on (as noted in the email to the list). [1]
> The key difference being that the store type feature will allow us to
> avoid unnecessary work that happens all the time for preallocations.
>
> [1] http://lists.infradead.org/pipermail/maple-tree/2023-December/003098.html
>
> Thanks,
> Liam
>
> >
> > Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
> > ---
> >  lib/maple_tree.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> > index 2d7d27e6ae3c..8ffabd73619f 100644
> > --- a/lib/maple_tree.c
> > +++ b/lib/maple_tree.c
> > @@ -4176,8 +4176,13 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas)
> >        * path.
> >        */
> >       new_end = mas_wr_new_end(wr_mas);
> > -     if (new_end >= mt_slots[wr_mas->type])
> > +     if (new_end >= mt_slots[wr_mas->type]) {
> > +                mas->depth = mas_mt_height(mas);
> > +                mas_node_count(mas, 1 + mas->depth * 2);
> > +                if (mas_is_err(mas))
> > +                        return;
> >               goto slow_path;
> > +        }
> >
> >       /* Attempt to append */
> >       if (mas_wr_append(wr_mas, new_end))
> > --
> > 2.17.1
> >


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify()
  2024-06-02  9:05   ` JaeJoon Jung
@ 2024-06-03 12:29     ` Liam R. Howlett
  2024-06-04 12:03       ` JaeJoon Jung
  0 siblings, 1 reply; 6+ messages in thread
From: Liam R. Howlett @ 2024-06-03 12:29 UTC (permalink / raw)
  To: JaeJoon Jung; +Cc: maple-tree, linux-mm, linux-kernel

* JaeJoon Jung <rgbi3307@gmail.com> [240602 05:06]:
> Hello, Liam.
> Thank you very much for the detailed answer and explanation.
> 
> I tested this patch in userspace.
> In user space, this phenomenon always occurs when kmem_cache_alloc()
> is executed to allocate a new node.

This is expected in the userspace test program.  We test the error path
most frequently, with only a bypass for those who wish to test an
initial success - such as preallocations.  I was concerned about the
testing since your first patch had a syntax error with a correction
quickly after.  It is good news that you were able to find and use the
maple tree testing framework, though.

> I will try to test it in more detail in kernel space.

If you test in kernel space, you will have to hit a low memory scenario
to see a difference.  stress-ng would probably help.

> I will also refer to the notes from the email list you shared
> and send results once a more clear analysis has been made.

I don't think you need to continue with this work as you will find that
the low memory situation is going to be rare and in a very slow path
already.

Thanks,
Liam


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify()
  2024-06-03 12:29     ` Liam R. Howlett
@ 2024-06-04 12:03       ` JaeJoon Jung
  0 siblings, 0 replies; 6+ messages in thread
From: JaeJoon Jung @ 2024-06-04 12:03 UTC (permalink / raw)
  To: Liam R. Howlett, JaeJoon Jung, maple-tree, linux-mm, linux-kernel

Hello, Liam.
As you pointed out, I did enough testing in the userspace and in
previous versions of the kernel, but I was surprised that a syntax
error occurred in making newest patch because of my careless mistake.
I will be careful not to make this mistake again.
And I am going to study the low memory situation in kernel space in more detail.
Thank you very much for your detailed answer and explanation.

Best regards,
JaeJoon Jung

On Mon, 3 Jun 2024 at 21:29, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * JaeJoon Jung <rgbi3307@gmail.com> [240602 05:06]:
> > Hello, Liam.
> > Thank you very much for the detailed answer and explanation.
> >
> > I tested this patch in userspace.
> > In user space, this phenomenon always occurs when kmem_cache_alloc()
> > is executed to allocate a new node.
>
> This is expected in the userspace test program.  We test the error path
> most frequently, with only a bypass for those who wish to test an
> initial success - such as preallocations.  I was concerned about the
> testing since your first patch had a syntax error with a correction
> quickly after.  It is good news that you were able to find and use the
> maple tree testing framework, though.
>
> > I will try to test it in more detail in kernel space.
>
> If you test in kernel space, you will have to hit a low memory scenario
> to see a difference.  stress-ng would probably help.
>
> > I will also refer to the notes from the email list you shared
> > and send results once a more clear analysis has been made.
>
> I don't think you need to continue with this work as you will find that
> the low memory situation is going to be rare and in a very slow path
> already.
>
> Thanks,
> Liam


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify()
@ 2024-06-01  2:37 Jung-JaeJoon
  0 siblings, 0 replies; 6+ messages in thread
From: Jung-JaeJoon @ 2024-06-01  2:37 UTC (permalink / raw)
  To: Liam R. Howlett; +Cc: Jung-JaeJoon, maple-tree, linux-mm, linux-kernel

From: Jung-JaeJoon <rgbi3307@gmail.com>

If there are not enough nodes, mas_node_count() set an error state via mas_set_err()
and return control flow to the beginning.

In the return flow, mas_nomem() checks the error status, allocates new nodes,
and resumes execution again.

In particular,
if this happens in mas_split() in the slow_path section executed in mas_wr_modify(),
unnecessary work is repeated, causing a slowdown in speed as below flow:

_begin:
mas_wr_modify() --> if (new_end >= mt_slots[wr_mas->type]) --> goto slow_path
slow_path:
    --> mas_wr_bnode() --> mas_store_b_node() --> mas_commit_b_node() --> mas_split()
    --> mas_node_count() return to _begin

But, in the above flow, if mas_node_count() is executed before entering slow_path,
execution efficiency is improved by allocating nodes without entering slow_path repeatedly.

Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
---
 lib/maple_tree.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 2d7d27e6ae3c..b42a4e70d229 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -4176,8 +4176,13 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas)
 	 * path.
 	 */
 	new_end = mas_wr_new_end(wr_mas);
-	if (new_end >= mt_slots[wr_mas->type])
+	if (new_end >= mt_slots[wr_mas->type]) }
+                mas->depth = mas_mt_height(mas);
+                mas_node_count(mas, 1 + mas->depth * 2);
+                if (mas_is_err(mas))
+                        return;
 		goto slow_path;
+        }
 
 	/* Attempt to append */
 	if (mas_wr_append(wr_mas, new_end))
-- 
2.17.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-04 12:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-01  2:55 [PATCH] maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify() Jung-JaeJoon
2024-06-02  2:41 ` Liam R. Howlett
2024-06-02  9:05   ` JaeJoon Jung
2024-06-03 12:29     ` Liam R. Howlett
2024-06-04 12:03       ` JaeJoon Jung
  -- strict thread matches above, loose matches on Subject: below --
2024-06-01  2:37 Jung-JaeJoon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox