[PATCH] mm/damon/core: remove call_control in inactive contexts

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm/damon/core: remove call_control in inactive contexts
@ 2025-12-28 18:31 SeongJae Park
  2025-12-30  1:45 ` SeongJae Park
  0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2025-12-28 18:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: SeongJae Park, # 6 . 14 . x, damon, linux-kernel, linux-mm, JaeJoon Jung

If damon_call() is executed against a DAMON context that is not running,
the function returns error while keeping the damon_call_control object
linked to the context's call_controls list.  Let's suppose the object is
deallocated after the damon_call(), and yet another damon_call() is
executed against the same context.  The function tries to add the new
damon_call_control object to the call_controls list, which still has the
pointer to the previous damon_call_control object, which is deallocated.
As a result, use-after-free happens.

This can actually be triggered using the DAMON sysfs interface.  It is
not easily exploitable since it requires the sysfs write permission and
making a definitely weird file writes, though.  Please refer to the
report for more details about the issue reproduction steps.

Fix the issue by making damon_call() to cleanup the damon_call_control
object before returning the error.

Reported-by: JaeJoon Jung <rgbi3307@gmail.com>
Closes: https://lore.kernel.org/20251224094401.20384-1-rgbi3307@gmail.com
Fixes: 42b7491af14c ("mm/damon/core: introduce damon_call()")
Cc: <stable@vger.kernel.org> # 6.14.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/core.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/mm/damon/core.c b/mm/damon/core.c
index 2d3e8006db50..65482a0ce20b 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1442,6 +1442,35 @@ bool damon_is_running(struct damon_ctx *ctx)
 	return running;
 }
 
+/*
+ * damon_call_handle_inactive_ctx() - handle DAMON call request that added to
+ *				      an inactive context.
+ * @ctx:	The inactive DAMON context.
+ * @control:	Control variable of the call request.
+ *
+ * This function is called in a case that @control is added to @ctx but @ctx is
+ * not running (inactive).  See if @ctx handled @control or not, and cleanup
+ * @control if it was not handled.
+ *
+ * Returns 0 if @control was handled by @ctx, negative error code otherwise.
+ */
+static int damon_call_handle_inactive_ctx(
+		struct damon_ctx *ctx, struct damon_call_control *control)
+{
+	struct damon_call_control *c;
+
+	mutex_lock(&ctx->call_controls_lock);
+	list_for_each_entry(c, &ctx->call_controls, list) {
+		if (c == control) {
+			list_del(&control->list);
+			mutex_unlock(&ctx->call_controls_lock);
+			return -EINVAL;
+		}
+	}
+	mutex_unlock(&ctx->call_controls_lock);
+	return 0;
+}
+
 /**
  * damon_call() - Invoke a given function on DAMON worker thread (kdamond).
  * @ctx:	DAMON context to call the function for.
@@ -1472,7 +1501,7 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
 	list_add_tail(&control->list, &ctx->call_controls);
 	mutex_unlock(&ctx->call_controls_lock);
 	if (!damon_is_running(ctx))
-		return -EINVAL;
+		return damon_call_handle_inactive_ctx(ctx, control);
 	if (control->repeat)
 		return 0;
 	wait_for_completion(&control->completion);

base-commit: a100272b3541cb00a5e29fdc16e428234ebfddd1
-- 
2.47.3


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon/core: remove call_control in inactive contexts
  2025-12-28 18:31 [PATCH] mm/damon/core: remove call_control in inactive contexts SeongJae Park
@ 2025-12-30  1:45 ` SeongJae Park
  2025-12-30  2:41   ` SeongJae Park
  0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2025-12-30  1:45 UTC (permalink / raw)
  To: SeongJae Park
  Cc: Andrew Morton, # 6 . 14 . x, damon, linux-kernel, linux-mm, JaeJoon Jung

On Sun, 28 Dec 2025 10:31:01 -0800 SeongJae Park <sj@kernel.org> wrote:

> If damon_call() is executed against a DAMON context that is not running,
> the function returns error while keeping the damon_call_control object
> linked to the context's call_controls list.  Let's suppose the object is
> deallocated after the damon_call(), and yet another damon_call() is
> executed against the same context.  The function tries to add the new
> damon_call_control object to the call_controls list, which still has the
> pointer to the previous damon_call_control object, which is deallocated.
> As a result, use-after-free happens.
> 
> This can actually be triggered using the DAMON sysfs interface.  It is
> not easily exploitable since it requires the sysfs write permission and
> making a definitely weird file writes, though.  Please refer to the
> report for more details about the issue reproduction steps.
> 
> Fix the issue by making damon_call() to cleanup the damon_call_control
> object before returning the error.
> 
> Reported-by: JaeJoon Jung <rgbi3307@gmail.com>
> Closes: https://lore.kernel.org/20251224094401.20384-1-rgbi3307@gmail.com
> Fixes: 42b7491af14c ("mm/damon/core: introduce damon_call()")
> Cc: <stable@vger.kernel.org> # 6.14.x
> Signed-off-by: SeongJae Park <sj@kernel.org>
> ---
>  mm/damon/core.c | 31 ++++++++++++++++++++++++++++++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index 2d3e8006db50..65482a0ce20b 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1442,6 +1442,35 @@ bool damon_is_running(struct damon_ctx *ctx)
>  	return running;
>  }
>  
> +/*
> + * damon_call_handle_inactive_ctx() - handle DAMON call request that added to
> + *				      an inactive context.
> + * @ctx:	The inactive DAMON context.
> + * @control:	Control variable of the call request.
> + *
> + * This function is called in a case that @control is added to @ctx but @ctx is
> + * not running (inactive).  See if @ctx handled @control or not, and cleanup
> + * @control if it was not handled.
> + *
> + * Returns 0 if @control was handled by @ctx, negative error code otherwise.
> + */
> +static int damon_call_handle_inactive_ctx(
> +		struct damon_ctx *ctx, struct damon_call_control *control)
> +{
> +	struct damon_call_control *c;
> +
> +	mutex_lock(&ctx->call_controls_lock);
> +	list_for_each_entry(c, &ctx->call_controls, list) {
> +		if (c == control) {
> +			list_del(&control->list);
> +			mutex_unlock(&ctx->call_controls_lock);
> +			return -EINVAL;
> +		}
> +	}
> +	mutex_unlock(&ctx->call_controls_lock);
> +	return 0;
> +}
> +
>  /**
>   * damon_call() - Invoke a given function on DAMON worker thread (kdamond).
>   * @ctx:	DAMON context to call the function for.
> @@ -1472,7 +1501,7 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
>  	list_add_tail(&control->list, &ctx->call_controls);
>  	mutex_unlock(&ctx->call_controls_lock);
>  	if (!damon_is_running(ctx))
> -		return -EINVAL;
> +		return damon_call_handle_inactive_ctx(ctx, control);
>  	if (control->repeat)
>  		return 0;
>  	wait_for_completion(&control->completion);

TL; DR: This patch introduces another UAF bug under a race condition.  I will
send a new version of the fix that solves the another issue.  Andrew, could you
please remove this from mm tree for now?

kdamond_fn() resets ->kdamond, which is read by damon_is_running(), and then
make the final kdamond_call() for cancelling any remaining damon_call()
requests.  Hence, if the above damon_is_running() was invoked between the
->kdamond reset and the final kdamond_call() invocation,
damon_call_handle_inactive_ctx() and the final kdamond_call() could
concurrently run.

kdamond_call() safely get a pointer to a damon_call_control object in
ctx->call_controls, and then access it without a lock.  Only after that, it
removes the object from the list while holding the lock.  The intermediate
lock-less access is safe because kdamond_call() is the only code that removes
items from ctx->call_controls.  But this patch makes it no more safe, because
this patch is introducing another ctx->call_controls item removing code, namely
damon_call_handle_inactive_ctx().

To see this in details, let's suppose kdamond_call() got the pointer, and
released the call_controls_lock.  After that, damon_call_handle_inactive_ctx()
shows the object is still in the ctx->call_controls, and removes it from the
list.  The damon_call() caller further deallocates the object.  Then, continued
execution of kdamond_call() accesses the already deallocated object.

I will send a new version of this fix soon.


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon/core: remove call_control in inactive contexts
  2025-12-30  1:45 ` SeongJae Park
@ 2025-12-30  2:41   ` SeongJae Park
  2025-12-30  3:45     ` SeongJae Park
  0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2025-12-30  2:41 UTC (permalink / raw)
  To: SeongJae Park
  Cc: Andrew Morton, # 6 . 14 . x, damon, linux-kernel, linux-mm, JaeJoon Jung

On Mon, 29 Dec 2025 17:45:30 -0800 SeongJae Park <sj@kernel.org> wrote:

> On Sun, 28 Dec 2025 10:31:01 -0800 SeongJae Park <sj@kernel.org> wrote:
> 
> > If damon_call() is executed against a DAMON context that is not running,
> > the function returns error while keeping the damon_call_control object
> > linked to the context's call_controls list.  Let's suppose the object is
> > deallocated after the damon_call(), and yet another damon_call() is
> > executed against the same context.  The function tries to add the new
> > damon_call_control object to the call_controls list, which still has the
> > pointer to the previous damon_call_control object, which is deallocated.
> > As a result, use-after-free happens.
> > 
> > This can actually be triggered using the DAMON sysfs interface.  It is
> > not easily exploitable since it requires the sysfs write permission and
> > making a definitely weird file writes, though.  Please refer to the
> > report for more details about the issue reproduction steps.
> > 
> > Fix the issue by making damon_call() to cleanup the damon_call_control
> > object before returning the error.
> > 
> > Reported-by: JaeJoon Jung <rgbi3307@gmail.com>
> > Closes: https://lore.kernel.org/20251224094401.20384-1-rgbi3307@gmail.com
> > Fixes: 42b7491af14c ("mm/damon/core: introduce damon_call()")

The above Fixes: tag is wrong.  At the moment of the commit, only single
damon_call() request was allowed.  Hence only a pointer instead of the linked
list was used.  In the problematic scenario, repeated damon_call() would simply
return -EBUSY.

The correct Fixes: should be,

Fixes: 004ded6bee11 ("mm/damon: accept parallel damon_call() requests")
Cc: <stable@vger.kernel.org> # 6.17.x

> > Cc: <stable@vger.kernel.org> # 6.14.x
> > Signed-off-by: SeongJae Park <sj@kernel.org>
> > ---
> >  mm/damon/core.c | 31 ++++++++++++++++++++++++++++++-
> >  1 file changed, 30 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > index 2d3e8006db50..65482a0ce20b 100644
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> > @@ -1442,6 +1442,35 @@ bool damon_is_running(struct damon_ctx *ctx)
> >  	return running;
> >  }
> >  
> > +/*
> > + * damon_call_handle_inactive_ctx() - handle DAMON call request that added to
> > + *				      an inactive context.
> > + * @ctx:	The inactive DAMON context.
> > + * @control:	Control variable of the call request.
> > + *
> > + * This function is called in a case that @control is added to @ctx but @ctx is
> > + * not running (inactive).  See if @ctx handled @control or not, and cleanup
> > + * @control if it was not handled.
> > + *
> > + * Returns 0 if @control was handled by @ctx, negative error code otherwise.
> > + */
> > +static int damon_call_handle_inactive_ctx(
> > +		struct damon_ctx *ctx, struct damon_call_control *control)
> > +{
> > +	struct damon_call_control *c;
> > +
> > +	mutex_lock(&ctx->call_controls_lock);
> > +	list_for_each_entry(c, &ctx->call_controls, list) {
> > +		if (c == control) {
> > +			list_del(&control->list);
> > +			mutex_unlock(&ctx->call_controls_lock);
> > +			return -EINVAL;
> > +		}
> > +	}
> > +	mutex_unlock(&ctx->call_controls_lock);
> > +	return 0;
> > +}
> > +
> >  /**
> >   * damon_call() - Invoke a given function on DAMON worker thread (kdamond).
> >   * @ctx:	DAMON context to call the function for.
> > @@ -1472,7 +1501,7 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> >  	list_add_tail(&control->list, &ctx->call_controls);
> >  	mutex_unlock(&ctx->call_controls_lock);
> >  	if (!damon_is_running(ctx))
> > -		return -EINVAL;
> > +		return damon_call_handle_inactive_ctx(ctx, control);
> >  	if (control->repeat)
> >  		return 0;
> >  	wait_for_completion(&control->completion);
> 
> TL; DR: This patch introduces another UAF bug under a race condition.  I will
> send a new version of the fix that solves the another issue.  Andrew, could you
> please remove this from mm tree for now?
> 
> kdamond_fn() resets ->kdamond, which is read by damon_is_running(), and then
> make the final kdamond_call() for cancelling any remaining damon_call()
> requests.  Hence, if the above damon_is_running() was invoked between the
> ->kdamond reset and the final kdamond_call() invocation,
> damon_call_handle_inactive_ctx() and the final kdamond_call() could
> concurrently run.
> 
> kdamond_call() safely get a pointer to a damon_call_control object in
> ctx->call_controls, and then access it without a lock.  Only after that, it
> removes the object from the list while holding the lock.  The intermediate
> lock-less access is safe because kdamond_call() is the only code that removes
> items from ctx->call_controls.  But this patch makes it no more safe, because
> this patch is introducing another ctx->call_controls item removing code, namely
> damon_call_handle_inactive_ctx().
> 
> To see this in details, let's suppose kdamond_call() got the pointer, and
> released the call_controls_lock.  After that, damon_call_handle_inactive_ctx()
> shows the object is still in the ctx->call_controls, and removes it from the
> list.  The damon_call() caller further deallocates the object.  Then, continued
> execution of kdamond_call() accesses the already deallocated object.
> 
> I will send a new version of this fix soon.

So far, I got two fixup ideas.

The first one is keeping the current code as is, and additionally modifying
kdamond_call() to protect all call_control object accesses under
ctx->call_controls_lock protection.

The second one is reverting this patch, and doing the DAMON running status
check before adding the damon_call_control object, but releasing the
kdamond_lock after the object insertion is done.

I'm in favor of the second one at the moment, as it seems more simple.


Thanks,
SJ


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon/core: remove call_control in inactive contexts
  2025-12-30  2:41   ` SeongJae Park
@ 2025-12-30  3:45     ` SeongJae Park
  2025-12-31  1:25       ` SeongJae Park
  0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2025-12-30  3:45 UTC (permalink / raw)
  To: SeongJae Park
  Cc: Andrew Morton, # 6 . 14 . x, damon, linux-kernel, linux-mm, JaeJoon Jung

On Mon, 29 Dec 2025 18:41:28 -0800 SeongJae Park <sj@kernel.org> wrote:

> On Mon, 29 Dec 2025 17:45:30 -0800 SeongJae Park <sj@kernel.org> wrote:
> 
> > On Sun, 28 Dec 2025 10:31:01 -0800 SeongJae Park <sj@kernel.org> wrote:
> > 
[...]
> > I will send a new version of this fix soon.
> 
> So far, I got two fixup ideas.
> 
> The first one is keeping the current code as is, and additionally modifying
> kdamond_call() to protect all call_control object accesses under
> ctx->call_controls_lock protection.
> 
> The second one is reverting this patch, and doing the DAMON running status
> check before adding the damon_call_control object, but releasing the
> kdamond_lock after the object insertion is done.
> 
> I'm in favor of the second one at the moment, as it seems more simple.

I don't really like both approaches because those implicitly add locking rules.
If the first approach is taken, damon_call() callers should aware they should
not register callback functions that can hold call_controls_lock.  If the
second approach is taken, we should avoid holding kdamond_lock while holding
damon_call_control lock.  The second implicit rule seems easier to keep to me,
but I want to avoid that if possible.

The third idea I just got is, keeping this patch as is, and moving the final
kdamond_call() invocation to be made _before_ the ctx->kdamond reset.  That
removes the race condition between the final kdamond_call() and
damon_call_handle_inactive_ctx(), without introducing new locking rules.

Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon/core: remove call_control in inactive contexts
  2025-12-30  3:45     ` SeongJae Park
@ 2025-12-31  1:25       ` SeongJae Park
  2025-12-31  5:27         ` JaeJoon Jung
  0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2025-12-31  1:25 UTC (permalink / raw)
  To: SeongJae Park
  Cc: Andrew Morton, # 6 . 14 . x, damon, linux-kernel, linux-mm, JaeJoon Jung

On Mon, 29 Dec 2025 19:45:14 -0800 SeongJae Park <sj@kernel.org> wrote:

> On Mon, 29 Dec 2025 18:41:28 -0800 SeongJae Park <sj@kernel.org> wrote:
> 
> > On Mon, 29 Dec 2025 17:45:30 -0800 SeongJae Park <sj@kernel.org> wrote:
> > 
> > > On Sun, 28 Dec 2025 10:31:01 -0800 SeongJae Park <sj@kernel.org> wrote:
> > > 
> [...]
> > > I will send a new version of this fix soon.
> > 
> > So far, I got two fixup ideas.
> > 
> > The first one is keeping the current code as is, and additionally modifying
> > kdamond_call() to protect all call_control object accesses under
> > ctx->call_controls_lock protection.
> > 
> > The second one is reverting this patch, and doing the DAMON running status
> > check before adding the damon_call_control object, but releasing the
> > kdamond_lock after the object insertion is done.
> > 
> > I'm in favor of the second one at the moment, as it seems more simple.
> 
> I don't really like both approaches because those implicitly add locking rules.
> If the first approach is taken, damon_call() callers should aware they should
> not register callback functions that can hold call_controls_lock.  If the
> second approach is taken, we should avoid holding kdamond_lock while holding
> damon_call_control lock.  The second implicit rule seems easier to keep to me,
> but I want to avoid that if possible.
> 
> The third idea I just got is, keeping this patch as is, and moving the final
> kdamond_call() invocation to be made _before_ the ctx->kdamond reset.  That
> removes the race condition between the final kdamond_call() and
> damon_call_handle_inactive_ctx(), without introducing new locking rules.

I just posted the v2 [1] with the third idea.

[1] https://lore.kernel.org/20251231012315.75835-1-sj@kernel.org


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon/core: remove call_control in inactive contexts
  2025-12-31  1:25       ` SeongJae Park
@ 2025-12-31  5:27         ` JaeJoon Jung
  2025-12-31 15:26           ` SeongJae Park
  0 siblings, 1 reply; 7+ messages in thread
From: JaeJoon Jung @ 2025-12-31  5:27 UTC (permalink / raw)
  To: SeongJae Park; +Cc: Andrew Morton, # 6 . 14 . x, damon, linux-kernel, linux-mm

On Wed, 31 Dec 2025 at 10:25, SeongJae Park <sj@kernel.org> wrote:
>
> On Mon, 29 Dec 2025 19:45:14 -0800 SeongJae Park <sj@kernel.org> wrote:
>
> > On Mon, 29 Dec 2025 18:41:28 -0800 SeongJae Park <sj@kernel.org> wrote:
> >
> > > On Mon, 29 Dec 2025 17:45:30 -0800 SeongJae Park <sj@kernel.org> wrote:
> > >
> > > > On Sun, 28 Dec 2025 10:31:01 -0800 SeongJae Park <sj@kernel.org> wrote:
> > > >
> > [...]
> > > > I will send a new version of this fix soon.
> > >
> > > So far, I got two fixup ideas.
> > >
> > > The first one is keeping the current code as is, and additionally modifying
> > > kdamond_call() to protect all call_control object accesses under
> > > ctx->call_controls_lock protection.
> > >
> > > The second one is reverting this patch, and doing the DAMON running status
> > > check before adding the damon_call_control object, but releasing the
> > > kdamond_lock after the object insertion is done.
> > >
> > > I'm in favor of the second one at the moment, as it seems more simple.
> >
> > I don't really like both approaches because those implicitly add locking rules.
> > If the first approach is taken, damon_call() callers should aware they should
> > not register callback functions that can hold call_controls_lock.  If the
> > second approach is taken, we should avoid holding kdamond_lock while holding
> > damon_call_control lock.  The second implicit rule seems easier to keep to me,
> > but I want to avoid that if possible.
> >
> > The third idea I just got is, keeping this patch as is, and moving the final
> > kdamond_call() invocation to be made _before_ the ctx->kdamond reset.  That
> > removes the race condition between the final kdamond_call() and
> > damon_call_handle_inactive_ctx(), without introducing new locking rules.
>
> I just posted the v2 [1] with the third idea.
>
> [1] https://lore.kernel.org/20251231012315.75835-1-sj@kernel.org

I generally agree with what you've said so far.  However, it's inefficient
to continue executing damon_call_handle_inactive_ctx() while kdamond is
"off".  There's no need to add a new damon_call_handle_inactive_ctx()
function.  As shown below, it's better to call list_add only when kdamond
is "on" (true), and then use the existing code to end with
kdamond_call(ctx, true) when kdamond is "off."

+static void kdamond_call(struct damon_ctx *ctx, bool cancel);
+
 /**
  * damon_call() - Invoke a given function on DAMON worker thread (kdamond).
  * @ctx:       DAMON context to call the function for.
@@ -1496,14 +1475,17 @@ int damon_call(struct damon_ctx *ctx, struct
damon_call_control *control)
        control->canceled = false;
        INIT_LIST_HEAD(&control->list);

-       if (damon_is_running(ctx)) {
-               mutex_lock(&ctx->call_controls_lock);
+       mutex_lock(&ctx->call_controls_lock);
+       if (ctx->kdamond) {
                list_add_tail(&control->list, &ctx->call_controls);
-               mutex_unlock(&ctx->call_controls_lock);
        } else {
-               /* return damon_call_handle_inactive_ctx(ctx, control); */
+               mutex_unlock(&ctx->call_controls_lock);
+               if (!list_empty_careful(&ctx->call_controls))
+                       kdamond_call(ctx, true);
                return -EINVAL;
        }
+       mutex_unlock(&ctx->call_controls_lock);
+
        if (control->repeat)
                return 0;
        wait_for_completion(&control->completion);

Thanks,
JaeJoon

>
>
> Thanks,
> SJ
>
> [...]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon/core: remove call_control in inactive contexts
  2025-12-31  5:27         ` JaeJoon Jung
@ 2025-12-31 15:26           ` SeongJae Park
  0 siblings, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2025-12-31 15:26 UTC (permalink / raw)
  To: JaeJoon Jung
  Cc: SeongJae Park, Andrew Morton, # 6 . 14 . x, damon, linux-kernel,
	linux-mm

On Wed, 31 Dec 2025 14:27:54 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:

> On Wed, 31 Dec 2025 at 10:25, SeongJae Park <sj@kernel.org> wrote:
> >
> > On Mon, 29 Dec 2025 19:45:14 -0800 SeongJae Park <sj@kernel.org> wrote:
> >
> > > On Mon, 29 Dec 2025 18:41:28 -0800 SeongJae Park <sj@kernel.org> wrote:
> > >
> > > > On Mon, 29 Dec 2025 17:45:30 -0800 SeongJae Park <sj@kernel.org> wrote:
> > > >
> > > > > On Sun, 28 Dec 2025 10:31:01 -0800 SeongJae Park <sj@kernel.org> wrote:
> > > > >
> > > [...]
> > > > > I will send a new version of this fix soon.
> > > >
> > > > So far, I got two fixup ideas.
> > > >
> > > > The first one is keeping the current code as is, and additionally modifying
> > > > kdamond_call() to protect all call_control object accesses under
> > > > ctx->call_controls_lock protection.
> > > >
> > > > The second one is reverting this patch, and doing the DAMON running status
> > > > check before adding the damon_call_control object, but releasing the
> > > > kdamond_lock after the object insertion is done.
> > > >
> > > > I'm in favor of the second one at the moment, as it seems more simple.
> > >
> > > I don't really like both approaches because those implicitly add locking rules.
> > > If the first approach is taken, damon_call() callers should aware they should
> > > not register callback functions that can hold call_controls_lock.  If the
> > > second approach is taken, we should avoid holding kdamond_lock while holding
> > > damon_call_control lock.  The second implicit rule seems easier to keep to me,
> > > but I want to avoid that if possible.
> > >
> > > The third idea I just got is, keeping this patch as is, and moving the final
> > > kdamond_call() invocation to be made _before_ the ctx->kdamond reset.  That
> > > removes the race condition between the final kdamond_call() and
> > > damon_call_handle_inactive_ctx(), without introducing new locking rules.
> >
> > I just posted the v2 [1] with the third idea.
> >
> > [1] https://lore.kernel.org/20251231012315.75835-1-sj@kernel.org
> 
> I generally agree with what you've said so far.  However, it's inefficient
> to continue executing damon_call_handle_inactive_ctx() while kdamond is
> "off".  There's no need to add a new damon_call_handle_inactive_ctx()
> function.

As I mentioned before on other threads with you, we care not only efficiency
but also maintainability of the code.  The inefficiency you are saying about
happens only in corner cases because damon_call() is not usually called while
kdamond is off.  So the gain of making this efficient is not that big.

Meanwhile, to avoid this, as I mentioned on the previous reply to the first and
the second idea of the fix, we need to add locking rule, which makes the code
bit difficult to maintain.

I therefore think the v2 is a good tradeoff.

> As shown below, it's better to call list_add only when kdamond
> is "on" (true), and then use the existing code to end with
> kdamond_call(ctx, true) when kdamond is "off."
> 
> +static void kdamond_call(struct damon_ctx *ctx, bool cancel);
> +
>  /**
>   * damon_call() - Invoke a given function on DAMON worker thread (kdamond).
>   * @ctx:       DAMON context to call the function for.
> @@ -1496,14 +1475,17 @@ int damon_call(struct damon_ctx *ctx, struct
> damon_call_control *control)
>         control->canceled = false;
>         INIT_LIST_HEAD(&control->list);
> 
> -       if (damon_is_running(ctx)) {
> -               mutex_lock(&ctx->call_controls_lock);
> +       mutex_lock(&ctx->call_controls_lock);
> +       if (ctx->kdamond) {

This is wrong.  You shouldn't access ctx->kdamond without holding
ctx->kdamond_lock.  Please read the comment about kdamond_lock field on damon.h
file.

>                 list_add_tail(&control->list, &ctx->call_controls);
> -               mutex_unlock(&ctx->call_controls_lock);
>         } else {
> -               /* return damon_call_handle_inactive_ctx(ctx, control); */
> +               mutex_unlock(&ctx->call_controls_lock);
> +               if (!list_empty_careful(&ctx->call_controls))
> +                       kdamond_call(ctx, true);
>                 return -EINVAL;
>         }
> +       mutex_unlock(&ctx->call_controls_lock);
> +
>         if (control->repeat)
>                 return 0;
>         wait_for_completion(&control->completion);


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-12-31 15:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-28 18:31 [PATCH] mm/damon/core: remove call_control in inactive contexts SeongJae Park
2025-12-30  1:45 ` SeongJae Park
2025-12-30  2:41   ` SeongJae Park
2025-12-30  3:45     ` SeongJae Park
2025-12-31  1:25       ` SeongJae Park
2025-12-31  5:27         ` JaeJoon Jung
2025-12-31 15:26           ` SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox