* [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
@ 2025-12-24 9:43 JaeJoon Jung
2025-12-25 0:32 ` SeongJae Park
0 siblings, 1 reply; 10+ messages in thread
From: JaeJoon Jung @ 2025-12-24 9:43 UTC (permalink / raw)
To: SeongJae Park; +Cc: JaeJoon Jung, damon, linux-mm, rgbi3307
cd /sys/kernel/mm/damon/admin
echo "off" > kdamonds/0/state
echo "commit" > kdamonds/0/state
echo "commit" > kdamonds/0/state
If you repeat "commit" twice with the kdamonds/0/state set to "off"
with the above command, list_add corruption error occurs as follows:
4-page vmalloc region starting at 0xffffffc600a38000 allocated at
kernel_clone+0x44/0x41e
------------[ cut here ]------------
list_add corruption. prev->next should be next (ffffffd6c7c5a6a8),
but was ffffffc600a3bcc8. (prev=ffffffc600a3bcc8).
WARNING: lib/list_debug.c:32 at __list_add_valid_or_report+
0xd8/0xe2, CPU#0: bash/466
Modules linked in: dwmac_starfive stmmac_platform stmmac pcs_xpcs phylink
CPU: 0 UID: 0 PID: 466 Comm: bash Tainted: G W 6.19.0-rc2+ #1 PREEMPTLAZY
Tainted: [W]=WARN
Hardware name: StarFive VisionFive 2 v1.3B (DT)
epc : __list_add_valid_or_report+0xd8/0xe2
ra : __list_add_valid_or_report+0xd8/0xe2
epc : ffffffff80540bce ra : ffffffff80540bce sp : ffffffc600a3bc00
gp : ffffffff81caec40 tp : ffffffd6c036f080 t0 : 0000000000000000
t1 : 0000000000006000 t2 : 0000000000000002 s0 : ffffffc600a3bc30
s1 : ffffffc600a3bcc8 a0 : ffffffd6fbf49a40 a1 : ffffffd6c036f080
a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000
a5 : 0000000000000000 a6 : 0000000020000000 a7 : 0000000000000001
s2 : ffffffd6c7c5a6a8 s3 : ffffffc600a3bcc8 s4 : ffffffc600a3bcc8
s5 : ffffffd6c7c5a6b8 s6 : ffffffd6c7c5a6a8 s7 : 0000003ff3f32794
s8 : 0000002ab38c9118 s9 : 0000000000000065 s10: 0000003f823a5cb8
s11: 0000003f823264e8 t3 : 0000000000000001 t4 : 0000000000000000
t5 : 00000000fa83b2da t6 : 000000000051df90
status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff80540bce>] __list_add_valid_or_report+0xd8/0xe2
[<ffffffff80255c86>] damon_call+0x52/0xe8
[<ffffffff8025c9a8>] damon_sysfs_damon_call+0x60/0x8a
[<ffffffff8025daf4>] state_store+0xfc/0x294
[<ffffffff80dbf1fa>] kobj_attr_store+0xe/0x1a
[<ffffffff802f070c>] sysfs_kf_write+0x42/0x56
[<ffffffff802eef4e>] kernfs_fop_write_iter+0xf4/0x178
[<ffffffff8026545c>] vfs_write+0x1b6/0x3b2
[<ffffffff80265782>] ksys_write+0x52/0xbc
[<ffffffff80265800>] __riscv_sys_write+0x14/0x1c
[<ffffffff80ddf124>] do_trap_ecall_u+0x19c/0x26e
[<ffffffff80deaa38>] handle_exception+0x150/0x15c
---[ end trace 0000000000000000 ]---
-bash: echo: write error: Invalid argument
The cause of the above error is that list_add_tail() is executed
repeatedly while executing damon_call(ctx, control)
in damon_sysfs_damon_call(). The execution flow is summarized below:
damon_sysfs_damon_call()
--> damon_call(ctx, control)
list_add_tail(control, ctx->call_contols);
--> /* list_add corruption error */
if (!damon_is_running)
return -EINVAL;
If you execute damon_call() when damon_sysfs_kdamond_running() is true,
you can prevent the error of duplicate execution of list_add_tail().
Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
---
mm/damon/sysfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c
index e2bd2d7becdd..835703c65c12 100644
--- a/mm/damon/sysfs.c
+++ b/mm/damon/sysfs.c
@@ -1686,7 +1686,7 @@ static int damon_sysfs_damon_call(int (*fn)(void *data),
struct damon_call_control call_control = {};
int err;
- if (!kdamond->damon_ctx)
+ if (!damon_sysfs_kdamond_running(kdamond))
return -EINVAL;
call_control.fn = fn;
call_control.data = kdamond;
--
2.43.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-24 9:43 [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call() JaeJoon Jung
@ 2025-12-25 0:32 ` SeongJae Park
2025-12-25 2:35 ` JaeJoon Jung
0 siblings, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2025-12-25 0:32 UTC (permalink / raw)
To: JaeJoon Jung; +Cc: SeongJae Park, damon, linux-mm, rgbi3307
Hello JaeJoon,
On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> cd /sys/kernel/mm/damon/admin
> echo "off" > kdamonds/0/state
>
> echo "commit" > kdamonds/0/state
> echo "commit" > kdamonds/0/state
>
> If you repeat "commit" twice with the kdamonds/0/state set to "off"
> with the above command, list_add corruption error occurs as follows:
>
> 4-page vmalloc region starting at 0xffffffc600a38000 allocated at
> kernel_clone+0x44/0x41e
> ------------[ cut here ]------------
> list_add corruption. prev->next should be next (ffffffd6c7c5a6a8),
> but was ffffffc600a3bcc8. (prev=ffffffc600a3bcc8).
> WARNING: lib/list_debug.c:32 at __list_add_valid_or_report+
> 0xd8/0xe2, CPU#0: bash/466
> Modules linked in: dwmac_starfive stmmac_platform stmmac pcs_xpcs phylink
> CPU: 0 UID: 0 PID: 466 Comm: bash Tainted: G W 6.19.0-rc2+ #1 PREEMPTLAZY
> Tainted: [W]=WARN
> Hardware name: StarFive VisionFive 2 v1.3B (DT)
> epc : __list_add_valid_or_report+0xd8/0xe2
> ra : __list_add_valid_or_report+0xd8/0xe2
> epc : ffffffff80540bce ra : ffffffff80540bce sp : ffffffc600a3bc00
> gp : ffffffff81caec40 tp : ffffffd6c036f080 t0 : 0000000000000000
> t1 : 0000000000006000 t2 : 0000000000000002 s0 : ffffffc600a3bc30
> s1 : ffffffc600a3bcc8 a0 : ffffffd6fbf49a40 a1 : ffffffd6c036f080
> a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000
> a5 : 0000000000000000 a6 : 0000000020000000 a7 : 0000000000000001
> s2 : ffffffd6c7c5a6a8 s3 : ffffffc600a3bcc8 s4 : ffffffc600a3bcc8
> s5 : ffffffd6c7c5a6b8 s6 : ffffffd6c7c5a6a8 s7 : 0000003ff3f32794
> s8 : 0000002ab38c9118 s9 : 0000000000000065 s10: 0000003f823a5cb8
> s11: 0000003f823264e8 t3 : 0000000000000001 t4 : 0000000000000000
> t5 : 00000000fa83b2da t6 : 000000000051df90
> status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
> [<ffffffff80540bce>] __list_add_valid_or_report+0xd8/0xe2
> [<ffffffff80255c86>] damon_call+0x52/0xe8
> [<ffffffff8025c9a8>] damon_sysfs_damon_call+0x60/0x8a
> [<ffffffff8025daf4>] state_store+0xfc/0x294
> [<ffffffff80dbf1fa>] kobj_attr_store+0xe/0x1a
> [<ffffffff802f070c>] sysfs_kf_write+0x42/0x56
> [<ffffffff802eef4e>] kernfs_fop_write_iter+0xf4/0x178
> [<ffffffff8026545c>] vfs_write+0x1b6/0x3b2
> [<ffffffff80265782>] ksys_write+0x52/0xbc
> [<ffffffff80265800>] __riscv_sys_write+0x14/0x1c
> [<ffffffff80ddf124>] do_trap_ecall_u+0x19c/0x26e
> [<ffffffff80deaa38>] handle_exception+0x150/0x15c
> ---[ end trace 0000000000000000 ]---
> -bash: echo: write error: Invalid argument
Thank you for finding issue!
Also appreciate for sharing your detailed reproducer. Nevertheless, I think
the reproducer can be more detailed. E.g., you could explicitly explain the
fact that the reproduction step should be executed only after starting DAMON
with the kdamond, and the kernel should run with CONFIG_lIST_HARDENED to get
the output from the kernel log.
>
> The cause of the above error is that list_add_tail() is executed
> repeatedly while executing damon_call(ctx, control)
> in damon_sysfs_damon_call(). The execution flow is summarized below:
>
> damon_sysfs_damon_call()
> --> damon_call(ctx, control)
> list_add_tail(control, ctx->call_contols);
> --> /* list_add corruption error */
> if (!damon_is_running)
> return -EINVAL;
>
> If you execute damon_call() when damon_sysfs_kdamond_running() is true,
> you can prevent the error of duplicate execution of list_add_tail().
The kdamond might be terminated between the damon_call() call and the
damon_is_running() check inside the damon_call() execution. In the case, the
problem may still happen.
The problem happens because damon_call() is not removing the damon_call_control
object before returning the error, right? What about removing the object
before returning the error?
>
> Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
Could you please also add Fixes: and Cc: stable@ ?
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-25 0:32 ` SeongJae Park
@ 2025-12-25 2:35 ` JaeJoon Jung
2025-12-25 19:49 ` SeongJae Park
0 siblings, 1 reply; 10+ messages in thread
From: JaeJoon Jung @ 2025-12-25 2:35 UTC (permalink / raw)
To: SeongJae Park; +Cc: damon, linux-mm, rgbi3307
On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
>
> Hello JaeJoon,
>
> On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
>
> > cd /sys/kernel/mm/damon/admin
> > echo "off" > kdamonds/0/state
> >
> > echo "commit" > kdamonds/0/state
> > echo "commit" > kdamonds/0/state
> >
> > If you repeat "commit" twice with the kdamonds/0/state set to "off"
> > with the above command, list_add corruption error occurs as follows:
> >
> > 4-page vmalloc region starting at 0xffffffc600a38000 allocated at
> > kernel_clone+0x44/0x41e
> > ------------[ cut here ]------------
> > list_add corruption. prev->next should be next (ffffffd6c7c5a6a8),
> > but was ffffffc600a3bcc8. (prev=ffffffc600a3bcc8).
> > WARNING: lib/list_debug.c:32 at __list_add_valid_or_report+
> > 0xd8/0xe2, CPU#0: bash/466
> > Modules linked in: dwmac_starfive stmmac_platform stmmac pcs_xpcs phylink
> > CPU: 0 UID: 0 PID: 466 Comm: bash Tainted: G W 6.19.0-rc2+ #1 PREEMPTLAZY
> > Tainted: [W]=WARN
> > Hardware name: StarFive VisionFive 2 v1.3B (DT)
> > epc : __list_add_valid_or_report+0xd8/0xe2
> > ra : __list_add_valid_or_report+0xd8/0xe2
> > epc : ffffffff80540bce ra : ffffffff80540bce sp : ffffffc600a3bc00
> > gp : ffffffff81caec40 tp : ffffffd6c036f080 t0 : 0000000000000000
> > t1 : 0000000000006000 t2 : 0000000000000002 s0 : ffffffc600a3bc30
> > s1 : ffffffc600a3bcc8 a0 : ffffffd6fbf49a40 a1 : ffffffd6c036f080
> > a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000
> > a5 : 0000000000000000 a6 : 0000000020000000 a7 : 0000000000000001
> > s2 : ffffffd6c7c5a6a8 s3 : ffffffc600a3bcc8 s4 : ffffffc600a3bcc8
> > s5 : ffffffd6c7c5a6b8 s6 : ffffffd6c7c5a6a8 s7 : 0000003ff3f32794
> > s8 : 0000002ab38c9118 s9 : 0000000000000065 s10: 0000003f823a5cb8
> > s11: 0000003f823264e8 t3 : 0000000000000001 t4 : 0000000000000000
> > t5 : 00000000fa83b2da t6 : 000000000051df90
> > status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
> > [<ffffffff80540bce>] __list_add_valid_or_report+0xd8/0xe2
> > [<ffffffff80255c86>] damon_call+0x52/0xe8
> > [<ffffffff8025c9a8>] damon_sysfs_damon_call+0x60/0x8a
> > [<ffffffff8025daf4>] state_store+0xfc/0x294
> > [<ffffffff80dbf1fa>] kobj_attr_store+0xe/0x1a
> > [<ffffffff802f070c>] sysfs_kf_write+0x42/0x56
> > [<ffffffff802eef4e>] kernfs_fop_write_iter+0xf4/0x178
> > [<ffffffff8026545c>] vfs_write+0x1b6/0x3b2
> > [<ffffffff80265782>] ksys_write+0x52/0xbc
> > [<ffffffff80265800>] __riscv_sys_write+0x14/0x1c
> > [<ffffffff80ddf124>] do_trap_ecall_u+0x19c/0x26e
> > [<ffffffff80deaa38>] handle_exception+0x150/0x15c
> > ---[ end trace 0000000000000000 ]---
> > -bash: echo: write error: Invalid argument
>
> Thank you for finding issue!
>
> Also appreciate for sharing your detailed reproducer. Nevertheless, I think
> the reproducer can be more detailed. E.g., you could explicitly explain the
> fact that the reproduction step should be executed only after starting DAMON
> with the kdamond, and the kernel should run with CONFIG_lIST_HARDENED to get
> the output from the kernel log.
Yes, as you said, I ran it under CONFIG_LIST_HARDENED=y condition.
>
> >
> > The cause of the above error is that list_add_tail() is executed
> > repeatedly while executing damon_call(ctx, control)
> > in damon_sysfs_damon_call(). The execution flow is summarized below:
> >
> > damon_sysfs_damon_call()
> > --> damon_call(ctx, control)
> > list_add_tail(control, ctx->call_contols);
> > --> /* list_add corruption error */
> > if (!damon_is_running)
> > return -EINVAL;
> >
> > If you execute damon_call() when damon_sysfs_kdamond_running() is true,
> > you can prevent the error of duplicate execution of list_add_tail().
>
> The kdamond might be terminated between the damon_call() call and the
> damon_is_running() check inside the damon_call() execution. In the case, the
> problem may still happen.
>
> The problem happens because damon_call() is not removing the damon_call_control
> object before returning the error, right? What about removing the object
> before returning the error?
damon_call() is called after damon_start() --> kdamond_fn() is executed,
This is a problem because damon_call() also occurs when kdamond is "off"
only in damon/sysfs. So, my first patch solved the problem, but the
following also worked. I tested it.
And it seems better to keep the existing method of releasing
damon_call_control. Since the damon_call_control structure uses both
static and kmalloc(), it's appropriate to release it in kdamond_fn() according
to the condition control->canceled && control->dealloc_on_cancel.
My previous suggestion regarding this:
https://lore.kernel.org/damon/20251206224724.13832-1-rgbi3307@gmail.com/
diff --git a/mm/damon/core.c b/mm/damon/core.c
index babad37719b6..2ead0bb3c462 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
*/
int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
{
+ if (!damon_is_running(ctx))
+ return -EINVAL;
+
if (!control->repeat)
init_completion(&control->completion);
control->canceled = false;
@@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
damon_call_control *control)
mutex_lock(&ctx->call_controls_lock);
list_add_tail(&control->list, &ctx->call_controls);
mutex_unlock(&ctx->call_controls_lock);
- if (!damon_is_running(ctx))
- return -EINVAL;
if (control->repeat)
return 0;
wait_for_completion(&control->completion);
>
> >
> > Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
>
> Could you please also add Fixes: and Cc: stable@ ?
I don't have much experience with this, so I'm sorry,
but could you please give me an example about this?
>
>
> Thanks,
> SJ
>
> [...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-25 2:35 ` JaeJoon Jung
@ 2025-12-25 19:49 ` SeongJae Park
2025-12-26 1:48 ` JaeJoon Jung
0 siblings, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2025-12-25 19:49 UTC (permalink / raw)
To: JaeJoon Jung; +Cc: SeongJae Park, damon, linux-mm, rgbi3307
On Thu, 25 Dec 2025 11:35:33 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
> >
> > Hello JaeJoon,
> >
> > On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> >
> > > cd /sys/kernel/mm/damon/admin
> > > echo "off" > kdamonds/0/state
> > >
> > > echo "commit" > kdamonds/0/state
> > > echo "commit" > kdamonds/0/state
> > >
> > > If you repeat "commit" twice with the kdamonds/0/state set to "off"
> > > with the above command, list_add corruption error occurs as follows:
> > >
> > > 4-page vmalloc region starting at 0xffffffc600a38000 allocated at
> > > kernel_clone+0x44/0x41e
> > > ------------[ cut here ]------------
> > > list_add corruption. prev->next should be next (ffffffd6c7c5a6a8),
> > > but was ffffffc600a3bcc8. (prev=ffffffc600a3bcc8).
> > > WARNING: lib/list_debug.c:32 at __list_add_valid_or_report+
> > > 0xd8/0xe2, CPU#0: bash/466
> > > Modules linked in: dwmac_starfive stmmac_platform stmmac pcs_xpcs phylink
> > > CPU: 0 UID: 0 PID: 466 Comm: bash Tainted: G W 6.19.0-rc2+ #1 PREEMPTLAZY
> > > Tainted: [W]=WARN
> > > Hardware name: StarFive VisionFive 2 v1.3B (DT)
> > > epc : __list_add_valid_or_report+0xd8/0xe2
> > > ra : __list_add_valid_or_report+0xd8/0xe2
> > > epc : ffffffff80540bce ra : ffffffff80540bce sp : ffffffc600a3bc00
> > > gp : ffffffff81caec40 tp : ffffffd6c036f080 t0 : 0000000000000000
> > > t1 : 0000000000006000 t2 : 0000000000000002 s0 : ffffffc600a3bc30
> > > s1 : ffffffc600a3bcc8 a0 : ffffffd6fbf49a40 a1 : ffffffd6c036f080
> > > a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000
> > > a5 : 0000000000000000 a6 : 0000000020000000 a7 : 0000000000000001
> > > s2 : ffffffd6c7c5a6a8 s3 : ffffffc600a3bcc8 s4 : ffffffc600a3bcc8
> > > s5 : ffffffd6c7c5a6b8 s6 : ffffffd6c7c5a6a8 s7 : 0000003ff3f32794
> > > s8 : 0000002ab38c9118 s9 : 0000000000000065 s10: 0000003f823a5cb8
> > > s11: 0000003f823264e8 t3 : 0000000000000001 t4 : 0000000000000000
> > > t5 : 00000000fa83b2da t6 : 000000000051df90
> > > status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
> > > [<ffffffff80540bce>] __list_add_valid_or_report+0xd8/0xe2
> > > [<ffffffff80255c86>] damon_call+0x52/0xe8
> > > [<ffffffff8025c9a8>] damon_sysfs_damon_call+0x60/0x8a
> > > [<ffffffff8025daf4>] state_store+0xfc/0x294
> > > [<ffffffff80dbf1fa>] kobj_attr_store+0xe/0x1a
> > > [<ffffffff802f070c>] sysfs_kf_write+0x42/0x56
> > > [<ffffffff802eef4e>] kernfs_fop_write_iter+0xf4/0x178
> > > [<ffffffff8026545c>] vfs_write+0x1b6/0x3b2
> > > [<ffffffff80265782>] ksys_write+0x52/0xbc
> > > [<ffffffff80265800>] __riscv_sys_write+0x14/0x1c
> > > [<ffffffff80ddf124>] do_trap_ecall_u+0x19c/0x26e
> > > [<ffffffff80deaa38>] handle_exception+0x150/0x15c
> > > ---[ end trace 0000000000000000 ]---
> > > -bash: echo: write error: Invalid argument
> >
> > Thank you for finding issue!
> >
> > Also appreciate for sharing your detailed reproducer. Nevertheless, I think
> > the reproducer can be more detailed. E.g., you could explicitly explain the
> > fact that the reproduction step should be executed only after starting DAMON
> > with the kdamond, and the kernel should run with CONFIG_lIST_HARDENED to get
> > the output from the kernel log.
>
> Yes, as you said, I ran it under CONFIG_LIST_HARDENED=y condition.
Thank you for confirming.
>
> >
> > >
> > > The cause of the above error is that list_add_tail() is executed
> > > repeatedly while executing damon_call(ctx, control)
> > > in damon_sysfs_damon_call(). The execution flow is summarized below:
> > >
> > > damon_sysfs_damon_call()
> > > --> damon_call(ctx, control)
> > > list_add_tail(control, ctx->call_contols);
> > > --> /* list_add corruption error */
> > > if (!damon_is_running)
> > > return -EINVAL;
> > >
> > > If you execute damon_call() when damon_sysfs_kdamond_running() is true,
> > > you can prevent the error of duplicate execution of list_add_tail().
> >
> > The kdamond might be terminated between the damon_call() call and the
> > damon_is_running() check inside the damon_call() execution. In the case, the
> > problem may still happen.
> >
> > The problem happens because damon_call() is not removing the damon_call_control
> > object before returning the error, right? What about removing the object
> > before returning the error?
>
> damon_call() is called after damon_start() --> kdamond_fn() is executed,
> This is a problem because damon_call() also occurs when kdamond is "off"
> only in damon/sysfs. So, my first patch solved the problem, but the
> following also worked. I tested it.
>
> And it seems better to keep the existing method of releasing
> damon_call_control. Since the damon_call_control structure uses both
> static and kmalloc(), it's appropriate to release it in kdamond_fn() according
> to the condition control->canceled && control->dealloc_on_cancel.
>
> My previous suggestion regarding this:
> https://lore.kernel.org/damon/20251206224724.13832-1-rgbi3307@gmail.com/
>
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index babad37719b6..2ead0bb3c462 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
> */
> int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> {
> + if (!damon_is_running(ctx))
> + return -EINVAL;
> +
> if (!control->repeat)
> init_completion(&control->completion);
> control->canceled = false;
> @@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
> damon_call_control *control)
> mutex_lock(&ctx->call_controls_lock);
> list_add_tail(&control->list, &ctx->call_controls);
> mutex_unlock(&ctx->call_controls_lock);
> - if (!damon_is_running(ctx))
> - return -EINVAL;
> if (control->repeat)
> return 0;
> wait_for_completion(&control->completion);
Let's assume DAMON is terminated between the damon_is_running() and
list_add_tail(). In the case, the control->fn() will never be called back. If
control->repeat is false, this function will even inifnitely wait.
I think we should keep the damon_is_running() as is, but further check if it
was terminated without handling the control object, and remove it from the list
in the case. Like below.
diff --git a/mm/damon/core.c b/mm/damon/core.c
index c3ae55b940cc..23d8602f5a49 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1649,6 +1649,35 @@ bool damon_is_running(struct damon_ctx *ctx)
return running;
}
+/*
+ * damon_call_handle_inactive_ctx() - handle DAMON call request that added to
+ * an inactive context.
+ * @ctx: The inactive DAMON context.
+ * @control: Control variable of the call request.
+ *
+ * This function is called in a case that @control is added to @ctx but @ctx is
+ * not running (inactive). See if @ctx handled @control or not, and cleanup
+ * @control if it was not handled.
+ *
+ * Returns 0 if @control was handled by @ctx, negative error code otherwise.
+ */
+static int damon_call_handle_inactive_ctx(
+ struct damon_ctx *ctx, struct damon_call_control *control)
+{
+ struct damon_call_control *c;
+
+ mutex_lock(&ctx->call_controls_lock);
+ list_for_each_entry(c, &ctx->call_controls, list) {
+ if (c == control) {
+ list_del(&control->list);
+ mutex_unlock(&ctx->call_controls_lock);
+ return -EINVAL;
+ }
+ }
+ mutex_unlock(&ctx->call_controls_lock);
+ return 0;
+}
+
/**
* damon_call() - Invoke a given function on DAMON worker thread (kdamond).
* @ctx: DAMON context to call the function for.
@@ -1679,7 +1708,7 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
list_add_tail(&control->list, &ctx->call_controls);
mutex_unlock(&ctx->call_controls_lock);
if (!damon_is_running(ctx))
- return -EINVAL;
+ return damon_call_handle_inactive_ctx(ctx, control);
if (control->repeat)
return 0;
wait_for_completion(&control->completion);
If you don't mind, I'll post the above diff as a patch, adding a 'Reported-by:'
tag for you.
>
> >
> > >
> > > Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
> >
> > Could you please also add Fixes: and Cc: stable@ ?
>
> I don't have much experience with this, so I'm sorry,
> but could you please give me an example about this?
No problem.
Please refer to https://docs.kernel.org/process/stable-kernel-rules.html
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-25 19:49 ` SeongJae Park
@ 2025-12-26 1:48 ` JaeJoon Jung
2025-12-26 18:41 ` SeongJae Park
0 siblings, 1 reply; 10+ messages in thread
From: JaeJoon Jung @ 2025-12-26 1:48 UTC (permalink / raw)
To: SeongJae Park; +Cc: damon, linux-mm, rgbi3307
On Fri, 26 Dec 2025 at 04:50, SeongJae Park <sj@kernel.org> wrote:
>
> On Thu, 25 Dec 2025 11:35:33 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
>
> > On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
> > >
> > > Hello JaeJoon,
> > >
> > > On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > >
> > > > cd /sys/kernel/mm/damon/admin
> > > > echo "off" > kdamonds/0/state
> > > >
> > > > echo "commit" > kdamonds/0/state
> > > > echo "commit" > kdamonds/0/state
> > > >
> > > > If you repeat "commit" twice with the kdamonds/0/state set to "off"
> > > > with the above command, list_add corruption error occurs as follows:
> > > >
> > > > 4-page vmalloc region starting at 0xffffffc600a38000 allocated at
> > > > kernel_clone+0x44/0x41e
> > > > ------------[ cut here ]------------
> > > > list_add corruption. prev->next should be next (ffffffd6c7c5a6a8),
> > > > but was ffffffc600a3bcc8. (prev=ffffffc600a3bcc8).
> > > > WARNING: lib/list_debug.c:32 at __list_add_valid_or_report+
> > > > 0xd8/0xe2, CPU#0: bash/466
> > > > Modules linked in: dwmac_starfive stmmac_platform stmmac pcs_xpcs phylink
> > > > CPU: 0 UID: 0 PID: 466 Comm: bash Tainted: G W 6.19.0-rc2+ #1 PREEMPTLAZY
> > > > Tainted: [W]=WARN
> > > > Hardware name: StarFive VisionFive 2 v1.3B (DT)
> > > > epc : __list_add_valid_or_report+0xd8/0xe2
> > > > ra : __list_add_valid_or_report+0xd8/0xe2
> > > > epc : ffffffff80540bce ra : ffffffff80540bce sp : ffffffc600a3bc00
> > > > gp : ffffffff81caec40 tp : ffffffd6c036f080 t0 : 0000000000000000
> > > > t1 : 0000000000006000 t2 : 0000000000000002 s0 : ffffffc600a3bc30
> > > > s1 : ffffffc600a3bcc8 a0 : ffffffd6fbf49a40 a1 : ffffffd6c036f080
> > > > a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000
> > > > a5 : 0000000000000000 a6 : 0000000020000000 a7 : 0000000000000001
> > > > s2 : ffffffd6c7c5a6a8 s3 : ffffffc600a3bcc8 s4 : ffffffc600a3bcc8
> > > > s5 : ffffffd6c7c5a6b8 s6 : ffffffd6c7c5a6a8 s7 : 0000003ff3f32794
> > > > s8 : 0000002ab38c9118 s9 : 0000000000000065 s10: 0000003f823a5cb8
> > > > s11: 0000003f823264e8 t3 : 0000000000000001 t4 : 0000000000000000
> > > > t5 : 00000000fa83b2da t6 : 000000000051df90
> > > > status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
> > > > [<ffffffff80540bce>] __list_add_valid_or_report+0xd8/0xe2
> > > > [<ffffffff80255c86>] damon_call+0x52/0xe8
> > > > [<ffffffff8025c9a8>] damon_sysfs_damon_call+0x60/0x8a
> > > > [<ffffffff8025daf4>] state_store+0xfc/0x294
> > > > [<ffffffff80dbf1fa>] kobj_attr_store+0xe/0x1a
> > > > [<ffffffff802f070c>] sysfs_kf_write+0x42/0x56
> > > > [<ffffffff802eef4e>] kernfs_fop_write_iter+0xf4/0x178
> > > > [<ffffffff8026545c>] vfs_write+0x1b6/0x3b2
> > > > [<ffffffff80265782>] ksys_write+0x52/0xbc
> > > > [<ffffffff80265800>] __riscv_sys_write+0x14/0x1c
> > > > [<ffffffff80ddf124>] do_trap_ecall_u+0x19c/0x26e
> > > > [<ffffffff80deaa38>] handle_exception+0x150/0x15c
> > > > ---[ end trace 0000000000000000 ]---
> > > > -bash: echo: write error: Invalid argument
> > >
> > > Thank you for finding issue!
> > >
> > > Also appreciate for sharing your detailed reproducer. Nevertheless, I think
> > > the reproducer can be more detailed. E.g., you could explicitly explain the
> > > fact that the reproduction step should be executed only after starting DAMON
> > > with the kdamond, and the kernel should run with CONFIG_lIST_HARDENED to get
> > > the output from the kernel log.
> >
> > Yes, as you said, I ran it under CONFIG_LIST_HARDENED=y condition.
>
> Thank you for confirming.
>
> >
> > >
> > > >
> > > > The cause of the above error is that list_add_tail() is executed
> > > > repeatedly while executing damon_call(ctx, control)
> > > > in damon_sysfs_damon_call(). The execution flow is summarized below:
> > > >
> > > > damon_sysfs_damon_call()
> > > > --> damon_call(ctx, control)
> > > > list_add_tail(control, ctx->call_contols);
> > > > --> /* list_add corruption error */
> > > > if (!damon_is_running)
> > > > return -EINVAL;
> > > >
> > > > If you execute damon_call() when damon_sysfs_kdamond_running() is true,
> > > > you can prevent the error of duplicate execution of list_add_tail().
> > >
> > > The kdamond might be terminated between the damon_call() call and the
> > > damon_is_running() check inside the damon_call() execution. In the case, the
> > > problem may still happen.
> > >
> > > The problem happens because damon_call() is not removing the damon_call_control
> > > object before returning the error, right? What about removing the object
> > > before returning the error?
> >
> > damon_call() is called after damon_start() --> kdamond_fn() is executed,
> > This is a problem because damon_call() also occurs when kdamond is "off"
> > only in damon/sysfs. So, my first patch solved the problem, but the
> > following also worked. I tested it.
> >
> > And it seems better to keep the existing method of releasing
> > damon_call_control. Since the damon_call_control structure uses both
> > static and kmalloc(), it's appropriate to release it in kdamond_fn() according
> > to the condition control->canceled && control->dealloc_on_cancel.
> >
> > My previous suggestion regarding this:
> > https://lore.kernel.org/damon/20251206224724.13832-1-rgbi3307@gmail.com/
> >
> > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > index babad37719b6..2ead0bb3c462 100644
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> > @@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
> > */
> > int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> > {
> > + if (!damon_is_running(ctx))
> > + return -EINVAL;
> > +
> > if (!control->repeat)
> > init_completion(&control->completion);
> > control->canceled = false;
> > @@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
> > damon_call_control *control)
> > mutex_lock(&ctx->call_controls_lock);
> > list_add_tail(&control->list, &ctx->call_controls);
> > mutex_unlock(&ctx->call_controls_lock);
> > - if (!damon_is_running(ctx))
> > - return -EINVAL;
> > if (control->repeat)
> > return 0;
> > wait_for_completion(&control->completion);
>
> Let's assume DAMON is terminated between the damon_is_running() and
> list_add_tail(). In the case, the control->fn() will never be called back. If
> control->repeat is false, this function will even inifnitely wait.
As you said, there are cases where kdamond is terminated(stopped) in
damon_is_running() and list_add_tail(). It may be a very rare case, but
I missed this case.
>
> I think we should keep the damon_is_running() as is, but further check if it
> was terminated without handling the control object, and remove it from the list
> in the case. Like below.
>
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index c3ae55b940cc..23d8602f5a49 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1649,6 +1649,35 @@ bool damon_is_running(struct damon_ctx *ctx)
> return running;
> }
>
> +/*
> + * damon_call_handle_inactive_ctx() - handle DAMON call request that added to
> + * an inactive context.
> + * @ctx: The inactive DAMON context.
> + * @control: Control variable of the call request.
> + *
> + * This function is called in a case that @control is added to @ctx but @ctx is
> + * not running (inactive). See if @ctx handled @control or not, and cleanup
> + * @control if it was not handled.
> + *
> + * Returns 0 if @control was handled by @ctx, negative error code otherwise.
> + */
> +static int damon_call_handle_inactive_ctx(
> + struct damon_ctx *ctx, struct damon_call_control *control)
> +{
> + struct damon_call_control *c;
> +
> + mutex_lock(&ctx->call_controls_lock);
> + list_for_each_entry(c, &ctx->call_controls, list) {
> + if (c == control) {
> + list_del(&control->list);
> + mutex_unlock(&ctx->call_controls_lock);
> + return -EINVAL;
> + }
> + }
> + mutex_unlock(&ctx->call_controls_lock);
> + return 0;
> +}
> +
> /**
> * damon_call() - Invoke a given function on DAMON worker thread (kdamond).
> * @ctx: DAMON context to call the function for.
> @@ -1679,7 +1708,7 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> list_add_tail(&control->list, &ctx->call_controls);
> mutex_unlock(&ctx->call_controls_lock);
> if (!damon_is_running(ctx))
> - return -EINVAL;
> + return damon_call_handle_inactive_ctx(ctx, control);
> if (control->repeat)
> return 0;
> wait_for_completion(&control->completion);
>
However, the damon_call_handle_inactive_ctx() function is to post-process
the duplicate addition of control->list. Rather, it is more efficient to
prevent duplicate additions in advance, as follows:
I have tested the following and it works fine.
@@ -1467,11 +1496,14 @@ int damon_call(struct damon_ctx *ctx, struct
damon_call_control *control)
control->canceled = false;
INIT_LIST_HEAD(&control->list);
- mutex_lock(&ctx->call_controls_lock);
- list_add_tail(&control->list, &ctx->call_controls);
- mutex_unlock(&ctx->call_controls_lock);
- if (!damon_is_running(ctx))
+ if (damon_is_running(ctx)) {
+ mutex_lock(&ctx->call_controls_lock);
+ list_add_tail(&control->list, &ctx->call_controls);
+ mutex_unlock(&ctx->call_controls_lock);
+ } else {
+ /* return damon_call_handle_inactive_ctx(ctx, control); */
return -EINVAL;
+ }
if (control->repeat)
return 0;
wait_for_completion(&control->completion);
> If you don't mind, I'll post the above diff as a patch, adding a 'Reported-by:'
> tag for you.
'Reported-by:' is OK. However, please check the above again.
>
> >
> > >
> > > >
> > > > Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
> > >
> > > Could you please also add Fixes: and Cc: stable@ ?
> >
> > I don't have much experience with this, so I'm sorry,
> > but could you please give me an example about this?
>
> No problem.
>
> Please refer to https://docs.kernel.org/process/stable-kernel-rules.html
Thanks,
JaeJoon
>
>
> Thanks,
> SJ
>
> [...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-26 1:48 ` JaeJoon Jung
@ 2025-12-26 18:41 ` SeongJae Park
2025-12-26 23:53 ` JaeJoon Jung
0 siblings, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2025-12-26 18:41 UTC (permalink / raw)
To: JaeJoon Jung; +Cc: SeongJae Park, damon, linux-mm, rgbi3307
On Fri, 26 Dec 2025 10:48:31 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> On Fri, 26 Dec 2025 at 04:50, SeongJae Park <sj@kernel.org> wrote:
> >
> > On Thu, 25 Dec 2025 11:35:33 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> >
> > > On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
> > > >
> > > > Hello JaeJoon,
> > > >
> > > > On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
[...]
> > > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > > index babad37719b6..2ead0bb3c462 100644
> > > --- a/mm/damon/core.c
> > > +++ b/mm/damon/core.c
> > > @@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
> > > */
> > > int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> > > {
> > > + if (!damon_is_running(ctx))
> > > + return -EINVAL;
> > > +
> > > if (!control->repeat)
> > > init_completion(&control->completion);
> > > control->canceled = false;
> > > @@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
> > > damon_call_control *control)
> > > mutex_lock(&ctx->call_controls_lock);
> > > list_add_tail(&control->list, &ctx->call_controls);
> > > mutex_unlock(&ctx->call_controls_lock);
> > > - if (!damon_is_running(ctx))
> > > - return -EINVAL;
> > > if (control->repeat)
> > > return 0;
> > > wait_for_completion(&control->completion);
> >
> > Let's assume DAMON is terminated between the damon_is_running() and
> > list_add_tail(). In the case, the control->fn() will never be called back. If
> > control->repeat is false, this function will even inifnitely wait.
>
> As you said, there are cases where kdamond is terminated(stopped) in
> damon_is_running() and list_add_tail(). It may be a very rare case, but
> I missed this case.
>
> >
> > I think we should keep the damon_is_running() as is, but further check if it
> > was terminated without handling the control object, and remove it from the list
> > in the case. Like below.
[...]
> However, the damon_call_handle_inactive_ctx() function is to post-process
> the duplicate addition of control->list. Rather, it is more efficient to
> prevent duplicate additions in advance, as follows:
> I have tested the following and it works fine.
>
> @@ -1467,11 +1496,14 @@ int damon_call(struct damon_ctx *ctx, struct
> damon_call_control *control)
> control->canceled = false;
> INIT_LIST_HEAD(&control->list);
>
> - mutex_lock(&ctx->call_controls_lock);
> - list_add_tail(&control->list, &ctx->call_controls);
> - mutex_unlock(&ctx->call_controls_lock);
> - if (!damon_is_running(ctx))
> + if (damon_is_running(ctx)) {
> + mutex_lock(&ctx->call_controls_lock);
> + list_add_tail(&control->list, &ctx->call_controls);
> + mutex_unlock(&ctx->call_controls_lock);
> + } else {
> + /* return damon_call_handle_inactive_ctx(ctx, control); */
> return -EINVAL;
> + }
> if (control->repeat)
> return 0;
> wait_for_completion(&control->completion);
I think this is not differnt from your previous suggestion, and thus it has the
same issue. What if DAMON is terminated between damon_is_running() and
list_add_tail() call? Please let me know if I'm missing something.
>
> > If you don't mind, I'll post the above diff as a patch, adding a 'Reported-by:'
> > tag for you.
>
> 'Reported-by:' is OK. However, please check the above again.
Thank you!
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-26 18:41 ` SeongJae Park
@ 2025-12-26 23:53 ` JaeJoon Jung
2025-12-27 17:42 ` SeongJae Park
0 siblings, 1 reply; 10+ messages in thread
From: JaeJoon Jung @ 2025-12-26 23:53 UTC (permalink / raw)
To: SeongJae Park; +Cc: damon, linux-mm, rgbi3307
On Sat, 27 Dec 2025 at 03:41, SeongJae Park <sj@kernel.org> wrote:
>
> On Fri, 26 Dec 2025 10:48:31 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
>
> > On Fri, 26 Dec 2025 at 04:50, SeongJae Park <sj@kernel.org> wrote:
> > >
> > > On Thu, 25 Dec 2025 11:35:33 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > >
> > > > On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
> > > > >
> > > > > Hello JaeJoon,
> > > > >
> > > > > On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> [...]
> > > > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > > > index babad37719b6..2ead0bb3c462 100644
> > > > --- a/mm/damon/core.c
> > > > +++ b/mm/damon/core.c
> > > > @@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
> > > > */
> > > > int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> > > > {
> > > > + if (!damon_is_running(ctx))
> > > > + return -EINVAL;
> > > > +
> > > > if (!control->repeat)
> > > > init_completion(&control->completion);
> > > > control->canceled = false;
> > > > @@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
> > > > damon_call_control *control)
> > > > mutex_lock(&ctx->call_controls_lock);
> > > > list_add_tail(&control->list, &ctx->call_controls);
> > > > mutex_unlock(&ctx->call_controls_lock);
> > > > - if (!damon_is_running(ctx))
> > > > - return -EINVAL;
> > > > if (control->repeat)
> > > > return 0;
> > > > wait_for_completion(&control->completion);
> > >
> > > Let's assume DAMON is terminated between the damon_is_running() and
> > > list_add_tail(). In the case, the control->fn() will never be called back. If
> > > control->repeat is false, this function will even inifnitely wait.
> >
> > As you said, there are cases where kdamond is terminated(stopped) in
> > damon_is_running() and list_add_tail(). It may be a very rare case, but
> > I missed this case.
> >
> > >
> > > I think we should keep the damon_is_running() as is, but further check if it
> > > was terminated without handling the control object, and remove it from the list
> > > in the case. Like below.
> [...]
> > However, the damon_call_handle_inactive_ctx() function is to post-process
> > the duplicate addition of control->list. Rather, it is more efficient to
> > prevent duplicate additions in advance, as follows:
> > I have tested the following and it works fine.
> >
> > @@ -1467,11 +1496,14 @@ int damon_call(struct damon_ctx *ctx, struct
> > damon_call_control *control)
> > control->canceled = false;
> > INIT_LIST_HEAD(&control->list);
> >
> > - mutex_lock(&ctx->call_controls_lock);
> > - list_add_tail(&control->list, &ctx->call_controls);
> > - mutex_unlock(&ctx->call_controls_lock);
> > - if (!damon_is_running(ctx))
> > + if (damon_is_running(ctx)) {
> > + mutex_lock(&ctx->call_controls_lock);
> > + list_add_tail(&control->list, &ctx->call_controls);
> > + mutex_unlock(&ctx->call_controls_lock);
> > + } else {
> > + /* return damon_call_handle_inactive_ctx(ctx, control); */
> > return -EINVAL;
> > + }
> > if (control->repeat)
> > return 0;
> > wait_for_completion(&control->completion);
>
> I think this is not differnt from your previous suggestion, and thus it has the
> same issue. What if DAMON is terminated between damon_is_running() and
> list_add_tail() call? Please let me know if I'm missing something.
I think it is good idea to insert a barrier() between damon_is_running()
and list_add_tail() to prevent context-switching. What do you think this?
Thanks,
JaeJoon
>
> >
> > > If you don't mind, I'll post the above diff as a patch, adding a 'Reported-by:'
> > > tag for you.
> >
> > 'Reported-by:' is OK. However, please check the above again.
>
> Thank you!
>
>
> Thanks,
> SJ
>
> [...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-26 23:53 ` JaeJoon Jung
@ 2025-12-27 17:42 ` SeongJae Park
2025-12-29 3:38 ` JaeJoon Jung
0 siblings, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2025-12-27 17:42 UTC (permalink / raw)
To: JaeJoon Jung; +Cc: SeongJae Park, damon, linux-mm, rgbi3307
On Sat, 27 Dec 2025 08:53:21 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> On Sat, 27 Dec 2025 at 03:41, SeongJae Park <sj@kernel.org> wrote:
> >
> > On Fri, 26 Dec 2025 10:48:31 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> >
> > > On Fri, 26 Dec 2025 at 04:50, SeongJae Park <sj@kernel.org> wrote:
> > > >
> > > > On Thu, 25 Dec 2025 11:35:33 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > > >
> > > > > On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
> > > > > >
> > > > > > Hello JaeJoon,
> > > > > >
> > > > > > On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > [...]
> > > > > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > > > > index babad37719b6..2ead0bb3c462 100644
> > > > > --- a/mm/damon/core.c
> > > > > +++ b/mm/damon/core.c
> > > > > @@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
> > > > > */
> > > > > int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> > > > > {
> > > > > + if (!damon_is_running(ctx))
> > > > > + return -EINVAL;
> > > > > +
> > > > > if (!control->repeat)
> > > > > init_completion(&control->completion);
> > > > > control->canceled = false;
> > > > > @@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
> > > > > damon_call_control *control)
> > > > > mutex_lock(&ctx->call_controls_lock);
> > > > > list_add_tail(&control->list, &ctx->call_controls);
> > > > > mutex_unlock(&ctx->call_controls_lock);
> > > > > - if (!damon_is_running(ctx))
> > > > > - return -EINVAL;
> > > > > if (control->repeat)
> > > > > return 0;
> > > > > wait_for_completion(&control->completion);
> > > >
> > > > Let's assume DAMON is terminated between the damon_is_running() and
> > > > list_add_tail(). In the case, the control->fn() will never be called back. If
> > > > control->repeat is false, this function will even inifnitely wait.
> > >
> > > As you said, there are cases where kdamond is terminated(stopped) in
> > > damon_is_running() and list_add_tail(). It may be a very rare case, but
> > > I missed this case.
> > >
> > > >
> > > > I think we should keep the damon_is_running() as is, but further check if it
> > > > was terminated without handling the control object, and remove it from the list
> > > > in the case. Like below.
> > [...]
> > > However, the damon_call_handle_inactive_ctx() function is to post-process
> > > the duplicate addition of control->list. Rather, it is more efficient to
> > > prevent duplicate additions in advance, as follows:
> > > I have tested the following and it works fine.
> > >
> > > @@ -1467,11 +1496,14 @@ int damon_call(struct damon_ctx *ctx, struct
> > > damon_call_control *control)
> > > control->canceled = false;
> > > INIT_LIST_HEAD(&control->list);
> > >
> > > - mutex_lock(&ctx->call_controls_lock);
> > > - list_add_tail(&control->list, &ctx->call_controls);
> > > - mutex_unlock(&ctx->call_controls_lock);
> > > - if (!damon_is_running(ctx))
> > > + if (damon_is_running(ctx)) {
> > > + mutex_lock(&ctx->call_controls_lock);
> > > + list_add_tail(&control->list, &ctx->call_controls);
> > > + mutex_unlock(&ctx->call_controls_lock);
> > > + } else {
> > > + /* return damon_call_handle_inactive_ctx(ctx, control); */
> > > return -EINVAL;
> > > + }
> > > if (control->repeat)
> > > return 0;
> > > wait_for_completion(&control->completion);
> >
> > I think this is not differnt from your previous suggestion, and thus it has the
> > same issue. What if DAMON is terminated between damon_is_running() and
> > list_add_tail() call? Please let me know if I'm missing something.
>
> I think it is good idea to insert a barrier() between damon_is_running()
> and list_add_tail() to prevent context-switching. What do you think this?
I don't think barrier() works in the way. Please correct me if I'm wrong.
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-27 17:42 ` SeongJae Park
@ 2025-12-29 3:38 ` JaeJoon Jung
2025-12-29 15:14 ` SeongJae Park
0 siblings, 1 reply; 10+ messages in thread
From: JaeJoon Jung @ 2025-12-29 3:38 UTC (permalink / raw)
To: SeongJae Park; +Cc: damon, linux-mm, rgbi3307
On Sun, 28 Dec 2025 at 02:42, SeongJae Park <sj@kernel.org> wrote:
>
> On Sat, 27 Dec 2025 08:53:21 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
>
> > On Sat, 27 Dec 2025 at 03:41, SeongJae Park <sj@kernel.org> wrote:
> > >
> > > On Fri, 26 Dec 2025 10:48:31 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > >
> > > > On Fri, 26 Dec 2025 at 04:50, SeongJae Park <sj@kernel.org> wrote:
> > > > >
> > > > > On Thu, 25 Dec 2025 11:35:33 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > > > >
> > > > > > On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
> > > > > > >
> > > > > > > Hello JaeJoon,
> > > > > > >
> > > > > > > On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > > [...]
> > > > > > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > > > > > index babad37719b6..2ead0bb3c462 100644
> > > > > > --- a/mm/damon/core.c
> > > > > > +++ b/mm/damon/core.c
> > > > > > @@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
> > > > > > */
> > > > > > int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> > > > > > {
> > > > > > + if (!damon_is_running(ctx))
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > if (!control->repeat)
> > > > > > init_completion(&control->completion);
> > > > > > control->canceled = false;
> > > > > > @@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
> > > > > > damon_call_control *control)
> > > > > > mutex_lock(&ctx->call_controls_lock);
> > > > > > list_add_tail(&control->list, &ctx->call_controls);
> > > > > > mutex_unlock(&ctx->call_controls_lock);
> > > > > > - if (!damon_is_running(ctx))
> > > > > > - return -EINVAL;
> > > > > > if (control->repeat)
> > > > > > return 0;
> > > > > > wait_for_completion(&control->completion);
> > > > >
> > > > > Let's assume DAMON is terminated between the damon_is_running() and
> > > > > list_add_tail(). In the case, the control->fn() will never be called back. If
> > > > > control->repeat is false, this function will even inifnitely wait.
> > > >
> > > > As you said, there are cases where kdamond is terminated(stopped) in
> > > > damon_is_running() and list_add_tail(). It may be a very rare case, but
> > > > I missed this case.
> > > >
> > > > >
> > > > > I think we should keep the damon_is_running() as is, but further check if it
> > > > > was terminated without handling the control object, and remove it from the list
> > > > > in the case. Like below.
> > > [...]
> > > > However, the damon_call_handle_inactive_ctx() function is to post-process
> > > > the duplicate addition of control->list. Rather, it is more efficient to
> > > > prevent duplicate additions in advance, as follows:
> > > > I have tested the following and it works fine.
> > > >
> > > > @@ -1467,11 +1496,14 @@ int damon_call(struct damon_ctx *ctx, struct
> > > > damon_call_control *control)
> > > > control->canceled = false;
> > > > INIT_LIST_HEAD(&control->list);
> > > >
> > > > - mutex_lock(&ctx->call_controls_lock);
> > > > - list_add_tail(&control->list, &ctx->call_controls);
> > > > - mutex_unlock(&ctx->call_controls_lock);
> > > > - if (!damon_is_running(ctx))
> > > > + if (damon_is_running(ctx)) {
> > > > + mutex_lock(&ctx->call_controls_lock);
> > > > + list_add_tail(&control->list, &ctx->call_controls);
> > > > + mutex_unlock(&ctx->call_controls_lock);
> > > > + } else {
> > > > + /* return damon_call_handle_inactive_ctx(ctx, control); */
> > > > return -EINVAL;
> > > > + }
> > > > if (control->repeat)
> > > > return 0;
> > > > wait_for_completion(&control->completion);
> > >
> > > I think this is not differnt from your previous suggestion, and thus it has the
> > > same issue. What if DAMON is terminated between damon_is_running() and
> > > list_add_tail() call? Please let me know if I'm missing something.
> >
> > I think it is good idea to insert a barrier() between damon_is_running()
> > and list_add_tail() to prevent context-switching. What do you think this?
>
> I don't think barrier() works in the way. Please correct me if I'm wrong.
Yes, there is no need to use memory barriers. Since each kdamond runs
its
own damon_ctx, the concurrent access problem can be sufficiently
solved with
mutext_lock. The problem discussed so far can be solved by applying
mutex_lock to both ctx->kdamond and ctx->call_controls.
Please refer to the modified code below:
@@ -1496,14 +1502,15 @@ int damon_call(struct damon_ctx *ctx, struct
damon_call_control *control)
control->canceled = false;
INIT_LIST_HEAD(&control->list);
- if (damon_is_running(ctx)) {
- mutex_lock(&ctx->call_controls_lock);
+ mutex_lock(&ctx->call_controls_lock);
+ if (ctx->kdamond) {
list_add_tail(&control->list, &ctx->call_controls);
- mutex_unlock(&ctx->call_controls_lock);
} else {
- /* return damon_call_handle_inactive_ctx(ctx, control); */
+ mutex_unlock(&ctx->call_controls_lock);
return -EINVAL;
}
+ mutex_unlock(&ctx->call_controls_lock);
+
if (control->repeat)
return 0;
wait_for_completion(&control->completion);
>
>
> Thanks,
> SJ
>
> [...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call()
2025-12-29 3:38 ` JaeJoon Jung
@ 2025-12-29 15:14 ` SeongJae Park
0 siblings, 0 replies; 10+ messages in thread
From: SeongJae Park @ 2025-12-29 15:14 UTC (permalink / raw)
To: JaeJoon Jung; +Cc: SeongJae Park, damon, linux-mm, rgbi3307
On Mon, 29 Dec 2025 12:38:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> On Sun, 28 Dec 2025 at 02:42, SeongJae Park <sj@kernel.org> wrote:
> >
> > On Sat, 27 Dec 2025 08:53:21 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> >
> > > On Sat, 27 Dec 2025 at 03:41, SeongJae Park <sj@kernel.org> wrote:
> > > >
> > > > On Fri, 26 Dec 2025 10:48:31 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > > >
> > > > > On Fri, 26 Dec 2025 at 04:50, SeongJae Park <sj@kernel.org> wrote:
> > > > > >
> > > > > > On Thu, 25 Dec 2025 11:35:33 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > > > > >
> > > > > > > On Thu, 25 Dec 2025 at 09:32, SeongJae Park <sj@kernel.org> wrote:
> > > > > > > >
> > > > > > > > Hello JaeJoon,
> > > > > > > >
> > > > > > > > On Wed, 24 Dec 2025 18:43:58 +0900 JaeJoon Jung <rgbi3307@gmail.com> wrote:
> > > > [...]
> > > > > > > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > > > > > > index babad37719b6..2ead0bb3c462 100644
> > > > > > > --- a/mm/damon/core.c
> > > > > > > +++ b/mm/damon/core.c
> > > > > > > @@ -1462,6 +1462,9 @@ bool damon_is_running(struct damon_ctx *ctx)
> > > > > > > */
> > > > > > > int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> > > > > > > {
> > > > > > > + if (!damon_is_running(ctx))
> > > > > > > + return -EINVAL;
> > > > > > > +
> > > > > > > if (!control->repeat)
> > > > > > > init_completion(&control->completion);
> > > > > > > control->canceled = false;
> > > > > > > @@ -1470,8 +1473,6 @@ int damon_call(struct damon_ctx *ctx, struct
> > > > > > > damon_call_control *control)
> > > > > > > mutex_lock(&ctx->call_controls_lock);
> > > > > > > list_add_tail(&control->list, &ctx->call_controls);
> > > > > > > mutex_unlock(&ctx->call_controls_lock);
> > > > > > > - if (!damon_is_running(ctx))
> > > > > > > - return -EINVAL;
> > > > > > > if (control->repeat)
> > > > > > > return 0;
> > > > > > > wait_for_completion(&control->completion);
> > > > > >
> > > > > > Let's assume DAMON is terminated between the damon_is_running() and
> > > > > > list_add_tail(). In the case, the control->fn() will never be called back. If
> > > > > > control->repeat is false, this function will even inifnitely wait.
> > > > >
> > > > > As you said, there are cases where kdamond is terminated(stopped) in
> > > > > damon_is_running() and list_add_tail(). It may be a very rare case, but
> > > > > I missed this case.
> > > > >
> > > > > >
> > > > > > I think we should keep the damon_is_running() as is, but further check if it
> > > > > > was terminated without handling the control object, and remove it from the list
> > > > > > in the case. Like below.
> > > > [...]
> > > > > However, the damon_call_handle_inactive_ctx() function is to post-process
> > > > > the duplicate addition of control->list. Rather, it is more efficient to
> > > > > prevent duplicate additions in advance, as follows:
> > > > > I have tested the following and it works fine.
> > > > >
> > > > > @@ -1467,11 +1496,14 @@ int damon_call(struct damon_ctx *ctx, struct
> > > > > damon_call_control *control)
> > > > > control->canceled = false;
> > > > > INIT_LIST_HEAD(&control->list);
> > > > >
> > > > > - mutex_lock(&ctx->call_controls_lock);
> > > > > - list_add_tail(&control->list, &ctx->call_controls);
> > > > > - mutex_unlock(&ctx->call_controls_lock);
> > > > > - if (!damon_is_running(ctx))
> > > > > + if (damon_is_running(ctx)) {
> > > > > + mutex_lock(&ctx->call_controls_lock);
> > > > > + list_add_tail(&control->list, &ctx->call_controls);
> > > > > + mutex_unlock(&ctx->call_controls_lock);
> > > > > + } else {
> > > > > + /* return damon_call_handle_inactive_ctx(ctx, control); */
> > > > > return -EINVAL;
> > > > > + }
> > > > > if (control->repeat)
> > > > > return 0;
> > > > > wait_for_completion(&control->completion);
> > > >
> > > > I think this is not differnt from your previous suggestion, and thus it has the
> > > > same issue. What if DAMON is terminated between damon_is_running() and
> > > > list_add_tail() call? Please let me know if I'm missing something.
> > >
> > > I think it is good idea to insert a barrier() between damon_is_running()
> > > and list_add_tail() to prevent context-switching. What do you think this?
> >
> > I don't think barrier() works in the way. Please correct me if I'm wrong.
>
> Yes, there is no need to use memory barriers. Since each kdamond runs
> its
> own damon_ctx, the concurrent access problem can be sufficiently
> solved with
> mutext_lock. The problem discussed so far can be solved by applying
> mutex_lock to both ctx->kdamond and ctx->call_controls.
> Please refer to the modified code below:
>
> @@ -1496,14 +1502,15 @@ int damon_call(struct damon_ctx *ctx, struct
> damon_call_control *control)
> control->canceled = false;
> INIT_LIST_HEAD(&control->list);
>
> - if (damon_is_running(ctx)) {
> - mutex_lock(&ctx->call_controls_lock);
> + mutex_lock(&ctx->call_controls_lock);
> + if (ctx->kdamond) {
> list_add_tail(&control->list, &ctx->call_controls);
> - mutex_unlock(&ctx->call_controls_lock);
> } else {
> - /* return damon_call_handle_inactive_ctx(ctx, control); */
> + mutex_unlock(&ctx->call_controls_lock);
> return -EINVAL;
> }
> + mutex_unlock(&ctx->call_controls_lock);
> +
> if (control->repeat)
> return 0;
> wait_for_completion(&control->completion);
This diff assumes holding ctx->call_controls_lock will avoid the context be
terminated, right? But there is no such guarantees.
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-12-29 15:14 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-24 9:43 [PATCH] mm/damon/sysfs: preventing duplicated list_add_tail() at the damon_call() JaeJoon Jung
2025-12-25 0:32 ` SeongJae Park
2025-12-25 2:35 ` JaeJoon Jung
2025-12-25 19:49 ` SeongJae Park
2025-12-26 1:48 ` JaeJoon Jung
2025-12-26 18:41 ` SeongJae Park
2025-12-26 23:53 ` JaeJoon Jung
2025-12-27 17:42 ` SeongJae Park
2025-12-29 3:38 ` JaeJoon Jung
2025-12-29 15:14 ` SeongJae Park
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox