* [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context
@ 2026-02-16 15:18 Raul Pazemecxas De Andrade
0 siblings, 0 replies; 4+ messages in thread
From: Raul Pazemecxas De Andrade @ 2026-02-16 15:18 UTC (permalink / raw)
To: sj; +Cc: damon, linux-mm, linux-kernel, security
[-- Attachment #1: Type: text/plain, Size: 3190 bytes --]
I found a bug in damos_walk() that leaves a dangling walk_control
pointer when called on an inactive context. The pattern is
structurally identical to the bug fixed in commit f9132fbc2e83
("mm/damon/core: remove call_control in inactive contexts") for
damon_call().
## Description
damos_walk() sets ctx->walk_control to point to a caller-provided
stack-allocated control structure (core.c line 1560), then checks
if the DAMON context is running (line 1562). If the context is
inactive, it returns -EINVAL (line 1563) WITHOUT clearing
ctx->walk_control back to NULL.
This leaves a dangling pointer. Subsequent damos_walk() calls see
the non-NULL stale pointer and return -EBUSY, permanently locking
the DAMOS tried_regions interface.
## Affected versions
Introduced in: commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
First affected release: v6.14-rc1
Affected stable releases: v6.14, v6.15, v6.16, v6.17, v6.18, v6.19
Tested on: 6.19.0 (commit ca4ee40bf13d, QEMU/KVM x86_64)
Current mainline: UNFIXED
## Reproduction (confirmed on 6.19.0, CONFIG_DAMON=y CONFIG_DAMON_SYSFS=y)
DAMON=/sys/kernel/mm/damon/admin/kdamonds
# Setup context with scheme
echo 1 > $DAMON/nr_kdamonds
echo 1 > $DAMON/0/contexts/nr_contexts
echo vaddr > $DAMON/0/contexts/0/operations
echo 1 > $DAMON/0/contexts/0/targets/nr_targets
echo $$ > $DAMON/0/contexts/0/targets/0/pid_target
echo 1 > $DAMON/0/contexts/0/schemes/nr_schemes
echo stat > $DAMON/0/contexts/0/schemes/0/action
# Start then stop (ctx stays allocated per sysfs design)
echo on > $DAMON/0/state
sleep 1
echo off > $DAMON/0/state
sleep 1
# Trigger bug: damos_walk() on inactive context
echo "update_schemes_tried_regions" > $DAMON/0/state
# Returns -EINVAL, walk_control left dangling
# Confirm: second call gets -EBUSY (dangling pointer != NULL)
echo "update_schemes_tried_regions" > $DAMON/0/state
# Returns -EBUSY -- interface permanently locked
## Tested output
First call: -EINVAL (Invalid argument)
Second call: -EBUSY (Device or resource busy) <-- BUG confirmed
## Root cause
Commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
introduced this function without cleanup on the -EINVAL error path.
The sibling function damon_call() had the exact same bug and was
fixed in f9132fbc2e83 by adding damon_call_handle_inactive_ctx()
which removes the control object when the context is inactive.
damos_walk() has no equivalent cleanup.
## Impact
1. PERMANENT LOCKUP DOS: After on->off->update_schemes_tried_regions,
all future tried_regions queries return -EBUSY forever until
the DAMON context is destroyed.
2. DANGLING POINTER: ctx->walk_control points to freed stack memory.
The struct damos_walk_control contains a function pointer
(walk_fn). If any DAMON API consumer reuses the same ctx after
damos_walk() returns -EINVAL and kdamond is restarted, it would
dereference the dangling pointer in damos_walk_call_walk()
(which calls control->walk_fn) or damos_walk_cancel().
Reported-by: Raul Pazemécxas <raul_pazemecxas@hotmail.com>
Best regards,
Raul
[-- Attachment #2: Type: text/html, Size: 16912 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread* [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context
@ 2026-02-16 15:34 Raul Pazemecxas De Andrade
2026-02-16 15:52 ` Greg KH
0 siblings, 1 reply; 4+ messages in thread
From: Raul Pazemecxas De Andrade @ 2026-02-16 15:34 UTC (permalink / raw)
To: sj; +Cc: security, damon, linux-mm, linux-kernel
Hi,
I found a bug in damos_walk() that leaves a dangling walk_control
pointer when called on an inactive context. The pattern is
structurally identical to the bug fixed in commit f9132fbc2e83
("mm/damon/core: remove call_control in inactive contexts") for
damon_call().
Description
-----------
damos_walk() sets ctx->walk_control to point to a caller-provided
stack-allocated control structure (core.c line 1560), then checks
if the DAMON context is running (line 1562). If the context is
inactive, it returns -EINVAL (line 1563) WITHOUT clearing
ctx->walk_control back to NULL.
This leaves a dangling pointer. Subsequent damos_walk() calls see
the non-NULL stale pointer and return -EBUSY, permanently locking
the DAMOS tried_regions interface.
Affected versions
-----------------
Introduced in: commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
First affected release: v6.14-rc1
Affected stable releases: v6.14, v6.15, v6.16, v6.17, v6.18, v6.19
Tested on: 6.19.0 (commit ca4ee40bf13d, QEMU/KVM x86_64)
Current mainline: UNFIXED
Reproduction (confirmed on 6.19.0, CONFIG_DAMON=y CONFIG_DAMON_SYSFS=y)
------------------------------------------------------------------------
DAMON=/sys/kernel/mm/damon/admin/kdamonds
# Setup context with scheme
echo 1 > $DAMON/nr_kdamonds
echo 1 > $DAMON/0/contexts/nr_contexts
echo vaddr > $DAMON/0/contexts/0/operations
echo 1 > $DAMON/0/contexts/0/targets/nr_targets
echo $$ > $DAMON/0/contexts/0/targets/0/pid_target
echo 1 > $DAMON/0/contexts/0/schemes/nr_schemes
echo stat > $DAMON/0/contexts/0/schemes/0/action
# Start then stop (ctx stays allocated per sysfs design)
echo on > $DAMON/0/state
sleep 1
echo off > $DAMON/0/state
sleep 1
# Trigger bug: damos_walk() on inactive context
echo "update_schemes_tried_regions" > $DAMON/0/state
# Returns -EINVAL, walk_control left dangling
# Confirm: second call gets -EBUSY (dangling pointer != NULL)
echo "update_schemes_tried_regions" > $DAMON/0/state
# Returns -EBUSY -- interface permanently locked
Tested output
-------------
First call: -EINVAL (Invalid argument)
Second call: -EBUSY (Device or resource busy) <-- BUG confirmed
Root cause
----------
Commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
introduced this function without cleanup on the -EINVAL error path.
The sibling function damon_call() had the exact same bug and was
fixed in f9132fbc2e83 by adding damon_call_handle_inactive_ctx()
which removes the control object when the context is inactive.
damos_walk() has no equivalent cleanup.
Impact
------
1. PERMANENT LOCKUP: After on->off->update_schemes_tried_regions,
all future tried_regions queries return -EBUSY forever until
the DAMON context is destroyed.
2. DANGLING POINTER: ctx->walk_control points to freed stack memory.
The struct damos_walk_control contains a function pointer
(walk_fn). If any DAMON API consumer reuses the same ctx after
damos_walk() returns -EINVAL and kdamond is restarted, it would
dereference the dangling pointer in damos_walk_call_walk()
(which calls control->walk_fn) or damos_walk_cancel().
Reported-by: Raul <raul_pazemecxas@hotmail.com>
Best regards,
Raul
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context
2026-02-16 15:34 Raul Pazemecxas De Andrade
@ 2026-02-16 15:52 ` Greg KH
2026-02-16 16:26 ` Raul Pazemecxas De Andrade
0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2026-02-16 15:52 UTC (permalink / raw)
To: Raul Pazemecxas De Andrade; +Cc: sj, security, damon, linux-mm, linux-kernel
On Mon, Feb 16, 2026 at 03:34:44PM +0000, Raul Pazemecxas De Andrade wrote:
> Root cause
> ----------
>
> Commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
> introduced this function without cleanup on the -EINVAL error path.
>
> The sibling function damon_call() had the exact same bug and was
> fixed in f9132fbc2e83 by adding damon_call_handle_inactive_ctx()
> which removes the control object when the context is inactive.
> damos_walk() has no equivalent cleanup.
Can you submit a patch to resolve this to get credit for fixing the bug?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread* RE: [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context
2026-02-16 15:52 ` Greg KH
@ 2026-02-16 16:26 ` Raul Pazemecxas De Andrade
0 siblings, 0 replies; 4+ messages in thread
From: Raul Pazemecxas De Andrade @ 2026-02-16 16:26 UTC (permalink / raw)
To: Greg KH; +Cc: sj, security, damon, linux-mm, linux-kernel
Thanks for your attention Greg and congratulations on your work on our Kernel
________________________________________
De: Greg KH <gregkh@linuxfoundation.org>
Enviadas: Segunda-feira, 16 de Fevereiro de 2026 12:52
Para: Raul Pazemecxas De Andrade <raul_pazemecxas@hotmail.com>
Cc: sj@kernel.org <sj@kernel.org>; security@kernel.org <security@kernel.org>; damon@lists.linux.dev <damon@lists.linux.dev>; linux-mm@kvack.org <linux-mm@kvack.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>
Assunto: Re: [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context
On Mon, Feb 16, 2026 at 03:34:44PM +0000, Raul Pazemecxas De Andrade wrote:
> Root cause
> ----------
>
> Commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
> introduced this function without cleanup on the -EINVAL error path.
>
> The sibling function damon_call() had the exact same bug and was
> fixed in f9132fbc2e83 by adding damon_call_handle_inactive_ctx()
> which removes the control object when the context is inactive.
> damos_walk() has no equivalent cleanup.
Can you submit a patch to resolve this to get credit for fixing the bug?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-16 16:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-16 15:18 [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context Raul Pazemecxas De Andrade
2026-02-16 15:34 Raul Pazemecxas De Andrade
2026-02-16 15:52 ` Greg KH
2026-02-16 16:26 ` Raul Pazemecxas De Andrade
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox