linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context
@ 2026-02-16 15:18 Raul Pazemecxas De Andrade
  0 siblings, 0 replies; 4+ messages in thread
From: Raul Pazemecxas De Andrade @ 2026-02-16 15:18 UTC (permalink / raw)
  To: sj; +Cc: damon, linux-mm, linux-kernel, security

[-- Attachment #1: Type: text/plain, Size: 3190 bytes --]

I found a bug in damos_walk() that leaves a dangling walk_control
pointer when called on an inactive context. The pattern is
structurally identical to the bug fixed in commit f9132fbc2e83
("mm/damon/core: remove call_control in inactive contexts") for
damon_call().

## Description

damos_walk() sets ctx->walk_control to point to a caller-provided
stack-allocated control structure (core.c line 1560), then checks
if the DAMON context is running (line 1562). If the context is
inactive, it returns -EINVAL (line 1563) WITHOUT clearing
ctx->walk_control back to NULL.

This leaves a dangling pointer. Subsequent damos_walk() calls see
the non-NULL stale pointer and return -EBUSY, permanently locking
the DAMOS tried_regions interface.

## Affected versions

Introduced in: commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
First affected release: v6.14-rc1
Affected stable releases: v6.14, v6.15, v6.16, v6.17, v6.18, v6.19
Tested on: 6.19.0 (commit ca4ee40bf13d, QEMU/KVM x86_64)
Current mainline: UNFIXED

## Reproduction (confirmed on 6.19.0, CONFIG_DAMON=y CONFIG_DAMON_SYSFS=y)

  DAMON=/sys/kernel/mm/damon/admin/kdamonds

  # Setup context with scheme
  echo 1 > $DAMON/nr_kdamonds
  echo 1 > $DAMON/0/contexts/nr_contexts
  echo vaddr > $DAMON/0/contexts/0/operations
  echo 1 > $DAMON/0/contexts/0/targets/nr_targets
  echo $$ > $DAMON/0/contexts/0/targets/0/pid_target
  echo 1 > $DAMON/0/contexts/0/schemes/nr_schemes
  echo stat > $DAMON/0/contexts/0/schemes/0/action

  # Start then stop (ctx stays allocated per sysfs design)
  echo on > $DAMON/0/state
  sleep 1
  echo off > $DAMON/0/state
  sleep 1

  # Trigger bug: damos_walk() on inactive context
  echo "update_schemes_tried_regions" > $DAMON/0/state
  # Returns -EINVAL, walk_control left dangling

  # Confirm: second call gets -EBUSY (dangling pointer != NULL)
  echo "update_schemes_tried_regions" > $DAMON/0/state
  # Returns -EBUSY -- interface permanently locked

## Tested output

  First call:  -EINVAL (Invalid argument)
  Second call: -EBUSY (Device or resource busy) <-- BUG confirmed

## Root cause

Commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
introduced this function without cleanup on the -EINVAL error path.

The sibling function damon_call() had the exact same bug and was
fixed in f9132fbc2e83 by adding damon_call_handle_inactive_ctx()
which removes the control object when the context is inactive.
damos_walk() has no equivalent cleanup.

## Impact

1. PERMANENT LOCKUP DOS: After on->off->update_schemes_tried_regions,
   all future tried_regions queries return -EBUSY forever until
   the DAMON context is destroyed.

2. DANGLING POINTER: ctx->walk_control points to freed stack memory.
   The struct damos_walk_control contains a function pointer
   (walk_fn). If any DAMON API consumer reuses the same ctx after
   damos_walk() returns -EINVAL and kdamond is restarted, it would
   dereference the dangling pointer in damos_walk_call_walk()
   (which calls control->walk_fn) or damos_walk_cancel().

Reported-by: Raul Pazemécxas <raul_pazemecxas@hotmail.com>

Best regards,
Raul

[-- Attachment #2: Type: text/html, Size: 16912 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread
* [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context
@ 2026-02-16 15:34 Raul Pazemecxas De Andrade
  2026-02-16 15:52 ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Raul Pazemecxas De Andrade @ 2026-02-16 15:34 UTC (permalink / raw)
  To: sj; +Cc: security, damon, linux-mm, linux-kernel

Hi,

I found a bug in damos_walk() that leaves a dangling walk_control
pointer when called on an inactive context. The pattern is
structurally identical to the bug fixed in commit f9132fbc2e83
("mm/damon/core: remove call_control in inactive contexts") for
damon_call().

Description
-----------

damos_walk() sets ctx->walk_control to point to a caller-provided
stack-allocated control structure (core.c line 1560), then checks
if the DAMON context is running (line 1562). If the context is
inactive, it returns -EINVAL (line 1563) WITHOUT clearing
ctx->walk_control back to NULL.

This leaves a dangling pointer. Subsequent damos_walk() calls see
the non-NULL stale pointer and return -EBUSY, permanently locking
the DAMOS tried_regions interface.

Affected versions
-----------------

Introduced in: commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
First affected release: v6.14-rc1
Affected stable releases: v6.14, v6.15, v6.16, v6.17, v6.18, v6.19
Tested on: 6.19.0 (commit ca4ee40bf13d, QEMU/KVM x86_64)
Current mainline: UNFIXED

Reproduction (confirmed on 6.19.0, CONFIG_DAMON=y CONFIG_DAMON_SYSFS=y)
------------------------------------------------------------------------

  DAMON=/sys/kernel/mm/damon/admin/kdamonds

  # Setup context with scheme
  echo 1 > $DAMON/nr_kdamonds
  echo 1 > $DAMON/0/contexts/nr_contexts
  echo vaddr > $DAMON/0/contexts/0/operations
  echo 1 > $DAMON/0/contexts/0/targets/nr_targets
  echo $$ > $DAMON/0/contexts/0/targets/0/pid_target
  echo 1 > $DAMON/0/contexts/0/schemes/nr_schemes
  echo stat > $DAMON/0/contexts/0/schemes/0/action

  # Start then stop (ctx stays allocated per sysfs design)
  echo on > $DAMON/0/state
  sleep 1
  echo off > $DAMON/0/state
  sleep 1

  # Trigger bug: damos_walk() on inactive context
  echo "update_schemes_tried_regions" > $DAMON/0/state
  # Returns -EINVAL, walk_control left dangling

  # Confirm: second call gets -EBUSY (dangling pointer != NULL)
  echo "update_schemes_tried_regions" > $DAMON/0/state
  # Returns -EBUSY -- interface permanently locked

Tested output
-------------

  First call:  -EINVAL (Invalid argument)
  Second call: -EBUSY (Device or resource busy) <-- BUG confirmed

Root cause
----------

Commit bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
introduced this function without cleanup on the -EINVAL error path.

The sibling function damon_call() had the exact same bug and was
fixed in f9132fbc2e83 by adding damon_call_handle_inactive_ctx()
which removes the control object when the context is inactive.
damos_walk() has no equivalent cleanup.

Impact
------

1. PERMANENT LOCKUP: After on->off->update_schemes_tried_regions,
   all future tried_regions queries return -EBUSY forever until
   the DAMON context is destroyed.

2. DANGLING POINTER: ctx->walk_control points to freed stack memory.
   The struct damos_walk_control contains a function pointer
   (walk_fn). If any DAMON API consumer reuses the same ctx after
   damos_walk() returns -EINVAL and kdamond is restarted, it would
   dereference the dangling pointer in damos_walk_call_walk()
   (which calls control->walk_fn) or damos_walk_cancel().

Reported-by: Raul <raul_pazemecxas@hotmail.com>

Best regards,
Raul


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-02-16 16:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-16 15:18 [BUG] mm/damon/core: dangling walk_control pointer in damos_walk() on inactive context Raul Pazemecxas De Andrade
2026-02-16 15:34 Raul Pazemecxas De Andrade
2026-02-16 15:52 ` Greg KH
2026-02-16 16:26   ` Raul Pazemecxas De Andrade

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox