* [PATCH v2 0/4] mm/damon: fix divide by zero and its samples
@ 2025-07-01 8:19 Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw)
To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim
This includes damon fixes and its samples to make it safer when damon
sample start fails.
It includes the following changes.
- fix unexpected divide by zero crash for zero size regions
- fix bugs for damon samples in case of start failures
Honggyu Kim (4):
samples/damon: fix damon sample prcl for start failure
samples/damon: fix damon sample wsse for start failure
samples/damon: fix damon sample mtier for start failure
mm/damon: fix divide by zero in damon_get_intervals_score()
mm/damon/core.c | 5 ++++-
samples/damon/mtier.c | 8 ++++++--
samples/damon/prcl.c | 8 ++++++--
samples/damon/wsse.c | 8 ++++++--
4 files changed, 22 insertions(+), 7 deletions(-)
base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure
2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim
@ 2025-07-01 8:19 ` Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 2/4] samples/damon: fix damon sample wsse " Honggyu Kim
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw)
To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim
The damon_sample_prcl_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.
In such cases, setting Y to "enable" then N triggers the following
crash because damon sample start failed but the "enable" stays as Y.
[ 2441.419649] damon_sample_prcl: start
[ 2454.146817] damon_sample_prcl: stop
[ 2454.146862] ------------[ cut here ]------------
[ 2454.146865] kernel BUG at mm/slub.c:546!
[ 2454.148183] Oops: invalid opcode: 0000 [#1] SMP NOPTI
...
[ 2454.167555] Call Trace:
[ 2454.167822] <TASK>
[ 2454.168061] damon_destroy_ctx+0x78/0x140
[ 2454.168454] damon_sample_prcl_enable_store+0x8d/0xd0
[ 2454.168932] param_attr_store+0xa1/0x120
[ 2454.169315] module_attr_store+0x20/0x50
[ 2454.169695] sysfs_kf_write+0x72/0x90
[ 2454.170065] kernfs_fop_write_iter+0x150/0x1e0
[ 2454.170491] vfs_write+0x315/0x440
[ 2454.170833] ksys_write+0x69/0xf0
[ 2454.171162] __x64_sys_write+0x19/0x30
[ 2454.171525] x64_sys_call+0x18b2/0x2700
[ 2454.171900] do_syscall_64+0x7f/0x680
[ 2454.172258] ? exit_to_user_mode_loop+0xf6/0x180
[ 2454.172694] ? clear_bhb_loop+0x30/0x80
[ 2454.173067] ? clear_bhb_loop+0x30/0x80
[ 2454.173439] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 2aca254620a8 ("samples/damon: introduce a skeleton of a smaple DAMON module for proactive reclamation")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
samples/damon/prcl.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/samples/damon/prcl.c b/samples/damon/prcl.c
index 056b1b21a0fe..5597e6a08ab2 100644
--- a/samples/damon/prcl.c
+++ b/samples/damon/prcl.c
@@ -122,8 +122,12 @@ static int damon_sample_prcl_enable_store(
if (enable == enabled)
return 0;
- if (enable)
- return damon_sample_prcl_start();
+ if (enable) {
+ err = damon_sample_prcl_start();
+ if (err)
+ enable = false;
+ return err;
+ }
damon_sample_prcl_stop();
return 0;
}
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/4] samples/damon: fix damon sample wsse for start failure
2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim
@ 2025-07-01 8:19 ` Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 3/4] samples/damon: fix damon sample mtier " Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim
3 siblings, 0 replies; 7+ messages in thread
From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw)
To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim
The damon_sample_wsse_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.
In such cases, setting Y to "enable" then N triggers the similar crash
with wsse because damon sample start failed but the "enable" stays as Y.
Fixes: b757c6cfc696 ("samples/damon/wsse: start and stop DAMON as the user requests")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
samples/damon/wsse.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/samples/damon/wsse.c b/samples/damon/wsse.c
index 11be25803274..e20238a249e7 100644
--- a/samples/damon/wsse.c
+++ b/samples/damon/wsse.c
@@ -102,8 +102,12 @@ static int damon_sample_wsse_enable_store(
if (enable == enabled)
return 0;
- if (enable)
- return damon_sample_wsse_start();
+ if (enable) {
+ err = damon_sample_wsse_start();
+ if (err)
+ enable = false;
+ return err;
+ }
damon_sample_wsse_stop();
return 0;
}
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] samples/damon: fix damon sample mtier for start failure
2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 2/4] samples/damon: fix damon sample wsse " Honggyu Kim
@ 2025-07-01 8:19 ` Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim
3 siblings, 0 replies; 7+ messages in thread
From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw)
To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim
The damon_sample_mtier_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.
In such cases, setting Y to "enable" then N triggers the similar crash
with mtier because damon sample start failed but the "enable" stays as Y.
Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
samples/damon/mtier.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index 36d2cd933f5a..c94254b77fc9 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -164,8 +164,12 @@ static int damon_sample_mtier_enable_store(
if (enable == enabled)
return 0;
- if (enable)
- return damon_sample_mtier_start();
+ if (enable) {
+ err = damon_sample_mtier_start();
+ if (err)
+ enable = false;
+ return err;
+ }
damon_sample_mtier_stop();
return 0;
}
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score()
2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim
` (2 preceding siblings ...)
2025-07-01 8:19 ` [PATCH v2 3/4] samples/damon: fix damon sample mtier " Honggyu Kim
@ 2025-07-01 8:19 ` Honggyu Kim
2025-07-01 17:00 ` SeongJae Park
3 siblings, 1 reply; 7+ messages in thread
From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw)
To: SeongJae Park, damon
Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim, Yunjeong Mun
The current implementation allows having zero size regions with no
special reasons, but damon_get_intervals_score() gets crashed by divide
by zero when the region size is zero.
[ 29.403950] Oops: divide error: 0000 [#1] SMP NOPTI
This patch fixes the bug, but does not disallow zero size regions to
keep the backward compatibility since disallowing zero size regions
might be a breaking change for some users.
Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Cc: Yunjeong Mun <yunjeong.mun@sk.com>
---
mm/damon/core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index b217e0120e09..e274a4d958d6 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1448,7 +1448,10 @@ static unsigned long damon_get_intervals_score(struct damon_ctx *c)
access_events += sz_region * r->nr_accesses;
}
}
- target_access_events = max_access_events * goal_bp / 10000;
+ if (likely(max_access_events) > 0)
+ target_access_events = max_access_events * goal_bp / 10000;
+ else
+ target_access_events = 1;
return access_events * 10000 / target_access_events;
}
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score()
2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim
@ 2025-07-01 17:00 ` SeongJae Park
2025-07-02 0:08 ` Honggyu Kim
0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2025-07-01 17:00 UTC (permalink / raw)
To: Honggyu Kim
Cc: SeongJae Park, damon, Andrew Morton, linux-mm, kernel_team, Yunjeong Mun
Hi Honggyu,
On Tue, 1 Jul 2025 17:19:26 +0900 Honggyu Kim <honggyu.kim@sk.com> wrote:
> The current implementation allows having zero size regions with no
> special reasons, but damon_get_intervals_score() gets crashed by divide
> by zero when the region size is zero.
>
> [ 29.403950] Oops: divide error: 0000 [#1] SMP NOPTI
>
> This patch fixes the bug, but does not disallow zero size regions to
> keep the backward compatibility since disallowing zero size regions
> might be a breaking change for some users.
>
> Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning")
> Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
> Cc: Yunjeong Mun <yunjeong.mun@sk.com>
> ---
> mm/damon/core.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index b217e0120e09..e274a4d958d6 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1448,7 +1448,10 @@ static unsigned long damon_get_intervals_score(struct damon_ctx *c)
> access_events += sz_region * r->nr_accesses;
> }
> }
> - target_access_events = max_access_events * goal_bp / 10000;
> + if (likely(max_access_events) > 0)
This makes checkpatch.pl complaints as below:
WARNING: Using likely should generally have parentheses around the comparison
#39: FILE: mm/damon/core.c:1451:
+ if (likely(max_access_events) > 0)
This is not performance critical part, so I think we can simply drop likely().
> + target_access_events = max_access_events * goal_bp / 10000;
> + else
> + target_access_events = 1;
But, target_access_events could still be zero if goal_bp is zero or
max_access_events * goal_bp is less thatn 10000?
I'd like to simply do the target_access_events calculation as is, and set
target_access_events as 1, if it is zero. Since this is not a performance
critical path, I think that's ok.
Also, in the previous version, we discussed the reproduction step requires
sample modules. But seems we can also reproduce this with the official DAMON
sysfs ABI by setting the intervals goal zero, and I just confirmed that. Let's
Cc stable@.
> return access_events * 10000 / target_access_events;
> }
>
> --
> 2.34.1
Thanks,
SJ
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score()
2025-07-01 17:00 ` SeongJae Park
@ 2025-07-02 0:08 ` Honggyu Kim
0 siblings, 0 replies; 7+ messages in thread
From: Honggyu Kim @ 2025-07-02 0:08 UTC (permalink / raw)
To: SeongJae Park; +Cc: kernel_team, damon, Andrew Morton, linux-mm, Yunjeong Mun
Hi SeongJae,
On 7/2/2025 2:00 AM, SeongJae Park wrote:
> Hi Honggyu,
>
> On Tue, 1 Jul 2025 17:19:26 +0900 Honggyu Kim <honggyu.kim@sk.com> wrote:
>
>> The current implementation allows having zero size regions with no
>> special reasons, but damon_get_intervals_score() gets crashed by divide
>> by zero when the region size is zero.
>>
>> [ 29.403950] Oops: divide error: 0000 [#1] SMP NOPTI
>>
>> This patch fixes the bug, but does not disallow zero size regions to
>> keep the backward compatibility since disallowing zero size regions
>> might be a breaking change for some users.
>>
>> Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning")
>> Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
>> Cc: Yunjeong Mun <yunjeong.mun@sk.com>
>> ---
>> mm/damon/core.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/damon/core.c b/mm/damon/core.c
>> index b217e0120e09..e274a4d958d6 100644
>> --- a/mm/damon/core.c
>> +++ b/mm/damon/core.c
>> @@ -1448,7 +1448,10 @@ static unsigned long damon_get_intervals_score(struct damon_ctx *c)
>> access_events += sz_region * r->nr_accesses;
>> }
>> }
>> - target_access_events = max_access_events * goal_bp / 10000;
>> + if (likely(max_access_events) > 0)
>
> This makes checkpatch.pl complaints as below:
>
> WARNING: Using likely should generally have parentheses around the comparison
> #39: FILE: mm/damon/core.c:1451:
> + if (likely(max_access_events) > 0)
>
> This is not performance critical part, so I think we can simply drop likely().
>
>> + target_access_events = max_access_events * goal_bp / 10000;
>> + else
>> + target_access_events = 1;
>
> But, target_access_events could still be zero if goal_bp is zero or
> max_access_events * goal_bp is less thatn 10000?
>
> I'd like to simply do the target_access_events calculation as is, and set
> target_access_events as 1, if it is zero. Since this is not a performance
> critical path, I think that's ok.
Sure. I've just made it as you mentioned in v3.
>
> Also, in the previous version, we discussed the reproduction step requires
> sample modules. But seems we can also reproduce this with the official DAMON
> sysfs ABI by setting the intervals goal zero, and I just confirmed that. Let's
> Cc stable@.
I've cc'ed stable@ in v3. Thanks for your help!
Thanks,
Honggyu
>
>> return access_events * 10000 / target_access_events;
>> }
>>
>> --
>> 2.34.1
>
>
> Thanks,
> SJ
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-07-02 0:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 2/4] samples/damon: fix damon sample wsse " Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 3/4] samples/damon: fix damon sample mtier " Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim
2025-07-01 17:00 ` SeongJae Park
2025-07-02 0:08 ` Honggyu Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox