* [PATCH v2 0/4] mm/damon: fix divide by zero and its samples
@ 2025-07-01 8:19 Honggyu Kim
2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw)
To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim
This includes damon fixes and its samples to make it safer when damon
sample start fails.
It includes the following changes.
- fix unexpected divide by zero crash for zero size regions
- fix bugs for damon samples in case of start failures
Honggyu Kim (4):
samples/damon: fix damon sample prcl for start failure
samples/damon: fix damon sample wsse for start failure
samples/damon: fix damon sample mtier for start failure
mm/damon: fix divide by zero in damon_get_intervals_score()
mm/damon/core.c | 5 ++++-
samples/damon/mtier.c | 8 ++++++--
samples/damon/prcl.c | 8 ++++++--
samples/damon/wsse.c | 8 ++++++--
4 files changed, 22 insertions(+), 7 deletions(-)
base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure 2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim @ 2025-07-01 8:19 ` Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 2/4] samples/damon: fix damon sample wsse " Honggyu Kim ` (2 subsequent siblings) 3 siblings, 0 replies; 7+ messages in thread From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw) To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim The damon_sample_prcl_start() can fail so we must reset the "enable" parameter to "false" again for proper rollback. In such cases, setting Y to "enable" then N triggers the following crash because damon sample start failed but the "enable" stays as Y. [ 2441.419649] damon_sample_prcl: start [ 2454.146817] damon_sample_prcl: stop [ 2454.146862] ------------[ cut here ]------------ [ 2454.146865] kernel BUG at mm/slub.c:546! [ 2454.148183] Oops: invalid opcode: 0000 [#1] SMP NOPTI ... [ 2454.167555] Call Trace: [ 2454.167822] <TASK> [ 2454.168061] damon_destroy_ctx+0x78/0x140 [ 2454.168454] damon_sample_prcl_enable_store+0x8d/0xd0 [ 2454.168932] param_attr_store+0xa1/0x120 [ 2454.169315] module_attr_store+0x20/0x50 [ 2454.169695] sysfs_kf_write+0x72/0x90 [ 2454.170065] kernfs_fop_write_iter+0x150/0x1e0 [ 2454.170491] vfs_write+0x315/0x440 [ 2454.170833] ksys_write+0x69/0xf0 [ 2454.171162] __x64_sys_write+0x19/0x30 [ 2454.171525] x64_sys_call+0x18b2/0x2700 [ 2454.171900] do_syscall_64+0x7f/0x680 [ 2454.172258] ? exit_to_user_mode_loop+0xf6/0x180 [ 2454.172694] ? clear_bhb_loop+0x30/0x80 [ 2454.173067] ? clear_bhb_loop+0x30/0x80 [ 2454.173439] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 2aca254620a8 ("samples/damon: introduce a skeleton of a smaple DAMON module for proactive reclamation") Signed-off-by: Honggyu Kim <honggyu.kim@sk.com> Reviewed-by: SeongJae Park <sj@kernel.org> --- samples/damon/prcl.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/samples/damon/prcl.c b/samples/damon/prcl.c index 056b1b21a0fe..5597e6a08ab2 100644 --- a/samples/damon/prcl.c +++ b/samples/damon/prcl.c @@ -122,8 +122,12 @@ static int damon_sample_prcl_enable_store( if (enable == enabled) return 0; - if (enable) - return damon_sample_prcl_start(); + if (enable) { + err = damon_sample_prcl_start(); + if (err) + enable = false; + return err; + } damon_sample_prcl_stop(); return 0; } -- 2.34.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/4] samples/damon: fix damon sample wsse for start failure 2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim @ 2025-07-01 8:19 ` Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 3/4] samples/damon: fix damon sample mtier " Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim 3 siblings, 0 replies; 7+ messages in thread From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw) To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim The damon_sample_wsse_start() can fail so we must reset the "enable" parameter to "false" again for proper rollback. In such cases, setting Y to "enable" then N triggers the similar crash with wsse because damon sample start failed but the "enable" stays as Y. Fixes: b757c6cfc696 ("samples/damon/wsse: start and stop DAMON as the user requests") Signed-off-by: Honggyu Kim <honggyu.kim@sk.com> Reviewed-by: SeongJae Park <sj@kernel.org> --- samples/damon/wsse.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/samples/damon/wsse.c b/samples/damon/wsse.c index 11be25803274..e20238a249e7 100644 --- a/samples/damon/wsse.c +++ b/samples/damon/wsse.c @@ -102,8 +102,12 @@ static int damon_sample_wsse_enable_store( if (enable == enabled) return 0; - if (enable) - return damon_sample_wsse_start(); + if (enable) { + err = damon_sample_wsse_start(); + if (err) + enable = false; + return err; + } damon_sample_wsse_stop(); return 0; } -- 2.34.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] samples/damon: fix damon sample mtier for start failure 2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 2/4] samples/damon: fix damon sample wsse " Honggyu Kim @ 2025-07-01 8:19 ` Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim 3 siblings, 0 replies; 7+ messages in thread From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw) To: SeongJae Park, damon; +Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim The damon_sample_mtier_start() can fail so we must reset the "enable" parameter to "false" again for proper rollback. In such cases, setting Y to "enable" then N triggers the similar crash with mtier because damon sample start failed but the "enable" stays as Y. Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering") Signed-off-by: Honggyu Kim <honggyu.kim@sk.com> Reviewed-by: SeongJae Park <sj@kernel.org> --- samples/damon/mtier.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c index 36d2cd933f5a..c94254b77fc9 100644 --- a/samples/damon/mtier.c +++ b/samples/damon/mtier.c @@ -164,8 +164,12 @@ static int damon_sample_mtier_enable_store( if (enable == enabled) return 0; - if (enable) - return damon_sample_mtier_start(); + if (enable) { + err = damon_sample_mtier_start(); + if (err) + enable = false; + return err; + } damon_sample_mtier_stop(); return 0; } -- 2.34.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() 2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim ` (2 preceding siblings ...) 2025-07-01 8:19 ` [PATCH v2 3/4] samples/damon: fix damon sample mtier " Honggyu Kim @ 2025-07-01 8:19 ` Honggyu Kim 2025-07-01 17:00 ` SeongJae Park 3 siblings, 1 reply; 7+ messages in thread From: Honggyu Kim @ 2025-07-01 8:19 UTC (permalink / raw) To: SeongJae Park, damon Cc: Andrew Morton, linux-mm, kernel_team, Honggyu Kim, Yunjeong Mun The current implementation allows having zero size regions with no special reasons, but damon_get_intervals_score() gets crashed by divide by zero when the region size is zero. [ 29.403950] Oops: divide error: 0000 [#1] SMP NOPTI This patch fixes the bug, but does not disallow zero size regions to keep the backward compatibility since disallowing zero size regions might be a breaking change for some users. Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning") Signed-off-by: Honggyu Kim <honggyu.kim@sk.com> Cc: Yunjeong Mun <yunjeong.mun@sk.com> --- mm/damon/core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/damon/core.c b/mm/damon/core.c index b217e0120e09..e274a4d958d6 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -1448,7 +1448,10 @@ static unsigned long damon_get_intervals_score(struct damon_ctx *c) access_events += sz_region * r->nr_accesses; } } - target_access_events = max_access_events * goal_bp / 10000; + if (likely(max_access_events) > 0) + target_access_events = max_access_events * goal_bp / 10000; + else + target_access_events = 1; return access_events * 10000 / target_access_events; } -- 2.34.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() 2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim @ 2025-07-01 17:00 ` SeongJae Park 2025-07-02 0:08 ` Honggyu Kim 0 siblings, 1 reply; 7+ messages in thread From: SeongJae Park @ 2025-07-01 17:00 UTC (permalink / raw) To: Honggyu Kim Cc: SeongJae Park, damon, Andrew Morton, linux-mm, kernel_team, Yunjeong Mun Hi Honggyu, On Tue, 1 Jul 2025 17:19:26 +0900 Honggyu Kim <honggyu.kim@sk.com> wrote: > The current implementation allows having zero size regions with no > special reasons, but damon_get_intervals_score() gets crashed by divide > by zero when the region size is zero. > > [ 29.403950] Oops: divide error: 0000 [#1] SMP NOPTI > > This patch fixes the bug, but does not disallow zero size regions to > keep the backward compatibility since disallowing zero size regions > might be a breaking change for some users. > > Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning") > Signed-off-by: Honggyu Kim <honggyu.kim@sk.com> > Cc: Yunjeong Mun <yunjeong.mun@sk.com> > --- > mm/damon/core.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/damon/core.c b/mm/damon/core.c > index b217e0120e09..e274a4d958d6 100644 > --- a/mm/damon/core.c > +++ b/mm/damon/core.c > @@ -1448,7 +1448,10 @@ static unsigned long damon_get_intervals_score(struct damon_ctx *c) > access_events += sz_region * r->nr_accesses; > } > } > - target_access_events = max_access_events * goal_bp / 10000; > + if (likely(max_access_events) > 0) This makes checkpatch.pl complaints as below: WARNING: Using likely should generally have parentheses around the comparison #39: FILE: mm/damon/core.c:1451: + if (likely(max_access_events) > 0) This is not performance critical part, so I think we can simply drop likely(). > + target_access_events = max_access_events * goal_bp / 10000; > + else > + target_access_events = 1; But, target_access_events could still be zero if goal_bp is zero or max_access_events * goal_bp is less thatn 10000? I'd like to simply do the target_access_events calculation as is, and set target_access_events as 1, if it is zero. Since this is not a performance critical path, I think that's ok. Also, in the previous version, we discussed the reproduction step requires sample modules. But seems we can also reproduce this with the official DAMON sysfs ABI by setting the intervals goal zero, and I just confirmed that. Let's Cc stable@. > return access_events * 10000 / target_access_events; > } > > -- > 2.34.1 Thanks, SJ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() 2025-07-01 17:00 ` SeongJae Park @ 2025-07-02 0:08 ` Honggyu Kim 0 siblings, 0 replies; 7+ messages in thread From: Honggyu Kim @ 2025-07-02 0:08 UTC (permalink / raw) To: SeongJae Park; +Cc: kernel_team, damon, Andrew Morton, linux-mm, Yunjeong Mun Hi SeongJae, On 7/2/2025 2:00 AM, SeongJae Park wrote: > Hi Honggyu, > > On Tue, 1 Jul 2025 17:19:26 +0900 Honggyu Kim <honggyu.kim@sk.com> wrote: > >> The current implementation allows having zero size regions with no >> special reasons, but damon_get_intervals_score() gets crashed by divide >> by zero when the region size is zero. >> >> [ 29.403950] Oops: divide error: 0000 [#1] SMP NOPTI >> >> This patch fixes the bug, but does not disallow zero size regions to >> keep the backward compatibility since disallowing zero size regions >> might be a breaking change for some users. >> >> Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning") >> Signed-off-by: Honggyu Kim <honggyu.kim@sk.com> >> Cc: Yunjeong Mun <yunjeong.mun@sk.com> >> --- >> mm/damon/core.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/mm/damon/core.c b/mm/damon/core.c >> index b217e0120e09..e274a4d958d6 100644 >> --- a/mm/damon/core.c >> +++ b/mm/damon/core.c >> @@ -1448,7 +1448,10 @@ static unsigned long damon_get_intervals_score(struct damon_ctx *c) >> access_events += sz_region * r->nr_accesses; >> } >> } >> - target_access_events = max_access_events * goal_bp / 10000; >> + if (likely(max_access_events) > 0) > > This makes checkpatch.pl complaints as below: > > WARNING: Using likely should generally have parentheses around the comparison > #39: FILE: mm/damon/core.c:1451: > + if (likely(max_access_events) > 0) > > This is not performance critical part, so I think we can simply drop likely(). > >> + target_access_events = max_access_events * goal_bp / 10000; >> + else >> + target_access_events = 1; > > But, target_access_events could still be zero if goal_bp is zero or > max_access_events * goal_bp is less thatn 10000? > > I'd like to simply do the target_access_events calculation as is, and set > target_access_events as 1, if it is zero. Since this is not a performance > critical path, I think that's ok. Sure. I've just made it as you mentioned in v3. > > Also, in the previous version, we discussed the reproduction step requires > sample modules. But seems we can also reproduce this with the official DAMON > sysfs ABI by setting the intervals goal zero, and I just confirmed that. Let's > Cc stable@. I've cc'ed stable@ in v3. Thanks for your help! Thanks, Honggyu > >> return access_events * 10000 / target_access_events; >> } >> >> -- >> 2.34.1 > > > Thanks, > SJ ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-07-02 0:08 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-07-01 8:19 [PATCH v2 0/4] mm/damon: fix divide by zero and its samples Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 1/4] samples/damon: fix damon sample prcl for start failure Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 2/4] samples/damon: fix damon sample wsse " Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 3/4] samples/damon: fix damon sample mtier " Honggyu Kim 2025-07-01 8:19 ` [PATCH v2 4/4] mm/damon: fix divide by zero in damon_get_intervals_score() Honggyu Kim 2025-07-01 17:00 ` SeongJae Park 2025-07-02 0:08 ` Honggyu Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox