* [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals
@ 2024-10-31 18:37 SeongJae Park
2024-10-31 18:37 ` [PATCH 1/2] mm/damon/core: handle zero {aggregation,ops_update} intervals SeongJae Park
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: SeongJae Park @ 2024-10-31 18:37 UTC (permalink / raw)
To: Andrew Morton; +Cc: SeongJae Park, damon, linux-mm, linux-kernel, kernel-team
DAMON's internal intervals accounting logic is not correctly handling
non-sampling intervals of zero values for a wrong assumption. This
could cause unexpected monitoring behavior, and even result in infinite
hang of DAMON sysfs interface user threads in case of zero aggregation
interval. Fix those by updating the intervals accounting logic. For
details of the root case and solutions, please refer to commit messages
of fixes.
SeongJae Park (2):
mm/damon/core: handle zero {aggregation,ops_update} intervals
mm/damon/core: handle zero schemes apply interval
mm/damon/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
base-commit: 0c9ffe1412203c72280f67567bb53200f4de44fb
--
2.39.5
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] mm/damon/core: handle zero {aggregation,ops_update} intervals
2024-10-31 18:37 [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals SeongJae Park
@ 2024-10-31 18:37 ` SeongJae Park
2024-10-31 18:37 ` [PATCH 2/2] mm/damon/core: handle zero schemes apply interval SeongJae Park
2024-11-01 0:27 ` [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals Andrew Morton
2 siblings, 0 replies; 4+ messages in thread
From: SeongJae Park @ 2024-10-31 18:37 UTC (permalink / raw)
To: Andrew Morton
Cc: SeongJae Park, damon, linux-mm, linux-kernel, kernel-team, stable
DAMON's logics to determine if this is the time to do aggregation and
ops update assumes next_{aggregation,ops_update}_sis are always set
larger than current passed_sample_intervals. And therefore it further
assumes continuously incrementing passed_sample_intervals every sampling
interval will make it reaches to the next_{aggregation,ops_update}_sis
in future. The logic therefore make the action and update
next_{aggregation,ops_updaste}_sis only if passed_sample_intervals is
same to the counts, respectively.
If Aggregation interval or Ops update interval are zero, however,
next_aggregation_sis or next_ops_update_sis are set same to current
passed_sample_intervals, respectively. And passed_sample_intervals is
incremented before doing the next_{aggregation,ops_update}_sis check.
Hence, passed_sample_intervals becomes larger than
next_{aggregation,ops_update}_sis, and the logic says it is not the time
to do the action and update next_{aggregation,ops_update}_sis forever,
until an overflow happens. In other words, DAMON stops doing
aggregations or ops updates effectively forever, and users cannot get
monitoring results.
Based on the documents and the common sense, a reasonable behavior for
such inputs is doing an aggregation and an ops update for every sampling
interval. Handle the case by removing the assumption.
Note that this could incur particular real issue for DAMON sysfs
interface users, in case of zero Aggregation interval. When user starts
DAMON with zero Aggregation interval and asks online DAMON parameter
tuning via DAMON sysfs interface, the request is handled by the
aggregation callback. Until the callback finishes the work, the user
who requested the online tuning just waits. Hence, the user will be
stuck until the passed_sample_intervals overflows.
Fixes: 4472edf63d66 ("mm/damon/core: use number of passed access sampling as a timer")
Cc: <stable@vger.kernel.org> # 6.7.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
mm/damon/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 27745dcf855f..931526fb2d2e 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -2014,7 +2014,7 @@ static int kdamond_fn(void *data)
if (ctx->ops.check_accesses)
max_nr_accesses = ctx->ops.check_accesses(ctx);
- if (ctx->passed_sample_intervals == next_aggregation_sis) {
+ if (ctx->passed_sample_intervals >= next_aggregation_sis) {
kdamond_merge_regions(ctx,
max_nr_accesses / 10,
sz_limit);
@@ -2032,7 +2032,7 @@ static int kdamond_fn(void *data)
sample_interval = ctx->attrs.sample_interval ?
ctx->attrs.sample_interval : 1;
- if (ctx->passed_sample_intervals == next_aggregation_sis) {
+ if (ctx->passed_sample_intervals >= next_aggregation_sis) {
ctx->next_aggregation_sis = next_aggregation_sis +
ctx->attrs.aggr_interval / sample_interval;
@@ -2042,7 +2042,7 @@ static int kdamond_fn(void *data)
ctx->ops.reset_aggregated(ctx);
}
- if (ctx->passed_sample_intervals == next_ops_update_sis) {
+ if (ctx->passed_sample_intervals >= next_ops_update_sis) {
ctx->next_ops_update_sis = next_ops_update_sis +
ctx->attrs.ops_update_interval /
sample_interval;
--
2.39.5
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 2/2] mm/damon/core: handle zero schemes apply interval
2024-10-31 18:37 [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals SeongJae Park
2024-10-31 18:37 ` [PATCH 1/2] mm/damon/core: handle zero {aggregation,ops_update} intervals SeongJae Park
@ 2024-10-31 18:37 ` SeongJae Park
2024-11-01 0:27 ` [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals Andrew Morton
2 siblings, 0 replies; 4+ messages in thread
From: SeongJae Park @ 2024-10-31 18:37 UTC (permalink / raw)
To: Andrew Morton
Cc: SeongJae Park, damon, linux-mm, linux-kernel, kernel-team, stable
DAMON's logics to determine if this is the time to apply damos schemes
assumes next_apply_sis is always set larger than current
passed_sample_intervals. And therefore assume continuously incrementing
passed_sample_intervals will make it reaches to the next_apply_sis in
future. The logic hence does apply the scheme and update next_apply_sis
only if passed_sample_intervals is same to next_apply_sis.
If Schemes apply interval is set as zero, however, next_apply_sis is set
same to current passed_sample_intervals, respectively. And
passed_sample_intervals is incremented before doing the next_apply_sis
check. Hence, next_apply_sis becomes larger than next_apply_sis, and
the logic says it is not the time to apply schemes and update
next_apply_sis. In other words, DAMON stops applying schemes until
passed_sample_intervals overflows.
Based on the documents and the common sense, a reasonable behavior for
such inputs would be applying the schemes for every sampling interval.
Handle the case by removing the assumption.
Fixes: 42f994b71404 ("mm/damon/core: implement scheme-specific apply interval")
Cc: <stable@vger.kernel.org> # 6.7.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
mm/damon/core.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 931526fb2d2e..511c3f61ab44 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1412,7 +1412,7 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
damon_for_each_scheme(s, c) {
struct damos_quota *quota = &s->quota;
- if (c->passed_sample_intervals != s->next_apply_sis)
+ if (c->passed_sample_intervals < s->next_apply_sis)
continue;
if (!s->wmarks.activated)
@@ -1636,7 +1636,7 @@ static void kdamond_apply_schemes(struct damon_ctx *c)
bool has_schemes_to_apply = false;
damon_for_each_scheme(s, c) {
- if (c->passed_sample_intervals != s->next_apply_sis)
+ if (c->passed_sample_intervals < s->next_apply_sis)
continue;
if (!s->wmarks.activated)
@@ -1656,9 +1656,9 @@ static void kdamond_apply_schemes(struct damon_ctx *c)
}
damon_for_each_scheme(s, c) {
- if (c->passed_sample_intervals != s->next_apply_sis)
+ if (c->passed_sample_intervals < s->next_apply_sis)
continue;
- s->next_apply_sis +=
+ s->next_apply_sis = c->passed_sample_intervals +
(s->apply_interval_us ? s->apply_interval_us :
c->attrs.aggr_interval) / sample_interval;
}
--
2.39.5
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals
2024-10-31 18:37 [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals SeongJae Park
2024-10-31 18:37 ` [PATCH 1/2] mm/damon/core: handle zero {aggregation,ops_update} intervals SeongJae Park
2024-10-31 18:37 ` [PATCH 2/2] mm/damon/core: handle zero schemes apply interval SeongJae Park
@ 2024-11-01 0:27 ` Andrew Morton
2 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2024-11-01 0:27 UTC (permalink / raw)
To: SeongJae Park; +Cc: damon, linux-mm, linux-kernel, kernel-team
On Thu, 31 Oct 2024 11:37:55 -0700 SeongJae Park <sj@kernel.org> wrote:
> DAMON's internal intervals accounting logic is not correctly handling
> non-sampling intervals of zero values for a wrong assumption. This
> could cause unexpected monitoring behavior, and even result in infinite
> hang of DAMON sysfs interface user threads in case of zero aggregation
> interval. Fix those by updating the intervals accounting logic. For
> details of the root case and solutions, please refer to commit messages
> of fixes.
Thanks. fyi, there has been email lossage here.
Only [2/2] hit my inbox. Neither [1/2] not [2/2] are in my linux-mm
folder. I found the whole series in my linux-kernel folder.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-11-01 13:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-31 18:37 [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals SeongJae Park
2024-10-31 18:37 ` [PATCH 1/2] mm/damon/core: handle zero {aggregation,ops_update} intervals SeongJae Park
2024-10-31 18:37 ` [PATCH 2/2] mm/damon/core: handle zero schemes apply interval SeongJae Park
2024-11-01 0:27 ` [PATCH 0/2] mm/damon/core: fix handling of zero non-sampling intervals Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox