From: Waiman Long <llong@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>, Waiman Long <llong@redhat.com>
Cc: "Tejun Heo" <tj@kernel.org>, "Michal Hocko" <mhocko@kernel.org>,
"Roman Gushchin" <roman.gushchin@linux.dev>,
"Shakeel Butt" <shakeel.butt@linux.dev>,
"Muchun Song" <muchun.song@linux.dev>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Shuah Khan" <shuah@kernel.org>,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
linux-mm@kvack.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0
Date: Fri, 4 Apr 2025 14:55:35 -0400 [thread overview]
Message-ID: <c4294852-cc94-401e-8335-02741005e5d7@redhat.com> (raw)
In-Reply-To: <20250404181308.GA300138@cmpxchg.org>
On 4/4/25 2:13 PM, Johannes Weiner wrote:
> On Fri, Apr 04, 2025 at 01:25:33PM -0400, Waiman Long wrote:
>> On 4/4/25 1:12 PM, Tejun Heo wrote:
>>> Hello,
>>>
>>> On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long wrote:
>>> ...
>>>> The simple and naive fix of changing the operator to ">", however,
>>>> changes the memory reclaim behavior which can lead to other failures
>>>> as low events are needed to facilitate memory reclaim. So we can't do
>>>> that without some relatively riskier changes in memory reclaim.
>>> I'm doubtful using ">" would change reclaim behavior in a meaningful way and
>>> that'd be more straightforward. What do mm people think?
> The knob documentation uses "within low" and "above low" to
> distinguish whether you are protected or not, so at least from a code
> clarity pov, >= makes more sense to me: if your protection is N and
> you use exactly N, you're considered protected.
>
> That also means that by definition an empty cgroup is protected. It's
> not in excess of its protection. The test result isn't wrong.
>
> The real weirdness is issuing a "low reclaim" event when no reclaim is
> going to happen*.
>
> The patch effectively special cases "empty means in excess" to avoid
> the event and fall through to reclaim, which then does nothing as a
> result of its own scan target calculations. That seems convoluted.
>
> Why not skip empty cgroups before running inapplicable checks?
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b620d74b0f66..260ab238ec22 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -5963,6 +5963,9 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
>
> mem_cgroup_calculate_protection(target_memcg, memcg);
>
> + if (!mem_cgroup_usage(memcg, false))
> + continue;
> +
> if (mem_cgroup_below_min(target_memcg, memcg)) {
> /*
> * Hard protection.
Yes, that should take care of the memcg with no task case.
>
>> I haven't looked deeply into why that is the case, but
>> test_memcg_low/min tests had other failures when I made this change.
> It surprises me as well that it makes any practical difference.
I looked at it again and failure is the same expected memory.current
check in test_memcontrol. If I remove the equal sign, I got errors like:
values_close: child 0 = 8339456, 29MB = 30408704
failed with err = 21
not ok 1 test_memcg_min
So the test is expecting memory.current to have around 29MB, but it got
a lot less (~8MB) in this case. Before removing the equality sign, I
usually got about 25 MB and above for child 0. That is a pretty big
change in behavior, so I didn't make it.
>
> * Waiman points out that the weirdness is seeing low events without
> having a low configured. Eh, this isn't really true with recursive
> propagation; you may or may not have an elow depending on parental
> configuration and sibling behavior.
>
Do you mind if we just don't update the low event count if low isn't
set, but leave the rest the same like
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 91721c8862c3..48a8bfa7d337 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -659,21 +659,25 @@ static inline bool mem_cgroup_unprotected(struct
mem_cgro>
static inline bool mem_cgroup_below_low(struct mem_cgroup *target,
struct mem_cgroup *memcg)
{
+ unsigned long elow;
+
if (mem_cgroup_unprotected(target, memcg))
return false;
- return READ_ONCE(memcg->memory.elow) >=
- page_counter_read(&memcg->memory);
+ elow = READ_ONCE(memcg->memory.elow);
+ return elow && (page_counter_read(&memcg->memory) <= elow);
}
static inline bool mem_cgroup_below_min(struct mem_cgroup *target,
struct mem_cgroup *memcg)
{
+ unsigned long emin;
+
if (mem_cgroup_unprotected(target, memcg))
return false;
- return READ_ONCE(memcg->memory.emin) >=
- page_counter_read(&memcg->memory);
+ emin = READ_ONCE(memcg->memory.emin);
+ return emin && (page_counter_read(&memcg->memory) <= emin);
}
void mem_cgroup_commit_charge(struct folio *folio, struct mem_cgroup
*memcg);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 77d015d5db0c..e8c1838c7962 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4827,7 +4827,8 @@ static int shrink_one(struct lruvec *lruvec,
struct scan_>
if (READ_ONCE(lruvec->lrugen.seg) != MEMCG_LRU_TAIL)
return MEMCG_LRU_TAIL;
- memcg_memory_event(memcg, MEMCG_LOW);
+ if (memcg->memory.low)
+ memcg_memory_event(memcg, MEMCG_LOW);
}
success = try_to_shrink_lruvec(lruvec, sc);
@@ -5902,6 +5903,9 @@ static void shrink_node_memcgs(pg_data_t *pgdat,
struct s>
mem_cgroup_calculate_protection(target_memcg, memcg);
+ if (!mem_cgroup_usage(memcg, false))
+ continue;
+
if (mem_cgroup_below_min(target_memcg, memcg)) {
/*
* Hard protection.
@@ -5919,7 +5923,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat,
struct s>
sc->memcg_low_skipped = 1;
continue;
}
- memcg_memory_event(memcg, MEMCG_LOW);
+ if (memcg->memory.low)
+ memcg_memory_event(memcg, MEMCG_LOW);
}
reclaimed = sc->nr_reclaimed;
Cheers,
Longman
next prev parent reply other threads:[~2025-04-04 18:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-04 1:24 Waiman Long
2025-04-04 1:24 ` [PATCH v2 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection() Waiman Long
2025-04-04 17:12 ` [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0 Tejun Heo
2025-04-04 17:25 ` Waiman Long
2025-04-04 18:13 ` Johannes Weiner
2025-04-04 18:55 ` Waiman Long [this message]
2025-04-04 19:38 ` Johannes Weiner
2025-04-05 18:52 ` Waiman Long
2025-04-04 18:26 ` Michal Koutný
2025-04-04 19:01 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c4294852-cc94-401e-8335-02741005e5d7@redhat.com \
--to=llong@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox