linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <llong@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>, Waiman Long <llong@redhat.com>
Cc: "Tejun Heo" <tj@kernel.org>, "Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Shuah Khan" <shuah@kernel.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0
Date: Fri, 4 Apr 2025 14:55:35 -0400	[thread overview]
Message-ID: <c4294852-cc94-401e-8335-02741005e5d7@redhat.com> (raw)
In-Reply-To: <20250404181308.GA300138@cmpxchg.org>

On 4/4/25 2:13 PM, Johannes Weiner wrote:
> On Fri, Apr 04, 2025 at 01:25:33PM -0400, Waiman Long wrote:
>> On 4/4/25 1:12 PM, Tejun Heo wrote:
>>> Hello,
>>>
>>> On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long wrote:
>>> ...
>>>> The simple and naive fix of changing the operator to ">", however,
>>>> changes the memory reclaim behavior which can lead to other failures
>>>> as low events are needed to facilitate memory reclaim.  So we can't do
>>>> that without some relatively riskier changes in memory reclaim.
>>> I'm doubtful using ">" would change reclaim behavior in a meaningful way and
>>> that'd be more straightforward. What do mm people think?
> The knob documentation uses "within low" and "above low" to
> distinguish whether you are protected or not, so at least from a code
> clarity pov, >= makes more sense to me: if your protection is N and
> you use exactly N, you're considered protected.
>
> That also means that by definition an empty cgroup is protected. It's
> not in excess of its protection. The test result isn't wrong.
>
> The real weirdness is issuing a "low reclaim" event when no reclaim is
> going to happen*.
>
> The patch effectively special cases "empty means in excess" to avoid
> the event and fall through to reclaim, which then does nothing as a
> result of its own scan target calculations. That seems convoluted.
>
> Why not skip empty cgroups before running inapplicable checks?
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b620d74b0f66..260ab238ec22 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -5963,6 +5963,9 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
>   
>   		mem_cgroup_calculate_protection(target_memcg, memcg);
>   
> +		if (!mem_cgroup_usage(memcg, false))
> +			continue;
> +
>   		if (mem_cgroup_below_min(target_memcg, memcg)) {
>   			/*
>   			 * Hard protection.
Yes, that should take care of the memcg with no task case.
>
>> I haven't looked deeply into why that is the case, but
>> test_memcg_low/min tests had other failures when I made this change.
> It surprises me as well that it makes any practical difference.

I looked at it again and failure is the same expected memory.current 
check in test_memcontrol. If I remove the equal sign, I got errors like:

values_close: child 0 = 8339456, 29MB = 30408704
failed with err = 21
not ok 1 test_memcg_min

So the test is expecting memory.current to have around 29MB, but it got 
a lot less (~8MB) in this case. Before removing the equality sign, I 
usually got about 25 MB and above for child 0. That is a pretty big 
change in behavior, so I didn't make it.

>
> * Waiman points out that the weirdness is seeing low events without
>    having a low configured. Eh, this isn't really true with recursive
>    propagation; you may or may not have an elow depending on parental
>    configuration and sibling behavior.
>
Do you mind if we just don't update the low event count if low isn't 
set, but leave the rest the same like

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 91721c8862c3..48a8bfa7d337 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -659,21 +659,25 @@ static inline bool mem_cgroup_unprotected(struct 
mem_cgro>
  static inline bool mem_cgroup_below_low(struct mem_cgroup *target,
                                         struct mem_cgroup *memcg)
  {
+       unsigned long elow;
+
         if (mem_cgroup_unprotected(target, memcg))
                 return false;

-       return READ_ONCE(memcg->memory.elow) >=
-               page_counter_read(&memcg->memory);
+       elow = READ_ONCE(memcg->memory.elow);
+       return elow && (page_counter_read(&memcg->memory) <= elow);
  }

  static inline bool mem_cgroup_below_min(struct mem_cgroup *target,
                                         struct mem_cgroup *memcg)
  {
+       unsigned long emin;
+
         if (mem_cgroup_unprotected(target, memcg))
                 return false;

-       return READ_ONCE(memcg->memory.emin) >=
-               page_counter_read(&memcg->memory);
+       emin = READ_ONCE(memcg->memory.emin);
+       return emin && (page_counter_read(&memcg->memory) <= emin);
  }

  void mem_cgroup_commit_charge(struct folio *folio, struct mem_cgroup 
*memcg);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 77d015d5db0c..e8c1838c7962 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4827,7 +4827,8 @@ static int shrink_one(struct lruvec *lruvec, 
struct scan_>
                 if (READ_ONCE(lruvec->lrugen.seg) != MEMCG_LRU_TAIL)
                         return MEMCG_LRU_TAIL;

-               memcg_memory_event(memcg, MEMCG_LOW);
+               if (memcg->memory.low)
+                       memcg_memory_event(memcg, MEMCG_LOW);
         }

         success = try_to_shrink_lruvec(lruvec, sc);
@@ -5902,6 +5903,9 @@ static void shrink_node_memcgs(pg_data_t *pgdat, 
struct s>

                 mem_cgroup_calculate_protection(target_memcg, memcg);

+               if (!mem_cgroup_usage(memcg, false))
+                       continue;
+
                 if (mem_cgroup_below_min(target_memcg, memcg)) {
                         /*
                          * Hard protection.
@@ -5919,7 +5923,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, 
struct s>
                                 sc->memcg_low_skipped = 1;
                                 continue;
                         }
-                       memcg_memory_event(memcg, MEMCG_LOW);
+                       if (memcg->memory.low)
+                               memcg_memory_event(memcg, MEMCG_LOW);
                 }

                 reclaimed = sc->nr_reclaimed;

Cheers,
Longman



  reply	other threads:[~2025-04-04 18:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-04  1:24 Waiman Long
2025-04-04  1:24 ` [PATCH v2 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection() Waiman Long
2025-04-04 17:12 ` [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0 Tejun Heo
2025-04-04 17:25   ` Waiman Long
2025-04-04 18:13     ` Johannes Weiner
2025-04-04 18:55       ` Waiman Long [this message]
2025-04-04 19:38         ` Johannes Weiner
2025-04-05 18:52           ` Waiman Long
2025-04-04 18:26 ` Michal Koutný
2025-04-04 19:01   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c4294852-cc94-401e-8335-02741005e5d7@redhat.com \
    --to=llong@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox