Re: [RFC PATCH] mm, memcg: introduce memory.high.throttle

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Waiman Long <llong@redhat.com>
To: Roman Gushchin <roman.gushchin@linux.dev>,
	Waiman Long <llong@redhat.com>
Cc: "Michal Hocko" <mhocko@suse.com>, "Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	"Peter Hunt" <pehunt@redhat.com>
Subject: Re: [RFC PATCH] mm, memcg: introduce memory.high.throttle
Date: Thu, 30 Jan 2025 12:19:38 -0500	[thread overview]
Message-ID: <a309f420-4a25-4cf5-b6f0-750059c8467c@redhat.com> (raw)
In-Reply-To: <Z5uxVzFf7Pk7yk9f@google.com>

On 1/30/25 12:05 PM, Roman Gushchin wrote:
> On Thu, Jan 30, 2025 at 10:05:34AM -0500, Waiman Long wrote:
>> On 1/30/25 3:15 AM, Michal Hocko wrote:
>>> On Wed 29-01-25 14:12:04, Waiman Long wrote:
>>>> Since commit 0e4b01df8659 ("mm, memcg: throttle allocators when failing
>>>> reclaim over memory.high"), the amount of allocator throttling had
>>>> increased substantially. As a result, it could be difficult for a
>>>> misbehaving application that consumes increasing amount of memory from
>>>> being OOM-killed if memory.high is set. Instead, the application may
>>>> just be crawling along holding close to the allowed memory.high memory
>>>> for the current memory cgroup for a very long time especially those
>>>> that do a lot of memcg charging and uncharging operations.
>>>>
>>>> This behavior makes the upstream Kubernetes community hesitate to
>>>> use memory.high. Instead, they use only memory.max for memory control
>>>> similar to what is being done for cgroup v1 [1].
>>> Why is this a problem for them?
>> My understanding is that a mishaving container will hold up memory.high
>> amount of memory for a long time instead of getting OOM killed sooner and be
>> more productively used elsewhere.
>>>> To allow better control of the amount of throttling and hence the
>>>> speed that a misbehving task can be OOM killed, a new single-value
>>>> memory.high.throttle control file is now added. The allowable range
>>>> is 0-32.  By default, it has a value of 0 which means maximum throttling
>>>> like before. Any non-zero positive value represents the corresponding
>>>> power of 2 reduction of throttling and makes OOM kills easier to happen.
>>> I do not like the interface to be honest. It exposes an implementation
>>> detail and casts it into a user API. If we ever need to change the way
>>> how the throttling is implemented this will stand in the way because
>>> there will be applications depending on a behavior they were carefuly
>>> tuned to.
>>>
>>> It is also not entirely sure how is this supposed to be used in
>>> practice? How do people what kind of value they should use?
>> Yes, I agree that a user may need to run some trial runs to find a proper
>> value. Perhaps a simpler binary interface of "off" and "on" may be easier to
>> understand and use.
>>>> System administrators can now use this parameter to determine how easy
>>>> they want OOM kills to happen for applications that tend to consume
>>>> a lot of memory without the need to run a special userspace memory
>>>> management tool to monitor memory consumption when memory.high is set.
>>> Why cannot they achieve the same with the existing events/metrics we
>>> already do provide? Most notably PSI which is properly accounted when
>>> a task is throttled due to memory.high throttling.
>> That will require the use of a userspace management agent that looks for
>> these stalling conditions and make the kill, if necessary. There are
>> certainly users out there that want to get some benefit of using memory.high
>> like early memory reclaim without the trouble of handling these kind of
>> stalling conditions.
> So you basically want to force the workload into some sort of a proactive
> reclaim but without an artificial slow down?
> It makes some sense to me, but
> 1) Idk if it deserves a new API, because it can be relatively easy implemented
>    in userspace by a daemon which monitors cgroups usage and reclaims the memory
>    if necessarily. No kernel changes are needed.
> 2) If new API is introduced, I think it's better to introduce a new limit,
>    e.g. memory.target, keeping memory.high semantics intact.

Yes, you are right about that. Introducing a new "memory.target" without 
disturbing the existing "memory.high" semantics will work for me too.

Cheers,
Longman

next prev parent reply	other threads:[~2025-01-30 17:19 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-29 19:12 Waiman Long
2025-01-29 20:10 ` Yosry Ahmed
2025-01-30 14:52   ` Waiman Long
2025-01-30 16:39     ` Johannes Weiner
2025-01-30 17:07       ` Waiman Long
2025-01-30 20:19         ` Johannes Weiner
2025-01-30 22:27           ` Balbir Singh
2025-01-30  8:15 ` Michal Hocko
2025-01-30 15:05   ` Waiman Long
2025-01-30 17:05     ` Roman Gushchin
2025-01-30 17:19       ` Waiman Long [this message]
2025-01-30 17:32         ` Shakeel Butt
2025-01-30 17:41           ` Waiman Long
2025-01-30 17:46         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a309f420-4a25-4cf5-b6f0-750059c8467c@redhat.com \
    --to=llong@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=hannes@cmpxchg.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=pehunt@redhat.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox