Re: [PATCH] mm: memcontrol: prevent starvation when writing memory.high

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Roman Gushchin <guro@fb.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] mm: memcontrol: prevent starvation when writing memory.high
Date: Wed, 13 Jan 2021 15:46:54 +0100	[thread overview]
Message-ID: <20210113144654.GD22493@dhcp22.suse.cz> (raw)
In-Reply-To: <20210112163011.127833-1-hannes@cmpxchg.org>

On Tue 12-01-21 11:30:11, Johannes Weiner wrote:
> When a value is written to a cgroup's memory.high control file, the
> write() context first tries to reclaim the cgroup to size before
> putting the limit in place for the workload. Concurrent charges from
> the workload can keep such a write() looping in reclaim indefinitely.
> 
> In the past, a write to memory.high would first put the limit in place
> for the workload, then do targeted reclaim until the new limit has
> been met - similar to how we do it for memory.max. This wasn't prone
> to the described starvation issue. However, this sequence could cause
> excessive latencies in the workload, when allocating threads could be
> put into long penalty sleeps on the sudden memory.high overage created
> by the write(), before that had a chance to work it off.
> 
> Now that memory_high_write() performs reclaim before enforcing the new
> limit, reflect that the cgroup may well fail to converge due to
> concurrent workload activity. Bail out of the loop after a few tries.

I can see that you have provided some more details in follow up replies
but I do not see any explicit argument why an excessive time for writer
is an actual problem. Could you be more specific?

If the writer is time sensitive then there is a trivial way to
workaround that and kill it by a signal (timeout 30s echo ....).

Btw. this behavior has been considered http://lkml.kernel.org/r/20200710122917.GB3022@dhcp22.suse.cz/
"
With this change
the reclaim here might be just playing never ending catch up. On the
plus side a break out from the reclaim loop would just enforce the limit
so if the operation takes too long then the reclaim burden will move
over to consumers eventually. So I do not see any real danger.
"

> Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
> Cc: <stable@vger.kernel.org> # 5.8+

Why is this worth backporting to stable? The behavior is different but I
do not think any of them is harmful.

> Reported-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

I am not against the patch. The existing interface doesn't provide any
meaningful feedback to the userspace anyway. User would have to re check
to see the result of the operation. So how hard we try is really an
implementation detail.

> ---
>  mm/memcontrol.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 605f671203ef..63a8d47c1cd3 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6275,7 +6275,6 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
>  
>  	for (;;) {
>  		unsigned long nr_pages = page_counter_read(&memcg->memory);
> -		unsigned long reclaimed;
>  
>  		if (nr_pages <= high)
>  			break;
> @@ -6289,10 +6288,10 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
>  			continue;
>  		}
>  
> -		reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
> -							 GFP_KERNEL, true);
> +		try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
> +					     GFP_KERNEL, true);
>  
> -		if (!reclaimed && !nr_retries--)
> +		if (!nr_retries--)
>  			break;
>  	}
>  
> -- 
> 2.30.0
> 

-- 
Michal Hocko
SUSE Labs

next prev parent reply	other threads:[~2021-01-13 14:46 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-12 16:30 Johannes Weiner
2021-01-12 17:03 ` Roman Gushchin
2021-01-12 19:45   ` Johannes Weiner
2021-01-12 20:12     ` Roman Gushchin
2021-01-12 21:11       ` Johannes Weiner
2021-01-12 21:45         ` Roman Gushchin
2021-01-15 15:34           ` Johannes Weiner
2021-01-12 18:59 ` Shakeel Butt
2021-01-12 19:53   ` Johannes Weiner
2021-01-12 20:28     ` Shakeel Butt
2021-01-13 14:46 ` Michal Hocko [this message]
2021-01-15 16:20   ` Johannes Weiner
2021-01-15 17:03     ` Roman Gushchin
2021-01-15 20:55       ` Johannes Weiner
2021-01-15 21:27         ` Roman Gushchin
2021-01-19 16:47           ` Johannes Weiner
2021-01-18 13:12     ` Michal Hocko
2021-01-13 17:25 ` Michal Koutný
2021-01-13 18:06 ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210113144654.GD22493@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox