From: Yang Shi <yang.shi@linux.alibaba.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: hannes@cmpxchg.org, akpm@linux-foundation.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/3] mm: memcontrol: delayed force empty
Date: Thu, 3 Jan 2019 10:40:54 -0800 [thread overview]
Message-ID: <6f43e926-3bb5-20d1-2e39-1d30bf7ad375@linux.alibaba.com> (raw)
In-Reply-To: <20190103181329.GW31793@dhcp22.suse.cz>
On 1/3/19 10:13 AM, Michal Hocko wrote:
> On Thu 03-01-19 09:33:14, Yang Shi wrote:
>>
>> On 1/3/19 2:12 AM, Michal Hocko wrote:
>>> On Thu 03-01-19 04:05:30, Yang Shi wrote:
>>>> Currently, force empty reclaims memory synchronously when writing to
>>>> memory.force_empty. It may take some time to return and the afterwards
>>>> operations are blocked by it. Although it can be interrupted by signal,
>>>> it still seems suboptimal.
>>> Why it is suboptimal? We are doing that operation on behalf of the
>>> process requesting it. What should anybody else pay for it? In other
>>> words why should we hide the overhead?
>> Please see the below explanation.
>>
>>>> Now css offline is handled by worker, and the typical usecase of force
>>>> empty is before memcg offline. So, handling force empty in css offline
>>>> sounds reasonable.
>>> Hmm, so I guess you are talking about
>>> echo 1 > $MEMCG/force_empty
>>> rmdir $MEMCG
>>>
>>> and you are complaining that the operation takes too long. Right? Why do
>>> you care actually?
>> We have some usecases which create and remove memcgs very frequently, and
>> the tasks in the memcg may just access the files which are unlikely accessed
>> by anyone else. So, we prefer force_empty the memcg before rmdir'ing it to
>> reclaim the page cache so that they don't get accumulated to incur
>> unnecessary memory pressure. Since the memory pressure may incur direct
>> reclaim to harm some latency sensitive applications.
> Yes, this makes sense to me.
>
>> And, the create/remove might be run in a script sequentially (there might be
>> a lot scripts or applications are run in parallel to do this), i.e.
>> mkdir cg1
>> do something
>> echo 0 > cg1/memory.force_empty
>> rmdir cg1
>>
>> mkdir cg2
>> ...
>>
>> The creation of the afterwards memcg might be blocked by the force_empty for
>> long time if there are a lot page caches, so the overall throughput of the
>> system may get hurt.
> Is there any reason for your scripts to be strictly sequential here? In
> other words why cannot you offload those expensive operations to a
> detached context in _userspace_?
I would say it has not to be strictly sequential. The above script is
just an example to illustrate the pattern. But, sometimes it may hit
such pattern due to the complicated cluster scheduling and container
scheduling in the production environment, for example the creation
process might be scheduled to the same CPU which is doing force_empty. I
have to say I don't know too much about the internals of the container
scheduling.
Thanks,
Yang
next prev parent reply other threads:[~2019-01-03 18:42 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-02 20:05 Yang Shi
2019-01-02 20:05 ` [PATCH 1/3] doc: memcontrol: fix the obsolete content about " Yang Shi
2019-01-02 21:18 ` Shakeel Butt
2019-01-02 21:18 ` Shakeel Butt
2019-01-03 10:13 ` Michal Hocko
2019-01-02 20:05 ` [PATCH 2/3] mm: memcontrol: do not try to do swap when " Yang Shi
2019-01-02 21:45 ` Shakeel Butt
2019-01-02 21:45 ` Shakeel Butt
2019-01-03 16:56 ` Yang Shi
2019-01-03 17:03 ` Shakeel Butt
2019-01-03 17:03 ` Shakeel Butt
2019-01-03 18:19 ` Yang Shi
2019-01-02 20:05 ` [PATCH 3/3] mm: memcontrol: delay force empty to css offline Yang Shi
2019-01-03 10:12 ` [RFC PATCH 0/3] mm: memcontrol: delayed force empty Michal Hocko
2019-01-03 17:33 ` Yang Shi
2019-01-03 18:13 ` Michal Hocko
2019-01-03 18:40 ` Yang Shi [this message]
2019-01-03 18:53 ` Michal Hocko
2019-01-03 19:10 ` Yang Shi
2019-01-03 19:23 ` Michal Hocko
2019-01-03 19:49 ` Yang Shi
2019-01-03 20:01 ` Michal Hocko
2019-01-04 4:15 ` Yang Shi
2019-01-04 8:55 ` Michal Hocko
2019-01-04 16:46 ` Yang Shi
2019-01-04 20:03 ` Greg Thelen
2019-01-04 20:03 ` Greg Thelen
2019-01-04 21:41 ` Yang Shi
2019-01-04 22:57 ` Yang Shi
2019-01-04 23:04 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6f43e926-3bb5-20d1-2e39-1d30bf7ad375@linux.alibaba.com \
--to=yang.shi@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox