linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Yang Shi <shy828301@gmail.com>, Fam Zheng <zhengfeiran@bytedance.com>
Cc: Cgroups <cgroups@vger.kernel.org>,
	"Linux MM" <linux-mm@kvack.org>, "Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Li Zefan" <lizefan@huawei.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Vladimir Davydov" <vdavydov.dev@gmail.com>,
	duanxiongchun@bytedance.com, 张永肃 <zhangyongsu@bytedance.com>,
	liuxiaozhou@bytedance.com
Subject: Re: memory cgroup pagecache and inode problem
Date: Sun, 20 Jan 2019 15:20:11 -0800	[thread overview]
Message-ID: <CALvZod6eTa7vvf2c41yx3Yew7RYa56WvNE+HwmFUDzXCsO7PVg@mail.gmail.com> (raw)
In-Reply-To: <20190120231551.213847-1-shakeelb@google.com>

On Sun, Jan 20, 2019 at 3:16 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Wed, Jan 16, 2019 at 9:07 PM Yang Shi <shy828301@gmail.com> wrote:
> ...
> > > > You mean it solves the problem by retrying more times?  Actually, I'm
> > > > not sure if you have swap setup in your test, but force_empty does do
> > > > swap if swap is on. This may cause it can't reclaim all the page cache
> > > > in 5 retries.  I have a patch within that series to skip swap.
> > >
> > > Basically yes, retrying solves the problem. But compared to immediate retries, a scheduled retry in a few seconds is much more effective.
> >
> > This may suggest doing force_empty in a worker is more effective in
> > fact. Not sure if this is good enough to convince Johannes or not.
> >
>
> From what I understand what we actually want is to force_empty an
> offlined memcg. How about we change the semantics of force_empty on
> root_mem_cgroup? Currently force_empty on root_mem_cgroup returns
> -EINVAL. Rather than that, let's do force_empty on all offlined memcgs
> if user does force_empty on root_mem_cgroup. Something like following.
>

Basically we don't need to add more complexity in kernel like
async/workers/timeouts/workqueues to run force_empty, if we expose a
way to force_empty offlined memcgs.

> ---
>  mm/memcontrol.c | 22 +++++++++++++++-------
>  1 file changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a4ac554be7e8..51daa2935c41 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2898,14 +2898,16 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg)
>   *
>   * Caller is responsible for holding css reference for memcg.
>   */
> -static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
> +static int mem_cgroup_force_empty(struct mem_cgroup *memcg, bool online)
>  {
>         int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
>
>         /* we call try-to-free pages for make this cgroup empty */
> -       lru_add_drain_all();
>
> -       drain_all_stock(memcg);
> +       if (online) {
> +               lru_add_drain_all();
> +               drain_all_stock(memcg);
> +       }
>
>         /* try to free all pages in this cgroup */
>         while (nr_retries && page_counter_read(&memcg->memory)) {
> @@ -2915,7 +2917,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
>                         return -EINTR;
>
>                 progress = try_to_free_mem_cgroup_pages(memcg, 1,
> -                                                       GFP_KERNEL, true);
> +                                                       GFP_KERNEL, online);
>                 if (!progress) {
>                         nr_retries--;
>                         /* maybe some writeback is necessary */
> @@ -2932,10 +2934,16 @@ static ssize_t mem_cgroup_force_empty_write(struct kernfs_open_file *of,
>                                             loff_t off)
>  {
>         struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
> +       struct mem_cgroup *mi;
>
> -       if (mem_cgroup_is_root(memcg))
> -               return -EINVAL;
> -       return mem_cgroup_force_empty(memcg) ?: nbytes;
> +       if (mem_cgroup_is_root(memcg)) {
> +               for_each_mem_cgroup_tree(mi, memcg) {
> +                       if (!mem_cgroup_online(mi))
> +                               mem_cgroup_force_empty(mi, false);
> +               }
> +               return 0;
> +       }
> +       return mem_cgroup_force_empty(memcg, true) ?: nbytes;
>  }
>
>  static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
> --
> 2.20.1.321.g9e740568ce-goog
>

  parent reply	other threads:[~2019-01-20 23:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <15614FDC-198E-449B-BFAF-B00D6EF61155@bytedance.com>
2019-01-04  4:44 ` Fam Zheng
2019-01-04  5:00   ` Yang Shi
2019-01-04  5:12     ` Fam Zheng
2019-01-04 19:36       ` Yang Shi
2019-01-07  5:10         ` Fam Zheng
2019-01-07  8:53           ` Michal Hocko
2019-01-07  9:01             ` Fam Zheng
2019-01-07  9:13               ` Michal Hocko
2019-01-09  4:33               ` Fam Zheng
2019-01-10  5:36           ` Yang Shi
2019-01-10  8:30             ` Fam Zheng
2019-01-10  8:41               ` Michal Hocko
2019-01-16  0:50               ` Yang Shi
2019-01-16  3:52                 ` Fam Zheng
2019-01-16  7:06                   ` Michal Hocko
2019-01-16 21:08                     ` Yang Shi
2019-01-16 21:06                   ` Yang Shi
2019-01-17  2:41                     ` Fam Zheng
2019-01-17  5:06                       ` Yang Shi
2019-01-19  3:17                         ` 段熊春
2019-01-20 23:15                         ` Shakeel Butt
2019-01-20 23:15                           ` Shakeel Butt
2019-01-20 23:20                           ` Shakeel Butt [this message]
2019-01-21 10:27                           ` Michal Hocko
2019-01-04  9:04 ` Michal Hocko
2019-01-04 10:02   ` Fam Zheng
2019-01-04 10:12     ` Michal Hocko
2019-01-04 10:35       ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALvZod6eTa7vvf2c41yx3Yew7RYa56WvNE+HwmFUDzXCsO7PVg@mail.gmail.com \
    --to=shakeelb@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=duanxiongchun@bytedance.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=liuxiaozhou@bytedance.com \
    --cc=lizefan@huawei.com \
    --cc=mhocko@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    --cc=zhangyongsu@bytedance.com \
    --cc=zhengfeiran@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox