linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Yang Shi <shy828301@gmail.com>, Fam Zheng <zhengfeiran@bytedance.com>
Cc: cgroups@vger.kernel.org, "Linux MM" <linux-mm@kvack.org>,
	tj@kernel.org, "Johannes Weiner" <hannes@cmpxchg.org>,
	lizefan@huawei.com, "Michal Hocko" <mhocko@kernel.org>,
	"Vladimir Davydov" <vdavydov.dev@gmail.com>,
	duanxiongchun@bytedance.com, 张永肃 <zhangyongsu@bytedance.com>,
	liuxiaozhou@bytedance.com, "Shakeel Butt" <shakeelb@google.com>
Subject: memory cgroup pagecache and inode problem
Date: Sun, 20 Jan 2019 15:15:51 -0800	[thread overview]
Message-ID: <20190120231551.213847-1-shakeelb@google.com> (raw)
In-Reply-To: <CAHbLzkoRGk9nE6URO9xJKaAQ+8HDPJQosJuPyR1iYuaUBroDMg@mail.gmail.com>

On Wed, Jan 16, 2019 at 9:07 PM Yang Shi <shy828301@gmail.com> wrote:
...
> > > You mean it solves the problem by retrying more times?  Actually, I'm
> > > not sure if you have swap setup in your test, but force_empty does do
> > > swap if swap is on. This may cause it can't reclaim all the page cache
> > > in 5 retries.  I have a patch within that series to skip swap.
> >
> > Basically yes, retrying solves the problem. But compared to immediate retries, a scheduled retry in a few seconds is much more effective.
>
> This may suggest doing force_empty in a worker is more effective in
> fact. Not sure if this is good enough to convince Johannes or not.
>

>From what I understand what we actually want is to force_empty an
offlined memcg. How about we change the semantics of force_empty on
root_mem_cgroup? Currently force_empty on root_mem_cgroup returns
-EINVAL. Rather than that, let's do force_empty on all offlined memcgs
if user does force_empty on root_mem_cgroup. Something like following.

---
 mm/memcontrol.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4ac554be7e8..51daa2935c41 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2898,14 +2898,16 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg)
  *
  * Caller is responsible for holding css reference for memcg.
  */
-static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
+static int mem_cgroup_force_empty(struct mem_cgroup *memcg, bool online)
 {
 	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
 
 	/* we call try-to-free pages for make this cgroup empty */
-	lru_add_drain_all();
 
-	drain_all_stock(memcg);
+	if (online) {
+		lru_add_drain_all();
+		drain_all_stock(memcg);
+	}
 
 	/* try to free all pages in this cgroup */
 	while (nr_retries && page_counter_read(&memcg->memory)) {
@@ -2915,7 +2917,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
 			return -EINTR;
 
 		progress = try_to_free_mem_cgroup_pages(memcg, 1,
-							GFP_KERNEL, true);
+							GFP_KERNEL, online);
 		if (!progress) {
 			nr_retries--;
 			/* maybe some writeback is necessary */
@@ -2932,10 +2934,16 @@ static ssize_t mem_cgroup_force_empty_write(struct kernfs_open_file *of,
 					    loff_t off)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
+	struct mem_cgroup *mi;
 
-	if (mem_cgroup_is_root(memcg))
-		return -EINVAL;
-	return mem_cgroup_force_empty(memcg) ?: nbytes;
+	if (mem_cgroup_is_root(memcg)) {
+		for_each_mem_cgroup_tree(mi, memcg) {
+			if (!mem_cgroup_online(mi))
+				mem_cgroup_force_empty(mi, false);
+		}
+		return 0;
+	}
+	return mem_cgroup_force_empty(memcg, true) ?: nbytes;
 }
 
 static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
-- 
2.20.1.321.g9e740568ce-goog

WARNING: multiple messages have this Message-ID
From: Shakeel Butt <shakeelb@google.com>
To: Yang Shi <shy828301@gmail.com>, Fam Zheng <zhengfeiran@bytedance.com>
Cc: cgroups@vger.kernel.org, "Linux MM" <linux-mm@kvack.org>,
	tj@kernel.org, "Johannes Weiner" <hannes@cmpxchg.org>,
	lizefan@huawei.com, "Michal Hocko" <mhocko@kernel.org>,
	"Vladimir Davydov" <vdavydov.dev@gmail.com>,
	duanxiongchun@bytedance.com, 张永肃 <zhangyongsu@bytedance.com>,
	liuxiaozhou@bytedance.com, "Shakeel Butt" <shakeelb@google.com>
Subject: memory cgroup pagecache and inode problem
Date: Sun, 20 Jan 2019 15:15:51 -0800	[thread overview]
Message-ID: <20190120231551.213847-1-shakeelb@google.com> (raw)
Message-ID: <20190120231551.NlPYwhDDkt0S1k7xwzc22UZTMMkOXbs1UML2wWw6e2U@z> (raw)
In-Reply-To: <CAHbLzkoRGk9nE6URO9xJKaAQ+8HDPJQosJuPyR1iYuaUBroDMg@mail.gmail.com>

On Wed, Jan 16, 2019 at 9:07 PM Yang Shi <shy828301@gmail.com> wrote:
...
> > > You mean it solves the problem by retrying more times?  Actually, I'm
> > > not sure if you have swap setup in your test, but force_empty does do
> > > swap if swap is on. This may cause it can't reclaim all the page cache
> > > in 5 retries.  I have a patch within that series to skip swap.
> >
> > Basically yes, retrying solves the problem. But compared to immediate retries, a scheduled retry in a few seconds is much more effective.
>
> This may suggest doing force_empty in a worker is more effective in
> fact. Not sure if this is good enough to convince Johannes or not.
>

From what I understand what we actually want is to force_empty an
offlined memcg. How about we change the semantics of force_empty on
root_mem_cgroup? Currently force_empty on root_mem_cgroup returns
-EINVAL. Rather than that, let's do force_empty on all offlined memcgs
if user does force_empty on root_mem_cgroup. Something like following.

---
 mm/memcontrol.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4ac554be7e8..51daa2935c41 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2898,14 +2898,16 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg)
  *
  * Caller is responsible for holding css reference for memcg.
  */
-static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
+static int mem_cgroup_force_empty(struct mem_cgroup *memcg, bool online)
 {
 	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
 
 	/* we call try-to-free pages for make this cgroup empty */
-	lru_add_drain_all();
 
-	drain_all_stock(memcg);
+	if (online) {
+		lru_add_drain_all();
+		drain_all_stock(memcg);
+	}
 
 	/* try to free all pages in this cgroup */
 	while (nr_retries && page_counter_read(&memcg->memory)) {
@@ -2915,7 +2917,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
 			return -EINTR;
 
 		progress = try_to_free_mem_cgroup_pages(memcg, 1,
-							GFP_KERNEL, true);
+							GFP_KERNEL, online);
 		if (!progress) {
 			nr_retries--;
 			/* maybe some writeback is necessary */
@@ -2932,10 +2934,16 @@ static ssize_t mem_cgroup_force_empty_write(struct kernfs_open_file *of,
 					    loff_t off)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
+	struct mem_cgroup *mi;
 
-	if (mem_cgroup_is_root(memcg))
-		return -EINVAL;
-	return mem_cgroup_force_empty(memcg) ?: nbytes;
+	if (mem_cgroup_is_root(memcg)) {
+		for_each_mem_cgroup_tree(mi, memcg) {
+			if (!mem_cgroup_online(mi))
+				mem_cgroup_force_empty(mi, false);
+		}
+		return 0;
+	}
+	return mem_cgroup_force_empty(memcg, true) ?: nbytes;
 }
 
 static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
-- 
2.20.1.321.g9e740568ce-goog


  parent reply	other threads:[~2019-01-20 23:16 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <15614FDC-198E-449B-BFAF-B00D6EF61155@bytedance.com>
2019-01-04  4:44 ` Fam Zheng
2019-01-04  5:00   ` Yang Shi
2019-01-04  5:12     ` Fam Zheng
2019-01-04 19:36       ` Yang Shi
2019-01-07  5:10         ` Fam Zheng
2019-01-07  8:53           ` Michal Hocko
2019-01-07  9:01             ` Fam Zheng
2019-01-07  9:13               ` Michal Hocko
2019-01-09  4:33               ` Fam Zheng
2019-01-10  5:36           ` Yang Shi
2019-01-10  8:30             ` Fam Zheng
2019-01-10  8:41               ` Michal Hocko
2019-01-16  0:50               ` Yang Shi
2019-01-16  3:52                 ` Fam Zheng
2019-01-16  7:06                   ` Michal Hocko
2019-01-16 21:08                     ` Yang Shi
2019-01-16 21:06                   ` Yang Shi
2019-01-17  2:41                     ` Fam Zheng
2019-01-17  5:06                       ` Yang Shi
2019-01-19  3:17                         ` 段熊春
2019-01-20 23:15                         ` Shakeel Butt [this message]
2019-01-20 23:15                           ` Shakeel Butt
2019-01-20 23:20                           ` Shakeel Butt
2019-01-21 10:27                           ` Michal Hocko
2019-01-04  9:04 ` Michal Hocko
2019-01-04 10:02   ` Fam Zheng
2019-01-04 10:12     ` Michal Hocko
2019-01-04 10:35       ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190120231551.213847-1-shakeelb@google.com \
    --to=shakeelb@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=duanxiongchun@bytedance.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=liuxiaozhou@bytedance.com \
    --cc=lizefan@huawei.com \
    --cc=mhocko@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    --cc=zhangyongsu@bytedance.com \
    --cc=zhengfeiran@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox