From: Shakeel Butt <shakeelb@google.com>
To: Yang Shi <shy828301@gmail.com>, Fam Zheng <zhengfeiran@bytedance.com>
Cc: cgroups@vger.kernel.org, "Linux MM" <linux-mm@kvack.org>,
tj@kernel.org, "Johannes Weiner" <hannes@cmpxchg.org>,
lizefan@huawei.com, "Michal Hocko" <mhocko@kernel.org>,
"Vladimir Davydov" <vdavydov.dev@gmail.com>,
duanxiongchun@bytedance.com, 张永肃 <zhangyongsu@bytedance.com>,
liuxiaozhou@bytedance.com, "Shakeel Butt" <shakeelb@google.com>
Subject: memory cgroup pagecache and inode problem
Date: Sun, 20 Jan 2019 15:15:51 -0800 [thread overview]
Message-ID: <20190120231551.213847-1-shakeelb@google.com> (raw)
In-Reply-To: <CAHbLzkoRGk9nE6URO9xJKaAQ+8HDPJQosJuPyR1iYuaUBroDMg@mail.gmail.com>
On Wed, Jan 16, 2019 at 9:07 PM Yang Shi <shy828301@gmail.com> wrote:
...
> > > You mean it solves the problem by retrying more times? Actually, I'm
> > > not sure if you have swap setup in your test, but force_empty does do
> > > swap if swap is on. This may cause it can't reclaim all the page cache
> > > in 5 retries. I have a patch within that series to skip swap.
> >
> > Basically yes, retrying solves the problem. But compared to immediate retries, a scheduled retry in a few seconds is much more effective.
>
> This may suggest doing force_empty in a worker is more effective in
> fact. Not sure if this is good enough to convince Johannes or not.
>
>From what I understand what we actually want is to force_empty an
offlined memcg. How about we change the semantics of force_empty on
root_mem_cgroup? Currently force_empty on root_mem_cgroup returns
-EINVAL. Rather than that, let's do force_empty on all offlined memcgs
if user does force_empty on root_mem_cgroup. Something like following.
---
mm/memcontrol.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4ac554be7e8..51daa2935c41 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2898,14 +2898,16 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg)
*
* Caller is responsible for holding css reference for memcg.
*/
-static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
+static int mem_cgroup_force_empty(struct mem_cgroup *memcg, bool online)
{
int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
/* we call try-to-free pages for make this cgroup empty */
- lru_add_drain_all();
- drain_all_stock(memcg);
+ if (online) {
+ lru_add_drain_all();
+ drain_all_stock(memcg);
+ }
/* try to free all pages in this cgroup */
while (nr_retries && page_counter_read(&memcg->memory)) {
@@ -2915,7 +2917,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
return -EINTR;
progress = try_to_free_mem_cgroup_pages(memcg, 1,
- GFP_KERNEL, true);
+ GFP_KERNEL, online);
if (!progress) {
nr_retries--;
/* maybe some writeback is necessary */
@@ -2932,10 +2934,16 @@ static ssize_t mem_cgroup_force_empty_write(struct kernfs_open_file *of,
loff_t off)
{
struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
+ struct mem_cgroup *mi;
- if (mem_cgroup_is_root(memcg))
- return -EINVAL;
- return mem_cgroup_force_empty(memcg) ?: nbytes;
+ if (mem_cgroup_is_root(memcg)) {
+ for_each_mem_cgroup_tree(mi, memcg) {
+ if (!mem_cgroup_online(mi))
+ mem_cgroup_force_empty(mi, false);
+ }
+ return 0;
+ }
+ return mem_cgroup_force_empty(memcg, true) ?: nbytes;
}
static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
--
2.20.1.321.g9e740568ce-goog
WARNING: multiple messages have this Message-ID
From: Shakeel Butt <shakeelb@google.com>
To: Yang Shi <shy828301@gmail.com>, Fam Zheng <zhengfeiran@bytedance.com>
Cc: cgroups@vger.kernel.org, "Linux MM" <linux-mm@kvack.org>,
tj@kernel.org, "Johannes Weiner" <hannes@cmpxchg.org>,
lizefan@huawei.com, "Michal Hocko" <mhocko@kernel.org>,
"Vladimir Davydov" <vdavydov.dev@gmail.com>,
duanxiongchun@bytedance.com, 张永肃 <zhangyongsu@bytedance.com>,
liuxiaozhou@bytedance.com, "Shakeel Butt" <shakeelb@google.com>
Subject: memory cgroup pagecache and inode problem
Date: Sun, 20 Jan 2019 15:15:51 -0800 [thread overview]
Message-ID: <20190120231551.213847-1-shakeelb@google.com> (raw)
Message-ID: <20190120231551.NlPYwhDDkt0S1k7xwzc22UZTMMkOXbs1UML2wWw6e2U@z> (raw)
In-Reply-To: <CAHbLzkoRGk9nE6URO9xJKaAQ+8HDPJQosJuPyR1iYuaUBroDMg@mail.gmail.com>
On Wed, Jan 16, 2019 at 9:07 PM Yang Shi <shy828301@gmail.com> wrote:
...
> > > You mean it solves the problem by retrying more times? Actually, I'm
> > > not sure if you have swap setup in your test, but force_empty does do
> > > swap if swap is on. This may cause it can't reclaim all the page cache
> > > in 5 retries. I have a patch within that series to skip swap.
> >
> > Basically yes, retrying solves the problem. But compared to immediate retries, a scheduled retry in a few seconds is much more effective.
>
> This may suggest doing force_empty in a worker is more effective in
> fact. Not sure if this is good enough to convince Johannes or not.
>
From what I understand what we actually want is to force_empty an
offlined memcg. How about we change the semantics of force_empty on
root_mem_cgroup? Currently force_empty on root_mem_cgroup returns
-EINVAL. Rather than that, let's do force_empty on all offlined memcgs
if user does force_empty on root_mem_cgroup. Something like following.
---
mm/memcontrol.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4ac554be7e8..51daa2935c41 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2898,14 +2898,16 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg)
*
* Caller is responsible for holding css reference for memcg.
*/
-static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
+static int mem_cgroup_force_empty(struct mem_cgroup *memcg, bool online)
{
int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
/* we call try-to-free pages for make this cgroup empty */
- lru_add_drain_all();
- drain_all_stock(memcg);
+ if (online) {
+ lru_add_drain_all();
+ drain_all_stock(memcg);
+ }
/* try to free all pages in this cgroup */
while (nr_retries && page_counter_read(&memcg->memory)) {
@@ -2915,7 +2917,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
return -EINTR;
progress = try_to_free_mem_cgroup_pages(memcg, 1,
- GFP_KERNEL, true);
+ GFP_KERNEL, online);
if (!progress) {
nr_retries--;
/* maybe some writeback is necessary */
@@ -2932,10 +2934,16 @@ static ssize_t mem_cgroup_force_empty_write(struct kernfs_open_file *of,
loff_t off)
{
struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
+ struct mem_cgroup *mi;
- if (mem_cgroup_is_root(memcg))
- return -EINVAL;
- return mem_cgroup_force_empty(memcg) ?: nbytes;
+ if (mem_cgroup_is_root(memcg)) {
+ for_each_mem_cgroup_tree(mi, memcg) {
+ if (!mem_cgroup_online(mi))
+ mem_cgroup_force_empty(mi, false);
+ }
+ return 0;
+ }
+ return mem_cgroup_force_empty(memcg, true) ?: nbytes;
}
static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
--
2.20.1.321.g9e740568ce-goog
next prev parent reply other threads:[~2019-01-20 23:16 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <15614FDC-198E-449B-BFAF-B00D6EF61155@bytedance.com>
2019-01-04 4:44 ` Fam Zheng
2019-01-04 5:00 ` Yang Shi
2019-01-04 5:12 ` Fam Zheng
2019-01-04 19:36 ` Yang Shi
2019-01-07 5:10 ` Fam Zheng
2019-01-07 8:53 ` Michal Hocko
2019-01-07 9:01 ` Fam Zheng
2019-01-07 9:13 ` Michal Hocko
2019-01-09 4:33 ` Fam Zheng
2019-01-10 5:36 ` Yang Shi
2019-01-10 8:30 ` Fam Zheng
2019-01-10 8:41 ` Michal Hocko
2019-01-16 0:50 ` Yang Shi
2019-01-16 3:52 ` Fam Zheng
2019-01-16 7:06 ` Michal Hocko
2019-01-16 21:08 ` Yang Shi
2019-01-16 21:06 ` Yang Shi
2019-01-17 2:41 ` Fam Zheng
2019-01-17 5:06 ` Yang Shi
2019-01-19 3:17 ` 段熊春
2019-01-20 23:15 ` Shakeel Butt [this message]
2019-01-20 23:15 ` Shakeel Butt
2019-01-20 23:20 ` Shakeel Butt
2019-01-21 10:27 ` Michal Hocko
2019-01-04 9:04 ` Michal Hocko
2019-01-04 10:02 ` Fam Zheng
2019-01-04 10:12 ` Michal Hocko
2019-01-04 10:35 ` Fam Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190120231551.213847-1-shakeelb@google.com \
--to=shakeelb@google.com \
--cc=cgroups@vger.kernel.org \
--cc=duanxiongchun@bytedance.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=liuxiaozhou@bytedance.com \
--cc=lizefan@huawei.com \
--cc=mhocko@kernel.org \
--cc=shy828301@gmail.com \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
--cc=zhangyongsu@bytedance.com \
--cc=zhengfeiran@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox