From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f70.google.com (mail-oi0-f70.google.com [209.85.218.70]) by kanga.kvack.org (Postfix) with ESMTP id 425686B000A for ; Tue, 31 Jul 2018 07:58:33 -0400 (EDT) Received: by mail-oi0-f70.google.com with SMTP id u74-v6so13571854oie.16 for ; Tue, 31 Jul 2018 04:58:33 -0700 (PDT) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id b24-v6sor8185170oib.252.2018.07.31.04.58.32 for (Google Transport Security); Tue, 31 Jul 2018 04:58:32 -0700 (PDT) MIME-Version: 1.0 References: <1533035368-30911-1-git-send-email-zhaoyang.huang@spreadtrum.com> <20180731111924.GI4557@dhcp22.suse.cz> In-Reply-To: <20180731111924.GI4557@dhcp22.suse.cz> From: Zhaoyang Huang Date: Tue, 31 Jul 2018 19:58:20 +0800 Message-ID: Subject: Re: [PATCH v2] mm: terminate the reclaim early when direct reclaiming Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Steven Rostedt , Ingo Molnar , Johannes Weiner , Vladimir Davydov , "open list:MEMORY MANAGEMENT" , LKML , kernel-patch-test@lists.linaro.org On Tue, Jul 31, 2018 at 7:19 PM Michal Hocko wrote: > > On Tue 31-07-18 19:09:28, Zhaoyang Huang wrote: > > This patch try to let the direct reclaim finish earlier than it used > > to be. The problem comes from We observing that the direct reclaim > > took a long time to finish when memcg is enabled. By debugging, we > > find that the reason is the softlimit is too low to meet the loop > > end criteria. So we add two barriers to judge if it has reclaimed > > enough memory as same criteria as it is in shrink_lruvec: > > 1. for each memcg softlimit reclaim. > > 2. before starting the global reclaim in shrink_zone. > > Then I would really recommend to not use soft limit at all. It has > always been aggressive. I have propose to make it less so in the past we > have decided to go that way because we simply do not know whether > somebody depends on that behavior. Your changelog doesn't really tell > the whole story. Why is this a problem all of the sudden? Nothing has > really changed recently AFAICT. Cgroup v1 interface is mostly for > backward compatibility, we have much better ways to accomplish > workloads isolation in cgroup v2. > > So why does it matter all of the sudden? > > Besides that EXPORT_SYMBOL for such a low level functionality as the > memory reclaim is a big no-no. > > So without a much better explanation and with a low level symbol > exported NAK from me. > My test workload is from Android system, where the multimedia apps require much pages. We observed that one thread of the process trapped into mem_cgroup_soft_limit_reclaim within direct reclaim and also blocked other thread in mmap or do_page_fault(by semphore?). Furthermore, we also observed other long time direct reclaim related with soft limit which are supposed to cause page thrash as the allocator itself is the most right of the rb_tree . Besides, even without the soft_limit, shall the 'direct reclaim' check the watermark firstly before shrink_node, for the concurrent kswapd may have reclaimed enough pages for allocation. > > > > Signed-off-by: Zhaoyang Huang > > --- > > include/linux/memcontrol.h | 3 ++- > > mm/memcontrol.c | 3 +++ > > mm/vmscan.c | 38 +++++++++++++++++++++++++++++++++++++- > > 3 files changed, 42 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 6c6fb11..a7e82c7 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -325,7 +325,8 @@ void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg, > > void mem_cgroup_uncharge_list(struct list_head *page_list); > > > > void mem_cgroup_migrate(struct page *oldpage, struct page *newpage); > > - > > +bool direct_reclaim_reach_watermark(pg_data_t *pgdat, unsigned long nr_reclaimed, > > + unsigned long nr_scanned, gfp_t gfp_mask, int order); > > static struct mem_cgroup_per_node * > > mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) > > { > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 8c0280b..e4efd46 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2577,6 +2577,9 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, > > (next_mz == NULL || > > loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS)) > > break; > > + if (direct_reclaim_reach_watermark(pgdat, nr_reclaimed, > > + *total_scanned, gfp_mask, order)) > > + break; > > } while (!nr_reclaimed); > > if (next_mz) > > css_put(&next_mz->memcg->css); > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 03822f8..19503f3 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2518,6 +2518,34 @@ static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg) > > (memcg && memcg_congested(pgdat, memcg)); > > } > > > > +bool direct_reclaim_reach_watermark(pg_data_t *pgdat, unsigned long nr_reclaimed, > > + unsigned long nr_scanned, gfp_t gfp_mask, > > + int order) > > +{ > > + struct scan_control sc = { > > + .gfp_mask = gfp_mask, > > + .order = order, > > + .priority = DEF_PRIORITY, > > + .nr_reclaimed = nr_reclaimed, > > + .nr_scanned = nr_scanned, > > + }; > > + if (!current_is_kswapd()) > > + return false; > > + if (!IS_ENABLED(CONFIG_COMPACTION)) > > + return false; > > + /* > > + * In fact, we add 1 to nr_reclaimed and nr_scanned to let should_continue_reclaim > > + * NOT return by finding they are zero, which means compaction_suitable() > > + * takes effect here to judge if we have reclaimed enough pages for passing > > + * the watermark and no necessary to check other memcg anymore. > > + */ > > + if (!should_continue_reclaim(pgdat, > > + sc.nr_reclaimed + 1, sc.nr_scanned + 1, &sc)) > > + return true; > > + return false; > > +} > > +EXPORT_SYMBOL(direct_reclaim_reach_watermark); > > + > > static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) > > { > > struct reclaim_state *reclaim_state = current->reclaim_state; > > @@ -2802,7 +2830,15 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > > sc->nr_scanned += nr_soft_scanned; > > /* need some check for avoid more shrink_zone() */ > > } > > - > > + /* > > + * we maybe have stolen enough pages from soft limit reclaim, so we return > > + * back if we are direct reclaim > > + */ > > + if (direct_reclaim_reach_watermark(zone->zone_pgdat, sc->nr_reclaimed, > > + sc->nr_scanned, sc->gfp_mask, sc->order)) { > > + sc->gfp_mask = orig_mask; > > + return; > > + } > > /* See comment about same check for global reclaim above */ > > if (zone->zone_pgdat == last_pgdat) > > continue; > > -- > > 1.9.1 > > -- > Michal Hocko > SUSE Labs