linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zhaoyang Huang <huangzhaoyang@gmail.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel-patch-test@lists.linaro.org
Subject: Re: [PATCH v2] mm: terminate the reclaim early when direct reclaiming
Date: Tue, 31 Jul 2018 19:58:20 +0800	[thread overview]
Message-ID: <CAGWkznGrc4cgMN4P5OJKGi_UV6kU_6yjV9XcPHv5MVRn11+pzw@mail.gmail.com> (raw)
In-Reply-To: <20180731111924.GI4557@dhcp22.suse.cz>

On Tue, Jul 31, 2018 at 7:19 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 31-07-18 19:09:28, Zhaoyang Huang wrote:
> > This patch try to let the direct reclaim finish earlier than it used
> > to be. The problem comes from We observing that the direct reclaim
> > took a long time to finish when memcg is enabled. By debugging, we
> > find that the reason is the softlimit is too low to meet the loop
> > end criteria. So we add two barriers to judge if it has reclaimed
> > enough memory as same criteria as it is in shrink_lruvec:
> > 1. for each memcg softlimit reclaim.
> > 2. before starting the global reclaim in shrink_zone.
>
> Then I would really recommend to not use soft limit at all. It has
> always been aggressive. I have propose to make it less so in the past we
> have decided to go that way because we simply do not know whether
> somebody depends on that behavior. Your changelog doesn't really tell
> the whole story. Why is this a problem all of the sudden? Nothing has
> really changed recently AFAICT. Cgroup v1 interface is mostly for
> backward compatibility, we have much better ways to accomplish
> workloads isolation in cgroup v2.
>
> So why does it matter all of the sudden?
>
> Besides that EXPORT_SYMBOL for such a low level functionality as the
> memory reclaim is a big no-no.
>
> So without a much better explanation and with a low level symbol
> exported NAK from me.
>
My test workload is from Android system, where the multimedia apps
require much pages. We observed that one thread of the process trapped
into mem_cgroup_soft_limit_reclaim within direct reclaim and also
blocked other thread in mmap or do_page_fault(by semphore?).
Furthermore, we also observed other long time direct reclaim related
with soft limit which are supposed to cause page thrash as the
allocator itself is the most right of the rb_tree . Besides, even
without the soft_limit, shall the 'direct reclaim' check the watermark
firstly before shrink_node, for the concurrent kswapd may have
reclaimed enough pages for allocation.
> >
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@spreadtrum.com>
> > ---
> >  include/linux/memcontrol.h |  3 ++-
> >  mm/memcontrol.c            |  3 +++
> >  mm/vmscan.c                | 38 +++++++++++++++++++++++++++++++++++++-
> >  3 files changed, 42 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index 6c6fb11..a7e82c7 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -325,7 +325,8 @@ void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg,
> >  void mem_cgroup_uncharge_list(struct list_head *page_list);
> >
> >  void mem_cgroup_migrate(struct page *oldpage, struct page *newpage);
> > -
> > +bool direct_reclaim_reach_watermark(pg_data_t *pgdat, unsigned long nr_reclaimed,
> > +                     unsigned long nr_scanned, gfp_t gfp_mask, int order);
> >  static struct mem_cgroup_per_node *
> >  mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid)
> >  {
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 8c0280b..e4efd46 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2577,6 +2577,9 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
> >                       (next_mz == NULL ||
> >                       loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS))
> >                       break;
> > +             if (direct_reclaim_reach_watermark(pgdat, nr_reclaimed,
> > +                                     *total_scanned, gfp_mask, order))
> > +                     break;
> >       } while (!nr_reclaimed);
> >       if (next_mz)
> >               css_put(&next_mz->memcg->css);
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 03822f8..19503f3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2518,6 +2518,34 @@ static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg)
> >               (memcg && memcg_congested(pgdat, memcg));
> >  }
> >
> > +bool direct_reclaim_reach_watermark(pg_data_t *pgdat, unsigned long nr_reclaimed,
> > +             unsigned long nr_scanned, gfp_t gfp_mask,
> > +             int order)
> > +{
> > +     struct scan_control sc = {
> > +             .gfp_mask = gfp_mask,
> > +             .order = order,
> > +             .priority = DEF_PRIORITY,
> > +             .nr_reclaimed = nr_reclaimed,
> > +             .nr_scanned = nr_scanned,
> > +     };
> > +     if (!current_is_kswapd())
> > +             return false;
> > +     if (!IS_ENABLED(CONFIG_COMPACTION))
> > +             return false;
> > +     /*
> > +      * In fact, we add 1 to nr_reclaimed and nr_scanned to let should_continue_reclaim
> > +      * NOT return by finding they are zero, which means compaction_suitable()
> > +      * takes effect here to judge if we have reclaimed enough pages for passing
> > +      * the watermark and no necessary to check other memcg anymore.
> > +      */
> > +     if (!should_continue_reclaim(pgdat,
> > +                             sc.nr_reclaimed + 1, sc.nr_scanned + 1, &sc))
> > +             return true;
> > +     return false;
> > +}
> > +EXPORT_SYMBOL(direct_reclaim_reach_watermark);
> > +
> >  static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
> >  {
> >       struct reclaim_state *reclaim_state = current->reclaim_state;
> > @@ -2802,7 +2830,15 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> >                       sc->nr_scanned += nr_soft_scanned;
> >                       /* need some check for avoid more shrink_zone() */
> >               }
> > -
> > +             /*
> > +              * we maybe have stolen enough pages from soft limit reclaim, so we return
> > +              * back if we are direct reclaim
> > +              */
> > +             if (direct_reclaim_reach_watermark(zone->zone_pgdat, sc->nr_reclaimed,
> > +                                             sc->nr_scanned, sc->gfp_mask, sc->order)) {
> > +                     sc->gfp_mask = orig_mask;
> > +                     return;
> > +             }
> >               /* See comment about same check for global reclaim above */
> >               if (zone->zone_pgdat == last_pgdat)
> >                       continue;
> > --
> > 1.9.1
>
> --
> Michal Hocko
> SUSE Labs

  reply	other threads:[~2018-07-31 11:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-31 11:09 Zhaoyang Huang
2018-07-31 11:19 ` Michal Hocko
2018-07-31 11:58   ` Zhaoyang Huang [this message]
2018-07-31 12:06     ` Michal Hocko
2018-07-31 17:50 ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGWkznGrc4cgMN4P5OJKGi_UV6kU_6yjV9XcPHv5MVRn11+pzw@mail.gmail.com \
    --to=huangzhaoyang@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-patch-test@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox