Re: [PATCH] mm/compaction: use proper zoneid for compaction_suitable()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Yafang Shao <laoar.shao@gmail.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,  Vlastimil Babka <vbabka@suse.cz>,
	Arnd Bergmann <arnd@arndb.de>,
	 Paul Gortmaker <paul.gortmaker@windriver.com>,
	Rik van Riel <riel@redhat.com>,
	 Yafang Shao <shaoyafang@didiglobal.com>
Subject: Re: [PATCH] mm/compaction: use proper zoneid for compaction_suitable()
Date: Fri, 26 Jul 2019 19:13:44 +0800	[thread overview]
Message-ID: <CALOAHbDxUENTiPm18Ntd=sOAakxbQaZRnktOY9Jok-0+RTwG5g@mail.gmail.com> (raw)
In-Reply-To: <20190726102602.GD2739@techsingularity.net>

On Fri, Jul 26, 2019 at 6:26 PM Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Fri, Jul 26, 2019 at 06:06:48PM +0800, Yafang Shao wrote:
> > On Fri, Jul 26, 2019 at 3:09 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > On Thu, Jul 25, 2019 at 09:50:21AM -0400, Yafang Shao wrote:
> > > > By now there're three compaction paths,
> > > > - direct compaction
> > > > - kcompactd compcation
> > > > - proc triggered compaction
> > > > When we do compaction in all these paths, we will use compaction_suitable()
> > > > to check whether a zone is suitable to do compaction.
> > > >
> > > > There're some issues around the usage of compaction_suitable().
> > > > We don't use the proper zoneid in kcompactd_node_suitable() when try to
> > > > wakeup kcompactd. In the kcompactd compaction paths, we call
> > > > compaction_suitable() twice and the zoneid isn't proper in the second call.
> > > > For proc triggered compaction, the classzone_idx is always zero.
> > > >
> > > > In order to fix these issues, I change the type of classzone_idx in the
> > > > struct compact_control from const int to int and assign the proper zoneid
> > > > before calling compact_zone().
> > > >
> > >
> > > What is actually fixed by this?
> > >
> >
> > Recently there's a page alloc failure on our server because the
> > compaction can't satisfy it.
>
> That could be for a wide variety of reasons. There are limits to how
> aggressive compaction will be but if there are unmovable pages preventing
> the allocation, no amount of cleverness with the wakeups will change that.
>

Yes, we should know whether it is lack of movable pages or the
compaction can't catch up first.
I think it would be better if there're some debugging facilities could
help us do that.

> > This issue is unproducible, so I have to view the compaction code and
> > find out the possible solutions.
>
> For high allocation success rates, the focus should be on strictness of
> fragmentation control (hard, multiple tradeoffs) or increasing the number
> of pages that can be moved (very hard, multiple tradeoffs).
>

Agreed, that's a tradeoff.

> > When I'm reading these compaction code, I find some  misuse of
> > compaction_suitable().
> > But after you point out, I find that I missed something.
> > The classzone_idx should represent the alloc request, otherwise we may
> > do unnecessary compaction on a zone.
> > Thanks a lot for your explaination.
> >
>
> Exactly.
>
> > Hi Andrew,
> >
> > Pls. help drop this patch. Sorry about that.
>
> Agreed but there is no need to apologise. The full picture of this problem
> is not obvious, not described anywhere and it's extremely difficult to
> test and verify.
>
> > > > <SNIP>
> > > > @@ -2535,7 +2535,7 @@ static void kcompactd_do_work(pg_data_t *pgdat)
> > > >                                                       cc.classzone_idx);
> > > >       count_compact_event(KCOMPACTD_WAKE);
> > > >
> > > > -     for (zoneid = 0; zoneid <= cc.classzone_idx; zoneid++) {
> > > > +     for (zoneid = 0; zoneid <= pgdat->kcompactd_classzone_idx; zoneid++) {
> > > >               int status;
> > > >
> > > >               zone = &pgdat->node_zones[zoneid];
> > >
> > > This variable can be updated by a wakeup while the loop is executing
> > > making the loop more difficult to reason about given the exit conditions
> > > can change.
> > >
> >
> > Thanks for your point out.
> >
> > But seems there're still issues event without my change ?
> > For example,
> > If we call wakeup_kcompactd() while the kcompactd is running,
> > we just modify the kcompactd_max_order and kcompactd_classzone_idx and
> > then return.
> > Then in another path, the wakeup_kcompactd() is called again,
> > so kcompactd_classzone_idx and kcompactd_max_order will be override,
> > that means the previous wakeup is missed.
> > Right ?
> >
>
> That's intended. When kcompactd wakes up, it takes a snapshot of what is
> requested and works on that. Other requests can update the requirements for
> a future compaction request if necessary. One could argue that the wakeup
> is missed but really it's "defer that request to some kcompactd activity
> in the future". If kcompactd loops until there are no more requests, it
> can consume an excessive amount of CPU due to requests continually keeping
> it awake. kcompactd is best-effort to reduce the amount of direct stalls
> due to compaction but an allocation request always faces the possibility
> that it may stall because a kernel thread has not made enough progress
> or failed.
>
> FWIW, similar problems hit kswapd in the past where allocation requests
> could artifically keep it awake consuming 100% of CPU.
>

Understood. Thanks for your explanation again.

Thanks
Yafang

     prev parent reply	other threads:[~2019-07-26 11:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-25 13:50 Yafang Shao
2019-07-26  7:09 ` Mel Gorman
2019-07-26 10:06   ` Yafang Shao
2019-07-26 10:26     ` Mel Gorman
2019-07-26 11:13       ` Yafang Shao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALOAHbDxUENTiPm18Ntd=sOAakxbQaZRnktOY9Jok-0+RTwG5g@mail.gmail.com' \
    --to=laoar.shao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=paul.gortmaker@windriver.com \
    --cc=riel@redhat.com \
    --cc=shaoyafang@didiglobal.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox