linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Jan Kara <jack@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 2/2] mm: add PageWaiters bit to indicate waitqueue should be checked
Date: Fri, 4 Nov 2016 13:40:49 +1100	[thread overview]
Message-ID: <20161104134049.6c7d394b@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <CA+55aFyzf8r2q-HLfADcz74H-My_GY-z15yLrwH-KUqd486Q0A@mail.gmail.com>

On Thu, 3 Nov 2016 08:49:14 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Wed, Nov 2, 2016 at 8:46 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > If you don't have that, then a long-waiting waiter for some
> > unrelated page can prevent other pages from getting back to
> > the fastpath.
> >
> > Contention bit is already explicitly not precise with this patch
> > (false positive possible), but in general the next wakeup will
> > clean it up. Without page_match, that's not always possible.  
> 
> Do we care?
> 
> The point is, it's rare, and if there are no numbers to say that it's
> an issue, we shouldn't create the complication. Numbers talk,
> handwaving "this might be an issue" walks.

Well you could have hundreds of waiters on pages with highly threaded
IO (say, a file server), which will cause collisions in the hash table.
I can just try to force that to happen and show up that 2.2% again.

Actaully it would be more than 2.2% with my patch as is, because it no
longer does an unlocked waitqueue_active() check if the waiters bit was
set (because with my approach the lock will always be required if only
to clear the bit after checking the waitqueue). If we avoid clearing
dangling bity there, we'll then have to reintroduce that test.

> That said, at least it isn't a big complexity that will hurt, and it's
> very localized.

I thought so :)

> 
> >> Also, it would be lovely to get numbers against the plain 4.8
> >> situation with the per-zone waitqueues. Maybe that used to help your
> >> workload, so the 2.2% improvement might be partly due to me breaking
> >> performance on your machine.  
> >
> > Oh yeah that'll hurt a bit. The hash will get spread over non-local
> > nodes now. I think it was only a 2 socket system, but remote memory
> > still takes a latency hit. Hmm, I think keeping the zone waitqueue
> > just for pages would be reasonable, because they're a special case?  
> 
> HELL NO!
> 
> Christ. That zone crap may have helped some very few NUMA machines,
> but it *hurt* normal machines.

Oh I missed why they hurt small systems -- where did you see that
slowdown? I agree that's a serious concern. I'll go back and read the
thread again.

> So no way in hell are we re-introducing that ugly, complex, fragile
> crap that actually slows down the normal case on real loads (not
> microbenchmarks). It was a mistake from the very beginning.

For the generic bit wait stuff, sure. For page waiters you always
have the page, there's no translation so I don't see the fragility.

> No, the reason I'd like to hear about numbers is that while I *know*
> that removing the crazy zone code helped on normal machines (since I
> could test that case myself), I still am interested in whether the
> zone removal hurt on some machines (probably not two-node ones,
> though: Mel already tested that on x86), I'd like to know what the
> situation is with the contention bit.
> 
> I'm pretty sure that with the contention bit, the zone crud is
> entirely immaterial (since we no longer actually hit the waitqueue
> outside of IO), but my "I'm pretty sure" comes back to the "handwaving
> walks" issue.

I do worry about pushing large amounts of IO, not even on huge NUMA
machines, but 2-4 socket. Then again, it *tends* to be that you don't
wait on every single page, but rather batches of them at a time.

> 
> So numbers would be really good.

I'll try to come up with some.

Thanks,
Nick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-11-04  2:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-02  7:03 [RFC][PATCH 0/2] optimise unlock_page / end_page_writeback Nicholas Piggin
2016-11-02  7:03 ` [PATCH 1/2] mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked Nicholas Piggin
2016-11-04  2:20   ` Hugh Dickins
2016-11-11  0:58     ` Nicholas Piggin
2016-11-02  7:03 ` [PATCH 2/2] mm: add PageWaiters bit to indicate waitqueue should be checked Nicholas Piggin
2016-11-02  7:31   ` Kirill A. Shutemov
2016-11-02  7:50     ` Nicholas Piggin
2016-11-02  7:58       ` Kirill A. Shutemov
2016-11-02  8:12         ` Nicholas Piggin
2016-11-02  8:33           ` Kirill A. Shutemov
2016-11-02  8:40             ` Nicholas Piggin
2016-11-02  9:04               ` Kirill A. Shutemov
2016-11-02 15:18   ` Linus Torvalds
2016-11-03  3:46     ` Nicholas Piggin
2016-11-03 15:49       ` Linus Torvalds
2016-11-04  2:40         ` Nicholas Piggin [this message]
2016-11-04  7:29           ` Nicholas Piggin
2016-11-04 15:59             ` Linus Torvalds
2016-11-07  3:04               ` Nicholas Piggin
2016-11-04  2:31   ` [lkp] [mm] 731b9bc419: kernel BUG at include/linux/page-flags.h:259! kernel test robot
2016-11-04  2:47     ` Nicholas Piggin
2016-12-21 15:19 [PATCH 0/2] respin of PageWaiters patch Nicholas Piggin
2016-12-21 15:19 ` [PATCH 2/2] mm: add PageWaiters bit to indicate waitqueue should be checked Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161104134049.6c7d394b@roar.ozlabs.ibm.com \
    --to=npiggin@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox