From: Linus Torvalds <torvalds@linux-foundation.org>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
Bob Peterson <rpeterso@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Steven Whitehouse <swhiteho@redhat.com>,
Andrew Lutomirski <luto@kernel.org>,
Andreas Gruenbacher <agruenba@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-mm <linux-mm@kvack.org>,
Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit
Date: Wed, 28 Dec 2016 20:16:56 -0800 [thread overview]
Message-ID: <CA+55aFxGz8R8J9jLvKpLUgyhWVYcgtObhbHBP7eZzZyc05AODw@mail.gmail.com> (raw)
In-Reply-To: <20161229140837.5fff906d@roar.ozlabs.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 864 bytes --]
On Wed, Dec 28, 2016 at 8:08 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>
> Okay. The name could be a bit better though I think, for readability.
> Just a BUILD_BUG_ON if it is not constant and correct bit numbers?
I have a slightly edited patch - moved the comments around and added
some new comments (about both the sign bit, but also about how the
smp_mb() shouldn't be necessary even for the non-atomic fallback).
I also did a BUILD_BUG_ON(), except the other way around - keeping it
about the sign bit in the byte, just just verifying that yes,
PG_waiters is that sign bit.
> BTW. I just notice in your patch too that you didn't use "nr" in the
> generic version.
And I fixed that too.
Of course, I didn't test the changes (apart from building it). But
I've been running the previous version since yesterday, so far no
issues.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 3744 bytes --]
arch/x86/include/asm/bitops.h | 13 +++++++++++++
include/linux/page-flags.h | 2 +-
mm/filemap.c | 36 +++++++++++++++++++++++++++++++-----
3 files changed, 45 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 68557f52b961..854022772c5b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -139,6 +139,19 @@ static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
}
+static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
+{
+ bool negative;
+ asm volatile(LOCK_PREFIX "andb %2,%1\n\t"
+ CC_SET(s)
+ : CC_OUT(s) (negative), ADDR
+ : "ir" ((char) ~(1 << nr)) : "memory");
+ return negative;
+}
+
+// Let everybody know we have it
+#define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+
/*
* __clear_bit_unlock - Clears a bit in memory
* @nr: Bit to clear
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index c56b39890a41..6b5818d6de32 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -73,13 +73,13 @@
*/
enum pageflags {
PG_locked, /* Page is locked. Don't touch. */
- PG_waiters, /* Page has waiters, check its waitqueue */
PG_error,
PG_referenced,
PG_uptodate,
PG_dirty,
PG_lru,
PG_active,
+ PG_waiters, /* Page has waiters, check its waitqueue. Must be bit #7 and in the same byte as "PG_locked" */
PG_slab,
PG_owner_priv_1, /* Owner use. If pagecache, fs may use*/
PG_arch_1,
diff --git a/mm/filemap.c b/mm/filemap.c
index 82f26cde830c..6b1d96f86a9c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -912,6 +912,29 @@ void add_page_wait_queue(struct page *page, wait_queue_t *waiter)
}
EXPORT_SYMBOL_GPL(add_page_wait_queue);
+#ifndef clear_bit_unlock_is_negative_byte
+
+/*
+ * PG_waiters is the high bit in the same byte as PG_lock.
+ *
+ * On x86 (and on many other architectures), we can clear PG_lock and
+ * test the sign bit at the same time. But if the architecture does
+ * not support that special operation, we just do this all by hand
+ * instead.
+ *
+ * The read of PG_waiters has to be after (or concurrently with) PG_locked
+ * being cleared, but a memory barrier should be unneccssary since it is
+ * in the same byte as PG_locked.
+ */
+static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void *mem)
+{
+ clear_bit_unlock(nr, mem);
+ /* smp_mb__after_atomic(); */
+ return test_bit(PG_waiters);
+}
+
+#endif
+
/**
* unlock_page - unlock a locked page
* @page: the page
@@ -921,16 +944,19 @@ EXPORT_SYMBOL_GPL(add_page_wait_queue);
* mechanism between PageLocked pages and PageWriteback pages is shared.
* But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
*
- * The mb is necessary to enforce ordering between the clear_bit and the read
- * of the waitqueue (to avoid SMP races with a parallel wait_on_page_locked()).
+ * Note that this depends on PG_waiters being the sign bit in the byte
+ * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
+ * clear the PG_locked bit and test PG_waiters at the same time fairly
+ * portably (architectures that do LL/SC can test any bit, while x86 can
+ * test the sign bit).
*/
void unlock_page(struct page *page)
{
+ BUILD_BUG_ON(PG_waiters != 7);
page = compound_head(page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- clear_bit_unlock(PG_locked, &page->flags);
- smp_mb__after_atomic();
- wake_up_page(page, PG_locked);
+ if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags))
+ wake_up_page_bit(page, PG_locked);
}
EXPORT_SYMBOL(unlock_page);
next prev parent reply other threads:[~2016-12-29 4:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-25 3:00 [PATCH 0/2] PageWaiters again Nicholas Piggin
2016-12-25 3:00 ` [PATCH 1/2] mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked Nicholas Piggin
2016-12-25 5:13 ` Hugh Dickins
2016-12-25 3:00 ` [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit Nicholas Piggin
2016-12-25 21:51 ` Linus Torvalds
2016-12-26 1:16 ` Nicholas Piggin
2016-12-26 19:07 ` Linus Torvalds
2016-12-27 11:19 ` Nicholas Piggin
2016-12-27 18:58 ` Linus Torvalds
2016-12-27 19:23 ` Linus Torvalds
2016-12-27 19:24 ` Linus Torvalds
2016-12-27 19:40 ` Linus Torvalds
2016-12-27 20:17 ` Linus Torvalds
2016-12-28 3:53 ` Nicholas Piggin
2016-12-28 19:17 ` Linus Torvalds
2016-12-29 4:08 ` Nicholas Piggin
2016-12-29 4:16 ` Linus Torvalds [this message]
2016-12-29 5:26 ` Nicholas Piggin
2017-01-03 10:24 ` Mel Gorman
2017-01-03 12:29 ` Nicholas Piggin
2017-01-03 17:18 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+55aFxGz8R8J9jLvKpLUgyhWVYcgtObhbHBP7eZzZyc05AODw@mail.gmail.com \
--to=torvalds@linux-foundation.org \
--cc=agruenba@redhat.com \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mgorman@techsingularity.net \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=rpeterso@redhat.com \
--cc=swhiteho@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox