* [PATCH next 1/3] slub defrag: unpin writeback pages
@ 2008-10-05 2:25 Hugh Dickins
2008-10-05 2:27 ` [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail Hugh Dickins
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Hugh Dickins @ 2008-10-05 2:25 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Pekka Enberg, Andrew Morton, linux-mm
A repetitive swapping load on powerpc G5 went progressively slower after
nine hours: Inactive(file) was rising, as if inactive file pages pinned.
Yes, slub defrag's kick_buffers() was forgetting to put_page() whenever
it met a page already under writeback.
That PageWriteback test should be made while PageLocked in trigger_write(),
just as it is in try_to_free_buffers() - if there are complex reasons why
that's not actually necessary, I'd rather not have to think through them.
A preliminary check before taking the lock? No, it's not that important.
And trigger_write() must remember to unlock_page() in each of the cases
where it doesn't reach the writepage().
Signed-off-by: Hugh Dickins <hugh@veritas.com>
---
Andrew, I mentioned at KS that I'd just completed a slow -mm bisection,
which had mysteriously converged on one of your patches: that was your
vm-dont-run-touch_buffer-during-buffercache-lookups.patch
In fact that turned out just to speed up the rate at which writebacks
were pinning these pages: I've nothing against (nor for!) your patch.
I'm wondering whether kick_buffers() ought to check PageLRU: I've no
evidence it's needed, but coming to writepage() or try_to_free_buffers()
from this direction (by virtue of having a buffer_head in a partial slab
page) is new, and I worry it might cause some ordering problem; whereas
if the page is PageLRU, then it is already fair game for such reclaim.
fs/buffer.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- 2.6.27-rc7-mmotm/fs/buffer.c 2008-09-26 13:18:50.000000000 +0100
+++ linux/fs/buffer.c 2008-10-03 19:43:44.000000000 +0100
@@ -3354,13 +3354,16 @@ static void trigger_write(struct page *p
.for_reclaim = 0
};
+ if (PageWriteback(page))
+ goto unlock;
+
if (!mapping->a_ops->writepage)
/* No write method for the address space */
- return;
+ goto unlock;
if (!clear_page_dirty_for_io(page))
/* Someone else already triggered a write */
- return;
+ goto unlock;
rc = mapping->a_ops->writepage(page, &wbc);
if (rc < 0)
@@ -3368,7 +3371,7 @@ static void trigger_write(struct page *p
return;
if (rc == AOP_WRITEPAGE_ACTIVATE)
- unlock_page(page);
+unlock: unlock_page(page);
}
/*
@@ -3420,7 +3423,7 @@ static void kick_buffers(struct kmem_cac
for (i = 0; i < nr; i++) {
page = v[i];
- if (!page || PageWriteback(page))
+ if (!page)
continue;
if (trylock_page(page)) {
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail
2008-10-05 2:25 [PATCH next 1/3] slub defrag: unpin writeback pages Hugh Dickins
@ 2008-10-05 2:27 ` Hugh Dickins
2008-10-06 7:46 ` Pekka Enberg
2008-10-05 2:28 ` [PATCH next 3/3] slub defrag: slabinfo help trivia Hugh Dickins
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Hugh Dickins @ 2008-10-05 2:27 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Pekka Enberg, Andrew Morton, linux-mm
Why did that slowdown from mispinned pages manifest only on the G5?
Because something in my x86_32 and x86_64 configs (CONFIG_BLK_DEV_SR
I believe) is giving me a kmalloc_dma-512 cache, and dma_kmalloc_cache()
had not been updated to satisfy the assumption in kmem_cache_defrag(),
that defragmentable caches come first in the list.
So, any DMAable cache was preventing all slub defragmentation: which
looks like it's not been getting the testing exposure it deserves.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- 2.6.27-rc7-mmotm/mm/slub.c 2008-09-26 13:18:53.000000000 +0100
+++ linux/mm/slub.c 2008-10-04 20:10:46.000000000 +0100
@@ -2636,7 +2636,7 @@ static noinline struct kmem_cache *dma_k
goto unlock_out;
}
- list_add(&s->list, &slab_caches);
+ list_add_tail(&s->list, &slab_caches);
kmalloc_caches_dma[index] = s;
schedule_work(&sysfs_add_work);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH next 3/3] slub defrag: slabinfo help trivia
2008-10-05 2:25 [PATCH next 1/3] slub defrag: unpin writeback pages Hugh Dickins
2008-10-05 2:27 ` [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail Hugh Dickins
@ 2008-10-05 2:28 ` Hugh Dickins
2008-10-06 7:40 ` Pekka Enberg
2008-10-06 7:53 ` [PATCH next 1/3] slub defrag: unpin writeback pages Pekka Enberg
2008-10-07 13:24 ` Christoph Lameter
3 siblings, 1 reply; 10+ messages in thread
From: Hugh Dickins @ 2008-10-05 2:28 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Pekka Enberg, Andrew Morton, linux-mm
Fix typo on --display-defrag line of slabinfo's help message;
fix misaligning tabs to spaces on two other lines of that message.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
---
Documentation/vm/slabinfo.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- 2.6.27-rc7-mmotm/Documentation/vm/slabinfo.c 2008-09-26 13:18:44.000000000 +0100
+++ linux/Documentation/vm/slabinfo.c 2008-10-04 19:47:40.000000000 +0100
@@ -117,14 +117,14 @@ void usage(void)
"-e|--empty Show empty slabs\n"
"-f|--first-alias Show first alias\n"
"-F|--defrag Show defragmentable caches\n"
- "-G:--display-defrag Display defrag counters\n"
+ "-G|--display-defrag Display defrag counters\n"
"-h|--help Show usage information\n"
"-i|--inverted Inverted list\n"
"-l|--slabs Show slabs\n"
"-n|--numa Show NUMA information\n"
- "-o|--ops Show kmem_cache_ops\n"
+ "-o|--ops Show kmem_cache_ops\n"
"-s|--shrink Shrink slabs\n"
- "-r|--report Detailed report on single slabs\n"
+ "-r|--report Detailed report on single slabs\n"
"-S|--Size Sort by size\n"
"-t|--tracking Show alloc/free information\n"
"-T|--Totals Show summary information\n"
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH next 3/3] slub defrag: slabinfo help trivia
2008-10-05 2:28 ` [PATCH next 3/3] slub defrag: slabinfo help trivia Hugh Dickins
@ 2008-10-06 7:40 ` Pekka Enberg
0 siblings, 0 replies; 10+ messages in thread
From: Pekka Enberg @ 2008-10-06 7:40 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Christoph Lameter, Andrew Morton, linux-mm
On Sun, 2008-10-05 at 03:28 +0100, Hugh Dickins wrote:
> Fix typo on --display-defrag line of slabinfo's help message;
> fix misaligning tabs to spaces on two other lines of that message.
>
> Signed-off-by: Hugh Dickins <hugh@veritas.com>
Applied, thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail
2008-10-05 2:27 ` [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail Hugh Dickins
@ 2008-10-06 7:46 ` Pekka Enberg
2008-10-07 13:25 ` Christoph Lameter
0 siblings, 1 reply; 10+ messages in thread
From: Pekka Enberg @ 2008-10-06 7:46 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Christoph Lameter, Andrew Morton, linux-mm
On Sun, 2008-10-05 at 03:27 +0100, Hugh Dickins wrote:
> Why did that slowdown from mispinned pages manifest only on the G5?
>
> Because something in my x86_32 and x86_64 configs (CONFIG_BLK_DEV_SR
> I believe) is giving me a kmalloc_dma-512 cache, and dma_kmalloc_cache()
> had not been updated to satisfy the assumption in kmem_cache_defrag(),
> that defragmentable caches come first in the list.
>
> So, any DMAable cache was preventing all slub defragmentation: which
> looks like it's not been getting the testing exposure it deserves.
>
> Signed-off-by: Hugh Dickins <hugh@veritas.com>
Applied, thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH next 1/3] slub defrag: unpin writeback pages
2008-10-05 2:25 [PATCH next 1/3] slub defrag: unpin writeback pages Hugh Dickins
2008-10-05 2:27 ` [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail Hugh Dickins
2008-10-05 2:28 ` [PATCH next 3/3] slub defrag: slabinfo help trivia Hugh Dickins
@ 2008-10-06 7:53 ` Pekka Enberg
2008-10-07 13:24 ` Christoph Lameter
3 siblings, 0 replies; 10+ messages in thread
From: Pekka Enberg @ 2008-10-06 7:53 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Christoph Lameter, Andrew Morton, linux-mm
On Sun, 2008-10-05 at 03:25 +0100, Hugh Dickins wrote:
> A repetitive swapping load on powerpc G5 went progressively slower after
> nine hours: Inactive(file) was rising, as if inactive file pages pinned.
> Yes, slub defrag's kick_buffers() was forgetting to put_page() whenever
> it met a page already under writeback.
>
> That PageWriteback test should be made while PageLocked in trigger_write(),
> just as it is in try_to_free_buffers() - if there are complex reasons why
> that's not actually necessary, I'd rather not have to think through them.
> A preliminary check before taking the lock? No, it's not that important.
>
> And trigger_write() must remember to unlock_page() in each of the cases
> where it doesn't reach the writepage().
>
> Signed-off-by: Hugh Dickins <hugh@veritas.com>
Looks good to me. I've applied the patch now but would really like an
ACK from Christoph/Andrew as well. Thanks again, Hugh!
Pekka
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH next 1/3] slub defrag: unpin writeback pages
2008-10-05 2:25 [PATCH next 1/3] slub defrag: unpin writeback pages Hugh Dickins
` (2 preceding siblings ...)
2008-10-06 7:53 ` [PATCH next 1/3] slub defrag: unpin writeback pages Pekka Enberg
@ 2008-10-07 13:24 ` Christoph Lameter
2008-10-07 15:34 ` Hugh Dickins
3 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2008-10-07 13:24 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Pekka Enberg, Andrew Morton, linux-mm
Hugh Dickins wrote:
> A repetitive swapping load on powerpc G5 went progressively slower after
> nine hours: Inactive(file) was rising, as if inactive file pages pinned.
> Yes, slub defrag's kick_buffers() was forgetting to put_page() whenever
> it met a page already under writeback.
Thanks for finding that.
> That PageWriteback test should be made while PageLocked in trigger_write(),
> just as it is in try_to_free_buffers() - if there are complex reasons why
> that's not actually necessary, I'd rather not have to think through them.
> A preliminary check before taking the lock? No, it's not that important.
The writeback check in kick_buffers() is a performance optimization. If the
page is under writeback then there is no point in trying to kick out the page.
That will only succeed after writeback is complete.
If a page is under writeback then try_to_free_buffers() will fail immediately.
So no need to check under pagelock.
> And trigger_write() must remember to unlock_page() in each of the cases
> where it doesn't reach the writepage().
Ack.
> --- 2.6.27-rc7-mmotm/fs/buffer.c 2008-09-26 13:18:50.000000000 +0100
> +++ linux/fs/buffer.c 2008-10-03 19:43:44.000000000 +0100
> @@ -3354,13 +3354,16 @@ static void trigger_write(struct page *p
> .for_reclaim = 0
> };
>
> + if (PageWriteback(page))
> + goto unlock;
> +
Is that necessary? Wont writepage do the appropriate thing?
> /*
> @@ -3420,7 +3423,7 @@ static void kick_buffers(struct kmem_cac
> for (i = 0; i < nr; i++) {
> page = v[i];
>
> - if (!page || PageWriteback(page))
> + if (!page)
> continue;
Thats just an optimization. No need to lock a page if its under writeback
which would make try_to_free_buffers() fail.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail
2008-10-06 7:46 ` Pekka Enberg
@ 2008-10-07 13:25 ` Christoph Lameter
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2008-10-07 13:25 UTC (permalink / raw)
To: Pekka Enberg; +Cc: Hugh Dickins, Andrew Morton, linux-mm
Thanks for finding the DMA cache issues. DMA caches are rarely used since Andi
tried to get rid of all of them for x86.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH next 1/3] slub defrag: unpin writeback pages
2008-10-07 13:24 ` Christoph Lameter
@ 2008-10-07 15:34 ` Hugh Dickins
2008-10-07 15:41 ` Christoph Lameter
0 siblings, 1 reply; 10+ messages in thread
From: Hugh Dickins @ 2008-10-07 15:34 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Pekka Enberg, Andrew Morton, linux-mm
On Tue, 7 Oct 2008, Christoph Lameter wrote:
> Hugh Dickins wrote:
>
> > That PageWriteback test should be made while PageLocked in trigger_write(),
> > just as it is in try_to_free_buffers() - if there are complex reasons why
> > that's not actually necessary, I'd rather not have to think through them.
> > A preliminary check before taking the lock? No, it's not that important.
>
> The writeback check in kick_buffers() is a performance optimization. If the
> page is under writeback then there is no point in trying to kick out the page.
> That will only succeed after writeback is complete.
I think it's more than a performance optimization: I may have missed
somewhere, but the only place I can find which calls ->writepage without
checking PageWriteback first, while holding PageLock across the calls,
is mm/page-writeback.c write_one_page() - which seems to be for internal
use by filesystems, which could make the check themselves - but every
existing caller passes wait arg 1 to wait for PageWriteback to clear.
I wrote that while running a test, my version of fs/buffer.c with
the PageWriteback test removed from trigger_write(): that's now hit
kernel BUG at fs/buffer.c:1757! which is BUG_ON(PageWriteback(page))
in __block_write_full_page(). Fair enough, though what I'd been
looking forward to was the
if (!test_clear_page_writeback(page))
BUG();
in mm/filemap.c end_page_writeback().
>
> If a page is under writeback then try_to_free_buffers() will fail immediately.
> So no need to check under pagelock.
What I meant was that try_to_free_buffers() is already checking it
under pagelock, so let's be symmetrical and check it under pagelock
in trigger_write(), rather than moving the test within kick_buffers().
> > --- 2.6.27-rc7-mmotm/fs/buffer.c 2008-09-26 13:18:50.000000000 +0100
> > +++ linux/fs/buffer.c 2008-10-03 19:43:44.000000000 +0100
> > @@ -3354,13 +3354,16 @@ static void trigger_write(struct page *p
> > .for_reclaim = 0
> > };
> >
> > + if (PageWriteback(page))
> > + goto unlock;
> > +
>
> Is that necessary? Wont writepage do the appropriate thing?
Some might, but in general no: my BUG came below blkdev_writepage().
> > /*
> > @@ -3420,7 +3423,7 @@ static void kick_buffers(struct kmem_cac
> > for (i = 0; i < nr; i++) {
> > page = v[i];
> >
> > - if (!page || PageWriteback(page))
> > + if (!page)
> > continue;
>
> Thats just an optimization. No need to lock a page if its under writeback
> which would make try_to_free_buffers() fail.
My first version of the patch did go on to say
- if (trylock_page(page)) {
+ if (!PageWriteback(page) && trylock_page(page)) {
But since the cacheline is already dirty (from get_page_unless_zero),
and it's only a trylock, and we (almost certainly) need to repeat the
PageWriteback test once we've got the lock, and it does not hit this
case often enough for you to have noticed the missing put_page() bug,
I decided to save icache instead by just removing your optimization.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH next 1/3] slub defrag: unpin writeback pages
2008-10-07 15:34 ` Hugh Dickins
@ 2008-10-07 15:41 ` Christoph Lameter
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2008-10-07 15:41 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Pekka Enberg, Andrew Morton, linux-mm
Hugh Dickins wrote:
> But since the cacheline is already dirty (from get_page_unless_zero),
> and it's only a trylock, and we (almost certainly) need to repeat the
> PageWriteback test once we've got the lock, and it does not hit this
> case often enough for you to have noticed the missing put_page() bug,
> I decided to save icache instead by just removing your optimization.
Ok. Acked-by: Christoph Lameter <cl@linux-foundation.org>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-10-07 15:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-05 2:25 [PATCH next 1/3] slub defrag: unpin writeback pages Hugh Dickins
2008-10-05 2:27 ` [PATCH next 2/3] slub defrag: dma_kmalloc_cache add_tail Hugh Dickins
2008-10-06 7:46 ` Pekka Enberg
2008-10-07 13:25 ` Christoph Lameter
2008-10-05 2:28 ` [PATCH next 3/3] slub defrag: slabinfo help trivia Hugh Dickins
2008-10-06 7:40 ` Pekka Enberg
2008-10-06 7:53 ` [PATCH next 1/3] slub defrag: unpin writeback pages Pekka Enberg
2008-10-07 13:24 ` Christoph Lameter
2008-10-07 15:34 ` Hugh Dickins
2008-10-07 15:41 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox