* [BUG] SLOB's krealloc() seems bust
@ 2008-10-07 13:57 Peter Zijlstra
2008-10-07 14:07 ` Christoph Lameter
0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-10-07 13:57 UTC (permalink / raw)
To: Matt Mackall
Cc: linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel
My test box would crash 50% of the bootups like so:
[ 12.002323] BUG: unable to handle kernel paging request at ffff88047bdd513c
[ 12.003310] IP: [<ffffffff80261f17>] load_module+0x6c8/0x1b2a
[ 12.003310] PGD 202063 PUD 0
[ 12.003310] Oops: 0000 [1] PREEMPT SMP
[ 12.003310] CPU 0
[ 12.003310] Modules linked in: kvm_amd kvm sg sr_mod serio_raw k8temp floppy pcspkr button cdrom shpchp
[ 12.003310] Pid: 1219, comm: modprobe Not tainted 2.6.27-rc9 #452
Which points us to the percpu_modalloc() code.
After adding some printk's to get some insight into the matter I got the
following:
[ 10.058055] percpu_modalloc: pcpu_size: ffff88007d82c0e8 size: 8 align: 8 name: kvm_amd
[ 10.066042] percpu_modalloc: pcpu_size[0] = -37536 ptr: ffffffff80757000 extra: 0
[ 10.073505] percpu_modalloc: pcpu_size[1] = 8192 ptr: ffffffff807602a0 extra: 0
[ 10.080795] split_block: pcpu_size: ffff88007d82c0e8 i: 1 size: 8
[ 10.086875] split_block: pcpu_size: ffff88007bdd5140
[ 10.091828] split_block: pcpu_size[0] = 2078109024
[ 10.096607] split_block: pcpu_size[1] = -30720
[ 10.101039] split_block: pcpu_size[0] = 2078109024
[ 10.105817] split_block: pcpu_size[1] = -30720
[ 10.110249] split_block: pcpu_size[2] = -30720
[ 10.114682] split_block: pcpu_size[1] = 8 pcpu_size[2] = -30728
Which basically shows us that the content of the pcpu_size[] array got
corrupted after the krealloc() call in split_block().
Which made me look at which slab allocator I had selected, which turned
out to be SLOB (from testing the network swap stuff).
Flipping it back to SLUB seems to cure the issue...
Will put poking at SLOB on the todo list somewhere, feel free to beat me
to it ;-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 13:57 [BUG] SLOB's krealloc() seems bust Peter Zijlstra
@ 2008-10-07 14:07 ` Christoph Lameter
2008-10-07 14:13 ` Peter Zijlstra
2008-10-07 15:00 ` Matt Mackall
0 siblings, 2 replies; 27+ messages in thread
From: Christoph Lameter @ 2008-10-07 14:07 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Matt Mackall, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar,
linux-kernel
> Which basically shows us that the content of the pcpu_size[] array got
> corrupted after the krealloc() call in split_block().
>
> Which made me look at which slab allocator I had selected, which turned
> out to be SLOB (from testing the network swap stuff).
krealloc() is in generic core code (mm/util.c) and is the same for all allocators.
krealloc uses ksize() which is somewhat dicey for SLOB because it only works
on kmalloc'ed memory. Is the krealloc used on memory allocated with kmalloc()?
Slob's ksize could use a BUG_ON for the case in which ksize() is used on
kmem_cache_alloc'd memory.
/* can't use ksize for kmem_cache_alloc memory, only kmalloc */
size_t ksize(const void *block)
{
struct slob_page *sp;
BUG_ON(!block);
if (unlikely(block == ZERO_SIZE_PTR))
return 0;
sp = (struct slob_page *)virt_to_page(block);
Add a BUG_ON(!kmalloc_cache(sp))?
if (slob_page(sp))
return ((slob_t *)block - 1)->units + SLOB_UNIT;
^^^^^^^ Is this correct?
else
return sp->page.private;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 14:07 ` Christoph Lameter
@ 2008-10-07 14:13 ` Peter Zijlstra
2008-10-07 14:26 ` Christoph Lameter
2008-10-07 15:00 ` Matt Mackall
1 sibling, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-10-07 14:13 UTC (permalink / raw)
To: Christoph Lameter
Cc: Matt Mackall, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar,
linux-kernel
On Tue, 2008-10-07 at 09:07 -0500, Christoph Lameter wrote:
> > Which basically shows us that the content of the pcpu_size[] array got
> > corrupted after the krealloc() call in split_block().
> >
> > Which made me look at which slab allocator I had selected, which turned
> > out to be SLOB (from testing the network swap stuff).
>
> krealloc() is in generic core code (mm/util.c) and is the same for all allocators.
Joy :/
> krealloc uses ksize() which is somewhat dicey for SLOB because it only works
> on kmalloc'ed memory. Is the krealloc used on memory allocated with kmalloc()?
> Slob's ksize could use a BUG_ON for the case in which ksize() is used on
> kmem_cache_alloc'd memory.
kernel/module.c: perpcu_modinit() reads:
pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated,
GFP_KERNEL);
kernel/module.c: split_block() reads:
new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2,
GFP_KERNEL);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 14:13 ` Peter Zijlstra
@ 2008-10-07 14:26 ` Christoph Lameter
0 siblings, 0 replies; 27+ messages in thread
From: Christoph Lameter @ 2008-10-07 14:26 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Matt Mackall, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar,
linux-kernel
Peter Zijlstra wrote:
> kernel/module.c: perpcu_modinit() reads:
>
> pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated,
> GFP_KERNEL);
>
> kernel/module.c: split_block() reads:
>
> new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2,
> GFP_KERNEL);
>
That code is on the way out btw. cpu_alloc support removes this code.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 14:07 ` Christoph Lameter
2008-10-07 14:13 ` Peter Zijlstra
@ 2008-10-07 15:00 ` Matt Mackall
2008-10-07 15:20 ` Christoph Lameter
2008-10-07 16:10 ` Peter Zijlstra
1 sibling, 2 replies; 27+ messages in thread
From: Matt Mackall @ 2008-10-07 15:00 UTC (permalink / raw)
To: Christoph Lameter
Cc: Peter Zijlstra, linux-mm, Nick Piggin, Linus Torvalds,
Ingo Molnar, linux-kernel
On Tue, 2008-10-07 at 09:07 -0500, Christoph Lameter wrote:
> > Which basically shows us that the content of the pcpu_size[] array got
> > corrupted after the krealloc() call in split_block().
> >
> > Which made me look at which slab allocator I had selected, which turned
> > out to be SLOB (from testing the network swap stuff).
>
> krealloc() is in generic core code (mm/util.c) and is the same for all allocators.
>
> krealloc uses ksize() which is somewhat dicey for SLOB because it only works
> on kmalloc'ed memory. Is the krealloc used on memory allocated with kmalloc()?
> Slob's ksize could use a BUG_ON for the case in which ksize() is used on
> kmem_cache_alloc'd memory.
>
> /* can't use ksize for kmem_cache_alloc memory, only kmalloc */
> size_t ksize(const void *block)
> {
> struct slob_page *sp;
>
> BUG_ON(!block);
> if (unlikely(block == ZERO_SIZE_PTR))
> return 0;
>
> sp = (struct slob_page *)virt_to_page(block);
>
>
> Add a BUG_ON(!kmalloc_cache(sp))?
We can't dynamically determine whether a pointer points to a kmalloced
object or not. kmem_cache_alloc objects have no header and live on the
same pages as kmalloced ones.
> if (slob_page(sp))
> return ((slob_t *)block - 1)->units + SLOB_UNIT;
> ^^^^^^^ Is this correct?
The cast? Yes. We want to look at the slob_t object header immediately
before the object pointer.
But the rest of the statement looks completely broken. If our SLOB
object is 3 units (6 bytes), with a usuable size of 4 bytes, the above
will report 3 + 2 = 5 bytes. Instead we want (3 - 1) * 2 = 4, something
more like:
return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
It's a bit amazing no one's hit this before, but I guess that's because
for allocations > 6 bytes, it will under-report buffer sizes, avoiding
overruns. And for the < 6 byte cases, we've just been getting lucky.
Give this a try, please:
diff -r 5e32b09a1b2b mm/slob.c
--- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
+++ b/mm/slob.c Tue Oct 07 10:00:16 2008 -0500
@@ -515,7 +515,7 @@
sp = (struct slob_page *)virt_to_page(block);
if (slob_page(sp))
- return ((slob_t *)block - 1)->units + SLOB_UNIT;
+ return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
else
return sp->page.private;
}
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 15:00 ` Matt Mackall
@ 2008-10-07 15:20 ` Christoph Lameter
2008-10-07 15:58 ` Matt Mackall
2008-10-07 16:10 ` Peter Zijlstra
1 sibling, 1 reply; 27+ messages in thread
From: Christoph Lameter @ 2008-10-07 15:20 UTC (permalink / raw)
To: Matt Mackall
Cc: Peter Zijlstra, linux-mm, Nick Piggin, Linus Torvalds,
Ingo Molnar, linux-kernel
Matt Mackall wrote:
> We can't dynamically determine whether a pointer points to a kmalloced
> object or not. kmem_cache_alloc objects have no header and live on the
> same pages as kmalloced ones.
Could you do a heuristic check? Assume that this is a kmalloc object and then
verify the values in the small control block? If the values are out of line
then this cannot be a kmalloc'ed object.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 15:20 ` Christoph Lameter
@ 2008-10-07 15:58 ` Matt Mackall
0 siblings, 0 replies; 27+ messages in thread
From: Matt Mackall @ 2008-10-07 15:58 UTC (permalink / raw)
To: Christoph Lameter
Cc: Peter Zijlstra, linux-mm, Nick Piggin, Linus Torvalds,
Ingo Molnar, linux-kernel
On Tue, 2008-10-07 at 10:20 -0500, Christoph Lameter wrote:
> Matt Mackall wrote:
>
> > We can't dynamically determine whether a pointer points to a kmalloced
> > object or not. kmem_cache_alloc objects have no header and live on the
> > same pages as kmalloced ones.
>
> Could you do a heuristic check? Assume that this is a kmalloc object and then
> verify the values in the small control block? If the values are out of line
> then this cannot be a kmalloc'ed object.
The control block is two bytes, so it doesn't have a lot of redundancy.
Best we can do is check that it doesn't claim the object runs off the
page. Or, for simplicity, isn't bigger than a page. On 32-bit x86,
that's equivalent to checking the top 5 bits of ->units are clear.
But it makes more sense to just do the check in SLUB. First, SLUB can
actually do the check reliably. Second, someone adding a bogus ksize
call to a random piece of kernel code is more likely to be using SLUB
when they do it. And third, it doesn't negatively impact SLOB's size.
In other words, SLUB is effectively SLOB's debugging switch when it
comes to external problems.
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 15:00 ` Matt Mackall
2008-10-07 15:20 ` Christoph Lameter
@ 2008-10-07 16:10 ` Peter Zijlstra
2008-10-07 16:37 ` Linus Torvalds
2008-10-07 16:37 ` Matt Mackall
1 sibling, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2008-10-07 16:10 UTC (permalink / raw)
To: Matt Mackall
Cc: Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds,
Ingo Molnar, linux-kernel
On Tue, 2008-10-07 at 10:00 -0500, Matt Mackall wrote:
> Give this a try, please:
>
> diff -r 5e32b09a1b2b mm/slob.c
> --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> +++ b/mm/slob.c Tue Oct 07 10:00:16 2008 -0500
> @@ -515,7 +515,7 @@
>
> sp = (struct slob_page *)virt_to_page(block);
> if (slob_page(sp))
> - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> else
> return sp->page.private;
> }
That seems to make it work again! (4 reboots, 0 crashes)
Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 16:10 ` Peter Zijlstra
@ 2008-10-07 16:37 ` Linus Torvalds
2008-10-07 16:37 ` Matt Mackall
1 sibling, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2008-10-07 16:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Matt Mackall, Christoph Lameter, linux-mm, Nick Piggin,
Ingo Molnar, linux-kernel
On Tue, 7 Oct 2008, Peter Zijlstra wrote:
>
> Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Heh. Can we get a sign-off and a nice commit message?
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 16:10 ` Peter Zijlstra
2008-10-07 16:37 ` Linus Torvalds
@ 2008-10-07 16:37 ` Matt Mackall
2008-10-07 16:57 ` Pekka Enberg
2008-10-07 17:57 ` Linus Torvalds
1 sibling, 2 replies; 27+ messages in thread
From: Matt Mackall @ 2008-10-07 16:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds,
Ingo Molnar, linux-kernel, akpm, Pekka J Enberg
On Tue, 2008-10-07 at 18:10 +0200, Peter Zijlstra wrote:
> On Tue, 2008-10-07 at 10:00 -0500, Matt Mackall wrote:
> > Give this a try, please:
...
> That seems to make it work again! (4 reboots, 0 crashes)
Thanks, Peter. I know we're way late in the 2.6.27 cycle, so I'll leave
it to Linus and Andrew to decide how to queue this up. I'm obligated to
mention it's theoretically possible that there's a path where this is
exploitable, but of course only on systems where SLOB is in use.
SLOB: fix bogus ksize calculation
SLOB's ksize calculation was braindamaged and generally harmlessly
underreported the allocation size. But for very small buffers, it could
in fact overreport them, leading code depending on krealloc to overrun
the allocation and trample other data.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
diff -r 5e32b09a1b2b mm/slob.c
--- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
+++ b/mm/slob.c Tue Oct 07 11:27:47 2008 -0500
@@ -515,7 +515,7 @@
sp = (struct slob_page *)virt_to_page(block);
if (slob_page(sp))
- return ((slob_t *)block - 1)->units + SLOB_UNIT;
+ return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
else
return sp->page.private;
}
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 16:37 ` Matt Mackall
@ 2008-10-07 16:57 ` Pekka Enberg
2008-10-07 17:13 ` Matt Mackall
2008-10-07 17:57 ` Linus Torvalds
1 sibling, 1 reply; 27+ messages in thread
From: Pekka Enberg @ 2008-10-07 16:57 UTC (permalink / raw)
To: Matt Mackall
Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
Hi Matt,
On Tue, Oct 7, 2008 at 7:37 PM, Matt Mackall <mpm@selenic.com> wrote:
> SLOB: fix bogus ksize calculation
>
> SLOB's ksize calculation was braindamaged and generally harmlessly
> underreported the allocation size. But for very small buffers, it could
> in fact overreport them, leading code depending on krealloc to overrun
> the allocation and trample other data.
>
> Signed-off-by: Matt Mackall <mpm@selenic.com>
> Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>
> diff -r 5e32b09a1b2b mm/slob.c
> --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> +++ b/mm/slob.c Tue Oct 07 11:27:47 2008 -0500
> @@ -515,7 +515,7 @@
>
> sp = (struct slob_page *)virt_to_page(block);
> if (slob_page(sp))
> - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
Hmm. I don't understand why we do the "minus one" thing here. Aren't
we underestimating the size now? Side note, why aren't we using
slob_units() here?
> else
> return sp->page.private;
> }
>
>
>
>
>
> --
> Mathematics is the supreme nostalgia of our time.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 16:57 ` Pekka Enberg
@ 2008-10-07 17:13 ` Matt Mackall
2008-10-07 17:31 ` Pekka Enberg
0 siblings, 1 reply; 27+ messages in thread
From: Matt Mackall @ 2008-10-07 17:13 UTC (permalink / raw)
To: Pekka Enberg
Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Tue, 2008-10-07 at 19:57 +0300, Pekka Enberg wrote:
> Hi Matt,
>
> On Tue, Oct 7, 2008 at 7:37 PM, Matt Mackall <mpm@selenic.com> wrote:
> > SLOB: fix bogus ksize calculation
> >
> > SLOB's ksize calculation was braindamaged and generally harmlessly
> > underreported the allocation size. But for very small buffers, it could
> > in fact overreport them, leading code depending on krealloc to overrun
> > the allocation and trample other data.
> >
> > Signed-off-by: Matt Mackall <mpm@selenic.com>
> > Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >
> > diff -r 5e32b09a1b2b mm/slob.c
> > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> > +++ b/mm/slob.c Tue Oct 07 11:27:47 2008 -0500
> > @@ -515,7 +515,7 @@
> >
> > sp = (struct slob_page *)virt_to_page(block);
> > if (slob_page(sp))
> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
>
> Hmm. I don't understand why we do the "minus one" thing here. Aren't
> we underestimating the size now?
The first -1 takes us to the object header in front of the object
pointer. The second -1 subtracts out the size of the header.
But it's entirely possible I'm off by one, so I'll double-check. Nick?
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 17:13 ` Matt Mackall
@ 2008-10-07 17:31 ` Pekka Enberg
2008-10-07 23:08 ` Matt Mackall
0 siblings, 1 reply; 27+ messages in thread
From: Pekka Enberg @ 2008-10-07 17:31 UTC (permalink / raw)
To: Matt Mackall
Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
Hi Matt,
On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote:
>> > @@ -515,7 +515,7 @@
>> >
>> > sp = (struct slob_page *)virt_to_page(block);
>> > if (slob_page(sp))
>> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
>> > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
>>
>> Hmm. I don't understand why we do the "minus one" thing here. Aren't
>> we underestimating the size now?
>
> The first -1 takes us to the object header in front of the object
> pointer. The second -1 subtracts out the size of the header.
>
> But it's entirely possible I'm off by one, so I'll double-check. Nick?
Yeah, I was referring to the second subtraction. Looking at
slob_page_alloc(), for example, we compare the return value of
slob_units() to SLOB_UNITS(size), so I don't think we count the header
in ->units. I mean, we ought to be seeing the subtraction elsewhere in
the code as well, no?
Pekka
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 16:37 ` Matt Mackall
2008-10-07 16:57 ` Pekka Enberg
@ 2008-10-07 17:57 ` Linus Torvalds
2008-10-07 18:11 ` Peter Zijlstra
1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-10-07 17:57 UTC (permalink / raw)
To: Matt Mackall
Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin,
Ingo Molnar, linux-kernel, akpm, Pekka J Enberg
On Tue, 7 Oct 2008, Matt Mackall wrote:
>
> Thanks, Peter. I know we're way late in the 2.6.27 cycle, so I'll leave
> it to Linus and Andrew to decide how to queue this up.
Well, since it seems to be clearly broken without it, I'd take it, but now
I'm kind of waiting for the resolution on whether that second "-1" is
correct or not.
>From a quick look at mm/slob.c I see Pekka's point that ->units does look
like the real size in units, not the "size plus header", and that the
second -1 may be bogus.
But I don't know the code.
Peter - can you check with that
> if (slob_page(sp))
> - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
thing using
- return ((slob_t *)block - 1)->units + SLOB_UNIT;
+ return ((slob_t *)block - 1)->units * SLOB_UNIT;
instead?
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 17:57 ` Linus Torvalds
@ 2008-10-07 18:11 ` Peter Zijlstra
2008-10-07 18:18 ` Linus Torvalds
0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-10-07 18:11 UTC (permalink / raw)
To: Linus Torvalds
Cc: Matt Mackall, Christoph Lameter, linux-mm, Nick Piggin,
Ingo Molnar, linux-kernel, akpm, Pekka J Enberg
On Tue, 2008-10-07 at 10:57 -0700, Linus Torvalds wrote:
> Peter - can you check with that
>
> > if (slob_page(sp))
> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
>
> thing using
>
> - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> + return ((slob_t *)block - 1)->units * SLOB_UNIT;
>
> instead?
went splat on the second run...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 18:11 ` Peter Zijlstra
@ 2008-10-07 18:18 ` Linus Torvalds
2008-10-08 19:51 ` Matt Mackall
0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-10-07 18:18 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Matt Mackall, Christoph Lameter, linux-mm, Nick Piggin,
Ingo Molnar, linux-kernel, akpm, Pekka J Enberg
On Tue, 7 Oct 2008, Peter Zijlstra wrote:
> On Tue, 2008-10-07 at 10:57 -0700, Linus Torvalds wrote:
>
> > Peter - can you check with that
> >
> > > if (slob_page(sp))
> > > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> >
> > thing using
> >
> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > + return ((slob_t *)block - 1)->units * SLOB_UNIT;
> >
> > instead?
>
> went splat on the second run...
Well, that makes it simple. I'll take Matt's patch as being "tested", and
somebody can hopefully explain where the extra unit comes from later.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 17:31 ` Pekka Enberg
@ 2008-10-07 23:08 ` Matt Mackall
2008-10-08 4:22 ` Nick Piggin
0 siblings, 1 reply; 27+ messages in thread
From: Matt Mackall @ 2008-10-07 23:08 UTC (permalink / raw)
To: Pekka Enberg
Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote:
> Hi Matt,
>
> On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote:
> >> > @@ -515,7 +515,7 @@
> >> >
> >> > sp = (struct slob_page *)virt_to_page(block);
> >> > if (slob_page(sp))
> >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> >> > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> >>
> >> Hmm. I don't understand why we do the "minus one" thing here. Aren't
> >> we underestimating the size now?
> >
> > The first -1 takes us to the object header in front of the object
> > pointer. The second -1 subtracts out the size of the header.
> >
> > But it's entirely possible I'm off by one, so I'll double-check. Nick?
>
> Yeah, I was referring to the second subtraction. Looking at
> slob_page_alloc(), for example, we compare the return value of
> slob_units() to SLOB_UNITS(size), so I don't think we count the header
> in ->units. I mean, we ought to be seeing the subtraction elsewhere in
> the code as well, no?
Ok, I've looked a bit closer at it and I think we need a different fix.
The underlying allocator, slob_alloc, takes a size in bytes and returns
an object of that size, with the first word containing the number of
slob_t units.
kmalloc calls slob_alloc after adding on some space for header and
architecture padding. This space is not necessarily 1 slob unit:
unsigned int *m;
int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
...
m = slob_alloc(size + align, gfp, align, node);
*m = size;
return (void *)m + align;
Note that we overwrite the header with our own size -in bytes-.
kfree does the reverse:
int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
unsigned int *m = (unsigned int *)(block - align);
slob_free(m, *m + align);
That second line is locating the kmalloc header. All looks good.
The MINALIGN business was introduced by Nick with:
slob: improved alignment handling
but seems to have missed ksize, which should now be doing the following
to match:
diff -r 5e32b09a1b2b mm/slob.c
--- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
+++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500
@@ -514,9 +514,11 @@
return 0;
sp = (struct slob_page *)virt_to_page(block);
- if (slob_page(sp))
- return ((slob_t *)block - 1)->units + SLOB_UNIT;
- else
+ if (slob_page(sp)) {
+ int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+ unsigned int *m = (unsigned int *)(block - align);
+ return SLOB_UNITS(*m); /* round up */
+ } else
return sp->page.private;
}
That leaves the question of why this morning's patch worked at all,
given that it was based on how SLOB worked before Nick's patch. But I
haven't finished working through that. Peter, can I get you to test the
above?
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 23:08 ` Matt Mackall
@ 2008-10-08 4:22 ` Nick Piggin
2008-10-08 4:46 ` Matt Mackall
0 siblings, 1 reply; 27+ messages in thread
From: Nick Piggin @ 2008-10-08 4:22 UTC (permalink / raw)
To: Matt Mackall
Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wednesday 08 October 2008 10:08, Matt Mackall wrote:
> On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote:
> > Hi Matt,
> >
> > On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote:
> > >> > @@ -515,7 +515,7 @@
> > >> >
> > >> > sp = (struct slob_page *)virt_to_page(block);
> > >> > if (slob_page(sp))
> > >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > >> > + return (((slob_t *)block - 1)->units - 1) *
> > >> > SLOB_UNIT;
> > >>
> > >> Hmm. I don't understand why we do the "minus one" thing here. Aren't
> > >> we underestimating the size now?
> > >
> > > The first -1 takes us to the object header in front of the object
> > > pointer. The second -1 subtracts out the size of the header.
> > >
> > > But it's entirely possible I'm off by one, so I'll double-check. Nick?
> >
> > Yeah, I was referring to the second subtraction. Looking at
> > slob_page_alloc(), for example, we compare the return value of
> > slob_units() to SLOB_UNITS(size), so I don't think we count the header
> > in ->units. I mean, we ought to be seeing the subtraction elsewhere in
> > the code as well, no?
>
> Ok, I've looked a bit closer at it and I think we need a different fix.
>
> The underlying allocator, slob_alloc, takes a size in bytes and returns
> an object of that size, with the first word containing the number of
> slob_t units.
>
> kmalloc calls slob_alloc after adding on some space for header and
> architecture padding. This space is not necessarily 1 slob unit:
>
> unsigned int *m;
> int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> ...
> m = slob_alloc(size + align, gfp, align, node);
> *m = size;
> return (void *)m + align;
>
> Note that we overwrite the header with our own size -in bytes-.
> kfree does the reverse:
Right.
> int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> unsigned int *m = (unsigned int *)(block - align);
> slob_free(m, *m + align);
>
> That second line is locating the kmalloc header. All looks good.
>
> The MINALIGN business was introduced by Nick with:
>
> slob: improved alignment handling
>
> but seems to have missed ksize, which should now be doing the following
> to match:
>
> diff -r 5e32b09a1b2b mm/slob.c
> --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500
> @@ -514,9 +514,11 @@
> return 0;
>
> sp = (struct slob_page *)virt_to_page(block);
> - if (slob_page(sp))
> - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> - else
> + if (slob_page(sp)) {
> + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> + unsigned int *m = (unsigned int *)(block - align);
> + return SLOB_UNITS(*m); /* round up */
> + } else
> return sp->page.private;
> }
Yes, I came up with nearly the same patch before reading this
--- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100
+++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100
@@ -514,9 +514,11 @@ size_t ksize(const void *block)
return 0;
sp = (struct slob_page *)virt_to_page(block);
- if (slob_page(sp))
- return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
- else
+ if (slob_page(sp)) {
+ int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+ unsigned int *m = (unsigned int *)(block - align);
+ return *m + align;
+ } else
return sp->page.private;
}
However, mine is lifted directly from kfree, wheras you do something a
bit different. Hmm, ksize arguably could be used to find the underlying
allocated slab size in order to use a little bit more than we'd asked
for. So probably we should really just `return *m` (don't round up or
add any padding).
> That leaves the question of why this morning's patch worked at all,
> given that it was based on how SLOB worked before Nick's patch. But I
> haven't finished working through that. Peter, can I get you to test the
> above?
I didn't have ksize in my slob user test harness, but added a couple of
tests in there, and indeed ksize was returning complete garbage both
before and after the latest patch to slob. I'd say it was simply luck.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 4:22 ` Nick Piggin
@ 2008-10-08 4:46 ` Matt Mackall
2008-10-08 4:54 ` Nick Piggin
0 siblings, 1 reply; 27+ messages in thread
From: Matt Mackall @ 2008-10-08 4:46 UTC (permalink / raw)
To: Nick Piggin
Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote:
> On Wednesday 08 October 2008 10:08, Matt Mackall wrote:
> > On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote:
> > > Hi Matt,
> > >
> > > On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote:
> > > >> > @@ -515,7 +515,7 @@
> > > >> >
> > > >> > sp = (struct slob_page *)virt_to_page(block);
> > > >> > if (slob_page(sp))
> > > >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > >> > + return (((slob_t *)block - 1)->units - 1) *
> > > >> > SLOB_UNIT;
> > > >>
> > > >> Hmm. I don't understand why we do the "minus one" thing here. Aren't
> > > >> we underestimating the size now?
> > > >
> > > > The first -1 takes us to the object header in front of the object
> > > > pointer. The second -1 subtracts out the size of the header.
> > > >
> > > > But it's entirely possible I'm off by one, so I'll double-check. Nick?
> > >
> > > Yeah, I was referring to the second subtraction. Looking at
> > > slob_page_alloc(), for example, we compare the return value of
> > > slob_units() to SLOB_UNITS(size), so I don't think we count the header
> > > in ->units. I mean, we ought to be seeing the subtraction elsewhere in
> > > the code as well, no?
> >
> > Ok, I've looked a bit closer at it and I think we need a different fix.
> >
> > The underlying allocator, slob_alloc, takes a size in bytes and returns
> > an object of that size, with the first word containing the number of
> > slob_t units.
> >
> > kmalloc calls slob_alloc after adding on some space for header and
> > architecture padding. This space is not necessarily 1 slob unit:
> >
> > unsigned int *m;
> > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> > ...
> > m = slob_alloc(size + align, gfp, align, node);
> > *m = size;
> > return (void *)m + align;
> >
> > Note that we overwrite the header with our own size -in bytes-.
> > kfree does the reverse:
>
> Right.
>
> > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> > unsigned int *m = (unsigned int *)(block - align);
> > slob_free(m, *m + align);
> >
> > That second line is locating the kmalloc header. All looks good.
> >
> > The MINALIGN business was introduced by Nick with:
> >
> > slob: improved alignment handling
> >
> > but seems to have missed ksize, which should now be doing the following
> > to match:
> >
> > diff -r 5e32b09a1b2b mm/slob.c
> > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500
> > @@ -514,9 +514,11 @@
> > return 0;
> >
> > sp = (struct slob_page *)virt_to_page(block);
> > - if (slob_page(sp))
> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > - else
> > + if (slob_page(sp)) {
> > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> > + unsigned int *m = (unsigned int *)(block - align);
> > + return SLOB_UNITS(*m); /* round up */
> > + } else
> > return sp->page.private;
> > }
>
> Yes, I came up with nearly the same patch before reading this
>
> --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100
> +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100
> @@ -514,9 +514,11 @@ size_t ksize(const void *block)
> return 0;
>
> sp = (struct slob_page *)virt_to_page(block);
> - if (slob_page(sp))
> - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> - else
> + if (slob_page(sp)) {
> + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> + unsigned int *m = (unsigned int *)(block - align);
> + return *m + align;
> + } else
> return sp->page.private;
> }
>
> However, mine is lifted directly from kfree, wheras you do something a
> bit different. Hmm, ksize arguably could be used to find the underlying
> allocated slab size in order to use a little bit more than we'd asked
> for. So probably we should really just `return *m` (don't round up or
> add any padding).
Huh? ksize should report how much space is available in the buffer. If
we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If
we request 33 bytes from SLOB and it gives us 34, we should report 34.
> > That leaves the question of why this morning's patch worked at all,
> > given that it was based on how SLOB worked before Nick's patch. But I
> > haven't finished working through that. Peter, can I get you to test the
> > above?
>
> I didn't have ksize in my slob user test harness, but added a couple of
> tests in there, and indeed ksize was returning complete garbage both
> before and after the latest patch to slob. I'd say it was simply luck.
I was going to dig your harness up this morning and realized it was on
my dead laptop. Send me another copy?
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 4:46 ` Matt Mackall
@ 2008-10-08 4:54 ` Nick Piggin
2008-10-08 5:11 ` Nick Piggin
0 siblings, 1 reply; 27+ messages in thread
From: Nick Piggin @ 2008-10-08 4:54 UTC (permalink / raw)
To: Matt Mackall
Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wednesday 08 October 2008 15:46, Matt Mackall wrote:
> On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote:
> > On Wednesday 08 October 2008 10:08, Matt Mackall wrote:
> > > On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote:
> > > > Hi Matt,
> > > >
> > > > On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote:
> > > > >> > @@ -515,7 +515,7 @@
> > > > >> >
> > > > >> > sp = (struct slob_page *)virt_to_page(block);
> > > > >> > if (slob_page(sp))
> > > > >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > > >> > + return (((slob_t *)block - 1)->units - 1) *
> > > > >> > SLOB_UNIT;
> > > > >>
> > > > >> Hmm. I don't understand why we do the "minus one" thing here.
> > > > >> Aren't we underestimating the size now?
> > > > >
> > > > > The first -1 takes us to the object header in front of the object
> > > > > pointer. The second -1 subtracts out the size of the header.
> > > > >
> > > > > But it's entirely possible I'm off by one, so I'll double-check.
> > > > > Nick?
> > > >
> > > > Yeah, I was referring to the second subtraction. Looking at
> > > > slob_page_alloc(), for example, we compare the return value of
> > > > slob_units() to SLOB_UNITS(size), so I don't think we count the
> > > > header in ->units. I mean, we ought to be seeing the subtraction
> > > > elsewhere in the code as well, no?
> > >
> > > Ok, I've looked a bit closer at it and I think we need a different fix.
> > >
> > > The underlying allocator, slob_alloc, takes a size in bytes and returns
> > > an object of that size, with the first word containing the number of
> > > slob_t units.
> > >
> > > kmalloc calls slob_alloc after adding on some space for header and
> > > architecture padding. This space is not necessarily 1 slob unit:
> > >
> > > unsigned int *m;
> > > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> > > ...
> > > m = slob_alloc(size + align, gfp, align, node);
> > > *m = size;
> > > return (void *)m + align;
> > >
> > > Note that we overwrite the header with our own size -in bytes-.
> > > kfree does the reverse:
> >
> > Right.
> >
> > > int align = max(ARCH_KMALLOC_MINALIGN,
> > > ARCH_SLAB_MINALIGN); unsigned int *m = (unsigned int *)(block - align);
> > > slob_free(m, *m + align);
> > >
> > > That second line is locating the kmalloc header. All looks good.
> > >
> > > The MINALIGN business was introduced by Nick with:
> > >
> > > slob: improved alignment handling
> > >
> > > but seems to have missed ksize, which should now be doing the following
> > > to match:
> > >
> > > diff -r 5e32b09a1b2b mm/slob.c
> > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> > > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500
> > > @@ -514,9 +514,11 @@
> > > return 0;
> > >
> > > sp = (struct slob_page *)virt_to_page(block);
> > > - if (slob_page(sp))
> > > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > - else
> > > + if (slob_page(sp)) {
> > > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> > > + unsigned int *m = (unsigned int *)(block - align);
> > > + return SLOB_UNITS(*m); /* round up */
> > > + } else
> > > return sp->page.private;
> > > }
> >
> > Yes, I came up with nearly the same patch before reading this
> >
> > --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100
> > +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100
> > @@ -514,9 +514,11 @@ size_t ksize(const void *block)
> > return 0;
> >
> > sp = (struct slob_page *)virt_to_page(block);
> > - if (slob_page(sp))
> > - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> > - else
> > + if (slob_page(sp)) {
> > + int align = max(ARCH_KMALLOC_MINALIGN,
> > ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int
> > *)(block - align); + return *m + align;
> > + } else
> > return sp->page.private;
> > }
> >
> > However, mine is lifted directly from kfree, wheras you do something a
> > bit different. Hmm, ksize arguably could be used to find the underlying
> > allocated slab size in order to use a little bit more than we'd asked
> > for. So probably we should really just `return *m` (don't round up or
> > add any padding).
>
> Huh? ksize should report how much space is available in the buffer. If
> we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If
> we request 33 bytes from SLOB and it gives us 34, we should report 34.
Oh.. hmm yeah right, I didn't realise what you were doing there.
OK, so your patch looks good to me then (provided it is diffed against
the previous one, for Linus).
> > > That leaves the question of why this morning's patch worked at all,
> > > given that it was based on how SLOB worked before Nick's patch. But I
> > > haven't finished working through that. Peter, can I get you to test the
> > > above?
> >
> > I didn't have ksize in my slob user test harness, but added a couple of
> > tests in there, and indeed ksize was returning complete garbage both
> > before and after the latest patch to slob. I'd say it was simply luck.
>
> I was going to dig your harness up this morning and realized it was on
> my dead laptop. Send me another copy?
OK, off-list.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 4:54 ` Nick Piggin
@ 2008-10-08 5:11 ` Nick Piggin
2008-10-08 5:15 ` Matt Mackall
0 siblings, 1 reply; 27+ messages in thread
From: Nick Piggin @ 2008-10-08 5:11 UTC (permalink / raw)
To: Matt Mackall
Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wednesday 08 October 2008 15:54, Nick Piggin wrote:
> On Wednesday 08 October 2008 15:46, Matt Mackall wrote:
> > On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote:
> > > On Wednesday 08 October 2008 10:08, Matt Mackall wrote:
> > > > diff -r 5e32b09a1b2b mm/slob.c
> > > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> > > > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500
> > > > @@ -514,9 +514,11 @@
> > > > return 0;
> > > >
> > > > sp = (struct slob_page *)virt_to_page(block);
> > > > - if (slob_page(sp))
> > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > > - else
> > > > + if (slob_page(sp)) {
> > > > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> > > > + unsigned int *m = (unsigned int *)(block - align);
> > > > + return SLOB_UNITS(*m); /* round up */
> > > > + } else
> > > > return sp->page.private;
> > > > }
> > >
> > > Yes, I came up with nearly the same patch before reading this
> > >
> > > --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100
> > > +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100
> > > @@ -514,9 +514,11 @@ size_t ksize(const void *block)
> > > return 0;
> > >
> > > sp = (struct slob_page *)virt_to_page(block);
> > > - if (slob_page(sp))
> > > - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> > > - else
> > > + if (slob_page(sp)) {
> > > + int align = max(ARCH_KMALLOC_MINALIGN,
> > > ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int
> > > *)(block - align); + return *m + align;
> > > + } else
> > > return sp->page.private;
> > > }
> > >
> > > However, mine is lifted directly from kfree, wheras you do something a
> > > bit different. Hmm, ksize arguably could be used to find the underlying
> > > allocated slab size in order to use a little bit more than we'd asked
> > > for. So probably we should really just `return *m` (don't round up or
> > > add any padding).
> >
> > Huh? ksize should report how much space is available in the buffer. If
> > we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If
> > we request 33 bytes from SLOB and it gives us 34, we should report 34.
>
> Oh.. hmm yeah right, I didn't realise what you were doing there.
> OK, so your patch looks good to me then (provided it is diffed against
> the previous one, for Linus).
OK, no, that's why I got confused. SLOB_UNITS will round you up to the
next SLOB_UNIT. You'd then have to multiply by SLOB_UNIT to get back to
bytes.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 5:11 ` Nick Piggin
@ 2008-10-08 5:15 ` Matt Mackall
2008-10-08 6:43 ` Peter Zijlstra
0 siblings, 1 reply; 27+ messages in thread
From: Matt Mackall @ 2008-10-08 5:15 UTC (permalink / raw)
To: Nick Piggin
Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wed, 2008-10-08 at 16:11 +1100, Nick Piggin wrote:
> On Wednesday 08 October 2008 15:54, Nick Piggin wrote:
> > On Wednesday 08 October 2008 15:46, Matt Mackall wrote:
> > > On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote:
> > > > On Wednesday 08 October 2008 10:08, Matt Mackall wrote:
>
> > > > > diff -r 5e32b09a1b2b mm/slob.c
> > > > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500
> > > > > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500
> > > > > @@ -514,9 +514,11 @@
> > > > > return 0;
> > > > >
> > > > > sp = (struct slob_page *)virt_to_page(block);
> > > > > - if (slob_page(sp))
> > > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > > > - else
> > > > > + if (slob_page(sp)) {
> > > > > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> > > > > + unsigned int *m = (unsigned int *)(block - align);
> > > > > + return SLOB_UNITS(*m); /* round up */
> > > > > + } else
> > > > > return sp->page.private;
> > > > > }
> > > >
> > > > Yes, I came up with nearly the same patch before reading this
> > > >
> > > > --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100
> > > > +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100
> > > > @@ -514,9 +514,11 @@ size_t ksize(const void *block)
> > > > return 0;
> > > >
> > > > sp = (struct slob_page *)virt_to_page(block);
> > > > - if (slob_page(sp))
> > > > - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> > > > - else
> > > > + if (slob_page(sp)) {
> > > > + int align = max(ARCH_KMALLOC_MINALIGN,
> > > > ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int
> > > > *)(block - align); + return *m + align;
> > > > + } else
> > > > return sp->page.private;
> > > > }
> > > >
> > > > However, mine is lifted directly from kfree, wheras you do something a
> > > > bit different. Hmm, ksize arguably could be used to find the underlying
> > > > allocated slab size in order to use a little bit more than we'd asked
> > > > for. So probably we should really just `return *m` (don't round up or
> > > > add any padding).
> > >
> > > Huh? ksize should report how much space is available in the buffer. If
> > > we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If
> > > we request 33 bytes from SLOB and it gives us 34, we should report 34.
> >
> > Oh.. hmm yeah right, I didn't realise what you were doing there.
> > OK, so your patch looks good to me then (provided it is diffed against
> > the previous one, for Linus).
>
> OK, no, that's why I got confused. SLOB_UNITS will round you up to the
> next SLOB_UNIT. You'd then have to multiply by SLOB_UNIT to get back to
> bytes.
Damnit, how many ways can we get confused by these little details? I'll
spin a final version and run it against the test harness shortly.
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 5:15 ` Matt Mackall
@ 2008-10-08 6:43 ` Peter Zijlstra
2008-10-08 7:25 ` Pekka Enberg
0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-10-08 6:43 UTC (permalink / raw)
To: Matt Mackall
Cc: Nick Piggin, Pekka Enberg, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote:
> Damnit, how many ways can we get confused by these little details? I'll
> spin a final version and run it against the test harness shortly.
So I'll wait with testing for the next version?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 6:43 ` Peter Zijlstra
@ 2008-10-08 7:25 ` Pekka Enberg
2008-10-08 7:37 ` Peter Zijlstra
0 siblings, 1 reply; 27+ messages in thread
From: Pekka Enberg @ 2008-10-08 7:25 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Matt Mackall, Nick Piggin, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
Hi Peter,
On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote:
>> Damnit, how many ways can we get confused by these little details? I'll
>> spin a final version and run it against the test harness shortly.
On Wed, Oct 8, 2008 at 9:43 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> So I'll wait with testing for the next version?
AFAICT, this should be fine:
http://lkml.org/lkml/2008/10/7/436
You need to revert the patch Linus already committed before applying
that though.
Pekka
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 7:25 ` Pekka Enberg
@ 2008-10-08 7:37 ` Peter Zijlstra
2008-10-08 7:39 ` Pekka Enberg
0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-10-08 7:37 UTC (permalink / raw)
To: Pekka Enberg
Cc: Matt Mackall, Nick Piggin, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wed, 2008-10-08 at 10:25 +0300, Pekka Enberg wrote:
> Hi Peter,
>
> On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote:
> >> Damnit, how many ways can we get confused by these little details? I'll
> >> spin a final version and run it against the test harness shortly.
>
> On Wed, Oct 8, 2008 at 9:43 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > So I'll wait with testing for the next version?
>
> AFAICT, this should be fine:
>
> http://lkml.org/lkml/2008/10/7/436
>
> You need to revert the patch Linus already committed before applying
> that though.
Or not have pulled it yet ;-)
Anyway, that patch seems to be good on 4 boots.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-08 7:37 ` Peter Zijlstra
@ 2008-10-08 7:39 ` Pekka Enberg
0 siblings, 0 replies; 27+ messages in thread
From: Pekka Enberg @ 2008-10-08 7:39 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Matt Mackall, Nick Piggin, Christoph Lameter, linux-mm,
Linus Torvalds, Ingo Molnar, linux-kernel, akpm
On Wed, 2008-10-08 at 09:37 +0200, Peter Zijlstra wrote:
> On Wed, 2008-10-08 at 10:25 +0300, Pekka Enberg wrote:
> > Hi Peter,
> >
> > On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote:
> > >> Damnit, how many ways can we get confused by these little details? I'll
> > >> spin a final version and run it against the test harness shortly.
> >
> > On Wed, Oct 8, 2008 at 9:43 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > So I'll wait with testing for the next version?
> >
> > AFAICT, this should be fine:
> >
> > http://lkml.org/lkml/2008/10/7/436
> >
> > You need to revert the patch Linus already committed before applying
> > that though.
>
> Or not have pulled it yet ;-)
>
> Anyway, that patch seems to be good on 4 boots.
Thanks for testing, Peter!
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust
2008-10-07 18:18 ` Linus Torvalds
@ 2008-10-08 19:51 ` Matt Mackall
0 siblings, 0 replies; 27+ messages in thread
From: Matt Mackall @ 2008-10-08 19:51 UTC (permalink / raw)
To: Linus Torvalds
Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin,
Ingo Molnar, linux-kernel, akpm, Pekka J Enberg
On Tue, 2008-10-07 at 11:18 -0700, Linus Torvalds wrote:
>
> On Tue, 7 Oct 2008, Peter Zijlstra wrote:
>
> > On Tue, 2008-10-07 at 10:57 -0700, Linus Torvalds wrote:
> >
> > > Peter - can you check with that
> > >
> > > > if (slob_page(sp))
> > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
> > >
> > > thing using
> > >
> > > - return ((slob_t *)block - 1)->units + SLOB_UNIT;
> > > + return ((slob_t *)block - 1)->units * SLOB_UNIT;
> > >
> > > instead?
> >
> > went splat on the second run...
>
> Well, that makes it simple. I'll take Matt's patch as being "tested", and
> somebody can hopefully explain where the extra unit comes from later.
Ok, I think we've gotten to the bottom of this. Here's an incremental
patch that doesn't work by dumb luck. Please apply.
SLOB: fix bogus ksize calculation fix
This fixes the previous fix, which was completely wrong on closer
inspection. This version has been manually tested with a user-space
test harness and generates sane values. A nearly identical patch has
been boot-tested.
The problem arose from changing how kmalloc/kfree handled alignment
padding without updating ksize to match. This brings it in sync.
Signed-off-by: Matt Mackall <mpm@selenic.com>
diff -r 3dd2424d4c32 -r 73d55a1b6c10 mm/slob.c
--- a/mm/slob.c Tue Oct 07 23:00:11 2008 +0000
+++ b/mm/slob.c Wed Oct 08 14:48:45 2008 -0500
@@ -514,9 +514,11 @@
return 0;
sp = (struct slob_page *)virt_to_page(block);
- if (slob_page(sp))
- return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT;
- else
+ if (slob_page(sp)) {
+ int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+ unsigned int *m = (unsigned int *)(block - align);
+ return SLOB_UNITS(*m) * SLOB_UNIT;
+ } else
return sp->page.private;
}
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2008-10-08 19:51 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-07 13:57 [BUG] SLOB's krealloc() seems bust Peter Zijlstra
2008-10-07 14:07 ` Christoph Lameter
2008-10-07 14:13 ` Peter Zijlstra
2008-10-07 14:26 ` Christoph Lameter
2008-10-07 15:00 ` Matt Mackall
2008-10-07 15:20 ` Christoph Lameter
2008-10-07 15:58 ` Matt Mackall
2008-10-07 16:10 ` Peter Zijlstra
2008-10-07 16:37 ` Linus Torvalds
2008-10-07 16:37 ` Matt Mackall
2008-10-07 16:57 ` Pekka Enberg
2008-10-07 17:13 ` Matt Mackall
2008-10-07 17:31 ` Pekka Enberg
2008-10-07 23:08 ` Matt Mackall
2008-10-08 4:22 ` Nick Piggin
2008-10-08 4:46 ` Matt Mackall
2008-10-08 4:54 ` Nick Piggin
2008-10-08 5:11 ` Nick Piggin
2008-10-08 5:15 ` Matt Mackall
2008-10-08 6:43 ` Peter Zijlstra
2008-10-08 7:25 ` Pekka Enberg
2008-10-08 7:37 ` Peter Zijlstra
2008-10-08 7:39 ` Pekka Enberg
2008-10-07 17:57 ` Linus Torvalds
2008-10-07 18:11 ` Peter Zijlstra
2008-10-07 18:18 ` Linus Torvalds
2008-10-08 19:51 ` Matt Mackall
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox