* [BUG] SLOB's krealloc() seems bust @ 2008-10-07 13:57 Peter Zijlstra 2008-10-07 14:07 ` Christoph Lameter 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2008-10-07 13:57 UTC (permalink / raw) To: Matt Mackall Cc: linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel My test box would crash 50% of the bootups like so: [ 12.002323] BUG: unable to handle kernel paging request at ffff88047bdd513c [ 12.003310] IP: [<ffffffff80261f17>] load_module+0x6c8/0x1b2a [ 12.003310] PGD 202063 PUD 0 [ 12.003310] Oops: 0000 [1] PREEMPT SMP [ 12.003310] CPU 0 [ 12.003310] Modules linked in: kvm_amd kvm sg sr_mod serio_raw k8temp floppy pcspkr button cdrom shpchp [ 12.003310] Pid: 1219, comm: modprobe Not tainted 2.6.27-rc9 #452 Which points us to the percpu_modalloc() code. After adding some printk's to get some insight into the matter I got the following: [ 10.058055] percpu_modalloc: pcpu_size: ffff88007d82c0e8 size: 8 align: 8 name: kvm_amd [ 10.066042] percpu_modalloc: pcpu_size[0] = -37536 ptr: ffffffff80757000 extra: 0 [ 10.073505] percpu_modalloc: pcpu_size[1] = 8192 ptr: ffffffff807602a0 extra: 0 [ 10.080795] split_block: pcpu_size: ffff88007d82c0e8 i: 1 size: 8 [ 10.086875] split_block: pcpu_size: ffff88007bdd5140 [ 10.091828] split_block: pcpu_size[0] = 2078109024 [ 10.096607] split_block: pcpu_size[1] = -30720 [ 10.101039] split_block: pcpu_size[0] = 2078109024 [ 10.105817] split_block: pcpu_size[1] = -30720 [ 10.110249] split_block: pcpu_size[2] = -30720 [ 10.114682] split_block: pcpu_size[1] = 8 pcpu_size[2] = -30728 Which basically shows us that the content of the pcpu_size[] array got corrupted after the krealloc() call in split_block(). Which made me look at which slab allocator I had selected, which turned out to be SLOB (from testing the network swap stuff). Flipping it back to SLUB seems to cure the issue... Will put poking at SLOB on the todo list somewhere, feel free to beat me to it ;-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 13:57 [BUG] SLOB's krealloc() seems bust Peter Zijlstra @ 2008-10-07 14:07 ` Christoph Lameter 2008-10-07 14:13 ` Peter Zijlstra 2008-10-07 15:00 ` Matt Mackall 0 siblings, 2 replies; 27+ messages in thread From: Christoph Lameter @ 2008-10-07 14:07 UTC (permalink / raw) To: Peter Zijlstra Cc: Matt Mackall, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel > Which basically shows us that the content of the pcpu_size[] array got > corrupted after the krealloc() call in split_block(). > > Which made me look at which slab allocator I had selected, which turned > out to be SLOB (from testing the network swap stuff). krealloc() is in generic core code (mm/util.c) and is the same for all allocators. krealloc uses ksize() which is somewhat dicey for SLOB because it only works on kmalloc'ed memory. Is the krealloc used on memory allocated with kmalloc()? Slob's ksize could use a BUG_ON for the case in which ksize() is used on kmem_cache_alloc'd memory. /* can't use ksize for kmem_cache_alloc memory, only kmalloc */ size_t ksize(const void *block) { struct slob_page *sp; BUG_ON(!block); if (unlikely(block == ZERO_SIZE_PTR)) return 0; sp = (struct slob_page *)virt_to_page(block); Add a BUG_ON(!kmalloc_cache(sp))? if (slob_page(sp)) return ((slob_t *)block - 1)->units + SLOB_UNIT; ^^^^^^^ Is this correct? else return sp->page.private; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 14:07 ` Christoph Lameter @ 2008-10-07 14:13 ` Peter Zijlstra 2008-10-07 14:26 ` Christoph Lameter 2008-10-07 15:00 ` Matt Mackall 1 sibling, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2008-10-07 14:13 UTC (permalink / raw) To: Christoph Lameter Cc: Matt Mackall, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel On Tue, 2008-10-07 at 09:07 -0500, Christoph Lameter wrote: > > Which basically shows us that the content of the pcpu_size[] array got > > corrupted after the krealloc() call in split_block(). > > > > Which made me look at which slab allocator I had selected, which turned > > out to be SLOB (from testing the network swap stuff). > > krealloc() is in generic core code (mm/util.c) and is the same for all allocators. Joy :/ > krealloc uses ksize() which is somewhat dicey for SLOB because it only works > on kmalloc'ed memory. Is the krealloc used on memory allocated with kmalloc()? > Slob's ksize could use a BUG_ON for the case in which ksize() is used on > kmem_cache_alloc'd memory. kernel/module.c: perpcu_modinit() reads: pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated, GFP_KERNEL); kernel/module.c: split_block() reads: new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2, GFP_KERNEL); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 14:13 ` Peter Zijlstra @ 2008-10-07 14:26 ` Christoph Lameter 0 siblings, 0 replies; 27+ messages in thread From: Christoph Lameter @ 2008-10-07 14:26 UTC (permalink / raw) To: Peter Zijlstra Cc: Matt Mackall, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel Peter Zijlstra wrote: > kernel/module.c: perpcu_modinit() reads: > > pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated, > GFP_KERNEL); > > kernel/module.c: split_block() reads: > > new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2, > GFP_KERNEL); > That code is on the way out btw. cpu_alloc support removes this code. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 14:07 ` Christoph Lameter 2008-10-07 14:13 ` Peter Zijlstra @ 2008-10-07 15:00 ` Matt Mackall 2008-10-07 15:20 ` Christoph Lameter 2008-10-07 16:10 ` Peter Zijlstra 1 sibling, 2 replies; 27+ messages in thread From: Matt Mackall @ 2008-10-07 15:00 UTC (permalink / raw) To: Christoph Lameter Cc: Peter Zijlstra, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel On Tue, 2008-10-07 at 09:07 -0500, Christoph Lameter wrote: > > Which basically shows us that the content of the pcpu_size[] array got > > corrupted after the krealloc() call in split_block(). > > > > Which made me look at which slab allocator I had selected, which turned > > out to be SLOB (from testing the network swap stuff). > > krealloc() is in generic core code (mm/util.c) and is the same for all allocators. > > krealloc uses ksize() which is somewhat dicey for SLOB because it only works > on kmalloc'ed memory. Is the krealloc used on memory allocated with kmalloc()? > Slob's ksize could use a BUG_ON for the case in which ksize() is used on > kmem_cache_alloc'd memory. > > /* can't use ksize for kmem_cache_alloc memory, only kmalloc */ > size_t ksize(const void *block) > { > struct slob_page *sp; > > BUG_ON(!block); > if (unlikely(block == ZERO_SIZE_PTR)) > return 0; > > sp = (struct slob_page *)virt_to_page(block); > > > Add a BUG_ON(!kmalloc_cache(sp))? We can't dynamically determine whether a pointer points to a kmalloced object or not. kmem_cache_alloc objects have no header and live on the same pages as kmalloced ones. > if (slob_page(sp)) > return ((slob_t *)block - 1)->units + SLOB_UNIT; > ^^^^^^^ Is this correct? The cast? Yes. We want to look at the slob_t object header immediately before the object pointer. But the rest of the statement looks completely broken. If our SLOB object is 3 units (6 bytes), with a usuable size of 4 bytes, the above will report 3 + 2 = 5 bytes. Instead we want (3 - 1) * 2 = 4, something more like: return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; It's a bit amazing no one's hit this before, but I guess that's because for allocations > 6 bytes, it will under-report buffer sizes, avoiding overruns. And for the < 6 byte cases, we've just been getting lucky. Give this a try, please: diff -r 5e32b09a1b2b mm/slob.c --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 +++ b/mm/slob.c Tue Oct 07 10:00:16 2008 -0500 @@ -515,7 +515,7 @@ sp = (struct slob_page *)virt_to_page(block); if (slob_page(sp)) - return ((slob_t *)block - 1)->units + SLOB_UNIT; + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; else return sp->page.private; } -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 15:00 ` Matt Mackall @ 2008-10-07 15:20 ` Christoph Lameter 2008-10-07 15:58 ` Matt Mackall 2008-10-07 16:10 ` Peter Zijlstra 1 sibling, 1 reply; 27+ messages in thread From: Christoph Lameter @ 2008-10-07 15:20 UTC (permalink / raw) To: Matt Mackall Cc: Peter Zijlstra, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel Matt Mackall wrote: > We can't dynamically determine whether a pointer points to a kmalloced > object or not. kmem_cache_alloc objects have no header and live on the > same pages as kmalloced ones. Could you do a heuristic check? Assume that this is a kmalloc object and then verify the values in the small control block? If the values are out of line then this cannot be a kmalloc'ed object. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 15:20 ` Christoph Lameter @ 2008-10-07 15:58 ` Matt Mackall 0 siblings, 0 replies; 27+ messages in thread From: Matt Mackall @ 2008-10-07 15:58 UTC (permalink / raw) To: Christoph Lameter Cc: Peter Zijlstra, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel On Tue, 2008-10-07 at 10:20 -0500, Christoph Lameter wrote: > Matt Mackall wrote: > > > We can't dynamically determine whether a pointer points to a kmalloced > > object or not. kmem_cache_alloc objects have no header and live on the > > same pages as kmalloced ones. > > Could you do a heuristic check? Assume that this is a kmalloc object and then > verify the values in the small control block? If the values are out of line > then this cannot be a kmalloc'ed object. The control block is two bytes, so it doesn't have a lot of redundancy. Best we can do is check that it doesn't claim the object runs off the page. Or, for simplicity, isn't bigger than a page. On 32-bit x86, that's equivalent to checking the top 5 bits of ->units are clear. But it makes more sense to just do the check in SLUB. First, SLUB can actually do the check reliably. Second, someone adding a bogus ksize call to a random piece of kernel code is more likely to be using SLUB when they do it. And third, it doesn't negatively impact SLOB's size. In other words, SLUB is effectively SLOB's debugging switch when it comes to external problems. -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 15:00 ` Matt Mackall 2008-10-07 15:20 ` Christoph Lameter @ 2008-10-07 16:10 ` Peter Zijlstra 2008-10-07 16:37 ` Linus Torvalds 2008-10-07 16:37 ` Matt Mackall 1 sibling, 2 replies; 27+ messages in thread From: Peter Zijlstra @ 2008-10-07 16:10 UTC (permalink / raw) To: Matt Mackall Cc: Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel On Tue, 2008-10-07 at 10:00 -0500, Matt Mackall wrote: > Give this a try, please: > > diff -r 5e32b09a1b2b mm/slob.c > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > +++ b/mm/slob.c Tue Oct 07 10:00:16 2008 -0500 > @@ -515,7 +515,7 @@ > > sp = (struct slob_page *)virt_to_page(block); > if (slob_page(sp)) > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > else > return sp->page.private; > } That seems to make it work again! (4 reboots, 0 crashes) Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Thanks! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 16:10 ` Peter Zijlstra @ 2008-10-07 16:37 ` Linus Torvalds 2008-10-07 16:37 ` Matt Mackall 1 sibling, 0 replies; 27+ messages in thread From: Linus Torvalds @ 2008-10-07 16:37 UTC (permalink / raw) To: Peter Zijlstra Cc: Matt Mackall, Christoph Lameter, linux-mm, Nick Piggin, Ingo Molnar, linux-kernel On Tue, 7 Oct 2008, Peter Zijlstra wrote: > > Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Heh. Can we get a sign-off and a nice commit message? Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 16:10 ` Peter Zijlstra 2008-10-07 16:37 ` Linus Torvalds @ 2008-10-07 16:37 ` Matt Mackall 2008-10-07 16:57 ` Pekka Enberg 2008-10-07 17:57 ` Linus Torvalds 1 sibling, 2 replies; 27+ messages in thread From: Matt Mackall @ 2008-10-07 16:37 UTC (permalink / raw) To: Peter Zijlstra Cc: Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel, akpm, Pekka J Enberg On Tue, 2008-10-07 at 18:10 +0200, Peter Zijlstra wrote: > On Tue, 2008-10-07 at 10:00 -0500, Matt Mackall wrote: > > Give this a try, please: ... > That seems to make it work again! (4 reboots, 0 crashes) Thanks, Peter. I know we're way late in the 2.6.27 cycle, so I'll leave it to Linus and Andrew to decide how to queue this up. I'm obligated to mention it's theoretically possible that there's a path where this is exploitable, but of course only on systems where SLOB is in use. SLOB: fix bogus ksize calculation SLOB's ksize calculation was braindamaged and generally harmlessly underreported the allocation size. But for very small buffers, it could in fact overreport them, leading code depending on krealloc to overrun the allocation and trample other data. Signed-off-by: Matt Mackall <mpm@selenic.com> Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> diff -r 5e32b09a1b2b mm/slob.c --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 +++ b/mm/slob.c Tue Oct 07 11:27:47 2008 -0500 @@ -515,7 +515,7 @@ sp = (struct slob_page *)virt_to_page(block); if (slob_page(sp)) - return ((slob_t *)block - 1)->units + SLOB_UNIT; + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; else return sp->page.private; } -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 16:37 ` Matt Mackall @ 2008-10-07 16:57 ` Pekka Enberg 2008-10-07 17:13 ` Matt Mackall 2008-10-07 17:57 ` Linus Torvalds 1 sibling, 1 reply; 27+ messages in thread From: Pekka Enberg @ 2008-10-07 16:57 UTC (permalink / raw) To: Matt Mackall Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel, akpm Hi Matt, On Tue, Oct 7, 2008 at 7:37 PM, Matt Mackall <mpm@selenic.com> wrote: > SLOB: fix bogus ksize calculation > > SLOB's ksize calculation was braindamaged and generally harmlessly > underreported the allocation size. But for very small buffers, it could > in fact overreport them, leading code depending on krealloc to overrun > the allocation and trample other data. > > Signed-off-by: Matt Mackall <mpm@selenic.com> > Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > > diff -r 5e32b09a1b2b mm/slob.c > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > +++ b/mm/slob.c Tue Oct 07 11:27:47 2008 -0500 > @@ -515,7 +515,7 @@ > > sp = (struct slob_page *)virt_to_page(block); > if (slob_page(sp)) > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; Hmm. I don't understand why we do the "minus one" thing here. Aren't we underestimating the size now? Side note, why aren't we using slob_units() here? > else > return sp->page.private; > } > > > > > > -- > Mathematics is the supreme nostalgia of our time. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 16:57 ` Pekka Enberg @ 2008-10-07 17:13 ` Matt Mackall 2008-10-07 17:31 ` Pekka Enberg 0 siblings, 1 reply; 27+ messages in thread From: Matt Mackall @ 2008-10-07 17:13 UTC (permalink / raw) To: Pekka Enberg Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Tue, 2008-10-07 at 19:57 +0300, Pekka Enberg wrote: > Hi Matt, > > On Tue, Oct 7, 2008 at 7:37 PM, Matt Mackall <mpm@selenic.com> wrote: > > SLOB: fix bogus ksize calculation > > > > SLOB's ksize calculation was braindamaged and generally harmlessly > > underreported the allocation size. But for very small buffers, it could > > in fact overreport them, leading code depending on krealloc to overrun > > the allocation and trample other data. > > > > Signed-off-by: Matt Mackall <mpm@selenic.com> > > Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > > > > diff -r 5e32b09a1b2b mm/slob.c > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > > +++ b/mm/slob.c Tue Oct 07 11:27:47 2008 -0500 > > @@ -515,7 +515,7 @@ > > > > sp = (struct slob_page *)virt_to_page(block); > > if (slob_page(sp)) > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > > Hmm. I don't understand why we do the "minus one" thing here. Aren't > we underestimating the size now? The first -1 takes us to the object header in front of the object pointer. The second -1 subtracts out the size of the header. But it's entirely possible I'm off by one, so I'll double-check. Nick? -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 17:13 ` Matt Mackall @ 2008-10-07 17:31 ` Pekka Enberg 2008-10-07 23:08 ` Matt Mackall 0 siblings, 1 reply; 27+ messages in thread From: Pekka Enberg @ 2008-10-07 17:31 UTC (permalink / raw) To: Matt Mackall Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel, akpm Hi Matt, On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote: >> > @@ -515,7 +515,7 @@ >> > >> > sp = (struct slob_page *)virt_to_page(block); >> > if (slob_page(sp)) >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT; >> > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; >> >> Hmm. I don't understand why we do the "minus one" thing here. Aren't >> we underestimating the size now? > > The first -1 takes us to the object header in front of the object > pointer. The second -1 subtracts out the size of the header. > > But it's entirely possible I'm off by one, so I'll double-check. Nick? Yeah, I was referring to the second subtraction. Looking at slob_page_alloc(), for example, we compare the return value of slob_units() to SLOB_UNITS(size), so I don't think we count the header in ->units. I mean, we ought to be seeing the subtraction elsewhere in the code as well, no? Pekka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 17:31 ` Pekka Enberg @ 2008-10-07 23:08 ` Matt Mackall 2008-10-08 4:22 ` Nick Piggin 0 siblings, 1 reply; 27+ messages in thread From: Matt Mackall @ 2008-10-07 23:08 UTC (permalink / raw) To: Pekka Enberg Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote: > Hi Matt, > > On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote: > >> > @@ -515,7 +515,7 @@ > >> > > >> > sp = (struct slob_page *)virt_to_page(block); > >> > if (slob_page(sp)) > >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > >> > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > >> > >> Hmm. I don't understand why we do the "minus one" thing here. Aren't > >> we underestimating the size now? > > > > The first -1 takes us to the object header in front of the object > > pointer. The second -1 subtracts out the size of the header. > > > > But it's entirely possible I'm off by one, so I'll double-check. Nick? > > Yeah, I was referring to the second subtraction. Looking at > slob_page_alloc(), for example, we compare the return value of > slob_units() to SLOB_UNITS(size), so I don't think we count the header > in ->units. I mean, we ought to be seeing the subtraction elsewhere in > the code as well, no? Ok, I've looked a bit closer at it and I think we need a different fix. The underlying allocator, slob_alloc, takes a size in bytes and returns an object of that size, with the first word containing the number of slob_t units. kmalloc calls slob_alloc after adding on some space for header and architecture padding. This space is not necessarily 1 slob unit: unsigned int *m; int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); ... m = slob_alloc(size + align, gfp, align, node); *m = size; return (void *)m + align; Note that we overwrite the header with our own size -in bytes-. kfree does the reverse: int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); unsigned int *m = (unsigned int *)(block - align); slob_free(m, *m + align); That second line is locating the kmalloc header. All looks good. The MINALIGN business was introduced by Nick with: slob: improved alignment handling but seems to have missed ksize, which should now be doing the following to match: diff -r 5e32b09a1b2b mm/slob.c --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500 @@ -514,9 +514,11 @@ return 0; sp = (struct slob_page *)virt_to_page(block); - if (slob_page(sp)) - return ((slob_t *)block - 1)->units + SLOB_UNIT; - else + if (slob_page(sp)) { + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int *)(block - align); + return SLOB_UNITS(*m); /* round up */ + } else return sp->page.private; } That leaves the question of why this morning's patch worked at all, given that it was based on how SLOB worked before Nick's patch. But I haven't finished working through that. Peter, can I get you to test the above? -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 23:08 ` Matt Mackall @ 2008-10-08 4:22 ` Nick Piggin 2008-10-08 4:46 ` Matt Mackall 0 siblings, 1 reply; 27+ messages in thread From: Nick Piggin @ 2008-10-08 4:22 UTC (permalink / raw) To: Matt Mackall Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wednesday 08 October 2008 10:08, Matt Mackall wrote: > On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote: > > Hi Matt, > > > > On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote: > > >> > @@ -515,7 +515,7 @@ > > >> > > > >> > sp = (struct slob_page *)virt_to_page(block); > > >> > if (slob_page(sp)) > > >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > >> > + return (((slob_t *)block - 1)->units - 1) * > > >> > SLOB_UNIT; > > >> > > >> Hmm. I don't understand why we do the "minus one" thing here. Aren't > > >> we underestimating the size now? > > > > > > The first -1 takes us to the object header in front of the object > > > pointer. The second -1 subtracts out the size of the header. > > > > > > But it's entirely possible I'm off by one, so I'll double-check. Nick? > > > > Yeah, I was referring to the second subtraction. Looking at > > slob_page_alloc(), for example, we compare the return value of > > slob_units() to SLOB_UNITS(size), so I don't think we count the header > > in ->units. I mean, we ought to be seeing the subtraction elsewhere in > > the code as well, no? > > Ok, I've looked a bit closer at it and I think we need a different fix. > > The underlying allocator, slob_alloc, takes a size in bytes and returns > an object of that size, with the first word containing the number of > slob_t units. > > kmalloc calls slob_alloc after adding on some space for header and > architecture padding. This space is not necessarily 1 slob unit: > > unsigned int *m; > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > ... > m = slob_alloc(size + align, gfp, align, node); > *m = size; > return (void *)m + align; > > Note that we overwrite the header with our own size -in bytes-. > kfree does the reverse: Right. > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > unsigned int *m = (unsigned int *)(block - align); > slob_free(m, *m + align); > > That second line is locating the kmalloc header. All looks good. > > The MINALIGN business was introduced by Nick with: > > slob: improved alignment handling > > but seems to have missed ksize, which should now be doing the following > to match: > > diff -r 5e32b09a1b2b mm/slob.c > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500 > @@ -514,9 +514,11 @@ > return 0; > > sp = (struct slob_page *)virt_to_page(block); > - if (slob_page(sp)) > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > - else > + if (slob_page(sp)) { > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > + unsigned int *m = (unsigned int *)(block - align); > + return SLOB_UNITS(*m); /* round up */ > + } else > return sp->page.private; > } Yes, I came up with nearly the same patch before reading this --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100 +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100 @@ -514,9 +514,11 @@ size_t ksize(const void *block) return 0; sp = (struct slob_page *)virt_to_page(block); - if (slob_page(sp)) - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; - else + if (slob_page(sp)) { + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int *)(block - align); + return *m + align; + } else return sp->page.private; } However, mine is lifted directly from kfree, wheras you do something a bit different. Hmm, ksize arguably could be used to find the underlying allocated slab size in order to use a little bit more than we'd asked for. So probably we should really just `return *m` (don't round up or add any padding). > That leaves the question of why this morning's patch worked at all, > given that it was based on how SLOB worked before Nick's patch. But I > haven't finished working through that. Peter, can I get you to test the > above? I didn't have ksize in my slob user test harness, but added a couple of tests in there, and indeed ksize was returning complete garbage both before and after the latest patch to slob. I'd say it was simply luck. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 4:22 ` Nick Piggin @ 2008-10-08 4:46 ` Matt Mackall 2008-10-08 4:54 ` Nick Piggin 0 siblings, 1 reply; 27+ messages in thread From: Matt Mackall @ 2008-10-08 4:46 UTC (permalink / raw) To: Nick Piggin Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote: > On Wednesday 08 October 2008 10:08, Matt Mackall wrote: > > On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote: > > > Hi Matt, > > > > > > On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote: > > > >> > @@ -515,7 +515,7 @@ > > > >> > > > > >> > sp = (struct slob_page *)virt_to_page(block); > > > >> > if (slob_page(sp)) > > > >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > >> > + return (((slob_t *)block - 1)->units - 1) * > > > >> > SLOB_UNIT; > > > >> > > > >> Hmm. I don't understand why we do the "minus one" thing here. Aren't > > > >> we underestimating the size now? > > > > > > > > The first -1 takes us to the object header in front of the object > > > > pointer. The second -1 subtracts out the size of the header. > > > > > > > > But it's entirely possible I'm off by one, so I'll double-check. Nick? > > > > > > Yeah, I was referring to the second subtraction. Looking at > > > slob_page_alloc(), for example, we compare the return value of > > > slob_units() to SLOB_UNITS(size), so I don't think we count the header > > > in ->units. I mean, we ought to be seeing the subtraction elsewhere in > > > the code as well, no? > > > > Ok, I've looked a bit closer at it and I think we need a different fix. > > > > The underlying allocator, slob_alloc, takes a size in bytes and returns > > an object of that size, with the first word containing the number of > > slob_t units. > > > > kmalloc calls slob_alloc after adding on some space for header and > > architecture padding. This space is not necessarily 1 slob unit: > > > > unsigned int *m; > > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > > ... > > m = slob_alloc(size + align, gfp, align, node); > > *m = size; > > return (void *)m + align; > > > > Note that we overwrite the header with our own size -in bytes-. > > kfree does the reverse: > > Right. > > > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > > unsigned int *m = (unsigned int *)(block - align); > > slob_free(m, *m + align); > > > > That second line is locating the kmalloc header. All looks good. > > > > The MINALIGN business was introduced by Nick with: > > > > slob: improved alignment handling > > > > but seems to have missed ksize, which should now be doing the following > > to match: > > > > diff -r 5e32b09a1b2b mm/slob.c > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500 > > @@ -514,9 +514,11 @@ > > return 0; > > > > sp = (struct slob_page *)virt_to_page(block); > > - if (slob_page(sp)) > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > - else > > + if (slob_page(sp)) { > > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > > + unsigned int *m = (unsigned int *)(block - align); > > + return SLOB_UNITS(*m); /* round up */ > > + } else > > return sp->page.private; > > } > > Yes, I came up with nearly the same patch before reading this > > --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100 > +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100 > @@ -514,9 +514,11 @@ size_t ksize(const void *block) > return 0; > > sp = (struct slob_page *)virt_to_page(block); > - if (slob_page(sp)) > - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > - else > + if (slob_page(sp)) { > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > + unsigned int *m = (unsigned int *)(block - align); > + return *m + align; > + } else > return sp->page.private; > } > > However, mine is lifted directly from kfree, wheras you do something a > bit different. Hmm, ksize arguably could be used to find the underlying > allocated slab size in order to use a little bit more than we'd asked > for. So probably we should really just `return *m` (don't round up or > add any padding). Huh? ksize should report how much space is available in the buffer. If we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If we request 33 bytes from SLOB and it gives us 34, we should report 34. > > That leaves the question of why this morning's patch worked at all, > > given that it was based on how SLOB worked before Nick's patch. But I > > haven't finished working through that. Peter, can I get you to test the > > above? > > I didn't have ksize in my slob user test harness, but added a couple of > tests in there, and indeed ksize was returning complete garbage both > before and after the latest patch to slob. I'd say it was simply luck. I was going to dig your harness up this morning and realized it was on my dead laptop. Send me another copy? -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 4:46 ` Matt Mackall @ 2008-10-08 4:54 ` Nick Piggin 2008-10-08 5:11 ` Nick Piggin 0 siblings, 1 reply; 27+ messages in thread From: Nick Piggin @ 2008-10-08 4:54 UTC (permalink / raw) To: Matt Mackall Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wednesday 08 October 2008 15:46, Matt Mackall wrote: > On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote: > > On Wednesday 08 October 2008 10:08, Matt Mackall wrote: > > > On Tue, 2008-10-07 at 20:31 +0300, Pekka Enberg wrote: > > > > Hi Matt, > > > > > > > > On Tue, Oct 7, 2008 at 8:13 PM, Matt Mackall <mpm@selenic.com> wrote: > > > > >> > @@ -515,7 +515,7 @@ > > > > >> > > > > > >> > sp = (struct slob_page *)virt_to_page(block); > > > > >> > if (slob_page(sp)) > > > > >> > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > > >> > + return (((slob_t *)block - 1)->units - 1) * > > > > >> > SLOB_UNIT; > > > > >> > > > > >> Hmm. I don't understand why we do the "minus one" thing here. > > > > >> Aren't we underestimating the size now? > > > > > > > > > > The first -1 takes us to the object header in front of the object > > > > > pointer. The second -1 subtracts out the size of the header. > > > > > > > > > > But it's entirely possible I'm off by one, so I'll double-check. > > > > > Nick? > > > > > > > > Yeah, I was referring to the second subtraction. Looking at > > > > slob_page_alloc(), for example, we compare the return value of > > > > slob_units() to SLOB_UNITS(size), so I don't think we count the > > > > header in ->units. I mean, we ought to be seeing the subtraction > > > > elsewhere in the code as well, no? > > > > > > Ok, I've looked a bit closer at it and I think we need a different fix. > > > > > > The underlying allocator, slob_alloc, takes a size in bytes and returns > > > an object of that size, with the first word containing the number of > > > slob_t units. > > > > > > kmalloc calls slob_alloc after adding on some space for header and > > > architecture padding. This space is not necessarily 1 slob unit: > > > > > > unsigned int *m; > > > int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > > > ... > > > m = slob_alloc(size + align, gfp, align, node); > > > *m = size; > > > return (void *)m + align; > > > > > > Note that we overwrite the header with our own size -in bytes-. > > > kfree does the reverse: > > > > Right. > > > > > int align = max(ARCH_KMALLOC_MINALIGN, > > > ARCH_SLAB_MINALIGN); unsigned int *m = (unsigned int *)(block - align); > > > slob_free(m, *m + align); > > > > > > That second line is locating the kmalloc header. All looks good. > > > > > > The MINALIGN business was introduced by Nick with: > > > > > > slob: improved alignment handling > > > > > > but seems to have missed ksize, which should now be doing the following > > > to match: > > > > > > diff -r 5e32b09a1b2b mm/slob.c > > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > > > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500 > > > @@ -514,9 +514,11 @@ > > > return 0; > > > > > > sp = (struct slob_page *)virt_to_page(block); > > > - if (slob_page(sp)) > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > - else > > > + if (slob_page(sp)) { > > > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > > > + unsigned int *m = (unsigned int *)(block - align); > > > + return SLOB_UNITS(*m); /* round up */ > > > + } else > > > return sp->page.private; > > > } > > > > Yes, I came up with nearly the same patch before reading this > > > > --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100 > > +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100 > > @@ -514,9 +514,11 @@ size_t ksize(const void *block) > > return 0; > > > > sp = (struct slob_page *)virt_to_page(block); > > - if (slob_page(sp)) > > - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > > - else > > + if (slob_page(sp)) { > > + int align = max(ARCH_KMALLOC_MINALIGN, > > ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int > > *)(block - align); + return *m + align; > > + } else > > return sp->page.private; > > } > > > > However, mine is lifted directly from kfree, wheras you do something a > > bit different. Hmm, ksize arguably could be used to find the underlying > > allocated slab size in order to use a little bit more than we'd asked > > for. So probably we should really just `return *m` (don't round up or > > add any padding). > > Huh? ksize should report how much space is available in the buffer. If > we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If > we request 33 bytes from SLOB and it gives us 34, we should report 34. Oh.. hmm yeah right, I didn't realise what you were doing there. OK, so your patch looks good to me then (provided it is diffed against the previous one, for Linus). > > > That leaves the question of why this morning's patch worked at all, > > > given that it was based on how SLOB worked before Nick's patch. But I > > > haven't finished working through that. Peter, can I get you to test the > > > above? > > > > I didn't have ksize in my slob user test harness, but added a couple of > > tests in there, and indeed ksize was returning complete garbage both > > before and after the latest patch to slob. I'd say it was simply luck. > > I was going to dig your harness up this morning and realized it was on > my dead laptop. Send me another copy? OK, off-list. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 4:54 ` Nick Piggin @ 2008-10-08 5:11 ` Nick Piggin 2008-10-08 5:15 ` Matt Mackall 0 siblings, 1 reply; 27+ messages in thread From: Nick Piggin @ 2008-10-08 5:11 UTC (permalink / raw) To: Matt Mackall Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wednesday 08 October 2008 15:54, Nick Piggin wrote: > On Wednesday 08 October 2008 15:46, Matt Mackall wrote: > > On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote: > > > On Wednesday 08 October 2008 10:08, Matt Mackall wrote: > > > > diff -r 5e32b09a1b2b mm/slob.c > > > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > > > > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500 > > > > @@ -514,9 +514,11 @@ > > > > return 0; > > > > > > > > sp = (struct slob_page *)virt_to_page(block); > > > > - if (slob_page(sp)) > > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > > - else > > > > + if (slob_page(sp)) { > > > > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > > > > + unsigned int *m = (unsigned int *)(block - align); > > > > + return SLOB_UNITS(*m); /* round up */ > > > > + } else > > > > return sp->page.private; > > > > } > > > > > > Yes, I came up with nearly the same patch before reading this > > > > > > --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100 > > > +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100 > > > @@ -514,9 +514,11 @@ size_t ksize(const void *block) > > > return 0; > > > > > > sp = (struct slob_page *)virt_to_page(block); > > > - if (slob_page(sp)) > > > - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > > > - else > > > + if (slob_page(sp)) { > > > + int align = max(ARCH_KMALLOC_MINALIGN, > > > ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int > > > *)(block - align); + return *m + align; > > > + } else > > > return sp->page.private; > > > } > > > > > > However, mine is lifted directly from kfree, wheras you do something a > > > bit different. Hmm, ksize arguably could be used to find the underlying > > > allocated slab size in order to use a little bit more than we'd asked > > > for. So probably we should really just `return *m` (don't round up or > > > add any padding). > > > > Huh? ksize should report how much space is available in the buffer. If > > we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If > > we request 33 bytes from SLOB and it gives us 34, we should report 34. > > Oh.. hmm yeah right, I didn't realise what you were doing there. > OK, so your patch looks good to me then (provided it is diffed against > the previous one, for Linus). OK, no, that's why I got confused. SLOB_UNITS will round you up to the next SLOB_UNIT. You'd then have to multiply by SLOB_UNIT to get back to bytes. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 5:11 ` Nick Piggin @ 2008-10-08 5:15 ` Matt Mackall 2008-10-08 6:43 ` Peter Zijlstra 0 siblings, 1 reply; 27+ messages in thread From: Matt Mackall @ 2008-10-08 5:15 UTC (permalink / raw) To: Nick Piggin Cc: Pekka Enberg, Peter Zijlstra, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wed, 2008-10-08 at 16:11 +1100, Nick Piggin wrote: > On Wednesday 08 October 2008 15:54, Nick Piggin wrote: > > On Wednesday 08 October 2008 15:46, Matt Mackall wrote: > > > On Wed, 2008-10-08 at 15:22 +1100, Nick Piggin wrote: > > > > On Wednesday 08 October 2008 10:08, Matt Mackall wrote: > > > > > > diff -r 5e32b09a1b2b mm/slob.c > > > > > --- a/mm/slob.c Fri Oct 03 14:04:43 2008 -0500 > > > > > +++ b/mm/slob.c Tue Oct 07 18:05:15 2008 -0500 > > > > > @@ -514,9 +514,11 @@ > > > > > return 0; > > > > > > > > > > sp = (struct slob_page *)virt_to_page(block); > > > > > - if (slob_page(sp)) > > > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > > > - else > > > > > + if (slob_page(sp)) { > > > > > + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); > > > > > + unsigned int *m = (unsigned int *)(block - align); > > > > > + return SLOB_UNITS(*m); /* round up */ > > > > > + } else > > > > > return sp->page.private; > > > > > } > > > > > > > > Yes, I came up with nearly the same patch before reading this > > > > > > > > --- linux-2.6/mm/slob.c 2008-10-08 14:43:17.000000000 +1100 > > > > +++ suth/mm/slob.c 2008-10-08 15:11:06.000000000 +1100 > > > > @@ -514,9 +514,11 @@ size_t ksize(const void *block) > > > > return 0; > > > > > > > > sp = (struct slob_page *)virt_to_page(block); > > > > - if (slob_page(sp)) > > > > - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > > > > - else > > > > + if (slob_page(sp)) { > > > > + int align = max(ARCH_KMALLOC_MINALIGN, > > > > ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int > > > > *)(block - align); + return *m + align; > > > > + } else > > > > return sp->page.private; > > > > } > > > > > > > > However, mine is lifted directly from kfree, wheras you do something a > > > > bit different. Hmm, ksize arguably could be used to find the underlying > > > > allocated slab size in order to use a little bit more than we'd asked > > > > for. So probably we should really just `return *m` (don't round up or > > > > add any padding). > > > > > > Huh? ksize should report how much space is available in the buffer. If > > > we request 33 bytes from SLUB and it gives us 64, ksize reports 64. If > > > we request 33 bytes from SLOB and it gives us 34, we should report 34. > > > > Oh.. hmm yeah right, I didn't realise what you were doing there. > > OK, so your patch looks good to me then (provided it is diffed against > > the previous one, for Linus). > > OK, no, that's why I got confused. SLOB_UNITS will round you up to the > next SLOB_UNIT. You'd then have to multiply by SLOB_UNIT to get back to > bytes. Damnit, how many ways can we get confused by these little details? I'll spin a final version and run it against the test harness shortly. -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 5:15 ` Matt Mackall @ 2008-10-08 6:43 ` Peter Zijlstra 2008-10-08 7:25 ` Pekka Enberg 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2008-10-08 6:43 UTC (permalink / raw) To: Matt Mackall Cc: Nick Piggin, Pekka Enberg, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote: > Damnit, how many ways can we get confused by these little details? I'll > spin a final version and run it against the test harness shortly. So I'll wait with testing for the next version? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 6:43 ` Peter Zijlstra @ 2008-10-08 7:25 ` Pekka Enberg 2008-10-08 7:37 ` Peter Zijlstra 0 siblings, 1 reply; 27+ messages in thread From: Pekka Enberg @ 2008-10-08 7:25 UTC (permalink / raw) To: Peter Zijlstra Cc: Matt Mackall, Nick Piggin, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm Hi Peter, On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote: >> Damnit, how many ways can we get confused by these little details? I'll >> spin a final version and run it against the test harness shortly. On Wed, Oct 8, 2008 at 9:43 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > So I'll wait with testing for the next version? AFAICT, this should be fine: http://lkml.org/lkml/2008/10/7/436 You need to revert the patch Linus already committed before applying that though. Pekka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 7:25 ` Pekka Enberg @ 2008-10-08 7:37 ` Peter Zijlstra 2008-10-08 7:39 ` Pekka Enberg 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2008-10-08 7:37 UTC (permalink / raw) To: Pekka Enberg Cc: Matt Mackall, Nick Piggin, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wed, 2008-10-08 at 10:25 +0300, Pekka Enberg wrote: > Hi Peter, > > On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote: > >> Damnit, how many ways can we get confused by these little details? I'll > >> spin a final version and run it against the test harness shortly. > > On Wed, Oct 8, 2008 at 9:43 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > So I'll wait with testing for the next version? > > AFAICT, this should be fine: > > http://lkml.org/lkml/2008/10/7/436 > > You need to revert the patch Linus already committed before applying > that though. Or not have pulled it yet ;-) Anyway, that patch seems to be good on 4 boots. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-08 7:37 ` Peter Zijlstra @ 2008-10-08 7:39 ` Pekka Enberg 0 siblings, 0 replies; 27+ messages in thread From: Pekka Enberg @ 2008-10-08 7:39 UTC (permalink / raw) To: Peter Zijlstra Cc: Matt Mackall, Nick Piggin, Christoph Lameter, linux-mm, Linus Torvalds, Ingo Molnar, linux-kernel, akpm On Wed, 2008-10-08 at 09:37 +0200, Peter Zijlstra wrote: > On Wed, 2008-10-08 at 10:25 +0300, Pekka Enberg wrote: > > Hi Peter, > > > > On Wed, 2008-10-08 at 00:15 -0500, Matt Mackall wrote: > > >> Damnit, how many ways can we get confused by these little details? I'll > > >> spin a final version and run it against the test harness shortly. > > > > On Wed, Oct 8, 2008 at 9:43 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > > So I'll wait with testing for the next version? > > > > AFAICT, this should be fine: > > > > http://lkml.org/lkml/2008/10/7/436 > > > > You need to revert the patch Linus already committed before applying > > that though. > > Or not have pulled it yet ;-) > > Anyway, that patch seems to be good on 4 boots. Thanks for testing, Peter! Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 16:37 ` Matt Mackall 2008-10-07 16:57 ` Pekka Enberg @ 2008-10-07 17:57 ` Linus Torvalds 2008-10-07 18:11 ` Peter Zijlstra 1 sibling, 1 reply; 27+ messages in thread From: Linus Torvalds @ 2008-10-07 17:57 UTC (permalink / raw) To: Matt Mackall Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin, Ingo Molnar, linux-kernel, akpm, Pekka J Enberg On Tue, 7 Oct 2008, Matt Mackall wrote: > > Thanks, Peter. I know we're way late in the 2.6.27 cycle, so I'll leave > it to Linus and Andrew to decide how to queue this up. Well, since it seems to be clearly broken without it, I'd take it, but now I'm kind of waiting for the resolution on whether that second "-1" is correct or not. >From a quick look at mm/slob.c I see Pekka's point that ->units does look like the real size in units, not the "size plus header", and that the second -1 may be bogus. But I don't know the code. Peter - can you check with that > if (slob_page(sp)) > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; thing using - return ((slob_t *)block - 1)->units + SLOB_UNIT; + return ((slob_t *)block - 1)->units * SLOB_UNIT; instead? Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 17:57 ` Linus Torvalds @ 2008-10-07 18:11 ` Peter Zijlstra 2008-10-07 18:18 ` Linus Torvalds 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2008-10-07 18:11 UTC (permalink / raw) To: Linus Torvalds Cc: Matt Mackall, Christoph Lameter, linux-mm, Nick Piggin, Ingo Molnar, linux-kernel, akpm, Pekka J Enberg On Tue, 2008-10-07 at 10:57 -0700, Linus Torvalds wrote: > Peter - can you check with that > > > if (slob_page(sp)) > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > > thing using > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > + return ((slob_t *)block - 1)->units * SLOB_UNIT; > > instead? went splat on the second run... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 18:11 ` Peter Zijlstra @ 2008-10-07 18:18 ` Linus Torvalds 2008-10-08 19:51 ` Matt Mackall 0 siblings, 1 reply; 27+ messages in thread From: Linus Torvalds @ 2008-10-07 18:18 UTC (permalink / raw) To: Peter Zijlstra Cc: Matt Mackall, Christoph Lameter, linux-mm, Nick Piggin, Ingo Molnar, linux-kernel, akpm, Pekka J Enberg On Tue, 7 Oct 2008, Peter Zijlstra wrote: > On Tue, 2008-10-07 at 10:57 -0700, Linus Torvalds wrote: > > > Peter - can you check with that > > > > > if (slob_page(sp)) > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > > > > thing using > > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > + return ((slob_t *)block - 1)->units * SLOB_UNIT; > > > > instead? > > went splat on the second run... Well, that makes it simple. I'll take Matt's patch as being "tested", and somebody can hopefully explain where the extra unit comes from later. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] SLOB's krealloc() seems bust 2008-10-07 18:18 ` Linus Torvalds @ 2008-10-08 19:51 ` Matt Mackall 0 siblings, 0 replies; 27+ messages in thread From: Matt Mackall @ 2008-10-08 19:51 UTC (permalink / raw) To: Linus Torvalds Cc: Peter Zijlstra, Christoph Lameter, linux-mm, Nick Piggin, Ingo Molnar, linux-kernel, akpm, Pekka J Enberg On Tue, 2008-10-07 at 11:18 -0700, Linus Torvalds wrote: > > On Tue, 7 Oct 2008, Peter Zijlstra wrote: > > > On Tue, 2008-10-07 at 10:57 -0700, Linus Torvalds wrote: > > > > > Peter - can you check with that > > > > > > > if (slob_page(sp)) > > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > > + return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; > > > > > > thing using > > > > > > - return ((slob_t *)block - 1)->units + SLOB_UNIT; > > > + return ((slob_t *)block - 1)->units * SLOB_UNIT; > > > > > > instead? > > > > went splat on the second run... > > Well, that makes it simple. I'll take Matt's patch as being "tested", and > somebody can hopefully explain where the extra unit comes from later. Ok, I think we've gotten to the bottom of this. Here's an incremental patch that doesn't work by dumb luck. Please apply. SLOB: fix bogus ksize calculation fix This fixes the previous fix, which was completely wrong on closer inspection. This version has been manually tested with a user-space test harness and generates sane values. A nearly identical patch has been boot-tested. The problem arose from changing how kmalloc/kfree handled alignment padding without updating ksize to match. This brings it in sync. Signed-off-by: Matt Mackall <mpm@selenic.com> diff -r 3dd2424d4c32 -r 73d55a1b6c10 mm/slob.c --- a/mm/slob.c Tue Oct 07 23:00:11 2008 +0000 +++ b/mm/slob.c Wed Oct 08 14:48:45 2008 -0500 @@ -514,9 +514,11 @@ return 0; sp = (struct slob_page *)virt_to_page(block); - if (slob_page(sp)) - return (((slob_t *)block - 1)->units - 1) * SLOB_UNIT; - else + if (slob_page(sp)) { + int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); + unsigned int *m = (unsigned int *)(block - align); + return SLOB_UNITS(*m) * SLOB_UNIT; + } else return sp->page.private; } -- Mathematics is the supreme nostalgia of our time. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2008-10-08 19:51 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-10-07 13:57 [BUG] SLOB's krealloc() seems bust Peter Zijlstra 2008-10-07 14:07 ` Christoph Lameter 2008-10-07 14:13 ` Peter Zijlstra 2008-10-07 14:26 ` Christoph Lameter 2008-10-07 15:00 ` Matt Mackall 2008-10-07 15:20 ` Christoph Lameter 2008-10-07 15:58 ` Matt Mackall 2008-10-07 16:10 ` Peter Zijlstra 2008-10-07 16:37 ` Linus Torvalds 2008-10-07 16:37 ` Matt Mackall 2008-10-07 16:57 ` Pekka Enberg 2008-10-07 17:13 ` Matt Mackall 2008-10-07 17:31 ` Pekka Enberg 2008-10-07 23:08 ` Matt Mackall 2008-10-08 4:22 ` Nick Piggin 2008-10-08 4:46 ` Matt Mackall 2008-10-08 4:54 ` Nick Piggin 2008-10-08 5:11 ` Nick Piggin 2008-10-08 5:15 ` Matt Mackall 2008-10-08 6:43 ` Peter Zijlstra 2008-10-08 7:25 ` Pekka Enberg 2008-10-08 7:37 ` Peter Zijlstra 2008-10-08 7:39 ` Pekka Enberg 2008-10-07 17:57 ` Linus Torvalds 2008-10-07 18:11 ` Peter Zijlstra 2008-10-07 18:18 ` Linus Torvalds 2008-10-08 19:51 ` Matt Mackall
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox