* unaligned accesses in SLAB etc. @ 2014-10-12 2:15 David Miller 2014-10-12 17:20 ` David Miller 2014-10-12 17:22 ` Joonsoo Kim 0 siblings, 2 replies; 34+ messages in thread From: David Miller @ 2014-10-12 2:15 UTC (permalink / raw) To: linux-kernel; +Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm I'm getting tons of the following on sparc64: [603965.383447] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 [603965.396987] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 [603965.410523] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 [603965.424061] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 [603965.437617] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 [603970.554394] log_unaligned: 333 callbacks suppressed [603970.564041] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 [603970.577576] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 [603970.591122] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 [603970.604669] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 [603970.618216] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 [603976.515633] log_unaligned: 31 callbacks suppressed [603976.525092] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603976.540196] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603976.555308] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603976.570411] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603976.585526] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603982.476424] log_unaligned: 43 callbacks suppressed [603982.485881] Kernel unaligned access at TPC[549378] kmem_cache_alloc+0xd8/0x1e0 [603982.501590] Kernel unaligned access at TPC[5470a8] kmem_cache_free+0xc8/0x200 [603982.501605] Kernel unaligned access at TPC[549378] kmem_cache_alloc+0xd8/0x1e0 [603982.530382] Kernel unaligned access at TPC[5470a8] kmem_cache_free+0xc8/0x200 [603982.544820] Kernel unaligned access at TPC[549378] kmem_cache_alloc+0xd8/0x1e0 [603987.567130] log_unaligned: 11 callbacks suppressed [603987.576582] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603987.591696] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603987.606811] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603987.621904] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 [603987.637017] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-12 2:15 unaligned accesses in SLAB etc David Miller @ 2014-10-12 17:20 ` David Miller 2014-10-13 20:22 ` mroos 2014-10-12 17:22 ` Joonsoo Kim 1 sibling, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-12 17:20 UTC (permalink / raw) To: linux-kernel Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, mroos, sparclinux From: David Miller <davem@davemloft.net> Date: Sat, 11 Oct 2014 22:15:10 -0400 (EDT) > > I'm getting tons of the following on sparc64: > > [603965.383447] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > [603965.396987] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 > [603965.410523] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 The unaligned accesses are happening in the SLAB_OBJ_PFMEMALLOC code, which assumes that all object pointers are "unsigned long" aligned: static inline void set_obj_pfmemalloc(void **objp) { *objp = (void *)((unsigned long)*objp | SLAB_OBJ_PFMEMALLOC); return; } etc. etc. But that code has been there working forever. Something changed recently such that this assumption no longer holds. In all of the cases, the address is 4-byte aligned but not 8-byte aligned. And they are vmalloc addresses. Which made me suspect the percpu commit: ==================== commit bf0dea23a9c094ae869a88bb694fbe966671bf6d Author: Joonsoo Kim <iamjoonsoo.kim@lge.com> Date: Thu Oct 9 15:26:27 2014 -0700 mm/slab: use percpu allocator for cpu cache ==================== And indeed, reverting this commit fixes the problem. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-12 17:20 ` David Miller @ 2014-10-13 20:22 ` mroos 2014-10-13 23:52 ` Joonsoo Kim 0 siblings, 1 reply; 34+ messages in thread From: mroos @ 2014-10-13 20:22 UTC (permalink / raw) To: David Miller Cc: Linux Kernel list, cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, sparclinux > From: David Miller <davem@davemloft.net> > Date: Sat, 11 Oct 2014 22:15:10 -0400 (EDT) > > > > > I'm getting tons of the following on sparc64: > > > > [603965.383447] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > > [603965.396987] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 > > [603965.410523] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > In all of the cases, the address is 4-byte aligned but not 8-byte > aligned. And they are vmalloc addresses. > > Which made me suspect the percpu commit: > > ==================== > commit bf0dea23a9c094ae869a88bb694fbe966671bf6d > Author: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Date: Thu Oct 9 15:26:27 2014 -0700 > > mm/slab: use percpu allocator for cpu cache > ==================== > > And indeed, reverting this commit fixes the problem. I tested Joonsoo Kim's fix and it gets rid of the kernel unaligned access messages, yes. But the instability on UltraSparc II era machines still remains - occassional Bus Errors during kernel compilation, messages like this: sh[11771]: segfault at ffd6a4d1 ip 00000000f7cc5714 (rpc 00000000f7cc562c) sp 00000000ffd69d90 error 30002 in libc-2.19.so[f7c44000+16a000] -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-13 20:22 ` mroos @ 2014-10-13 23:52 ` Joonsoo Kim 2014-10-14 0:04 ` David Miller 0 siblings, 1 reply; 34+ messages in thread From: Joonsoo Kim @ 2014-10-13 23:52 UTC (permalink / raw) To: mroos Cc: David Miller, Linux Kernel list, cl, penberg, rientjes, akpm, linux-mm, sparclinux On Mon, Oct 13, 2014 at 11:22:37PM +0300, mroos@linux.ee wrote: > > From: David Miller <davem@davemloft.net> > > Date: Sat, 11 Oct 2014 22:15:10 -0400 (EDT) > > > > > > > > I'm getting tons of the following on sparc64: > > > > > > [603965.383447] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > > > [603965.396987] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 > > > [603965.410523] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > > > In all of the cases, the address is 4-byte aligned but not 8-byte > > aligned. And they are vmalloc addresses. > > > > Which made me suspect the percpu commit: > > > > ==================== > > commit bf0dea23a9c094ae869a88bb694fbe966671bf6d > > Author: Joonsoo Kim <iamjoonsoo.kim@lge.com> > > Date: Thu Oct 9 15:26:27 2014 -0700 > > > > mm/slab: use percpu allocator for cpu cache > > ==================== > > > > And indeed, reverting this commit fixes the problem. > > I tested Joonsoo Kim's fix and it gets rid of the kernel unaligned > access messages, yes. > > But the instability on UltraSparc II era machines still remains - > occassional Bus Errors during kernel compilation, messages like this: > > sh[11771]: segfault at ffd6a4d1 ip 00000000f7cc5714 (rpc 00000000f7cc562c) sp 00000000ffd69d90 error 30002 in libc-2.19.so[f7c44000+16a000] Hello, Meelis. Thanks for testing. I'd like to know that your another problem is related to commit bf0dea23a9c0 ("mm/slab: use percpu allocator for cpu cache"). So, if the commit is reverted, your another problem is also gone completely? Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-13 23:52 ` Joonsoo Kim @ 2014-10-14 0:04 ` David Miller 2014-10-14 0:14 ` Joonsoo Kim 2014-10-14 21:19 ` mroos 0 siblings, 2 replies; 34+ messages in thread From: David Miller @ 2014-10-14 0:04 UTC (permalink / raw) To: iamjoonsoo.kim Cc: mroos, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Joonsoo Kim <iamjoonsoo.kim@lge.com> Date: Tue, 14 Oct 2014 08:52:19 +0900 > I'd like to know that your another problem is related to commit > bf0dea23a9c0 ("mm/slab: use percpu allocator for cpu cache"). So, > if the commit is reverted, your another problem is also gone > completely? The other problem has been present forever. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-14 0:04 ` David Miller @ 2014-10-14 0:14 ` Joonsoo Kim 2014-10-14 21:19 ` mroos 1 sibling, 0 replies; 34+ messages in thread From: Joonsoo Kim @ 2014-10-14 0:14 UTC (permalink / raw) To: David Miller Cc: mroos, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux On Mon, Oct 13, 2014 at 08:04:16PM -0400, David Miller wrote: > From: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Date: Tue, 14 Oct 2014 08:52:19 +0900 > > > I'd like to know that your another problem is related to commit > > bf0dea23a9c0 ("mm/slab: use percpu allocator for cpu cache"). So, > > if the commit is reverted, your another problem is also gone > > completely? > > The other problem has been present forever. Okay. Thanks for notifying me. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-14 0:04 ` David Miller 2014-10-14 0:14 ` Joonsoo Kim @ 2014-10-14 21:19 ` mroos 2014-10-14 21:32 ` David Miller 1 sibling, 1 reply; 34+ messages in thread From: mroos @ 2014-10-14 21:19 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > > I'd like to know that your another problem is related to commit > > bf0dea23a9c0 ("mm/slab: use percpu allocator for cpu cache"). So, > > if the commit is reverted, your another problem is also gone > > completely? > > The other problem has been present forever. Umm? I am afraid I have been describing it badly. This random SIGBUS+SIGSEGV problem is new - I have not seen it before. I have been able to do kernel compiles for years on sparc64 (modulo specific bugs in specific configurations) and 3.17 + start/end swap patch seems also stable for most machine. With yesterdays git + align patch, it dies with SIGBUS multiple times during compilation so it's a new regression for me. Will try reverting that commit tomorrow. My only other current sparc64 problems that I am seeing - V210/V440 die during bootup if compiled with gcc 4.9 and V480 dies with FATAL exceptions during bootups since previous kernel release. Maybe also exit_mmap warning - I do not know if they have been fixed, I see them rarely. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-14 21:19 ` mroos @ 2014-10-14 21:32 ` David Miller 2014-10-15 8:04 ` Meelis Roos 2014-10-16 7:02 ` Meelis Roos 0 siblings, 2 replies; 34+ messages in thread From: David Miller @ 2014-10-14 21:32 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: mroos@linux.ee Date: Wed, 15 Oct 2014 00:19:36 +0300 (EEST) >> > I'd like to know that your another problem is related to commit >> > bf0dea23a9c0 ("mm/slab: use percpu allocator for cpu cache"). So, >> > if the commit is reverted, your another problem is also gone >> > completely? >> >> The other problem has been present forever. > > Umm? I am afraid I have been describing it badly. This random > SIGBUS+SIGSEGV problem is new - I have not seen it before. Sorry, I thought it was the same bug that causes git corruptions for you. I misunderstood. > I have been able to do kernel compiles for years on sparc64 (modulo > specific bugs in specific configurations) and 3.17 + start/end swap > patch seems also stable for most machine. With yesterdays git + align > patch, it dies with SIGBUS multiple times during compilation so it's a > new regression for me. > > Will try reverting that commit tomorrow. If that fails, please try to bisect, it will help us a lot. > My only other current sparc64 problems that I am seeing - V210/V440 die > during bootup if compiled with gcc 4.9 and V480 dies with FATAL > exceptions during bootups since previous kernel release. Maybe also > exit_mmap warning - I do not know if they have been fixed, I see them > rarely. The gcc-4.9 case is interesting, are you saying that a gcc-4.9 compiled kernel works fine on other systems? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-14 21:32 ` David Miller @ 2014-10-15 8:04 ` Meelis Roos 2014-10-15 18:36 ` David Miller 2014-10-16 7:02 ` Meelis Roos 1 sibling, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-15 8:04 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > > My only other current sparc64 problems that I am seeing - V210/V440 die > > during bootup if compiled with gcc 4.9 and V480 dies with FATAL > > exceptions during bootups since previous kernel release. Maybe also > > exit_mmap warning - I do not know if they have been fixed, I see them > > rarely. > > The gcc-4.9 case is interesting, are you saying that a gcc-4.9 compiled > kernel works fine on other systems? Yes, all USII based systems work fine with Debian gcc-4.9, as does T2000. Of USIII* systems, V210 and V440 exhibit the boot hang with gcc-4.9 and V480 crashes wit FATAL exception during boot that is probably earlier than the gcc boot hang so I do not know about V480 and gcc-4.9. V240 not tested because of fan failures, V245 is in the queue for setup but not tested so far. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-15 8:04 ` Meelis Roos @ 2014-10-15 18:36 ` David Miller 2014-10-15 20:11 ` Meelis Roos 0 siblings, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-15 18:36 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Meelis Roos <mroos@linux.ee> Date: Wed, 15 Oct 2014 11:04:49 +0300 (EEST) >> > My only other current sparc64 problems that I am seeing - V210/V440 die >> > during bootup if compiled with gcc 4.9 and V480 dies with FATAL >> > exceptions during bootups since previous kernel release. Maybe also >> > exit_mmap warning - I do not know if they have been fixed, I see them >> > rarely. >> >> The gcc-4.9 case is interesting, are you saying that a gcc-4.9 compiled >> kernel works fine on other systems? > > Yes, all USII based systems work fine with Debian gcc-4.9, as does > T2000. Of USIII* systems, V210 and V440 exhibit the boot hang with > gcc-4.9 and V480 crashes wit FATAL exception during boot that is > probably earlier than the gcc boot hang so I do not know about V480 and > gcc-4.9. V240 not tested because of fan failures, V245 is in the queue > for setup but not tested so far. Ok, on the V210/V440 can you boot with "-p" on the kernel boot command line and post the log? Let's start by seeing how far it gets, maybe we can figure out roughly where it dies. A boot hang should be relatively easy to diagnose and pinpoint. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-15 18:36 ` David Miller @ 2014-10-15 20:11 ` Meelis Roos 2014-10-16 3:11 ` David Miller 0 siblings, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-15 20:11 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > >> The gcc-4.9 case is interesting, are you saying that a gcc-4.9 compiled > >> kernel works fine on other systems? > > > > Yes, all USII based systems work fine with Debian gcc-4.9, as does > > T2000. Of USIII* systems, V210 and V440 exhibit the boot hang with > > gcc-4.9 and V480 crashes wit FATAL exception during boot that is > > probably earlier than the gcc boot hang so I do not know about V480 and > > gcc-4.9. V240 not tested because of fan failures, V245 is in the queue > > for setup but not tested so far. > > Ok, on the V210/V440 can you boot with "-p" on the kernel boot command > line and post the log? Let's start by seeing how far it gets, maybe > we can figure out roughly where it dies. http://www.spinics.net/lists/sparclinux/msg12238.html and http://www.spinics.net/lists/sparclinux/msg12468.html are my relevant posts about it. Should I get something more? It would be easy because of ALOM. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-15 20:11 ` Meelis Roos @ 2014-10-16 3:11 ` David Miller 2014-10-16 7:22 ` Meelis Roos 0 siblings, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-16 3:11 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Meelis Roos <mroos@linux.ee> Date: Wed, 15 Oct 2014 23:11:34 +0300 (EEST) >> >> The gcc-4.9 case is interesting, are you saying that a gcc-4.9 compiled >> >> kernel works fine on other systems? >> > >> > Yes, all USII based systems work fine with Debian gcc-4.9, as does >> > T2000. Of USIII* systems, V210 and V440 exhibit the boot hang with >> > gcc-4.9 and V480 crashes wit FATAL exception during boot that is >> > probably earlier than the gcc boot hang so I do not know about V480 and >> > gcc-4.9. V240 not tested because of fan failures, V245 is in the queue >> > for setup but not tested so far. >> >> Ok, on the V210/V440 can you boot with "-p" on the kernel boot command >> line and post the log? Let's start by seeing how far it gets, maybe >> we can figure out roughly where it dies. > > http://www.spinics.net/lists/sparclinux/msg12238.html and > http://www.spinics.net/lists/sparclinux/msg12468.html are my relevant > posts about it. Should I get something more? It would be easy because of > ALOM. Less information than I had hoped :-/ I thought it was hanging "during boot" meaning before we try to execute userspace. When in fact it seems to die exactly when we start running the init process. Wrt. disassembly of fault_in_user_windows(), that's not likely the cause because if it were being miscompiled it would equally not work on the other systems. Something in the UltraSPARC-III specific code paths is going wrong (either it is miscompiled, or the code makes an assumption that isn't valid which has happened in the past). Do you happen to have both gcc-4.9 and a previously working compiler on these systems? If you do, we can build a kernel with gcc-4.9 and then selectively compile certain failes with the older working compiler to narrow down what compiles into something non-working with gcc-4.9 I would start with the following files: arch/sparc/mm/init_64.c arch/sparc/mm/tlb.c arch/sparc/mm/tsb.c arch/sparc/mm/fault_64.c And failing that, go for various files under arch/sparc/kernel/ such as: arch/sparc/kernel/process_64.c arch/sparc/kernel/smp_64.c arch/sparc/kernel/sys_sparc_64.c arch/sparc/kernel/sys_sparc32.c arch/sparc/kernel/traps_64.c Hopefully, this should be a simply matter of doing a complete build with gcc-4.9, then removing the object file we want to selectively build with the older compiler and then going: make CC="gcc-4.6" arch/sparc/mm/init_64.o then relinking with plain 'make'. If the build system rebuilds the object file on you when you try to relink the final kernel image, we'll have to do some of this by hand to make the test. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 3:11 ` David Miller @ 2014-10-16 7:22 ` Meelis Roos 2014-10-16 20:11 ` Meelis Roos 0 siblings, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-16 7:22 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > Do you happen to have both gcc-4.9 and a previously working compiler > on these systems? If you do, we can build a kernel with gcc-4.9 and > then selectively compile certain failes with the older working > compiler to narrow down what compiles into something non-working with > gcc-4.9 Yes, I kept gcc-4.6 to help resolving it. [...] > Hopefully, this should be a simply matter of doing a complete build > with gcc-4.9, then removing the object file we want to selectively > build with the older compiler and then going: > > make CC="gcc-4.6" arch/sparc/mm/init_64.o > > then relinking with plain 'make'. > > If the build system rebuilds the object file on you when you try > to relink the final kernel image, we'll have to do some of this > by hand to make the test. Unfortunately it starts a full rebuild with plain make after compiling some files with gcc-4.6 - detects CC change? -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 7:22 ` Meelis Roos @ 2014-10-16 20:11 ` Meelis Roos 2014-10-16 20:18 ` David Miller 0 siblings, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-16 20:11 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, Linux Kernel list, cl, penberg, rientjes, akpm, linux-mm, sparclinux > > Hopefully, this should be a simply matter of doing a complete build > > with gcc-4.9, then removing the object file we want to selectively > > build with the older compiler and then going: > > > > make CC="gcc-4.6" arch/sparc/mm/init_64.o > > > > then relinking with plain 'make'. > > > > If the build system rebuilds the object file on you when you try > > to relink the final kernel image, we'll have to do some of this > > by hand to make the test. > > Unfortunately it starts a full rebuild with plain make after compiling > some files with gcc-4.6 - detects CC change? Figured out from make V=1 how to call gcc-4.6 directly, so far my bisection shows that it one or probably more of arch/sparc/kernel/*.c but probably more than 1 - 2 halfs of it both failed. Still bisecting. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 20:11 ` Meelis Roos @ 2014-10-16 20:18 ` David Miller 0 siblings, 0 replies; 34+ messages in thread From: David Miller @ 2014-10-16 20:18 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Meelis Roos <mroos@linux.ee> Date: Thu, 16 Oct 2014 23:11:49 +0300 (EEST) >> > Hopefully, this should be a simply matter of doing a complete build >> > with gcc-4.9, then removing the object file we want to selectively >> > build with the older compiler and then going: >> > >> > make CC="gcc-4.6" arch/sparc/mm/init_64.o >> > >> > then relinking with plain 'make'. >> > >> > If the build system rebuilds the object file on you when you try >> > to relink the final kernel image, we'll have to do some of this >> > by hand to make the test. >> >> Unfortunately it starts a full rebuild with plain make after compiling >> some files with gcc-4.6 - detects CC change? > > Figured out from make V=1 how to call gcc-4.6 directly, so far my > bisection shows that it one or probably more of arch/sparc/kernel/*.c > but probably more than 1 - 2 halfs of it both failed. Still bisecting. Thanks a lot for working this out. I'm going to also try to setup a test environment so I can try this gcc-4.9 stuff on my T4-2 as well. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-14 21:32 ` David Miller 2014-10-15 8:04 ` Meelis Roos @ 2014-10-16 7:02 ` Meelis Roos 2014-10-16 20:07 ` David Miller 1 sibling, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-16 7:02 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > >> > I'd like to know that your another problem is related to commit > >> > bf0dea23a9c0 ("mm/slab: use percpu allocator for cpu cache"). So, > >> > if the commit is reverted, your another problem is also gone > >> > completely? > >> > >> The other problem has been present forever. > > > > Umm? I am afraid I have been describing it badly. This random > > SIGBUS+SIGSEGV problem is new - I have not seen it before. > > Sorry, I thought it was the same bug that causes git corruptions > for you. I misunderstood. > > > I have been able to do kernel compiles for years on sparc64 (modulo > > specific bugs in specific configurations) and 3.17 + start/end swap > > patch seems also stable for most machine. With yesterdays git + align > > patch, it dies with SIGBUS multiple times during compilation so it's a > > new regression for me. > > > > Will try reverting that commit tomorrow. > > If that fails, please try to bisect, it will help us a lot. Commit bf0dea23a9c0 is working OK with no revert needed (checked out this revision and it tested OK). So far I know that the breakage seems to have happened between cadbb58039f7cab1def9c931012ab04c953a6997 (first sparc commit of the batch, working OK on V100) and bdcf81b658ebc4c2640c3c2c55c8b31c601b6996 (last sparc commit before the merge, breaks on E3500). Will continue bisecting the sparc64 commits. Also, I noticed that when the problem happens, it's deterministic - with some kernels, sshd dies reproducibly on login. With most kernels, building kernel breaks in one specific location, not randomly. scripts/Makefile.build:352: recipe for target 'sound/modules.order' failed make[1]: *** [sound/modules.order] Bus error make[1]: *** Deleting file 'sound/modules.order' Makefile:929: recipe for target 'sound' failed Will tell when I get more details. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 7:02 ` Meelis Roos @ 2014-10-16 20:07 ` David Miller 2014-10-16 20:16 ` Meelis Roos 2014-10-16 20:20 ` Meelis Roos 0 siblings, 2 replies; 34+ messages in thread From: David Miller @ 2014-10-16 20:07 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Meelis Roos <mroos@linux.ee> Date: Thu, 16 Oct 2014 10:02:57 +0300 (EEST) > scripts/Makefile.build:352: recipe for target 'sound/modules.order' failed > make[1]: *** [sound/modules.order] Bus error > make[1]: *** Deleting file 'sound/modules.order' > Makefile:929: recipe for target 'sound' failed I just reproduced this on my Sun Blade 2500, so it can trigger on UltraSPARC-IIIi systems too. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 20:07 ` David Miller @ 2014-10-16 20:16 ` Meelis Roos 2014-10-16 20:20 ` David Miller 2014-10-16 20:20 ` Meelis Roos 1 sibling, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-16 20:16 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > > scripts/Makefile.build:352: recipe for target 'sound/modules.order' failed > > make[1]: *** [sound/modules.order] Bus error > > make[1]: *** Deleting file 'sound/modules.order' > > Makefile:929: recipe for target 'sound' failed > > I just reproduced this on my Sun Blade 2500, so it can trigger on UltraSPARC-IIIi > systems too. My bisection led to the folloowing commit but it seems irrelevant (I have no sun4v on these machines): 4ccb9272892c33ef1c19a783cfa87103b30c2784 is the first bad commit commit 4ccb9272892c33ef1c19a783cfa87103b30c2784 Author: bob picco <bpicco@meloft.net> Date: Tue Sep 16 09:26:47 2014 -0400 sparc64: sun4v TLB error power off events However, the following chunk sound slightly suspicious: + if (fault_code & FAULT_CODE_BAD_RA) + goto do_sigbus; + because SIGNUS is what I got. For some machines, it killed chekroot during startup, for some shells under some circumstances, for some sshd. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 20:16 ` Meelis Roos @ 2014-10-16 20:20 ` David Miller 2014-10-16 20:50 ` David Miller 0 siblings, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-16 20:20 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Meelis Roos <mroos@linux.ee> Date: Thu, 16 Oct 2014 23:16:44 +0300 (EEST) >> > scripts/Makefile.build:352: recipe for target 'sound/modules.order' failed >> > make[1]: *** [sound/modules.order] Bus error >> > make[1]: *** Deleting file 'sound/modules.order' >> > Makefile:929: recipe for target 'sound' failed >> >> I just reproduced this on my Sun Blade 2500, so it can trigger on UltraSPARC-IIIi >> systems too. > > My bisection led to the folloowing commit but it seems irrelevant (I > have no sun4v on these machines): > > 4ccb9272892c33ef1c19a783cfa87103b30c2784 is the first bad commit > commit 4ccb9272892c33ef1c19a783cfa87103b30c2784 > Author: bob picco <bpicco@meloft.net> > Date: Tue Sep 16 09:26:47 2014 -0400 > > sparc64: sun4v TLB error power off events > > > However, the following chunk sound slightly suspicious: > > + if (fault_code & FAULT_CODE_BAD_RA) > + goto do_sigbus; > + > > because SIGNUS is what I got. For some machines, it killed chekroot > during startup, for some shells under some circumstances, for some sshd. Good catch! So I'm going to audit all the code paths to make sure we don't put garbage into the fault_code value. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 20:20 ` David Miller @ 2014-10-16 20:50 ` David Miller 2014-10-17 11:12 ` Meelis Roos 0 siblings, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-16 20:50 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: David Miller <davem@redhat.com> Date: Thu, 16 Oct 2014 16:20:01 -0400 (EDT) > So I'm going to audit all the code paths to make sure we don't put garbage > into the fault_code value. There are two code paths where we can put garbage into the fault_code value. And for the dtlb_prot.S case, the value we put in there is TLB_TAG_ACCESS which is 0x30, which include bit 0x20 which is that FAULT_CODE_BAD_RA indication which is erroneously triggering. The other path is via hugepage TLB misses, for the situation where we haven't allocated the huge TSB for the thread yet. That might explain some other longer-term problems we've had. I'm about to test the following fix: diff --git a/arch/sparc/kernel/dtlb_prot.S b/arch/sparc/kernel/dtlb_prot.S index b2c2c5b..d668ca14 100644 --- a/arch/sparc/kernel/dtlb_prot.S +++ b/arch/sparc/kernel/dtlb_prot.S @@ -24,11 +24,11 @@ mov TLB_TAG_ACCESS, %g4 ! For reload of vaddr /* PROT ** ICACHE line 2: More real fault processing */ + ldxa [%g4] ASI_DMMU, %g5 ! Put tagaccess in %g5 bgu,pn %xcc, winfix_trampoline ! Yes, perform winfixup - ldxa [%g4] ASI_DMMU, %g5 ! Put tagaccess in %g5 - ba,pt %xcc, sparc64_realfault_common ! Nope, normal fault mov FAULT_CODE_DTLB | FAULT_CODE_WRITE, %g4 - nop + ba,pt %xcc, sparc64_realfault_common ! Nope, normal fault + nop nop nop nop diff --git a/arch/sparc/kernel/tsb.S b/arch/sparc/kernel/tsb.S index 14158d4..be98685 100644 --- a/arch/sparc/kernel/tsb.S +++ b/arch/sparc/kernel/tsb.S @@ -162,10 +162,10 @@ tsb_miss_page_table_walk_sun4v_fastpath: nop .previous - rdpr %tl, %g3 - cmp %g3, 1 + rdpr %tl, %g7 + cmp %g7, 1 bne,pn %xcc, winfix_trampoline - nop + mov %g3, %g4 ba,pt %xcc, etrap rd %pc, %g7 call hugetlb_setup -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 20:50 ` David Miller @ 2014-10-17 11:12 ` Meelis Roos 2014-10-18 17:59 ` David Miller 0 siblings, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-17 11:12 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > From: David Miller <davem@redhat.com> > Date: Thu, 16 Oct 2014 16:20:01 -0400 (EDT) > > > So I'm going to audit all the code paths to make sure we don't put garbage > > into the fault_code value. > > There are two code paths where we can put garbage into the fault_code > value. And for the dtlb_prot.S case, the value we put in there is > TLB_TAG_ACCESS which is 0x30, which include bit 0x20 which is that > FAULT_CODE_BAD_RA indication which is erroneously triggering. > > The other path is via hugepage TLB misses, for the situation where > we haven't allocated the huge TSB for the thread yet. That might > explain some other longer-term problems we've had. > > I'm about to test the following fix: Thank you - it seems to work fine for me on E3500 on top of 3.17.0-07551-g052db7e + slab alignment fix. However, on top of mainline HEAD 3.17.0-09670-g0429fbc it explodes with scheduler BUG - just reported to LKML + sched maintainers. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-17 11:12 ` Meelis Roos @ 2014-10-18 17:59 ` David Miller 2014-10-18 18:23 ` David Miller 0 siblings, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-18 17:59 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Meelis Roos <mroos@linux.ee> Date: Fri, 17 Oct 2014 14:12:09 +0300 (EEST) > However, on top of mainline HEAD 3.17.0-09670-g0429fbc it explodes with > scheduler BUG - just reported to LKML + sched maintainers. task_stack_end_corrupted() cannot work properly on sparc64. It stores the magic value at "task_thread_info(p) + 1", but on sparc64 that's where we store the nested array of FPU register saves. In fact this facility could be corrupting FPU register state in certain circumstances. The current sparc64 design is intentional, the CPU stack grows down toward the thread_info, and the FPU stack saving area grows up from the end of thread_info. I don't want to define the array size of the fpregs save area explicitly and thereby placing an artificial limit there. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-18 17:59 ` David Miller @ 2014-10-18 18:23 ` David Miller 2014-10-19 12:31 ` Meelis Roos 2014-10-19 15:32 ` Sam Ravnborg 0 siblings, 2 replies; 34+ messages in thread From: David Miller @ 2014-10-18 18:23 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: David Miller <davem@davemloft.net> Date: Sat, 18 Oct 2014 13:59:07 -0400 (EDT) > I don't want to define the array size of the fpregs save area > explicitly and thereby placing an artificial limit there. Nevermind, it seems we have a hard limit of 7 FPU save areas anyways. Meelis, please try this patch: diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h index f85dc85..cc6275c 100644 --- a/arch/sparc/include/asm/thread_info_64.h +++ b/arch/sparc/include/asm/thread_info_64.h @@ -63,7 +63,8 @@ struct thread_info { struct pt_regs *kern_una_regs; unsigned int kern_una_insn; - unsigned long fpregs[0] __attribute__ ((aligned(64))); + unsigned long fpregs[(7 * 256) / sizeof(unsigned long)] + __attribute__ ((aligned(64))); }; #endif /* !(__ASSEMBLY__) */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-18 18:23 ` David Miller @ 2014-10-19 12:31 ` Meelis Roos 2014-10-19 17:12 ` Meelis Roos 2014-10-19 15:32 ` Sam Ravnborg 1 sibling, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-19 12:31 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > > I don't want to define the array size of the fpregs save area > > explicitly and thereby placing an artificial limit there. > > Nevermind, it seems we have a hard limit of 7 FPU save areas anyways. > > Meelis, please try this patch: Works fine with 3.17.0-09670-g0429fbc + fault patch. Will try current git next to find any new problems :) -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-19 12:31 ` Meelis Roos @ 2014-10-19 17:12 ` Meelis Roos 2014-10-19 17:18 ` David Miller 0 siblings, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-19 17:12 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, Linux Kernel list, cl, penberg, rientjes, akpm, linux-mm, sparclinux > > > I don't want to define the array size of the fpregs save area > > > explicitly and thereby placing an artificial limit there. > > > > Nevermind, it seems we have a hard limit of 7 FPU save areas anyways. > > > > Meelis, please try this patch: > > Works fine with 3.17.0-09670-g0429fbc + fault patch. > > Will try current git next to find any new problems :) Works on all 3 machines, with latest git (only had to apply the no-ipv6 patch on one of them). Thank you for the good work! -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-19 17:12 ` Meelis Roos @ 2014-10-19 17:18 ` David Miller 0 siblings, 0 replies; 34+ messages in thread From: David Miller @ 2014-10-19 17:18 UTC (permalink / raw) To: mroos Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Meelis Roos <mroos@linux.ee> Date: Sun, 19 Oct 2014 20:12:43 +0300 (EEST) >> > > I don't want to define the array size of the fpregs save area >> > > explicitly and thereby placing an artificial limit there. >> > >> > Nevermind, it seems we have a hard limit of 7 FPU save areas anyways. >> > >> > Meelis, please try this patch: >> >> Works fine with 3.17.0-09670-g0429fbc + fault patch. >> >> Will try current git next to find any new problems :) > > Works on all 3 machines, with latest git (only had to apply the no-ipv6 > patch on one of them). Thank you for the good work! Thanks for testing. Hopefully we can kill the gcc-4.9 bug next, and then see if that exit_mmap() crash is still happening. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-18 18:23 ` David Miller 2014-10-19 12:31 ` Meelis Roos @ 2014-10-19 15:32 ` Sam Ravnborg 2014-10-19 17:27 ` David Miller 1 sibling, 1 reply; 34+ messages in thread From: Sam Ravnborg @ 2014-10-19 15:32 UTC (permalink / raw) To: David Miller Cc: mroos, iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux On Sat, Oct 18, 2014 at 02:23:35PM -0400, David Miller wrote: > From: David Miller <davem@davemloft.net> > Date: Sat, 18 Oct 2014 13:59:07 -0400 (EDT) > > > I don't want to define the array size of the fpregs save area > > explicitly and thereby placing an artificial limit there. > > Nevermind, it seems we have a hard limit of 7 FPU save areas anyways. > > Meelis, please try this patch: > > diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h > index f85dc85..cc6275c 100644 > --- a/arch/sparc/include/asm/thread_info_64.h > +++ b/arch/sparc/include/asm/thread_info_64.h > @@ -63,7 +63,8 @@ struct thread_info { > struct pt_regs *kern_una_regs; > unsigned int kern_una_insn; > > - unsigned long fpregs[0] __attribute__ ((aligned(64))); > + unsigned long fpregs[(7 * 256) / sizeof(unsigned long)] This part: > + __attribute__ ((aligned(64))); Could be written as __aligned(64) Sam -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-19 15:32 ` Sam Ravnborg @ 2014-10-19 17:27 ` David Miller 2014-10-19 19:55 ` Sam Ravnborg 0 siblings, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-19 17:27 UTC (permalink / raw) To: sam Cc: mroos, iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux From: Sam Ravnborg <sam@ravnborg.org> Date: Sun, 19 Oct 2014 17:32:20 +0200 > This part: > >> + __attribute__ ((aligned(64))); > > Could be written as __aligned(64) I'll try to remember to sweep this up in sparc-next, thanks Sam. We probably use this long-hand form in a lot of other places in the sparc code too, so I'll try to do a full sweep. Thanks again. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-19 17:27 ` David Miller @ 2014-10-19 19:55 ` Sam Ravnborg 0 siblings, 0 replies; 34+ messages in thread From: Sam Ravnborg @ 2014-10-19 19:55 UTC (permalink / raw) To: David Miller Cc: mroos, iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux On Sun, Oct 19, 2014 at 01:27:37PM -0400, David Miller wrote: > From: Sam Ravnborg <sam@ravnborg.org> > Date: Sun, 19 Oct 2014 17:32:20 +0200 > > > This part: > > > >> + __attribute__ ((aligned(64))); > > > > Could be written as __aligned(64) > > I'll try to remember to sweep this up in sparc-next, thanks Sam. > > We probably use this long-hand form in a lot of other places in > the sparc code too, so I'll try to do a full sweep. Another related one would be a full sweep of "__asm__ __volatile__" to the shorter version "asm volatile". The latter is used in a few places in sparc already - so toolchain supports it. I got hits in: include/asm/irqflags_32.h: asm volatile("rd %%psr, %0" : "=r" (flags)); include/asm/processor_64.h:#define cpu_relax() asm volatile("\n99:\n\t" \ kernel/kprobes.c: asm volatile(".global kretprobe_trampoline\n" But this would touch 93 files. Thats too much crunch :-( Sam -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 20:07 ` David Miller 2014-10-16 20:16 ` Meelis Roos @ 2014-10-16 20:20 ` Meelis Roos 2014-10-16 20:40 ` Meelis Roos 1 sibling, 1 reply; 34+ messages in thread From: Meelis Roos @ 2014-10-16 20:20 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, linux-kernel, cl, penberg, rientjes, akpm, linux-mm, sparclinux > I just reproduced this on my Sun Blade 2500, so it can trigger on UltraSPARC-IIIi > systems too. I looked it up - V210 and V440 are also IIIi, not plain III. So I do not have information about real USIII, sorry for confusion. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-16 20:20 ` Meelis Roos @ 2014-10-16 20:40 ` Meelis Roos 0 siblings, 0 replies; 34+ messages in thread From: Meelis Roos @ 2014-10-16 20:40 UTC (permalink / raw) To: David Miller Cc: iamjoonsoo.kim, Linux Kernel list, cl, penberg, rientjes, akpm, linux-mm, sparclinux > > I just reproduced this on my Sun Blade 2500, so it can trigger on UltraSPARC-IIIi > > systems too. > > I looked it up - V210 and V440 are also IIIi, not plain III. So I do not > have information about real USIII, sorry for confusion. Brr, I just understood I confused 2 problems with the same subject. You are talking about SIGBUS problem that is also happening on IIIi, my last comment is about gcc-4.9 problem so please just ignore it. -- Meelis Roos (mroos@linux.ee) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-12 2:15 unaligned accesses in SLAB etc David Miller 2014-10-12 17:20 ` David Miller @ 2014-10-12 17:22 ` Joonsoo Kim 2014-10-12 17:30 ` David Miller 1 sibling, 1 reply; 34+ messages in thread From: Joonsoo Kim @ 2014-10-12 17:22 UTC (permalink / raw) To: David Miller Cc: LKML, Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton, Linux Memory Management List 2014-10-12 11:15 GMT+09:00 David Miller <davem@davemloft.net>: > > I'm getting tons of the following on sparc64: > > [603965.383447] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > [603965.396987] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 > [603965.410523] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > [603965.424061] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 > [603965.437617] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > [603970.554394] log_unaligned: 333 callbacks suppressed > [603970.564041] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > [603970.577576] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 > [603970.591122] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > [603970.604669] Kernel unaligned access at TPC[546b60] free_block+0xa0/0x1a0 > [603970.618216] Kernel unaligned access at TPC[546b58] free_block+0x98/0x1a0 > [603976.515633] log_unaligned: 31 callbacks suppressed > [603976.525092] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603976.540196] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603976.555308] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603976.570411] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603976.585526] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603982.476424] log_unaligned: 43 callbacks suppressed > [603982.485881] Kernel unaligned access at TPC[549378] kmem_cache_alloc+0xd8/0x1e0 > [603982.501590] Kernel unaligned access at TPC[5470a8] kmem_cache_free+0xc8/0x200 > [603982.501605] Kernel unaligned access at TPC[549378] kmem_cache_alloc+0xd8/0x1e0 > [603982.530382] Kernel unaligned access at TPC[5470a8] kmem_cache_free+0xc8/0x200 > [603982.544820] Kernel unaligned access at TPC[549378] kmem_cache_alloc+0xd8/0x1e0 > [603987.567130] log_unaligned: 11 callbacks suppressed > [603987.576582] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603987.591696] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603987.606811] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603987.621904] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 > [603987.637017] Kernel unaligned access at TPC[548080] cache_alloc_refill+0x180/0x3a0 Hello, Could you test below patch? If it fixes your problem, I will send it with proper description. Thanks. ---------->8---------------- diff --git a/mm/slab.c b/mm/slab.c index 154aac8..eb2b2ea 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1992,7 +1992,7 @@ static struct array_cache __percpu *alloc_kmem_cache_cpus( struct array_cache __percpu *cpu_cache; size = sizeof(void *) * entries + sizeof(struct array_cache); - cpu_cache = __alloc_percpu(size, 0); + cpu_cache = __alloc_percpu(size, sizeof(void *)); if (!cpu_cache) return NULL; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-12 17:22 ` Joonsoo Kim @ 2014-10-12 17:30 ` David Miller 2014-10-12 17:43 ` Joonsoo Kim 0 siblings, 1 reply; 34+ messages in thread From: David Miller @ 2014-10-12 17:30 UTC (permalink / raw) To: js1304 Cc: linux-kernel, cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm From: Joonsoo Kim <js1304@gmail.com> Date: Mon, 13 Oct 2014 02:22:15 +0900 > Could you test below patch? > If it fixes your problem, I will send it with proper description. It works, I just tested using ARCH_KMALLOC_MINALIGN which would be better. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: unaligned accesses in SLAB etc. 2014-10-12 17:30 ` David Miller @ 2014-10-12 17:43 ` Joonsoo Kim 0 siblings, 0 replies; 34+ messages in thread From: Joonsoo Kim @ 2014-10-12 17:43 UTC (permalink / raw) To: David Miller Cc: LKML, Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton, Linux Memory Management List 2014-10-13 2:30 GMT+09:00 David Miller <davem@davemloft.net>: > From: Joonsoo Kim <js1304@gmail.com> > Date: Mon, 13 Oct 2014 02:22:15 +0900 > >> Could you test below patch? >> If it fixes your problem, I will send it with proper description. > > It works, I just tested using ARCH_KMALLOC_MINALIGN which would be > better. Oops. resend with whole Cc list. Thanks for testing. ARCH_KMALLOC_MINALIGN is for object alignment, but, current problem is caused by alignment of cpu cache array. I think that my fix is more proper in this situation. I will send fix tomorrow, because I'd like to test more and it's 2:42 am. :) Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2014-10-19 19:56 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-10-12 2:15 unaligned accesses in SLAB etc David Miller 2014-10-12 17:20 ` David Miller 2014-10-13 20:22 ` mroos 2014-10-13 23:52 ` Joonsoo Kim 2014-10-14 0:04 ` David Miller 2014-10-14 0:14 ` Joonsoo Kim 2014-10-14 21:19 ` mroos 2014-10-14 21:32 ` David Miller 2014-10-15 8:04 ` Meelis Roos 2014-10-15 18:36 ` David Miller 2014-10-15 20:11 ` Meelis Roos 2014-10-16 3:11 ` David Miller 2014-10-16 7:22 ` Meelis Roos 2014-10-16 20:11 ` Meelis Roos 2014-10-16 20:18 ` David Miller 2014-10-16 7:02 ` Meelis Roos 2014-10-16 20:07 ` David Miller 2014-10-16 20:16 ` Meelis Roos 2014-10-16 20:20 ` David Miller 2014-10-16 20:50 ` David Miller 2014-10-17 11:12 ` Meelis Roos 2014-10-18 17:59 ` David Miller 2014-10-18 18:23 ` David Miller 2014-10-19 12:31 ` Meelis Roos 2014-10-19 17:12 ` Meelis Roos 2014-10-19 17:18 ` David Miller 2014-10-19 15:32 ` Sam Ravnborg 2014-10-19 17:27 ` David Miller 2014-10-19 19:55 ` Sam Ravnborg 2014-10-16 20:20 ` Meelis Roos 2014-10-16 20:40 ` Meelis Roos 2014-10-12 17:22 ` Joonsoo Kim 2014-10-12 17:30 ` David Miller 2014-10-12 17:43 ` Joonsoo Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox