* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review [not found] ` <718b8afe-222f-4b3a-96d3-93af0e4ceff1@roeck-us.net> @ 2024-08-06 2:40 ` Linus Torvalds 2024-08-06 11:02 ` Vlastimil Babka 0 siblings, 1 reply; 21+ messages in thread From: Linus Torvalds @ 2024-08-06 2:40 UTC (permalink / raw) To: Guenter Roeck, Vlastimil Babka; +Cc: linux-kernel, Linux-MM [ Let's drop random people and bring in Vlastimil ] Vlastimil, it turns out that the "this patch" is entirely a red herring, and the problem comes and goes randomly with just some code layout issues. See http://server.roeck-us.net/qemu/parisc64-6.10.3/ for more detail, particularly you'll see the "log.bad.gz" with the full log. See also https://lore.kernel.org/all/87y15a4p4h.ffs@tglx/ for this thread. I don't think this is really a slub issue, since it only happens on parisc, but maybe you can see what would make parisc different, and what could possibly make it all timing- or layout-dependent. Linus On Sun, 4 Aug 2024 at 11:36, Guenter Roeck <linux@roeck-us.net> wrote: > > With this patch in v6.10.3, all my parisc64 qemu tests get stuck with repeated error messages > > [ 0.000000] ============================================================================= > [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16 > [ 0.000000] ----------------------------------------------------------------------------- > > This never stops until the emulation aborts. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-06 2:40 ` [PATCH 6.10 000/809] 6.10.3-rc3 review Linus Torvalds @ 2024-08-06 11:02 ` Vlastimil Babka 2024-08-06 17:33 ` Thomas Gleixner [not found] ` <f63c6789-b01a-4d76-b7c9-74c04867bc13@roeck-us.net> 0 siblings, 2 replies; 21+ messages in thread From: Vlastimil Babka @ 2024-08-06 11:02 UTC (permalink / raw) To: Linus Torvalds, Guenter Roeck; +Cc: linux-kernel, Linux-MM, Thomas Gleixner On 8/6/24 04:40, Linus Torvalds wrote: > [ Let's drop random people and bring in Vlastimil ] tglx was reproducing it so I add him back > Vlastimil, > it turns out that the "this patch" is entirely a red herring, and the > problem comes and goes randomly with just some code layout issues. See > > http://server.roeck-us.net/qemu/parisc64-6.10.3/ > > for more detail, particularly you'll see the "log.bad.gz" with the full log. [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16 [ 0.000000] Slab 0x0000000041ed0000 objects=21 used=5 fp=0x00000000434003d0 flags=0x200(workingset|section=0|zone=0) flags tell us this came from the partial list (workingset), there's no head flag so order-0 since the error was detected it basically throws the slab page away and tries another one [ 0.000000] BUG kmem_cache (Tainted: G B ): objects 25 > max 16 [ 0.000000] Slab 0x0000000041ed0080 objects=25 used=6 fp=0x0000000043402790 flags=0x240(workingset|head|section=0|zone=0) this was also from the partial list but head flag so at least order-1, two things are weird: - max=16 is same as above even though it should be at least double as slab page's order is larger - objects=25 also isn't at least twice than objects=21 All the following are: [ 0.000000] BUG kmem_cache (Tainted: G B ): objects 25 > max 16 [ 0.000000] Slab 0x0000000041ed0300 objects=25 used=1 fp=0x000000004340c150 flags=0x40(head|section=0|zone=0) we depleted the partial list so it's allocating new slab pages, that are also at least order-1 It looks like maxobj calculation is bogus, would be useful to see what values it calculates from. I'm attaching a diff, but maybe it will also hide the issue... If someone has a /proc/slabinfo from a working boot with otherwise same config it might be also enough to guess what values should be expected there, at least the s-size. objects=21 vs 25 also seem odd though used=5 with used=6 in the first two also suggests we already passed this code successfully for creating a number of kmalloc caches and only then it started failing, that's also weird. > See also > > https://lore.kernel.org/all/87y15a4p4h.ffs@tglx/ > > for this thread. > > I don't think this is really a slub issue, since it only happens on > parisc, but maybe you can see what would make parisc different, and > what could possibly make it all timing- or layout-dependent. > > Linus > > On Sun, 4 Aug 2024 at 11:36, Guenter Roeck <linux@roeck-us.net> wrote: >> >> With this patch in v6.10.3, all my parisc64 qemu tests get stuck with repeated error messages >> >> [ 0.000000] ============================================================================= >> [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16 >> [ 0.000000] ----------------------------------------------------------------------------- >> >> This never stops until the emulation aborts. diff --git a/mm/slub.c b/mm/slub.c index 4927edec6a8c..ec4ed5215f2f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1386,8 +1386,8 @@ static int check_slab(struct kmem_cache *s, struct slab *slab) maxobj = order_objects(slab_order(slab), s->size); if (slab->objects > maxobj) { - slab_err(s, slab, "objects %u > max %u", - slab->objects, maxobj); + slab_err(s, slab, "objects %u > max %u (order %d size %u)", + slab->objects, maxobj, slab_order(slab), s->size); return 0; } if (slab->inuse > slab->objects) { ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-06 11:02 ` Vlastimil Babka @ 2024-08-06 17:33 ` Thomas Gleixner [not found] ` <90e02d99-37a2-437e-ad42-44b80c4e94f6@suse.cz> [not found] ` <f63c6789-b01a-4d76-b7c9-74c04867bc13@roeck-us.net> 1 sibling, 1 reply; 21+ messages in thread From: Thomas Gleixner @ 2024-08-06 17:33 UTC (permalink / raw) To: Vlastimil Babka, Linus Torvalds, Guenter Roeck; +Cc: linux-kernel, Linux-MM On Tue, Aug 06 2024 at 13:02, Vlastimil Babka wrote: > On 8/6/24 04:40, Linus Torvalds wrote: > It looks like maxobj calculation is bogus, would be useful to see what values it > calculates from. I'm attaching a diff, but maybe it will also hide the issue... It does hide it :( > If someone has a /proc/slabinfo from a working boot with otherwise same config > it might be also enough to guess what values should be expected there, > at least the s-size. > > objects=21 vs 25 also seem odd though > > used=5 with used=6 in the first two also suggests we already passed this code > successfully for creating a number of kmalloc caches and only then it started > failing, that's also weird. I added a printk() to check_slab() and on the non-failing boot this looks like: [ 0.000000] c 000000004017b0f8 c 0000000041ed0000 objects 21 max 21 order 0 size 192, inuse 2 [ 0.000000] c 000000004017b1c8 c 0000000041ed0080 objects 25 max 25 order 1 size 320, inuse 1 [ 0.000000] c 0000000043402010 c 0000000041ed0080 objects 25 max 25 order 1 size 320, inuse 2 [ 0.000000] c 0000000043402010 c 0000000041ed0080 objects 25 max 25 order 1 size 320, inuse 3 [ 0.000000] c 0000000043402150 c 0000000041ed0000 objects 21 max 21 order 0 size 192, inuse 3 [ 0.000000] c 0000000043402010 c 0000000041ed0080 objects 25 max 25 order 1 size 320, inuse 4 [ 0.000000] c 0000000043402150 c 0000000041ed0000 objects 21 max 21 order 0 size 192, inuse 4 [ 0.000000] c 0000000043402010 c 0000000041ed0080 objects 25 max 25 order 1 size 320, inuse 5 [ 0.000000] c 0000000043402150 c 0000000041ed0000 objects 21 max 21 order 0 size 192, inuse 5 [ 0.000000] c 0000000043402010 c 0000000041ed0080 objects 25 max 25 order 1 size 320, inuse 6 [ 0.000000] c 0000000043402150 c 0000000041ed0000 objects 21 max 21 order 0 size 192, inuse 6 I did some more experiments to figure out why adding or removing text cures it. The minimal change which makes it boot again is: asmlinkage __visible void __softirq_entry __do_softirq(void) { + current->flags &= ~PF_MEMALLOC; handle_softirqs(false); } That results in the following System.map delta: --- ../upstream.txt 2024-08-06 16:52:49.746528992 +0200 +++ ../build-misc/System.map 2024-08-06 19:02:32.652201977 +0200 @@ -47600,15 +47600,15 @@ 0000000041218c30 T __do_softirq 0000000041218c30 T __irqentry_text_end 0000000041218c30 T __softirqentry_text_start -0000000041218c70 T $$divoI -0000000041218c70 T __softirqentry_text_end -00000000412190d0 T $$divI_2 -00000000412190d0 T $$divide_by_constant -00000000412190e0 T $$divI_4 -00000000412190f0 T $$divI_8 -0000000041219100 T $$divI_16 -00000000412192d8 T $$divI_17 -000000004121930c T $$divU_17 +0000000041218c80 T $$divoI +0000000041218c80 T __softirqentry_text_end +00000000412190e0 T $$divI_2 +00000000412190e0 T $$divide_by_constant +00000000412190f0 T $$divI_4 +0000000041219100 T $$divI_8 +0000000041219110 T $$divI_16 +00000000412192e8 T $$divI_17 +000000004121931c T $$divU_17 000000004121a000 D __start_opd 000000004121a000 D _etext 000000004121a000 D _sdata So this change adds 16 bytes to __softirq() which moves the division functions up by 16 bytes. That's all it takes to make the stupid go away.... I wonder whether this is some qemu stupid. Thanks, tglx ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <90e02d99-37a2-437e-ad42-44b80c4e94f6@suse.cz>]
[parent not found: <87frrh44mf.ffs@tglx>]
[parent not found: <76c643ee-17d6-463b-8ee1-4e30b0133671@roeck-us.net>]
[parent not found: <87plqjz6aa.ffs@tglx>]
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review [not found] ` <87plqjz6aa.ffs@tglx> @ 2024-08-08 15:53 ` Linus Torvalds 2024-08-08 16:12 ` Thomas Gleixner [not found] ` <cffe30ed-43a3-46ac-ad03-afb7633f17e5@roeck-us.net> 1 sibling, 1 reply; 21+ messages in thread From: Linus Torvalds @ 2024-08-08 15:53 UTC (permalink / raw) To: Thomas Gleixner Cc: Guenter Roeck, Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc On Thu, 8 Aug 2024 at 02:57, Thomas Gleixner <tglx@linutronix.de> wrote: > > Careful vs. the pr_once(). It's not necessarily the first allocation > which trips up. I removed slab_err() in that condition and just printed > the data: > > [ 0.000000] Order: 1 Size: 384 Nobj: 21 Maxobj: 16 21 Inuse: 14 > [ 0.000000] Order: 0 Size: 168 Nobj: 24 Maxobj: 16 24 Inuse: 1 > [ 0.000000] Order: 1 Size: 320 Nobj: 25 Maxobj: 16 25 Inuse: 18 > [ 0.000000] Order: 1 Size: 320 Nobj: 25 Maxobj: 16 25 Inuse: 19 > [ 0.000000] Order: 1 Size: 320 Nobj: 25 Maxobj: 16 25 Inuse: 20 > [ 0.000000] Order: 0 Size: 160 Nobj: 25 Maxobj: 16 25 Inuse: 5 > [ 0.000000] Order: 2 Size: 672 Nobj: 24 Maxobj: 16 24 Inuse: 1 > [ 0.000000] Order: 3 Size: 1536 Nobj: 21 Maxobj: 16 21 Inuse: 1 > [ 0.000000] Order: 3 Size: 1536 Nobj: 21 Maxobj: 16 21 Inuse: 2 > [ 0.000000] Order: 3 Size: 1536 Nobj: 21 Maxobj: 16 21 Inuse: 10 > > The maxobj column shows the failed result and the result from the second > invocation inside of the printk(). Hmm. There's a few patterns there: - the incorrect Maxobj is always 16, with wildly different sizes. - the correct value is always in that 21-25 range and neither of these are particularly common cases for slab objects (well, at least on x86-64). I actually went into the gcc sources to look at the libgcc routines for the hppa $$divU routine, but apart from checking for trivial powers-of-two and for divisions with small divisor values (<=17), all it is ends up being a series of "ds" (divide step) and "addc" instructions. I don't see how that could possibly mess up. It does end up with the final addc in the delay slot of the return, but that's normal parisc behavior (and here by "normal" I mean "it's a really messed up instruction set that did everything wrong, including branch delay slots") I do note that the $$divU function (which is what this all should use) oddly doesn't show up as defined in 'nm' for me when I look at Guenter's vmlinux file. So there's some odd linker thing going on, and it *only* affects the $$div* functions. Thomas' System.map shows some of the same effects, ie it shows $$divoI (signed integer divide with overflow checking), but doesn't show $$divU that is right after it. The reason I was looking was exactly because this should be using $$divU, and clearly code alignment is implicated somehow, but the exact alignment of $$divU wasn't obvious. But it looks like "$$divU" should be somewhere between $$divoI and $$divl_2, and in Guenter's bad case that's 0000000041218c70 T $$divoI 00000000412190d0 T $$divI_2 so *maybe* $$divU is around a page boundary? 0000000041218xxx turning into 0000000041219000? Some ITLB fill issue together with that delayed branch and a qemu bug? Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 15:53 ` Linus Torvalds @ 2024-08-08 16:12 ` Thomas Gleixner 2024-08-08 16:33 ` Linus Torvalds 0 siblings, 1 reply; 21+ messages in thread From: Thomas Gleixner @ 2024-08-08 16:12 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc On Thu, Aug 08 2024 at 08:53, Linus Torvalds wrote: > On Thu, 8 Aug 2024 at 02:57, Thomas Gleixner <tglx@linutronix.de> wrote: > Hmm. There's a few patterns there: > > - the incorrect Maxobj is always 16, with wildly different sizes. Which means that the size value is rounded up to the next power of 2 >> [ 0.000000] Order: 1 Size: 384 Nobj: 21 Maxobj: 16 21 Inuse: 14 8192/16 = 512 >> [ 0.000000] Order: 0 Size: 168 Nobj: 24 Maxobj: 16 24 Inuse: 1 4096/16 = 256 >> [ 0.000000] Order: 3 Size: 1536 Nobj: 21 Maxobj: 16 21 Inuse: 1 32768/16 = 2048 >> The maxobj column shows the failed result and the result from the second >> invocation inside of the printk(). > I actually went into the gcc sources to look at the libgcc routines > for the hppa $$divU routine, but apart from checking for trivial > powers-of-two and for divisions with small divisor values (<=17), all > it is ends up being a series of "ds" (divide step) and "addc" > instructions. I don't see how that could possibly mess up. It does end > up with the final addc in the delay slot of the return, but that's > normal parisc behavior (and here by "normal" I mean "it's a really > messed up instruction set that did everything wrong, including branch > delay slots") > > I do note that the $$divU function (which is what this all should use) > oddly doesn't show up as defined in 'nm' for me when I look at > Guenter's vmlinux file. So there's some odd linker thing going on, and > it *only* affects the $$div* functions. > > Thomas' System.map shows some of the same effects, ie it shows $$divoI > (signed integer divide with overflow checking), but doesn't show > $$divU that is right after it. The reason I was looking was exactly > because this should be using $$divU, and clearly code alignment is > implicated somehow, but the exact alignment of $$divU wasn't obvious. > > But it looks like "$$divU" should be somewhere between $$divoI and > $$divl_2, and in Guenter's bad case that's > > 0000000041218c70 T $$divoI > 00000000412190d0 T $$divI_2 > > so *maybe* $$divU is around a page boundary? 0000000041218xxx turning > into 0000000041219000? It uses $$divU which is at $$divoI + 0x250. I validated that in the disassembly. Thanks, tglx ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 16:12 ` Thomas Gleixner @ 2024-08-08 16:33 ` Linus Torvalds 2024-08-08 17:48 ` Thomas Gleixner 0 siblings, 1 reply; 21+ messages in thread From: Linus Torvalds @ 2024-08-08 16:33 UTC (permalink / raw) To: Thomas Gleixner Cc: Guenter Roeck, Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc On Thu, 8 Aug 2024 at 09:12, Thomas Gleixner <tglx@linutronix.de> wrote: > > > > But it looks like "$$divU" should be somewhere between $$divoI and > > $$divl_2, and in Guenter's bad case that's > > > > 0000000041218c70 T $$divoI > > 00000000412190d0 T $$divI_2 > > > > so *maybe* $$divU is around a page boundary? 0000000041218xxx turning > > into 0000000041219000? > > It uses $$divU which is at $$divoI + 0x250. I validated that in the > disassembly. Well, that does support "maybe we have a page crosser issue", but it's not quite at the delayed branch. Because that would mean that $$divU starts at 0x41218ec0, and that means that there are 80 instructions from the start of $$divU to the end of that 0x41218xxx page. And if I counted instructions right (I don't have a disassembler, so I'm just looking at the libgcc sources), that puts the page crosser not quite at the delayed branch slot, but it does put it somewhere roughly at or around ds temp,arg1,temp /* 29th divide step */ addc retreg,retreg,retreg /* shift retreg with/into carry */ so it's around the last few bits of the result. The ones we get wrong. Which is intriguing, but honestly, I don't see how we could get itlb misses horribly wrong and not crash left and right. The $$divU routine is unusual in that it uses that millicode calling convention, but I think that's just a different register for the return address. And it obviously depends on the carry flag, which is pretty unusual. Maybe if the ITLB fill messes up C, it wouldn't show up in other areas? But the $$divU result error is more than one single bit getting cleared. Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 16:33 ` Linus Torvalds @ 2024-08-08 17:48 ` Thomas Gleixner 2024-08-08 18:19 ` Linus Torvalds 0 siblings, 1 reply; 21+ messages in thread From: Thomas Gleixner @ 2024-08-08 17:48 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc On Thu, Aug 08 2024 at 09:33, Linus Torvalds wrote: > On Thu, 8 Aug 2024 at 09:12, Thomas Gleixner <tglx@linutronix.de> wrote: >> It uses $$divU which is at $$divoI + 0x250. I validated that in the >> disassembly. > > Well, that does support "maybe we have a page crosser issue", but it's > not quite at the delayed branch. > > Because that would mean that $$divU starts at 0x41218ec0, and that > means that there are 80 instructions from the start of $$divU to the > end of that 0x41218xxx page. > > And if I counted instructions right (I don't have a disassembler, so > I'm just looking at the libgcc sources), that puts the page crosser > not quite at the delayed branch slot, but it does put it somewhere > roughly at or around > > ds temp,arg1,temp /* 29th divide step */ > addc retreg,retreg,retreg /* shift retreg with/into carry */ > > so it's around the last few bits of the result. The ones we get wrong. > > Which is intriguing, but honestly, I don't see how we could get itlb > misses horribly wrong and not crash left and right. Here is the disassembly from my latest crashing debug kernel which shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. 4121dec0: 37 21 3f ff ldo -1(r25),r1 4121dec4: 08 39 22 00 and,= r25,r1,r0 4121dec8: e8 00 00 88 b,l 4121df14 <$$divoI+0x2a4>,r0 4121decc: b3 20 20 00 addi,tc,= 0,r25,r0 4121ded0: 08 1a 02 5d copy r26,ret1 4121ded4: d3 21 39 f0 extrw,u,= r25,15,16,r1 4121ded8: d3 bd 19 f0 extrw,u ret1,15,16,ret1 4121dedc: 08 39 02 59 or r25,r1,r25 4121dee0: 34 1a 01 98 ldi cc,r26 4121dee4: d3 21 3a f8 extrw,u,= r25,23,8,r1 4121dee8: d3 bd 1a e8 extrw,u ret1,23,24,ret1 4121deec: 08 39 02 59 or r25,r1,r25 4121def0: 34 01 01 54 ldi aa,r1 4121def4: d3 20 3b 7c extrw,u,= r25,27,4,r0 4121def8: d3 bd 1b 64 extrw,u ret1,27,28,ret1 4121defc: 0b 59 22 00 and,= r25,r26,r0 4121df00: d3 bd 1b a2 extrw,u ret1,29,30,ret1 4121df04: 08 39 22 00 and,= r25,r1,r0 4121df08: d3 bd 1b c1 extrw,u ret1,30,31,ret1 4121df0c: e8 40 c0 02 bv,n r0(rp) 4121df10: 08 00 02 40 nop 4121df18: 97 21 00 00 subi 0,r25,r1 4121df1c: 08 20 04 40 ds r0,r1,r0 4121df20: 0b 5a 06 1d add r26,r26,ret1 4121df24: 0b 20 04 41 ds r0,r25,r1 4121df28: 0b bd 07 1d add,c ret1,ret1,ret1 4121df2c: 0b 21 04 41 ds r1,r25,r1 4121df30: 0b bd 07 1d add,c ret1,ret1,ret1 4121df34: 0b 21 04 41 ds r1,r25,r1 4121df38: 0b bd 07 1d add,c ret1,ret1,ret1 4121df3c: 0b 21 04 41 ds r1,r25,r1 4121df40: 0b bd 07 1d add,c ret1,ret1,ret1 4121df44: 0b 21 04 41 ds r1,r25,r1 4121df48: 0b bd 07 1d add,c ret1,ret1,ret1 4121df4c: 0b 21 04 41 ds r1,r25,r1 4121df50: 0b bd 07 1d add,c ret1,ret1,ret1 4121df54: 0b 21 04 41 ds r1,r25,r1 4121df58: 0b bd 07 1d add,c ret1,ret1,ret1 4121df5c: 0b 21 04 41 ds r1,r25,r1 4121df60: 0b bd 07 1d add,c ret1,ret1,ret1 4121df64: 0b 21 04 41 ds r1,r25,r1 4121df68: 0b bd 07 1d add,c ret1,ret1,ret1 4121df6c: 0b 21 04 41 ds r1,r25,r1 4121df70: 0b bd 07 1d add,c ret1,ret1,ret1 4121df74: 0b 21 04 41 ds r1,r25,r1 4121df78: 0b bd 07 1d add,c ret1,ret1,ret1 4121df7c: 0b 21 04 41 ds r1,r25,r1 4121df80: 0b bd 07 1d add,c ret1,ret1,ret1 4121df84: 0b 21 04 41 ds r1,r25,r1 4121df88: 0b bd 07 1d add,c ret1,ret1,ret1 4121df8c: 0b 21 04 41 ds r1,r25,r1 4121df90: 0b bd 07 1d add,c ret1,ret1,ret1 4121df94: 0b 21 04 41 ds r1,r25,r1 4121df98: 0b bd 07 1d add,c ret1,ret1,ret1 4121df9c: 0b 21 04 41 ds r1,r25,r1 4121dfa0: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfa4: 0b 21 04 41 ds r1,r25,r1 4121dfa8: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfac: 0b 21 04 41 ds r1,r25,r1 4121dfb0: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfb4: 0b 21 04 41 ds r1,r25,r1 4121dfb8: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfbc: 0b 21 04 41 ds r1,r25,r1 4121dfc0: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfc4: 0b 21 04 41 ds r1,r25,r1 4121dfc8: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfcc: 0b 21 04 41 ds r1,r25,r1 4121dfd0: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfd4: 0b 21 04 41 ds r1,r25,r1 4121dfd8: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfdc: 0b 21 04 41 ds r1,r25,r1 4121dfe0: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfe4: 0b 21 04 41 ds r1,r25,r1 4121dfe8: 0b bd 07 1d add,c ret1,ret1,ret1 4121dfec: 0b 21 04 41 ds r1,r25,r1 4121dff0: 0b bd 07 1d add,c ret1,ret1,ret1 4121dff4: 0b 21 04 41 ds r1,r25,r1 4121dff8: 0b bd 07 1d add,c ret1,ret1,ret1 4121dffc: 0b 21 04 41 ds r1,r25,r1 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 4121e004: 0b 21 04 41 ds r1,r25,r1 4121e008: 0b bd 07 1d add,c ret1,ret1,ret1 4121e00c: 0b 21 04 41 ds r1,r25,r1 4121e010: 0b bd 07 1d add,c ret1,ret1,ret1 4121e014: 0b 21 04 41 ds r1,r25,r1 4121e018: 0b bd 07 1d add,c ret1,ret1,ret1 4121e01c: 0b 21 04 41 ds r1,r25,r1 4121e020: e8 40 c0 00 bv r0(rp) 4121e024: 0b bd 07 1d add,c ret1,ret1,ret1 4121e028: f3 20 0c 00 depd,* r0,31,32,r25 4121e02c: 8f 20 61 10 cmpib,> 0,r25,4121e0bc <$$divoI+0x44c> 4121e030: 08 00 02 40 nop 4121e034: e8 19 40 00 blr r25,r0 4121e038: 08 00 02 40 nop 4121e03c: b3 20 20 00 addi,tc,= 0,r25,r0 4121e040: 08 00 02 40 nop 4121e044: e8 40 c0 00 bv r0(rp) 4121e048: 08 1a 02 5d copy r26,ret1 4121e04c: e8 40 c0 00 bv r0(rp) 4121e050: d3 5d 1b c1 extrw,u r26,30,31,ret1 4121e054: e8 00 01 c2 b,l,n 4121e13c <$$divI_16+0x3c>,r0 4121e058: 08 00 02 40 nop 4121e05c: e8 40 c0 00 bv r0(rp) 4121e060: d3 5d 1b a2 extrw,u r26,29,30,ret1 4121e064: e8 00 02 2a b,l,n 4121e180 <$$divI_16+0x80>,r0 4121e068: 08 00 02 40 nop 4121e06c: e8 00 02 aa b,l,n 4121e1c8 <$$divI_16+0xc8>,r0 4121e070: 08 00 02 40 nop 4121e074: e8 00 06 9a b,l,n 4121e3c8 <$$divU_17+0xbc>,r0 4121e078: 08 00 02 40 nop 4121e07c: e8 40 c0 00 bv r0(rp) 4121e080: d3 5d 1b 83 extrw,u r26,28,29,ret1 4121e084: e8 00 07 12 b,l,n 4121e414 <$$divU_17+0x108>,r0 4121e088: 08 00 02 40 nop 4121e08c: e8 00 02 9a b,l,n 4121e1e0 <$$divI_16+0xe0>,r0 4121e090: 08 00 02 40 nop 4121e094: e8 1f 1d 0d b,l 4121df20 <$$divoI+0x2b0>,r0 4121e098: 08 20 04 40 ds r0,r1,r0 4121e09c: e8 00 03 fa b,l,n 4121e2a0 <$$divI_16+0x1a0>,r0 4121e0a0: 08 00 02 40 nop 4121e0a4: e8 1f 1c ed b,l 4121df20 <$$divoI+0x2b0>,r0 4121e0a8: 08 20 04 40 ds r0,r1,r0 4121e0ac: e8 00 07 02 b,l,n 4121e434 <$$divU_17+0x128>,r0 4121e0b0: 08 00 02 40 nop 4121e0b4: e8 00 04 22 b,l,n 4121e2cc <$$divI_16+0x1cc>,r0 4121e0b8: 08 00 02 40 nop 4121e0bc: 0b 3a 04 00 sub r26,r25,r0 4121e0c0: e8 40 c0 00 bv r0(rp) 4121e0c4: 08 00 07 1d add,c r0,r0,ret1 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 17:48 ` Thomas Gleixner @ 2024-08-08 18:19 ` Linus Torvalds 2024-08-08 20:52 ` Guenter Roeck 2024-09-03 7:54 ` Helge Deller 0 siblings, 2 replies; 21+ messages in thread From: Linus Torvalds @ 2024-08-08 18:19 UTC (permalink / raw) To: Thomas Gleixner Cc: Guenter Roeck, Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc On Thu, 8 Aug 2024 at 10:48, Thomas Gleixner <tglx@linutronix.de> wrote: > > Here is the disassembly from my latest crashing debug kernel which > shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. Looks like I was off by an instruction, it's the 28th divide-step (not 29) that does the page crosser: > 4121dffc: 0b 21 04 41 ds r1,r25,r1 > 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 but my parisc knowledge is not good enough to even guess at what could go wrong. And I have no actual reason to believe this has *anything* to do with an itlb miss, except for that whole "exact placement seems to matter, and it crosses a page boundary" detail. None of this makes sense. I think we'll have to wait for Helge. It's not like parisc is a huge concern, and for all we know this is all a qemu bug to begin with. Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 18:19 ` Linus Torvalds @ 2024-08-08 20:52 ` Guenter Roeck 2024-08-08 21:50 ` John David Anglin 2024-08-08 22:15 ` Richard Henderson 2024-09-03 7:54 ` Helge Deller 1 sibling, 2 replies; 21+ messages in thread From: Guenter Roeck @ 2024-08-08 20:52 UTC (permalink / raw) To: Linus Torvalds, Thomas Gleixner Cc: Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc, Richard Henderson On 8/8/24 11:19, Linus Torvalds wrote: > On Thu, 8 Aug 2024 at 10:48, Thomas Gleixner <tglx@linutronix.de> wrote: >> >> Here is the disassembly from my latest crashing debug kernel which >> shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. > > Looks like I was off by an instruction, it's the 28th divide-step (not > 29) that does the page crosser: > >> 4121dffc: 0b 21 04 41 ds r1,r25,r1 >> 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 > > but my parisc knowledge is not good enough to even guess at what could go wrong. > > And I have no actual reason to believe this has *anything* to do with > an itlb miss, except for that whole "exact placement seems to matter, > and it crosses a page boundary" detail. > > None of this makes sense. I think we'll have to wait for Helge. It's > not like parisc is a huge concern, and for all we know this is all a > qemu bug to begin with. > Copying Richard Henderson who recently made a number of changes to the parisc/hppa qemu implementation (which unfortunately didn't fix the problem). Guenter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 20:52 ` Guenter Roeck @ 2024-08-08 21:50 ` John David Anglin 2024-08-08 22:29 ` John David Anglin 2024-08-09 0:50 ` Guenter Roeck 2024-08-08 22:15 ` Richard Henderson 1 sibling, 2 replies; 21+ messages in thread From: John David Anglin @ 2024-08-08 21:50 UTC (permalink / raw) To: Guenter Roeck, Linus Torvalds, Thomas Gleixner Cc: Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc, Richard Henderson On 2024-08-08 4:52 p.m., Guenter Roeck wrote: > On 8/8/24 11:19, Linus Torvalds wrote: >> On Thu, 8 Aug 2024 at 10:48, Thomas Gleixner <tglx@linutronix.de> wrote: >>> >>> Here is the disassembly from my latest crashing debug kernel which >>> shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. >> >> Looks like I was off by an instruction, it's the 28th divide-step (not >> 29) that does the page crosser: >> >>> 4121dffc: 0b 21 04 41 ds r1,r25,r1 >>> 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 I think this macro might clobber the C/B bits on a ITLB missing: /* This is for ILP32 PA2.0 only. The TLB insertion needs * to extend into I/O space if the address is 0xfXXXXXXX * so we extend the f's into the top word of the pte in * this case */ .macro f_extend pte,tmp extrd,s \pte,42,4,\tmp addi,<> 1,\tmp,%r0 extrd,s \pte,63,25,\pte .endm The addi instruction affects the C/B bits. However, it is only used for 32-bit PA 2.0 kernels. A second tmp register would be needed to change the addi to an add logical. The mode likely problem is the shladd instruction in the following macro in entry.S: .macro L2_ptep pmd,pte,index,va,fault #if CONFIG_PGTABLE_LEVELS == 3 extru_safe \va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index #else extru_safe \va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index #endif dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ #if CONFIG_PGTABLE_LEVELS < 3 copy %r0,\pte #endif ldw,s \index(\pmd),\pmd bb,>=,n \pmd,_PxD_PRESENT_BIT,\fault dep %r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */ SHLREG \pmd,PxD_VALUE_SHIFT,\pmd extru_safe \va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ .endm I believe the shladd instruction should be changed to shladd,l (shift left and add logical). Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 21:50 ` John David Anglin @ 2024-08-08 22:29 ` John David Anglin 2024-08-08 23:33 ` Linus Torvalds 2024-08-09 0:56 ` Guenter Roeck 2024-08-09 0:50 ` Guenter Roeck 1 sibling, 2 replies; 21+ messages in thread From: John David Anglin @ 2024-08-08 22:29 UTC (permalink / raw) To: Guenter Roeck, Linus Torvalds, Thomas Gleixner Cc: Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc, Richard Henderson On 2024-08-08 5:50 p.m., John David Anglin wrote: > The mode likely problem is the shladd instruction in the following macro in entry.S: > > .macro L2_ptep pmd,pte,index,va,fault > #if CONFIG_PGTABLE_LEVELS == 3 > extru_safe \va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index > #else > extru_safe \va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index > #endif > dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ > #if CONFIG_PGTABLE_LEVELS < 3 > copy %r0,\pte > #endif > ldw,s \index(\pmd),\pmd > bb,>=,n \pmd,_PxD_PRESENT_BIT,\fault > dep %r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */ > SHLREG \pmd,PxD_VALUE_SHIFT,\pmd > extru_safe \va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index > dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ > shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ > .endm > > I believe the shladd instruction should be changed to shladd,l (shift left and add logical). diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S index ab23e61a6f01..1ec60406f841 100644 --- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -399,7 +399,7 @@ SHLREG \pmd,PxD_VALUE_SHIFT,\pmd extru_safe \va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ - shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ + shladd,l \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ .endm /* Look up PTE in a 3-Level scheme. */ Boots okay. Fixing the addi instruction is harder and it would take some time to test. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 22:29 ` John David Anglin @ 2024-08-08 23:33 ` Linus Torvalds 2024-08-09 0:33 ` John David Anglin 2024-08-09 0:56 ` Guenter Roeck 1 sibling, 1 reply; 21+ messages in thread From: Linus Torvalds @ 2024-08-08 23:33 UTC (permalink / raw) To: John David Anglin Cc: Guenter Roeck, Thomas Gleixner, Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc, Richard Henderson On Thu, 8 Aug 2024 at 15:30, John David Anglin <dave.anglin@bell.net> wrote: > > > I believe the shladd instruction should be changed to shladd,l (shift left and add logical). > > diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S > index ab23e61a6f01..1ec60406f841 100644 > --- a/arch/parisc/kernel/entry.S > +++ b/arch/parisc/kernel/entry.S > @@ -399,7 +399,7 @@ > - shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ > + shladd,l \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ This doesn't seem wrong, but doesn't RFIR already restore the status word? So even if the itlb fill modifies C/B, I don't see why that should actually matter. But again, parisc is very much not one of the architectures I've ever worked with, so.. Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 23:33 ` Linus Torvalds @ 2024-08-09 0:33 ` John David Anglin 0 siblings, 0 replies; 21+ messages in thread From: John David Anglin @ 2024-08-09 0:33 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Thomas Gleixner, Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc, Richard Henderson On 2024-08-08 7:33 p.m., Linus Torvalds wrote: > On Thu, 8 Aug 2024 at 15:30, John David Anglin <dave.anglin@bell.net> wrote: >>> I believe the shladd instruction should be changed to shladd,l (shift left and add logical). >> diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S >> index ab23e61a6f01..1ec60406f841 100644 >> --- a/arch/parisc/kernel/entry.S >> +++ b/arch/parisc/kernel/entry.S >> @@ -399,7 +399,7 @@ >> - shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ >> + shladd,l \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ > This doesn't seem wrong, but doesn't RFIR already restore the status word? Yes. > > So even if the itlb fill modifies C/B, I don't see why that should > actually matter. Probably won't help unless there's a bug in qemu rfir emulation. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 22:29 ` John David Anglin 2024-08-08 23:33 ` Linus Torvalds @ 2024-08-09 0:56 ` Guenter Roeck 1 sibling, 0 replies; 21+ messages in thread From: Guenter Roeck @ 2024-08-09 0:56 UTC (permalink / raw) To: John David Anglin, Linus Torvalds, Thomas Gleixner Cc: Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc, Richard Henderson On 8/8/24 15:29, John David Anglin wrote: > On 2024-08-08 5:50 p.m., John David Anglin wrote: >> The mode likely problem is the shladd instruction in the following macro in entry.S: >> >> .macro L2_ptep pmd,pte,index,va,fault >> #if CONFIG_PGTABLE_LEVELS == 3 >> extru_safe \va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index >> #else >> extru_safe \va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index >> #endif >> dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ >> #if CONFIG_PGTABLE_LEVELS < 3 >> copy %r0,\pte >> #endif >> ldw,s \index(\pmd),\pmd >> bb,>=,n \pmd,_PxD_PRESENT_BIT,\fault >> dep %r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */ >> SHLREG \pmd,PxD_VALUE_SHIFT,\pmd >> extru_safe \va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index >> dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ >> shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ >> .endm >> >> I believe the shladd instruction should be changed to shladd,l (shift left and add logical). > diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S > index ab23e61a6f01..1ec60406f841 100644 > --- a/arch/parisc/kernel/entry.S > +++ b/arch/parisc/kernel/entry.S > @@ -399,7 +399,7 @@ > SHLREG \pmd,PxD_VALUE_SHIFT,\pmd > extru_safe \va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index > dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ > - shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ > + shladd,l \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ > .endm > > /* Look up PTE in a 3-Level scheme. */ > > Boots okay. Fixing the addi instruction is harder and it would take some time to test. > Odd, it doesn't help for me. Does it crash for you without the above change ? Or, in other words, is divI at the objecting location ? Guenter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 21:50 ` John David Anglin 2024-08-08 22:29 ` John David Anglin @ 2024-08-09 0:50 ` Guenter Roeck 1 sibling, 0 replies; 21+ messages in thread From: Guenter Roeck @ 2024-08-09 0:50 UTC (permalink / raw) To: John David Anglin, Linus Torvalds, Thomas Gleixner Cc: Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc, Richard Henderson On 8/8/24 14:50, John David Anglin wrote: > On 2024-08-08 4:52 p.m., Guenter Roeck wrote: >> On 8/8/24 11:19, Linus Torvalds wrote: >>> On Thu, 8 Aug 2024 at 10:48, Thomas Gleixner <tglx@linutronix.de> wrote: >>>> >>>> Here is the disassembly from my latest crashing debug kernel which >>>> shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. >>> >>> Looks like I was off by an instruction, it's the 28th divide-step (not >>> 29) that does the page crosser: >>> >>>> 4121dffc: 0b 21 04 41 ds r1,r25,r1 >>>> 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 > I think this macro might clobber the C/B bits on a ITLB missing: > > /* This is for ILP32 PA2.0 only. The TLB insertion needs > * to extend into I/O space if the address is 0xfXXXXXXX > * so we extend the f's into the top word of the pte in > * this case */ > .macro f_extend pte,tmp > extrd,s \pte,42,4,\tmp > addi,<> 1,\tmp,%r0 > extrd,s \pte,63,25,\pte > .endm > > The addi instruction affects the C/B bits. However, it is only used for 32-bit PA 2.0 kernels. > A second tmp register would be needed to change the addi to an add logical. > > The mode likely problem is the shladd instruction in the following macro in entry.S: > > .macro L2_ptep pmd,pte,index,va,fault > #if CONFIG_PGTABLE_LEVELS == 3 > extru_safe \va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index > #else > extru_safe \va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index > #endif > dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ > #if CONFIG_PGTABLE_LEVELS < 3 > copy %r0,\pte > #endif > ldw,s \index(\pmd),\pmd > bb,>=,n \pmd,_PxD_PRESENT_BIT,\fault > dep %r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */ > SHLREG \pmd,PxD_VALUE_SHIFT,\pmd > extru_safe \va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index > dep %r0,31,PAGE_SHIFT,\pmd /* clear offset */ > shladd \index,BITS_PER_PTE_ENTRY,\pmd,\pmd /* pmd is now pte */ > .endm > > I believe the shladd instruction should be changed to shladd,l (shift left and add logical). > That doesn't help, at least not in qemu. Guenter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 20:52 ` Guenter Roeck 2024-08-08 21:50 ` John David Anglin @ 2024-08-08 22:15 ` Richard Henderson 1 sibling, 0 replies; 21+ messages in thread From: Richard Henderson @ 2024-08-08 22:15 UTC (permalink / raw) To: Guenter Roeck, Linus Torvalds, Thomas Gleixner Cc: Vlastimil Babka, linux-kernel, Linux-MM, Helge Deller, linux-parisc On 8/9/24 06:52, Guenter Roeck wrote: > On 8/8/24 11:19, Linus Torvalds wrote: >> On Thu, 8 Aug 2024 at 10:48, Thomas Gleixner <tglx@linutronix.de> wrote: >>> >>> Here is the disassembly from my latest crashing debug kernel which >>> shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. >> >> Looks like I was off by an instruction, it's the 28th divide-step (not >> 29) that does the page crosser: >> >>> 4121dffc: 0b 21 04 41 ds r1,r25,r1 >>> 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 >> >> but my parisc knowledge is not good enough to even guess at what could go wrong. >> >> And I have no actual reason to believe this has *anything* to do with >> an itlb miss, except for that whole "exact placement seems to matter, >> and it crosses a page boundary" detail. >> >> None of this makes sense. I think we'll have to wait for Helge. It's >> not like parisc is a huge concern, and for all we know this is all a >> qemu bug to begin with. >> > > Copying Richard Henderson who recently made a number of changes to the > parisc/hppa qemu implementation (which unfortunately didn't fix the problem). Wow, that's quite the agile bug you've got there. You can eliminate one class of qemu bug by attempting to reproduce in qemu-linux-user: arrange for the page crossing at the appropriate spot and see if the split between two translation blocks causes carry flag weirdness. If that doesn't reproduce, then I'd be likely to blame something in the exception delivery or return process. Still could be a qemu problem, but it would be something in the system emulation of the exception path. It should be possible to write a small system mode test case for this hypothesis. Ideally the itlb miss handler would be as simple a possible, e.g. computing an identity mapping rather than using real page tables. Only after that would I start digging into the linux kernel's exception paths. r~ ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-08-08 18:19 ` Linus Torvalds 2024-08-08 20:52 ` Guenter Roeck @ 2024-09-03 7:54 ` Helge Deller 2024-09-03 14:13 ` Guenter Roeck 2024-09-03 18:43 ` Linus Torvalds 1 sibling, 2 replies; 21+ messages in thread From: Helge Deller @ 2024-09-03 7:54 UTC (permalink / raw) To: Linus Torvalds, Richard Henderson, Guenter Roeck Cc: Vlastimil Babka, linux-kernel, Linux-MM, linux-parisc, Thomas Gleixner On 8/8/24 20:19, Linus Torvalds wrote: > On Thu, 8 Aug 2024 at 10:48, Thomas Gleixner <tglx@linutronix.de> wrote: >> >> Here is the disassembly from my latest crashing debug kernel which >> shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. > > Looks like I was off by an instruction, it's the 28th divide-step (not > 29) that does the page crosser: > >> 4121dffc: 0b 21 04 41 ds r1,r25,r1 >> 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 > > but my parisc knowledge is not good enough to even guess at what could go wrong. > > And I have no actual reason to believe this has *anything* to do with > an itlb miss, except for that whole "exact placement seems to matter, > and it crosses a page boundary" detail. Well, you were on the right track :-) Guenters kernel from http://server.roeck-us.net/qemu/parisc64-6.10.3/ boots nicely on my physical C3700 machine, but crashes with Qemu. So, it's not some bug in the kernel ITLB miss handler or other Linux kernel code. Instead it's a Qemu bug, which gets triggered by the page boundary crossing of: 41218ffc: 0b 21 04 41 ds r1,r25,r1 41219000: 0b bd 07 1d add,c ret1,ret1,ret1 During the ITLB miss, the carry bits and the PSW-V-bit (from the divide step) are saved in the IPSW register and restored upon irq return. During packaging the bits there is a qemu coding bug, where we missed to handle the PSW-V-bit as 32-bit value even on a 64-bit CPU. The (copy&pasted) patch below fixes the crash for me. Helge diff --git a/target/hppa/helper.c b/target/hppa/helper.c index b79ddd8184..d4b1a3cd5a 100644 --- a/target/hppa/helper.c +++ b/target/hppa/helper.c @@ -53,7 +53,7 @@ target_ulong cpu_hppa_get_psw(CPUHPPAState *env) } psw |= env->psw_n * PSW_N; - psw |= (env->psw_v < 0) * PSW_V; + psw |= ((env->psw_v >> 31) & 1) * PSW_V; psw |= env->psw | env->psw_xb; return psw; ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-09-03 7:54 ` Helge Deller @ 2024-09-03 14:13 ` Guenter Roeck 2024-09-03 18:43 ` Linus Torvalds 1 sibling, 0 replies; 21+ messages in thread From: Guenter Roeck @ 2024-09-03 14:13 UTC (permalink / raw) To: Helge Deller Cc: Linus Torvalds, Richard Henderson, Vlastimil Babka, linux-kernel, Linux-MM, linux-parisc, Thomas Gleixner On Tue, Sep 03, 2024 at 09:54:19AM +0200, Helge Deller wrote: > On 8/8/24 20:19, Linus Torvalds wrote: > > On Thu, 8 Aug 2024 at 10:48, Thomas Gleixner <tglx@linutronix.de> wrote: > > > > > > Here is the disassembly from my latest crashing debug kernel which > > > shifts it up a couple of pages. Add 0x10 or sub 0x20 to make it work. > > > > Looks like I was off by an instruction, it's the 28th divide-step (not > > 29) that does the page crosser: > > > > > 4121dffc: 0b 21 04 41 ds r1,r25,r1 > > > 4121e000: 0b bd 07 1d add,c ret1,ret1,ret1 > > > > but my parisc knowledge is not good enough to even guess at what could go wrong. > > > > And I have no actual reason to believe this has *anything* to do with > > an itlb miss, except for that whole "exact placement seems to matter, > > and it crosses a page boundary" detail. > > Well, you were on the right track :-) > > Guenters kernel from > http://server.roeck-us.net/qemu/parisc64-6.10.3/ > boots nicely on my physical C3700 machine, but crashes with Qemu. > > So, it's not some bug in the kernel ITLB miss handler or other > Linux kernel code. > > Instead it's a Qemu bug, which gets triggered by the page > boundary crossing of: > 41218ffc: 0b 21 04 41 ds r1,r25,r1 > 41219000: 0b bd 07 1d add,c ret1,ret1,ret1 > > During the ITLB miss, the carry bits and the PSW-V-bit > (from the divide step) are saved in the IPSW register and restored > upon irq return. > > During packaging the bits there is a qemu coding bug, where we missed > to handle the PSW-V-bit as 32-bit value even on a 64-bit CPU. > The (copy&pasted) patch below fixes the crash for me. > Yes, that works for me as well. Thanks a lot for the fix! Guenter > Helge > > diff --git a/target/hppa/helper.c b/target/hppa/helper.c > index b79ddd8184..d4b1a3cd5a 100644 > --- a/target/hppa/helper.c > +++ b/target/hppa/helper.c > @@ -53,7 +53,7 @@ target_ulong cpu_hppa_get_psw(CPUHPPAState *env) > } > > psw |= env->psw_n * PSW_N; > - psw |= (env->psw_v < 0) * PSW_V; > + psw |= ((env->psw_v >> 31) & 1) * PSW_V; > psw |= env->psw | env->psw_xb; > > return psw; > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review 2024-09-03 7:54 ` Helge Deller 2024-09-03 14:13 ` Guenter Roeck @ 2024-09-03 18:43 ` Linus Torvalds 1 sibling, 0 replies; 21+ messages in thread From: Linus Torvalds @ 2024-09-03 18:43 UTC (permalink / raw) To: Helge Deller Cc: Richard Henderson, Guenter Roeck, Vlastimil Babka, linux-kernel, Linux-MM, linux-parisc, Thomas Gleixner On Tue, 3 Sept 2024 at 00:54, Helge Deller <deller@gmx.de> wrote: > > During packaging the bits there is a qemu coding bug, where we missed > to handle the PSW-V-bit as 32-bit value even on a 64-bit CPU. Well, that was a fun bug. And by "fun" I obviously mean "let's hope to never do this again". Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <cffe30ed-43a3-46ac-ad03-afb7633f17e5@roeck-us.net>]
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review [not found] ` <cffe30ed-43a3-46ac-ad03-afb7633f17e5@roeck-us.net> @ 2024-08-08 15:58 ` John David Anglin 0 siblings, 0 replies; 21+ messages in thread From: John David Anglin @ 2024-08-08 15:58 UTC (permalink / raw) To: Guenter Roeck, Thomas Gleixner, Vlastimil Babka, Linus Torvalds Cc: linux-kernel, Linux-MM, Helge Deller, linux-parisc On 2024-08-08 10:59 a.m., Guenter Roeck wrote: >> Let's wait for PARISC people to run it on actual hardware. >> > > Agreed. I suspect that there is a carry or upper 32 bit of a register not > handled properly, but I have no idea where that might be or why that would > only be seen if the div functions are located in a certain address range. I'm doubtful there's a coding issue in $$divoI. The routine was written by HP and it's used on both HP-UX and Linux. The routine can be found in libgcc/config/pa/milli64.S The routine can trap: Divide by zero is trapped. Divide of -2**31 by -1 is trapped for $$divoI but not for $$divI. $$divoI is a millicode routine. Not sure what calls it. gcc doesn't call it. gcc uses $$divI. It appears you are testing a 64-bit kernel. There might be issues calling millicode routines when branch distance exceeds approximately 4 MB. Millicode routines have a special calling sequence. You could try building kernel with -mlong-calls. Kernel will get bugger and slower with this option. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <f63c6789-b01a-4d76-b7c9-74c04867bc13@roeck-us.net>]
[parent not found: <CAHk-=wjmumbT73xLkSAnnxDwaFE__Ny=QCp6B_LE2aG1SUqiTg@mail.gmail.com>]
* Re: [PATCH 6.10 000/809] 6.10.3-rc3 review [not found] ` <CAHk-=wjmumbT73xLkSAnnxDwaFE__Ny=QCp6B_LE2aG1SUqiTg@mail.gmail.com> @ 2024-08-06 17:49 ` Linus Torvalds 0 siblings, 0 replies; 21+ messages in thread From: Linus Torvalds @ 2024-08-06 17:49 UTC (permalink / raw) To: Guenter Roeck, Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Christian Borntraeger Cc: Vlastimil Babka, linux-kernel, Linux-MM, Thomas Gleixner [ Adding s390 people, this is strange ] New people, see https://lore.kernel.org/all/CAHk-=wjmumbT73xLkSAnnxDwaFE__Ny=QCp6B_LE2aG1SUqiTg@mail.gmail.com/ for context. There's a heisenbug that depends on random code layout issues on s390. On Tue, 6 Aug 2024 at 10:34, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Hmm. Do we have some alignment confusion? > > The alignment rules for 192 are to align it to 64-byte boundaries > (because that's the largest power of two that divides it), and that > means it stays at 192, and that would give 21 objects per 4kB page. > > But if we use the "align up to next power of two", you get 256 bytes, > and 16 objects per page. > > And that 21-vs-16 confusion would seem to match this pretty well: > > [ 0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16 > > which makes me wonder... I'd suspect commit ad59baa31695 ("slab, rust: extend kmalloc() alignment guarantees to remove Rust padding"), perhaps with some odd s390 code generation issue for 'ffs()'. IOW, this new code in mm/slab_common.c if (flags & SLAB_KMALLOC) align = max(align, 1U << (ffs(size) - 1)); might not match some other alignment code. Or maybe it's the s390 ffs(). It looks like static inline int ffs(int word) { unsigned long mask = 2 * BITS_PER_LONG - 1; unsigned int val = (unsigned int)word; return (1 + (__flogr(-val & val) ^ (BITS_PER_LONG - 1))) & mask; } where s390 has this very odd "flogr" instruction ("find last one G register"?) for the non-constant case. That uses a "union register_pair" but only ever uses the "even" register without ever using the full 128-bit part or the odd register. So the other register in the register pair is uninitialized. Does that cause random compiler issues based on register allocation? Just for fun, does something like this make any difference? --- a/arch/s390/include/asm/bitops.h +++ b/arch/s390/include/asm/bitops.h @@ -305,6 +305,7 @@ static inline unsigned char __flogr(unsigned long word) union register_pair rp; rp.even = word; + rp.odd = 0; asm volatile( " flogr %[rp],%[rp]\n" : [rp] "+d" (rp.pair) : : "cc"); Thomas notices that the special "div by constant" routines moved around, and I'm not seeing how *that* would matter, but it's all obviously very strange. Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-09-03 18:43 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20240731095022.970699670@linuxfoundation.org>
[not found] ` <718b8afe-222f-4b3a-96d3-93af0e4ceff1@roeck-us.net>
2024-08-06 2:40 ` [PATCH 6.10 000/809] 6.10.3-rc3 review Linus Torvalds
2024-08-06 11:02 ` Vlastimil Babka
2024-08-06 17:33 ` Thomas Gleixner
[not found] ` <90e02d99-37a2-437e-ad42-44b80c4e94f6@suse.cz>
[not found] ` <87frrh44mf.ffs@tglx>
[not found] ` <76c643ee-17d6-463b-8ee1-4e30b0133671@roeck-us.net>
[not found] ` <87plqjz6aa.ffs@tglx>
2024-08-08 15:53 ` Linus Torvalds
2024-08-08 16:12 ` Thomas Gleixner
2024-08-08 16:33 ` Linus Torvalds
2024-08-08 17:48 ` Thomas Gleixner
2024-08-08 18:19 ` Linus Torvalds
2024-08-08 20:52 ` Guenter Roeck
2024-08-08 21:50 ` John David Anglin
2024-08-08 22:29 ` John David Anglin
2024-08-08 23:33 ` Linus Torvalds
2024-08-09 0:33 ` John David Anglin
2024-08-09 0:56 ` Guenter Roeck
2024-08-09 0:50 ` Guenter Roeck
2024-08-08 22:15 ` Richard Henderson
2024-09-03 7:54 ` Helge Deller
2024-09-03 14:13 ` Guenter Roeck
2024-09-03 18:43 ` Linus Torvalds
[not found] ` <cffe30ed-43a3-46ac-ad03-afb7633f17e5@roeck-us.net>
2024-08-08 15:58 ` John David Anglin
[not found] ` <f63c6789-b01a-4d76-b7c9-74c04867bc13@roeck-us.net>
[not found] ` <CAHk-=wjmumbT73xLkSAnnxDwaFE__Ny=QCp6B_LE2aG1SUqiTg@mail.gmail.com>
2024-08-06 17:49 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox