[Oops] vfree abort in bpf_jit_free with memcg

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff
@ 2024-06-03  9:10 Peng Fan
  2024-06-04  0:50 ` Roman Gushchin
  0 siblings, 1 reply; 4+ messages in thread
From: Peng Fan @ 2024-06-03  9:10 UTC (permalink / raw)
  To: linux-mm, bpf, daniel, ast, zlim.lnx, cgroups, hannes, mhocko,
	roman.gushchin, shakeelb, muchun.song

Hi All,

We are running 6.6 kernel on NXP i.MX95 platform, and meet an issue very
hard to reproduce. Panic log in the end. I check the registers and source code.

static inline struct obj_cgroup *__folio_objcg(struct folio *folio)                                 
{                                                                                                   
        unsigned long memcg_data = folio->memcg_data;                                               
                                                                                                    
        VM_BUG_ON_FOLIO(folio_test_slab(folio), folio);                                             
        VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJCGS, folio);                                     
        VM_BUG_ON_FOLIO(!(memcg_data & MEMCG_DATA_KMEM), folio);                                    
                                                                                                    
        return (struct obj_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);                          
}  

the memcg_data is 0xffff in register x1. This seems a invalid value.
Register x0 is x1 & ~3.
The panic happens in the PC: ffff800080305894, which is 'ldr     x0, [x0, #16]'
I not have an good idea on how to fix the issue, please suggest if you have time
to give a look.

[   12.843675] Unable to handle kernel paging request at virtual address 000000000001000c
[   12.849981] audit: type=1334 audit(1709988536.322:30): prog-id=3 op=UNLOAD
[   12.857888] Mem abort info:
[   12.867630]   ESR = 0x0000000096000004
[   12.871368]   EC = 0x25: DABT (current EL), IL = 32 bits
[   12.876675]   SET = 0, FnV = 0
[   12.879732]   EA = 0, S1PTW = 0
[   12.882860]   FSC = 0x04: level 0 translation fault
[   12.887730] Data abort info:
[   12.890599]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[   12.896076]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[   12.901120]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[   12.906424] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001008de000
[   12.912854] [000000000001000c] pgd=0000000000000000, p4d=0000000000000000
[   12.919642] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[   12.925900] Modules linked in:
[   12.928942] CPU: 4 PID: 131 Comm: kworker/4:2 Not tainted 6.6.23-06226-g41e0f501b547-dirty #248
[   12.937625] Hardware name: NXP i.MX95 19X19 board (DT)
[   12.942748] Workqueue: events bpf_prog_free_deferred
[   12.947713] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   12.954663] pc : vfree+0x114/0x2e0
[   12.958060] lr : vfree+0x78/0x2e0
[   12.961362] sp : ffff80008459bd10
[   12.964664] x29: ffff80008459bd10 x28: 0000000000000000 x27: 0000000000000000
[   12.969128] watchdog: watchdog0: watchdog did not stop!
[   12.971788] x26: 0000000000000000 x25: ffff0000808b5a00 x24: ffff000080090805
[   12.971795] x23: ffff000084bcdc08 x22: 0000000000000000 x21: ffff00008493c6c0
[   12.971802] x20: fffffc000100005e x19: 0000000000000000 x18: 0000000000000000
[   12.971808] x17: ffff800084ec1000 x16: ffff00008465f208
[   12.991063] systemd-shutdown[1]: Using hardware watchdog 'i.MX7ULP watchdog timer', version 0, device /dev/watchdog0
[   12.991246]  x15: 0000000000000000
[   13.017453] x14: 0000000000000000 x13: ffff80008f001000 x12: ffff000084647a00
[   13.024577] x11: ffff000080b9d1f8 x10: ffff0000846479d8 x9 : ffff8000803057f8
[   13.031701] x8 : ffff80008459bcf0 x7 : 0000000000000001 x6 : ffff800082b84d38
[   13.038825] x5 : 0000000000000000 x4 : 0000000080000000 x3 : ffff80008377d000
[   13.045949] x2 : 0000000000000001 x1 : 000000000000ffff x0 : 000000000000fffc
[   13.047210] systemd-shutdown[1]: Watchdog running with a timeout of 1min.
[   13.053073] Call trace:
[   13.053076]  vfree+0x114/0x2e0
[   13.053083]  bpf_jit_free+0x54/0xb8
[   13.068804]  bpf_prog_free_deferred+0x16c/0x1a0
[   13.073328]  process_one_work+0x148/0x3b8
[   13.077332]  worker_thread+0x32c/0x450
[   13.081076]  kthread+0x11c/0x128
[   13.084300]  ret_from_fork+0x10/0x20
[   13.087874] Code: a9425bf5 a8c57bfd d50323bf d65f03c0 (f9400800)


Part of the objdump code:
ffff8000803057f4:       97f8c73d        bl      ffff8000801374e8 <__rcu_read_lock>                  
ffff8000803057f8:       f9400681        ldr     x1, [x20, #8]                                       
ffff8000803057fc:       d1000420        sub     x0, x1, #0x1                                        
ffff800080305800:       f240003f        tst     x1, #0x1                                            
ffff800080305804:       9a941000        csel    x0, x0, x20, ne  // ne = any                        
ffff800080305808:       f9401c01        ldr     x1, [x0, #56]                                       
ffff80008030580c:       927ef420        and     x0, x1, #0xfffffffffffffffc                         
ffff800080305810:       37080421        tbnz    w1, #1, ffff800080305894 <vfree+0x114>              
ffff800080305814:       b40000e0        cbz     x0, ffff800080305830 <vfree+0xb0>                   
ffff800080305818:       d53b4236        mrs     x22, daif                                           
ffff80008030581c:       d50343df        msr     daifset, #0x3                                       
ffff800080305820:       12800002        mov     w2, #0xffffffff                 // #-1              
ffff800080305824:       528005c1        mov     w1, #0x2e                       // #46              
ffff800080305828:       94015eac        bl      ffff80008035d2d8 <__mod_memcg_state>                
ffff80008030582c:       d51b4236        msr     daif, x22                                           
ffff800080305830:       97f8eafa        bl      ffff800080140418 <__rcu_read_unlock>                
ffff800080305834:       aa1403e0        mov     x0, x20                                             
ffff800080305838:       52800001        mov     w1, #0x0                        // #0               
ffff80008030583c:       94001847        bl      ffff80008030b958 <__free_pages>                     
ffff800080305840:       11000673        add     w19, w19, #0x1                                      
ffff800080305844:       b9402ea0        ldr     w0, [x21, #44]                                      
ffff800080305848:       f94012a1        ldr     x1, [x21, #32]                     
......
ffff80008030588c:       d50323bf        autiasp                                                     
ffff800080305890:       d65f03c0        ret                                                         
ffff800080305894:       f9400800        ldr     x0, [x0, #16]                                       
ffff800080305898:       17ffffdf        b       ffff800080305814 <vfree+0x94>                       
ffff80008030589c:       a90363f7        stp     x23, x24, [sp, #48]    

Thanks
Peng.       


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff
  2024-06-03  9:10 [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff Peng Fan
@ 2024-06-04  0:50 ` Roman Gushchin
  2024-06-04  2:20   ` Peng Fan
  0 siblings, 1 reply; 4+ messages in thread
From: Roman Gushchin @ 2024-06-04  0:50 UTC (permalink / raw)
  To: Peng Fan
  Cc: linux-mm, bpf, daniel, ast, zlim.lnx, cgroups, hannes, mhocko,
	shakeelb, muchun.song

On Mon, Jun 03, 2024 at 09:10:43AM +0000, Peng Fan wrote:
> Hi All,
> 
> We are running 6.6 kernel on NXP i.MX95 platform, and meet an issue very
> hard to reproduce. Panic log in the end. I check the registers and source code.

Hi!

Do you know by a chance if the issue is reproducible on newer kernels?

From a very first glance, I doubt it's a generic memory accounting
issue, otherwise we'd see a lot more instances of it. So my guess it
something related to bpf jit code. It seems like there were heavy
changes since 6.6, this is why I'm asking about newer kernels.

Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff
  2024-06-04  0:50 ` Roman Gushchin
@ 2024-06-04  2:20   ` Peng Fan
  2024-06-04 14:52     ` Peng Fan
  0 siblings, 1 reply; 4+ messages in thread
From: Peng Fan @ 2024-06-04  2:20 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, bpf, daniel, ast, zlim.lnx, cgroups, hannes, mhocko,
	shakeelb, muchun.song

Hi Roman,

> Subject: Re: [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff
> 
> On Mon, Jun 03, 2024 at 09:10:43AM +0000, Peng Fan wrote:
> > Hi All,
> >
> > We are running 6.6 kernel on NXP i.MX95 platform, and meet an issue
> > very hard to reproduce. Panic log in the end. I check the registers and
> source code.
> 
> Hi!
> 
> Do you know by a chance if the issue is reproducible on newer kernels?
> 
> From a very first glance, I doubt it's a generic memory accounting issue,
> otherwise we'd see a lot more instances of it. So my guess it something
> related to bpf jit code. It seems like there were heavy changes since 6.6, this
> is why I'm asking about newer kernels.

I not have a full test environment with newer kernel, the i.MX95 platform
has not been landed in upstream repo.

After I enable DEBUG_VM, I have a new dump in virt_to_phys: I am thinking
whether the dma corrupt memory. And with disabling DPU, I am redoing
the test, and see how it goes.

[    2.992655] ------------[ cut here ]------------                                                 
[    3.003764] virt_to_phys used for non-linear address: 00000000897eac93 (0xffff800086001000)      
[    3.004944] sysctr_timer_read_write:10024 retry: 1                                                
[    3.012196] WARNING: CPU: 0 PID: 11 at arch/arm64/mm/physaddr.c:12 __virt_to_phys+0x68/0x98      
[    3.025243] Modules linked in:                                                                   
[    3.028312] CPU: 0 PID: 11 Comm: kworker/u12:0 Not tainted 6.6.23-06226-g4986cc3e1b75-dirty #251 
[    3.037098] Hardware name: NXP i.MX95 19X19 board (DT)                                              
[    3.042239] Workqueue: events_unbound deferred_probe_work_func                                   
[    3.044953] sysctr_timer_read_write:10024 retry: 1                                               
[    3.048079] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)                      
[    3.059796] pc : __virt_to_phys+0x68/0x98                                                        
[    3.063809] lr : __virt_to_phys+0x68/0x98                                                        
[    3.067839] sp : ffff800082de3990                                                                
[    3.071141] x29: ffff800082de3990 x28: 0000000000000000 x27: 0000000034325258                           
[    3.078282] x26: ffff000084748000 x25: ffff0000818ba800 x24: ffff00008471dc00                    
[    3.084954] sysctr_timer_read_write:10024 retry: 1                                                      
[    3.085423] x23: 0000000000000000 x22: ffff0000818ba200 x21: ffff00008080bc00                    
[    3.097323] x20: ffff0000847345c0 x19: ffff800086001000 x18: 0000000000000006                    
[    3.104447] x17: 6666783028203339 x16: 6361653739383030 x15: 303030303030203a                    
[    3.111588] x14: 7373657264646120 x13: 2930303031303036 x12: 3830303038666666                    
[    3.118712] x11: 6678302820333963 x10: 0000000000000a90 x9 : ffff8000800e04a0                    
[    3.120954] sysctr_timer_read_write:10024 retry: 1                                               
[    3.125836] x8 : ffff0000803d28f0 x7 : 000000006273d88e x6 : 0000000000000400                    
[    3.137736] x5 : 00000000410fd050 x4 : 0000000000f0000f x3 : 0000000000200000                    
[    3.144894] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000803d1e00                    
[    3.152036] Call trace:                                                                          
[    3.154489]  __virt_to_phys+0x68/0x98                                                            
[    3.158163]  drm_fbdev_dma_helper_fb_probe+0x138/0x238                                           
[    3.163294]  __drm_fb_helper_initial_config_and_unlock+0x2b0/0x4c0                               
[    3.169012] sysctr_timer_read_write:10024 retry: 1                                               
[    3.169498]  drm_fb_helper_initial_config+0x4c/0x68                                              
[    3.177000] sysctr_timer_read_write:10024 retry: 1                                               
[    3.179136]  drm_fbdev_dma_client_hotplug+0x8c/0xe0                                              
[    3.188773]  drm_client_register+0x60/0xb0                                                       
[    3.192881]  drm_fbdev_dma_setup+0x94/0x148                                                      
[    3.197059]  dpu95_probe+0xc4/0x130                                                              
[    3.200577]  platform_probe+0x70/0xd0                                                            
[    3.204252]  really_probe+0x150/0x2c0   

Thanks
Peng
> 
> Thanks!


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff
  2024-06-04  2:20   ` Peng Fan
@ 2024-06-04 14:52     ` Peng Fan
  0 siblings, 0 replies; 4+ messages in thread
From: Peng Fan @ 2024-06-04 14:52 UTC (permalink / raw)
  To: Peng Fan, Roman Gushchin
  Cc: linux-mm, bpf, daniel, ast, zlim.lnx, cgroups, hannes, mhocko,
	shakeelb, muchun.song

> Subject: RE: [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff
> 
> Hi Roman,
> 
> > Subject: Re: [Oops] vfree abort in bpf_jit_free with memcg_data value
> > 0xffff
> >
> > On Mon, Jun 03, 2024 at 09:10:43AM +0000, Peng Fan wrote:
> > > Hi All,
> > >
> > > We are running 6.6 kernel on NXP i.MX95 platform, and meet an issue
> > > very hard to reproduce. Panic log in the end. I check the registers
> > > and
> > source code.
> >
> > Hi!
> >
> > Do you know by a chance if the issue is reproducible on newer kernels?
> >
> > From a very first glance, I doubt it's a generic memory accounting
> > issue, otherwise we'd see a lot more instances of it. So my guess it
> > something related to bpf jit code. It seems like there were heavy
> > changes since 6.6, this is why I'm asking about newer kernels.
> 
> I not have a full test environment with newer kernel, the i.MX95 platform has
> not been landed in upstream repo.
> 
> After I enable DEBUG_VM, I have a new dump in virt_to_phys: I am thinking
> whether the dma corrupt memory. And with disabling DPU, I am redoing the
> test, and see how it goes.

After address the virt_to_phys issue, I could still see bpt_jit_free trigger
kernel panic. 

Is there any suggestion that how I could reproduce this issue sooner?
Currently I am doing linux reboot test, but needs several hours or more
to reproduce this issue.

Thanks,
Peng.
> 
> [    2.992655] ------------[ cut here ]------------
> [    3.003764] virt_to_phys used for non-linear address: 00000000897eac93
> (0xffff800086001000)
> [    3.004944] sysctr_timer_read_write:10024 retry: 1
> [    3.012196] WARNING: CPU: 0 PID: 11 at arch/arm64/mm/physaddr.c:12
> __virt_to_phys+0x68/0x98
> [    3.025243] Modules linked in:
> [    3.028312] CPU: 0 PID: 11 Comm: kworker/u12:0 Not tainted 6.6.23-
> 06226-g4986cc3e1b75-dirty #251
> [    3.037098] Hardware name: NXP i.MX95 19X19 board (DT)
> [    3.042239] Workqueue: events_unbound deferred_probe_work_func
> [    3.044953] sysctr_timer_read_write:10024 retry: 1
> [    3.048079] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS
> BTYPE=--)
> [    3.059796] pc : __virt_to_phys+0x68/0x98
> [    3.063809] lr : __virt_to_phys+0x68/0x98
> [    3.067839] sp : ffff800082de3990
> [    3.071141] x29: ffff800082de3990 x28: 0000000000000000 x27:
> 0000000034325258
> [    3.078282] x26: ffff000084748000 x25: ffff0000818ba800 x24:
> ffff00008471dc00
> [    3.084954] sysctr_timer_read_write:10024 retry: 1
> [    3.085423] x23: 0000000000000000 x22: ffff0000818ba200 x21:
> ffff00008080bc00
> [    3.097323] x20: ffff0000847345c0 x19: ffff800086001000 x18:
> 0000000000000006
> [    3.104447] x17: 6666783028203339 x16: 6361653739383030 x15:
> 303030303030203a
> [    3.111588] x14: 7373657264646120 x13: 2930303031303036 x12:
> 3830303038666666
> [    3.118712] x11: 6678302820333963 x10: 0000000000000a90 x9 :
> ffff8000800e04a0
> [    3.120954] sysctr_timer_read_write:10024 retry: 1
> [    3.125836] x8 : ffff0000803d28f0 x7 : 000000006273d88e x6 :
> 0000000000000400
> [    3.137736] x5 : 00000000410fd050 x4 : 0000000000f0000f x3 :
> 0000000000200000
> [    3.144894] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
> ffff0000803d1e00
> [    3.152036] Call trace:
> [    3.154489]  __virt_to_phys+0x68/0x98
> [    3.158163]  drm_fbdev_dma_helper_fb_probe+0x138/0x238
> [    3.163294]  __drm_fb_helper_initial_config_and_unlock+0x2b0/0x4c0
> [    3.169012] sysctr_timer_read_write:10024 retry: 1
> [    3.169498]  drm_fb_helper_initial_config+0x4c/0x68
> [    3.177000] sysctr_timer_read_write:10024 retry: 1
> [    3.179136]  drm_fbdev_dma_client_hotplug+0x8c/0xe0
> [    3.188773]  drm_client_register+0x60/0xb0
> [    3.192881]  drm_fbdev_dma_setup+0x94/0x148
> [    3.197059]  dpu95_probe+0xc4/0x130
> [    3.200577]  platform_probe+0x70/0xd0
> [    3.204252]  really_probe+0x150/0x2c0
> 
> Thanks
> Peng
> >
> > Thanks!



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-04 14:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-03  9:10 [Oops] vfree abort in bpf_jit_free with memcg_data value 0xffff Peng Fan
2024-06-04  0:50 ` Roman Gushchin
2024-06-04  2:20   ` Peng Fan
2024-06-04 14:52     ` Peng Fan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox