linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
       [not found] ` <eca17bfb-c75f-5db1-f194-5b00c2a0c6f2@iogearbox.net>
@ 2022-11-02  2:59   ` zhongbaisong
  2022-11-02  4:05     ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: zhongbaisong @ 2022-11-02  2:59 UTC (permalink / raw)
  To: Daniel Borkmann, edumazet, davem, kuba, pabeni
  Cc: linux-kernel, bpf, netdev, ast, song, yhs, haoluo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, Linux MM,
	kasan-dev, elver, glider, dvyukov



On 2022/11/2 0:45, Daniel Borkmann wrote:
> [ +kfence folks ]

+ cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov

Do you have any suggestions about this problem?

Thanks,

.

> 
> On 11/1/22 5:04 AM, Baisong Zhong wrote:
>> Recently, we got a syzkaller problem because of aarch64
>> alignment fault if KFENCE enabled.
>>
>> When the size from user bpf program is an odd number, like
>> 399, 407, etc, it will cause skb shard info's alignment access,
>> as seen below:
>>
>> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 
>> net/core/skbuff.c:1032
>>
>> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
>>   __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
>>   arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
>>   arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
>>   atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
>>   __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
>>   skb_clone+0xf4/0x214 net/core/skbuff.c:1481
>>   ____bpf_clone_redirect net/core/filter.c:2433 [inline]
>>   bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
>>   bpf_prog_d3839dd9068ceb51+0x80/0x330
>>   bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
>>   bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
>>   bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
>>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>>
>> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, 
>> cache=kmalloc-512
>>
>> allocated by task 15074 on cpu 0 at 1342.585390s:
>>   kmalloc include/linux/slab.h:568 [inline]
>>   kzalloc include/linux/slab.h:675 [inline]
>>   bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
>>   bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
>>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>>   __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
>>
>> To fix the problem, we round up allocations with kmalloc_size_roundup()
>> so that build_skb()'s use of kize() is always alignment and no special
>> handling of the memory is needed by KFENCE.
>>
>> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
>> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
>> ---
>>   net/bpf/test_run.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>> index 13d578ce2a09..058b67108873 100644
>> --- a/net/bpf/test_run.c
>> +++ b/net/bpf/test_run.c
>> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr 
>> *kattr, u32 user_size,
>>       if (user_size > size)
>>           return ERR_PTR(-EMSGSIZE);
>> +    size = kmalloc_size_roundup(size);
>>       data = kzalloc(size + headroom + tailroom, GFP_USER);
> 
> The fact that you need to do this roundup on call sites feels broken, no?
> Was there some discussion / consensus that now all k*alloc() call sites
> would need to be fixed up? Couldn't this be done transparently in k*alloc()
> when KFENCE is enabled? I presume there may be lots of other such occasions
> in the kernel where similar issue triggers, fixing up all call-sites feels
> like ton of churn compared to api-internal, generic fix.
> 
>>       if (!data)
>>           return ERR_PTR(-ENOMEM);
>>
> 
> Thanks,
> Daniel
>




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
  2022-11-02  2:59   ` [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb() zhongbaisong
@ 2022-11-02  4:05     ` Jakub Kicinski
  2022-11-02  4:27       ` Kees Cook
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2022-11-02  4:05 UTC (permalink / raw)
  To: zhongbaisong
  Cc: Daniel Borkmann, edumazet, davem, pabeni, linux-kernel, bpf,
	netdev, ast, song, yhs, haoluo, Alexander Potapenko, Marco Elver,
	Dmitry Vyukov, Linux MM, kasan-dev, Kees Cook

On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
> On 2022/11/2 0:45, Daniel Borkmann wrote:
> > [ +kfence folks ]  
> 
> + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
> 
> Do you have any suggestions about this problem?

+ Kees who has been sending similar patches for drivers

> > On 11/1/22 5:04 AM, Baisong Zhong wrote:  
> >> Recently, we got a syzkaller problem because of aarch64
> >> alignment fault if KFENCE enabled.
> >>
> >> When the size from user bpf program is an odd number, like
> >> 399, 407, etc, it will cause skb shard info's alignment access,
> >> as seen below:
> >>
> >> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 
> >> net/core/skbuff.c:1032
> >>
> >> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
> >>   __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
> >>   arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
> >>   arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
> >>   atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
> >>   __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> >>   skb_clone+0xf4/0x214 net/core/skbuff.c:1481
> >>   ____bpf_clone_redirect net/core/filter.c:2433 [inline]
> >>   bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
> >>   bpf_prog_d3839dd9068ceb51+0x80/0x330
> >>   bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
> >>   bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
> >>   bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
> >>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> >>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> >>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> >>
> >> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, 
> >> cache=kmalloc-512
> >>
> >> allocated by task 15074 on cpu 0 at 1342.585390s:
> >>   kmalloc include/linux/slab.h:568 [inline]
> >>   kzalloc include/linux/slab.h:675 [inline]
> >>   bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
> >>   bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
> >>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> >>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> >>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> >>   __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
> >>
> >> To fix the problem, we round up allocations with kmalloc_size_roundup()
> >> so that build_skb()'s use of kize() is always alignment and no special
> >> handling of the memory is needed by KFENCE.
> >>
> >> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
> >> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
> >> ---
> >>   net/bpf/test_run.c | 1 +
> >>   1 file changed, 1 insertion(+)
> >>
> >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> >> index 13d578ce2a09..058b67108873 100644
> >> --- a/net/bpf/test_run.c
> >> +++ b/net/bpf/test_run.c
> >> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr 
> >> *kattr, u32 user_size,
> >>       if (user_size > size)
> >>           return ERR_PTR(-EMSGSIZE);
> >> +    size = kmalloc_size_roundup(size);
> >>       data = kzalloc(size + headroom + tailroom, GFP_USER);  
> > 
> > The fact that you need to do this roundup on call sites feels broken, no?
> > Was there some discussion / consensus that now all k*alloc() call sites
> > would need to be fixed up? Couldn't this be done transparently in k*alloc()
> > when KFENCE is enabled? I presume there may be lots of other such occasions
> > in the kernel where similar issue triggers, fixing up all call-sites feels
> > like ton of churn compared to api-internal, generic fix.
> >   
> >>       if (!data)
> >>           return ERR_PTR(-ENOMEM);
> >>  
> > 
> > Thanks,
> > Daniel
> >  
> 
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
  2022-11-02  4:05     ` Jakub Kicinski
@ 2022-11-02  4:27       ` Kees Cook
  2022-11-02  4:37         ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Kees Cook @ 2022-11-02  4:27 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: zhongbaisong, Daniel Borkmann, edumazet, davem, pabeni,
	linux-kernel, bpf, netdev, ast, song, yhs, haoluo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, Linux MM,
	kasan-dev

On Tue, Nov 01, 2022 at 09:05:42PM -0700, Jakub Kicinski wrote:
> On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
> > On 2022/11/2 0:45, Daniel Borkmann wrote:
> > > [ +kfence folks ]  
> > 
> > + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
> > 
> > Do you have any suggestions about this problem?
> 
> + Kees who has been sending similar patches for drivers
> 
> > > On 11/1/22 5:04 AM, Baisong Zhong wrote:  
> > >> Recently, we got a syzkaller problem because of aarch64
> > >> alignment fault if KFENCE enabled.
> > >>
> > >> When the size from user bpf program is an odd number, like
> > >> 399, 407, etc, it will cause skb shard info's alignment access,
> > >> as seen below:
> > >>
> > >> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 
> > >> net/core/skbuff.c:1032
> > >>
> > >> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
> > >>   __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
> > >>   arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
> > >>   arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
> > >>   atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
> > >>   __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> > >>   skb_clone+0xf4/0x214 net/core/skbuff.c:1481
> > >>   ____bpf_clone_redirect net/core/filter.c:2433 [inline]
> > >>   bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
> > >>   bpf_prog_d3839dd9068ceb51+0x80/0x330
> > >>   bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
> > >>   bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
> > >>   bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
> > >>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > >>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > >>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > >>
> > >> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, 
> > >> cache=kmalloc-512
> > >>
> > >> allocated by task 15074 on cpu 0 at 1342.585390s:
> > >>   kmalloc include/linux/slab.h:568 [inline]
> > >>   kzalloc include/linux/slab.h:675 [inline]
> > >>   bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
> > >>   bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
> > >>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > >>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > >>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > >>   __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
> > >>
> > >> To fix the problem, we round up allocations with kmalloc_size_roundup()
> > >> so that build_skb()'s use of kize() is always alignment and no special
> > >> handling of the memory is needed by KFENCE.
> > >>
> > >> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
> > >> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
> > >> ---
> > >>   net/bpf/test_run.c | 1 +
> > >>   1 file changed, 1 insertion(+)
> > >>
> > >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > >> index 13d578ce2a09..058b67108873 100644
> > >> --- a/net/bpf/test_run.c
> > >> +++ b/net/bpf/test_run.c
> > >> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr 
> > >> *kattr, u32 user_size,
> > >>       if (user_size > size)
> > >>           return ERR_PTR(-EMSGSIZE);
> > >> +    size = kmalloc_size_roundup(size);
> > >>       data = kzalloc(size + headroom + tailroom, GFP_USER);  
> > > 
> > > The fact that you need to do this roundup on call sites feels broken, no?
> > > Was there some discussion / consensus that now all k*alloc() call sites
> > > would need to be fixed up? Couldn't this be done transparently in k*alloc()
> > > when KFENCE is enabled? I presume there may be lots of other such occasions
> > > in the kernel where similar issue triggers, fixing up all call-sites feels
> > > like ton of churn compared to api-internal, generic fix.

I hope I answer this in more detail here:
https://lore.kernel.org/lkml/202211010937.4631CB1B0E@keescook/

The problem is that ksize() should never have existed in the first
place. :P Every runtime bounds checker has tripped over it, and with
the addition of the __alloc_size attribute, I had to start ripping
ksize() out: it can't be used to pretend an allocation grew in size.
Things need to either preallocate more or go through *realloc() like
everything else. Luckily, ksize() is rare.

FWIW, the above fix doesn't look correct to me -- I would expect this to
be:

	size_t alloc_size;
	...
	alloc_size = kmalloc_size_roundup(size + headroom + tailroom);
	data = kzalloc(alloc_size, GFP_USER);

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
  2022-11-02  4:27       ` Kees Cook
@ 2022-11-02  4:37         ` Eric Dumazet
  2022-11-02  7:19           ` zhongbaisong
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2022-11-02  4:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jakub Kicinski, zhongbaisong, Daniel Borkmann, davem, pabeni,
	linux-kernel, bpf, netdev, ast, song, yhs, haoluo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, Linux MM,
	kasan-dev

On Tue, Nov 1, 2022 at 9:27 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Nov 01, 2022 at 09:05:42PM -0700, Jakub Kicinski wrote:
> > On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
> > > On 2022/11/2 0:45, Daniel Borkmann wrote:
> > > > [ +kfence folks ]
> > >
> > > + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
> > >
> > > Do you have any suggestions about this problem?
> >
> > + Kees who has been sending similar patches for drivers
> >
> > > > On 11/1/22 5:04 AM, Baisong Zhong wrote:
> > > >> Recently, we got a syzkaller problem because of aarch64
> > > >> alignment fault if KFENCE enabled.
> > > >>
> > > >> When the size from user bpf program is an odd number, like
> > > >> 399, 407, etc, it will cause skb shard info's alignment access,
> > > >> as seen below:
> > > >>
> > > >> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0
> > > >> net/core/skbuff.c:1032
> > > >>
> > > >> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
> > > >>   __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
> > > >>   arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
> > > >>   arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
> > > >>   atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
> > > >>   __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> > > >>   skb_clone+0xf4/0x214 net/core/skbuff.c:1481
> > > >>   ____bpf_clone_redirect net/core/filter.c:2433 [inline]
> > > >>   bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
> > > >>   bpf_prog_d3839dd9068ceb51+0x80/0x330
> > > >>   bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
> > > >>   bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
> > > >>   bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
> > > >>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > > >>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > > >>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > > >>
> > > >> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407,
> > > >> cache=kmalloc-512
> > > >>
> > > >> allocated by task 15074 on cpu 0 at 1342.585390s:
> > > >>   kmalloc include/linux/slab.h:568 [inline]
> > > >>   kzalloc include/linux/slab.h:675 [inline]
> > > >>   bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
> > > >>   bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
> > > >>   bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > > >>   __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > > >>   __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > > >>   __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
> > > >>
> > > >> To fix the problem, we round up allocations with kmalloc_size_roundup()
> > > >> so that build_skb()'s use of kize() is always alignment and no special
> > > >> handling of the memory is needed by KFENCE.
> > > >>
> > > >> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
> > > >> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
> > > >> ---
> > > >>   net/bpf/test_run.c | 1 +
> > > >>   1 file changed, 1 insertion(+)
> > > >>
> > > >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > > >> index 13d578ce2a09..058b67108873 100644
> > > >> --- a/net/bpf/test_run.c
> > > >> +++ b/net/bpf/test_run.c
> > > >> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr
> > > >> *kattr, u32 user_size,
> > > >>       if (user_size > size)
> > > >>           return ERR_PTR(-EMSGSIZE);
> > > >> +    size = kmalloc_size_roundup(size);
> > > >>       data = kzalloc(size + headroom + tailroom, GFP_USER);
> > > >
> > > > The fact that you need to do this roundup on call sites feels broken, no?
> > > > Was there some discussion / consensus that now all k*alloc() call sites
> > > > would need to be fixed up? Couldn't this be done transparently in k*alloc()
> > > > when KFENCE is enabled? I presume there may be lots of other such occasions
> > > > in the kernel where similar issue triggers, fixing up all call-sites feels
> > > > like ton of churn compared to api-internal, generic fix.
>
> I hope I answer this in more detail here:
> https://lore.kernel.org/lkml/202211010937.4631CB1B0E@keescook/
>
> The problem is that ksize() should never have existed in the first
> place. :P Every runtime bounds checker has tripped over it, and with
> the addition of the __alloc_size attribute, I had to start ripping
> ksize() out: it can't be used to pretend an allocation grew in size.
> Things need to either preallocate more or go through *realloc() like
> everything else. Luckily, ksize() is rare.
>
> FWIW, the above fix doesn't look correct to me -- I would expect this to
> be:
>
>         size_t alloc_size;
>         ...
>         alloc_size = kmalloc_size_roundup(size + headroom + tailroom);
>         data = kzalloc(alloc_size, GFP_USER);

Making sure the struct skb_shared_info is aligned to a cache line does
not need kmalloc_size_roundup().

What is needed is to adjust @size so that (@size + @headroom) is a
multiple of SMP_CACHE_BYTES


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
  2022-11-02  4:37         ` Eric Dumazet
@ 2022-11-02  7:19           ` zhongbaisong
  0 siblings, 0 replies; 5+ messages in thread
From: zhongbaisong @ 2022-11-02  7:19 UTC (permalink / raw)
  To: Eric Dumazet, Kees Cook
  Cc: Jakub Kicinski, Daniel Borkmann, davem, pabeni, linux-kernel,
	bpf, netdev, ast, song, yhs, haoluo, Alexander Potapenko,
	Marco Elver, Dmitry Vyukov, Linux MM, kasan-dev



On 2022/11/2 12:37, Eric Dumazet wrote:
> On Tue, Nov 1, 2022 at 9:27 PM Kees Cook <keescook@chromium.org> wrote:
>>
>> On Tue, Nov 01, 2022 at 09:05:42PM -0700, Jakub Kicinski wrote:
>>> On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
>>>> On 2022/11/2 0:45, Daniel Borkmann wrote:
>>>>> [ +kfence folks ]
>>>>
>>>> + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
>>>>
>>>> Do you have any suggestions about this problem?
>>>
>>> + Kees who has been sending similar patches for drivers
>>>
>>>>> On 11/1/22 5:04 AM, Baisong Zhong wrote:
>>>>>> Recently, we got a syzkaller problem because of aarch64
>>>>>> alignment fault if KFENCE enabled.
>>>>>>
>>>>>> When the size from user bpf program is an odd number, like
>>>>>> 399, 407, etc, it will cause skb shard info's alignment access,
>>>>>> as seen below:
>>>>>>
>>>>>> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0
>>>>>> net/core/skbuff.c:1032
>>>>>>
>>>>>> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
>>>>>>    __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
>>>>>>    arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
>>>>>>    arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
>>>>>>    atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
>>>>>>    __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
>>>>>>    skb_clone+0xf4/0x214 net/core/skbuff.c:1481
>>>>>>    ____bpf_clone_redirect net/core/filter.c:2433 [inline]
>>>>>>    bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
>>>>>>    bpf_prog_d3839dd9068ceb51+0x80/0x330
>>>>>>    bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
>>>>>>    bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
>>>>>>    bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
>>>>>>    bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>>>>>>    __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>>>>>>    __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>>>>>>
>>>>>> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407,
>>>>>> cache=kmalloc-512
>>>>>>
>>>>>> allocated by task 15074 on cpu 0 at 1342.585390s:
>>>>>>    kmalloc include/linux/slab.h:568 [inline]
>>>>>>    kzalloc include/linux/slab.h:675 [inline]
>>>>>>    bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
>>>>>>    bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
>>>>>>    bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>>>>>>    __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>>>>>>    __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>>>>>>    __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
>>>>>>
>>>>>> To fix the problem, we round up allocations with kmalloc_size_roundup()
>>>>>> so that build_skb()'s use of kize() is always alignment and no special
>>>>>> handling of the memory is needed by KFENCE.
>>>>>>
>>>>>> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
>>>>>> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
>>>>>> ---
>>>>>>    net/bpf/test_run.c | 1 +
>>>>>>    1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>>>>>> index 13d578ce2a09..058b67108873 100644
>>>>>> --- a/net/bpf/test_run.c
>>>>>> +++ b/net/bpf/test_run.c
>>>>>> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr
>>>>>> *kattr, u32 user_size,
>>>>>>        if (user_size > size)
>>>>>>            return ERR_PTR(-EMSGSIZE);
>>>>>> +    size = kmalloc_size_roundup(size);
>>>>>>        data = kzalloc(size + headroom + tailroom, GFP_USER);
>>>>>
>>>>> The fact that you need to do this roundup on call sites feels broken, no?
>>>>> Was there some discussion / consensus that now all k*alloc() call sites
>>>>> would need to be fixed up? Couldn't this be done transparently in k*alloc()
>>>>> when KFENCE is enabled? I presume there may be lots of other such occasions
>>>>> in the kernel where similar issue triggers, fixing up all call-sites feels
>>>>> like ton of churn compared to api-internal, generic fix.
>>
>> I hope I answer this in more detail here:
>> https://lore.kernel.org/lkml/202211010937.4631CB1B0E@keescook/
>>
>> The problem is that ksize() should never have existed in the first
>> place. :P Every runtime bounds checker has tripped over it, and with
>> the addition of the __alloc_size attribute, I had to start ripping
>> ksize() out: it can't be used to pretend an allocation grew in size.
>> Things need to either preallocate more or go through *realloc() like
>> everything else. Luckily, ksize() is rare.
>>
>> FWIW, the above fix doesn't look correct to me -- I would expect this to
>> be:
>>
>>          size_t alloc_size;
>>          ...
>>          alloc_size = kmalloc_size_roundup(size + headroom + tailroom);
>>          data = kzalloc(alloc_size, GFP_USER);
> 
> Making sure the struct skb_shared_info is aligned to a cache line does
> not need kmalloc_size_roundup().
> 
> What is needed is to adjust @size so that (@size + @headroom) is a
> multiple of SMP_CACHE_BYTES

ok, I'll fix it and send v2.

Thanks

.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-02  7:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20221101040440.3637007-1-zhongbaisong@huawei.com>
     [not found] ` <eca17bfb-c75f-5db1-f194-5b00c2a0c6f2@iogearbox.net>
2022-11-02  2:59   ` [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb() zhongbaisong
2022-11-02  4:05     ` Jakub Kicinski
2022-11-02  4:27       ` Kees Cook
2022-11-02  4:37         ` Eric Dumazet
2022-11-02  7:19           ` zhongbaisong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox