* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
[not found] ` <eca17bfb-c75f-5db1-f194-5b00c2a0c6f2@iogearbox.net>
@ 2022-11-02 2:59 ` zhongbaisong
2022-11-02 4:05 ` Jakub Kicinski
0 siblings, 1 reply; 5+ messages in thread
From: zhongbaisong @ 2022-11-02 2:59 UTC (permalink / raw)
To: Daniel Borkmann, edumazet, davem, kuba, pabeni
Cc: linux-kernel, bpf, netdev, ast, song, yhs, haoluo,
Alexander Potapenko, Marco Elver, Dmitry Vyukov, Linux MM,
kasan-dev, elver, glider, dvyukov
On 2022/11/2 0:45, Daniel Borkmann wrote:
> [ +kfence folks ]
+ cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
Do you have any suggestions about this problem?
Thanks,
.
>
> On 11/1/22 5:04 AM, Baisong Zhong wrote:
>> Recently, we got a syzkaller problem because of aarch64
>> alignment fault if KFENCE enabled.
>>
>> When the size from user bpf program is an odd number, like
>> 399, 407, etc, it will cause skb shard info's alignment access,
>> as seen below:
>>
>> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0
>> net/core/skbuff.c:1032
>>
>> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
>> __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
>> arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
>> arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
>> atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
>> __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
>> skb_clone+0xf4/0x214 net/core/skbuff.c:1481
>> ____bpf_clone_redirect net/core/filter.c:2433 [inline]
>> bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
>> bpf_prog_d3839dd9068ceb51+0x80/0x330
>> bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
>> bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
>> bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
>> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>>
>> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407,
>> cache=kmalloc-512
>>
>> allocated by task 15074 on cpu 0 at 1342.585390s:
>> kmalloc include/linux/slab.h:568 [inline]
>> kzalloc include/linux/slab.h:675 [inline]
>> bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
>> bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
>> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>> __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
>>
>> To fix the problem, we round up allocations with kmalloc_size_roundup()
>> so that build_skb()'s use of kize() is always alignment and no special
>> handling of the memory is needed by KFENCE.
>>
>> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
>> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
>> ---
>> net/bpf/test_run.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>> index 13d578ce2a09..058b67108873 100644
>> --- a/net/bpf/test_run.c
>> +++ b/net/bpf/test_run.c
>> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr
>> *kattr, u32 user_size,
>> if (user_size > size)
>> return ERR_PTR(-EMSGSIZE);
>> + size = kmalloc_size_roundup(size);
>> data = kzalloc(size + headroom + tailroom, GFP_USER);
>
> The fact that you need to do this roundup on call sites feels broken, no?
> Was there some discussion / consensus that now all k*alloc() call sites
> would need to be fixed up? Couldn't this be done transparently in k*alloc()
> when KFENCE is enabled? I presume there may be lots of other such occasions
> in the kernel where similar issue triggers, fixing up all call-sites feels
> like ton of churn compared to api-internal, generic fix.
>
>> if (!data)
>> return ERR_PTR(-ENOMEM);
>>
>
> Thanks,
> Daniel
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
2022-11-02 2:59 ` [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb() zhongbaisong
@ 2022-11-02 4:05 ` Jakub Kicinski
2022-11-02 4:27 ` Kees Cook
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2022-11-02 4:05 UTC (permalink / raw)
To: zhongbaisong
Cc: Daniel Borkmann, edumazet, davem, pabeni, linux-kernel, bpf,
netdev, ast, song, yhs, haoluo, Alexander Potapenko, Marco Elver,
Dmitry Vyukov, Linux MM, kasan-dev, Kees Cook
On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
> On 2022/11/2 0:45, Daniel Borkmann wrote:
> > [ +kfence folks ]
>
> + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
>
> Do you have any suggestions about this problem?
+ Kees who has been sending similar patches for drivers
> > On 11/1/22 5:04 AM, Baisong Zhong wrote:
> >> Recently, we got a syzkaller problem because of aarch64
> >> alignment fault if KFENCE enabled.
> >>
> >> When the size from user bpf program is an odd number, like
> >> 399, 407, etc, it will cause skb shard info's alignment access,
> >> as seen below:
> >>
> >> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0
> >> net/core/skbuff.c:1032
> >>
> >> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
> >> __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
> >> arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
> >> arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
> >> atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
> >> __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> >> skb_clone+0xf4/0x214 net/core/skbuff.c:1481
> >> ____bpf_clone_redirect net/core/filter.c:2433 [inline]
> >> bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
> >> bpf_prog_d3839dd9068ceb51+0x80/0x330
> >> bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
> >> bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
> >> bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
> >> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> >> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> >> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> >>
> >> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407,
> >> cache=kmalloc-512
> >>
> >> allocated by task 15074 on cpu 0 at 1342.585390s:
> >> kmalloc include/linux/slab.h:568 [inline]
> >> kzalloc include/linux/slab.h:675 [inline]
> >> bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
> >> bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
> >> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> >> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> >> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> >> __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
> >>
> >> To fix the problem, we round up allocations with kmalloc_size_roundup()
> >> so that build_skb()'s use of kize() is always alignment and no special
> >> handling of the memory is needed by KFENCE.
> >>
> >> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
> >> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
> >> ---
> >> net/bpf/test_run.c | 1 +
> >> 1 file changed, 1 insertion(+)
> >>
> >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> >> index 13d578ce2a09..058b67108873 100644
> >> --- a/net/bpf/test_run.c
> >> +++ b/net/bpf/test_run.c
> >> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr
> >> *kattr, u32 user_size,
> >> if (user_size > size)
> >> return ERR_PTR(-EMSGSIZE);
> >> + size = kmalloc_size_roundup(size);
> >> data = kzalloc(size + headroom + tailroom, GFP_USER);
> >
> > The fact that you need to do this roundup on call sites feels broken, no?
> > Was there some discussion / consensus that now all k*alloc() call sites
> > would need to be fixed up? Couldn't this be done transparently in k*alloc()
> > when KFENCE is enabled? I presume there may be lots of other such occasions
> > in the kernel where similar issue triggers, fixing up all call-sites feels
> > like ton of churn compared to api-internal, generic fix.
> >
> >> if (!data)
> >> return ERR_PTR(-ENOMEM);
> >>
> >
> > Thanks,
> > Daniel
> >
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
2022-11-02 4:05 ` Jakub Kicinski
@ 2022-11-02 4:27 ` Kees Cook
2022-11-02 4:37 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: Kees Cook @ 2022-11-02 4:27 UTC (permalink / raw)
To: Jakub Kicinski
Cc: zhongbaisong, Daniel Borkmann, edumazet, davem, pabeni,
linux-kernel, bpf, netdev, ast, song, yhs, haoluo,
Alexander Potapenko, Marco Elver, Dmitry Vyukov, Linux MM,
kasan-dev
On Tue, Nov 01, 2022 at 09:05:42PM -0700, Jakub Kicinski wrote:
> On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
> > On 2022/11/2 0:45, Daniel Borkmann wrote:
> > > [ +kfence folks ]
> >
> > + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
> >
> > Do you have any suggestions about this problem?
>
> + Kees who has been sending similar patches for drivers
>
> > > On 11/1/22 5:04 AM, Baisong Zhong wrote:
> > >> Recently, we got a syzkaller problem because of aarch64
> > >> alignment fault if KFENCE enabled.
> > >>
> > >> When the size from user bpf program is an odd number, like
> > >> 399, 407, etc, it will cause skb shard info's alignment access,
> > >> as seen below:
> > >>
> > >> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0
> > >> net/core/skbuff.c:1032
> > >>
> > >> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
> > >> __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
> > >> arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
> > >> arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
> > >> atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
> > >> __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> > >> skb_clone+0xf4/0x214 net/core/skbuff.c:1481
> > >> ____bpf_clone_redirect net/core/filter.c:2433 [inline]
> > >> bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
> > >> bpf_prog_d3839dd9068ceb51+0x80/0x330
> > >> bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
> > >> bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
> > >> bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
> > >> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > >> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > >> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > >>
> > >> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407,
> > >> cache=kmalloc-512
> > >>
> > >> allocated by task 15074 on cpu 0 at 1342.585390s:
> > >> kmalloc include/linux/slab.h:568 [inline]
> > >> kzalloc include/linux/slab.h:675 [inline]
> > >> bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
> > >> bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
> > >> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > >> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > >> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > >> __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
> > >>
> > >> To fix the problem, we round up allocations with kmalloc_size_roundup()
> > >> so that build_skb()'s use of kize() is always alignment and no special
> > >> handling of the memory is needed by KFENCE.
> > >>
> > >> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
> > >> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
> > >> ---
> > >> net/bpf/test_run.c | 1 +
> > >> 1 file changed, 1 insertion(+)
> > >>
> > >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > >> index 13d578ce2a09..058b67108873 100644
> > >> --- a/net/bpf/test_run.c
> > >> +++ b/net/bpf/test_run.c
> > >> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr
> > >> *kattr, u32 user_size,
> > >> if (user_size > size)
> > >> return ERR_PTR(-EMSGSIZE);
> > >> + size = kmalloc_size_roundup(size);
> > >> data = kzalloc(size + headroom + tailroom, GFP_USER);
> > >
> > > The fact that you need to do this roundup on call sites feels broken, no?
> > > Was there some discussion / consensus that now all k*alloc() call sites
> > > would need to be fixed up? Couldn't this be done transparently in k*alloc()
> > > when KFENCE is enabled? I presume there may be lots of other such occasions
> > > in the kernel where similar issue triggers, fixing up all call-sites feels
> > > like ton of churn compared to api-internal, generic fix.
I hope I answer this in more detail here:
https://lore.kernel.org/lkml/202211010937.4631CB1B0E@keescook/
The problem is that ksize() should never have existed in the first
place. :P Every runtime bounds checker has tripped over it, and with
the addition of the __alloc_size attribute, I had to start ripping
ksize() out: it can't be used to pretend an allocation grew in size.
Things need to either preallocate more or go through *realloc() like
everything else. Luckily, ksize() is rare.
FWIW, the above fix doesn't look correct to me -- I would expect this to
be:
size_t alloc_size;
...
alloc_size = kmalloc_size_roundup(size + headroom + tailroom);
data = kzalloc(alloc_size, GFP_USER);
--
Kees Cook
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
2022-11-02 4:27 ` Kees Cook
@ 2022-11-02 4:37 ` Eric Dumazet
2022-11-02 7:19 ` zhongbaisong
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2022-11-02 4:37 UTC (permalink / raw)
To: Kees Cook
Cc: Jakub Kicinski, zhongbaisong, Daniel Borkmann, davem, pabeni,
linux-kernel, bpf, netdev, ast, song, yhs, haoluo,
Alexander Potapenko, Marco Elver, Dmitry Vyukov, Linux MM,
kasan-dev
On Tue, Nov 1, 2022 at 9:27 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Nov 01, 2022 at 09:05:42PM -0700, Jakub Kicinski wrote:
> > On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
> > > On 2022/11/2 0:45, Daniel Borkmann wrote:
> > > > [ +kfence folks ]
> > >
> > > + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
> > >
> > > Do you have any suggestions about this problem?
> >
> > + Kees who has been sending similar patches for drivers
> >
> > > > On 11/1/22 5:04 AM, Baisong Zhong wrote:
> > > >> Recently, we got a syzkaller problem because of aarch64
> > > >> alignment fault if KFENCE enabled.
> > > >>
> > > >> When the size from user bpf program is an odd number, like
> > > >> 399, 407, etc, it will cause skb shard info's alignment access,
> > > >> as seen below:
> > > >>
> > > >> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0
> > > >> net/core/skbuff.c:1032
> > > >>
> > > >> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
> > > >> __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
> > > >> arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
> > > >> arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
> > > >> atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
> > > >> __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
> > > >> skb_clone+0xf4/0x214 net/core/skbuff.c:1481
> > > >> ____bpf_clone_redirect net/core/filter.c:2433 [inline]
> > > >> bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
> > > >> bpf_prog_d3839dd9068ceb51+0x80/0x330
> > > >> bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
> > > >> bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
> > > >> bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
> > > >> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > > >> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > > >> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > > >>
> > > >> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407,
> > > >> cache=kmalloc-512
> > > >>
> > > >> allocated by task 15074 on cpu 0 at 1342.585390s:
> > > >> kmalloc include/linux/slab.h:568 [inline]
> > > >> kzalloc include/linux/slab.h:675 [inline]
> > > >> bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
> > > >> bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
> > > >> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
> > > >> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
> > > >> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
> > > >> __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
> > > >>
> > > >> To fix the problem, we round up allocations with kmalloc_size_roundup()
> > > >> so that build_skb()'s use of kize() is always alignment and no special
> > > >> handling of the memory is needed by KFENCE.
> > > >>
> > > >> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
> > > >> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
> > > >> ---
> > > >> net/bpf/test_run.c | 1 +
> > > >> 1 file changed, 1 insertion(+)
> > > >>
> > > >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > > >> index 13d578ce2a09..058b67108873 100644
> > > >> --- a/net/bpf/test_run.c
> > > >> +++ b/net/bpf/test_run.c
> > > >> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr
> > > >> *kattr, u32 user_size,
> > > >> if (user_size > size)
> > > >> return ERR_PTR(-EMSGSIZE);
> > > >> + size = kmalloc_size_roundup(size);
> > > >> data = kzalloc(size + headroom + tailroom, GFP_USER);
> > > >
> > > > The fact that you need to do this roundup on call sites feels broken, no?
> > > > Was there some discussion / consensus that now all k*alloc() call sites
> > > > would need to be fixed up? Couldn't this be done transparently in k*alloc()
> > > > when KFENCE is enabled? I presume there may be lots of other such occasions
> > > > in the kernel where similar issue triggers, fixing up all call-sites feels
> > > > like ton of churn compared to api-internal, generic fix.
>
> I hope I answer this in more detail here:
> https://lore.kernel.org/lkml/202211010937.4631CB1B0E@keescook/
>
> The problem is that ksize() should never have existed in the first
> place. :P Every runtime bounds checker has tripped over it, and with
> the addition of the __alloc_size attribute, I had to start ripping
> ksize() out: it can't be used to pretend an allocation grew in size.
> Things need to either preallocate more or go through *realloc() like
> everything else. Luckily, ksize() is rare.
>
> FWIW, the above fix doesn't look correct to me -- I would expect this to
> be:
>
> size_t alloc_size;
> ...
> alloc_size = kmalloc_size_roundup(size + headroom + tailroom);
> data = kzalloc(alloc_size, GFP_USER);
Making sure the struct skb_shared_info is aligned to a cache line does
not need kmalloc_size_roundup().
What is needed is to adjust @size so that (@size + @headroom) is a
multiple of SMP_CACHE_BYTES
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb()
2022-11-02 4:37 ` Eric Dumazet
@ 2022-11-02 7:19 ` zhongbaisong
0 siblings, 0 replies; 5+ messages in thread
From: zhongbaisong @ 2022-11-02 7:19 UTC (permalink / raw)
To: Eric Dumazet, Kees Cook
Cc: Jakub Kicinski, Daniel Borkmann, davem, pabeni, linux-kernel,
bpf, netdev, ast, song, yhs, haoluo, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, Linux MM, kasan-dev
On 2022/11/2 12:37, Eric Dumazet wrote:
> On Tue, Nov 1, 2022 at 9:27 PM Kees Cook <keescook@chromium.org> wrote:
>>
>> On Tue, Nov 01, 2022 at 09:05:42PM -0700, Jakub Kicinski wrote:
>>> On Wed, 2 Nov 2022 10:59:44 +0800 zhongbaisong wrote:
>>>> On 2022/11/2 0:45, Daniel Borkmann wrote:
>>>>> [ +kfence folks ]
>>>>
>>>> + cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov
>>>>
>>>> Do you have any suggestions about this problem?
>>>
>>> + Kees who has been sending similar patches for drivers
>>>
>>>>> On 11/1/22 5:04 AM, Baisong Zhong wrote:
>>>>>> Recently, we got a syzkaller problem because of aarch64
>>>>>> alignment fault if KFENCE enabled.
>>>>>>
>>>>>> When the size from user bpf program is an odd number, like
>>>>>> 399, 407, etc, it will cause skb shard info's alignment access,
>>>>>> as seen below:
>>>>>>
>>>>>> BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0
>>>>>> net/core/skbuff.c:1032
>>>>>>
>>>>>> Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
>>>>>> __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
>>>>>> arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
>>>>>> arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
>>>>>> atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
>>>>>> __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
>>>>>> skb_clone+0xf4/0x214 net/core/skbuff.c:1481
>>>>>> ____bpf_clone_redirect net/core/filter.c:2433 [inline]
>>>>>> bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
>>>>>> bpf_prog_d3839dd9068ceb51+0x80/0x330
>>>>>> bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
>>>>>> bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
>>>>>> bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
>>>>>> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>>>>>> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>>>>>> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>>>>>>
>>>>>> kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407,
>>>>>> cache=kmalloc-512
>>>>>>
>>>>>> allocated by task 15074 on cpu 0 at 1342.585390s:
>>>>>> kmalloc include/linux/slab.h:568 [inline]
>>>>>> kzalloc include/linux/slab.h:675 [inline]
>>>>>> bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
>>>>>> bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
>>>>>> bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
>>>>>> __do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
>>>>>> __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
>>>>>> __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
>>>>>>
>>>>>> To fix the problem, we round up allocations with kmalloc_size_roundup()
>>>>>> so that build_skb()'s use of kize() is always alignment and no special
>>>>>> handling of the memory is needed by KFENCE.
>>>>>>
>>>>>> Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
>>>>>> Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
>>>>>> ---
>>>>>> net/bpf/test_run.c | 1 +
>>>>>> 1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>>>>>> index 13d578ce2a09..058b67108873 100644
>>>>>> --- a/net/bpf/test_run.c
>>>>>> +++ b/net/bpf/test_run.c
>>>>>> @@ -774,6 +774,7 @@ static void *bpf_test_init(const union bpf_attr
>>>>>> *kattr, u32 user_size,
>>>>>> if (user_size > size)
>>>>>> return ERR_PTR(-EMSGSIZE);
>>>>>> + size = kmalloc_size_roundup(size);
>>>>>> data = kzalloc(size + headroom + tailroom, GFP_USER);
>>>>>
>>>>> The fact that you need to do this roundup on call sites feels broken, no?
>>>>> Was there some discussion / consensus that now all k*alloc() call sites
>>>>> would need to be fixed up? Couldn't this be done transparently in k*alloc()
>>>>> when KFENCE is enabled? I presume there may be lots of other such occasions
>>>>> in the kernel where similar issue triggers, fixing up all call-sites feels
>>>>> like ton of churn compared to api-internal, generic fix.
>>
>> I hope I answer this in more detail here:
>> https://lore.kernel.org/lkml/202211010937.4631CB1B0E@keescook/
>>
>> The problem is that ksize() should never have existed in the first
>> place. :P Every runtime bounds checker has tripped over it, and with
>> the addition of the __alloc_size attribute, I had to start ripping
>> ksize() out: it can't be used to pretend an allocation grew in size.
>> Things need to either preallocate more or go through *realloc() like
>> everything else. Luckily, ksize() is rare.
>>
>> FWIW, the above fix doesn't look correct to me -- I would expect this to
>> be:
>>
>> size_t alloc_size;
>> ...
>> alloc_size = kmalloc_size_roundup(size + headroom + tailroom);
>> data = kzalloc(alloc_size, GFP_USER);
>
> Making sure the struct skb_shared_info is aligned to a cache line does
> not need kmalloc_size_roundup().
>
> What is needed is to adjust @size so that (@size + @headroom) is a
> multiple of SMP_CACHE_BYTES
ok, I'll fix it and send v2.
Thanks
.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-11-02 7:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20221101040440.3637007-1-zhongbaisong@huawei.com>
[not found] ` <eca17bfb-c75f-5db1-f194-5b00c2a0c6f2@iogearbox.net>
2022-11-02 2:59 ` [PATCH -next] bpf, test_run: fix alignment problem in bpf_prog_test_run_skb() zhongbaisong
2022-11-02 4:05 ` Jakub Kicinski
2022-11-02 4:27 ` Kees Cook
2022-11-02 4:37 ` Eric Dumazet
2022-11-02 7:19 ` zhongbaisong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox