From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96DF0C433F5 for ; Thu, 10 Feb 2022 16:19:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7CF76B0071; Thu, 10 Feb 2022 11:19:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B05036B0073; Thu, 10 Feb 2022 11:19:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A8036B0075; Thu, 10 Feb 2022 11:19:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 850496B0071 for ; Thu, 10 Feb 2022 11:19:06 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3AA788249980 for ; Thu, 10 Feb 2022 16:19:06 +0000 (UTC) X-FDA: 79127379492.12.9DD7DE4 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by imf16.hostedemail.com (Postfix) with ESMTP id B37A0180004 for ; Thu, 10 Feb 2022 16:19:05 +0000 (UTC) Received: by mail-wr1-f49.google.com with SMTP id d27so10509708wrc.6 for ; Thu, 10 Feb 2022 08:19:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ExeWQc9SUOqYcrRLN43WRoHZMIKzifId+2InLZRt+PQ=; b=EsnqJ5WpC+8gesQoT17NvYMvanu9wd2hDWjYhUR/uFhAsGNLZDya/IyC06RFvZgpdk 0mkCziXaDhkVWFlq/7GuTQGs8UsANXFSm/kP7jSUcMG9QZYLMjoxcllva6SfXeCd8RRz oynCPxEPV8zYb/kpiwC3mAof688qEvjd6EbvkL5lyCvuBOiSReXuwmWHWmzPJ1YEZMbA 0hlyH3tJR6kqmo/wkq/hc8jeeFTrKdojaK+nl/BdhnibHVHy8TyiKBfevgOs/r5Xke4/ lAcX/TgJad1G12lh594d78NPT8/swKpDP8vBePXUi3KdS2XR+YYzTYAGo5KAolVkr1R9 k8hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ExeWQc9SUOqYcrRLN43WRoHZMIKzifId+2InLZRt+PQ=; b=2TFhqvFlTyWkEte8NdXVAnTjNiS4n16VHzFYNmuYhYR3VPKkLPx7txa4yPljnErUxR 4OWlbd4Riq71alab+6F6q1o3son5sBuyMJnYdqgnHPy+q3zIoYAQHtUrVck3C3leCvBK jW/HvJRYITAA2Ydq9O1PcsiYnT96GQcJ67oQ/InQtJJS9NVTMJDDhJTKzJFQf2FRp7QO XlJ4xE2CksvtAyp2fIjYtelcBFaLQuMh+X28ysRxc1D3tXrH/ugPBgbuHuQuIBr6togg s4hErNuVzPrfUedB3Er6OLRQ+9dMk6Nl84RLOtdXlkXwSxmnqtqHVxqWc8RIECu7A9yl jRaQ== X-Gm-Message-State: AOAM532yjlgcK0UD1FlObjpyKyYPjcnuFlytN0WGLeF4Utsx88r1ZwmC cOngHwureePGbMm5Eew0t7I5JVdzcK0YQmYA1tw= X-Google-Smtp-Source: ABdhPJzLa+ZcEdnWY8kZ6vC+kndyjKV326hStd8ZZhxGEYPbi51SWjuWqzqutzFO4RxENSyniI0IaGB4OXM+4MCpNsk= X-Received: by 2002:a05:6000:1846:: with SMTP id c6mr7165029wri.438.1644509944114; Thu, 10 Feb 2022 08:19:04 -0800 (PST) MIME-Version: 1.0 References: <000000000000a3571605d27817b5@google.com> <0000000000001f60ef05d7a3c6ad@google.com> <20220210081125.GA4616@1wt.eu> <359ee592-747f-8610-4180-5e1d2aba1b77@iogearbox.net> In-Reply-To: <359ee592-747f-8610-4180-5e1d2aba1b77@iogearbox.net> From: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Date: Thu, 10 Feb 2022 17:18:52 +0100 Message-ID: Subject: Re: [syzbot] WARNING: kmalloc bug in xdp_umem_create (2) To: Daniel Borkmann Cc: Willy Tarreau , syzbot , akpm@linux-foundation.org, Andrii Nakryiko , Alexei Starovoitov , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , bpf , David Miller , fgheet255t@gmail.com, Jesper Dangaard Brouer , John Fastabend , Jonathan Lemon , Martin KaFai Lau , KP Singh , Jakub Kicinski , LKML , linux-mm@kvack.org, "Karlsson, Magnus" , mudongliangabcd@gmail.com, Netdev , Song Liu , syzkaller-bugs@googlegroups.com, Linus Torvalds , Yonghong Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam11 X-Rspam-User: X-Stat-Signature: c7xbu4a4orha8fy6tehg7q1toarw133d Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=EsnqJ5Wp; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf16.hostedemail.com: domain of bjorn.topel@gmail.com designates 209.85.221.49 as permitted sender) smtp.mailfrom=bjorn.topel@gmail.com X-Rspamd-Queue-Id: B37A0180004 X-HE-Tag: 1644509945-671040 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 10 Feb 2022 at 09:35, Daniel Borkmann wrote: > > On 2/10/22 9:11 AM, Willy Tarreau wrote: > > On Wed, Feb 09, 2022 at 10:08:07PM -0800, syzbot wrote: > >> syzbot has bisected this issue to: > >> > >> commit 7661809d493b426e979f39ab512e3adf41fbcc69 > >> Author: Linus Torvalds > >> Date: Wed Jul 14 16:45:49 2021 +0000 > >> > >> mm: don't allow oversized kvmalloc() calls > >> > >> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=3D13bc74c= 2700000 > >> start commit: f4bc5bbb5fef Merge tag 'nfsd-5.17-2' of git://git.kern= el.o.. > >> git tree: upstream > >> final oops: https://syzkaller.appspot.com/x/report.txt?x=3D107c74c= 2700000 > >> console output: https://syzkaller.appspot.com/x/log.txt?x=3D17bc74c270= 0000 > >> kernel config: https://syzkaller.appspot.com/x/.config?x=3D5707221760= c00a20 > >> dashboard link: https://syzkaller.appspot.com/bug?extid=3D11421fbbff99= b989670e > >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=3D12e514a4= 700000 > >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=3D15fcdf8a70= 0000 > >> > >> Reported-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com > >> Fixes: 7661809d493b ("mm: don't allow oversized kvmalloc() calls") > >> > >> For information about bisection process see: https://goo.gl/tpsmEJ#bis= ection > > > > Interesting, so in fact syzkaller has shown that the aforementioned > > patch does its job well and has spotted a call path by which a single > > userland setsockopt() can request more than 2 GB allocation in the > > kernel. Most likely that's in fact what needs to be addressed. > > > > FWIW the call trace at the URL above is: > > > > Call Trace: > > kvmalloc include/linux/mm.h:806 [inline] > > kvmalloc_array include/linux/mm.h:824 [inline] > > kvcalloc include/linux/mm.h:829 [inline] > > xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline] > > xdp_umem_reg net/xdp/xdp_umem.c:219 [inline] > > xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252 > > xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068 > > __sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176 > > __do_sys_setsockopt net/socket.c:2187 [inline] > > __se_sys_setsockopt net/socket.c:2184 [inline] > > __x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > and the meaningful part of the repro is: > > > > syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > > syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul); > > syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > > intptr_t res =3D 0; > > res =3D syscall(__NR_socket, 0x2cul, 3ul, 0); > > if (res !=3D -1) > > r[0] =3D res; > > *(uint64_t*)0x20000080 =3D 0; > > *(uint64_t*)0x20000088 =3D 0xfff02000000; > > *(uint32_t*)0x20000090 =3D 0x800; > > *(uint32_t*)0x20000094 =3D 0; > > *(uint32_t*)0x20000098 =3D 0; > > syscall(__NR_setsockopt, r[0], 0x11b, 4, 0x20000080ul, 0x20ul); > > Bjorn had a comment back then when the issue was first raised here: > > https://lore.kernel.org/bpf/3f854ca9-f5d6-4065-c7b1-5e5b25ea742f@iogea= rbox.net/ > > There was earlier discussion from Andrew to potentially retire the warnin= g: > > https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@li= nux-foundation.org/ > > Bjorn / Magnus / Andrew, anyone planning to follow-up on this issue? > Honestly, I would need some guidance on how to progress. I could just change from U32_MAX to INT_MAX, but as I stated earlier (lore-link above), that has a hacky feeling to it. Andrew's mail didn't really land in a consensus. From my perspective, the code isn't broken, with the memcg limits in consideration. Introducing a LARGE flag or a new "_yes_this_can_be_huge_but_it_is_ok()" version would make sense if this problem is applicable to more users in the kernel. So, thoughts? ;-) Bj=C3=B6rn