From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13336C61DA4 for ; Mon, 13 Mar 2023 19:39:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 663246B0072; Mon, 13 Mar 2023 15:39:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5C5096B0074; Mon, 13 Mar 2023 15:39:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 465B76B0075; Mon, 13 Mar 2023 15:39:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 349D86B0072 for ; Mon, 13 Mar 2023 15:39:08 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0861940E34 for ; Mon, 13 Mar 2023 19:39:08 +0000 (UTC) X-FDA: 80564888376.26.8246535 Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf06.hostedemail.com (Postfix) with ESMTP id 2EFF2180011 for ; Mon, 13 Mar 2023 19:39:04 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="FwIqdh/7"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of zokeefe@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=zokeefe@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678736345; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wS8jmXNaPLsNZEoeUl7PhMCNiH0F6PKhoHc6Y2aL034=; b=7CUC1MJA96ZFkOeXnvN2k0EsmHRl09x68ln/MYniEmGYI7g3qyeS1mINUyO6bOpJ6ewb98 sQp9jXAW93HfbRgyNOLsUIbqMQd51YMwFFfFYDbfZbw1rQSKntjOkVz9+zdld80en6dDmN c4JVggAPOJcrNIf89LDRdQxdyYUShgk= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="FwIqdh/7"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of zokeefe@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=zokeefe@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678736345; a=rsa-sha256; cv=none; b=n4U+BP7u/14ElAxojHVzNjsNEZhjANOwmAneCd6CHvw+iH8vusBLTnvcENL80Huh0EYvgt 37Df8IqfOOU3OJZlK/N9xg3lu+nytmpZ/ISrFBMVHnn/goXofiPEJopwYjFiP36t+3mdB2 yiV6q9XSa3NHPRQOcfA056T38YVAZSg= Received: by mail-ed1-f50.google.com with SMTP id ek18so22071483edb.6 for ; Mon, 13 Mar 2023 12:39:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678736343; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wS8jmXNaPLsNZEoeUl7PhMCNiH0F6PKhoHc6Y2aL034=; b=FwIqdh/7GyFQK7ObRGHVo7b3jAPMcWSbjdhVYfRlWkIPqFGf3qioU2zyU7TN7p0Yp8 twm8bIIC4OHRfPR6lY+JxIKLe/49FncmmpWBtVcB7Kk7zsqmZAhOaQIV6INJ7Jlp7TCG vTuTBZVDdho71as/b9FwH0l6j3xI2NAZpTI2pBP9+mtR9EzJLqEnflq6GjWi0LXhdIlw YA0yo+bMcoQLswKdPCsNEI1lqxQ2eIrB3UU80gulhntbQvLUZckvNyB8NLEIAVXIYgXM evWaH0VYwpSdjR0KvQ42RteZKWfL5+Vx6Ib3kM9qFry43U2t6Bnpb3JG3sDu/Ru40bpf gJUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678736343; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wS8jmXNaPLsNZEoeUl7PhMCNiH0F6PKhoHc6Y2aL034=; b=fYscyf9QsZdod9RA0poY99iPZo1MuHSZNtFZqBi+ssqtJE60P8OpmQIHa0gQA6Sr1C V4U3svMz9NrVhR1Q45zxbTa4Hf5q06Vm15f094zcifIMoCVcCwaf5KfBmp2oOZ4miagN 7mY4C89lwpgYSu55ZXN0AtxyXVIG0SxWXe4Nr69PM311lvLq1Vxcdy/dEDEvoUzflb3y gKgrvMdvQMAq0MNsmEcCU5Wnfx1ZJN74dBbCJbVaASsjXP5SM96746u0Dma6kl6KlE5t mfyEAOjNmeV1M0olbE56/I3OSS4oDiUBwcsAifR3CJdvFcfRpc4MtFUfuTXbTIlOZ1Gr OEXg== X-Gm-Message-State: AO0yUKXbGfBDyDchqj7hfzvn374vEaxreJTF4vsOrsO78XzgbPri8Gvt qfZkGavuoOrD78nKSh0UXTlgrXMTUUWVzJPRIiTPl6O9/jXuOO1nPbw= X-Google-Smtp-Source: AK7set+BqYQIoFP4NPt2wScs0G+JZdXprJZpv+ov3SBz4Jdd82gtbNhfhh1eoiynPky42VawkdpV0RHJMAtKYC2vmq8= X-Received: by 2002:a17:906:66c9:b0:8ef:ccc8:5075 with SMTP id k9-20020a17090666c900b008efccc85075mr16437542ejp.11.1678736342741; Mon, 13 Mar 2023 12:39:02 -0700 (PDT) MIME-Version: 1.0 References: <000000000000226a6105f6954b47@google.com> <20230313191557.6lm53ndvwoxtf5dz@google.com> In-Reply-To: <20230313191557.6lm53ndvwoxtf5dz@google.com> From: "Zach O'Keefe" Date: Mon, 13 Mar 2023 12:38:25 -0700 Message-ID: Subject: Re: [syzbot] [mm?] kernel BUG in hpage_collapse_scan_file To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com, Matthew Wilcox Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 2EFF2180011 X-Stat-Signature: jrxskesozcz66i63edbs5i19m7izw9s9 X-HE-Tag: 1678736344-230505 X-HE-Meta: U2FsdGVkX1/qIn67NfQerQeX1Ab3STtuWUWb/1zb7Qcp5sRuMbG2XpvableBSip6XE7WwLUF1UNco1VFsOtoYTODbQpDAGM6CXeVkNAt2BU1BMs9dgq4w1cjVcRSGcm6QY9m/dO6ARCVd1GXonp5GOvFUtutnrklIXW+GFUyAgEpd7mDcXRd28hyaIwrioajWmLuO9/EpJJSKM1FnbWYxb9FuBOtBl0FYKueaqjUgHcIvmWWjalMjVJdBrK8h5k17DUPZ4nUZR2ihDdrGLm/VDpz3zgfX6fk55aigJhKwKqL6vp4ZwFyjpGgeBYJfrqlM0/1323PrJCBYIOCUHnBaMSk+J28Y33dzdYrGR+SpJOL71Qp3kdnzgdAOZqLpFy9GGipocU6biMUnakiWBPg34+w27yxNsPJ133pK7q0v80s9R9cRyPujyqlOFJzcaLpTqEicO4AKLdlHX4jagRBy0XwMu0mrTfQuVQuQZYEcVjN2K8q8CdCRnquRvne8XTDFf7yt9k5ypHy1SQQFXmt1zHOFB3FoysWIKjb33uyNmOXUwjFk+Taczgs+gfSuH/fHh+sP50+By3x+HiI1P8G7LrfpiVWvQpEBByQBXiSdX34HpPKfBnUEsp7IXkEXDa4LHKmM/h1VJHIxVe73U1ELULxvJVBYqHgIerVmCHeYEVud/hMMIRHdwYe45CylM8TMCqDvgMrJeougH3LIVEOQpID5JKVbnZNoWUzuwRkKfyYOaRfietyr/zJonGqQD9GePyGJqS/M+8FirDJ1+kZPwMACwOZ4o1FOW3skEVfJxjBUJnC1EMWhhr+jS/vGAaRR5K24ji3wDpHYxw2dou72CVNeqykwtsMfUuf5OQWSBqGBjhyjOg1K1YNfIWptdJDRoaSRY/zlX/JXQwGfY/a4PbC4VCwK53PZvHmSHO5qLYqKyKG2VK+SLTybjLAr3KMwmfwAzwy47mNrgT6HUp qKAuVzvx 9wezzNmJpwyJ0xW78h+bUtBnQFGrIyRQJVlxb3ki7H39hj8HqHoTYtpoUBZr4XFiHOSQj6TzHBZo9r/KDm0/b0Oe0vbxJbAomOtLWCTnFvg+HThwJ5/G5vlFv+1D0zUkyA2KphO4vAUsCE051gFAUSRBtLfDRpqLyUv0QHQfH9HuQeHMbwS0sNDHlH7LqursC98BYMyyn67HAAfLuUqA9TMuSw5PQOztH8/9TOKMXe7zdAIE1Si6nkxWvBXFIB9gFprAMyh7EGRs3Iyt341ShUMOzZBof6UaTHvisi7wP1WxQiP9b+mvaDiI7a1MalD4TZimvB5JKw1VPTX9k50XO3Yr3fsWCEO0MVyxTm36pMia9inNQePB+4+q2LddSHYadTu59y1446QUiT00yd0s9cQtix5Gami4OsivK744RTyWDHZLY90xjNlRRmIhSSFg/b2w8OR3++U/x7VcCmgmQ+MPt2QgUBVDNL8FrliD4ovoOo0qzPBWY0etZ0JH3874AzC6/73yYQTEuavrKmn5JS0OrtzJcux4z2GMWpts8LIeB03tEVHcYHoVr9X+zGf3FHoUeAizU3IZoH+PqbBPsYRfluWpA9oYxvoMYT19JIrkOk+UMzpUQQ7sz1EDYj00mWER1ViyZb0wm3u6woa8ev+uLIgpM/HocSwf75xVrTt21p2GLUfB35/we3G5uHGnLuTYcPU6pRCmuOK+g7kUeyUxzI211sV1DWCO1SUqjs1TzWUmzrxRX2hkKTiN5WSeFkR6J2iXbXzYnpqweD5COzrB+8KE2J0pO1/UWorjo0BhaeY0pL0h4hDS9CEle94JDv0Q8Yb21rEDB9asKsmJSKpNxYY4UAaSfTfFiYH7dhY3JlamStVI0J+TSK15g7rMjV8enxQDiiI+mFtbd0F9ASukNWKogjuMU40kj0oT0sWYJs3/4H5mdTkUo6w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 13, 2023 at 12:16=E2=80=AFPM Zach O'Keefe = wrote: > > On Mar 10 17:02, Zach O'Keefe wrote: > > On Fri, Mar 10, 2023 at 4:52=E2=80=AFPM syzbot > > wrote: > > > > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: 857f1268a591 Merge tag 'objtool-core-2023-03-02' of g= it://.. > > > git tree: upstream > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=3D168e1032c= 80000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=3Df763d89e2= 6d3d4c4 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=3D9578faa5475= acb35fa50 > > > compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for= Debian) 2.35.2 > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=3D179e4e1= 2c80000 > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=3D119cce98c= 80000 > > > > > > Downloadable assets: > > > disk image: https://storage.googleapis.com/syzbot-assets/b3b7a7e333f1= /disk-857f1268.raw.xz > > > vmlinux: https://storage.googleapis.com/syzbot-assets/5940be1cf171/vm= linux-857f1268.xz > > > kernel image: https://storage.googleapis.com/syzbot-assets/986015398e= 4a/bzImage-857f1268.xz > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the = commit: > > > Reported-by: syzbot+9578faa5475acb35fa50@syzkaller.appspotmail.com > > > > > > ------------[ cut here ]------------ > > > kernel BUG at mm/khugepaged.c:1823! > > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN > > > CPU: 1 PID: 5097 Comm: syz-executor220 Not tainted 6.2.0-syzkaller-13= 154-g857f1268a591 #0 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BI= OS Google 02/16/2023 > > > RIP: 0010:collapse_file mm/khugepaged.c:1823 [inline] > > > RIP: 0010:hpage_collapse_scan_file+0x67c8/0x7580 mm/khugepaged.c:2233 > > > Code: 00 00 89 de e8 c9 66 a3 ff 31 ff 89 de e8 c0 66 a3 ff 45 84 f6 = 0f 85 28 0d 00 00 e8 22 64 a3 ff e9 dc f7 ff ff e8 18 64 a3 ff <0f> 0b f3 0= f 1e fa e8 0d 64 a3 ff e9 93 f6 ff ff f3 0f 1e fa 4c 89 > > > RSP: 0018:ffffc90003dff4e0 EFLAGS: 00010093 > > > RAX: ffffffff81e95988 RBX: 00000000000001c1 RCX: ffff8880205b3a80 > > > RDX: 0000000000000000 RSI: 00000000000001c0 RDI: 00000000000001c1 > > > RBP: ffffc90003dff830 R08: ffffffff81e90e67 R09: fffffbfff1a433c3 > > > R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000000 > > > R13: ffffc90003dff6c0 R14: 00000000000001c0 R15: 0000000000000000 > > > FS: 00007fdbae5ee700(0000) GS:ffff8880b9900000(0000) knlGS:000000000= 0000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007fdbae6901e0 CR3: 000000007b2dd000 CR4: 00000000003506e0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > Call Trace: > > > > > > madvise_collapse+0x721/0xf50 mm/khugepaged.c:2693 > > > madvise_vma_behavior mm/madvise.c:1086 [inline] > > > madvise_walk_vmas mm/madvise.c:1260 [inline] > > > do_madvise+0x9e5/0x4680 mm/madvise.c:1439 > > > __do_sys_madvise mm/madvise.c:1452 [inline] > > > __se_sys_madvise mm/madvise.c:1450 [inline] > > > __x64_sys_madvise+0xa5/0xb0 mm/madvise.c:1450 > > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > > do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > > RIP: 0033:0x7fdbae65dc39 > > > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 = 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f= 0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > > > RSP: 002b:00007fdbae5ee2f8 EFLAGS: 00000246 ORIG_RAX: 000000000000001= c > > > RAX: ffffffffffffffda RBX: 00007fdbae6e64b8 RCX: 00007fdbae65dc39 > > > RDX: 0000000000000019 RSI: 000000000060005f RDI: 0000000020000000 > > > RBP: 00007fdbae6e64b0 R08: 0000000000000001 R09: 0000000000000033 > > > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdbae5ee300 > > > R13: 0000000000000001 R14: 00007fdbae5ee400 R15: 0000000000022000 > > > > > > Modules linked in: > > > ---[ end trace 0000000000000000 ]--- > > > RIP: 0010:collapse_file mm/khugepaged.c:1823 [inline] > > > RIP: 0010:hpage_collapse_scan_file+0x67c8/0x7580 mm/khugepaged.c:2233 > > > Code: 00 00 89 de e8 c9 66 a3 ff 31 ff 89 de e8 c0 66 a3 ff 45 84 f6 = 0f 85 28 0d 00 00 e8 22 64 a3 ff e9 dc f7 ff ff e8 18 64 a3 ff <0f> 0b f3 0= f 1e fa e8 0d 64 a3 ff e9 93 f6 ff ff f3 0f 1e fa 4c 89 > > > RSP: 0018:ffffc90003dff4e0 EFLAGS: 00010093 > > > RAX: ffffffff81e95988 RBX: 00000000000001c1 RCX: ffff8880205b3a80 > > > RDX: 0000000000000000 RSI: 00000000000001c0 RDI: 00000000000001c1 > > > RBP: ffffc90003dff830 R08: ffffffff81e90e67 R09: fffffbfff1a433c3 > > > R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000000 > > > R13: ffffc90003dff6c0 R14: 00000000000001c0 R15: 0000000000000000 > > > FS: 00007fdbae5ee700(0000) GS:ffff8880b9900000(0000) knlGS:000000000= 0000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007fdbae6901e0 CR3: 000000007b2dd000 CR4: 00000000003506e0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > > > > --- > > > This report is generated by a bot. It may contain errors. > > > See https://goo.gl/tpsmEJ for more information about syzbot. > > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > > > syzbot will keep track of this issue. See: > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > syzbot can test patches for this issue, for details see: > > > https://goo.gl/tpsmEJ#testing-patches > > > > I had a look at this, and the issue is stemming from failed (due to > > error injection here) xas_store() in collapse_file() (in this report, > > specifically was picking on shmem after MADV_REMOVE punch). This puts > > the xa_state into an error state (-ENOMEM) and the subsequent > > xas_next() will (a) not increment xas->xa_index (which trips the > > VM_BUG_ON), and (b) returns NULL (which is confusing, since AFAIU, > > that's a "valid" entry for a truncated page cache entry, but also > > being used to indicate error). > > > > I think the right thing to do is to check xas_invalid() at the top of > > the loop, or checking return value of all those xas_store()'s and > > taking appropriate action. There is also the possibility this never > > occurs in practice due to the "Ensure we have slots for all the pages > > in the range" check at the top of the function, and that we are only > > able to trip this from error injection. > > Right, so looking a bit more into this this morning, my last question abo= ut > whether the xas_create_range() check at the top of collapse_file() guaran= teeing > us the needed slots (and that syzbot was only able to trip this due to er= ror > injection) is plainly false: we are actually attempting to allocate memor= y here, > so clearly the slots weren't already available - duh. > > Now, why isn't that well-intending pre-reservation enough? Well, we are d= ropping > the xarray lock ~ every iteration of the for-loop, then relocking it to s= tore > the hugpage at the current index. While the lock is dropped, there isn't > anything protecting us from racing with page_cache_delete() -- here, from > > __filemap_remove_folio() > truncate_inode_folio() > shmem_undo_range() > shmem_truncate_range() > vfs_fallocate() > madvise_remove() > > which can then remove slots out from under us: > > xas_delete_node() > update_node() > xas_store() > page_cache_delete() > > So, I think this code needs to be guarded against concurrent slot removal= . > > I think just giving up is the best (i.e. simplest) route (vs taking some > additional measures to serialize vs concurrent removal). One concern is t= hat if > we've encountered ENOMEM situation where xas_store() is failing, then the > rollback code also won't work correctly. However, rollback xas_store() wi= ll > either replace the current hpage entry with the previous entry, or replac= e it > will a NULL entry (had it been a hole previously) -- neither of which wil= l > involve any additional allocations -- so we're safe. > > Patch to fix this should be following in the next day or so. > > Also, to be clear, the concurrent removal isn't actually a problem on its own, it's only concurrent removal + subsequent inability to allocate missing xarray slot that is the issue.