From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E85CC4708E for ; Mon, 2 Jan 2023 01:40:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 521248E0002; Sun, 1 Jan 2023 20:40:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A9F58E0001; Sun, 1 Jan 2023 20:40:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3232A8E0002; Sun, 1 Jan 2023 20:40:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1A8208E0001 for ; Sun, 1 Jan 2023 20:40:30 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id CE8711604F5 for ; Mon, 2 Jan 2023 01:40:29 +0000 (UTC) X-FDA: 80308154178.21.AA1D32C Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf02.hostedemail.com (Postfix) with ESMTP id 9EF4B8000C for ; Mon, 2 Jan 2023 01:40:26 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=HP16kw1S; spf=pass (imf02.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.180 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672623627; a=rsa-sha256; cv=none; b=NwQ7ef//8X5yiIrVLId/9gnMeaNOXxY1NGx3s3a+64j7/fsUsO22RPEwgmmSKnb2fT4t0N TtQh3vlyklD+QjrsSyz0illxhcXh7Z6WgYn/5wEo3I75T+uf+wG7+BcEE8JuVcfw5acR7G RJle1omv948f+efP/ILRDo2b/m2RiF8= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=HP16kw1S; spf=pass (imf02.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.180 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672623627; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mKe7CHdab8mIbJ0mMxRFBSBGuJHIvKXl4x/yHQ6F9DU=; b=MoGGKDUmyyQwpb7He3ptLvs+C2Kb5kRuJPCRkyA6m3dy3axvS0wLt1x4tn/6Gq9wzgzVYZ ozKZa6vRASELxUaxitff5WsxQ7aAO/S89EZYBI+qV4Zc5HRuAxFkmAS5iC8msXxMbG88AB HDKSSoVU/HqW6xWdL5EQtUtGjy8Du9s= Received: by mail-qt1-f180.google.com with SMTP id c7so21544448qtw.8 for ; Sun, 01 Jan 2023 17:40:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=mKe7CHdab8mIbJ0mMxRFBSBGuJHIvKXl4x/yHQ6F9DU=; b=HP16kw1SRu8lXjQXWvDQZ5OA3KVMBXOT92mtfFl2biLVpfyv8YLrWqfYnMuWB4APQN lg8caVYjHim/LXaljqp9LL+P3jmO3adVNToO6p4dBU+s0dmT35t+N7ycOn3e3Z82HxEg Ks8R+05VbCowFuYQoKbl6rQjy8OWtf1nfEKEY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mKe7CHdab8mIbJ0mMxRFBSBGuJHIvKXl4x/yHQ6F9DU=; b=BU+HT+7LYQEa8Q44znBAK1+ZGbzHF6+bVqzh8wvUtUThgtN9m9aMH//ttp1QZPU5r1 bTIVoiVlF0t36BMZ8rr0no8AG+yVlHNDHuJzUe3WLx4WWi1efdokF0WSOxYHYsd5/Qzm mFEs1KHcxCZccdlFoTf5rLZCnP0nqQXWEmElWW6QNEWMAwj9cVeJlyPTQXRCq0VFzdwV vNsc9eeaLYLXlMEvmqPJLBxTzbPQjzZxTNN2b6X++wSwImrmU4TXQwY7KlG/6hQlgcj+ l92CkCWdpu0IFdnZElJRNFOZ6wNQX5oxZD527M1047WQhs7ZvvKK0C5LX5nNg4HfEmaW M6fA== X-Gm-Message-State: AFqh2ko1FEaXCR0q9/0K93JwIIj0qdts5Zl48BsMCTdNF5TAKISsrxWH 64/etEpI1iPFCGb+BRTebvW4DwsKgYTXxjWc X-Google-Smtp-Source: AMrXdXvH2Ypdxz7fdnZrLTxqT2Q8bCHcfYhuTqotCGVtrvvUCGMczkHRlWnA1pwbIyHwMw6HfA5aeg== X-Received: by 2002:ac8:6705:0:b0:39c:da20:65f with SMTP id e5-20020ac86705000000b0039cda20065fmr56450065qtp.2.1672623625272; Sun, 01 Jan 2023 17:40:25 -0800 (PST) Received: from mail-qv1-f50.google.com (mail-qv1-f50.google.com. [209.85.219.50]) by smtp.gmail.com with ESMTPSA id bb23-20020a05622a1b1700b003431446588fsm16911456qtb.5.2023.01.01.17.40.23 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 01 Jan 2023 17:40:23 -0800 (PST) Received: by mail-qv1-f50.google.com with SMTP id qb7so4071474qvb.5 for ; Sun, 01 Jan 2023 17:40:23 -0800 (PST) X-Received: by 2002:a05:6214:1185:b0:4c6:608c:6b2c with SMTP id t5-20020a056214118500b004c6608c6b2cmr1841721qvv.130.1672623623525; Sun, 01 Jan 2023 17:40:23 -0800 (PST) MIME-Version: 1.0 References: <00000000000060d41f05f139aa44@google.com> <20230102005409.3474-1-hdanton@sina.com> In-Reply-To: <20230102005409.3474-1-hdanton@sina.com> From: Linus Torvalds Date: Sun, 1 Jan 2023 17:40:07 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [syzbot] [ntfs3?] INFO: task hung in do_user_addr_fault (3) To: Hillf Danton Cc: syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Tetsuo Handa , Waiman Long , Matthew Wilcox , syzkaller-bugs@googlegroups.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: 9EF4B8000C X-Rspamd-Server: rspam01 X-Stat-Signature: 4s7o4senwy3yxnc1mhg3jme1gdpw3pwe X-HE-Tag: 1672623626-414890 X-HE-Meta: U2FsdGVkX19lfu2KlnKAj9QQa4fqchGa2HHnkcGTS89gTjC59L958/m7UIKDQ+2BufOJ3LHbtbgv/wcuObNuWuFE0AgLNuqF5e+opQoGt1unSWvq8gs+lEbz0eGyR4krHuumaslwTnYFcmaM47A0YFI3j44z+4o+veo0jZSVAcHNZP+Vs38CYN3VjNab5t1QNakZ81hfVatsHObM3ZvJrVlorUHKdn32kWGRKAcPZKydibgv4II5oLISOrQBalO9f52RpG7gF4GU8DtQa7OqPF3qHGcVHZgNN7tikqRJ6h+iwJXi8W3wbrMGuhlszADTalYSByIwfCrBWGPD9Tsr5T+utpoX6vGSSJyCyycXI139xO+2G/MNUXpv6KjzN51jG30RUL4z7TqprGLRj16Kdu/wab4Qyylef9HPdoTWI4f3KFEvp4rCuByKxLcq3Nzg1q9f5DCHWmEGdbztIr/eEMWttK0+S+SFHKK+i7wE3cXtlfexDSPRI7eT9HjN3+br0fW7w6Rj+dp4GidlfePixceLXUXE6KLaw9xlN/xiQSShF8s/8bqgZ+oacYMFJM3pZRGFvOdW6u/v5aCFfsTMp0ma7SpVE2noDHJJxH9vFoZDITGqhr4GIJ4ApyOl2CIqxQhB+LwRYrV005/MWsnvJbIw0gjI1ie0CQKxXciys8vbb4sPa5KE++SCphLxciBVnuBQnaidmJmqo82Fxi1pp+qq5b1suj3P8G6pSdnFcSQUMZs93P4rLA54AWJOsTTMVlNJSPfDmLhkYekLxqZyUSblZ1XyHJeigNCwH+jYlFWk6VeYKoQmKo78rlefJHxqd56WYthbBDpl7iXb9DwBVlFcjfAQGDyM4OrAXnox60fPMYUeTAT6WANDBVzvTXsIUTMV0ywVC8Phj4l4KnP0pz4vWfjamgdFCj31VDlWkdw/OT7NmM6uV/UJOOO9Ytr+pk5kR7SJhYlcmnDSrR1 /STltDyS FPQo4U4afK6tMavYHT8oBn+1d7P2yXOJt953BhOzHDOYXnH6V26x2pXUK2lwRU2yKEKps8mQEsw8tDP3aHW1zFRYDysAGYEbfPScbYL95jG1w/7DX362hC1EclzWm3TaoeFbyR3M0NunHjsAVEwsVvMhzLYmA2aFBvG67FH+rUx+1y8kqHWvjcfjeKxBM17j2XamWNnB6+p/Um5YuFX3LmzSFbWqNNy0lKchvjvQeDs61zjOY/1pF9p5IcTQNloIir0hEmLWVUO4Qxhc2jvigF5ry7WSr+mSSfSYsy2vh38Ji1WZPvoLBn1wWJ1rOadR+Q21r4DFq9PoN1fWLmKB84buEHXU9DxEQu80KF+Inea2v31L5b+ThoSt4FvZlTTPoGTfRyH1iOLXwpDNfkTDFAe8k1F8cggw9FHweZdPWjmR2xi+uZmvK6iwNcSElIVCmz1tigcG2xdWxp7y/1k/7lsC4Lbru4txLheXp X-Bogosity: Ham, tests=bogofilter, spamicity=0.001874, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Jan 1, 2023 at 4:54 PM Hillf Danton wrote: > > > ni_lock fs/ntfs3/ntfs_fs.h:1122 [inline] Something holds the ni_lock, so this process has blocked on it, and this all happens inside mmap(): > > attr_data_get_block+0x4a6/0x2e40 fs/ntfs3/attrib.c:919 > > ntfs_file_mmap+0x4cc/0x780 fs/ntfs3/file.c:296 > > call_mmap include/linux/fs.h:2191 [inline] > > mmap_region+0x1022/0x1e60 mm/mmap.c:2621 > > do_mmap+0x8d9/0xf30 mm/mmap.c:1411 > > vm_mmap_pgoff+0x1e5/0x2f0 mm/util.c:520 so this code holds the mmapo_lock for writing, which is why all those other processes are hung on getting it for reading for page faults etc. End result: ignore all those page fault processes, this mmap_lock -> ni_lock explains them all, and they aren't the cause. > > folio_wait_bit_common+0x8ca/0x1390 mm/filemap.c:1297 > > folio_lock include/linux/pagemap.h:938 [inline] > > truncate_inode_pages_range+0xc8d/0x1650 mm/truncate.c:421 > > truncate_inode_pages mm/truncate.c:448 [inline] > > truncate_pagecache mm/truncate.c:743 [inline] > > truncate_setsize+0xcb/0xf0 mm/truncate.c:768 > > ntfs_truncate fs/ntfs3/file.c:395 [inline] .. and this thread is waiting on the page lock (well, folio, same thing), and the IO apparently isn't completing. And that seems to be because this one is busy reading the page, and blocked on that same ni_lock: > > task:syz-executor394 state:D stack:24072 pid:6048 ppid:5125 flags:0x00004004 > > Call Trace: > > > > ni_lock fs/ntfs3/ntfs_fs.h:1122 [inline] > > attr_data_get_block+0x4a6/0x2e40 fs/ntfs3/attrib.c:919 > > ntfs_get_block_vbo+0x374/0xd20 fs/ntfs3/inode.c:573 > > do_mpage_readpage+0x98b/0x1bb0 fs/mpage.c:208 > > mpage_read_folio+0x103/0x1d0 fs/mpage.c:379 But our debugging output looks a bit bogus: > > Showing all locks held in the system: > > 3 locks held by syz-executor394/5214: > > #0: ffff88801ee04460 (sb_writers#9){.+.+}-{0:0}, at: do_sendfile+0x61c/0xfd0 fs/read_write.c:1254 > > #1: ffff888073930ca0 (mapping.invalidate_lock#3){.+.+}-{3:3}, at: filemap_invalidate_lock_shared include/linux/fs.h:811 [inline] > > #1: ffff888073930ca0 (mapping.invalidate_lock#3){.+.+}-{3:3}, at: filemap_update_page+0x72/0x550 mm/filemap.c:2478 > > #2: ffff888073930860 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1122 [inline] > > #2: ffff888073930860 (&ni->ni_lock/4){+.+.}-{3:3}, at: attr_data_get_block+0x4a6/0x2e40 fs/ntfs3/attrib.c:919 It's showing 394/5214 as "holding" the lock, even though it's just waiting for it - it's the one doing the readpage. I think it's just because lockdep ends up adding the lock to the queue before it actually gets the lock, so anybody pending will be shown as "holding" it. And the 5221 one: > > 2 locks held by syz-executor394/5221: > > #0: ffff88802c7bc758 (&mm->mmap_lock){++++}-{3:3}, at: mmap_write_lock_killable include/linux/mmap_lock.h:87 [inline] > > #0: ffff88802c7bc758 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x18f/0x2f0 mm/util.c:518 > > #1: ffff888073930860 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1122 [inline] > > #1: ffff888073930860 (&ni->ni_lock/4){+.+.}-{3:3}, at: attr_data_get_block+0x4a6/0x2e40 fs/ntfs3/attrib.c:919 is that mmap() one, which is waiting for the ni_lock too (while holding the mmap_sem, which is why the page faulters are all blocked). But 5222 is is interesting, it is the truncate one, and it's waiting for the page lock, and it really does seem to hold the ni_lock: > > 3 locks held by syz-executor394/5222: > > #0: ffff88801ee04460 (sb_writers#9){.+.+}-{0:0}, at: mnt_want_write+0x3b/0x80 fs/namespace.c:508 > > #1: ffff888073930b00 (&sb->s_type->i_mutex_key#14){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:756 [inline] > > #1: ffff888073930b00 (&sb->s_type->i_mutex_key#14){+.+.}-{3:3}, at: do_truncate+0x205/0x300 fs/open.c:63 > > #2: ffff888073930860 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1122 [inline] > > #2: ffff888073930860 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs_truncate fs/ntfs3/file.c:393 [inline] > > #2: ffff888073930860 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs3_setattr+0x596/0xca0 fs/ntfs3/file.c:696 So I think that we have: - ntfs_truncate() gets the ni_lock (fs/ntfs3/file.c:393) - it then - while holding that lock - calls (on line 395): truncate_setsize -> truncate_pagecache -> truncate_inode_pages -> truncate_inode_pages_range -> folio_lock but that deadlocks on another process that wants to read that page, and that needs ni_lock to do so. So yes, it does look like a ntfs3 deadlock involving ni_lock. Anyway, the above is just me trying to make sense of the call traces and trying to cut out all the noise. I might have mis-read something. Linus