From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEC39C77B73 for ; Thu, 27 Apr 2023 07:33:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DD37C6B0071; Thu, 27 Apr 2023 03:33:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D5C956B0072; Thu, 27 Apr 2023 03:33:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BFD136B0074; Thu, 27 Apr 2023 03:33:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A92526B0071 for ; Thu, 27 Apr 2023 03:33:53 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6A8341201F3 for ; Thu, 27 Apr 2023 07:33:53 +0000 (UTC) X-FDA: 80726356746.27.7899F80 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf11.hostedemail.com (Postfix) with ESMTP id 928E64001E for ; Thu, 27 Apr 2023 07:33:49 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of libaokun1@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=libaokun1@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682580830; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LdNLVoUsNOIh5YwIh4R/R/tx8k5ye2PPzN4MA4BuIy8=; b=wYf9idPkqQYIxgMMX0YUpG5s4FB0jjbIlHX+HgpMQ/niVroeik3iYyeENdQOkdAxValcDo 2NQijtx0Kp1W3qHkbOJrZP0V7cgx2ZlbBbb1pt/LQJlwwj6HTV1RRlNlz7rHI1yI3rvG6W mVMH2T546v8RJ/P+syZu8H8Kj+Lw9T0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of libaokun1@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=libaokun1@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682580830; a=rsa-sha256; cv=none; b=Ab8vwVatMywCHlY1bj2Ivz5DnCQhbPr25hz3nrJfSbwJAljq49CfTdHMx7coq4v6ZWxFUa Lb20hFvX6wPl9XGfj59i+vh40hTGAjTf22h7CJUeq9ez4U6HFqSof3k767zHSfl6RUxa5P 6AQj7Wa5uxmNiC18NopJ0FkqUxr5N5o= Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Q6S6X2QkmzStFk; Thu, 27 Apr 2023 15:29:24 +0800 (CST) Received: from [10.174.177.174] (10.174.177.174) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 27 Apr 2023 15:33:42 +0800 Message-ID: <321a58db-da64-fea4-64b2-1dd6ae5e4976@huawei.com> Date: Thu, 27 Apr 2023 15:33:41 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 Subject: Re: [ext4 io hang] buffered write io hang in balance_dirty_pages Content-Language: en-US To: Ming Lei , Matthew Wilcox CC: Theodore Ts'o , , Andreas Dilger , , Andrew Morton , , , Dave Chinner , Eric Sandeen , Christoph Hellwig , Zhang Yi , yangerkun , Baokun Li References: From: Baokun Li In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.174] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected X-Stat-Signature: tu6fohojixwss6zsesz1j81mnfhphkzd X-Rspam-User: X-Rspamd-Queue-Id: 928E64001E X-Rspamd-Server: rspam06 X-HE-Tag: 1682580829-617481 X-HE-Meta: U2FsdGVkX1+IOz+BTTQ8x0g2qBBB8OixnfppE3W5rEwAYdNCbhSWhf3FruafGhb8quGynFTrzW88pDNDH37Zu5jnU9ECJpHL3EUs3ZPNyNr95Db/cCvRyRRc0W5pWPlhMPZzTtJymzceFzI7/iGR6gsj9M94yz6cWX8I5pFd3sBJzuFKfcmYQLTHno37RG3L+g+FOJViY5ms5ImE/UnPVLckJO0r/Y0fIeZtMV6dc43jMuvBnjg/bs/z8KZ86fF10mjSljLy5M32wqe+2NjbBNfiLe6C+XAaOAi6b3+Y2gCuROseDgb5v39IiIuyb1UGZr1oNqrGvniSoShEfk0DmOb+OQlIVlkHfq8pbfkDbetWxZW7MYHOyGVtUDt+w5H127VjagofIBT9bPD0E6amE6koNHGDSZ/dBDqNHO1HNjV4m9klavVUZQ8bDJK8v3UcLnun4dFWttCJ2d82ldxo9CypUYzJo8zUJ9kZAYrlfMRlQmODCdV0dwHYEd+vSeU/V9I3Hzu8wOYAZjZ2cCfxPkT1LGNgZJDIiyTq6wrbHWfBApRlEHU05XiFD+NmyA52dOFQHByzJwoMmwZ5bXJy9BAuCkQ/IDPMmSxf9Ix14Z8D4GqEElHutLVUWz93Q4jdMEdXVecjMo2vqGIKgAKqzOQQ0QOr82XyReTOCvxG582b2hsJN0fGU52rEKgUYmPw0kaZR1eBZD1SNCqbGtc6nn+22uivt2r3A/RgDaUUeDkkXq5dokIHaDYjFRsixmWNikiCwYRGF8cGksqv2x6aFncZz9BqBeAu789ZhAsBDeLsprxJdjoSVHBIoYI2UH+WuMiYC8Bs6dHCW5Z3hkCw2aTmjEzw6g1pXsY061ScaxQzCKYtIudJ2ToLX8i+jvdiAeYQP/BesaeWHIJ7LacR+RzWQPVGvXVkyXcuh/WryEw1G0kF7hvhc4XnWHw0NYo+21VTHDu/J618T6KCo59 Wyu7L9ri lnftRXcXgHprUV9sdMJuhwItlWQ/rCMj/r6ruAWoaLJC5HirRHerGIel5VdJGF1OPePjKwbsxx3fgi44N4R1I0/smhP3qFgS7iENuzBg7Mfn1L1CNY+rLdwVxf1cqXtaa9LXnjMFi8q1eyR6Rssn4VV5Qpk6KAYvHVfGZT5JP5ZTIa4FPFeyg7KzXnw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/4/27 14:36, Baokun Li wrote: > On 2023/4/27 12:50, Ming Lei wrote: >> Hello Matthew, >> >> On Thu, Apr 27, 2023 at 04:58:36AM +0100, Matthew Wilcox wrote: >>> On Thu, Apr 27, 2023 at 10:20:28AM +0800, Ming Lei wrote: >>>> Hello Guys, >>>> >>>> I got one report in which buffered write IO hangs in >>>> balance_dirty_pages, >>>> after one nvme block device is unplugged physically, then umount can't >>>> succeed. >>> That's a feature, not a bug ... the dd should continue indefinitely? >> Can you explain what the feature is? And not see such 'issue' or >> 'feature' >> on xfs. >> >> The device has been gone, so IMO it is reasonable to see FS buffered >> write IO >> failed. Actually dmesg has shown that 'EXT4-fs (nvme0n1): Remounting >> filesystem read-only'. Seems these things may confuse user. > > > The reason for this difference is that ext4 and xfs handle errors > differently. > > ext4 remounts the filesystem as read-only or even just continues, > vfs_write does not check for these. > > xfs shuts down the filesystem, so it returns a failure at > xfs_file_write_iter when it finds an error. > > > ``` ext4 > ksys_write >  vfs_write >   ext4_file_write_iter >    ext4_buffered_write_iter >     ext4_write_checks >      file_modified >       file_modified_flags >        __file_update_time >         inode_update_time >          generic_update_time >           __mark_inode_dirty >            ext4_dirty_inode ---> 2. void func, No propagating errors out >             __ext4_journal_start_sb >              ext4_journal_check_start ---> 1. Error found, remount-ro >     generic_perform_write ---> 3. No error sensed, continue >      balance_dirty_pages_ratelimited >       balance_dirty_pages_ratelimited_flags >        balance_dirty_pages >         // 4. Sleeping waiting for dirty pages to be freed >         __set_current_state(TASK_KILLABLE) >         io_schedule_timeout(pause); > ``` > > ``` xfs > ksys_write >  vfs_write >   xfs_file_write_iter >    if (xfs_is_shutdown(ip->i_mount)) >      return -EIO;    ---> dd fail > ``` >>> balance_dirty_pages() is sleeping in KILLABLE state, so kill -9 of >>> the dd process should succeed. >> Yeah, dd can be killed, however it may be any application(s), :-) >> >> Fortunately it won't cause trouble during reboot/power off, given >> userspace will be killed at that time. >> >> >> >> Thanks, >> Ming >> > Don't worry about that, we always set the current thread to TASK_KILLABLE > > while waiting in balance_dirty_pages(). > > On second thought, we can determine if the file system has become read-only when the ext4_file_write_iter() is called on a write file, even though the fs was not read-only when the file was opened. This would end the write process early and free up resources like xfs does. The patch is below, does anyone have any other thoughts? diff --git a/fs/ext4/file.c b/fs/ext4/file.c index d101b3b0c7da..d2966268ee41 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -699,6 +699,8 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)         if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))                 return -EIO; +       if (unlikely(sb_rdonly(inode->i_sb))) +               return -EROFS;  #ifdef CONFIG_FS_DAX         if (IS_DAX(inode)) -- With Best Regards, Baokun Li .