From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E04E6C77B7C for ; Thu, 4 May 2023 15:59:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 704E4900004; Thu, 4 May 2023 11:59:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B521900002; Thu, 4 May 2023 11:59:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CB79900004; Thu, 4 May 2023 11:59:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by kanga.kvack.org (Postfix) with ESMTP id 4BE24900002 for ; Thu, 4 May 2023 11:59:56 -0400 (EDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B8D4D60D36; Thu, 4 May 2023 15:59:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F61EC433D2; Thu, 4 May 2023 15:59:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683215995; bh=FBaesmDueThgzce1AGXwzBYeheoMfDTnfkfXOq6hmkQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=O6vIlkM6jH9hdIbBTD5WahjazhJLSIoImfHG90DvPIb2qNO7tn3CEfHYd1ASraeUQ ce3sWdSjQD2NlVAbqsoR7rlRMggbeKTLJf2WDHZaR6Am3yx8MMYYBG+S4PW/jgc3ZS HVqT/jsnKTSn0bHsF3phyZEtsJKFTm9WgAxHbmvs7KywT1FMXYDJ6JHEL2SrDdvLXH 642WKkofFkrUhxEF0NrcttDNFOrw0mW6DhiA2981Urtt2/8eyshs3hpQayzVG0PTGC 3+I+YOdb4QTWf1tzorXtLSO0O2Y5u6i/3WVLxe4OvyBDFoFOS3tAmiD0zAqiSf2gj0 BxEyBBQrUflOA== Date: Thu, 4 May 2023 09:59:52 -0600 From: Keith Busch To: Ming Lei Cc: Theodore Ts'o , linux-ext4@vger.kernel.org, Andreas Dilger , linux-block@vger.kernel.org, Andrew Morton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Dave Chinner , Eric Sandeen , Christoph Hellwig , Zhang Yi Subject: Re: [ext4 io hang] buffered write io hang in balance_dirty_pages Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 27, 2023 at 10:20:28AM +0800, Ming Lei wrote: > Hello Guys, > > I got one report in which buffered write IO hangs in balance_dirty_pages, > after one nvme block device is unplugged physically, then umount can't > succeed. > > Turns out it is one long-term issue, and it can be triggered at least > since v5.14 until the latest v6.3. > > And the issue can be reproduced reliably in KVM guest: > > 1) run the following script inside guest: > > mkfs.ext4 -F /dev/nvme0n1 > mount /dev/nvme0n1 /mnt > dd if=/dev/zero of=/mnt/z.img& > sleep 10 > echo 1 > /sys/block/nvme0n1/device/device/remove > > 2) dd hang is observed and /dev/nvme0n1 is gone actually Sorry to jump in so late. For an ungraceful nvme removal, like a surpirse hot unplug, the driver sets the capacity to 0 and that effectively ends all dirty page writers that could stall forward progress on the removal. And that 0 capacity should also cause 'dd' to exit. But this is not an ungraceful removal, so we're not getting that forced behavior. Could we use the same capacity trick here after flushing any outstanding dirty pages?