From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97B50C7EE25 for ; Thu, 4 May 2023 16:21:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 04763900005; Thu, 4 May 2023 12:21:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F3886900002; Thu, 4 May 2023 12:21:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4FAA900005; Thu, 4 May 2023 12:21:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by kanga.kvack.org (Postfix) with ESMTP id C533A900002 for ; Thu, 4 May 2023 12:21:31 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=xNrzahxisfy2D9GpQQsapIsOxx/6hZf8YuSTxd9nEBY=; b=J7o12dUjGzezKJVUFDeEK+Nek0 4w32+t1wCykosSCQC6rewjDWwiIgAg5Qh0Nd31+tuHIhLO1gpKusbtv1rE/441rV0fTLMIguJZPqo Iw+gTf3HFt35Vd8FErSCxs4nNyq9Vl1HbUCMKefkQQLy9wy/95/uDVmarQlH6d/2EQ+HPWpEmNnx7 xLbJEFS2WdmXvL45AESlq1++dKlG/A282LMCjOtk0v7esTFbfuU9w8iUdHvFs/MU8vH77vv02/V3o GMpongEetngPUmWTBGa73s15SfqNBYOaMWXfBGUbCgUHbOafCbfHoLceQ1Ml8jwI0ZlkUWCPQbCfC 59WsA6ig==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pubhq-00Akte-9Y; Thu, 04 May 2023 16:21:22 +0000 Date: Thu, 4 May 2023 17:21:22 +0100 From: Matthew Wilcox To: Keith Busch Cc: Ming Lei , Theodore Ts'o , linux-ext4@vger.kernel.org, Andreas Dilger , linux-block@vger.kernel.org, Andrew Morton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Dave Chinner , Eric Sandeen , Christoph Hellwig , Zhang Yi Subject: Re: [ext4 io hang] buffered write io hang in balance_dirty_pages Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 04, 2023 at 09:59:52AM -0600, Keith Busch wrote: > On Thu, Apr 27, 2023 at 10:20:28AM +0800, Ming Lei wrote: > > Hello Guys, > > > > I got one report in which buffered write IO hangs in balance_dirty_pages, > > after one nvme block device is unplugged physically, then umount can't > > succeed. > > > > Turns out it is one long-term issue, and it can be triggered at least > > since v5.14 until the latest v6.3. > > > > And the issue can be reproduced reliably in KVM guest: > > > > 1) run the following script inside guest: > > > > mkfs.ext4 -F /dev/nvme0n1 > > mount /dev/nvme0n1 /mnt > > dd if=/dev/zero of=/mnt/z.img& > > sleep 10 > > echo 1 > /sys/block/nvme0n1/device/device/remove > > > > 2) dd hang is observed and /dev/nvme0n1 is gone actually > > Sorry to jump in so late. > > For an ungraceful nvme removal, like a surpirse hot unplug, the driver > sets the capacity to 0 and that effectively ends all dirty page writers > that could stall forward progress on the removal. And that 0 capacity > should also cause 'dd' to exit. > > But this is not an ungraceful removal, so we're not getting that forced > behavior. Could we use the same capacity trick here after flushing any > outstanding dirty pages? There's a filesystem mounted on that block device, though. I don't think the filesystem is going to notice the underlying block device capacity change and break out of any of these functions.