From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A68AC4332F for ; Tue, 15 Nov 2022 08:41:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEBB76B0071; Tue, 15 Nov 2022 03:41:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A9C936B0072; Tue, 15 Nov 2022 03:41:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 965D46B0073; Tue, 15 Nov 2022 03:41:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 84F516B0071 for ; Tue, 15 Nov 2022 03:41:45 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5D112C08F8 for ; Tue, 15 Nov 2022 08:41:45 +0000 (UTC) X-FDA: 80135033370.11.A8A6517 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf02.hostedemail.com (Postfix) with ESMTP id 0D9718000B for ; Tue, 15 Nov 2022 08:41:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=ragGr8bI09kd4Xs0ZvoiBDX9nwTCjc2Hw6wTuuGKA20=; b=TPd1O1H7nZ/7RTEXOZEv2oGBRL Qh+CJpDd/wzK01CHJF3uRd8skvyH8eS4RVhTMHJ9Vw7BslUTnDg31z/dKT6THrC8Kc1S+rnao/feM PgWhkWIkg3HSIR7ikXpj5t0sh1wu821pOUJxMmXt/KbbM1ARcvGn4/ajtkUoLdEprhUSS0Udhv3Z8 M1T40Rlq/6bpMRnrqKqEHYYtYOte0vj8CctzT/g08TRGkTpSuaFNfkPRnoEh+7SIE5D5eTI9/bzeW o6I6nTAfMHQrWBIYl51LzsTws0wC1yjhIQ9FazjuME+27ws3DLjBk4ah6PuWertxNOvtI/CVM+CPq Uz4bUDXw==; Received: from hch by bombadil.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1ourVm-008zfG-1O; Tue, 15 Nov 2022 08:41:42 +0000 Date: Tue, 15 Nov 2022 00:41:42 -0800 From: Christoph Hellwig To: Dave Chinner Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 3/9] xfs: punching delalloc extents on write failure is racy Message-ID: References: <20221115013043.360610-1-david@fromorbit.com> <20221115013043.360610-4-david@fromorbit.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221115013043.360610-4-david@fromorbit.com> X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668501704; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ragGr8bI09kd4Xs0ZvoiBDX9nwTCjc2Hw6wTuuGKA20=; b=pmDIhhC2FN2X2WTt48l3+SzNniKmyHVt60qvrxsSF2gvtlJSpvmle2r572cQruzBZDb4eA WrSxUX+qsbTg6E9EAaYAkzl128RRcNLKC0hxPW+zhMz5ejfe5LCOP616SRRfNv/Ns6LU9m O2Ai1G+lcNX6pZSRqpABPoJ+ntJdlLU= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=TPd1O1H7; spf=none (imf02.hostedemail.com: domain of BATV+c84debcf882b373fc39b+7023+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+c84debcf882b373fc39b+7023+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668501704; a=rsa-sha256; cv=none; b=0OkTGgzvZQVVyT4EVLNjfiwh1YEbtiUwcrGBXLA5xjS/JskgSbVsUrhkx8/GgLVWzNNSBf fVaQIMIGP5ZytQfLdg4awsnzj5sd1UHEnz8zKly21IzNbVKFyrNEjLLkMlPCJjpend6EjG B+YYZoT+ZJy9DxWFgRvspCRg4iNg47Q= X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 0D9718000B Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=TPd1O1H7; spf=none (imf02.hostedemail.com: domain of BATV+c84debcf882b373fc39b+7023+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+c84debcf882b373fc39b+7023+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none X-Stat-Signature: oko5wr9jgm7ysi4nooeg1ar7ux8a7yc8 X-HE-Tag: 1668501703-310410 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 15, 2022 at 12:30:37PM +1100, Dave Chinner wrote: > From: Dave Chinner > > xfs_buffered_write_iomap_end() has a comment about the safety of > punching delalloc extents based holding the IOLOCK_EXCL. This > comment is wrong, and punching delalloc extents is not race free. > > When we punch out a delalloc extent after a write failure in > xfs_buffered_write_iomap_end(), we punch out the page cache with > truncate_pagecache_range() before we punch out the delalloc extents. > At this point, we only hold the IOLOCK_EXCL, so there is nothing > stopping mmap() write faults racing with this cleanup operation, > reinstantiating a folio over the range we are about to punch and > hence requiring the delalloc extent to be kept. > > If this race condition is hit, we can end up with a dirty page in > the page cache that has no delalloc extent or space reservation > backing it. This leads to bad things happening at writeback time. > > To avoid this race condition, we need the page cache truncation to > be atomic w.r.t. the extent manipulation. We can do this by holding > the mapping->invalidate_lock exclusively across this operation - > this will prevent new pages from being inserted into the page cache > whilst we are removing the pages and the backing extent and space > reservation. > > Taking the mapping->invalidate_lock exclusively in the buffered > write IO path is safe - it naturally nests inside the IOLOCK (see > truncate and fallocate paths). iomap_zero_range() can be called from > under the mapping->invalidate_lock (from the truncate path via > either xfs_zero_eof() or xfs_truncate_page(), but iomap_zero_iter() > will not instantiate new delalloc pages (because it skips holes) and > hence will not ever need to punch out delalloc extents on failure. > > Fix the locking issue, and clean up the code logic a little to avoid > unnecessary work if we didn't allocate the delalloc extent or wrote > the entire region we allocated. > > Signed-off-by: Dave Chinner > + filemap_invalidate_lock(inode->i_mapping); > + truncate_pagecache_range(VFS_I(ip), XFS_FSB_TO_B(mp, start_fsb), > + XFS_FSB_TO_B(mp, end_fsb) - 1); No need to use VFS_I here, the inode is passed as a funtion argument. Otherwise looks good: Reviewed-by: Christoph Hellwig