From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FB16C4345F for ; Wed, 1 May 2024 06:51:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3F7906B0082; Wed, 1 May 2024 02:51:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3A7136B0083; Wed, 1 May 2024 02:51:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 248106B0085; Wed, 1 May 2024 02:51:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 053106B0082 for ; Wed, 1 May 2024 02:51:22 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 14854C0A6E for ; Wed, 1 May 2024 06:51:22 +0000 (UTC) X-FDA: 82068905604.14.966351A Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf14.hostedemail.com (Postfix) with ESMTP id 22A30100002 for ; Wed, 1 May 2024 06:51:19 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=FKFJ6LyT; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf14.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714546280; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dNZLyG6u4xUG/H7FZmJcRs0vrAIcDKlQHLf7CLLPwDA=; b=lxDwjH4h2aDLiK5OOJmPquG1Gq+JSjDXY82/7Br7ZBs6ogaWy9wY9Rc8ssxbab1JS5rhrA R5Lkj1H2WD5yd2qaNy4Z/+OcpmnGBaIT6G5z0g7D0T7brxhvXLNQeDyu0qtlmyy71Grd2Z yUT+LQufK0JkoqpiOmRR4i5aHT8rzjA= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=FKFJ6LyT; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf14.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714546280; a=rsa-sha256; cv=none; b=La3ZSidQdwi9fUmcVsFzDurjcoopkxltroYcoWcWO70i3Ncz3ZSFSTvQJUkZsCt0WI7Bh7 McYHi7uc6CvLWwol3mnkffzcgZhJ+4w8TVubH7En8y6azyE0Cmcur8MJer2nr7ZUlvJ3vE z/DyL9TFOxgbecnV5Xl9+LcJbrI9cb8= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1ec4b2400b6so13168535ad.3 for ; Tue, 30 Apr 2024 23:51:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1714546279; x=1715151079; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=dNZLyG6u4xUG/H7FZmJcRs0vrAIcDKlQHLf7CLLPwDA=; b=FKFJ6LyTPiFPutYvjqnzyClaGnzCa2L3D0EnYj94OmNjosQiTfcJ0xsazq+CXtfjeG 00PozLCzyUifJzfqW3MBSeDXzZKCzrTufPhjlj4hbtcfRWTx8cMk1oTJWrvJpA5lGZTq jRSPGTHWX8nXA1mzNyrdrv7/U7P6oeD7+A+ouubwB9653kt5LJU64ZPeAreHKbV2JTBX ECer8OORWaghSyGyP2VF9lCsqK96DBdSx20UbBvsHWCCHhrrTfZsw8pOZyEFAb/NuxVD sr+Y8LVxJOKHwKlf9kbWNGgkTPwjxowZiV5AZFc1iW/4w0k+ovRQh3yxIBm7CVHObjDK Or0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714546279; x=1715151079; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=dNZLyG6u4xUG/H7FZmJcRs0vrAIcDKlQHLf7CLLPwDA=; b=hlawTxyjZIIuv8qXOin7hdQzBoX6/PraDRh+877qCdIfd9dJh5k36rKTN+HWZK+EaP g1xOZdPjarBaHyvYXk7w429GJliUvxUGiPy21JidwEuANQahKTeWcdgiUoQLUagkD+Lo deS2My6SYyXdZdtx1gsrXegYrfvj/LElnxNkdoZ45Kgv4eCM4kQL53H/JsFlTlXBpJC0 2abGvJY1pupSQUf28kstl7IRngZMEU2+dofFA9wNPNteXDa/0cJ0Iqk/mRxvug4ABsQG KAP6OQr0XUNb6Z7MqqLV5PU1x9QoCBl5cu95fgYEDSBFIQdrKG8Erd3hp3Y06TWUv8m5 IYNQ== X-Forwarded-Encrypted: i=1; AJvYcCXf08SQ3Co1L6alqLEr/q3l37e0KvfUvdsbanwhpV865/tIoq16bItKCR+rUwKx8amkcuNq4Xo5EGo14riE5P3bC5E= X-Gm-Message-State: AOJu0YwsWuQporin9k4UAbdabKa9a4c5Wjd01XZnfcCOfH0+k07gxPvp mv/yckK2n3IxY0Vjib4eWKY6ybWnWYxoCI/uAyim0R7SseNP1URKxJyxMhMwSaM= X-Google-Smtp-Source: AGHT+IG2Z/wySqd1NJnCFct2EGwo2kzRUh5XmAZi5hWfxL0w1v8Ms72FMH2GUKU/yvqUwnEtxfxs8g== X-Received: by 2002:a17:902:bb17:b0:1e2:bf94:487 with SMTP id im23-20020a170902bb1700b001e2bf940487mr1480759plb.57.1714546278723; Tue, 30 Apr 2024 23:51:18 -0700 (PDT) Received: from dread.disaster.area (pa49-179-32-121.pa.nsw.optusnet.com.au. [49.179.32.121]) by smtp.gmail.com with ESMTPSA id mi7-20020a170902fcc700b001e2c1740264sm23528976plb.252.2024.04.30.23.51.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Apr 2024 23:51:18 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1s23oB-00H5CZ-1A; Wed, 01 May 2024 16:51:15 +1000 Date: Wed, 1 May 2024 16:51:15 +1000 From: Dave Chinner To: Zhang Yi Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: Re: [PATCH v4 02/34] ext4: check the extent status again before inserting delalloc block Message-ID: References: <20240410142948.2817554-1-yi.zhang@huaweicloud.com> <20240410142948.2817554-3-yi.zhang@huaweicloud.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240410142948.2817554-3-yi.zhang@huaweicloud.com> X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 22A30100002 X-Stat-Signature: nb594gm5cz79zpghq54rbu1ju6pci8hy X-Rspam-User: X-HE-Tag: 1714546279-939971 X-HE-Meta: U2FsdGVkX1/BAdLGqN+zVkeExv4WG0kSvaqY87tVnUgY9Cn7qQqTUjUt2vh/3/Kj84cii6P7VnC83SFf1HE6MKynIlJrhUhNXh5+MvtG/Leoy5c4As+EzfSuPnRWwBVVyJcGbi/J5LxyGuOR3rYHA2U9EScVMK3dgsKt6pLdx+SnBWBCRrxb+06wv6AHo0gVpT9pMqRBrhjnmoRIuFzyb/1aNgdPvWUOeEP7t+IUIOoAHdPNwDPuexqAK4d4WRfO5br+d435Y+PbqduhHmxymARAVDCUZsJOiMR6QOKGdFAwYu5hra0x4RATj+TetRVg9Q0bFtPNWaQxDCVDvcp7IIJBMb3qdcwUeGfzzFF+8vlXgdZUIz1syIyONhG/IbUp2t9XUsJMXLTzvvOWzVPxuhbJpi0eIW5l1I0VeCA3lQ6/UpcYJTS+hwyc24JkVlBS4Pk0md9bScj6IkfohNbt90ly10W+Qnq12NV0D6mFoh0flIPMAjEobsiT/eYcRkbpAI5T/HLtHXjxVveDHBMb6SqhaLTLtePGccNKFR7bxwVuU+xhFxLm2urn8lLK7En4JpCrOaDdGvTu4mHeivOfldAMxmTHfnCx83HyntBNCdClwTfC6Wfx1i/3mqGjXPlIpmTojTjywnWg6G0dQhuoFW1j1+X1YoBUOyjpH5/G27d7Fxdd3zV2pGiKRUCSbcbbVHbTa1LRp/wSae1SFfdDrRbw8xqMUaYngDc96mm7Y27ZABHhsrm8aUWIYAmfF/DSm/R0rR1WRHMgwFfwvZ/X6jWqGhA68fKhNk8A7jp7QEMJXs0zhD957ZDfxNKG2au/RJQIi4EdY3oMAlbRqKCDR5TxbWKzrTEIusiMWD0dXl51RbK10doJu7mrqKniup04b7ozpmL5K1lvPI98bV1eSOfAhwDe3VPmhbp5veo/jwyiHzbGBsfX7xy3HtCmunVX5eNO2xaqP16MTTVrB0m 9wewndym Kz/7ndKGcdSdEK78a229wGKNzMkhp8KKu7jHxeTRvUL+LawPYnV5O5eNSLhCm5VNnqnRqKtKnaDg+Dzkg3XsG2CaSuZESFKMZywBrrkRBbBhooousl4Vku8pklvOReXu7zpfpbSN1a9DqZOlL1Fo/aVlXpVLR9Xy9PIxk1S+FRG7uGw5tc7Vn5H6M7XpksngPSPAWP9MCK+u300pqlDiDAYlidS4opOoapoSiRL+4sdxU0zanYaLsCNcocX9ALn1BLCFLy+blRSrSoHpqOQ1Z1iqMbdReUc6Duh/jnuqOJ1iSYfcaLkbqXZ22KgMcfoeuaVzuH87EzF4E32maMGaNdSHOowr1IsjXHiS+Pdoi4gyaPEsMUlWTC9xU72miL79EzmnEXa5aRlospOritPLAV6BBtkGE4asAYRguEm4M3XUT6pqGtXSIu4C/g6m8gAG1NbVSZPydtXXQkUDjxw89R9E/IM+Rk3J1nFZKaazn+iwFCB7bA/b7R0v2aN3S+BqNm3ZqrHORuwyYRALcWFa7Sa89sjzyNgxE1KOa X-Bogosity: Ham, tests=bogofilter, spamicity=0.000041, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 10, 2024 at 10:29:16PM +0800, Zhang Yi wrote: > From: Zhang Yi > > Now we lookup extent status entry without holding the i_data_sem before > inserting delalloc block, it works fine in buffered write path and > because it holds i_rwsem and folio lock, and the mmap path holds folio > lock, so the found extent locklessly couldn't be modified concurrently. > But it could be raced by fallocate since it allocate block whitout > holding i_rwsem and folio lock. > > ext4_page_mkwrite() ext4_fallocate() > block_page_mkwrite() > ext4_da_map_blocks() > //find hole in extent status tree > ext4_alloc_file_blocks() > ext4_map_blocks() > //allocate block and unwritten extent > ext4_insert_delayed_block() > ext4_da_reserve_space() > //reserve one more block > ext4_es_insert_delayed_block() > //drop unwritten extent and add delayed extent by mistake Shouldn't this be serialised by the file invalidation lock? Hole punching via fallocate must do this to avoid data use-after-free bugs w.r.t racing page faults and all the other fallocate ops need to serialise page faults to avoid page cache level data corruption. Yet here we see a problem resulting from a fallocate operation racing with a page fault.... Ah, I see that the invalidation lock is only picked up deep inside ext4_punch_hole(), ext4_collapse_range(), ext4_insert_range() and ext4_zero_range(). They all do the same flush, lock, and dio wait preamble but each do it just a little bit differently. The allocation path does it just a little bit differently again and does not take the invalidate lock... Perhaps the ext4 fallocate code should be factored so that all the fallocate operations run the same flush, lock and wait code rather than having 5 slightly different copies of the same code? Cheers, Dave. -- Dave Chinner david@fromorbit.com