From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3146C47073 for ; Mon, 1 Jan 2024 14:11:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 753CF6B0273; Mon, 1 Jan 2024 09:11:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7042A6B0274; Mon, 1 Jan 2024 09:11:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61A9E6B0275; Mon, 1 Jan 2024 09:11:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 55A4F6B0273 for ; Mon, 1 Jan 2024 09:11:25 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 296011606B3 for ; Mon, 1 Jan 2024 14:11:25 +0000 (UTC) X-FDA: 81630929730.20.9B47A2A Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf09.hostedemail.com (Postfix) with ESMTP id D412A14000C for ; Mon, 1 Jan 2024 14:11:21 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Z0G2xAzH; spf=none (imf09.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704118282; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QRdEn0NaJiEbRMphd2l3d+AB5MhJXfTmZ+Djh+NAXgM=; b=FOb1WzL9/rmF3RKOF33iMma2StO+W+AhMjyPpsIvBSbDYpPptgwO2yeF0sH9PIIIHJRrvJ nUBoebYYA0wjvQUsbRqwfVvI5ltO5eGznFgoR0f1+FOwj1ACnIczQ7lCBv35hnH+FaGV9g +G/Zb7n37Azdo92g6VD4yGAJBG7UnhM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704118282; a=rsa-sha256; cv=none; b=6u8+zfwSoTEaaA56WTn/PjB/FgJTDxvlvr8koUsrIkwR1VktByOurccwfxA5NvzEXMf5kx C+ekL6XM8I4E+sVe4MSFEcSfBshEr8/rYsATRieJtUJhml4rYs+7oUI3kGqEz/Cf2nm4Wb +CUF5e+Dt73dFe1J3F8LRdNMMenUW0g= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Z0G2xAzH; spf=none (imf09.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=QRdEn0NaJiEbRMphd2l3d+AB5MhJXfTmZ+Djh+NAXgM=; b=Z0G2xAzHGXfIIvdWetmiLf5NIU JQauzKrlkPq9bsgE1GRFBPe32I9it1i/aw//Qu0y3CWwwO9Udb4fy1WWpc8wroUToamA1f+hFZLV5 dYwVS6i3FqKRN9uzCJO24Pzmq7zX+HveCTEmtMyQxWL0EMyKWQ/B+TKfqtM6hnBgyW1bZ5TfKC3Y6 phZdBzmlOpI7zpXc1V+TRjFf1VBvdy4CzUB1qQ9qjb7DziEXT5xYIQfq2luXFeQBtnHCcVuS0Eraf d85B+N2gzsD7uyVnckb00rVsE/r4AgBthARU4CgyePzGXMsoPe2Nx3l4pIvoySza6K8fo0HTX6ntU 6lldKCaw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1rKJ0Q-008bvC-9G; Mon, 01 Jan 2024 14:11:02 +0000 Date: Mon, 1 Jan 2024 14:11:02 +0000 From: Matthew Wilcox To: Hillf Danton Cc: Genes Lists , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: 6.6.8 stable: crash in folio_mark_dirty Message-ID: References: <8bb29431064fc1f70a42edef75a8788dd4a0eecc.camel@sapience.com> <20231231012846.2355-1-hdanton@sina.com> <20240101015504.2446-1-hdanton@sina.com> <20240101113316.2595-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240101113316.2595-1-hdanton@sina.com> X-Rspamd-Queue-Id: D412A14000C X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: ytbcr3xn9segcefwt3qkt6jgyfud9xis X-HE-Tag: 1704118281-96617 X-HE-Meta: U2FsdGVkX1+HyHQzf5iIy7Y5AtDMIo57ltW1TJTfaXLSGxMNAGn6niX0zzv/g2/X+4z0XMmPMkE6aGAezyopPCeWFAhpSjCIDbcBUfCYKwnuzGC8Evacye0OTefluIRpqWQex5Iigts1XqppBqIHfVmIWE9FlfHx2Z8/ADU3MO3Lno9FgyQrNX63yaYZdKTOAfjqzpftZ7hS2Qizc1e734+ZnPaKy1GRxFykSExArVvGpLs44e+Wy6k9SZWn6bL4U1wAAu36we6UWVgFWqErWPuedi4uCn47ZWiRYtS0CUdwvbgz7z98Ly6IuoKGZOxEV103VLq0sEaj3z0BICzxcFGy/VPWgkfDXlei9FowI38YpE3oDIGUEwMmaWGi56S0t0Qnb28mhC1qZD8Rz0dLo3DppSxqq0egIqMqtKaeMY9FF54zNnEfG9cQc/lN7P8kCh9j5ANpDKFbWf+CsPzl3gLIXKmAUJ0YjMe9tzFzTdmXuCpcojtUfduDPOwtyEruydt/wMg7kD9RC2GGHU3dEk2GHYYm/yJ6/IUe9AicV1eRT6UkyopMSeRbanoI2qDOe6+nHz1GZsdXE1u9U5vurOfx5mjnCXaVbAz09FTh1d4ENdXwggPunqtevm+X9lb1q0U2mpqSJ0GDOqeCoyeQcOAfzCgKIfwbD29tlgJimK+14pKJdBGpnXJyR+X+yuea2D7jZ4DsGwT3UYXWPbOENCWNhXpL1eF472rk6sdPmGNFgE4kSeCyqHEQV8ySpz7pkIAdqgrTnxb/gaDCuqQx5RaO+b9ssdZwKotNRNcxWi5NTfnYVArfA1Z6U6eUuEpKQzh8FXrCKIlP9IT3wNuPHd4QpiJtq+BJ5XrEtweFyBQ0mTAOLmHtddZfc6s3t2JQrj+HcUIUklUtZpXo4hizhNidlGbn8vxXcRRZLIdKW5qvetLsjZiFqoTbYbi+hGyu5i/u8YJcjNnpyiwT6y4 /q5zsTBo 478d/QqEK90BbcBNeCucEz0QrwFYxWJtGjxlnvvQrVtRGJ+gIMyKPXWR6EuILiyKvjSqKSk4/QkTya7deLaT/HStrFrY7dAelW9zrS5kmdf06D1JhciSfYguHUmV5nN20G/evjXehFmGKivLsw4sNoS8irFt7DN6nsExa0k7+773RWSNBUbcAO+zxdHh4WYZ3PGrJAjyYfgcb/bchx/qNVWE0j3o1bNM0eHYkRp8MRr6g2bH0YtfKuwQ/uVBDFOjbYCVEOX0GQcaOtK+QszYkwP/XRTe2sQOG2RzmrOAhtGTX19k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 01, 2024 at 07:33:16PM +0800, Hillf Danton wrote: > On Mon, 1 Jan 2024 09:07:52 +0000 Matthew Wilcox > > On Mon, Jan 01, 2024 at 09:55:04AM +0800, Hillf Danton wrote: > > > On Sun, 31 Dec 2023 13:07:03 +0000 Matthew Wilcox > > > > I don't think this can happen. Look at the call trace; > > > > block_dirty_folio() is called from unmap_page_range(). That means the > > > > page is in the page tables. We unmap the pages in a folio from the > > > > page tables before we set folio->mapping to NULL. Look at > > > > invalidate_inode_pages2_range() for example: > > > > > > > > unmap_mapping_pages(mapping, indices[i], > > > > (1 + end - indices[i]), false); > > > > folio_lock(folio); > > > > folio_wait_writeback(folio); > > > > if (folio_mapped(folio)) > > > > unmap_mapping_folio(folio); > > > > BUG_ON(folio_mapped(folio)); > > > > if (!invalidate_complete_folio2(mapping, folio)) > > > > > > > What is missed here is the same check [1] in invalidate_inode_pages2_range(), > > > so I built no wheel. > > > > > > folio_lock(folio); > > > if (unlikely(folio->mapping != mapping)) { > > > folio_unlock(folio); > > > continue; > > > } > > > > > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/truncate.c#n658 > > > > That's entirely different. That's checking in the truncate path whether > > somebody else already truncated this page. What I was showing was why > > a page found through a page table walk cannot have been truncated (which > > is actually quite interesting, because it's the page table lock that > > prevents the race). > > > Feel free to shed light on how ptl protects folio->mapping. The documentation for __folio_mark_dirty() hints at it: * The caller must hold folio_memcg_lock(). Most callers have the folio * locked. A few have the folio blocked from truncation through other * means (eg zap_vma_pages() has it mapped and is holding the page table * lock). This can also be called from mark_buffer_dirty(), which I * cannot prove is always protected against truncate. Re-reading that now, I _think_ mark_buffer_dirty() always has to be called with a reference on the bufferhead, which means that a racing truncate will fail due to invalidate_inode_pages2_range -> invalidate_complete_folio2 -> filemap_release_folio -> try_to_free_buffers -> drop_buffers -> buffer_busy >From an mm point of view, what is implicit is that truncate calls unmap_mapping_folio -> unmap_mapping_range_tree -> unmap_mapping_range_vma -> zap_page_range_single -> unmap_single_vma -> unmap_page_range -> zap_p4d_range -> zap_pud_range -> zap_pmd_range -> zap_pte_range -> pte_offset_map_lock() So a truncate will take the page lock, then spin on the pte lock until the racing munmap() has finished (ok, this was an exit(), not a munmap(), but exit() does an implicit munmap()).