From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A4BEC00A98 for ; Fri, 20 Oct 2023 15:40:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CDB88D00B6; Fri, 20 Oct 2023 11:40:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57D868D0003; Fri, 20 Oct 2023 11:40:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 444EF8D00B6; Fri, 20 Oct 2023 11:40:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3473A8D0003 for ; Fri, 20 Oct 2023 11:40:15 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id DA1F21CB679 for ; Fri, 20 Oct 2023 15:40:14 +0000 (UTC) X-FDA: 81366251148.18.136405E Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf26.hostedemail.com (Postfix) with ESMTP id ECF6614000E for ; Fri, 20 Oct 2023 15:40:12 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WkF7rOGe; spf=pass (imf26.hostedemail.com: domain of djwong@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=djwong@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697816413; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7+JKOzNXy4vAvOsUQW659kUzh2w0pfUgP4hzygbDkwY=; b=y8TyA1q6DcgQDeRuTjgmUm4o6w+X/yFbMS2B7RakL0N50f75Axkyc25goyLlb1tOYGzLeq WVyf+rBwPKAb8CRAlJ9L+Bgu1HG5FnDK17yRlVYxhxeIDSjjErqr9jfk9giIxdfMQQ1ODn fTG/iwkN75jTwfZZ5YOOFB8pld8Pltg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697816413; a=rsa-sha256; cv=none; b=IzSrUiZ2V1Pv6mfwWaIevu4GlgS7PvPuc2oip7jSHIhVk0SaYSZ955FSScgGTMmRRq5iCt Fk2n6eeAjmNFmoUW3n2ECnQMs/bJNU5OyK47AFGcVNgaOjmbwi2V3xQWERmuJCgUyT5a89 lz7VdWiG1sib2amNU3OCdJX/vGiCB+0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WkF7rOGe; spf=pass (imf26.hostedemail.com: domain of djwong@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=djwong@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id E8A4BB82C24; Fri, 20 Oct 2023 15:40:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 409F1C433C8; Fri, 20 Oct 2023 15:40:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697816410; bh=pNbTKac+ipzyaU5O7ECaqnI3XIz9cOApR0uJsKIi8F4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=WkF7rOGeSD9GV27XyOSQ4PIzLpcstNi8e3LvINNAcv9cypM/eIiKOWv3qZuw2/v7i 7GUthvo1a9J1cYvHdQ0QSMFdSlYzU8+TaKU0rVG+rRxlHgTFUiQkGhN65TlJWu2zpw 1eMlAy4x1FsSgXZkU24P5dW9rC03Kd5vPv15popYLpfA+2G76MuVmeVBlRMStJrf5B 3TxXSYHnYBhgtUewG4J3QBQYIwECqlGa1N5eoZLwiVX9+U1QAp7jdC5rb7ILk+V91J hPu21b1BrCSSMpIB1t3Ktr/4kIHXRe6F4C5B7hpigPwuT4TcTRiWb7FRbP4PnJuDZd yENO3dtLBKrEQ== Date: Fri, 20 Oct 2023 08:40:09 -0700 From: "Darrick J. Wong" To: Chandan Babu R Cc: akpm@linux-foundation.org, Shiyang Ruan , linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-xfs@vger.kernel.org, linux-mm@kvack.org, dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, mcgrof@kernel.org Subject: Re: [PATCH v15] mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind Message-ID: <20231020154009.GS3195650@frogsfrogsfrogs> References: <20230828065744.1446462-1-ruansy.fnst@fujitsu.com> <20230928103227.250550-1-ruansy.fnst@fujitsu.com> <875y31wr2d.fsf@debian-BULLSEYE-live-builder-AMD64> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <875y31wr2d.fsf@debian-BULLSEYE-live-builder-AMD64> X-Stat-Signature: ygkc4kgcxyndgkr3icmxsp1g7fxcdbn8 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: ECF6614000E X-Rspam-User: X-HE-Tag: 1697816412-324182 X-HE-Meta: U2FsdGVkX1/J+0hi4ShqejK/qRvAhtxd+pWAkKCmjk4Ar9iVlAvByyRd2puCkL2aR9LiyFLS8N9wwEgvcYY3oBrsu8QgtcYJXbr/Wnx4M/qLQ01HCQ74KCcJVX97p4ysfaCCTDhD1h7kZyde37kYR0a/IMJX9H8stDoE8s+GSC/NpIkVn9VxhQUAadeX/SrxLrGrdnouPTUqJlyBRzfpu09+YnQO2AYZZUzfkMqmPwSvyvUQP89ofFGxFc1araDmkIDj/wQe75bc4HZRjR7CzqF6buLizTDmQrI148iqcyABDQDT2IGq+Y9eejjfASuCptFX1whx2bgaJVeYag3H2gQVKjwaGhMBh1k+ymBbJP7VodeSTFrhTjDS2T18fDQlbMW0OPrk5oluXG+AWgpd5krTPMmiSHuSo2WYEyOOPG5gaqOWUnuj+Le7ieCndVPLHV92v2AXriJ7VKQ/v9lIgO90oJzRB+5j/w1X3Kb6ETrXPNpEgx+Am1EG2nIGXKluS6caq872JDpmY08FHqxQbzZ/Eotm+q1Ch0+iM4NU+TuATYbxiv6QhGKd5H0hXqoGi7V/t0lS8goLldpPnjeh2tDh1Hrc8v5AbHjkfIRQVZFkv1tvZXQa+5qnS/wwG/9e9MmPtWiZ92imJOgfxZ9ERTV2FwFOMntwfCfz3Ja2xLLUH7l+J+6rAlmnA+ebNGUI/GGmoPciwuI8E+b/oXVwgRvhL7IMe0W2XHJqo+/1C2EOzY+2KcOALUaNxC4bKNvqqLaY9kl+ABcuUr/BDkjgQYT0KjKAxeTPFcEGGWl1mVtrcCyUEVx+1Ctk78scjn0ZXHgZlcdP6HkRz1c+cHgkXDvhoTzZpRTt8wMBJUFMIN8zGpe+Ncs+H7ddalYwOqjf6nfWSx9i4OLonSFOPi0Dp75Wi/y+0hiHoPfyycaZAqt5ILZ9IpRRtPmaZ6nyyMBVNcOxOOdiQ+3D12dTHN8 Y+ONsZ7r oLU5rIlavVfdbkvUd3Ok1O4NrpAFCB/my8YpZbDwMhYyP0JBFvqzD/kNPWfHgwSPVyALr4X3/I4ktndf0tnPEtbZqHnBP9N+F0njvNyCixY4Qvw8xWShpTfLX164erh/+57t0r7Il1BaI1zwc2Ql4GqXdisJeF8MLeHhKg/wGnvdxsRTIUJHRQ7fOaf0lv0aSBzDpS321tWzlvWOUhH23E24viE1AKIXtEPj20Y2gCq1eO/w2Et3wYqRqlN5rG4eBMiBiamcB8ycysemC4LKcEGpMK4PX8nV7WZniWxAotn3K1PiX91KClZlso86uZvTLOFoBwHp53GBOFW8h2KHjG9GKxA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 20, 2023 at 03:26:32PM +0530, Chandan Babu R wrote: > On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote: > > ==== > > Changes since v14: > > 1. added/fixed code comments per Dan's comments > > ==== > > > > Now, if we suddenly remove a PMEM device(by calling unbind) which > > contains FSDAX while programs are still accessing data in this device, > > e.g.: > > ``` > > $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 & > > # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 & > > echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind > > ``` > > it could come into an unacceptable state: > > 1. device has gone but mount point still exists, and umount will fail > > with "target is busy" > > 2. programs will hang and cannot be killed > > 3. may crash with NULL pointer dereference > > > > To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we > > are going to remove the whole device, and make sure all related processes > > could be notified so that they could end up gracefully. > > > > This patch is inspired by Dan's "mm, dax, pmem: Introduce > > dev_pagemap_failure()"[1]. With the help of dax_holder and > > ->notify_failure() mechanism, the pmem driver is able to ask filesystem > > on it to unmap all files in use, and notify processes who are using > > those files. > > > > Call trace: > > trigger unbind > > -> unbind_store() > > -> ... (skip) > > -> devres_release_all() > > -> kill_dax() > > -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE) > > -> xfs_dax_notify_failure() > > `-> freeze_super() // freeze (kernel call) > > `-> do xfs rmap > > ` -> mf_dax_kill_procs() > > ` -> collect_procs_fsdax() // all associated processes > > ` -> unmap_and_kill() > > ` -> invalidate_inode_pages2_range() // drop file's cache > > `-> thaw_super() // thaw (both kernel & user call) > > > > Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove > > event. Use the exclusive freeze/thaw[2] to lock the filesystem to prevent > > new dax mapping from being created. Do not shutdown filesystem directly > > if configuration is not supported, or if failure range includes metadata > > area. Make sure all files and processes(not only the current progress) > > are handled correctly. Also drop the cache of associated files before > > pmem is removed. > > > > [1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@dwillia2-desk3.amr.corp.intel.com/ > > [2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/ > > > > Signed-off-by: Shiyang Ruan > > Reviewed-by: Darrick J. Wong > > Acked-by: Dan Williams > > Hi Andrew, > > Shiyang had indicated that this patch has been added to > akpm/mm-hotfixes-unstable branch. However, I don't see the patch listed in > that branch. > > I am about to start collecting XFS patches for v6.7 cycle. Please let me know > if you have any objections with me taking this patch via the XFS tree. V15 was dropped from his tree on 28 Sept., you might as well pull it into your own tree for 6.7. It's been testing fine on my trees for the past 3 weeks. https://lore.kernel.org/mm-commits/20230928172815.EE6AFC433C8@smtp.kernel.org/ --D > > -- > Chandan