From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F947C001E0 for ; Mon, 23 Oct 2023 12:22:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE7A36B00C4; Mon, 23 Oct 2023 08:22:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C970E6B00C5; Mon, 23 Oct 2023 08:22:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B86096B00C6; Mon, 23 Oct 2023 08:22:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A8FB26B00C4 for ; Mon, 23 Oct 2023 08:22:20 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 640621409F5 for ; Mon, 23 Oct 2023 12:22:20 +0000 (UTC) X-FDA: 81376638840.30.0ED9147 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id A5ACB140022 for ; Mon, 23 Oct 2023 12:22:18 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=fUTH+SSp; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf09.hostedemail.com: domain of chandanbabu@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chandanbabu@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698063738; a=rsa-sha256; cv=none; b=L1AvxfwjFip7tvomaO+dfyE0e+8btM72nPpV8Q5xJkDdDgVdXNDnAQ9hkB6B2elM9fEhUJ b0vOGun7BszhBp3cnP6Rj9/U9TqEyyQEdmiEkCz5vawAPZlZlSVpvKANeAkLZJU9ARiqs8 y9cvrZttk7iOWgt5rDLAUryopbVr91U= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=fUTH+SSp; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf09.hostedemail.com: domain of chandanbabu@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chandanbabu@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698063738; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yMYL7k2mt2gCyIoQj67NPoZZh7lQ90XieUVbI13+tB4=; b=5k/LocJvms1In851C20XOw0zeHm/OGNvcT2sBI53fd5c4lvF/DTNS0lg5rNimS+g4+iAhF pdVNJbKjds0n+T8PKAjTJ+OFP+Yl/zCh8BqzUXDrihu1WL7AsBcGEF1WMtjJg4Zd1nK4/F FKGtbv6LAsRckmWtkD7pFYP3GEtrGIw= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id A074F62389; Mon, 23 Oct 2023 12:22:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CE325C433C8; Mon, 23 Oct 2023 12:22:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698063737; bh=yWAB5IfWcKrAlITT4bJyO6/NuZWlOzBmHsnfhba5iJk=; h=References:From:To:Cc:Subject:Date:In-reply-to:From; b=fUTH+SSp6jspikGzmehSSTvtU+ailRWHiTav2WWd5d/KCGJvzLg5O3ZLuHPTEjOH/ 1DERIBOo1JAhCgasljWl67Hd2R62YfCkTRPDNxAt4JkBKGqa2SWealTnKBF/x2+0BT DHzztgSJhoWo5DhBILafGR8mfcL2nFbkPoWKT3SMNbMWxq6B16vFF7jOQGsk9STebI pY3wyZh+CpJP5IUZuk5csmNWEKu6SSLmHkwiyWcYHkB4/C2rlQxJhWQbQ+JmLd0VBo mkoOKgiujS64fBy31V7xkM1XK1GsPFdAYQA4JMn0ABpMVYx3/7/rfFsXEJDzfnq63+ 0ANLXF7nrA+VQ== References: <20230828065744.1446462-1-ruansy.fnst@fujitsu.com> <20230928103227.250550-1-ruansy.fnst@fujitsu.com> <875y31wr2d.fsf@debian-BULLSEYE-live-builder-AMD64> <20231020154009.GS3195650@frogsfrogsfrogs> <87msw9zvpk.fsf@debian-BULLSEYE-live-builder-AMD64> <834497bc-0876-43bb-bd67-154ad7f26af3@fujitsu.com> User-agent: mu4e 1.8.10; emacs 27.1 From: Chandan Babu R To: Shiyang Ruan Cc: akpm@linux-foundation.org, "Darrick J. Wong" , linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-xfs@vger.kernel.org, linux-mm@kvack.org, dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, mcgrof@kernel.org Subject: Re: [PATCH v15] mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind Date: Mon, 23 Oct 2023 17:51:49 +0530 In-reply-to: <834497bc-0876-43bb-bd67-154ad7f26af3@fujitsu.com> Message-ID: <87edhlzfyi.fsf@debian-BULLSEYE-live-builder-AMD64> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A5ACB140022 X-Stat-Signature: qcgcyo7yd3chkt1ttky64attsjonquc6 X-HE-Tag: 1698063738-194434 X-HE-Meta: U2FsdGVkX1/DqBl0zart1GOcJYO6EwxW6lPlzBXA7nlr75L5hrZQKJqho4XxRlbvhe5u47VVfhLmNnkYzkJKpsK5PSnvce91SRsTOKH0zPNgEXllrTCqnicKUn/xyk83HQsy+deLZfyql11WXTc7TuCNlcXAqL1eHyNrPNJ2SVHRbSFK00a0U6IuUBpAM4TQ68b+bBgo5DXLd0Mk9ZfAaSXNfPNH0gUgI5Yn8OuzxP+zNVoi+jU3xubZaRmS5+52KEsdSqgSH11WNBhMsYd9FeJnl73WXijIBTlEPqkZZznYSIu647lg7aBy4m0MBLKTn20vWs+eI9Bdgm2TT34uzvwh4M7Mkmqq0u2sV7MWjKewxztD4AfmYzL3lc7IushHxeQoZeqKI8BILafADNis+m6MJkiJecEQm5G9zRmr8cTDx4Em1D6WDI0r0xei+LJTG/EmVudH/j45p5M76GNoeGJhYViMzP9/qvnpx5GdNMi8xMgUtDZaHZ0cOf032m9VNdzndcfhW0Wp6xGLUZpJevDEsjDkn/z+ZFtzVTyeHmXBxvi0g6k7fYh4Jj9uZCg9RbLy4GnU72u84gnkQPCFc7JMEYFnk5vIbWX2mUkqXz3lw7l16CUrRGQzTO/ldQZFcg3nEc0pHaHMCKq3v7GRZkmRx0yXO4NV2rOsH8AF5dE+/oigu1j7uJSZt/S+eGHCh3JZo1VBTR7X01LMVZnFn4Dp+xYuAtgpQD4HKlKZCYbYwYIbepqVXsULOe7pRapUA8klDd6vZd3esWB5hRgpJ3O2CNa9h3dcXxX85j8E0m8WZQ/7vVsHUOGgL4PsjnJsqayDCnknZpRVgJWh5PW6BYgayA0gfKnpiD16uXCj1uShLRRUsSxI2r73N5u7IyfW/l82bnRZTNRLbdIj2lNLUnJrazc4TxpQ3NXWsSFl9HeS3KEqEN9UQfK8N0jPmx/IdYcuDOVWiu0O2qZ8rh0 aUkbSONK I0tSnMo/plNn/DrDg84z4GvFA0uFELu4aQjhZC3sVw+rZBb0u08A6Ucc056zatRlzxVIP1PDQal68/ZYswpwcB/ou/k75s0YNFeUWa4YfINeFXeZLJw5cDvjD93tPf3aN9Zf2ehwOC+ruxAHF0CWddwPJ31vKumvvPZPbRlVfr00GSXNQ+/smbPVt8hRsIRlCAVOcie4n/S76hVVBNn0NrB47En+CbQmmVlfsOF4azHGUz6uGt7zAHqlrv7Ygg7l3BZhqnoqKfUaEU2kpmuULa5JCE/Rlq/P7D24ZWitiqEHxl8NDwNdlNwARpvS8CaATM9+Rf/VBm9xhCAFGPIIjpsjnEn2gv8UQqo66KiANLK1fx1yzqyT0g66gL9oO0AiwZr6Y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 23, 2023 at 03:26:52 PM +0800, Shiyang Ruan wrote: > =E5=9C=A8 2023/10/23 14:40, Chandan Babu R =E5=86=99=E9=81=93: >> On Fri, Oct 20, 2023 at 08:40:09 AM -0700, Darrick J. Wong wrote: >>> On Fri, Oct 20, 2023 at 03:26:32PM +0530, Chandan Babu R wrote: >>>> On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote: >>>>> =3D=3D=3D=3D >>>>> Changes since v14: >>>>> 1. added/fixed code comments per Dan's comments >>>>> =3D=3D=3D=3D >>>>> >>>>> Now, if we suddenly remove a PMEM device(by calling unbind) which >>>>> contains FSDAX while programs are still accessing data in this device, >>>>> e.g.: >>>>> ``` >>>>> $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 & >>>>> # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 & >>>>> echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind >>>>> ``` >>>>> it could come into an unacceptable state: >>>>> 1. device has gone but mount point still exists, and umount will f= ail >>>>> with "target is busy" >>>>> 2. programs will hang and cannot be killed >>>>> 3. may crash with NULL pointer dereference >>>>> >>>>> To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know tha= t we >>>>> are going to remove the whole device, and make sure all related proce= sses >>>>> could be notified so that they could end up gracefully. >>>>> >>>>> This patch is inspired by Dan's "mm, dax, pmem: Introduce >>>>> dev_pagemap_failure()"[1]. With the help of dax_holder and >>>>> ->notify_failure() mechanism, the pmem driver is able to ask filesyst= em >>>>> on it to unmap all files in use, and notify processes who are using >>>>> those files. >>>>> >>>>> Call trace: >>>>> trigger unbind >>>>> -> unbind_store() >>>>> -> ... (skip) >>>>> -> devres_release_all() >>>>> -> kill_dax() >>>>> -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_RE= MOVE) >>>>> -> xfs_dax_notify_failure() >>>>> `-> freeze_super() // freeze (kernel call) >>>>> `-> do xfs rmap >>>>> ` -> mf_dax_kill_procs() >>>>> ` -> collect_procs_fsdax() // all associated processes >>>>> ` -> unmap_and_kill() >>>>> ` -> invalidate_inode_pages2_range() // drop file's cache >>>>> `-> thaw_super() // thaw (both kernel & user cal= l) >>>>> >>>>> Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove >>>>> event. Use the exclusive freeze/thaw[2] to lock the filesystem to pr= event >>>>> new dax mapping from being created. Do not shutdown filesystem direc= tly >>>>> if configuration is not supported, or if failure range includes metad= ata >>>>> area. Make sure all files and processes(not only the current progres= s) >>>>> are handled correctly. Also drop the cache of associated files before >>>>> pmem is removed. >>>>> >>>>> [1]: https://lore.kernel.org/linux-mm/161604050314.1463742.1415166514= 0035795571.stgit@dwillia2-desk3.amr.corp.intel.com/ >>>>> [2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.168624101= 28731457358.stg-ugh@frogsfrogsfrogs/ >>>>> >>>>> Signed-off-by: Shiyang Ruan >>>>> Reviewed-by: Darrick J. Wong >>>>> Acked-by: Dan Williams >>>> >>>> Hi Andrew, >>>> >>>> Shiyang had indicated that this patch has been added to >>>> akpm/mm-hotfixes-unstable branch. However, I don't see the patch liste= d in >>>> that branch. >>>> >>>> I am about to start collecting XFS patches for v6.7 cycle. Please let = me know >>>> if you have any objections with me taking this patch via the XFS tree. >>> >>> V15 was dropped from his tree on 28 Sept., you might as well pull it >>> into your own tree for 6.7. It's been testing fine on my trees for the >>> past 3 weeks. >>> >>> https://lore.kernel.org/mm-commits/20230928172815.EE6AFC433C8@smtp.kern= el.org/ >> Shiyang, this patch does not apply cleanly on v6.6-rc7. Can you >> please rebase >> the patch on v6.6-rc7 and send it to the mailing list? > > Sure. I have rebased it and sent a v15.1. Please check it: > > https://lore.kernel.org/linux-xfs/20231023072046.1626474-1-ruansy.fnst@fu= jitsu.com/ Thank you. I have applied the patch to my local Git tree. --=20 Chandan