From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8734CC433EF for ; Wed, 6 Apr 2022 20:39:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E06886B0071; Wed, 6 Apr 2022 16:39:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB4976B0073; Wed, 6 Apr 2022 16:39:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7BF26B0074; Wed, 6 Apr 2022 16:39:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id B89906B0071 for ; Wed, 6 Apr 2022 16:39:15 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8513C23DF7 for ; Wed, 6 Apr 2022 20:39:05 +0000 (UTC) X-FDA: 79327618650.09.7A3A501 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf10.hostedemail.com (Postfix) with ESMTP id BDAA4C000D for ; Wed, 6 Apr 2022 20:39:04 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C2A5AB8254D; Wed, 6 Apr 2022 20:39:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 757FFC385A5; Wed, 6 Apr 2022 20:39:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649277541; bh=PXUDYE1r6Gil/o9m5rfjYKx/BFUoyr1dI3rq+0GFdOc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Cd9d3yNYx5ueGCXvizdmPbLOeNrVL6KS5C9ulrFKncWNwuw/eFoj4ZgXMdrIGjDxv a7fjB7+wNcj4jiIBOJK35xYyIsj6e+NpbqJQinyrR1FNlX3ZgTWjHUksIBI7YC3Omx gGwXlXFrEFw+DdU48Ubs3JzmiTAVW17j0Naaa8et4nrA9ezYZlBmQvEKN9N6GxDyrt V4EgEOqknmuYSsUSXN9oA2iX4BkcxxQbiFkg6SphOxCEbIr2ml5/lLBC4yKrLZ0z/f UCHO7hgqZ6LQ5zOaQ7/07RWJJxT5VxoUEIQq2NGtk6k/McpEorw6y0xTMPEc3f1MqM 71yCvdJd8NJqQ== Date: Wed, 6 Apr 2022 13:39:00 -0700 From: "Darrick J. Wong" To: Dan Williams Cc: Jane Chu , Christoph Hellwig , Shiyang Ruan , Linux Kernel Mailing List , linux-xfs , Linux NVDIMM , Linux MM , linux-fsdevel , david Subject: Re: [PATCH v11 1/8] dax: Introduce holder for dax_device Message-ID: <20220406203900.GR27690@magnolia> References: <4fd95f0b-106f-6933-7bc6-9f0890012b53@fujitsu.com> <15a635d6-2069-2af5-15f8-1c0513487a2f@fujitsu.com> <4ed8baf7-7eb9-71e5-58ea-7c73b7e5bb73@fujitsu.com> <20220330161812.GA27649@magnolia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: BDAA4C000D X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Cd9d3yNY; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf10.hostedemail.com: domain of djwong@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=djwong@kernel.org X-Stat-Signature: junf4dw4snx8mwpozy8s3hiupuseih8u X-HE-Tag: 1649277544-190524 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 05, 2022 at 06:22:48PM -0700, Dan Williams wrote: > On Tue, Apr 5, 2022 at 5:55 PM Jane Chu wrote: > > > > On 3/30/2022 9:18 AM, Darrick J. Wong wrote: > > > On Wed, Mar 30, 2022 at 08:49:29AM -0700, Christoph Hellwig wrote: > > >> On Wed, Mar 30, 2022 at 06:58:21PM +0800, Shiyang Ruan wrote: > > >>> As the code I pasted before, pmem driver will subtract its ->data_offset, > > >>> which is byte-based. And the filesystem who implements ->notify_failure() > > >>> will calculate the offset in unit of byte again. > > >>> > > >>> So, leave its function signature byte-based, to avoid repeated conversions. > > >> > > >> I'm actually fine either way, so I'll wait for Dan to comment. > > > > > > FWIW I'd convinced myself that the reason for using byte units is to > > > make it possible to reduce the pmem failure blast radius to subpage > > > units... but then I've also been distracted for months. :/ > > > > > > > Yes, thanks Darrick! I recall that. > > Maybe just add a comment about why byte unit is used? > > I think we start with page failure notification and then figure out > how to get finer grained through the dax interface in follow-on > changes. Otherwise, for finer grained error handling support, > memory_failure() would also need to be converted to stop upcasting > cache-line granularity to page granularity failures. The native MCE > notification communicates a 'struct mce' that can be in terms of > sub-page bytes, but the memory management implications are all page > based. I assume the FS implications are all FS-block-size based? I wouldn't necessarily make that assumption -- for regular files, the user program is in a better position to figure out how to reset the file contents. For fs metadata, it really depends. In principle, if (say) we could get byte granularity poison info, we could look up the space usage within the block to decide if the poisoned part was actually free space, in which case we can correct the problem by (re)zeroing the affected bytes to clear the poison. Obviously, if the blast radius hits the internal space info or something that was storing useful data, then you'd have to rebuild the whole block (or the whole data structure), but that's not necessarily a given. --D