From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D96B8C4828E for ; Fri, 2 Feb 2024 20:02:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 634696B0088; Fri, 2 Feb 2024 15:02:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E3BF6B0089; Fri, 2 Feb 2024 15:02:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D2886B008A; Fri, 2 Feb 2024 15:02:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3E19F6B0088 for ; Fri, 2 Feb 2024 15:02:53 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 03E33120970 for ; Fri, 2 Feb 2024 20:02:52 +0000 (UTC) X-FDA: 81747937026.14.BD5071C Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf11.hostedemail.com (Postfix) with ESMTP id 2B7164000F for ; Fri, 2 Feb 2024 20:02:49 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=CmeX23S7; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf11.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706904170; a=rsa-sha256; cv=none; b=4ThCtgFn6enO4p2Lvvn/URzSg05nDwfbot3dIFYIfHJmhs59MmCCpJGm4TM3YaZqm6qGto L0de4za33c9M/DvIOTFic1pdPA62r/7/kB9Cad+G57KN3YKb1RgtaYnoo2XtzyC7xcsuML 9KF1bnmf4ug08sfshm3eP1UrkdG1lw4= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=CmeX23S7; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf11.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706904170; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=emtslztaCdAAPNAKRxhXoXrNaIirO3nkQkuyYdaYoU0=; b=6toZLyNrUJ+TyYVsyAfiF/w6VXSMKyp15Blqq+h6wPXcCDRNmfnOk6gYmjGT/1Kp8Kma6Z aXChdC8+xTogv0NZN3X2fiBDOOUVAajyaBQuD+3h7foVExswVkViLll2VxNNY8G8lgfpzZ wIcA1qQbI9wZcXiwN8nZ/5Gdz2q9EHA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1706904169; bh=U8Zpp8rzADjQWv8sp6Upen6MRmo/xdz9mIdCpg1d5d0=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=CmeX23S7NWtLpK+eiIZsvdsXqvq9a2C5GT5UEfoNFmLT74sYFVEQYiZUHTkWR1s/m gE82fXfKp38/4XUPZ133OnCLiaZxfY/nGJ6r0uws4ITe4unBEgeis7VnRH87RkvfNS CcXWaHj/JViXRsjCr9gB0Kbmj/OOP3ohi6cBPEyDlOEOgnW+3JyO9rjllrj3yXQGPc f0IHPPF5Zw5Vp2HS8EdN/Ls7v6O3dLmXNVMB4dlwr35GzKBiARWFnFxE8DWOSwJZyn zCFGR0OSgygAMI3MXZqpWjLzzYzG+W4swDPmIV6cDvsnrXfRR85+F+SRNfRKdnRGZT ypkpPgzZlqG2A== Received: from [172.16.0.134] (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4TRRX86Fj7zWgY; Fri, 2 Feb 2024 15:02:48 -0500 (EST) Message-ID: <5e838147-524c-40e5-b106-e388bf4e549b@efficios.com> Date: Fri, 2 Feb 2024 15:02:50 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime Content-Language: en-US To: Dan Williams , Arnd Bergmann , Dave Chinner Cc: linux-kernel@vger.kernel.org, Andrew Morton , Linus Torvalds , linux-mm@kvack.org, linux-arch@vger.kernel.org, Vishal Verma , Dave Jiang , Matthew Wilcox , Russell King , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, dm-devel@lists.linux.dev References: <20240131162533.247710-1-mathieu.desnoyers@efficios.com> <20240131162533.247710-3-mathieu.desnoyers@efficios.com> <65bab567665f3_37ad2943c@dwillia2-xfh.jf.intel.com.notmuch> <0a38176b-c453-4be0-be83-f3e1bb897973@efficios.com> <65bac71a9659b_37ad29428@dwillia2-xfh.jf.intel.com.notmuch> <65bd284165177_2d43c29443@dwillia2-mobl3.amr.corp.intel.com.notmuch> <6bdf6085-101d-47ef-86f4-87936622345a@efficios.com> <65bd457460fb1_719322942@dwillia2-mobl3.amr.corp.intel.com.notmuch> From: Mathieu Desnoyers In-Reply-To: <65bd457460fb1_719322942@dwillia2-mobl3.amr.corp.intel.com.notmuch> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2B7164000F X-Stat-Signature: hw5kut5gat5k1s515tgtxhjj83qyckra X-HE-Tag: 1706904169-15307 X-HE-Meta: U2FsdGVkX18d8eoIHLeQIqv93jC7aQpl7Sk3MkyQB6qSxjQqgSbmehUcnYQOZWP4ohfwzk2ORXO79HzAyjtCmV1a6cBBu/8fssmb5HaL3TqqkHtUoIrT3z9uy8yrmiQeH1Y3pwQ9ftZJPiQg4N7IqCbYZH8cZfeffv7zj9Xb3luHd77+DceXTEfzP39FnjLpTPCdh/j0wJUWw+uKFEtuJuUjhvw0rtS0e99GyOBvom6fhnKmSRTA+KPi0CleAlQBWyuV2nSfZLUv9pxerGcvqqXKigVa85/DvSyx0HLzHq8DpLszGJsNV0FFzsSDQQeuyh+3zyW/iA45m6vSiBFcKWUn8EBRJJyyWN7ZHRZAD/THg+Fu6MMlcB4nH2XEWBhAESj24Qm8olqLWX7dlTYyTIAZnkJEn4CnIPQR0A4MDakJYEoI1koP5KhXvVKZ71eCA+zyELcYvZawAFKiYqh2G6Dv6BkgkmodyVB/UlUer6yhDmvUDoaMwusuqmai4Id+c15olCZ7JzwWQ7kjJkAf9/wZ4x4opmrupnh81o9hmnXBcASG0ednwSOpArMZJa7qvCPQA4owEItGSNejvSKoxNXMMReBA0hMHvwGs89UxtpQWyoFi61eKSIXs0K+Oc27npLMF5OIRLGJPDmtwUV8FnWbvDSYFncDl/LzmtFlZHESSEl4eRygwzrsxmPDr6Zt79EEiEkFYJ35w3Nto0b45gQhKJaQJ0VBas4SqE8Hw5YZ+H/jXWCBilpKkPYB38wUQVqW5aQnXcfay4nUzC3sK/ikwYtwk1TT/5okImira7e9oIoqfCiVMW2cJPsf9xSdVzojZLV1tpIyeVn21as0oyygSNLb6PjTZ2ppOUqdA3FO1HBA8ESzLog6Bfoz3yfBw/S97cs3VQMoovoOE8djxwOwlSeBdfziMc5Zx0gVv6Ju2Zmh3DNaCly9lSjQcRq6qTJJhdwcUXT1uE+WBM8 9gD4mgpw n6SG7MOAOtyblCHoPKWg8lc0GB/WCuYdvuXx/UEJRJF5aRQU3fZjix9bYcs8GU5WHgTOT7Gltls9y/Jx5yGtroUvzbaRaQ4gRHdlYSk3eENR+tBnFixufh1621EXHumrbFGGyffQoRbFEibDagElmd8OWLcGtIkUpk0cUYvhrLkEmrkXbNM0Rke/8vrFvBoMbApruSX2BpVbo0OvOxM8XU1Vc5RExQO+LM5betSgTG1vBJn5KzZ9ukzs3r+NQB5f4bZIrRdfgWpa1np96W5mYDi/vPVa2J9xjgsJlQQjPSvEm5lU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-02-02 14:41, Dan Williams wrote: > Mathieu Desnoyers wrote: >> On 2024-02-02 12:37, Dan Williams wrote: >>> Mathieu Desnoyers wrote: >> [...] >>>> >>> >>>> The alternative route I intend to take is to audit all callers >>>> of alloc_dax() and make sure they all save the alloc_dax() return >>>> value in a struct dax_device * local variable first for the sake >>>> of checking for IS_ERR(). This will leave the xyz->dax_dev pointer >>>> initialized to NULL in the error case and simplify the rest of >>>> error checking. >>> >>> I could maybe get on board with that, but it needs a comment somewhere >>> about the asymmetric subtlety. >> >> Is this "somewhere" at every alloc_dax() call site, or do you have >> something else in mind ? > > At least kill_dax() should mention the asymmetry I think. Here is what I intend to add: * Note, because alloc_dax() returns an ERR_PTR() on error, callers * typically store its result into a local variable in order to check * the result. Therefore, care must be taken to populate the struct * device dax_dev field make sure the dax_dev is not leaked. > >>> The real question is what to do about device-dax. I *think* it is not >>> affected by cpu_dcache aliasing because it never accesses user mappings >>> through a kernel alias. I doubt device-dax is in use on these platforms, >>> but we might need another fixup for that if someone screams about the >>> alloc_dax() behavior change making them lose device-dax access. >> >> By "device-dax", I understand you mean drivers/dax/Kconfig:DEV_DAX. >> >> Based on your analysis, is alloc_dax() still the right spot where >> to place this runtime check ? Which call sites are responsible >> for invoking alloc_dax() for device-dax ? > > That is in devm_create_dev_dax(). > >> If we know which call sites do not intend to use the kernel linear >> mapping, we could introduce a flag (or a new variant of the alloc_dax() >> API) that would either enforce or skip the check. > > Hmmm, it looks like there is already a natural flag for that. If > alloc_dax() is passed a NULL operations pointer it means there are no > kernel usages of the aliased mapping. That actually fits rather nicely. Good, I was reaching the same conclusion when I received your reply. I'll do that. It ends up being: /* * Unavailable on architectures with virtually aliased data caches, * except for device-dax (NULL operations pointer), which does * not use aliased mappings from the kernel. */ if (ops && cpu_dcache_is_aliasing()) return ERR_PTR(-EOPNOTSUPP); > > [..] >>>>> @@ -804,6 +808,15 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) >>>>> if (!IS_ENABLED(CONFIG_FUSE_DAX)) >>>>> return 0; >>>>> >>>>> + dax_dev = alloc_dax(fs, &virtio_fs_dax_ops); >>>>> + if (IS_ERR(dax_dev)) { >>>>> + int rc = PTR_ERR(dax_dev); >>>>> + >>>>> + if (rc == -EOPNOTSUPP) >>>>> + return 0; >>>>> + return rc; >>>>> + } >>>> >>>> What is gained by moving this allocation here ? >>> >>> The gain is to fail early in virtio_fs_setup_dax() since the fundamental >>> dependency of alloc_dax() success is not met. For example why let the >>> setup progress to devm_memremap_pages() when alloc_dax() is going to >>> return ERR_PTR(-EOPNOTSUPP). >> >> What I don't know is whether there is a dependency requiring to do >> devm_request_mem_region(), devm_kzalloc(), devm_memremap_pages() >> before calling alloc_dax() ? >> >> Those 3 calls are used to populate: >> >> fs->window_phys_addr = (phys_addr_t) cache_reg.addr; >> fs->window_len = (phys_addr_t) cache_reg.len; >> >> and then alloc_dax() takes "fs" as private data parameter. So it's >> unclear to me whether we can swap the invocation order. I suspect >> that it is not an issue because it is only used to populate >> dax_dev->private, but I prefer to confirm this with you just to be >> on the safe side. > > Thanks for that. All of those need to be done before the fs goes live > later in virtio_device_ready(), but before that point nothing should be > calling into virtio_fs_dax_ops, so as far as I can see it is safe to > change the order. Sounds good, I'll do that. I will soon be ready to send out a RFC v4, which is still only compiled-tested. Do you happen to have some kind of test suite you can use to automate some of the runtime testing ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com