From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B9F3C5472C for ; Sun, 25 Aug 2024 18:32:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A97C8D002D; Sun, 25 Aug 2024 14:31:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65B138D0029; Sun, 25 Aug 2024 14:31:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 548F18D002D; Sun, 25 Aug 2024 14:31:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 373C78D0029 for ; Sun, 25 Aug 2024 14:31:51 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DAC1180849 for ; Sun, 25 Aug 2024 18:31:50 +0000 (UTC) X-FDA: 82491611580.20.FC8FD0A Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf06.hostedemail.com (Postfix) with ESMTP id 5298918000E for ; Sun, 25 Aug 2024 18:31:49 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=qRXiXOve; spf=pass (imf06.hostedemail.com: domain of helgaas@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=helgaas@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724610624; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:dkim-signature; bh=rlIrbkt1m3VdIuQRwcyTPHPfkqvxLMUzFQ0alyGnQvA=; b=1k1bvRpJauuLP4VERfsli5TSwQ8vXEKofV/e3IfQ0GWHtxUGecdgDkZ60A6d2n1Uu91Q1i 4DYgza+bqYsumpzG6S009OU2qsf3m1YTmAxO8YNxCHo3CJsEtncfo/4cRezLeuePifUZbI /+ML0UJtPmWr2PeYa9IUcu8Cgb43bII= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724610624; a=rsa-sha256; cv=none; b=xK8XS+DA+CQRX7RLtNI93zh0r4swnRgOlh/gVzQOzGq0/VZ1bljD/7Y8HXuDqEUF7M1X+o hZK5HIY8BnVaK8wuEH1F/mqoATzPWmoSIdyUo8A37QHgo77M7/rxsDwgeIThVqJ+RmzE+U bKUTBO2/xQQ5BuWdFH4ZLGiRKSo4Lfg= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=qRXiXOve; spf=pass (imf06.hostedemail.com: domain of helgaas@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=helgaas@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id DCACEA41DF8; Wed, 21 Aug 2024 18:46:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 00449C32781; Wed, 21 Aug 2024 18:46:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724265977; bh=cqQ+efIJ7BbTVi8BZHPCeDnba7cGvozIJaWcOMkL4Bs=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=qRXiXOveQ14RLiahwyTqoh0cx2pVLkxXvMpA09lX0Dh3EUM8hpIfFmTIM6onqfJY5 rOHFMpWtoIgdPtwWB1lqI/xEnxaZ/3DlRaM9c1PblTV/8x+vwUUpo/z740OEXFUXz2 W5k1NgvBpMa2fD7jvpL2dqLvPFBu016cf+exjZ+kjEcnLER4o0j8k3LQg7AEShtY5/ wonnPLL6cmGJ5a3akVX9qBf0zuvbgCyalBw6Ms3d9rnVrFb1ZMGCgXx2PUusdF48Xa JQXEL2JhAtKkpmJ3QFS0DmNEjYa1Jru6U3C4fHKPJr9SoaqY5Qf5A8BLx63alXCGnc U1Vzcft7T6ePw== Date: Wed, 21 Aug 2024 13:46:15 -0500 From: Bjorn Helgaas To: Huang Ying Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, Dan Williams , David Hildenbrand , Davidlohr Bueso , Jonathan Cameron , Dave Jiang , Alison Schofield , Vishal Verma , Ira Weiny , Alistair Popple , Andy Shevchenko , Bjorn Helgaas , Baoquan He Subject: Re: [PATCH -v2] Resource: fix region_intersects() for CXL memory Message-ID: <20240821184615.GA262749@bhelgaas> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240819023413.1109779-1-ying.huang@intel.com> X-Stat-Signature: fb93ib3k74dtmxbudo5bzc8iymwzsqi8 X-Rspamd-Queue-Id: 5298918000E X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1724610709-610547 X-HE-Meta: U2FsdGVkX1+YTefHk1okUsHmSUkcgkG5NUPoiuaxoU11bqRpqmmFvheowz+HsTRIsyCcHke8I5qo5qs4hyzbzxFk3vA+XfTee8pch6VqZy/ZHHi7AeRb61hlPGXChlPOA96h2iPDFxYkNM/7cg9iHTskqHDpD9HE9EAJDsu7+4qvVzrvV/xK2c+btZ6Y/SEQ9qtIRPPf4oIJiLzLP5kNTqN4ojlj3oolVpVYzhmvueI01EkAsyaRVuTA7K3eumG9oKckbom3cvuQwhIxfvJ14hiHEXNuw7+GKMnDVsIq8ZFeN5Ru87OVcmE2F2f4IGxHggFgbnY6ycmLPAJ7MI68wfXnZd6hVZRXv5jra9Dc8Dj6zLl5c36GP2CsQhBY7ypuLCsSWfTQ0igaDsI48vSriSQ7DUMbt20/iM3vSihZGKBDNyzLoE+BoW/11QpKkC15ATPG3Cc3K3eOtzYxODLFE49hliHKirYPe85TZwplc65A8QU/EzXNuu3f7FckyYA/2Xckab5OilciT1l5BCf65/I4FAKllkd9Lk2e3EoAoj1D1KJK7wFrhxaiP7MNw+h4s5mLV3qIY7ujlqNoLMnvj1de6ShZhNpzdy9+Q+70kbG+/N0QVumizO/KLjnvZRbOj5qD79Q65u6l8zZHzFGkNYr/XdN9lGjE2WF6MwgTUI6v1J1+kOxHpgnvr20Uzu+9FMDsUOhv9aOnBQdsMN+I8F2tOBUTWNzlA9Sb4xlT4YDNIBZddBrYBWb23rfcAJYoaq1ENzQp14tklb7sfoUefBOAY6vE4d3g3Mvu5+uCoiU69Zt/+Lyk9JOSLJN9ZD3uxzlEaT7POwJGzJwvCbqX8lMCggRacJ/uHelf965xIWdJVS/FG63qlCk1gTC6CAjseVzbU1K3meWZTXEyPSEdSpyWrVfniqMWsoZ+qNT/aeibhMk8OASMJgMUIGsBApOC/en7E+HvcdVKu2+vKM2 MU2wGe5K hIpdMsAWxUuIwHfozdYZSca4uU+XPGVK0ZlPeAWgTvbUU5BN457KYLFzshu2BdSF37Mke7BvrmSZwWjxppQ4z3ffH+B2sRwAtrrdwJR5j/tO5o6YHyGJCBKfoqseuTgYJUn266tzdAIGAMmZtsFBPiGocF+Beg7wjR80EVhqSw+IF1dUFXSYtoaN64sdLQ2NOkgPwzpPQBkOUJEeYOUO1NZDYZP385m1V3RjHsKPf91jegLyA+Id09rZpoH3Y51WVv4SdmTFwYNx4/JX9uCswa1K/LNd1i426kUyMJRhzeM1dCTwg564V/VUMT/W1FxnSlcUp7J4hZG+eLwZ8h+U5Bdd8J9dn1RkN9EwFTlxuJouQJlfLlHLJWjdm+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 19, 2024 at 10:34:13AM +0800, Huang Ying wrote: > On a system with CXL memory installed, the resource tree (/proc/iomem) > related to CXL memory looks like something as follows. > > 490000000-50fffffff : CXL Window 0 > 490000000-50fffffff : region0 > 490000000-50fffffff : dax0.0 > 490000000-50fffffff : System RAM (kmem) I think the subject is too specific (the problem is something to do with the tree topology, not the fact that it's "CXL memory") and at the same time not specific enough ("fix" doesn't say anything about what was wrong or how it is fixed). IMO it could be improved by saying something about what is different about CXL, e.g., maybe it could mention checking children in addition to top-level resources. > When the following command line is run to try writing some memory in > CXL memory range, > > $ dd if=data of=/dev/mem bs=1k seek=19136512 count=1 > dd: error writing '/dev/mem': Bad address > 1+0 records in > 0+0 records out > 0 bytes copied, 0.0283507 s, 0.0 kB/s Took me a minute, but I guess the connection is that 19136512 * 1k = 0x490000000, which is the beginning of the CXL Window. > the command fails as expected. However, the error code is wrong. It > should be "Operation not permitted" instead of "Bad address". And, > the following warning is reported in kernel log. This intro makes it sound like the problem being solved is the error code being wrong. But it seems like a more serious problem than that. > ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff Incidental: it seems a little weird that this warning only exists on x86 and mips (and powerpc32 has a similar warning with different wording), but I assume we don't want to ioremap RAM on *any* architecture? > WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d > Modules linked in: cxl_pmem libnvdimm cbc encrypted_keys cxl_pmu > CPU: 2 UID: 0 PID: 416 Comm: dd Not tainted 6.11.0-rc3-kvm #40 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 > RIP: 0010:__ioremap_caller.constprop.0+0x131/0x35d > ... > In the above resource tree, "System RAM" is a descendant of "CXL > Window 0" instead of a top level resource. So, region_intersects() > will report no System RAM resources in the CXL memory region > incorrectly, because it only checks the top level resources. > Consequently, devmem_is_allowed() will return 1 (allow access via > /dev/mem) for CXL memory region incorrectly. Fortunately, ioremap() > doesn't allow to map System RAM and reject the access. > > However, region_intersects() needs to be fixed to work correctly with > the resources tree with CXL Window as above. To fix it, if we found a > unmatched resource in the top level, we will continue to search > matched resources in its descendant resources. So, we will not miss > any matched resources in resource tree anymore. In the new > implementation, > > |------------- "CXL Window 0" ------------| > |-- "System RAM" --| > > will look as if > > |-- "System RAM" --||-- "CXL Window 0a" --| Where did "0a" come from? The /proc/iomem above mentioned "CXL Window 0"; is the "a" spurious? Same question applies to the code comment below. > in effect. > + * |------------- "CXL Window 0" ------------| > + * |-- "System RAM" --| > + * > + * looks as if > + * > + * |-- "System RAM" --||-- "CXL Window 0a" --| > + * > + * in effect.