From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92D85CF34DD for ; Fri, 4 Oct 2024 01:10:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE6216B0319; Thu, 3 Oct 2024 21:10:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C961F6B0429; Thu, 3 Oct 2024 21:10:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE7E36B0319; Thu, 3 Oct 2024 21:10:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 907E76B0317 for ; Thu, 3 Oct 2024 21:10:58 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 1A3051A0F0A for ; Fri, 4 Oct 2024 01:10:58 +0000 (UTC) X-FDA: 82634140596.11.DC7F51D Received: from mail-ua1-f54.google.com (mail-ua1-f54.google.com [209.85.222.54]) by imf08.hostedemail.com (Postfix) with ESMTP id 563F116000D for ; Fri, 4 Oct 2024 01:10:56 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eo3xEWDR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728004160; a=rsa-sha256; cv=none; b=kwf4zgNNa+Q55lmj8wYocuuZKe+3Jcfi6cAsPZen0aoPCM6pulfoiATpueP9FEpK7SmBlp PCXV/uaYUXj/Xv+P7cY7JVKOdv9pDLmDjf7EEMzXhylHJjbcxMGmQvUDPatONz/tqYha4y II5za0cC/jJW7F6jMmM8zBUfvVADsxQ= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eo3xEWDR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728004160; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lCX1gRvr7xgcko7ucEWBIV2dt5z8qvy1u1J14nEVaLo=; b=MCFE6ledDdnjeNyv3t0J1VghwPB2EgMWMWDPpB/bPGGAgQlGir87zgNxcOs6iuEuIE267H /6qqiljrnfYxal8DjBZH6cv9pl2oBEh4wE+G35Fg5rLZILz/l1Wsjo55h040iLb1HO68O6 Y8QiEeU3RwUS4WYiykMwUt+UsWputfw= Received: by mail-ua1-f54.google.com with SMTP id a1e0cc1a2514c-84e842c1673so507457241.3 for ; Thu, 03 Oct 2024 18:10:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728004255; x=1728609055; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=lCX1gRvr7xgcko7ucEWBIV2dt5z8qvy1u1J14nEVaLo=; b=eo3xEWDRPsC5ghIZSoeESbj+ZUtGJbzHpNTJgdHkhoIioyTDuWshc7BKJTZ3hZb2w4 ZntXaPwvVml4KCtU/Fhfb8ximLYTxlQD7Lqzzv54ALkXIOfU7FGPTVCVQFY4MP5f1M/P MFg2eRHT9NpfXvvMkDqzGur1xhGAKmFb1jRsztWsfGUwz0JGtw8dwYnQOCD5Z3uqg8a5 jWBESuY4XVlp2D6OMoCAQ1fWbdEbgPojfXVrLq3WKDbbj2sgMiASNJq4jTu9qb9DRbEI 59fZRnmnmCX4VJd36pKUdolosFHc0l7BpSAp0yT6i84SOiQi9NNb0YO+QzRaP8ed/B+r EI7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728004255; x=1728609055; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lCX1gRvr7xgcko7ucEWBIV2dt5z8qvy1u1J14nEVaLo=; b=TnQgKO9SaogOp+m5qvCJwEiPRQTCcxjTEym1lwS4xyBl9nnkkHdHMmuSiX/afQAn/c ILJU2w2VwQsZ90ISwDuWEsPJO9t7TJRF8nh5I2yCZ/VU6qNFW3y+/cFr7jufPd78FZrm EyrJlo6Y9FPSpxhjVtU7+IgLUYBg632fOAMiJd+ya+m78k0SqEOQMM81i4jQ5XQLdPpv iI5CMcTxxSBq1NDRlcdH76xbT2nqWD0lUlni9xrHHHBAVlXSoyEbHICBcxNZyXZd6VF2 RzPOSTeKX5pW6ASgT//5qxMcW57QoiUky3PUlSHfY235ZWfDG4L7oWP2a1Jto/Hj/mmb PUWA== X-Forwarded-Encrypted: i=1; AJvYcCVwtiWe1/uAfL6X8MhXgGKM7BpWCZKvyC9klhntHZu/T04kVCb407d0XkyzqDcEoma5D0ZHvCIW4g==@kvack.org X-Gm-Message-State: AOJu0YwL2abhqHoFFGUU/jInoTcKYPN3Gp9Nq3I8oih3n5oBL9AMeDuN 88PoRzN5Q+3Gd2vyfCgw7JN+v87K0DREuVK3pu6GvdiqdjhLi9O0252IHZBmNI1kFw2WaRPVnKd dCADQEy0Fn5MSWn/+8hWWYXw0VpA= X-Google-Smtp-Source: AGHT+IFaWK8cjLqmy+brgL9rw0VETzbwyhgClzP1Who58XmUlLZIQHSDuRNkAHxpDBTlvdg1vsTA7xcx4JAg5HnW1GY= X-Received: by 2002:a05:6102:38cb:b0:4a3:e05e:f6a3 with SMTP id ada2fe7eead31-4a405762235mr1317707137.3.1728004255235; Thu, 03 Oct 2024 18:10:55 -0700 (PDT) MIME-Version: 1.0 References: <20240930055112.344206-1-ying.huang@intel.com> <8734lgpuoi.fsf@yhuang6-desk2.ccr.corp.intel.com> <66ff297119b92_964f2294c6@dwillia2-xfh.jf.intel.com.notmuch> In-Reply-To: <66ff297119b92_964f2294c6@dwillia2-xfh.jf.intel.com.notmuch> From: Yang Shi Date: Thu, 3 Oct 2024 18:10:42 -0700 Message-ID: Subject: Re: [PATCH] tdx, memory hotplug: Check whole hot-adding memory range for TDX To: Dan Williams Cc: "Huang, Ying" , David Hildenbrand , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A . Shutemov" , x86@kernel.org, Andrew Morton , Oscar Salvador , linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kai Huang , "H. Peter Anvin" , Andy Lutomirski Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 563F116000D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: xyd8atkfwn7tithzm7ac5jyg8p5oneh4 X-HE-Tag: 1728004256-313673 X-HE-Meta: U2FsdGVkX1/blsKXFxw+txP+cj3iiZM7mvBf7ddLbHhcvx7vWI9fwDc/Fn96xT5ErVoJxXuuvnUhkQHrHEK21Zb/lAb+51WKtyaRMbU0kTS3kzLFwcdqTAlB3ZVHkMXJdjyFf8cNrWRE2J/sDEBQv841Qi7sXU3IwG8wqIWUVyEhlt7ohZ/hjLmJbyaCpvB5QkWOvn7mz3Hl3L41FJcotplTJl6u0VtoXo2Kpb3wQeoGq8yySVsxHNju0gNyVC3QvL3sp8AIn3IBlAa3ruvaqGH00+08bkwIPZJWK2kMxv0Yt77WQZW738uVrxNBTCdrd1bc5hN47c0LXjRSEYFqyDzHBo3b87uZead4mB/9WTij3Pg84T4cGS1CApiif/SWRdXcwK7SIvuIM0DIy/H34kkQdwZqwQ4U2/GSDVqBE++pQ9WPyltMDr12XQnvhf4ZFQm1qaQU6WZeI/zkP6tvkopZyCNcyEr7QHR3dTJFfDsimVY9+gT9bz7YcTj1vyB7q89j+oC4KLrl/LD68XW/JT6TbPf2UFCq8tU2FvpZNyMslnAQ86WWIXZv4DfFrhg2n4GKPMqa4nNJ8xnGAyol8pkH35G1oKoP2+cEZDxaIx1iDt2FfWLrVD0MlvXVZfcm9MBtBfQDW5t8DKLaC4ezR/hecjZ3e0QfWWQPzqkb1rcUTbrP59i7re7zDwvr8YsH1JNeZWFSgwqEuQJld/iRT2gWOEU7mUkckwCS2uzS2OLthvRNH+w1jZF3kkK601M5IiJZUpaPhczUAzp/iljtSQRtwH3ZJf05Zw+bWeL+3ssij3l0EtO1T5pFptjkcOBJY/1DkPQovF891Ov5CuTTKezN7Z/VrARo16jTJ4pWrF9+jugK+J6LzlOcNowOovv4d61bqPg/GgUE97tut6r2YaI0319jJD59ayCMw0JMIKaBU1wS6BqkBz4jrQejcMI9yrN5RH+jvunb7omAROq gbzAt8sJ MleTu793fw+KKAH1HJEORqxaz5lYRSVBlzFikvHSz3CROp7fb6hvNEM9QQ1SZORGJTUkPD+5vZg6nAgr9L5EooLUyegTEQ68/SsiNMBDwBxUp0WzACtsfbVo4wLkP/KxS1byQ6UiVXCph9Q11m9sTMEOx0Q9XORojYsxHgQzlUjxtHGRrjNmn7MnkiSJ+B2YXGvOzbhDVFXIH+rQ4ORG7P6yVdHrCf90kgf5hq2E6fbh6r0HBLzInWjob25YqrLmYIRIuGHkjWqIZW/ecgDkLFZShgFBnOiETkEsoGXK/cI/64kMBIsK4RhVG5DqjX8CJIX9LEL6bnJQ0+HG+GfJykVL4yuUzx3rCgWLphiQXr1knXFYaQBV+LKFAiQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 3, 2024 at 4:32=E2=80=AFPM Dan Williams wrote: > > Yang Shi wrote: > > On Mon, Sep 30, 2024 at 4:54=E2=80=AFPM Huang, Ying wrote: > > > > > > Hi, David, > > > > > > Thanks a lot for comments! > > > > > > David Hildenbrand writes: > > > > > > > On 30.09.24 07:51, Huang Ying wrote: > > > >> On systems with TDX (Trust Domain eXtensions) enabled, memory rang= es > > > >> hot-added must be checked for compatibility by TDX. This is curre= ntly > > > >> implemented through memory hotplug notifiers for each memory_block= . > > > >> If a memory range which isn't TDX compatible is hot-added, for > > > >> example, some CXL memory, the command line as follows, > > > >> $ echo 1 > /sys/devices/system/node/nodeX/memoryY/online > > > >> will report something like, > > > >> bash: echo: write error: Operation not permitted > > > >> If pr_debug() is enabled, the error message like below will be sho= wn > > > >> in the kernel log, > > > >> online_pages [mem 0xXXXXXXXXXX-0xXXXXXXXXXX] failed > > > >> Both are too general to root cause the problem. This will confuse > > > >> users. One solution is to print some error messages in the TDX me= mory > > > >> hotplug notifier. However, memory hotplug notifiers are called fo= r > > > >> each memory block, so this may lead to a large volume of messages = in > > > >> the kernel log if a large number of memory blocks are onlined with= a > > > >> script or automatically. For example, the typical size of memory > > > >> block is 128MB on x86_64, when online 64GB CXL memory, 512 message= s > > > >> will be logged. > > > > > > > > ratelimiting would likely help here a lot, but I agree that it is > > > > suboptimal. > > > > > > > >> Therefore, in this patch, the whole hot-adding memory range is > > > >> checked > > > >> for TDX compatibility through a newly added architecture specific > > > >> function (arch_check_hotplug_memory_range()). If rejected, the me= mory > > > >> hot-adding will be aborted with a proper kernel log message. Whic= h > > > >> looks like something as below, > > > >> virt/tdx: Reject hot-adding memory range: 0xXXXXXXXX-0xXXXXXXXX > > > >> for TDX compatibility. > > > >> > The target use case is to support CXL memory on TDX enabled syst= ems. > > > >> If the CXL memory isn't compatible with TDX, the whole CXL memory > > > >> range hot-adding will be rejected. While the CXL memory can still= be > > > >> used via devdax interface. > > > > > > > > I'm curious, why can that memory be used through devdax but not > > > > through the buddy? I'm probably missing something important :) > > > > > > Because only TDX compatible memory can be used for TDX guest. The bu= ddy > > > is used to allocate memory for TDX guest. While devdax will not be u= sed > > > for that. > > > > Sorry for chiming in late. I think CXL also faces the similar problem > > on the platform with MTE (memory tagging extension on ARM64). AFAIK, > > we can't have MTE on CXL, so CXL has to stay as dax device if MTE is > > enabled. > > > > We should need a similar mechanism to prevent users from hot-adding > > CXL memory if MTE is on. But not like TDX I don't think we have a > > simple way to tell whether the pfn belongs to CXL or not. Please > > correct me if I'm wrong. I'm wondering whether we can find a more > > common way to tell memory hotplug to not hot-add some region. For > > example, a special flag in struct resource. off the top of my head. > > > > No solid idea yet, I'm definitely seeking some advice. > > Could the ARM version of arch_check_hotplug_memory_range() check if MTE > is enabled in the CPU and then ask the CXL subsystem if the address range= is > backed by a topology that supports MTE? Kernel can tell whether MTE is really enabled. For the CXL part, IIUC that relies on the CXL subsystem is able to tell whether that range can support MTE or not, right? Or CXL subsystem tells us whether the range is CXL memory range or not, then we can just refuse MTE for all CXL regions for now. Does CXL support this now? > > However, why would it be ok to access CXL memory without MTE via devdax, > but not as online page allocator memory? CXL memory can be onlined as system ram as long as MTE is not enabled. It just can be used as devdax device if MTE is enabled. > > If the goal is to simply deny any and all non-MTE supported CXL region > from attaching then that could probably be handled as a modification to > the "cxl_acpi" driver to deny region creation unless it supports > everything the CPU expects from "memory". I'm not quite familiar with the details in CXL driver. What did you mean "deny region creation"? As long as the CXL memory still can be used as devdax device, it should be fine.