From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AC32C54798 for ; Tue, 5 Mar 2024 06:19:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A6B9D6B007E; Tue, 5 Mar 2024 01:19:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A1B266B0080; Tue, 5 Mar 2024 01:19:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86E856B0082; Tue, 5 Mar 2024 01:19:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6F8CC6B007E for ; Tue, 5 Mar 2024 01:19:06 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 39D841C0911 for ; Tue, 5 Mar 2024 06:19:06 +0000 (UTC) X-FDA: 81861982692.25.AE87B7C Received: from mail-vs1-f48.google.com (mail-vs1-f48.google.com [209.85.217.48]) by imf04.hostedemail.com (Postfix) with ESMTP id 8A8464000F for ; Tue, 5 Mar 2024 06:19:03 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=RP9Co1K5; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf04.hostedemail.com: domain of horenchuang@bytedance.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=horenchuang@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709619544; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t9nqsjGsjHu0OgureRZqRYa80276z37A+29y5cW7SC8=; b=8rZ9YMHG2sUKUHyVv6cLeCXX22mBRAL+qXP+Jkc/xsIvCV6Taxy86XlTGM6+v+qFtYGOFE NYa67bZr7ae8O6VGJgLhoZYrEMThn8+4fmp1Qs/Zy+BsmUeWrPCyxm0kUQ9Dgbvz/KGG9a uwkTNHskx7qXOj7p/Zg+G2/jBcnH1xE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=RP9Co1K5; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf04.hostedemail.com: domain of horenchuang@bytedance.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=horenchuang@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709619544; a=rsa-sha256; cv=none; b=YiUBH+4j9n1GW9N7k/uGCLTjFeRp8GoKWuvSVDYB13t1kw+l/Th4yyyZXzjRVn6qcO1MYK BhOEzXRpLHl/uFYj4jp2LlmI2kd+XSNU6Q5wOXQ3QE97HN/78TGhkD21lNDrubfzuUY0Tf un5lWfwWmHgi3PJBGqfe7hyucyYIQJo= Received: by mail-vs1-f48.google.com with SMTP id ada2fe7eead31-47259486a1fso1119797137.2 for ; Mon, 04 Mar 2024 22:19:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1709619542; x=1710224342; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=t9nqsjGsjHu0OgureRZqRYa80276z37A+29y5cW7SC8=; b=RP9Co1K55wStOdoMqlrEPh9nVsSl8ul9MGTngIDTE4wMYLne4zfNRIpYyNS6N0Whnc jQZH4C99j323jzCxWwPIbeUOE7riz660WPpKGCRagBmgRHYYzYeTby0MkTn3kfLs4TxS nFkdGe3fO367fs2sBNYCk8WNrw6X5wIdfrXp8lAY519jrz6nEAPMxQ4SxMdFsibW+pg+ /WpGxvwKQz287uDVuyGFy4V3MbhajATogYRxTAccYe8E5TeGyza8lQ5qYyl8lo2LhGJe TBH2FqUyeZWXzlzEyC/loiGzQjkSldVIkhE+0Zc5Lbz8LV+xCkA2f4FiPJi77NHO+ySg JsAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709619542; x=1710224342; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=t9nqsjGsjHu0OgureRZqRYa80276z37A+29y5cW7SC8=; b=edjHytVGYZt63Qqx1kCjpxDOjAEFTp6CEdlE+ufyJI4RXA+DCD1RRvi4GjE2xj7Lwr Li3cUTo/A00W7BVRdeXdp7e2QtW+uIISeLfvQUMi+yLEolOvYZydtV9GyV4dhOe+HMdG aSsKzYXmWLDXJXu+evwwa02Pngy5QiXBcZGq8a/it5ke3pJRUIf24zZKt/AviKUx10Ye 5Rimv/5l5MaisWw59W3UMszJGbcm35gGzDgyKiLb97GbTeJ+dYN1uZyzSt79WjOihDz4 PkRqssr0C+2FlR4KOxxF5Q5YJNWH0zRGE043u+DbaxqRF/liY/xNBDvzsz7Vmy65oNlb 4Ymw== X-Forwarded-Encrypted: i=1; AJvYcCVrUmHp8kC0ohT7OON0xy24ul2LRWa6nI+EXJxmPDgt1OWQUCme9HnOE2jYDmZTL7Bu7kgH9DW7O7Y3gNcMyWg2N9M= X-Gm-Message-State: AOJu0YyHBmm5OOSzwlUqzCGq8dF/EV4f5bhW1z1XL+Qcz576/UHz3Std /Ge6gVvkVrAvCUJuvU7jiwVnnvDTem7UMhtN+agyS+/JdAyXtV9lUCtRriO2V8TMJc0w8gtJdzU z0wq0HW5wWZ5eFfNpuhlW1zFNXMerd4uzvMNn2Q== X-Google-Smtp-Source: AGHT+IFTEtlS723KrD/sGIZR/iFRVUXunrKnfzDUdAb2xqPpk4so46bQuE/ZCyGuFmv4UDEUNbTKdFRhL65Y2+n3Wks= X-Received: by 2002:a67:e9da:0:b0:472:990d:84a with SMTP id q26-20020a67e9da000000b00472990d084amr962104vso.20.1709619542310; Mon, 04 Mar 2024 22:19:02 -0800 (PST) MIME-Version: 1.0 References: <20240301082248.3456086-1-horenchuang@bytedance.com> <87frx6btqp.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87frx6btqp.fsf@yhuang6-desk2.ccr.corp.intel.com> From: "Ho-Ren (Jack) Chuang" Date: Mon, 4 Mar 2024 22:18:51 -0800 Message-ID: Subject: Re: [External] Re: [PATCH v1 0/1] Improved Memory Tier Creation for CPUless NUMA Nodes To: "Huang, Ying" Cc: Gregory Price , aneesh.kumar@linux.ibm.com, mhocko@suse.com, tj@kernel.org, john@jagalactic.com, Eishan Mirakhur , Vinicius Tavares Petrucci , Ravis OpenSrc , Alistair Popple , "Rafael J. Wysocki" , Len Brown , Andrew Morton , Dave Jiang , Dan Williams , Jonathan Cameron , linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Ho-Ren (Jack) Chuang" , "Ho-Ren (Jack) Chuang" , linux-cxl@vger.kernel.org, qemu-devel@nongnu.org Content-Type: multipart/alternative; boundary="0000000000008567cb0612e3d108" X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8A8464000F X-Stat-Signature: yzfipadh1tqob5zs6igqqnq8rp476cya X-HE-Tag: 1709619543-327010 X-HE-Meta: U2FsdGVkX1/xjFK9eFHPuOHbZGpLdsP+u0hx6yHXQ6Ssnl5wqw6CQ3/81TOivne/EIvL7xLblZj+OWvqdrh7Zvdao0qI/0vMfmNrSFicp1g5YplynljDvX1v9g7lJ8Owi9n3wkO5RzCeVU7zxXu068Fx3qWSEEqrJ3BU4xpijQWUhP2tUolqx68j1OgH+JdiNTAGhAe/heLjI8yh5bw/RJE+9bGYC9M/P0PZL0EswcBpz6/UMqHqlykHDOR6tsP2DmzoH7eyyCEU7GjvzZ3wPJZgL4iVOjLIZD96wwMq0tnXYBwoO3V79JZFrxShycTys57wNjYsyXh2ydlcSSn5w55zvksPCnwzhLSub/tBf06oSgFPuTe8EAApEAF/vZ8f39nd2Dt+5sYZCcxdBfFY3peRbIjF6pMKkv+uztREhCMTe7dPgHZK9zPEUSCBmu10DRzXd5OHritZr/mRtSh8Ur0Cw/6/9ki4PjVmJXK+eCz7aYVut247WcQNqaC1d6pmTbRSF8j7pE62BLvWg130ApWs/W9eCfOAkNkIV8INQWc7jvEMdAYNwFDF0QLwi5CUgQZSAqQSnqCZPkKW2LsnFZgth32415kBlM2zZEIykyPmYRG9tHyMG0hwrA8QC5WukrmunsuGLv4jgtuVBDEmXAv+sU+5vW7btbQNLQ+vZ4ylDTK+k05gWOqi6pQyy+DT+B6A58Y3b9oV+1nb6ZYUqApY366BYz/HJINCv5MIc4iHL6vZxsu7Bqw5ymJStVUcT4bdPPaj8HWSygnd/3Pq9MEPFhKOhd7iBaXxIJmfcZMbZ2YG5qU/boyfU/PJsIq+iBM3ZVwnwq+c5wOL/pls5BpWhjDxa0TO775vSfkPsUlg8SQ8VChuEgNJGQuFBRPofpy+E+xk8BGX1B9z7T+mDRyATlcB7VjAPhCcUJ1KV0/2Ns4vxVapQo1nbYPxnnYF9TTzHW6FY5bQ0YdpXzl yJhGbCJB lY50u7RSqqx6AQLsb0mzep3HdUDD5hWLJkTGRJPDgzvHK0hId8wCiJsfVDp3jMThZiwPtKoMCEihdsmASiEhEY+ztS3H5dUAt45U7DvtrvhgGrQQeqBvRUSMCXK+cxR5xY740cSx0sJxcXhvVN2t0xhI2R1RjtWbOVRtzP7DEvNJyFJUQ1o+jNduNISR6fSARndBitFgsSriZOXmOF96LLWYZjjn8+UWEMvDtEAQHyoL7Y+dSKS4PqiB5aj09lT8MqImAgeNS4eu4lXR6XTKCvMwNeH8FJFYnQsKTJ0mwg/QU3EQ6n8vj9Dl+G6jxv5w8VhlS6Ph+2MIhCX8Tb6YY/+lRU/3+P+covQPAvICobQdeqU9bxptvRX5HTHXUxMBMi4aYJaRA2Vt9GwUwxrEsWs06qxi71y7UMGV8gf6LsTdwfCabxLiN1vOQO8II0UYLwmg+We+c/pKWWrS7n8jjarK47bvwfjHY/dUub3yZ570kvhbakQpD8h0IobrsXx3XxfX2bDURQw4XuSibCB52CXhs4aKzk9AccycQfyVG7tm/LuiIXPWeFlR7g6x4SuVQNV3nOWLCw9GJJwA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --0000000000008567cb0612e3d108 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Mar 3, 2024 at 6:47=E2=80=AFPM Huang, Ying w= rote: > > "Ho-Ren (Jack) Chuang" writes: > > > The memory tiering component in the kernel is functionally useless for > > CPUless memory/non-DRAM devices like CXL1.1 type3 memory because the nodes > > are lumped together in the DRAM tier. > > https://lore.kernel.org/linux-mm/PH0PR08MB7955E9F08CCB64F23963B5C3A860A@PH0= PR08MB7955.namprd08.prod.outlook.com/T/ > > I think that it's unfair to call it "useless". Yes, it doesn't work if > the CXL memory device are not enumerate via drivers/dax/kmem.c. So, > please be specific about in which cases it doesn't work instead of too > general "useless". Thank you and I didn't mean anything specific. I simply reused phrases we discussed earlier in the previous patchset. I will change them to the following in v2= : "At boot time, current memory tiering assigns all detected memory nodes to the same DRAM tier. This results in CPUless memory/non-DRAM devices, such as CXL1.1 type3 memory, being unable to be assigned to the correct memory tier, leading to the inability to migrate pages between different types of memory." Please see if this looks more specific. > > This patchset automatically resolves the issues. It delays the initialization > > of memory tiers for CPUless NUMA nodes until they obtain HMAT information > > at boot time, eliminating the need for user intervention. > > If no HMAT specified, it falls back to using `default_dram_type`. > > > > Example usecase: > > We have CXL memory on the host, and we create VMs with a new system memory > > device backed by host CXL memory. We inject CXL memory performance attributes > > through QEMU, and the guest now sees memory nodes with performance attributes > > in HMAT. With this change, we enable the guest kernel to construct > > the correct memory tiering for the memory nodes. > > > > Ho-Ren (Jack) Chuang (1): > > memory tier: acpi/hmat: create CPUless memory tiers after obtaining > > HMAT info > > > > drivers/acpi/numa/hmat.c | 3 ++ > > include/linux/memory-tiers.h | 6 +++ > > mm/memory-tiers.c | 76 ++++++++++++++++++++++++++++++++---- > > 3 files changed, 77 insertions(+), 8 deletions(-) > > -- > Best Regards, > Huang, Ying -- Best regards, Ho-Ren (Jack) Chuang =E8=8E=8A=E8=B3=80=E4=BB=BB --0000000000008567cb0612e3d108 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sun, Mar 3, 2024 at 6:47=E2=80=AFPM Huang, Ying= <ying.huang@intel.com> w= rote:
>
> "Ho-Ren (Jack) Chuang" <horenchuang@bytedance.com> writes:
&= gt;
> > The memory tiering component in the kernel is functionally= useless for
> > CPUless memory/non-DRAM devices like CXL1.1 type3= memory because the nodes
> > are lumped together in the DRAM tier= .
> > https://= lore.kernel.org/linux-mm/PH0PR08MB7955E9F08CCB64F23963B5C3A860A@PH0PR08MB79= 55.namprd08.prod.outlook.com/T/
>
> I think that it's u= nfair to call it "useless".=C2=A0 Yes, it doesn't work if
= > the CXL memory device are not enumerate via drivers/dax/kmem.c.=C2=A0 = So,
> please be specific about in which cases it doesn't work ins= tead of too
> general "useless".

Thank=C2=A0you and I didn't mean anyt= hing specific. I simply reused phrases we discussed

earlier in the previous=C2=A0patchset. I wil= l change them to the following in v2:=C2=A0

&quo= t;At boot time, current memory tiering assigns all detected memory nodes

to the same DRAM tier. Thi= s results in CPUless memory/non-DRAM devices,

such as CXL1.1 type3 memory, being unable to be assig= ned to the correct memory tier,

leading to the inability to migrate pages between different=C2=A0ty= pes of memory."

Please see if this looks more specific.

> > = This patchset automatically resolves the issues. It delays the initializati= on
> > of memory tiers for CPUless NUMA nodes until they obtain HM= AT information
> > at boot time, eliminating the need for user int= ervention.
> > If no HMAT specified, it falls back to using `defau= lt_dram_type`.
> >
> > Example usecase:
> > We h= ave CXL memory on the host, and we create VMs with a new system memory
&= gt; > device backed by host CXL memory. We inject CXL memory performance= attributes
> > through QEMU, and the guest now sees memory nodes = with performance attributes
> > in HMAT. With this change, we enab= le the guest kernel to construct
> > the correct memory tiering fo= r the memory nodes.
> >
> > Ho-Ren (Jack) Chuang (1):
= > > =C2=A0 memory tier: acpi/hmat: create CPUless memory tiers after = obtaining
> > =C2=A0 =C2=A0 HMAT info
> >
> > = =C2=A0drivers/acpi/numa/hmat.c =C2=A0 =C2=A0 | =C2=A03 ++
> > =C2= =A0include/linux/memory-tiers.h | =C2=A06 +++
> > =C2=A0mm/memory-= tiers.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| 76 +++++++++++++++++++++= +++++++++++----
> > =C2=A03 files changed, 77 insertions(+), 8 del= etions(-)
>
> --
> Best Regards,
> Huang, Ying
<= br>--
Best regards,
Ho-Ren (Jack) Chuang
=E8=8E=8A=E8=B3=80=E4=BB= =BB
--0000000000008567cb0612e3d108--