From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28456C47422 for ; Fri, 26 Jan 2024 14:31:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93C386B008A; Fri, 26 Jan 2024 09:31:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E9236B008C; Fri, 26 Jan 2024 09:31:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 789FB6B0092; Fri, 26 Jan 2024 09:31:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 64F816B008A for ; Fri, 26 Jan 2024 09:31:05 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 19B07A062C for ; Fri, 26 Jan 2024 14:31:05 +0000 (UTC) X-FDA: 81721699290.10.10B9480 Received: from a11-32.smtp-out.amazonses.com (a11-32.smtp-out.amazonses.com [54.240.11.32]) by imf02.hostedemail.com (Postfix) with ESMTP id E49498000A for ; Fri, 26 Jan 2024 14:31:01 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=jagalactic.com header.s=rjayupzefgi7e6fmzxcxe4cv4arrjs35 header.b=mn6emrx3; dkim=pass header.d=amazonses.com header.s=224i4yxa5dv7c2xz3womw6peuasteono header.b=jKIpPsdZ; spf=pass (imf02.hostedemail.com: domain of 0100018d462e5e47-5a13cb79-3eda-4d3c-a88c-812fb8f4e557-000000@amazonses.com designates 54.240.11.32 as permitted sender) smtp.mailfrom=0100018d462e5e47-5a13cb79-3eda-4d3c-a88c-812fb8f4e557-000000@amazonses.com; dmarc=pass (policy=quarantine) header.from=jagalactic.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706279462; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O8ez/ZYHCEJtlY4AiLaEAsk+F3puauQ/j2uxaJgQ/a8=; b=bqBJfokroPa/GxlL2nKk5piG6LggGU74f/QTmANO9t2uOiq0dBI09BwvPv4Lok6077C2as IUpVcZay0UUYTc+/4t99C0ekKa5rXd2LD26XtCHkXptnbvJ2fdtKmK1b/PMIsZ37T3emIb DSNfcJ4MXUAjN8Brm2TAbhI7LxhhZBw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706279462; a=rsa-sha256; cv=none; b=BCkCW2D/I2tR2/s2iVjp2v+sYjHoVA1BDgxt4A6L4dBZ3wTymLUehpUDRJnVzgk1OKTkDU Yt+6/Jet7JZ0a5lImLbPxQBo0F7NWnsP18ztk7HzMS4rYaTLTdUBnGkXng29qz3HG9oir7 C6OCmmMTN1PrdDWlLMkLyMxuiVPfnKs= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=jagalactic.com header.s=rjayupzefgi7e6fmzxcxe4cv4arrjs35 header.b=mn6emrx3; dkim=pass header.d=amazonses.com header.s=224i4yxa5dv7c2xz3womw6peuasteono header.b=jKIpPsdZ; spf=pass (imf02.hostedemail.com: domain of 0100018d462e5e47-5a13cb79-3eda-4d3c-a88c-812fb8f4e557-000000@amazonses.com designates 54.240.11.32 as permitted sender) smtp.mailfrom=0100018d462e5e47-5a13cb79-3eda-4d3c-a88c-812fb8f4e557-000000@amazonses.com; dmarc=pass (policy=quarantine) header.from=jagalactic.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=rjayupzefgi7e6fmzxcxe4cv4arrjs35; d=jagalactic.com; t=1706279460; h=Subject:From:To:Cc:Date:Mime-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To:References:Message-Id; bh=gHOzM9vAgtTnJvfNLQ4D9+JelrB/kLwMELCiOXtfpGA=; b=mn6emrx3Aw8ZjhE4sBYWDGkVQ3WSM7FjaW3TPTjBpjE7GU6s+WnBJeg0OnpsDzgC MBlaYYuQyCmQ78AXJ3/+9ZYm87el/WHI0vENEf1ArP6FxG3QfyfOATErXIUnPhyE/lW T43vj7A9X7rfiUv5TZIC/0j6cp+G+EyjmilUbExo= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=224i4yxa5dv7c2xz3womw6peuasteono; d=amazonses.com; t=1706279460; h=Subject:From:To:Cc:Date:Mime-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To:References:Message-Id:Feedback-ID; bh=gHOzM9vAgtTnJvfNLQ4D9+JelrB/kLwMELCiOXtfpGA=; b=jKIpPsdZJGqnvXxFR+EyiXPA3nPhZ0h1ANhfuO86Q6e+atCwb1s8ZPJTqlOuHmVg nheBB/nO/YSe7/VWw51wAx2anehtwaEFxs/lEdQXYauy728i09vLMTKqqH+yc2RZ5oS hPLjmLdGApsXF5s9m2Mr8dHEfMqhp6NI1DqT1SnA= Subject: Re: [RFC] Memory tiering kernel alignment From: =?UTF-8?Q?John_Groves?= To: =?UTF-8?Q?David_Rientjes?= Cc: =?UTF-8?Q?John_Hubbard?= , =?UTF-8?Q?Zi_Yan?= , =?UTF-8?Q?Bharata_B_Rao?= , =?UTF-8?Q?Dave_Jiang?= , =?UTF-8?Q?Aneesh_Kumar_K=2E?= =?UTF-8?Q?V?= , =?UTF-8?Q?Huang=2C_Ying?= , =?UTF-8?Q?Alistair_Popple?= , =?UTF-8?Q?Christoph_Lameter?= , =?UTF-8?Q?Andrew_Morton?= , =?UTF-8?Q?Linus_Torvalds?= , =?UTF-8?Q?Dave_Hansen?= , =?UTF-8?Q?Mel_Gorman?= , =?UTF-8?Q?Jon_Grimm?= , =?UTF-8?Q?Gregory_Price?= , =?UTF-8?Q?Brian_Morris?= , =?UTF-8?Q?Wei_Xu?= , =?UTF-8?Q?Johannes_Weiner?= , =?UTF-8?Q?linux-mm=40kv?= =?UTF-8?Q?ack=2Eorg?= Date: Fri, 26 Jan 2024 14:31:00 +0000 Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable In-Reply-To: <75f21150-1e12-4f4b-e578-e170e4fea18b@google.com> References: <75f21150-1e12-4f4b-e578-e170e4fea18b@google.com> X-Mailer: Amazon WorkMail Thread-Index: AQHaT7wDyiDvKX8zS8+89eR831IHvAAqEOm2 Thread-Topic: [RFC] Memory tiering kernel alignment X-Wm-Sent-Timestamp: 1706279458 Message-ID: <0100018d462e5e47-5a13cb79-3eda-4d3c-a88c-812fb8f4e557-000000@email.amazonses.com> Feedback-ID: 1.us-east-1.LF00NED762KFuBsfzrtoqw+Brn/qlF9OYdxWukAhsl8=:AmazonSES X-SES-Outgoing: 2024.01.26-54.240.11.32 X-Stat-Signature: oswhh4efjgehk4ysbt7fx9xerkm1fubo X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E49498000A X-Rspam-User: X-HE-Tag: 1706279461-388313 X-HE-Meta: U2FsdGVkX1/l4bHF3jRd/R6DZv5UHZwRhTg0/usUmE86nBpTJf3mo8IVZ0CJFTqnRFGLRqkewjjYHhI7/bfCCB50iOm2r3BFtXJMi6f/o6BTFbQZVu19nTy0WxGsVQPQ9YAmwwFTDukRuMAvj2otY4fT/CYFktwgzWIqeYyDrkFPsD0ARV6FG2fPjYaEZaXq/DMLXUaNBWD5Z+JuvIAFnoYl6oUSZI/IyLkRjyRbt3zD4Y74Nc4aCM4ViSashw7vjNmL68jFEjexDhEJIoJOaR3kwKq+vjtmLjdsDOVxjoEH27I3CmkzozwHj9YvK+ljzDKCe6lFSak9YT1NTvAVXMFSFnnliAoybT/MWfFKeZ0GJEIz8WofNX3Hfg1xDLVqVql9CGhRcIEZt1FuyQ0GoHhwe8YiVzmYw7LANToJ0leCN5nY9AEIjn3km1MkFXorghFn/wzeNW0YjQshSSX7BtEKWZ4PXtgonH8bGwKG1SMWVfctYjdqiPDWCWcIgM9tqab7fE/G9krjYiHePtSrplG0JtyZ+9rQTaEDKDqJcv2jNwNANpkaq+K1DVFWQn4xakvbFHy25Qs42HEMIyovEgFfRd6A96hADNTYF1pRLY3e6KdVV8ejWZVernGTRZkiv7Se9zhFcWTJ35zp5+Z05Pccqu7UbA24AX5jj0NNOzNnKZ2qhahuTc+dxqIObgD9O1tAL6w9/ns3x5Q2BPYDHgGtc2WnDFRmHkTFyP3JYEJxTVcqn+c+L/baoKHSws8ItO2WJGHCf3ShDiF7QqSPlemd+oSIkE1bSX1yacpjUkgUzkuwmGVIlHgNFxzp/ZMihIolgtDoR3hRkEBRfovTfSOrD/yE1bQJnC9JnOdqAI5VdngkCbOnJP0b9FK2fMSntNjqyROssxqgBcXRcs36V6wqmTYFd6zev/elDRwCKWq37Sl24oHhjocBP+WnlB6JAIH4ORuh8H4oVuzFpvS tv70D4J+ CULp7m14cHyQPW8+dBkustPUyPoq8sWPJb5LPuU7I1yF0Hx1w6dKS3I3QLvS0EW16vYaRYuCnoORBYw53PRsJ8JTJ+CKyq4oep23jNQfFyvyYqLbtMCawt1+v/3jAcmKsLa5h2iBWjDojOAcxKUlpIVAkOy55moYz91v1mN1p/vsKbTNx3i/Q0xclozlhgmiW13TLN3gTXAZ3lCxH2Xb6gpSSIQMIzIQFMiAKv6lVNkEKVRA8LFHsKPS0FeS3/NkOsOK4vub2UFQ+B5oJS8/2prYl2xqHhrgVo1hlKzKbSxTaxcqPW4b2w18LHOxrUW+mWrUzkzcCjPmwteZZoj8TiDHByfb0QSUuMSnz8QKtuHfijTPqv2rNSc8Ve9qjIi6dnf5COUJGlSkI1SOcM2AgaWiT3oBmgYg2ffyp0mn9iT6nbBRKRfDLRrbLdtJwd+xLzptPvgYRqs6IGvwjs/Du/nyRVwOvjTEhTOT3SaY9AdQ3tt5mCmhwVpeEbizo3Xxc6e3XxCE8CovMX/2diBOIC94fEOIx/DVsJdM/Re15WBmngB9O/K2+DVJwD2/NPaf+TCOEXgMIt6sq/iDmW9mLxLCWGG2QiAmPHv42jbDBYw35elWK48r1jWptZcG8XrXH6As3bSNCjWisxgEgwTEnh+kaTowHZfG0M6bEwp3H78VBA+p1Wd6Ao1Q55Nzpqys/6Azwe98iNhgvJ70fFWrXfkTYY+p5pDAOK+z3ehuhf2aJ85yduX6NcFURwmciP/YSh3fedRxA3EjYVgghlEMUqMpwCjlsIns1oea5paHPLD9AmPvC8VLyOExHvsvbJcId2tm0MsAKL05zhSvM6Bjy5wHb8uVnoBLK4S/zFW8k9Cgu48qf4f+Q3SwC5uUvzyMiGtRQBfNx3BMfi8Q/Rbg2C1T0d/QBesAzyy5K3c2HcOfdnQC8dTNkjdYnhho9oBJA4KJrWAc/H2Tss1t6jqmGW7WVEgEa VRSdapsG bP8Kq/25bfSg2EoVQn+tg6gkLwtJi/Rnbz5tFwEs9UW9i6enW77/7roAcDWnh4pTBOgE9mycj47x6Eh9o+gvIcS0QPptUbYA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 24/01/25 10:26AM, David Rientjes wrote:=0D=0A> Hi everybody,=0D=0A>=20= =0D=0A> There is a lot of excitement around upcoming CXL type 3 memory ex= pansion=0D=0A> devices and their cost savings potential. As the industry= starts to=0D=0A> adopt this technology, one of the key components in str= ategic planning is=0D=0A> how the upstream Linux kernel will support vari= ous tiered configurations=0D=0A> to meet various user needs. I think it = goes without saying that this is=0D=0A> quite interesting to cloud provid= ers as well as other hyperscalers :)=0D=0A>=20=0D=0A> I think this discus= sion would benefit from a collaborative approach=0D=0A> between various s= takeholders and interested parties. Reason being is=0D=0A> that there ar= e several different use cases the need different support=0D=0A> models, b= ut also because there is great incentive toward moving "with"=0D=0A> upst= ream Linux for this support rather than having multiple parties=0D=0A> br= inging up their own stacks only to find that they are diverging from=0D=0A= > upstream rather than converging with it.=0D=0A>=20=0D=0A> I'm intereste= d to learn if there is interest in forming a "Linux Memory=0D=0A> Tiering= Work Group" to share ideas, discuss multi-faceted approaches, and=0D=0A>= keep track of work items=3F=0D=0A>=20=0D=0A> Some recent discussions hav= e proven that there is widespread interest in=0D=0A> some very foundation= al topics for this technology such as:=0D=0A>=20=0D=0A> - Decoupling CPU= balancing from memory balancing (or obsoleting CPU=0D=0A> balancing e= ntirely)=0D=0A>=20=0D=0A> + John Hubbard notes this would be useful fo= r GPUs:=0D=0A>=20=0D=0A> a) GPUs have their own processors that are= invisible to the kernel's=0D=0A> NUMA "which tasks are active o= n which NUMA nodes" calculations,=0D=0A> and=0D=0A>=20=0D=0A> = b) Similar to where CXL is generally going, we have already built=0D=0A= > fully memory-coherent hardware, which include memory-only NUMA= =0D=0A> nodes.=0D=0A>=20=0D=0A> - In-kernel hot memory abstract= ion, informed by hardware hinting drivers=0D=0A> (incl some architectu= res like Power10), usable as a NUMA Balancing=0D=0A> backend for promo= tion and other areas of the kernel like transparent=0D=0A> hugepage ut= ilization=0D=0A>=20=0D=0A> - NUMA and memory tiering enlightenment for a= ccelerators, such as for=0D=0A> optimal use of GPU memory, extremely i= mportant for a cloud provider=0D=0A> (hint hint :)=0D=0A>=20=0D=0A> -= Asynchronous memory promotion independent of task_numa_fault() while=0D=0A= > considering the cost of page migration (due to identifying cold memo= ry)=0D=0A>=20=0D=0A> It looks like there is already some interest in such= a working group that=0D=0A> would have a biweekly discussion of shared i= nterests with the goal of=0D=0A> accelerating design, development, testin= g, and division of work:=0D=0A>=20=0D=0A> Alistair Popple=0D=0A> Aneesh K= umar K V=0D=0A> Brian Morris=0D=0A> Christoph Lameter=0D=0A> Dan Williams= =0D=0A> Gregory Price=0D=0A> Grimm, Jon=0D=0A> Huang, Ying=0D=0A> Johanne= s Weiner=0D=0A> John Hubbard=0D=0A> Zi Yan=0D=0A>=20=0D=0A> Specifically = for the in-kernel hot memory abstraction topic, Google and=0D=0A> Meta re= cently publushed an OCP base specification "Hyperscale CXL Tiered=0D=0A> = Memory Expander Specification" available at=0D=0A> https://drive.google.c= om/file/d/1fFfU7dFmCyl6V9-9qiakdWaDr9d38ewZ/view=3Fusp=3Ddrive_link=0D=0A= > that would be great to discuss.=0D=0A>=20=0D=0A> There is also on-going= work in the CXL Consortium to standardize some of=0D=0A> the abstraction= s for CXL 3.1.=0D=0A>=20=0D=0A> If folks are interested in this topic and= your name doesn't appear above=0D=0A> (I already got you :), please:=0D=0A= >=20=0D=0A> - reply-all to this email to express interest and expand upo= n the list=0D=0A> of topics above to represent additional areas of int= erest that should=0D=0A> be included, *or*=0D=0A>=20=0D=0A> - email m= e privately to express interest to make sure you are included=0D=0A>=20=0D= =0A> Perhaps I'm overly optimistic, but one thing that would be absolutel= y=0D=0A> *amazing* would be if we all have a very clear and understandabl= e vision=0D=0A> for how Linux will support the wide variety of use cases,= even before=0D=0A> that work is fully implemented (or even designed), by= LSF/MM/BPF 2024=0D=0A> time in May.=0D=0A>=20=0D=0A> Thanks!=0D=0A>=20=0D= =0A=0D=0APlease add me to the cxl interested parties list.=20=0D=0A=0D=0A= John Groves (jgroves@micron.com / John@Jagalactic.com)=0D=0A=0D=0A