From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7288FC47258 for ; Thu, 25 Jan 2024 20:04:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3E286B009A; Thu, 25 Jan 2024 15:04:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BEE696B00A0; Thu, 25 Jan 2024 15:04:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8DAC6B00A1; Thu, 25 Jan 2024 15:04:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 90ED76B009A for ; Thu, 25 Jan 2024 15:04:42 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4A3F8140D9F for ; Thu, 25 Jan 2024 20:04:42 +0000 (UTC) X-FDA: 81718911204.20.FAB2338 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf27.hostedemail.com (Postfix) with ESMTP id 6AB1A40018 for ; Thu, 25 Jan 2024 20:04:40 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=oXEQ08Zw; spf=pass (imf27.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706213080; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/kXvhOokEN61j2Kw72sdVoNSerYznfCBERCnoGzZATM=; b=ZSuTr5PQX7oRTS78k/tfyWZhZNyRhubEmafMN9M3KDLXYD8h7bYEMGrM6Q0mASEpKC31y5 Ogpc3opOl26nSafxwn1oVTdaNTs+jk/qa0QoDNitFUfWT1j/Mt+yjEODGfqF9OUHz/dokY s5KyHNBEs9pbGWrolfQkL4JlfuuEcqE= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=oXEQ08Zw; spf=pass (imf27.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706213080; a=rsa-sha256; cv=none; b=YQhczO6/ywR6b+HKcO6QLqHX3LTDtwkO6Qi1ZswB40ug4X57nch46fSUUwxQ6PTf0GTnse Y4wPopz6QJ5JN6D/xkOoaILVKxR+gY9Rx8OZW6Q3E4VcGXRiDPOBk9PP9o0uAZR7CVm/Nb 02gdmYPmgYI4tMu+G+rrgvw8jkIOtfM= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1d5ce88b51cso31945ad.0 for ; Thu, 25 Jan 2024 12:04:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706213079; x=1706817879; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=/kXvhOokEN61j2Kw72sdVoNSerYznfCBERCnoGzZATM=; b=oXEQ08ZwgLMC0Hm15oif8Z+TtYBPY1ghYChGzxwoBAP678A7J4epsZ8Z8cDjTHDX7p b+WDdT4XIa5UykN9/F9JjlNlAeNyR+gxhsZ8179gS3d18YOinhoW6iTCdVxM7y2PFGrv 83HT47jbHQ1bSucp7/m0oxbbry6162DrWI1qCQMIhQCCPG/Lm8Ks6RI3r1Wmot1tNP4P 1gD1BmZga5LBk7HjONsIL2ZBtitaz4MLILvM6ue7cMqjKtYMwqloN5R95B7MrSoHEiy+ kZy49geXBhrDTBHoXhAgoIk4mUpqIWerpAnDtnN4B+IxI7/2MPZlXpP9aniF/fS4DdOp 4qLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706213079; x=1706817879; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/kXvhOokEN61j2Kw72sdVoNSerYznfCBERCnoGzZATM=; b=YwrCDalVc4acu71cyjJVZkCNow5B7Q+o2Dvwc+wwFmU7KIrtq3UE0cIUxgshw89FVS 61WbWMIHWLKvqbi1cHAP3LeeV7manF8XE8R/L8QDfIPBzvd+6QIulwS5izmJoNAcZqsK CMQAMRlpCKBCiP0r0ewnuibJSVSsF/MMv2aq7Uu49hm+Q8LHc7jdsx3WgvUTZw1RIAOn ei2MibAEIr9eS/t60fQUMnjtpit/bFoSsoX+dtyG7w3kgHrKBi8RD1Oi789qQiTN0bFk NfID+9GHWk1NckwkRautLSo29msFBGNYMx6vWVx7LTo3CkoM95rGQDSX5zUkaUKY45Ly KOJg== X-Gm-Message-State: AOJu0YxuNhlmGh9o7zKw8CTrRIQS1sUwxwAqJJFMhk8/csVWFDMOwUX+ 6YlZAMlXLvfmYZrjIpshKf2RPTlzYS8xuKg4V6Dj7VGBUZ20GL1JV+8G8kci4w== X-Google-Smtp-Source: AGHT+IHbOsLKhnBzghKeT0aXalnaBPt1oT7inyYYedyedaCSqzhBJEohW/BfjRTk5OlWIfX69GEImA== X-Received: by 2002:a17:903:583:b0:1d5:4c73:3c83 with SMTP id jv3-20020a170903058300b001d54c733c83mr27690plb.24.1706213078808; Thu, 25 Jan 2024 12:04:38 -0800 (PST) Received: from [2620:0:1008:15:8d79:aa0b:df21:e137] ([2620:0:1008:15:8d79:aa0b:df21:e137]) by smtp.gmail.com with ESMTPSA id x33-20020a631721000000b005ceac534e47sm13947901pgl.51.2024.01.25.12.04.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Jan 2024 12:04:38 -0800 (PST) Date: Thu, 25 Jan 2024 12:04:37 -0800 (PST) From: David Rientjes To: Matthew Wilcox cc: John Hubbard , Zi Yan , Bharata B Rao , Dave Jiang , "Aneesh Kumar K.V" , "Huang, Ying" , Alistair Popple , Christoph Lameter , Andrew Morton , Linus Torvalds , Dave Hansen , Mel Gorman , Jon Grimm , Gregory Price , Brian Morris , Wei Xu , Johannes Weiner , linux-mm@kvack.org Subject: Re: [RFC] Memory tiering kernel alignment In-Reply-To: Message-ID: <2b29dd3d-bb2c-6a8c-94d2-d5c2e035516a@google.com> References: <75f21150-1e12-4f4b-e578-e170e4fea18b@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 6AB1A40018 X-Rspam-User: X-Stat-Signature: qaq4on5o8wf36dj6yw17rypeycm1wssj X-Rspamd-Server: rspam01 X-HE-Tag: 1706213080-635691 X-HE-Meta: U2FsdGVkX19mK6v+2IwqfcXbEM6Ged1Ww3Tz0WAMCuXVcZlI71cooCH0FlpBr4n5Mnp5ml/X3wPxyunHTas55nQdMtx8EY4/6HsDm7lw4JxfTyfFD3rgTZkpTSBcU/w7FSxyPX4Td4vcm5NDTYPH+jV7j5hq+5/8vxQ+ibBLVXnHN1hbRsMOFcWKXQvv2nOpG1LGw1psfrm25tyAbjYnjtw+j1z4xtVomUkp3wOuvTr3NNnhfXnngv003lcD3lOkXM1ZSnvW3lnSt98WmmcFayY6dXXIWlDMXvV5BnTMoU3gu12nrhbrKnX7ohbs6ZZXuuvmCLKvWFsQwcDL6pdjM3RJ7q6huUL/DzcyvpyWgwdsqW1AEjamCK/sv25GSP1jR6tdC9lIxx9GpHZxhdvVng9LV6stmwcdsPrcXgVK8zgMTpPkMHiqcj3zkYiWGc2M4JWdfoNXJu3pYWNWVnyCPnaYN08QN4+NocmDEL9oSQKSxNKRdwd/ATPFYffV4Hbbo6PwzPwVIFo5gXpAw21cyyV3X8eyHSlzToQhOUoaiU+RYWnZYzECUnfjiOjseJD3VNCDEiZaERhzj32WDHtqQYcuYQak4FMlzFbSdDse3EeGiqVzeJCVLTbjDpffQK01ATymDZQg7MTqrgMlZZlUcGjjVGe3PBzlIKBteTDYT0tDsy3uf56+14wOy1w6LvgoSZmgv7Qjt0Iu0Jqz6dqXRPcWEKUv9n46GStv7vVBQD3VjglQFgzgIbqznpvv2G3ro/i3oenG/8uequ5xo3OfFxBTOacEVF8PR+g8hq4pq5AmzweNwVdjHSz9bwP6pB9070/aBByoPc+DumBDSr+3OpeB6y4urlwgwYe70WOYN3ik+1QAmqcgunOJC+g0cM5at4drgIO0JVwESouNZdQEeoGdWc2F8YzLmfuWAAfA38exPGpkQ2JWEuz4I/1CgQghGUg4lEjX2RefQlo7q7+ PRb43WQv UMi5FKjdmVWrV5DOLT4K//CuGCjxdHPm19+JYaBrLbQglBLGYhK8uX5yjTgUdEhSv9N1gDJkpFAnrpasKOJyxNfj71UtzJe8ogbq7HXVTTZsRqKlabDBjS4MndRwrRhkSLhnpOifoK5v7033H3kAoAnzXXHEPMpFpZ6OMR/qWZ22hUdG2Ka/K5rrSHOSFqXtcdBeM+A4XojuaOsyHiC6ZwZ6CCf46K8wQX+5XHuJqd8mXj6V7MTuHpLMpmkiRPK8zh6b0fuXsTopfifR1wyTwBJ/n3E4m6kPeSwgYMmb0J2L/aiXd1DO7eMgogzXGRPQ0rV0nlvipZRlcKIOWRXOCLcE70BlxwAhIcdUqG+Cu072uWMroXJIF8pvvtXvnMbkfiW6p3ndytQSYYpm25UPV//+hOjeGIaq7uamIqPiZGg0cDD3bPSqsB6Sb/APcu3wnhV3TYbYDFQzqA011KIj8Txxiyu4Cfvn70UAy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 25 Jan 2024, Matthew Wilcox wrote: > On Thu, Jan 25, 2024 at 10:26:19AM -0800, David Rientjes wrote: > > There is a lot of excitement around upcoming CXL type 3 memory expansion > > devices and their cost savings potential. As the industry starts to > > adopt this technology, one of the key components in strategic planning is > > how the upstream Linux kernel will support various tiered configurations > > to meet various user needs. I think it goes without saying that this is > > quite interesting to cloud providers as well as other hyperscalers :) > > I'm not excited. I'm disappointed that people are falling for this scam. > CXL is the ATM of this decade. The protocol is not fit for the purpose > of accessing remote memory, adding 10ns just for an encode/decode cycle. > Hands up everybody who's excited about memory latency increasing by 17%. > Right, I don't think that anybody is claiming that we can leverage locally attached CXL memory as through it was DRAM on the same or remote socket and that there won't be a noticable impact to application performance while the memory is still across the device. It does offer several cost savings benefits for offloading of cold memory, though, if locally attached and I think the support for that use case is inevitable -- in fact, Linux has some sophisticated support for the locally attached use case already. > Then there are the lies from the vendors who want you to buy switches. > Not one of them are willing to guarantee you the worst case latency > through their switches. > I should have prefaced this thread by saying "locally attached CXL memory expansion", because that's the primary focus of many of the folks on this email thread :) FWIW, I fully agree with your evaluation for memory pooling and some of the extensions provided by CXL 2.0. I think that a lot of the pooling concepts are currently being overhyped, that's just my personal opinion. Happy to talk about the advantages and disadvantages (as well as the use cases), but I remain unconvinced on memory pooling use cases. > The concept is wrong. Nobody wants to tie all of their machines together > into a giant single failure domain. There's no possible redundancy > here. Availability is diminished; how do you upgrade firmware on a > switch without taking it down? Nobody can answer my contentions about > contention either; preventing a single machine from hogging access to > a single CXL endpoint seems like an unsolved problem. > > CXL is great for its real purpose of attaching GPUs and migrating memory > back and forth in a software-transparent way. We should support that, > and nothing more. > > We should reject this technology before it harms our kernel and the > entire industry. There's a reason that SGI died. Nobody wants to buy > single image machines the size of a data centre. > >