From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55FE8C25B5F for ; Fri, 10 May 2024 03:10:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF6C56B0089; Thu, 9 May 2024 23:10:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA6756B0093; Thu, 9 May 2024 23:10:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6DC06B009B; Thu, 9 May 2024 23:10:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id AAC636B0089 for ; Thu, 9 May 2024 23:10:12 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4DA4EC167E for ; Fri, 10 May 2024 03:10:12 +0000 (UTC) X-FDA: 82101007464.19.7472CBC Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf18.hostedemail.com (Postfix) with ESMTP id 7C0851C0006 for ; Fri, 10 May 2024 03:10:10 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Xr3gb0sk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of rientjes@google.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715310610; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4rIhbcIN+ZTYnt4Ao5LK8dm88bzWLn7xhC/7GpjDJcI=; b=vO59CA7Km1QodE/2R3b9HU16gMa5YDGrCVs1ARXeGNEsIZ/FVKYW9i7bS345Gn8jUMfMj/ 0RuzkLaP4O1G5qW0V399+/bDwVSXpJC0ZrC4nfcw9E9XL6upYZYfEnx3VCu5npx2WwfIrI 4mQq1ffOeh27xOsvpkD2/5voqNqyC40= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Xr3gb0sk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of rientjes@google.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715310610; a=rsa-sha256; cv=none; b=OoyDJYapNfVVZ2lJwU/IgbR1UVV4Q+votWXZncFNyOBFORNsgEkVe5uNbCyPkmKoSTC1As hBEGLLUdpZ85XD+GOfCMnDTTX0bkm4zTZmr9bQeQZKrWQ3LizPqEwC/LDUxnGa0fUq+HnW OUcfOp3IvcRhEQpSU9qry63QGjZMTJo= Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-1eed90a926fso85245ad.0 for ; Thu, 09 May 2024 20:10:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1715310609; x=1715915409; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=4rIhbcIN+ZTYnt4Ao5LK8dm88bzWLn7xhC/7GpjDJcI=; b=Xr3gb0skudD6rK+DIH04xoPFOo+te49Lu9ZqJLYqr4sSznLF1K5LPyvnAJG3GhisTV DQ0OUFip5orOzqMPyclGHuTZmvfqEBuLnbooQAgg7/QM9h75A2WkkXF7mXfjkKrnRGrE Ab1W8c8OFZ8G3yjbuMa6sOVqAaM7xKkyIP7RhJTJlrhOorL+iQnZAfswn3COqBjlR9zD ODrku09FyPn0+b8GwgJSR3nt07899JElK3t1Tcb3tmMWIi22jHaIkWuvaIb8HkBfGTsE U65bZROGUHj/z4c0ubkM01ozdl/f1MzHq5oFyGT7RobKAecoPdtijkWkCkxZsQsMklZt NQlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715310609; x=1715915409; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4rIhbcIN+ZTYnt4Ao5LK8dm88bzWLn7xhC/7GpjDJcI=; b=JhoqNg41K5LaoWkcCUWhgPlY/w7j89XY3tvrF1dQtX/i2szLXGo/KnugaDztx3+Su7 ol+LyE5lJAEmscc4SsapRKrycC2DhnP3rfMEEwL9OHfZVk8EXVLRMyeT58SeMo+JpdEu su9oV2MiQ8ENJ7uLol8ujHU5jymPav6BCNJqcnq/xEOdbHbtoj8gllsFvW91MXSTgO71 OWPvJOwI4VSNPmyqhS1diiHJ8Nvw1820GF0Ph5twfb/3bmu7Lfs7aI8ZceJgD/GFoUx9 9DO9xMlaywTCm/hBLt+VuIL/+uCZi6A8FNdnZCsHI+jaP3G+0fU/COaaedz0Pv3lLoOy V7nw== X-Forwarded-Encrypted: i=1; AJvYcCURIW7cZNuyhoeD4uUzNWGdhidN6Yui5s3fkRg5cmR5bnUYHS/v3EV5y/0IepT0LkWh8r0lAB+eMVOsf2UxcjLUTuE= X-Gm-Message-State: AOJu0YzFx3gjTgvspr6dVU3aSR7j3Jf40I5A6fACM4V0YCoy44qJ+d7f yxTHARFMViUf5G8h2/3u+8FHD9sF/euVpRlOI3r420vkOT+M1oHh4Pqkzo4FMw== X-Google-Smtp-Source: AGHT+IFVt728MaeGY/KsWA0u9SS8APKJhy44w7vNb81GJGqWhCkXydH3vQuXbbrqloKgXAAeIsXHGg== X-Received: by 2002:a17:902:e848:b0:1e0:984b:6215 with SMTP id d9443c01a7336-1ef46b88af0mr1288395ad.16.1715310608719; Thu, 09 May 2024 20:10:08 -0700 (PDT) Received: from [2620:0:1008:15:3671:ce3b:3311:fcd2] ([2620:0:1008:15:3671:ce3b:3311:fcd2]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2b670647c39sm2457139a91.0.2024.05.09.20.10.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 May 2024 20:10:08 -0700 (PDT) Date: Thu, 9 May 2024 20:10:07 -0700 (PDT) From: David Rientjes To: "Huang, Ying" cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, Michal Hocko , Dan Williams , John Hubbard , Zi Yan , Bharata B Rao , Dave Jiang , "Aneesh Kumar K.V" , Alistair Popple , Christoph Lameter , Andrew Morton , Linus Torvalds , Dave Hansen , Mel Gorman , Jon Grimm , Gregory Price , Wei Xu , Johannes Weiner , SeongJae Park , David Hildenbrand , Davidlohr Bueso , Yuanchu Xie Subject: Re: [LSF/MM/BPF TOPIC] Locally attached memory tiering In-Reply-To: <87msp1kkj2.fsf@yhuang6-desk2.ccr.corp.intel.com> Message-ID: <4e326d6a-b4d2-0617-97fe-e2d8c6458c68@google.com> References: <87msp1kkj2.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7C0851C0006 X-Stat-Signature: quuhesdq6746hcpxfm6iahs1em3wjpem X-Rspam-User: X-HE-Tag: 1715310610-689211 X-HE-Meta: U2FsdGVkX191Ppx+4q2pt2CAxOTAMaYiqf/FPg+Ewaf5QTZ0fNJzEAo4vZxTh3Qp3XkR9anAXd1JZ63Aci1o2f3AoYmmatvxPKkG3SSk0oE6w9IhLWOfPv7ib2dOahoNnFlUyhqZPz5/+2lgoEOhxmwxuyKQjoh4zjLS5AoIjdL9/+2IUOINzP2eKzwIyoH/bEcK41BE7QXPYVsLF8fdPoyxQ5BhlVxguJu7Z5ALmoXT7HT9pmNZzOvOeZ87RXXMCq2oIFINuDkht0l7GdRsOk5dlpsOd+gsmmV+57qgjE+YugbRysRh/hjkSaOHnL6DoQCEXn/AiYRYOirQY+KCTEKygfybb+gCM151wjpr3IQ4sfmjCXU1qV/SPj0ulM+y657JOnDmFEGpTdUZUygEssH0oFIm8uE9h574sQ3WRuK2MKmihDYw7Jk4Yi7XIU9rLclnVII77KWnWA2M7WUCzfGQcw8r+tjxx6Fm79MJ7/vnZ/G1yDH681oX9OnwJ+CbjbNrbG6TRuey8isdR00vVQkKSd8h0E93X7+JxFq2GUDds4YRAoivAhzONjKSruP7OFX11PbozkYIg0zfoy3fpMGHornCwD3IOPZNrOQ+WGM9YT6zJgt2cnP/OGtu2YFo1nV2/gV30+oJu6zvy8c4JP1y+OPNLGcl9vYGVycocV+avWY1zdtlbo1dIsJLcusEsV8eknDfcGw3KNYkDinmuTaT1tApq/jqVYtlprmdsxI5ItGRJjcNlX12CEN1b70fPjrSOPswroQ0dj3PPasY981DuUavRy/uSVe95T3FQSzXIMF7neAIXmXfI0bQSYQPAWqMrICCdPxHznOehtzvUEx17Wl5mEL4n7vr4eaF/JqLnCfuHnEmEBCspMtrs8/jl1EWWMmZ5JEVSeP+oIxQQ+GqhYjE9nfNkoVrHWvTYLOKVgqh0G2r7Z1Np1YIXyrukzWTxAa9GNedz1lUbzq IOERuUY6 8t0RBUk3KoyEmbcmZgoOrUZnbchC5dyfwaKTjYAa1J0uFOgyFTuJzzfSD9H9LVfH93Y9/qKSLtyy/vYJC//HiSjT1Uz5VxUBRuT+v0l7zpn8WoJJuwmcj2X7MUWYiHFHUszlb6f0XeB/cbrDb4zQpWm6QQxWVidK6XvM6QM8fU31u790m8SbTPvijt7AhrGWM1pCbWoGj8cdRHspOA99o0yw7T5A+I9dgFUuVr2ovfexo5OxDoAP5PEZPUwvJy3zqkIuHnF85HXqrIYVZqBDiOEBUApNobF1ogUM0ONeEGwSfEYdgr+atQR26Arkzh3k2mfGKbkhhDYnrrD0MMX3faXBrW8i27NCUgawLqmEQxCklcest6V+dLdI+/B1tVrAvOnNNxUc5cZu4UAZpIfqpI+xRzIKn/grPtBEy3FL4aE9kj9M/o3OyjCMdgqEwc8CaBXLW82nlvwOIDmS3RLjRSQBEHDBxfQAyCGr/7WGGtzfRX3PwdSdKgaXhnrx41b1WwYe1jlr48qb3JMC9wLP23S4oix8WVNDIN1eca/8e9+oryGCciT7YN8uGtOGbZlm4+MVk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 8 May 2024, Huang, Ying wrote: > > Hi all, > > > > I think it would be very worthwhile to have a block set aside for > > discussion on locally attached memory tiering extensions at LSF/MM/BPF > > 2024. > > > > Primarily interested in discussing Linux enlightenment for CXL 1.1 and > > later type-3 memory expansion devices (CXL.mem). I think we could touch > > on CXL 2.0 and later memory pooling architectures if we have time and > > there is interest, but the primary focus here would be local attached. > > > > Based on the premise for a Memory Tiering Working Group[1], there is > > widespread interest in the foundational topics for generally useful Linux > > enlightenment: > > > > - Decoupling CPU balancing from memory balancing (or obsoleting CPU > > balancing entirely) > > > > + John Hubbard notes this would be useful for GPUs: > > > > a) GPUs have their own processors that are invisible to the kernel's > > NUMA "which tasks are active on which NUMA nodes" calculations, > > and > > > > b) Similar to where CXL is generally going, we have already built > > fully memory-coherent hardware, which include memory-only NUMA > > nodes. > > > > - In-kernel hot memory abstraction, informed by hardware hinting drivers > > (incl some architectures like Power10), usable as a NUMA Balancing > > backend for promotion and other areas of the kernel like transparent > > hugepage utilization > > > > - NUMA and memory tiering enlightenment for accelerators, such as for > > optimal use of GPU memory, extremely important for a cloud provider > > (hint hint :) > > > > - Asynchronous memory promotion independent of task_numa_fault() while > > considering the cost of page migration (due to identifying cold memory) > > > > - What the role of userspace plays in this decision-making and how we can > > extend the default policy and mechanisms in the kernel to allow for it > > if necessary > > > > Additional topics that you find interesting are also very helpful! > > In addition to the hot memory identification and promotion, I think that > we should consider the cold memory identification and demotion too as a > full solution. The existing method based on the page table accessed bit > may be good enough, but we still need to consider the full solution in > the context of the general NUMA balancing. > I think that's a great suggestion! We'll be able to cover the approach taken by workingset reporting[*] which is quite powerful for the purposes of proactive reclaim through memory.reclaim and would also very be useful for identifying cold memory for the purposes of demotion as well. [*] https://lore.kernel.org/linux-mm/20240504073011.4000534-1-yuanchu@google.com/T/ > > I'm biased toward a generally useful solution that would leverage the > > kernel as the ultimate source of truth for page hotness that can be > > extended for multiple use caes, one of which is memory tiering support. > > But certainly if there are other approaches, we can discuss that as well. > > > > A few main goals from this discussion: > > > > - Ensure that proposals address, or can be extended to address, the > > emerging needs of the various use cases that users may have > > > > - Surface any constraints that stakeholders may find to be prohibitive > > for support in the core MM subsystem > > > > - Alignment and division of work for developers who are actively looking > > to contribute to this area > > > > As I'm just one of many stakeholders for this discussion, I'd nominate > > Michal Hocko to moderate it if he's willing to do so. If he's so willing, > > we'd be in good hands :) > > > > [1] https://lore.kernel.org/linux-mm/45d850ec-623b-7c07-c266-e948cdbf1f62@linux.com/T/ > > -- > Best Regards, > Huang, Ying >