From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28497C10F1A for ; Tue, 7 May 2024 03:37:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 839626B0083; Mon, 6 May 2024 23:37:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E8EC6B0087; Mon, 6 May 2024 23:37:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B0D66B0088; Mon, 6 May 2024 23:37:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 50D146B0083 for ; Mon, 6 May 2024 23:37:24 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D2DB2A1325 for ; Tue, 7 May 2024 03:37:23 +0000 (UTC) X-FDA: 82090189566.23.820EF52 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf26.hostedemail.com (Postfix) with ESMTP id 291A8140007 for ; Tue, 7 May 2024 03:37:21 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="DIVPJw/e"; spf=pass (imf26.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715053042; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=LMud/KNnl2pEDAjUsatfgy9O0xbmAzMRJVZi6P0tRAM=; b=0A3gUbWyKW6K5jVRQPnWE0PHeN0NBE5gq0SxzwFMnr1g5yNwvd/qMc7n+U41mcPQWfdCvO v0QhHocthB1eXnVgOZQGuGC93YU5a1NTBcNpkKarBiGLMzDQwa6N1Eq2PKdYKKeqZhATMR SaSUK7Br3vlx/Hw1hNqeMT4I3u4adHY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="DIVPJw/e"; spf=pass (imf26.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715053042; a=rsa-sha256; cv=none; b=8ffSFRdCC/zQK11zAO0T/dUS8OZp/wLgjjs11lv+rdzJY9gHxmXJUOt3ew13HnaLyMuZ7g UFD6z+fCnO2hEBK34FeosW4QdQHw5zs72oG14+jzknMBi8k/PdJhG1skeQu0xuwk3iXUHS 4Xl4fT+JQsqG+0Ft1sRHFVCs+u0Ax9s= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1ee5f3123d8so68775ad.1 for ; Mon, 06 May 2024 20:37:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1715053041; x=1715657841; darn=kvack.org; h=mime-version:message-id:subject:cc:to:from:date:from:to:cc:subject :date:message-id:reply-to; bh=LMud/KNnl2pEDAjUsatfgy9O0xbmAzMRJVZi6P0tRAM=; b=DIVPJw/eeylrS/fQwIbngVzX+RC8NtfET78IN5Pv1zlf3ZBmFzuVNHXCtauaDCa4wS f4X7nSQIfJuVFUtjOV6a9gunSE0LkimfZyzkrl/0u6GpQb6g8K9Q/tBgPGAL4hTarIQo g3V6+NfEdh2Hzr6S0+hFrkL2VXKn/cER3zrpvSCVzYCEqMaYwwmS/XtQcbSOWXzHv6g0 2eTfbZ+VivCk5K9zvNp/PXw6qk+nSh5oOnS0dYqfiBswxq2DtwmYfqWve7N9B7xKZrQx ZY/vBTYedPFeMiXUvxycVD7TQOomGpZ/ItmfnDM6eFZaDkl+e1fYK9ZOJiXw6w+WaLhL IwCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715053041; x=1715657841; h=mime-version:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=LMud/KNnl2pEDAjUsatfgy9O0xbmAzMRJVZi6P0tRAM=; b=OS4+ar6UQIzLi98uo9eCp4yDx8+/wu2eE6OY97CCiCFIqXeLfOOZrWhZ2GDh21PwO2 YxuZS4O8RUAxK0VIkUE4YFoRqYI5NonqvuvabNAEB+1xNVmtYUj4yW02kcVIZAakd7k4 GQv+nWmrTKTrk99bcDZ82dxBDFa1elS/fi5j1Hs+JMQN/2KcYoOiQXhpeuaDMhQMkTaX vyHffYkmpyIz3Yk8sTF0h2h9tSo43EkXF608FxRogI2amn2Qmm1sZhK1BanTmFJDZr3x z2MiRCiFysVNIzrd+S9kfZ3/5Xw4qQ187lvVqT2qN2/xZfXNX6I2awNNTbfyKI8iHeDF n+mA== X-Gm-Message-State: AOJu0YxRbvvF0NxFCwDalp/6VQ5xvKORaO3CQ4wJytkJ3XawB8wY6y9S BY1mRsV+DTSeNKhrkERUapR9YVHPuQ0qiHNHLpZCz7gLdbKHDaahgHZhYKl+nQ== X-Google-Smtp-Source: AGHT+IEa6Lba7NwTGCFCHxqV/z0mb0TYrcgtO52S9t+iNzFEkbCK0ArtWrxDmFQCk0MfWx4wy73UAA== X-Received: by 2002:a17:902:cec7:b0:1e3:d11b:532b with SMTP id d9443c01a7336-1ee71d03cd6mr887695ad.5.1715053040571; Mon, 06 May 2024 20:37:20 -0700 (PDT) Received: from [2620:0:1008:15:4286:fac7:a075:d049] ([2620:0:1008:15:4286:fac7:a075:d049]) by smtp.gmail.com with ESMTPSA id l37-20020a635725000000b0060764a6a26esm8752727pgb.70.2024.05.06.20.37.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 May 2024 20:37:20 -0700 (PDT) Date: Mon, 6 May 2024 20:37:19 -0700 (PDT) From: David Rientjes To: lsf-pc@lists.linux-foundation.org cc: linux-mm@kvack.org, Michal Hocko , Dan Williams , John Hubbard , Zi Yan , Bharata B Rao , Dave Jiang , "Aneesh Kumar K.V" , "Huang, Ying" , Alistair Popple , Christoph Lameter , Andrew Morton , Linus Torvalds , Dave Hansen , Mel Gorman , Jon Grimm , Gregory Price , Wei Xu , Johannes Weiner , SeongJae Park , David Hildenbrand , Davidlohr Bueso Subject: [LSF/MM/BPF TOPIC] Locally attached memory tiering Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Stat-Signature: 8adsea13keadjh5oni9bcp8hycjbj8sm X-Rspamd-Queue-Id: 291A8140007 X-Rspamd-Server: rspam10 X-Rspam-User: X-HE-Tag: 1715053041-692983 X-HE-Meta: U2FsdGVkX1881PTHRt7RJyLjCkF0KsfmfTvZTllaJ9CohF7DuJ8ipBAnOTMlxk9pAeUnwetqKrfk8JQIAwdKNWSotrhaHurwjxunaWo4fsunaBK4LS2RwY7CEghlraJ8chhUPm5szvs6BcCy0s7FipeOYtJNncTtJygds+IQk+TYsxI8qcCn8TjMe8gkyoljIXsEfyigEeWVpcYNzvf8/ym9fsVQoIMOCRtRKxrvvuR+weivF7rGKEqgedozQzRiQku3jKlZMXR6BIZB5dQsvbyzNtQc2TaUgadLOjIwMCMuhQ9DdG+AnsNn7mWfJyE3zR2bKWn+Fp8llWAp5SoQHJL30IoglMCtYw1XoCw7IaKnOHtDEmQORd9hkLvMOo6kpAIIB/zplD2UqeOWoVgewGTdI2fDcRednhRAlbHe8heFWy31iTt9VreowYTuBLEeHC+wPgoSVoarjSTDEKXw2Chog/PNVufXHXwa1TjGQtutUSSLbZcWJEs0DLljoCUJR+qemHFyFkxO9C4asoR2LAsLI2iDgNBrYSjlx0CKRQXuqKxTUOdQw4vMFlGEYOKjBhFtAs+E+h3Vd47whQuhpPF5dcJa5q3ENSRuMyPxZcQvnP/9ky0Mb0e+B9Sup3/uijMgCOrx5XVlXfeJgq3kLz+6M5sMhiaR6tYJAOp1d3kd3R7lQlIpI4efkZowNDNITkqUtd5VMeWBgMMsb4P8ZmVimrY8xYkjSCz0ChZQ5i0qjF/8EHLg1kxAocWTVxAE4mP+7DwnlYn3ToyTKnmvVpL6WAIGs5YPUI+8b0hANytBtys27tg/JH0KMKMKLZgRZ8q5CEJn9jkmbldsUv4dBaYDBdso/C8LIWzPz46NsIeX0M5/BWvpDs/nyj7ofPwQoFAjCaltVHXDp4F30ybPHjUHP5SxcJDDd7N9dZql1US7pPNRiXcO8YHsqWTqWAKKejPhyTb2dLVJTkRTzGn yxUnjC4k TIlAxxaQ+bKO1bpRyoXBWjLQd85fZBwittz7Cv1iMtXZLney7X0GzWgWBHMrTwrU4zp6DGo97TeOJIDiTv1DyBnzTMQAVmXtPgbAKzT/ivesu7qLD7eQsOLKuF9pN4QlIowJp/VR2MLg1b10VucI6Xpnp+yOt8CW1quqvVzu+dnJ+lWJ/2FU6CtETanUqEHfsu7qG6z+ruLXKygZFf2mFVaOfupLnKYBlkDefVdkFDx8HO1A4R++7cORLjD4E5QRJBa62aLKoHUMqGAl/YOgWEQ19yVD/m1s/8nAqYHmNtlqP2n2+oJtlLxuG4Q3aGEVzWpcq2/B1/Wb7WqOGSldL2hZ0yoXbYdWFUVor5fNLm+W+38LRcHQmgRAKmIO4PloQVFzK24XPS/Jle+wjdfZdtlBnNj5UfEZlRJrsGAguirgFuoBZNPTg1dt/Gpi6FV6hCzyuLVlzYL2tJFwmd+BWPuVF15dLF6cpFOCy4bNWlL1jPAuoNn3lGElipSMYy8hKfvO39lr3qPMKEBYk9EToIKMFA5TuheRChGhMVXThbn/QHHS0eizfFEGpggCLwVXbAX3cdaY2JXG48NkHGk+88qe4vg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi all, I think it would be very worthwhile to have a block set aside for discussion on locally attached memory tiering extensions at LSF/MM/BPF 2024. Primarily interested in discussing Linux enlightenment for CXL 1.1 and later type-3 memory expansion devices (CXL.mem). I think we could touch on CXL 2.0 and later memory pooling architectures if we have time and there is interest, but the primary focus here would be local attached. Based on the premise for a Memory Tiering Working Group[1], there is widespread interest in the foundational topics for generally useful Linux enlightenment: - Decoupling CPU balancing from memory balancing (or obsoleting CPU balancing entirely) + John Hubbard notes this would be useful for GPUs: a) GPUs have their own processors that are invisible to the kernel's NUMA "which tasks are active on which NUMA nodes" calculations, and b) Similar to where CXL is generally going, we have already built fully memory-coherent hardware, which include memory-only NUMA nodes. - In-kernel hot memory abstraction, informed by hardware hinting drivers (incl some architectures like Power10), usable as a NUMA Balancing backend for promotion and other areas of the kernel like transparent hugepage utilization - NUMA and memory tiering enlightenment for accelerators, such as for optimal use of GPU memory, extremely important for a cloud provider (hint hint :) - Asynchronous memory promotion independent of task_numa_fault() while considering the cost of page migration (due to identifying cold memory) - What the role of userspace plays in this decision-making and how we can extend the default policy and mechanisms in the kernel to allow for it if necessary Additional topics that you find interesting are also very helpful! I'm biased toward a generally useful solution that would leverage the kernel as the ultimate source of truth for page hotness that can be extended for multiple use caes, one of which is memory tiering support. But certainly if there are other approaches, we can discuss that as well. A few main goals from this discussion: - Ensure that proposals address, or can be extended to address, the emerging needs of the various use cases that users may have - Surface any constraints that stakeholders may find to be prohibitive for support in the core MM subsystem - Alignment and division of work for developers who are actively looking to contribute to this area As I'm just one of many stakeholders for this discussion, I'd nominate Michal Hocko to moderate it if he's willing to do so. If he's so willing, we'd be in good hands :) [1] https://lore.kernel.org/linux-mm/45d850ec-623b-7c07-c266-e948cdbf1f62@linux.com/T/