From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EA64C77B61 for ; Sat, 29 Apr 2023 02:27:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A0DF56B0071; Fri, 28 Apr 2023 22:27:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9BE306B0074; Fri, 28 Apr 2023 22:27:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 885F56B0075; Fri, 28 Apr 2023 22:27:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 774E66B0071 for ; Fri, 28 Apr 2023 22:27:07 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4839280329 for ; Sat, 29 Apr 2023 02:27:07 +0000 (UTC) X-FDA: 80732841294.06.6EDA61C Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf08.hostedemail.com (Postfix) with ESMTP id 7CACA160006 for ; Sat, 29 Apr 2023 02:27:05 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=KsxfegeT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682735225; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ilw7O0itjLsDTwm3SKK0sVaJpdA9NIjOZKczF/AzRjc=; b=p+BrWHb4LCHuYBH2ywlGFb/Yh7gr0cz7LiTxO+/Ifu3HusQ1WxPPzpgyFTfS+Ds/GN1kVQ ntpfko/MuICPozsCOaADuMUcVtyRxoyn8NyDEBZVbyEUBXGXvVJ5+3bM4teOdSEgJLePQi rahaxG0wHYU9Dn7ip8lncBiw4gXKECM= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=KsxfegeT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682735225; a=rsa-sha256; cv=none; b=7xVc9/ZAZi7PyMOXZklObnY00t9twoTn+r9J1RyVS5HkOeLaxZGnVeDJpeeFaNl0l2OByc zA9kLuBcviaAJ+HMaMPQEPQx1sMg1BlXNqHL2EYz6jgSTkH//1PUkeyHV1+IJpxJ0F8PqR zfiuEVDqiMBnvnP+etR477UMeKPany0= Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-63b4dfead1bso558154b3a.3 for ; Fri, 28 Apr 2023 19:27:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682735224; x=1685327224; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ilw7O0itjLsDTwm3SKK0sVaJpdA9NIjOZKczF/AzRjc=; b=KsxfegeTV5+++YDrSZdit5uCEV+YyEHb9d0pvWHLaplOSAKkQ82CLTv3yvrD6LRUR1 IT1kxCbnBrXcZFuH4bpNXp5PifPgocajsYbIecbMgTbhshTrCAcMnAL3abEuvzrEq/O7 hZ4shzDoUXvSbz96LnfpGg76TDi0wYrJpZMGQBQ44edHQqhTije5rbtKZ5f4179J1JOl bxbKvw1r9uYNwo2I2CeU8TVugAWla9l8h8VPZyREMlo4ZRJ4ynt3w8XOOohTjyZACi9o MufyBv9e+2+kc5Mz1icimO+dlm5XChkxReMHfbd6kPVgnTi03O+BOhKu7W9kmBZvgowh Ap7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682735224; x=1685327224; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ilw7O0itjLsDTwm3SKK0sVaJpdA9NIjOZKczF/AzRjc=; b=Ky07rLlPysu1iAGqk/sA3tdVtt1eGcEJ+lrcp+9vg9wQNiHOzI7h6ewhBZeHi5ekGw PPcoQOlZu73VKoXLpmSorLFidbJfrWHR08LLzCufawokXUE9gX+sBIlDLTfi0XyeiFfm nWSKNyff9YN1LbVvUdoPM9ExekqrWKWaKMnJku4xPgPvlPsHgmam4E5tLXSP9/81edYR qxkzyjYG5xOf7pQKx+aM661NUYzpWiM83uBk5eWb1Hh6Yve97cJ2v+y9oCDHMx7BCgeE lgyXmOh6Hpgz9uUFb9gH9yihjGc1xW+py7vlyJmjTYpZ1z273MN+nNynji/+T3nnM/bw ORKg== X-Gm-Message-State: AC+VfDwVh9/scaQ534p9zN41FQbiADwe7jSRXIJXnRPp3CFT5OAmfgeg ipQ6pwM/XUxSDiSVoefWUFofPbeRQvUMQ8PPEXw= X-Google-Smtp-Source: ACHHUZ5g4A2Ht7aW4g+ZAK9XWpROufelTfcbwFSFR7ioCMEPlw9e7xa8TekcOvn2GjW5bAVnCYyXBsEOTbZAMJOkubs= X-Received: by 2002:a05:6a20:1613:b0:e7:7844:903a with SMTP id l19-20020a056a20161300b000e77844903amr9459422pzj.53.1682735224021; Fri, 28 Apr 2023 19:27:04 -0700 (PDT) MIME-Version: 1.0 References: <7443f0e6-6be2-3320-60d9-03da0cca2987@google.com> In-Reply-To: <7443f0e6-6be2-3320-60d9-03da0cca2987@google.com> From: Yang Shi Date: Fri, 28 Apr 2023 19:26:52 -0700 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] The future of memory tiering To: David Rientjes Cc: Michal Hocko , Dan Williams , lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, Wei Xu , Frank van der Linden , Johannes Weiner , Dave Hansen , Huang Ying , "Aneesh Kumar K.V" , Davidlohr Bueso , Jon Grimm Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7CACA160006 X-Stat-Signature: dkotfdtgt8rn6gprtygsc1whmdj47k4s X-HE-Tag: 1682735225-690293 X-HE-Meta: U2FsdGVkX1//+h29yxX0L6bq34Qg+8f2+GJotRocL+3G0wTuki1gboP4jL9RMiNSrGfqfb+0knN+yNXmdpMHRdn2odBUj72j2vH2fpB+h9hWGBUrdGVxDfaHEKtMd34i/a7rNZEFw3ggsyTTKtSzPc/OwwYcJa8sSnxNA1BPfaJiPTjLvNoEE9rVFWDeDsHicpVvcYeZCzenBYGq076sM0/sAZA7RZRcE2abw9OhR8JBUmBY8wXGnduTgiz5K8u+1R/WoVjPOwUqaeIjlaJbHRjD6XUeYdsfCcn0x5Ng159GzGLkVw9vP9epsay91OlBUDezH+PpKzC7wsSe7rMZYRfjkYm6CClqFjnrmOT1kOQLfdM5azwU+wjMS8/CE8nvNSSpYb6ZEGKsM9uvvE1mgTV4Gp1L1J7t2THIiPJj9qjJ2FUuIkHmjCToH4ZSrspOpLWz/XTrsE4chtr4kjjeos+uqSWl3dcnxp3zi40ytkXCPgDFkMhV+vRMdz4EDF60CHP5YDaZk9jmdVfoYVYRkBzIu4V2cqUKAttbbW0edqHiSX6aWh5Qrl1IIoHvYYAUYA/qKxeTtQg51eU++wMXLbkCro9qddjXyeAql/9qIJuxqUTAMOlqNUCC/0NYYa40LwxWhQ2P0m8C6+QvurzQ7mJTC137BE+7kQuR3yUgRtOmHgQGraAJEQgfiZEgv4ezZj8hXtBBn6igymwpvQCFV6uH+mJ+E4YmOnArnxjBdKfD0zmDfcAKtoRGDpnrbCpv4uSGDQbFDTRATmT4bbfPDDBAh+HwHSfhsaUz1wcYSfP29dp9fg4j/IvEo2xmf2LhviKQ2KNKVJr5fUaa21IJhkCgVx6LBuMnIP2mlL5nEAV46PKPNL0qK1hTLTaFPTU08OEz2XMswvi9laJdi0lKO9MXhYxPIZjXh8Ca5G13EQ3p0pe1//1vNKVEokBsDrznfR5P/BhjDq3lUMWEh+n N4HrdPh0 jxRdzng3eBolrX+GUg/l4wid5V5ROBdv/uy38YUbVrCRGcXoRkpC+v86ZgUvAmXGFjXUJHmkGvW9EiVTNhvD8NR3iahQGtgvnZPaD5w9dfw8cnWScHBdoP3+jTqX9h9jI6gX4Xbf0CdSRWnNc4IlyDKnQJUgKkpoJiz2fLLt0JwWRy2cZ2q4U8G3IbSxOMN2wf/nh7k1VcgQhCjW8exekdvVHlbempUuuia+fBF5QWfL9T2N7GHk1MPWVYneA/U6uKuxLG3uMvxntLu0BeOYiMM8PtqhQKRNKsi9aQEMd4nBqrEYH61JlxWC9tnSOQ00ivX3/WA3p4uaJSZzPnRhQdH9BIIOYe2dyEGi8GQSk2znjgCDOfWx4taN0NnXyZYLR9wnT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi David, Thanks for proposing this topic. I'd like to join the discussion. Some inline comments in the below. On Wed, Apr 26, 2023 at 9:30=E2=80=AFPM David Rientjes wrote: > > Hi everybody, > > As requested, sending along a last minute topic suggestion for > consideration for LSF/MM/BPF 2023 :) > > For a sizable set of emerging technologies, memory tiering presents one o= f > the most formidable challenges and exicting opportunities for the MM > subsystem today. > > "Memory tiering" can mean many different things based on the user: from > traditional every day NUMA, to swap (to zswap), to NVDIMMs, to HBM, to > locally attached CXL memory, to memory borrowing over PCIe, to memory > pooling with disaggregation, and beyond. > > Just as NUMA started out only being useful for the supercomputers, memory > tiering will likely evolve over the next five years to take on an > expanding set of use cases, and likely with rapidly increasing adoption > even beyond hyperscalers. > > I think a discussion about memory tiering would be highly valuable. A fe= w > key questions that I think can drive this discussion: > > - What are the various form factors that must be supported as short-term > goals as well as need to be supported 5+ years into the future? > > - What incremental changes need to be made on top of NUMA support to > fully support the wide range of use cases that will be coming? (Is > memory tiering support built entirely upon NUMA?) AFAICT, per the before discussion numa distance may be not enough to rank the memory devices in tiers properly. We may need to figure out one or multiple better metrics. > > - What is the minimum viable *default* support that the MM subsystem > should provide for tiered configs? What are the set of optimizations > that should be left to userspace or BPF to control? > > - What are the various page promotion technqiues that we must plan for > beyond traditional NUMA balancing that will allow us to exploit > hardware innovation? > > (And I'm sure there are more topics of discussion that others would > readily add. It would be great to have additional ideas in replies.) > > A key challenge in all of this is to make memory tiering support in the > upstream kernel compatible with the roadmaps of various CPU vendors. A > key goal is to ensure the end user benefits from all of this rapid > innovation with generalized support that is well abstracted and allows fo= r > extensibility.