From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 909D3C77B73 for ; Mon, 1 May 2023 13:16:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E09BC900003; Mon, 1 May 2023 09:16:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB98D900002; Mon, 1 May 2023 09:16:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA89F900003; Mon, 1 May 2023 09:16:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BAA1F900002 for ; Mon, 1 May 2023 09:16:53 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7A77A804EC for ; Mon, 1 May 2023 13:16:53 +0000 (UTC) X-FDA: 80741736306.23.15DED35 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf20.hostedemail.com (Postfix) with ESMTP id 7D8201C001C for ; Mon, 1 May 2023 13:16:51 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=PxGZgmjn; spf=pass (imf20.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.160.174 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682947011; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Is4boHvQoSaLHhMTqdjd0BBjYFbAkdmNqv/HTDdfI24=; b=6T7EU/yzBhPtrJ6BbaGmnGCqsHFJbFIwV9YlRLq6+JISC9o0ELrim9UvlekrzS9nw2zaFT 30TzQmM4HgWEwmNsjc5zvUBRIrGKyJfmQBQJotzEJ4RNdZU7/m2adT+3J4Ko9Uv4iRTUyP FcVcdqRnxuR+UPr3BFrnSwwnpuahhZQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=PxGZgmjn; spf=pass (imf20.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.160.174 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682947011; a=rsa-sha256; cv=none; b=Vfx8VgpMeP3RRqY+N+3Osed16eRGGvUkDjZ27Pg9+U0WEjtghS96zgITWHVQ/RnRiKdaDh 0UBV0Kh2BlQuSbO7yIdY+wlbWQn/oUfwjMo81sWdIAfkttX0Wvr/QUWOdR9kgcysyfNdEM mXOJOXYu9UdDLbE5EQCl42ZDL3/h2k8= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-3ef34e948b1so11119351cf.2 for ; Mon, 01 May 2023 06:16:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1682947010; x=1685539010; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Is4boHvQoSaLHhMTqdjd0BBjYFbAkdmNqv/HTDdfI24=; b=PxGZgmjnKiWVSfm/xkWEKdGccObe+at6OJMXeRGrGtsJ9JiyyxdhUCSXXzrnrlGUMr rclC9MxaoiC9R/m3vqfgg0cQ3nxg9uI4asgBLMG5df54OP2bHrLDPh/OAllH90+8b51A 2HXOtv96U/sEwg3JPSXCv3W5sBZgKWw7szjHpsMjc0WcvwTFveJGoSffWXVvcTXA8U4c waK6pBOuDQtfnCe5IUTTi7rdX25xxfoThbNjA5fUG8QGBXgCCkP/7AmCci2WxW3by5pz fCNTaFDGq0CUrhHqv7elN3b6CNYGhWJ1Z40dqN3d8OXoFjEWm9h16fGTsfS8RWNC24Ko JmQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682947010; x=1685539010; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Is4boHvQoSaLHhMTqdjd0BBjYFbAkdmNqv/HTDdfI24=; b=Nqpa8rWuIPzd0hcmZrjxwOknGoA+dfYh7w8YDXKKO7XcuF1idWCPvYl0Ql+T+/0pn+ OaHRSO63Bw38yEm1sKnoVcyR/HUIazhz0c9pudHl10Z/+KsPy/zJQ/NlW8BgaZ/IK0+G P1SEegYTVexGpba5Ucm1f0ZAWETeyo9lxBUSC2bdq0sEHKWxB+6AUR1rA+23qirfuMYz HpZens5rJGtwtfQZ4R5LmKcb1Pftj4VMGCQZRFfRMJWxUkMFJA6Z4LULztwNlvdM4nOt ifKJvGk0a2S48t3REM0Ukpd75imc7QEMxSmKUwqwYcqqXoznUD2XTiKYkpkR8qgB+mk+ lfXw== X-Gm-Message-State: AC+VfDyb1BvQo2SFxY9exWbADTkBkJXmdaYu7y3fdM3kqMSuYhnhosub pUxaRD+H/A3bviD2yPeMeBMmbw== X-Google-Smtp-Source: ACHHUZ4RFSDhfUu2hOYX+ayJfrViPjXMOM9TqVUbZR9IFvGfTYQZzL0az1TKreK/1kikkk5Ib7ipcA== X-Received: by 2002:a05:622a:449:b0:3e3:9122:78ce with SMTP id o9-20020a05622a044900b003e3912278cemr20079082qtx.54.1682947010636; Mon, 01 May 2023 06:16:50 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-68-25-194.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.25.194]) by smtp.gmail.com with ESMTPSA id b17-20020a05620a271100b0074e034915d4sm8913923qkp.73.2023.05.01.06.16.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 May 2023 06:16:49 -0700 (PDT) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1ptTOb-005ovZ-8q; Mon, 01 May 2023 10:16:49 -0300 Date: Mon, 1 May 2023 10:16:49 -0300 From: Jason Gunthorpe To: David Rientjes Cc: Michal Hocko , Dan Williams , lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, Wei Xu , Frank van der Linden , Johannes Weiner , Dave Hansen , Huang Ying , "Aneesh Kumar K.V" , Yang Shi , Davidlohr Bueso , Jon Grimm , John Hubbard Subject: Re: [LSF/MM/BPF TOPIC] The future of memory tiering Message-ID: References: <7443f0e6-6be2-3320-60d9-03da0cca2987@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7443f0e6-6be2-3320-60d9-03da0cca2987@google.com> X-Stat-Signature: sxsny7ro4zufgchiaamaihss95dxoq5a X-Rspam-User: X-Rspamd-Queue-Id: 7D8201C001C X-Rspamd-Server: rspam06 X-HE-Tag: 1682947011-749571 X-HE-Meta: U2FsdGVkX19Xz+kr+3MUqPbBQp5hKYRSEvknYaietzvvKK9xsWfa82V20Vyl+Rt3fafLaG96Obb9kmryE0Sej39jG0vPHtWfmKsCbX+A0saWLTMRqi5a2NUTh/ru6pxCMBncuThZaruJL6vnD8PLdmKS9wjNc3eiIXdvV5dNcYsfY6iijrMXxlQm6yjoTO6abL4EQ1t8bEM6HI6mu4T+oexBFq/d57aJ5+12ry/6LBhE/Mxq4r+GrvtGnfHPYM3cSPhYh/r8/ZgdP0EoTu05ptnqhXKwMDX2SANzgCBERpBLkqlYc+Jqge6zGOH5ZlNnzg5K0ZSVYo3BC/QZ44KPnzH3ygoXJyAdb4/Mje8Pmu83Tu7tEZ9jskbHviD7+qyWTnY6NudyP2sm3iixqniAgj+7veAZUqPR3lyAuBW9FU/wGfMToeB7F2Fd6j8+L7SAWz29MJvRijNBgqBCGyYDB76JwpUo9zYgN4IkWFxFiAer4nbG1TzHNvWqZVe/6VKEpYQh9Xougnc1cV4nlpK8AcK1kbQ+hPz1eZaixMrlr8jw20/O8T+G1YEaDlCZ1xDlaev4ASwls6XvvnsJLdXMaaUqoVrzgOHkKvwRvkpNGPSVgWCwyuiK84lH2ZL1G7CFCZBX6YaYZbrdRVtc/0eFCLZMjDtBaVTKuD2/LexNOiZgSLbuXt+rESWIrijkKcOM0DZULvGrtV0jZ+DUbBqmjKWIyJr0RYCFxgyhN/AgKJZeFyeIQ/Q18zIykQvVq8tAvzELuegoTTT9+j5dc7wQUkfbPGXp6ac6TVvdlAjFRfQlbs4Kyatx1bmSVkiSiTbHcQ+PMClVxZ0DSLTqLjhhtrGtHheKq+Bm9BLnpJmcDfKPvUc/Uk7Hh6/HaTfXSyBfq00XWNb1CViQKiks1oEKgQCUjFYz+ziMnEn6R2iMqNYK7Wu7IwDrDOsmva5KzNnz2dvNQKHv1nnWlFATPig SI+plg62 j7InlKnMLuc804/CqGUggkCSJ3FqsvkHaSKQlRgFAL76aqJGjr6TyAx1DWIP513rkCLsxNoZQRwxoOCqDC0kwvVXEz5irQwcn44oBm3BryGk8uCLM3EBW7Xmcn2KyX7mvPqnrp4bFCoCOBQlt5vnCSDIxNVYW6kXaOojgSv/CZdHoEbE5nm7Gb0B6C2xSce77PDQGfZAyz5i7yhIAScWvO0DEg+Nv3OzkP+K7dtL46wkiMXrTOajEqBVy1aWxQqFKihVlQhIEdyjQygQzpbRYqPSerGfwTre3VZNZtFwU0iRrXV6go94aKmRErhySY8rrVKyC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 26, 2023 at 09:30:54PM -0700, David Rientjes wrote: > Hi everybody, > > As requested, sending along a last minute topic suggestion for > consideration for LSF/MM/BPF 2023 :) > > For a sizable set of emerging technologies, memory tiering presents one of > the most formidable challenges and exicting opportunities for the MM > subsystem today. > > "Memory tiering" can mean many different things based on the user: from > traditional every day NUMA, to swap (to zswap), to NVDIMMs, to HBM, to > locally attached CXL memory, to memory borrowing over PCIe, to memory > pooling with disaggregation, and beyond. > > Just as NUMA started out only being useful for the supercomputers, memory > tiering will likely evolve over the next five years to take on an > expanding set of use cases, and likely with rapidly increasing adoption > even beyond hyperscalers. > > I think a discussion about memory tiering would be highly valuable. A few > key questions that I think can drive this discussion: > > - What are the various form factors that must be supported as short-term > goals as well as need to be supported 5+ years into the future? > > - What incremental changes need to be made on top of NUMA support to > fully support the wide range of use cases that will be coming? (Is > memory tiering support built entirely upon NUMA?) > > - What is the minimum viable *default* support that the MM subsystem > should provide for tiered configs? What are the set of optimizations > that should be left to userspace or BPF to control? > > - What are the various page promotion technqiues that we must plan for > beyond traditional NUMA balancing that will allow us to exploit > hardware innovation? > > (And I'm sure there are more topics of discussion that others would > readily add. It would be great to have additional ideas in replies.) > > A key challenge in all of this is to make memory tiering support in the > upstream kernel compatible with the roadmaps of various CPU vendors. A > key goal is to ensure the end user benefits from all of this rapid > innovation with generalized support that is well abstracted and allows for > extensibility. I'm interested in this too, memory pools with strong locality to specific compute blocks are becoming an increasing feature in supercomputer build outs. It would be great to see a comprehensive approach to this in the mm, not just solving the "external slower dram" approach. Jason