From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24C6CE77188 for ; Fri, 20 Dec 2024 14:13:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 846326B0083; Fri, 20 Dec 2024 09:13:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CED26B0088; Fri, 20 Dec 2024 09:13:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 648BE6B0089; Fri, 20 Dec 2024 09:13:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3ECA76B0083 for ; Fri, 20 Dec 2024 09:13:56 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id EC7D2ADC1E for ; Fri, 20 Dec 2024 14:13:55 +0000 (UTC) X-FDA: 82915529592.01.10E1847 Received: from mail-qv1-f45.google.com (mail-qv1-f45.google.com [209.85.219.45]) by imf15.hostedemail.com (Postfix) with ESMTP id 1C8D7A0006 for ; Fri, 20 Dec 2024 14:13:00 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b="Zen/B+OW"; dmarc=none; spf=pass (imf15.hostedemail.com: domain of gourry@gourry.net designates 209.85.219.45 as permitted sender) smtp.mailfrom=gourry@gourry.net ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734704001; a=rsa-sha256; cv=none; b=uEmi3Ex+nGtwHTyd1fnTHXs0LYz2QDKQG5CpE317ZEXm067BJeWf748DOjA0ALPgc/FELK RHvYTfX796VDRIPgbjqqUh+S9+yUH/W+11jTRh2AI2rt8x2HULnUl0j4wao8hL+vxWBNJW 3c4cnJjpjXLFhkS9w8HQgvGh3HhqWuE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b="Zen/B+OW"; dmarc=none; spf=pass (imf15.hostedemail.com: domain of gourry@gourry.net designates 209.85.219.45 as permitted sender) smtp.mailfrom=gourry@gourry.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734704001; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U3v08MX0QjC5F8LYMz3jdkKlxnmqui017XpkGCbWdww=; b=yscJSptwmLMfOH/+2Mdg800FW1dUzZTuNnoPk9hByzbze4kgxqlGxSQUXMedm+Puc7JXa5 dEwQQ5AysY4kDyX2CUq00M7NGlUIGcUW6OsXnGdW5bFNfVg2A6DDlnVJslppnkKwwVflDV Z+UFW0W3Zs8OmrHhxkCJonLHxn+cqTg= Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-6d8e773ad77so15080646d6.2 for ; Fri, 20 Dec 2024 06:13:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1734704033; x=1735308833; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=U3v08MX0QjC5F8LYMz3jdkKlxnmqui017XpkGCbWdww=; b=Zen/B+OWt26Ct+DQybd5QIgB4NlevlJR0dnrEF/A1nW1hVrD76VjsPaZ1Gb0DF/3pT /JKbmIfqAUyqGOGIFMK0AtDdddYmUfJ35EaYL7dR0cJQXYroRAyEzXuy9AFEbErohW4/ o3dq4nBMWDfUJDGH01wrI4eCW3Nfjv2n7ReKJcFqKvCDEk8rYxEzPz0AH6QZ4bzqHuvd Si3LLuWAqBtm72YLJVaZ9ZV00ks7sSwqyAv3MkqIRecjYGQhMwnbb5N1Ikfxo9/Luwd3 R4mMTh6hBv/xQhH7r+U/PN2XPfVwkqFVxY2bXGi/mOnakQ8r/Z7cl1I8Tui49opIHEvn wZCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734704033; x=1735308833; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=U3v08MX0QjC5F8LYMz3jdkKlxnmqui017XpkGCbWdww=; b=WkjVmcZPkXKuF96uUQnCZ0Jd2a9oMbHKqD9FjLD/qrn/zELA23uSM1a7WaJfa7TASD 1HMd5zYngbWH9/IyQKS9zKfVbWGJ/IijaTpMKCg8fdjszoojCHmcKe8PX9Zg18YE+JCC 0P6EzNvHgaykxiIAq1Dhve6mTVA+gDFuUib37+olNoWTVxHNSGg7N9K10vC8Lpq9WZcY ewCbXn4qrelONiIsGUYBZYhhxHaX2b7aHaNTBfhKvuKxzOvuts5mrEUA8aFcoJSBIH+v lGzVtYjsewKVM/994YD4ndYWEq9QV8E0aRo/6HZwJ/BajwJHP1WIgOYtUoCoojxr8xJC rGWg== X-Forwarded-Encrypted: i=1; AJvYcCX/Uq0JyAjPSSyXenfu0DWs3bpsH7awmijuO0Sxbm774n2Pne276oIZoUxRgEHmFyh1cu2oI4M8KA==@kvack.org X-Gm-Message-State: AOJu0YwrPJCsV1jGBMx26nHAhGaVCqtqUyxkLor2doMqugLpozyBwP1A mHEJPkYebgt5RnsnBF3OKdFmI+JJcJx3o0QTUSrPmQzpZ5p+NF0GfOvIvfv5+6E= X-Gm-Gg: ASbGncvYUcZPWReu4QPZvphxevvhlpLHdqcrtpocxg3PmWpFC0EVyN4xP3nPD8vDw5E B9DmkM0qQlGpEKMayFG6Vies2MY6VOjTpV5gYpz2vYCGmqSHwBLFJ2alPnndGdPx1LgR9r0DHE7 RFzP9L17TITAGT7N0vEgu0HoknKzgBd8QbrYa55pLo0NHWV+sQcWtILEo72iWndYQS2Eg8pab4D qrkoz3B4NlOpn8UK++5W06+ouy7xXkMcspdWPuThX4ggpdriaf/6aXSKUV8aN9YH3ac40bsNcD2 VTIPMvdFvBJSKq+x6r+cOz7sUt7pOi1JLXb3g4SZtghFqCLsQnbm7Ho= X-Google-Smtp-Source: AGHT+IF07FueLvr+EH96oSAOT59/93jJUqPV2Yol4ROXZ7UazU1CvYitUGtxot3syd768RDJFTPvpw== X-Received: by 2002:a05:6214:2242:b0:6d8:fa8a:af7e with SMTP id 6a1803df08f44-6dd23337bc0mr46854986d6.12.1734704032954; Fri, 20 Dec 2024 06:13:52 -0800 (PST) Received: from gourry-fedora-PF4VCD3F (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6dd1810ec4asm17148286d6.50.2024.12.20.06.13.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 06:13:52 -0800 (PST) From: Gregory Price X-Google-Original-From: Gregory Price Date: Fri, 20 Dec 2024 09:13:50 -0500 To: Hyeonggon Yoo Cc: Joshua Hahn , "gourry@gourry.net" , kernel_team@skhynix.com, 42.hyeyoo@gmail.com, "rafael@kernel.org" , "lenb@kernel.org" , "gregkh@linuxfoundation.org" , "akpm@linux-foundation.org" , Honggyu Kim , "ying.huang@linux.alibaba.com" , Rakie Kim , "dan.j.williams@intel.com" , "Jonathan.Cameron@huawei.com" , "dave.jiang@intel.com" , "horen.chuang@linux.dev" , "hannes@cmpxchg.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "linux-acpi@vger.kernel.org" , "kernel-team@meta.com" Subject: Re: [External Mail] [RFC PATCH v2] Weighted interleave auto-tuning Message-ID: References: <20241219191845.3506370-1-joshua.hahnjy@gmail.com> <3682b9cf-213c-497d-ab81-f70e1a785716@sk.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3682b9cf-213c-497d-ab81-f70e1a785716@sk.com> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1C8D7A0006 X-Stat-Signature: o96bbwb8t4j7ghf4auqgr5srpx4gfz1q X-Rspam-User: X-HE-Tag: 1734703980-509674 X-HE-Meta: U2FsdGVkX18OxZbuk9+bWtf7n5BuWPdMR7QDFtdpqDfLIMXuPqZubwEak+HzQrd6qr5Ob4Ps3R11JrVfg4c7/lqb9kMeKye+3co4h4McLPuFGRAiHHgWyo7pS514Sa+le1Z3s9dkYzEie27BMDAwcSg5964u5HId1XNU6d4pPk9oM7rB9SSM7ifNvng3L1mQuhliWBiaBSheCNAwIE6EcYWM4eHyRk5v+BuywfoCEjjM3VFRr0wbb0xWgcz2TjoHiwAjmKB6Y7LqxL8XxDUzB9Bn+CMl4c3A5ASD9HzQnBeUmuxgeop57x28SDx+kc/5u8yGHUGI4KhhPOWIpz89C1OYeBiBLHcK33g2QC/G4p3cZ24YJkm8UrX2q1PhBxVKONeBF3S5+lmUvUmiKgiubWuJWGswbYanN1g9ZDPOjTmZAXNU7oFmyBpr7sFJRBW8W55K3Xd/WQZrF3JxiTB9qt2Py/VT382TYNkH743To7fcrnhgiTLuChal1/aUKDq0bbJyEF7emP61YOlsr0LypcPr4us6Hwy/iSljzDRVrco576vnIJPtHudmTAyC4H/OjprX4qmMbdOc15SpLhqavzuejoVjrHN5kAcYL2BghdBl+5o7jc1hlxpMitqxEZsLyMDFZZ1VlfMATqs1sfkZ9f2C1iVpPRS0wnuFXKhaEGktbwA2mQWKfD770G9JcNR3dS7KXi6aFJXlqZo/Pu/dzVDQjzR9qFafu+eBPwFcuIA/EFI+qn7RjO6NLuPpIPQ6bNcfQWuuCO17spXA7GNhBIlFE5LPm60pNqn2AYYJwD4AJwnAwzGTpENEw2lXAJDAmM9eDcpbDIBWI6QVMetjCNIyQWZNejnKVQp7+h6HXkSy1kbsxTJ2VSequDNFeJLlmFx44wp8Ea6NDtODGvEQgY77wbUQ/avJL8HMQe1qp/IKvP+ZwKS96JaYHuo7mpqfq6ymFtvjALXwc1u91zy 9rBVUvM8 2I0uymELAGrH6cnHa3sdldp00LFFir1/Ts7jEhbw6+cccBgQz82H7Vc8fswj2fA7sGptMYIA/ruWMmkBELcDRoILYVhJHqYE6I6sAjzjHjvmAWRzhJJQ2lgdSI9RWWxukSJWR66L16Or6enxsrW7XBaJhxFy8AH6bCvmR+sxGBZTffEpFb43XN8VQHvAziMi1ppGx1gScUng0F+9vwT4DbXZ+V2kYgZiLBNMbcu9g9A3oVDAdEeTL8x+Q5OuUTuuq+wJxlKfBgAPQq8d6OK35BUi0kqOmq2vfeKF64I9p43WD04G6I/oXDJ/vZHTkBnujWvM6KdrrKxsKA4Yhfea7QubJtqmLiQSlg0WMqUXPuuTp/++pL0km70HwbDK0EQmckGB57+g8D1TMIfUQBfVQi11PoDaVN/jyx5Vt4xcew9Dp7PgZotypJSL3D25SCowsBSCk X-Bogosity: Ham, tests=bogofilter, spamicity=0.157929, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Dec 20, 2024 at 05:25:28PM +0900, Hyeonggon Yoo wrote: > On 2024-12-20 4:18 AM, Joshua Hahn wrote: ... snip ... > > By the way, this might be out of scope, but let me ask for my own > learning. > > We have a server with 2 sockets, each attached with local DRAM and CXL > memory (and thus 4 NUMA nodes). When accessing remote socket's memory > (either CXL or not), the bandwidth is limited by the interconnect's > bandwidth. > > On this server, ideally weighted interleaving should be configured > within a socket (e.g. local NUMA node + local CXL node) because > weighted interleaving does not consider the bandwidth when accessed > from a remote socket. > > So, the question is: On systems with multiple sockets (and CXL mem > attached to each socket), do you always assume the admin must bind to > a specific socket for optimal performance or is there any plan to > mitigate this problem without binding tasks to a socket? > There was a long discussion about this when initially implementing the weighted interleave mechanism. The answer is basically that interleave/weighted-interleave is suboptimal for this scenario for a few reasons. 1) The "effective bandwidth" of a given node is relative *to the task* Imagine: A----B | | C D Task 1 on A has a different effective bandwidth from A->D than Task 2 running on B. There's no good way for us to capture this information in global weights because... 2) We initially explored implementing a matrix of weights (cpu-relative) This had little support - so it was simplied to a single array. 3) We also explored task-local weights to allow capturing this info. This required new syscalls, and likewise had little support. 4) It's unclear how we can actually acquire cross-connect bandwidth information anyway, and it's further unclear how this would be used in an automated fashion to do "something reasonable" for the user. 5) The actual use cases for weighted-interleave on multi-socket systems was questionable due to the above - so we more or less discarded the idea as untennable at best (or at least in need of much more thought) So in short, yes, if the admin wants to be good use of (weighted) interleave, they should bind to one socket and its attached CXL memory only - otherwise the hidden chokepoint of the cross-socket interconnect may bite them. For now the best we can do is create global-relative weights, which mathematically reduce according to bandwidth within a nodemask if the task binds itself to a single socket. ~Gregory