From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F854C02192 for ; Fri, 7 Feb 2025 07:20:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEF4E6B007B; Fri, 7 Feb 2025 02:20:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D9FC56B0082; Fri, 7 Feb 2025 02:20:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C673A6B0083; Fri, 7 Feb 2025 02:20:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A8FF06B007B for ; Fri, 7 Feb 2025 02:20:35 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 489DBB0329 for ; Fri, 7 Feb 2025 07:20:35 +0000 (UTC) X-FDA: 83092300830.29.BE658D8 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf05.hostedemail.com (Postfix) with ESMTP id 9FA63100011 for ; Fri, 7 Feb 2025 07:20:32 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738912833; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bom+/djOcpbg1uYNZPqX8cqdK8jynQbfNhBGn97b3Ig=; b=Vpbk6juf+Hx7e6aOSYAcyD2AeQCKaztY+64MdrqoNO6iN2q75fVvq5rg4yLznugCjo6lk4 hlw9J3uixlK52UbeuPXj9ZWu57/PQdntzpSnxcrmZtPmn0v4t9kSFaFA69x02KwbZJg85l 3ShODfqu7MiLwUrFC5cUD6Mh2DH5b08= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738912833; a=rsa-sha256; cv=none; b=m5RV3mZ9Y58HobjervfuyN9KQ5FvLYf07f3tD4Yw86377WyWPy9Uyz2uJdf+q19ala3aqY r+8cYDnjpZj/pMnSWgjVN/NhFcw4kUccEF5VoDGpUjfoFs3LZIxxSwTk9mlLczODPTKL4X Q59euGmKb4oswuJyPDTHOEZvbkoU24c= X-AuditID: a67dfc5b-3c9ff7000001d7ae-5c-67a5b43e5722 Date: Fri, 7 Feb 2025 16:20:24 +0900 From: Byungchul Park To: Matthew Wilcox Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, linux-cxl@vger.kernel.org, Honggyu Kim , kernel_team@skhynix.com Subject: Re: [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier Message-ID: <20250207072024.GA48419@system.software.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrBLMWRmVeSWpSXmKPExsXC9ZZnoa7dlqXpBg/38VpM7DGwOD/rFIvF vTX/WS32vd7LbPH7xxw2B1aPnbPusntsXqHlsenTJHaPyTeWM3p83iQXwBrFZZOSmpNZllqk b5fAlXGmcxFjwUfxipvzVzM3MB4W6mLk5JAQMJHYsWc5excjB5jdvEUYJMwioCKxad0pNhCb TUBd4saNn8wgJSICGhJvthh1MXJxMAvsYJR4tfAHK0hcWCBN4u0PP5ByXgELie+nPzCB2EIC cRJ3etYwQsQFJU7OfMICYjMLaEnc+PeSCaSVWUBaYvk/DpAwJ9ABR1o+gpWICihLHNh2nAlk lYTAGjaJby2TWSAulpQ4uOIGywRGgVlIxs5CMnYWwtgFjMyrGIUy88pyEzNzTPQyKvMyK/SS 83M3MQLDd1ntn+gdjJ8uBB9iFOBgVOLhTTiwJF2INbGsuDL3EKMEB7OSCO+UNUAh3pTEyqrU ovz4otKc1OJDjNIcLErivEbfylOEBNITS1KzU1MLUotgskwcnFINjB2rfafvDK3WsSovllN2 azZWFet/c2ch31SWiMe6TJycPsmTeOYy7eY/3WMm+6HsXqzGq3cB6+7Wn973it1BuXCjYru6 /S55tl2P9NJ+sYe5re3YV6BSWfzY4fyUEzyOcvVXZDfu1Wz0Ndm0Y8+svfflnEz3is29JPtQ 5YH3ohdnlwrKHi1apMRSnJFoqMVcVJwIAJCf2G1bAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrBLMWRmVeSWpSXmKPExsXC5WfdrGu7ZWm6we+n1hYTewwsPj97zWxx eO5JVovzs06xWNxb85/VYt/rvcwWv3/MYXNg99g56y67x+YVWh6bPk1i95h8Yzmjx7fbHh6L X3xg8vi8SS6APYrLJiU1J7MstUjfLoEr40znIsaCj+IVN+evZm5gPCzUxcjBISFgItG8RbiL kZODRUBFYtO6U2wgNpuAusSNGz+ZQUpEBDQk3mwx6mLk4mAW2MEo8WrhD1aQuLBAmsTbH34g 5bwCFhLfT39gArGFBOIk7vSsYYSIC0qcnPmEBcRmFtCSuPHvJRNIK7OAtMTyfxwgYU6gA460 fAQrERVQljiw7TjTBEbeWUi6ZyHpnoXQvYCReRWjSGZeWW5iZo6pXnF2RmVeZoVecn7uJkZg eC6r/TNxB+OXy+6HGAU4GJV4eBMOLEkXYk0sK67MPcQowcGsJMI7ZQ1QiDclsbIqtSg/vqg0 J7X4EKM0B4uSOK9XeGqCkEB6YklqdmpqQWoRTJaJg1OqgbFOK/7H7npfnZksN/ceS2B56Ghd PrHRRFHK8pKGV4eSSKNgge6118kP9JY/FTjL94+jfLFQ4qNJ68MWfatYI3VWesaZP1/qlrwz +eW0eFfNK6Hsnz/b25SOXpkWvY5Ne3a+SJCUlOS5aIe9Kzu/mTqlPTP8dqBphZr1yUcFhdy1 PW8Z124XLFdiKc5INNRiLipOBAAmQDCFSwIAAA== X-CFilter-Loop: Reflected X-Rspam-User: X-Stat-Signature: zynt6mrzcgfgohhycijhajmdedpthm5s X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9FA63100011 X-HE-Tag: 1738912832-639241 X-HE-Meta: U2FsdGVkX1+OkrJ0YchAy135/+ELLxbLEF2BjIaAr6+OmCpfrj42P2G50dzhHTSgI9KUCc2h5YO5FIpD53OZDt7Nsy9KtdbunikNF4pezlyU16mTZhqbj8/KiwbxFQqlsYFVrl5kqVEjEsVKY1F200BixC3GQzzUFJ9BcAkCZAAfGbpdf6/xKypfADpF2xaHIlCO1Oz1fvBqc3rqTqfShiHmlT4+C56papJB2iSkPnsc4mDAMb1tKUDJysv7360OQn33FP8WNyIFNLCtD00Fa2dcNMKVUT15OUo7yUyTGT9RFwIM0xultP6jFgs1AirLgy9MAVJu+TuWX935inZrwxi1HRxplukLIpm0csXNADzOR9KYt3gzcS8PE7U3xTtt4DyFcvtMxwkptnmtkWFVbX21NqYS0Qx4qmxPruFpeGzmOz1qKHsLiLpGHm5QLAaTl54aVUprqzpxupKkm6dm2+iicDX6W/nwav9+fGnlnyrLor81ms76JhsuKpMi2wsf6l/E/Fm9HkQncbYWWq/RbpZF1OiG/F6sF17+GjGmeB5R4lJbOyEpUPTUgblNYm+s/JdkxGKdTmbok6Br653IH7wgZoefUUqUOG+J5wpT5bqVxWthEbT7gBFDer7+QOa0xJpvvxZKphPTMuwYFLZj5WYLO+t7F2fJXZkWZH53M/PMlzQi/SdmKT61xcwZfBM6JAMTCSfiVVEGh9BzvbAUqYtQQHtSyQ59l1Zsc7/HNJPxiJjMGnGQKFKDjncn7KAKXhuq5VWCeW/vqOuHVx1SPKPxLE08x4d+IVHWm8oQWR2DiIpeyvs4F6JCLBDM3vm+5jgBE0srB+JroFb4ZKQSx+3BCO3OXHxZJRqDz+zs4XeDJYM3O8YFNTlva3IRlMpmLKopDflXc0Eoa9kmwYvFg50hk1MIlSUx2D9CLBFhYguK3bmvtoTQ5vhZoxBdZGbp1GKgjMvLNQa6LX579Qs r9iUhvog wMAD67rAfOUMmBq8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Feb 01, 2025 at 02:04:17PM +0000, Matthew Wilcox wrote: > On Sat, Feb 01, 2025 at 10:29:23PM +0900, Hyeonggon Yoo wrote: > > The Linux kernel supports hot-plugging CXL memory via dax/kmem functionality. > > The hot-plugged memory allows either unmovable kernel allocations > > (ZONE_NORMAL), or restricts them to movable allocations (ZONE_MOVABLE) > > depending on the hot-plug policy. > > This all seems like a grand waste of time. Don't do that. Don't allow > kernel allocations from CXL at all. Don't build systems that have > vast quantities of CXL memory (or if you do, expose it as really fast > swap, not as memory). > > All of the CXL topics I see this year are "It really hurts performance > when ..." and my reaction is "Yes, I told you it would hurt and you did > it anyway". Just stop doing it. CXL is this decade's Infiniband / ATM > / (name your favourite misguided dead technology here). You can't stop > other people from doing foolish things, but you don't have to join in. > And we don't have to take stupid patches. Hyeonggon and I described the topic based on what we observed in CXL memory environment, but fundamentally it doesn't have to be only CXL memory issue but also heterogeneous memory or ZONE_NORMAL cost issue as you and others mentioned. Lemme clarify it. 1. Allow kernel object to be movable: a. ZONE_NORMAL cost will be reduced. (less reclaim and oom) b. ZONE_NORMAL covers bigger whole memory. c. A smaller ZONE_NORMAL is sufficient. d. Need additional consideration about when(or what) to move. 2. Never allow kernel object to be movable: a. ZONE_NORMAL cost keeps high. (premature reclaim and oom) b. ZONE_NORMAL covers smaller whole memory. c. A bigger ZONE_NORMAL is required. 3. Allow ZONE_NORMAL in non-DRAM: a. Mitigate ZONE_NORMAL cost. (less reclaim and oom) b. Followed by e.g. hot-unplug issue. c. Option 1: No restricting the ZONE_NORMAL size. d. Option 2: Restricting the size as budget to cover its capacity. e. Option 3: ? 4. Never allow ZONE_NORMAL in non-DRAM: a. ZONE_NORMAL cost should be low enough to cover non-DRAM too. b. Any efforts to reduce ZONE_NORMAL cost should be welcome. c. Matthew's work would mitigate the cost. d. Allowing kernel object to be movable would work for it too. Plus, I think Metthew's effort to reduce ZONE_NORMAL cost is amazing and hope successfully make it. However, ZONE_NORMAL cost can be reduced in many ways and all the efforts can be considered meaningful. We can work with from the easiest object e.g. page table, struct page, and kernel stack, to harder ones, while struct page cost is getting reduced by Matthew's work at the same time. When it comes to this topic, the most important thing is the collected *direction* from the community so that we can start the work under the *direction*. Byungchul