From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D137C02192 for ; Fri, 7 Feb 2025 08:57:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C65B96B007B; Fri, 7 Feb 2025 03:57:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C15F16B0082; Fri, 7 Feb 2025 03:57:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ADDD96B0083; Fri, 7 Feb 2025 03:57:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8F2326B007B for ; Fri, 7 Feb 2025 03:57:52 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2398FB3294 for ; Fri, 7 Feb 2025 08:57:52 +0000 (UTC) X-FDA: 83092545984.06.0C7A384 Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) by imf18.hostedemail.com (Postfix) with ESMTP id 1CAFD1C0009 for ; Fri, 7 Feb 2025 08:57:49 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=eSxUySuB; dmarc=none; spf=pass (imf18.hostedemail.com: domain of gourry@gourry.net designates 209.85.222.178 as permitted sender) smtp.mailfrom=gourry@gourry.net ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738918670; a=rsa-sha256; cv=none; b=TorU/d8kY5JAj4qOnIAetp6WLpE8Gloiolf+Ky3ncC18qUS18LevTouUIkJmFRu22qrbRe DLOlvRt2axaJ7rEEcPsGsYHHLCawuCH01yIFQtzzjc2oi6VLjW1hxjvC+lmjJYQhUyRQBa ygcxmdTz1eldExjiACD0D0yj1vSCRLw= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=eSxUySuB; dmarc=none; spf=pass (imf18.hostedemail.com: domain of gourry@gourry.net designates 209.85.222.178 as permitted sender) smtp.mailfrom=gourry@gourry.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738918670; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NgvrfAQwLFovDWtM64VfstFqvcyM7cPGtOE5g74BpOE=; b=TJ6MichNwFmf7oVBiejRSBa2QQqIHiiswRR1qY3obX8e9v5hTvqpl98s0wkvMZgKPtPxnw LH9SyYXX+fZopbZc8OgnMUG0x85KxDQdtXrhuCQvYnEJjo3eZFycfFaA33r0Z8B4CwqtPY wR8edAe+OrvDSCq6edd3E9pEMAhMdp0= Received: by mail-qk1-f178.google.com with SMTP id af79cd13be357-7be8f281714so161018385a.1 for ; Fri, 07 Feb 2025 00:57:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1738918669; x=1739523469; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=NgvrfAQwLFovDWtM64VfstFqvcyM7cPGtOE5g74BpOE=; b=eSxUySuBLt9vVL9MjCttR6kYJmS7Ou0AFzTCW8gefcDO+Rdu/x6U9QQZaPodbmMRSp Ge1bNEoSYXbdaJq8M+jkWsCtjC6QfF6VR/MUimiiY/n+uP16mWtf5Unmsqhlkd3cbMs1 MImr/j0TDdSjZYDdfibXNGAszliIfzHH/YTHDpDC6TYnsVhGofC8xUISgne+YMpLHSHE Km91I5b2DC+m/+yY2aGadGjLi/M19jAl81FIy3LyYbhoQEPYit212b4ajW7Y8eRR9cAA TYhhA994ZZiShv14+1hMD9IWgPqWOa+kQeOC+xctDh8ksRfDGPtKtquwfBI+lUrN7ng3 ZJig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738918669; x=1739523469; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=NgvrfAQwLFovDWtM64VfstFqvcyM7cPGtOE5g74BpOE=; b=gfDOHYPP4qmTzK9QbEjodiFHJy2MbnWvzPMEHu3OXW8AljDE4DKywIrcCWokn7ytmZ 4Dp0IPGhK4vT4WYXa0jTzan+rZCOsZdgOz/uYwrYHK+xcd9v1AxWPOkHfAAy2gWodePe ZWjQQbLlde4oVnwg8MAMSuabskOTgoXzx2cEbFNxr4xJNKRhJbT+cBQWVLnSJDhecGSI VMzyvi2002oYqZ8R7pNOexFdRG+t8DVLZKvt0lELrwUzOpE/Uc+TRF59CjfqZvABnYxr M0TQmMWmG3i5TnG3XiSdg3q93uFqaXpj5yZcnQsI5cgpcNYxYIK56WHEc2CEScRtu7JA YhqA== X-Forwarded-Encrypted: i=1; AJvYcCUPvh8vLhSCAkjJCVtjCdsJkW3Ie+8kX+VDyqE3XlHbQbyxSWfW3FxKJKubuja1weoLJf7iu2N5eQ==@kvack.org X-Gm-Message-State: AOJu0YwggGayIMgA16BA/jyJtf2mFHvkqxmOb8yyLCVtiCBW4xN2PpnZ ZzX5FOr189jIU1t+5bJXCcT+3DAnDXHkPhhSAhWTJvtzgwzZiMc7Sj7BIw1Qs18= X-Gm-Gg: ASbGnctSMze4eXmXzID7f5RhrUlSIgzyO9Ck6DaVc0bD4fBmf3Yc5bNYN2BqYr6qzwT TfNLqZA+E6sNpWPuC0mNwrZyIJkqd7P7ngBb3dD9UybCQdofLgmnTb71XBRNPjjuW6tJPA5QTav OAMzExH7oPCIAvKT7KuotF8YIlEq3IGvZwVGXFUlY6Ip043bWP7jBRWvWE6qT+9/cCd9WoVzoEd JlKmN+Yyj2uTediBzOonHHaqCsaaj1ZW/r9ZsZQqArQd7hiCxNF6qRJYEvPBc1th2X/gfqejtTH 3hhH3W+k4XEjglk7E7Ubse3ITtNR8ZDjJDxY2Nwy+uC1JZpVlBECDYWFarKtlcBSZ8b215/pVg= = X-Google-Smtp-Source: AGHT+IHmp+FH1AuZTlJKN929evp5N+MuXixQzE8522ihJ4whOBdQptmbBxXYSkVLlqkBflCAEdVa/w== X-Received: by 2002:a05:620a:270c:b0:7b7:2de:6fd3 with SMTP id af79cd13be357-7c047b5214fmr359033585a.0.1738918669134; Fri, 07 Feb 2025 00:57:49 -0800 (PST) Received: from gourry-fedora-PF4VCD3F (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7c041dec312sm163884585a.11.2025.02.07.00.57.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 00:57:48 -0800 (PST) Date: Fri, 7 Feb 2025 03:57:45 -0500 From: Gregory Price To: Byungchul Park Cc: Matthew Wilcox , Hyeonggon Yoo <42.hyeyoo@gmail.com>, lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, linux-cxl@vger.kernel.org, Honggyu Kim , kernel_team@skhynix.com Subject: Re: [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier Message-ID: References: <20250207072024.GA48419@system.software.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250207072024.GA48419@system.software.com> X-Rspam-User: X-Rspamd-Queue-Id: 1CAFD1C0009 X-Rspamd-Server: rspam12 X-Stat-Signature: 7ea8o49uduw9piece5jd4t8gfycmnzt4 X-HE-Tag: 1738918669-838659 X-HE-Meta: U2FsdGVkX1+jA5J7D8Bc0yLuDyqgnGYQPERGVHINDgx+UNJBmtl1gJZFNuOhMRf3sq2tN4FuFaBP6ScRBpkU8D0BfNcq21jCCYc3jnlZ+mlbs4HFiYOXywzwgyLkMbTxADvlRvxtteGOKl27YBjqDatHu7ywHd2R0juLaZ86l5iATHE8WLvKS4BYRdYydfjYTJwhjN5KngNFwW/Ya84/J502BmgMS3pMQyohO4W7zrHS2Odvwmw2WwdQKpFdJYVbMIGkco6a0zSyX4jYF66OvVmivkpcZYxdXieznnAaS2RN0jG71oPPmZK5YcC+8dQAV+2ynyMaeACkzX0545OU3pWdcWfCS/3WcJ6lqYGMDZUNFK2OHHHqzSXyDzHGKAzacqvYetRXd79DFNNPxp5oav4EOVOhhYDpBSYp2/ZQg7VhSfOve6C8KypV/sIt9HukzGjbhtm+758GQKitbvunGTw1byK6uJEOOFx4N0KhOgplmrQonZT5sqdzr+CnptOcWZrtcwlknwam0ciLly3SylXw1W2hbVmj5SxTqa5xg0qwwHTdooLezErApSjRmBIBgblD3gq/FzMsOFnNNK39GyIEI5ho0Nh+dcMVfMnZtJvWICV1Es5PP/a59wsJXUlNitmn0sIvOte/HLjTHuFfi/YQ8mu2MDjbPd/tS2iVdyVfydKmxFvF+SJmxX+5J1pKjtO5Jpo1RTXPjkbmV/RFXTuCNrSGt7n4/4j83odlMnkjglRNSmCCDHhg3+LAMPtKePaDgpCsTbG3sYFQzW+pV7HEeeC0Dcd/STfSpu86y7EjUC78ARuhqlofvTWFDkdz3+8DkSoQyzxxZzoE6rr2JlqgLvSI/Hzz25CWDpi/Rg8yZIx9Mg7L3hP9Re18H3mgAg+MPYlH0JK/VFOeDXoUNvtBLBf5F/XLc19aSaJsEHWY/3uJEpsZfnz3mjX8gBV4obcm5h/ZLClHidY/6gL DMQMyO1x r+a5Sp/RQwIghM+zYeKtrKEnHXPBhPL+EIBA6Nby2cFeQ9NJ+dm3TCOHLOlY0m8cZ8Gp4ju2JxzODJ0cJRJLtTMGgUeeU7puymeu6NnoX8vjkjnH5ppVWtTufn6Pt5HGIwj+nuKVZTM2+nVRbeDTIMvR3WhmQcI04laMPw0z6gyITveUTTlwTbwcxNO1iqXZP73VQ2vw+Gawu68/vKRRQfnNyF2GJWiDpXRdIL2ZuIj7UfXo2Y/3BuiDdzVkWo/ITxsuuuM46qfYWc3u/cU/AmcJ26rz6NH3WqssessuyIvpCDeH/KGy3exDte9dGrlccrY0UKRAWjCwq+6FPJFDd4v5WFnParYRNOgCiiiZflLwnTnHFiT++SKMTeWU3FBsrpsEWSH9gFNHaSIzdbk9dQufB9w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000642, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 07, 2025 at 04:20:24PM +0900, Byungchul Park wrote: > On Sat, Feb 01, 2025 at 02:04:17PM +0000, Matthew Wilcox wrote: > > We can work with from the easiest object >e.g. page table It's more efficient and easier to change page sizes than it is to make page tables migratable. It's also easier to reclaim cold pages eating up significantly more memory than the page table (which describes pages at ~8 bytes per page). Also, there's quite a bit of literature that shows page tables landing on remote nodes (cross-socket) has negative performance impacts. Putting them on CXL makes the problem worse. > struct page, `struct page` is a structure that describes a physically addressed page. It is common to access it by simply doing `pfn_to_page()`, which is a fairly simply conversion (bit more complex in sparsemem w/ sections) This is used in a lockless manner to acquire page references all over the kernel. Making that migratable is... ambitious, to say the least. > and kernel stack, The default kernel stack size is like 16kb. You'd need like 100,000 threads to eat up 1.5GB, and 2048 threads only eats like 32MB. It's not an interesting amount of memory if you have a 20TB system. > When it comes to this topic, the most important thing is the collected > *direction* from the community so that we can start the work under the > *direction*. > My thoughts here are that memory tiering is the wrong tool for the problem you are trying to solve. Maybe there's a world in which we propose a ZONE_MEMDESC which is exclusively used for `struct page` for a node. At least then you could design CXL capacities *around* that. ~Gregory