From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B5B8C3600B for ; Mon, 31 Mar 2025 16:53:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9FE4A280002; Mon, 31 Mar 2025 12:53:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98913280001; Mon, 31 Mar 2025 12:53:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 827A3280002; Mon, 31 Mar 2025 12:53:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 621FE280001 for ; Mon, 31 Mar 2025 12:53:12 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 32288BA0BD for ; Mon, 31 Mar 2025 16:53:14 +0000 (UTC) X-FDA: 83282441508.17.4CA55F1 Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) by imf23.hostedemail.com (Postfix) with ESMTP id 025AD140002 for ; Mon, 31 Mar 2025 16:53:11 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=MxrbuYCU; spf=pass (imf23.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.52 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743439992; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AiUKTo61WkrGilNB9SYxNkhCryPehlR+RK/PdoXLPcM=; b=3hLIc40OuStDfynwbEDUR4ZfOQH0tSxHGCbVCrVxZFsp6gKMKp2+EwGrWXwGKaa/DRB2wo vKiT3D/d3lpaiQ4NNRqkAWjXf2VNFTPqsn1dQ700L6HbhaluBWM46HWrexoeqn3F0CIxnN mtAvSag0AIUYhMbH6DpwuyE/ISZjEp4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743439992; a=rsa-sha256; cv=none; b=03ReIraKuBdVXltp1q197kucUJwpXbDlto+Uao11lHODr6wkLCb7uQnXpmFi7Xc8Syc3ys h7q9aM6VZNWfjxnEehycXHptr8lV0Na+T20heyOtdbvnvN0jD/JGE92R3LE3Qtte/gLyUm sCQ85x2T0Tld0j+K3Eo1WdOeHOl0kkU= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=MxrbuYCU; spf=pass (imf23.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.52 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-6e8f05acc13so53284166d6.2 for ; Mon, 31 Mar 2025 09:53:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1743439991; x=1744044791; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=AiUKTo61WkrGilNB9SYxNkhCryPehlR+RK/PdoXLPcM=; b=MxrbuYCUA/OkPaegeDXMFQl7awGJmAxJYf1YPo2I9jtzffc3owkl2WxPbfTTar/SRS bfmJaQuPZ2cabOrAaWNRAB2MEyCpg7OrhVTKxBKFda89SCbhYve8Mn1XgQZWIryXx1ct 1HXS3++mX1vzOxp22PwgktbvXvMgs5ojomfZWi9AznfBUSp7OKWOReRS7ggOZB3meRhI zoPuYO/qOw4RxwFE1GNv0hgvZk90ZwcQCXlZc0ejKPBVXUSf+4rAS3oXnYFhZvNWblim gZJh+dGt8L6/Kl6VRHAj3nVaF9SvsSWXvG7ov4gzL8d2WS2e4KqH7z0E6c1mIWAjUtwU DxJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743439991; x=1744044791; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=AiUKTo61WkrGilNB9SYxNkhCryPehlR+RK/PdoXLPcM=; b=DqueLtxEic7yXmrgnXqaju8K/QtB8tkVeLEjE1gyrf713dXzlz1ByA8TirWC85Cfx2 nIxhTAVteOxZwGbtVZc3DG2ppLAojQsUUD71k+NvhuK0QvwTb/ya0awtiXd/8S/0vb2c bwRnDIKxQITMIQ8AaViI4UICiRYqhvIOkaO9eocZcF88YcgvSaoZ3UpqRIfYoWl63DHT o2WpskEExgW40ptuYKw1AEVxnFoPyXzu35I0qMIuBnamWi58rCdsQ1sr0JAaBwzxiUSp 0LbD96RIsyA5oJqSPrFmLMpOOHSvEZ+BvS9yp0mJLcXrMWbhf4vN8MR6lfBMtvfqoOiP f5jA== X-Forwarded-Encrypted: i=1; AJvYcCVIV2XBXTN0PoVofU0rdz4Ief9699Knvaszlvm1kxiot8M1DEPnStGCbjShpgoq7QsFy4EmOkhz5A==@kvack.org X-Gm-Message-State: AOJu0YwTSTLNacATRgv0eQg7MOAncUPq/+dwQhK/p4fZvOxJT/kT9gu3 SPtZhVzqMR93RheAjm+sNv8PjoZZpwyhmXjSJ11tbGG/C2Car7zf96+Fb3YmiVE= X-Gm-Gg: ASbGncvyjz/UP/8FzKKhRakVWt74P90Zqh2fTTRbaFzLU1JdWpXcAUFN3prhIpvcEl9 1Dj4Tky8yOjetWvzZwXGSEgHZvKtUoALu39iUadrOtntPFWLJTdCfustGcrb+Cl1lQtTc3R9e+N 58Uknd1K7v27VSwFCv9jSgYLpz2dTRcMptUgmo2b/WZzxtmqBZH/ULIhrwJcbOPH2I/Q8NeZ8BY Qnq6EIi1qyjwN/LMQ/4DZKO0HXCQVY3ODtGTY08vqFZeg6MyS0zc8qXrhaZdNdlrNYzS4HpThvW DNHhcM+A6dGXUUVb+Os21vBvVsE9MPQiLB1YZUbTc6I= X-Google-Smtp-Source: AGHT+IFd+qsosdQQtA0DsjOuFOjNzDgnTDzOhpn2WIRyqS+ggjdGBL/fAbxW/5YG2w4V0Rf+gmk9vg== X-Received: by 2002:a05:6214:1d2e:b0:6d8:846b:cd8d with SMTP id 6a1803df08f44-6eed627129bmr126446056d6.30.1743439991034; Mon, 31 Mar 2025 09:53:11 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id 6a1803df08f44-6eec9797218sm48420876d6.110.2025.03.31.09.53.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Mar 2025 09:53:10 -0700 (PDT) Date: Mon, 31 Mar 2025 12:53:06 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Nhat Pham , linux-mm@kvack.org, akpm@linux-foundation.org, chengming.zhou@linux.dev, sj@kernel.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, gourry@gourry.net, willy@infradead.org, ying.huang@linux.alibaba.com, jonathan.cameron@huawei.com, dan.j.williams@intel.com, linux-cxl@vger.kernel.org, minchan@kernel.org, senozhatsky@chromium.org Subject: Re: [RFC PATCH 0/2] zswap: fix placement inversion in memory tiering systems Message-ID: <20250331165306.GC2110528@cmpxchg.org> References: <20250329110230.2459730-1-nphamcs@gmail.com> <2759fa95d0071f3c5e33a9c6369f0d0bcecd76b7@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2759fa95d0071f3c5e33a9c6369f0d0bcecd76b7@linux.dev> X-Stat-Signature: 1ghwqqdfzu45o6kdaodepr8uy8ueg5dm X-Rspam-User: X-Rspamd-Queue-Id: 025AD140002 X-Rspamd-Server: rspam08 X-HE-Tag: 1743439991-47828 X-HE-Meta: U2FsdGVkX19ep8/40X6VI5md3wA3GLug6GwdqTisxd2uOIvfq5ewX6UV8/1WLRKBncB+zewj9yPzJ9OXH9EG8SjhdcCA5zYxu/M6ns2Zbm/qMYtrcpsW1kuWxe80PNILt9qv/ZlZYAc0JsMXovcJW64ZVc5eVleiLRv56LqvsGvtwYq5s4IGoyLDG48r+42K1YGw1eDc8EXzw5VcAqnzyJuJxTeBHJ+lc4cbHcs3EOEVPZ9ZSI1mv1Ngi7l9PPH7ohBM9euniIJuGvlUlysTNLEwTumT/mKW283zqp4/1DNBQiIt+n899QMfTUPOoR2at7mlJ1o+fBm/5XDFMmZXn/wGlC5k82ek3kI7Py3TZgX2KfGkRTvLt3c4Kt7MkcpuFXzBsxFpAfW19pLlzyrfX5O8g6vUf+Ji95bYWaRl8rx4NwwEKwDrOCvn6WS99uEoWCu54Svxy2nxYIWN8w5YNnUZXCDQMlfTM9q4BTAuQqSqt5qglyI0Zyg4UzDsQpaCn8NRtXsbsfMFixL+Cg/J0KKSUqmpwXik0bTdj7Lc5wiQzDEkARdTIGUkQhxZ/4lFenS55ACb/22iWn+lK8KQb2It6iiha1A0xVsYgYd5WwFGLFyDxqyaAIxhDTSe/f3NeuWN8zAimquT+i7zgXeqOFxw+OYh2GGMDKDWPr+/91pQYityxwjfiYW+bYcTHVDhmiIlzc6r9HyCYNnjJnIkVNmFv4Ji+gEe9qyl8UKDIUkRPCXeUd4FhZ/+EspfChaK9f1JUEtAdzR9h8uzhhmmHlfEJwbFRJcNvOJH4V+f0g1CStsjB/tmUw5dIXZIgmPMAdGHvp8b9Ldz0Uuh2wv0bpXuq84vG7XIGKxbnDECjppW6gvTFaDhlQD+I8tRlXqGMt6qnbBrRtSjYeOORfGpvruEDO5lzg7O/i7z7bbh6KGN83g2mh0afPfcU2gwcCRbVBp67VWtRhgvOH/xOWE IwRK52re KZEscihOs1TV233z3kmlzYnmGGJRsA6u/zCONwjOq7DiMYLPadg+ZkSWyV5WtFk5nMZHDbnDDAjEzXcxac6/lFGq3R5hccGRvm4Pj/VyTtLQ5mLzc9Ln0vt1nRTyD10+h97crJ+faElraMg0k5ImENXGqpozCMl/0160rgwwLhlHbd0fYV6fInzen5fsfTMRxjpCtsFZAP0DtDJt8dsoc38x6cVPuR08t8sDsxo3Dyc6VJWxC53mUbyzFKZt1mRaZyMAMw+/+n+RjYwWXuLbb1ZBsCeZLMcK3tfEr5YDvSW5Cb2gO5owxFsWFJn26u637dfYTfPryImSpju6GMLKNPMAVnVWrF9K7aTEoSh2xprP6J4rsk3QFBz70JEHqwEJxHJG+oGec0WBIQIPX357zfntNTq05vIe93cJ5vqg1D39A1+BACoVQe2lDwxuqUNaWHlt1gj6Yf/+JUklwtpOTQC3mduzU5Zkwx1D1Avyog7Bbu+s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.006134, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Mar 29, 2025 at 07:53:23PM +0000, Yosry Ahmed wrote: > March 29, 2025 at 1:02 PM, "Nhat Pham" wrote: > > > Currently, systems with CXL-based memory tiering can encounter the > > following inversion with zswap: the coldest pages demoted to the CXL > > tier can return to the high tier when they are zswapped out, > > creating memory pressure on the high tier. > > This happens because zsmalloc, zswap's backend memory allocator, does > > not enforce any memory policy. If the task reclaiming memory follows > > the local-first policy for example, the memory requested for zswap can > > be served by the upper tier, leading to the aformentioned inversion. > > This RFC fixes this inversion by adding a new memory allocation mode > > for zswap (exposed through a zswap sysfs knob), intended for > > hosts with CXL, where the memory for the compressed object is requested > > preferentially from the same node that the original page resides on. > > I didn't look too closely, but why not just prefer the same node by > default? Why is a knob needed? +1 It should really be the default. Even on regular NUMA setups this behavior makes more sense. Consider a direct reclaimer scanning nodes in order of allocation preference. If it ventures into remote nodes, the memory it compresses there should stay there. Trying to shift those contents over to the reclaiming thread's preferred node further *increases* its local pressure, and provoking more spills. The remote node is also the most likely to refault this data again. This is just bad for everybody. > Or maybe if there's a way to tell the "tier" of the node we can > prefer to allocate from the same "tier"? Presumably, other nodes in the same tier would come first in the fallback zonelist of that node, so page_to_nid() should just work. I wouldn't complicate this until somebody has real systems where it does the wrong thing. My vote is to stick with page_to_nid(), but do it unconditionally.