From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A72B5C3DA60 for ; Wed, 17 Jul 2024 18:05:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F88B6B0083; Wed, 17 Jul 2024 14:05:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A8D46B0085; Wed, 17 Jul 2024 14:05:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06F646B0088; Wed, 17 Jul 2024 14:05:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DE0EE6B0083 for ; Wed, 17 Jul 2024 14:05:49 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8393C160C51 for ; Wed, 17 Jul 2024 18:05:49 +0000 (UTC) X-FDA: 82350022818.30.C0F4B4F Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com [209.85.208.181]) by imf01.hostedemail.com (Postfix) with ESMTP id 7A2D440024 for ; Wed, 17 Jul 2024 18:05:47 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="A/vDPjXj"; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.181 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721239508; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EcbaqaLncNt5/+ldvNun3H3cqM4FhEED9qsUEKX0xf8=; b=jeA2BCwgFqaxG9FET6cLjpF8nrjr7fGA7WYc2yfSZF19KThsJNE2/dJtpD5WMe3Y8ZaoiQ gndzay2iK/0XKobPf9n48DHfe2efjzOmQZoqXqIgUiqXZezCGJJG3kGBpOV71p+3Hg6Qn1 p1pIfaEgUh4kzyL9DlglVPZ49Ls+DQM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721239508; a=rsa-sha256; cv=none; b=hZN1LLh3NQ0XLwMR8dPQKDd+xE0qHobPnkxa0E5DmP/82rwpWRPJQVS2h7i61cJmslupIS mytWJAJDmPUTejmG66ily5vnMDlNCIwc2w37CA5O630ag3hJQkFGUcmy/kVY6ZvGSdbYWj dHyzURHv6WSzGNyWYYeITNr3gLxg22A= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="A/vDPjXj"; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.181 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-lj1-f181.google.com with SMTP id 38308e7fff4ca-2eeef45d865so65771fa.0 for ; Wed, 17 Jul 2024 11:05:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721239545; x=1721844345; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=EcbaqaLncNt5/+ldvNun3H3cqM4FhEED9qsUEKX0xf8=; b=A/vDPjXjRtY8+ELOhrdzlUKSYYawnKO1skFEgp5CqKwxjhEpn6Oi6fdb1z/IImtaOS SUOgL3OsalSvxvrDNC+VqDr2N18xlNmajMs3q+3lnFz+Dqe4pGTZKRxkz3fd05sEatYY f5gwY71GdZITLNWEz4iRPAIihXiTOaXO/KabVtXnrparSp8nxQdaYSYGDVWwiCsZ2yGA djhUKWB6YrZw8v5Q6LWB6i9jckiZ+jwqTnhtyuv3MsttW8uWM8vSjamNg13BGmW3BUZx e58hX2DmJ/l+eHIrfCUyHBxDhuvgXKQRAHe0abCQM+Y7Lec80XC3oTCJBQaxSjpGyUxX PvCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721239545; x=1721844345; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EcbaqaLncNt5/+ldvNun3H3cqM4FhEED9qsUEKX0xf8=; b=VDEoF4YDTw9xC775/6FZQWQ2W+Zxn0azk+2wUoEyo9jXp725g99wYaVgWG1cdUJ8uq FlEo7LvJCJhbrR8uSvho3mRwJdHeISglLOBt70JAeqFJebjdFP/D5aDVuxo9NfC+iIKC xU1IC236g0pQW/XlErDBcfK4qnOLVS0nzWzz4qxxKE44LY/XpLGFVdWhyj9NdCgOUqvw N5k17So4x55kbRzViYe/46dvBeUq6JFLoZpo3MCWO+STKp6882JDrEfvlL4z2MGT9tLC +aVApEXKjUrvU5B85jy0eNX5aRzI5kOrsnTqPQuN8nywVo2KJUAtcDGfTxpqXtO966C6 mXAw== X-Forwarded-Encrypted: i=1; AJvYcCUSB/vu2tOZuw8jYmABaf5GJPy9qvNecnEQtqjC80YltZ839jzS5sS1DlRawzZp7VD1GIHMZUR47+OP/0BfMUGkQ3I= X-Gm-Message-State: AOJu0Yw4tLVv55nQzrVZS4U/HzcuH2DGxSz0lJW9D9r+eJZugqAI30r5 ezyWr4XSzdKv4M36367ObECj4GWf19tBNS+CwVcM687qJbVNf7mNc5P2Qd/5Zxt4qPdKlCZJBb0 lwM6I4q1LTTXUTwJalv1JPX8iQrU2zcxQY3Kx X-Google-Smtp-Source: AGHT+IHmEPpkW9BMRf7R+mnkbP964r6rm53YMQ2kLWJrWHw4ELd3YcPcVXt2ZFtW89ezovDBD1g4Tm5CJK5P3Fnz80o= X-Received: by 2002:a2e:9a87:0:b0:2ee:8566:32cb with SMTP id 38308e7fff4ca-2ef05c73758mr1514341fa.16.1721239544904; Wed, 17 Jul 2024 11:05:44 -0700 (PDT) MIME-Version: 1.0 References: <20240706022523.1104080-1-flintglass@gmail.com> In-Reply-To: From: Yosry Ahmed Date: Wed, 17 Jul 2024 11:05:06 -0700 Message-ID: Subject: Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink To: Nhat Pham Cc: Takero Funaki , Johannes Weiner , Chengming Zhou , Jonathan Corbet , Andrew Morton , Domenico Cerasuolo , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 7A2D440024 X-Stat-Signature: s5rds6pijkmkmopeyrprdmykqd38hggs X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1721239547-251368 X-HE-Meta: U2FsdGVkX1+bS2JSn9NPhNF2UzZ9764TqvcQTVaA78k31FRZ98yW0dMXxaYoKog4cx3ByEatIQ1hj+myqgem64R/W4rLwHZLL8P8ym/ohrL1vdoHNrOyt/+c1txNo78pFUWzeuy3r/uU/rH4OK+WLD7ENWC6X1YdJa+7v0Vp6Np3F6OLy95yIe+Sd8cQN3XOmVQsEAYbY2F6KxPdk1LTjezpRkLshjgSn4pa5zpCCpXTpjpPgeYWL5AVrJQ4UaE/1uHzYibQO/4pzXfEWhMBTNS/jOokI7ZhfzzgObCTH3MUNNDyp3dC6i92OLD9eIbCxU4XET38yFuyFKAy/dNywmD2A2kl47l0O5RqyJGCzbYd0f1dnpEdl7NBBUtnoAp5sfcTFV4vsYqmsBMiDlAbzRSAMMnc3j0S1KYHOiDyN05Z0lsYRmVeaIKcBcdv0FFp+8SoG7ILukHJYpY3YyWXAPejzuEWxzHOGMXgk1LjvAxcNHEmb+3MUbgaZck3GmJMsLZIMY9nNf0NwmyVcXglmk9XIwumYYUUu/rk8iGEDunk5zdIukjKfQt5Va+Vzhmy/8LK+JQvbyj6DypHC4N+sDgMdu8Umpn7IlPo9ohRLmp+Rh1euNCQuwG3pU3bemEvh0XfIANjN7c9lSPNZmoujWo9HoZTfe79qZDs1StEo0SZw2nurOtn+yt1m1ikVywKt9JNft8YbC/cznFL/jytn4s2GWxYgeXKS/o3u8kr7fi9OX6MF96wtRPMUsKynYj7sfPMRZsB5Q21waTFaR3EFtzKG2x1BYUwDZK//9zKtS2HPnk+Arcl/GKQFpuW575poMFed0uw9/UxeQSjjL8B/50bja4ei8C1RX+slYFAUFBxyjuyUj+XrysNbG/lTGPh+ANRX3F0Q4Y0Ox0RDXU8ohsHxqLjSR6fqWoYBDnIEtKUnRKs8LqoFeG8+gnfyMBDtoo2W05Krbhp857VTKB eEAjzwJ1 0u9WKv8LGpwjzAkh/vwAzAznmomNYpDptqwYt16+VNzd4BFRB4z1D+IuYjNXdXaQ9z1s5QcAwTpoKxzsr9U3ZAtyor4FsANuKrVmBleyNGCFPbZKlY7Gr06VxlVkZJd5wWAhjTSdSIKMNwALxK9FpIBZiwNFXLU5dTDaP5zKM+8rFTslQza/mzt6R+ZHP/aoLHl++oLzG4D5qcMTzSln6U5+DkA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 17, 2024 at 10:49=E2=80=AFAM Nhat Pham wrot= e: > > On Tue, Jul 16, 2024 at 7:53=E2=80=AFPM Yosry Ahmed wrote: > > > > [..] > > > > > > > My concern is that we are knowingly (and perhaps unnecessarily) > > > > creating an LRU inversion here - preferring swapping out the reject= ed > > > > pages over the colder pages in the zswap pool. Shouldn't it be the > > > > other way around? For instance, can we spiral into the following > > > > scenario: > > > > > > > > 1. zswap pool becomes full. > > > > 2. Memory is still tight, so anonymous memory will be reclaimed. zs= wap > > > > keeps rejecting incoming pages, and putting a hold on the global > > > > shrinker. > > > > 3. The pages that are swapped out are warmer than the ones stored i= n > > > > the zswap pool, so they will be more likely to be swapped in (which= , > > > > IIUC, will also further delay the global shrinker). > > > > > > > > and the cycle keeps going on and on? > > > > > > I agree this does not follow LRU, but I think the LRU priority > > > inversion is unavoidable once the pool limit is hit. > > > The accept_thr_percent should be lowered to reduce the probability of > > > LRU inversion if it matters. (it is why I implemented proactive > > > shrinker.) > > > > Why? > > > > Let's take a step back. You are suggesting that we throttle zswap > > writeback to allow reclaim to swapout warmer pages to swap device. As > > Nhat said, we are proliferating LRU inversion instead of fixing it. > > > > I think I had a similar discussion with Johannes about this before, > > and we discussed that if zswap becomes full, we should instead > > throttle reclaim and allow zswap writeback to proceed (i.e. the > > opposite of what this series is doing). This would be similar to how > > we throttle reclaim today to wait for dirty pages to be written back. > > > > I completely agree with this analysis and proposal - it's somewhat > similar to what I have in mind, but more fleshed out :) > > > This should reduce/fix the LRU inversion instead of proliferating it, > > and it should reduce the total amout of IO as colder pages should go > > to disk while warmer pages go to zswap. I am wondering if we can reuse > > the reclaim_throttle() mechanism here. > > > > One concern I have is that we will also throttle file pages if we use > > reclaim_throttle(), since I don't see per-type throttling there. This > > could be fine, since we similarly throttle zswap reclaim if there are > > too many dirty file pages. I am not super familiar with reclaim > > throttling, so maybe I missed something obvious or there is a better > > way, but I believe that from a high level this should be the right way > > to go. > > I don't think we have any infrastructure for anon-only throttling in > vmscan logic, but it sounds trivial to implement if necessary :) > > > > > I actually think if we do this properly, and throttle reclaim when > > zswap becomes full, we may be able to drop the acceptance hysteresis > > and rely on the throttling mechanism to make sure we stop reclaim > > until we free up enough space in zswap to avoid consistently hitting > > the limit, but this could be a future extension. > > Agree - this hysteresis heuristics needs to die. > > IMHO, I think we should still have the proactive global shrinking > action that Takero is proposing in patch 3. The throttling is nice, > but it'd be even nicer if we can get ahead of that :) I have always thought that the shrinker should play this role in one way or another. Instead of an arbitrary watermark and asynchronous work, it incrementally pushes the zswap LRU toward disk as reclaim activity increases. Is the point behind proactive shrinking is to reduce the latency in the reclaim path?