From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B1D0C3DA5D for ; Fri, 19 Jul 2024 14:55:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF7116B0083; Fri, 19 Jul 2024 10:55:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA68A6B0089; Fri, 19 Jul 2024 10:55:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6E7F6B008C; Fri, 19 Jul 2024 10:55:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A93796B0083 for ; Fri, 19 Jul 2024 10:55:18 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5B6101C0ACE for ; Fri, 19 Jul 2024 14:55:18 +0000 (UTC) X-FDA: 82356800316.23.6477EB6 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) by imf06.hostedemail.com (Postfix) with ESMTP id 8EC0918001D for ; Fri, 19 Jul 2024 14:55:16 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jvmxKqfF; spf=pass (imf06.hostedemail.com: domain of flintglass@gmail.com designates 209.85.128.182 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721400896; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=68yuc6NN728LxmvmV/7h6sci0WMOrieT8WjUycM9Czs=; b=3K2F/Sud/0YUhHHdAq987C4HmyxeOontIw8WBOI/IJwZkeOqyHbiW3N/gsPCJJ0ENxRVvJ C0BA4JyrEFqdyLNAprmHuRkJ/XJYnLjJSzZGkaNlvgYQnrt7ADn4LoNxP1Y+QluFKGzeNF oKpMiZdpvYz7WqlloeMl1nOBvTjMZEY= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jvmxKqfF; spf=pass (imf06.hostedemail.com: domain of flintglass@gmail.com designates 209.85.128.182 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721400896; a=rsa-sha256; cv=none; b=O/6gitPD5Qny8Lwjc35/W8yFVUeUlFkaKHLJi/66zE/py6I7kJfjXSh4x38eshZ60NZSPQ rukRBy2Rr3JUYSHCiorT3/2/R0+M3ozvQnnJ/liSy8jvuKL9R+It3/s1zorzWc3EoA538R dj33LlvHZ0aabwg4IaIkRHvwiGak/58= Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-66493332ebfso18469127b3.3 for ; Fri, 19 Jul 2024 07:55:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721400915; x=1722005715; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=68yuc6NN728LxmvmV/7h6sci0WMOrieT8WjUycM9Czs=; b=jvmxKqfFvfSQcSfHA6avciOf07exndrH9oeQNseKkUS/A+yr52j3YtL5+owjj2IMFa VOgklkeAntecbCb2y4O4p5KcpZ/VypSBmJRMROqq6oRtcwMepKxwX8+029H0fICJ6eX8 A/zUJcpBKoB6xdR/CZ5yrl8oDHmVf9QBr+ES8fy7ZYSm2cgjY6PdjxqV9jjMr7uEKvl/ 5Uy34y3j8jYPasvLB2RTHWk7vCAnegeWpFIJd+zteggC/RaWoUTxXClfedn7X1f4BuWt 8OPs6875nO77H9Efb3ostWPQRtK/zGu86hkt2T+PZSThnyrs7Kn6pz1n7KsF3RirYcG4 x3DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721400915; x=1722005715; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=68yuc6NN728LxmvmV/7h6sci0WMOrieT8WjUycM9Czs=; b=giZP4XeuUwo427nRAXrLziOnt55+fn4gZdycJGUqjXFcnb9q6CaMOq/kjneiTQkUiY ZdItl6vhfqNIayhKLrwKu/6/QHDAjcTneNFEMhGAVeU+D3hAyTury4EcMSErBQ3wwLIw fxEHhHkRd8KZibZUilIAqtUHEOFJH7uttvNRavVpKkU7yuCNonrWCBLHATJ9w8aosEim vpV02/1uL3/ZaPBDUGSTJTHiRWZ2uIUqgSK/IPUwjM9d8AscmvH0YNXjwNp45I4WbPqI eL7FsG3hx5HN2a2AR2uQfdN9eyUElLjO2C1UkLV7LL6WHMK6Ck3q3ALUF0At1RDYCWqT Lg+g== X-Forwarded-Encrypted: i=1; AJvYcCWLlxb4MNjdGdb4d1HqGdGGpWjvmCXsgXYAOmoJ4F1zSM6JB7iq36ySUEBzgnCXi7swhNb8aNBlP6rLWA2jkoEBTAw= X-Gm-Message-State: AOJu0YwJQPSk3MVY+gUopW719xMr98+c3WUA4KgkDuPQoXqQQCyqXbCc 7aeqEYo4AuUM6v8iKtqddt0qxmydz1LcF4D6H0XQZjdTCqG1KDNEh0g+m62CIPjgAQYFNuJJzYf EAXH0SgRCCghAnv5t9GV9LlbZ6Lw= X-Google-Smtp-Source: AGHT+IEdWHvsEjt7ddW74Wyq0ZsZOLuJ6QzPGM5zDey3dQwCJ7dfrCLIeBNvrcVeDz8GSRNcsMm0TcVaglXnrrbmG50= X-Received: by 2002:a05:6902:18d3:b0:dff:2f2c:d6ad with SMTP id 3f1490d57ef6-e05fec14a25mr8202645276.51.1721400915435; Fri, 19 Jul 2024 07:55:15 -0700 (PDT) MIME-Version: 1.0 References: <20240706022523.1104080-1-flintglass@gmail.com> In-Reply-To: From: Takero Funaki Date: Fri, 19 Jul 2024 23:55:04 +0900 Message-ID: Subject: Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink To: Yosry Ahmed Cc: Nhat Pham , Johannes Weiner , Chengming Zhou , Jonathan Corbet , Andrew Morton , Domenico Cerasuolo , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 8EC0918001D X-Stat-Signature: r7wgmafogz41ywu9tn3zkyd8nyyqfzzo X-HE-Tag: 1721400916-742512 X-HE-Meta: U2FsdGVkX19Ei6nNQaxIRmU1o/mGG3GI+CJ/fL07yB7ysE+77UIEyRRythCZ+rdNYN+cKQg2Uzv8sumwPEWoiGZeyOTX9ns8cbIDsgXDpuba8fWl/CtOdrKqSaTyt6efBKKCdjSBjfvlOhccYkr2vgAwp7E2qw5CuvMfNksEo5KAnmwlk/68GYjsVX5lFm4uS99JYnR9O+YT1dmlOu+NKDZMLZ01MzAj/RgPY5QrZ/fvNd4bdR7W2c0YhEDCEKg22rZnen0FYX3PrYTplLkeLMtgUvLaLgJn9Zcw0OxCUM5vVgfkpctKbmkOnXKEARa3fEZJ8s4Zy53lxpVmhvbyR4R2kBdwERUtBxzo3mBeLoRfz7js5ui1Ek0Ksne7T79gogPqSsUhtuOmd/TLecgUA050SfUZRRzH8lGjr6wGQSCVjC0PVh1rVconK6u3zixpGsX5rDVSJOU9J+XNBFra/ADOciLcg0QzEvUhIGPJb0X3ELMTiE31bTKDPg9UX5Ly7Nnk+bygALAF8hl0s82RWib0r6qHcSEx8YJQnjh6/IjZ1zW8D1YpobVgTgGHm2Gw3EwFG/3LBGlLdm/x76vBGV3PSajgy76JwfI7q5gx5IMB6ogH3KOvOrCasA1N3nsKcqMvo1tT13oQ3N9HTKfeYSGmIVojp71f2qbvr27sD7S8atqtlnCcbLHucctWwsf5vk5rX234BTqKQxUL/ozM5qMoKPGVh3JvvgMF4jHv4S8plXQVkh2zR7uuUabGf8uGT1XqO47YTSXO15gLp4bo5kg64VyxU/job6ncYJrqhruCZgXTaJOyZ5W3RGnKNe6/FzwgusbMZ2GV4PsGkoQTSVQVjD3vseR/vMFTt+slpt7lfxpms+EN+JJEDZn+6QjxeSh9BO9Q+ShTX+c+0cmTAM3yfWGFcu2LPFVlxQBiRfxgPCC2a/6sKptBYEpZp8NUN3bycTPUhhPabt4pQWS r052PnV0 hObWQ2XlLV/cEi+TopWaWr8JXnAXvTf6qCceoaxhKWgsChwZSCtcVoy6FqedX7wzOT7BoDhwbyjIBDlGCo5/HCaFiuyqqSsmtJTqiYmwka13gDOhycnjZrj6GJt+6WPCrabxXVpWlnTZgJz4G2iKnfDZjAZiyUnyZDT67Od4cFClktTfa7XGsKgJ6reI704imEG07u+Tp2EZl5vMrjTamyDpPKSS+fwDM/XUL35Tm7JvJ/yLh/1FwkdUx1q0BhXBhRqKApNg7QpsPsJbBfcPLAqfWdVxdd/IeoNx6un2q1+otE54nIGiIeVhOvncnI1sKb34WcaPf2TWe5SYT7FKVi8y2Nw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Thank you all for your reviews and comments. 2024=E5=B9=B47=E6=9C=8818=E6=97=A5(=E6=9C=A8) 3:05 Yosry Ahmed : > > On Wed, Jul 17, 2024 at 10:49=E2=80=AFAM Nhat Pham wr= ote: > > > > On Tue, Jul 16, 2024 at 7:53=E2=80=AFPM Yosry Ahmed wrote: > > > > > > [..] > > > > > > > > > My concern is that we are knowingly (and perhaps unnecessarily) > > > > > creating an LRU inversion here - preferring swapping out the reje= cted > > > > > pages over the colder pages in the zswap pool. Shouldn't it be th= e > > > > > other way around? For instance, can we spiral into the following > > > > > scenario: > > > > > > > > > > 1. zswap pool becomes full. > > > > > 2. Memory is still tight, so anonymous memory will be reclaimed. = zswap > > > > > keeps rejecting incoming pages, and putting a hold on the global > > > > > shrinker. > > > > > 3. The pages that are swapped out are warmer than the ones stored= in > > > > > the zswap pool, so they will be more likely to be swapped in (whi= ch, > > > > > IIUC, will also further delay the global shrinker). > > > > > > > > > > and the cycle keeps going on and on? > > > > > > > > I agree this does not follow LRU, but I think the LRU priority > > > > inversion is unavoidable once the pool limit is hit. > > > > The accept_thr_percent should be lowered to reduce the probability = of > > > > LRU inversion if it matters. (it is why I implemented proactive > > > > shrinker.) > > > > > > Why? > > > > > > Let's take a step back. You are suggesting that we throttle zswap > > > writeback to allow reclaim to swapout warmer pages to swap device. As > > > Nhat said, we are proliferating LRU inversion instead of fixing it. > > > > > > I think I had a similar discussion with Johannes about this before, > > > and we discussed that if zswap becomes full, we should instead > > > throttle reclaim and allow zswap writeback to proceed (i.e. the > > > opposite of what this series is doing). This would be similar to how > > > we throttle reclaim today to wait for dirty pages to be written back. > > > > > > > I completely agree with this analysis and proposal - it's somewhat > > similar to what I have in mind, but more fleshed out :) > > > > > This should reduce/fix the LRU inversion instead of proliferating it, > > > and it should reduce the total amout of IO as colder pages should go > > > to disk while warmer pages go to zswap. I am wondering if we can reus= e > > > the reclaim_throttle() mechanism here. > > > > > > One concern I have is that we will also throttle file pages if we use > > > reclaim_throttle(), since I don't see per-type throttling there. This > > > could be fine, since we similarly throttle zswap reclaim if there are > > > too many dirty file pages. I am not super familiar with reclaim > > > throttling, so maybe I missed something obvious or there is a better > > > way, but I believe that from a high level this should be the right wa= y > > > to go. > > > > I don't think we have any infrastructure for anon-only throttling in > > vmscan logic, but it sounds trivial to implement if necessary :) > > > > > > > > I actually think if we do this properly, and throttle reclaim when > > > zswap becomes full, we may be able to drop the acceptance hysteresis > > > and rely on the throttling mechanism to make sure we stop reclaim > > > until we free up enough space in zswap to avoid consistently hitting > > > the limit, but this could be a future extension. > > > > Agree - this hysteresis heuristics needs to die. > > > > IMHO, I think we should still have the proactive global shrinking > > action that Takero is proposing in patch 3. The throttling is nice, > > but it'd be even nicer if we can get ahead of that :) > > I have always thought that the shrinker should play this role in one > way or another. Instead of an arbitrary watermark and asynchronous > work, it incrementally pushes the zswap LRU toward disk as reclaim > activity increases. > > Is the point behind proactive shrinking is to reduce the latency in > the reclaim path? For proactive shrinking, I thought the latency and throughput of pageout should be prioritized, assuming that delaying the reclaim progress by rejection or synchronous writeback is not always acceptable. Similarly, patch 6 accepted breaking LRU priority to avoid degrading pageout performance compared to zswap-disabled systems. But It seems like zswap prefers the LRU heuristics and a larger pool. The shrinker should writeback synchronously after the pool limit is hit until the max pool size, and zswap should backpressure the reclaim, right? If so, my proposal is in the opposite direction. I will submit patches 1 and 2 as v3.