From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6090C47DAF for ; Thu, 18 Jan 2024 18:33:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 252766B00AA; Thu, 18 Jan 2024 13:33:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 203CE6B00AB; Thu, 18 Jan 2024 13:33:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0CAAC6B00AC; Thu, 18 Jan 2024 13:33:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EDC7A6B00AA for ; Thu, 18 Jan 2024 13:33:07 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C33D3A24E0 for ; Thu, 18 Jan 2024 18:33:07 +0000 (UTC) X-FDA: 81693278814.25.F0C3CAE Received: from mail-vk1-f169.google.com (mail-vk1-f169.google.com [209.85.221.169]) by imf15.hostedemail.com (Postfix) with ESMTP id 061E4A002B for ; Thu, 18 Jan 2024 18:33:05 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=OPiqtrpv; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.221.169 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705602786; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AoKvtiZnBVW/PmvlVnV1RfQwRL3DN9j4sZFOWAIFXeo=; b=FalatoS9lmARvlqKjYDlEVdLnPxmdj+QPs2ysQ+Ftt5p55Pbosd8psoPziKmBwgtYa4+RV OSi9KO4jzzQk4GGh11NpxWiCwRGF0n2amJxyMXhYLMobzFstpim1RWL1WjYhAdugXTPJQx kRebCyc9sI2jH+Rl/7gM2qhBoBTQKaA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=OPiqtrpv; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.221.169 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705602786; a=rsa-sha256; cv=none; b=aPYtSMteVDjqK4Qrcy0aNl05ZGTfzfGCTEtHzDdB+hYxY7yKNhdwvQuJDYS3KfMVtBOosc Mtw4cN5f3xJvgiY7GD+dmDwS+utrHH4kGPC672Put6TbvG+e/3EfWShS0vMhcLz9ybTS7b auCdpfvVYDnpyZ/cvgLthgu04ZKPN1s= Received: by mail-vk1-f169.google.com with SMTP id 71dfb90a1353d-4b87d79a7d8so2369074e0c.3 for ; Thu, 18 Jan 2024 10:33:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705602785; x=1706207585; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=AoKvtiZnBVW/PmvlVnV1RfQwRL3DN9j4sZFOWAIFXeo=; b=OPiqtrpvtleYcYCUptW28tG70NUAvj+G9zyFLXdSWidwA/XGXz9gaKE2cV9IvOAJLj 3crt7+WZStfomryZZx96s5tW1INqyei8hyIV8g4LZOd5zGvEpNgT6Mpib+ZVPyE/lypb xGZfiWh63QbsfVdIB7Kn1KK9AhAAlPkWCEI3pR+/xDyXV4DEexWTcLZaBqwROTuXLPXl f2F8gLRJfRcBrnagsmcoHw8KPuahZmstIfNUG0SazWo0s+crvjnqCapfEvj/uBv8l+d9 bRAtg+gozEPlSODFd8IAjEsixbKGmgYAgh5pqqS7ChYdhoveZOT/T6eOcqGz4bkRbtOS W6mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705602785; x=1706207585; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AoKvtiZnBVW/PmvlVnV1RfQwRL3DN9j4sZFOWAIFXeo=; b=EBaWMnCNgD2XOKc9OF46zDDz6x6715onVtQvCapCFWXKT7Q4YNU9AI9fpJ/5cHmGCO YDUjcw2EruENHO+L7bVOBO+14Rp6wLr6ZmS9LLgrUoZMcrd82NU6i13iNoSDJ6MJ+iVk 0+tk/axVCiR1+ZVMPJ5XMemoCxDevgiVKLFlbWPgkq219S7XWWzs+w30jEbti8LMLMd7 Nu6ocOVEYHTd/8Y591NC3n/Pzyu6Pmho2t4w7Y9vvfzYa20C3EoUL+PMR7YXmumPRet7 Sujr0EyP7Nh0RkZQmA107QbeYm45BgLnUYxFJx4ygxbPB7f3gbV89OSCO9oTsN4e5Rp6 1Lww== X-Gm-Message-State: AOJu0YyyWpxZPSXpqrVN59b6Pc8B1WMAE/BMLsdPR4HHFJW62aqN9Y3i KsIBXtZ927ru3waUPfinSpJzvuJHnbDqj4kOiJzZIzyKdhShI59MZ2P1zSs6BrHKINYmdhe817o afYn6zKErKaz5eAWUhsv8IaoSa8A5XYryyhkNug== X-Google-Smtp-Source: AGHT+IGz8+VV+YjIzdeZnTGQat3BrS60m1s2aKJuDXDW43uot7OTGBId3+z+1dtYJTgJcqGKtfBevgjflgOsIJSNG8E= X-Received: by 2002:a05:6122:3128:b0:4b7:4cfb:4217 with SMTP id cg40-20020a056122312800b004b74cfb4217mr937797vkb.22.1705602783574; Thu, 18 Jan 2024 10:33:03 -0800 (PST) MIME-Version: 1.0 References: <20240116133145.12454-1-debug.penguin32@gmail.com> <20240118161601.GJ939255@cmpxchg.org> <20240118173927.GL939255@cmpxchg.org> In-Reply-To: <20240118173927.GL939255@cmpxchg.org> From: Nhat Pham Date: Thu, 18 Jan 2024 10:32:51 -0800 Message-ID: Subject: Re: [PATCH] mm/zswap: Improve with alloc_workqueue() call To: Johannes Weiner Cc: Yosry Ahmed , Ronald Monthero , sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, akpm@linux-foundation.org, chrisl@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 061E4A002B X-Stat-Signature: a8654y8e6umj6yy6mw7bm6owxb3fkzih X-HE-Tag: 1705602785-641662 X-HE-Meta: U2FsdGVkX19mp5TLDqSfRlrsGFK0YgR92Fe9ftgfG3JyxrxWDmIt4QYBltrVKmqtL9NdNUU0LjsdVHGQFr/0eXEg/a0ccRKslUmqsQqRmp5KT9pTz+MUHwiyeuGzya0NdEc6hVfUWpd1jgpuXAKkl5/ZxXcziz8Sc8gPjcbG8H5F0OZAiZ6nR5UG3nRFcqQRJU2EgtfZAaQD466VNBtAXO+SkW7H4AAwAKrk2d/jPpDKib+GTETnTt3+r6nN3c43RUdBjdQC15FEJVb4qTjhUPBD34Yb+XtdLfNCN5SLAXLIus+khj19Ee/fmXSu5aoUYBMm+vx/CvZBFSAcrCEBvOM5NxqNkiLpoy+d36GOdgtWl46MWug+/SP1+ZCqKrAF94By0jXAYAq5jW+HBcvASei53BJqyortjLChO0o7y8jgOLRMSmJv8TebkKiK+cA9Ph2v4GZkSK4R/0e88bMOk3f2bQb4diLcI19juUYhpM59L6hcy7ATvtb2xgBnzloz6hMWdGLW6Oxk08kDT86jMq0NQ/pQooBkY24MmduyKmLXN3wvOzo07HPqTWWZm0K3LtBesINYCRDXNYTwFvUDm3fukuY99FI7F9DC5rAov1CQLbJdLoTjJaH9aCubz1uNYVJt6aFSW6Zl+fnjdOdxgkd10AvfvyS18D3PUirL8EYCaQgzXFFcouAW/P5y2JToKOr+go5dKmejhy/hLDKM8qtAFHcgRRpNVMLO5URIPbR39kl0oZ9Wn6Z31NFcZteFskoqKOHiw1YU61NqLbWZXQKyWN3lvlS9RcdLbt8KZpzB2MlLCIKw4Ff5Hr2k3uoV8aupRwZx+6P0slohu9L1zsezeuADqgPV2m4Q1GoG8Y5v5Cds+IRtqzxCa54PlTMjLCI0lQXpLyjAc+mAe16kAMIMJmPg1W4T725c34DJmvfsHaBA/kiPLm3KpmcHJIx+2TMWcgBJ59Hv21kkhLQ /4yWiv4X 7XBb+zUq7s5+ZaB9T1nDtPm7yTDFdusHGxDxlDED1ZGvtjxqT8o/wrW+DX3WVwFM98w++9FRF+qmNcjGTq7jH2Q8aHacm7CoWBBBFVqQD2Y5ahUTzPLaEXVZ0GDv5loXM9PsRwcUAaCIr6DKO31qIQH03Mx4AJZ4vocGucxjv0rQlk1Mr1/+OwoyZKvDqHT9iC3QOzQqAfaDYf/ZTuZIN+mG706F5OHy1QMeXq7Qp/Y056DCiJ8hsa14bze54Io5/JO4t2CWsAlJak5ESoxJSOF7fwYCzG37KjpYG1ABefm7GcvbrjBEITaVFFCegYcvu5CxOHcRvYuIICps2FIXEq5HTBZUnMf4PZxpWUF2Rcme7VN6/mbyni7wCjGtNy/KXR6Fy4Cq9oMyyaA1TAmVhj8OrOvldBtkQ26DaHt6uMFweWqsEjkOauqhdsu8Xq2OYMHl+V3x0gaZILGW5hSZLKk5GboBEPDLPLeO58JcOgt03mkcPS3T4+DL2c8uFzbrMTssDYTbX5cqH6lHe3AggI0yn2Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000267, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 18, 2024 at 9:39=E2=80=AFAM Johannes Weiner wrote: > > On Thu, Jan 18, 2024 at 09:06:43AM -0800, Yosry Ahmed wrote: > > > > > On a different note, I wonder if it would help to perform synchro= nous > > > > > reclaim here instead. With our current design, the zswap store fa= ilure > > > > > (due to global limit hit) would leave the incoming page going to = swap > > > > > instead, creating an LRU inversion. Not sure if that's ideal. > > > > > > > > The global shrink path keeps reclaiming until zswap can accept agai= n > > > > (by default, that means reclaiming 10% of the total limit). I think > > > > this is too expensive to be done synchronously. > > > > > > That thresholding code is a bit weird right now. > > > > > > It wakes the shrinker and rejects at the same time. We're guaranteed > > > to see rejections, even if the shrinker has no trouble flushing some > > > entries a split second later. > > > > > > It would make more sense to wake the shrinker at e.g. 95% full and > > > have it run until 90%. Yep, we should be reclaiming zswap objects way ahead of the pool limit. Hence the new shrinker, which is memory pressure-driven (i.e independent of zswap internal limits), and will typically be triggered even if the pool is not full. During experiments, I never observe the pool becoming full, with the default settings. I'd be happy to extend it (or build in extra shrinking logic) to cover these pool limits too, if it turns out to be necessary. > > > > > > But with that in place we also *should* do synchronous reclaim once w= e > > > hit 100%. Just enough to make room for the store. This is important t= o > > > catch the case where reclaim rate exceeds swapout rate. Rejecting and > > > going to swap means the reclaimer will be throttled down to IO rate > > > anyway, and the app latency isn't any worse. But this way we keep the > > > pipeline alive, and keep swapping out the oldest zswap entries, > > > instead of rejecting and swapping what would be the hottest ones. > > > > I fully agree with the thresholding code being weird, and with waking > > up the shrinker before the pool is full. What I don't understand is > > how we can do synchronous reclaim when we hit 100% and still respect > > the acceptance threshold :/ > > > > Are you proposing we change the semantics of the acceptance threshold > > to begin with? > > I kind of am. It's worth looking at the history of this knob. > > It was added in 2020 by 45190f01dd402112d3d22c0ddc4152994f9e1e55, and > from the changelogs and the code in this patch I do not understand how > this was supposed to work. > > It also *didn't* work for very basic real world applications. See > Domenico's follow-up (e0228d590beb0d0af345c58a282f01afac5c57f3), which > effectively reverted it to get halfway reasonable behavior. > > If there are no good usecases for this knob, then I think it makes > sense to phase it out again. Yeah this was my original proposal - remove this knob altogether :) Based on a cursory read, it just seems like zswap was originally trying to shrink (synchronously) one "object", then try to check if the pool size is now under the limit. This is indeed insufficient. However, I'm not quite convinced by the solution (hysteresis) either. Maybe we can synchronously shrink a la Domenico, i.e until the pool can accept new pages, but this time capacity-based (maybe under the limit + some headroom, 1 page for example)? This is just so that the immediate incoming zswap store succeeds - we can still have the shrinker freeing up space later on (or maybe keep an asynchronous pool-limit based shrinker around).