From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FCCAC3DA7F for ; Mon, 5 Aug 2024 23:58:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B1F5B6B007B; Mon, 5 Aug 2024 19:58:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ACFA56B0082; Mon, 5 Aug 2024 19:58:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 996CE6B0083; Mon, 5 Aug 2024 19:58:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 74B6F6B007B for ; Mon, 5 Aug 2024 19:58:45 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 22BC6A4ACF for ; Mon, 5 Aug 2024 23:58:45 +0000 (UTC) X-FDA: 82419859410.14.EDA92EE Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) by imf15.hostedemail.com (Postfix) with ESMTP id 4C033A000A for ; Mon, 5 Aug 2024 23:58:43 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hBdSlCzv; spf=pass (imf15.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722902262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cQU7vbPtaOjGnv/hbhN+ll2t7sFr0pUqNwdbWYE25jw=; b=HegovrKnHUaCf5IswqO5xExHI5XDcUZ5wJ34w17b/pTGUnZiUGYbDyaHnKw/OEu/9JfvwM bYJUhiwqyJ14hNS76Y9BPwz3j7I5jRp6YTwIouRfrxl9OLbya37Q76mkGYZHUvvKoAxN7t LTQwQWOytCcF4SXTDdot5+G1+s10VOo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722902262; a=rsa-sha256; cv=none; b=qophxvmJqBLKL5JHFsc/vwFqjkdHBp+RD5MNux8KdzzgQKwb7qBlzJTvMxN5KXQS5DGGC2 EREynH2kugr2paTgu0Xb8Ip5J+d2RRTiGec4g2p0xlcL77WBTB3pYNLgzPIoZEpLEYhS/w 4+UgIG+kRIF6TPvvkbOcu6OBTvbu8Dc= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hBdSlCzv; spf=pass (imf15.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a7d2a9a23d9so12828066b.3 for ; Mon, 05 Aug 2024 16:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722902322; x=1723507122; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=cQU7vbPtaOjGnv/hbhN+ll2t7sFr0pUqNwdbWYE25jw=; b=hBdSlCzv8IMe0vu79JDkH42LXWJLp4dNj1VIhc65q15vkcDzec4IHaNR2PX6V9LOly 6ryoSFZyyjKIbANCaqs4DzvoqImYUf7tubBssWjIjVGPoog/9aje8NspRXz3jlvQLuCc GfvAE5D6uD8XCohKdCA0Yr0dAuFzAq2oflLmsREO8XwrNnSENdST0LWacBukrByi8wuG M4b5zy9pt8sOtGs6KmNiqruIL4Kl5X2lzifrygDKDsMtQLkIS5RC57fxGvHoyrZ2TXVa x5Ug45Vq3OO/b25uo43StqfPu6TcqnOxCpDEdOP3eXdNWaiLEGmt2DtJWfbbjL8g2CNY Q7ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722902322; x=1723507122; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cQU7vbPtaOjGnv/hbhN+ll2t7sFr0pUqNwdbWYE25jw=; b=Pu8g2t/lR0HDZZjenQJKCo1lRS4PbTencuiERl9V7P4QGXtsia0p10++7qFraO8+c0 VM8ekWVqIm71tbOIWxve9MrSbU6sCQC/GWYnl21Bof4K4w8f4WrMDX0mw/2L6QBR0lVX exNvpdwSDUzavWyuj4g2Ze4sVg1ct49xEnCu6C8SrM0Xy4/pRof7atioyopD7d6lhZq2 /9Qpxv+aW/cMcQJzOxqYdz/GJla6EzWvJk6VUwLiLAIPRfS1R6t19YmIomnSZwe7ZOBY LVriiUO8yzG2fdkpGFcQ3HRlmPbrYDEWXacMf28vkhn6AxnvLt0uPlqLwFhyow9KVLJ8 tlfw== X-Forwarded-Encrypted: i=1; AJvYcCWHylak5UD9dw/tvc11YaTeb9AfO84bfMMH9THViA2cjeCblx6kdmwI+vbVNmtvRBvT0k1To7V5ZPyilnpIbaAZaWY= X-Gm-Message-State: AOJu0Ywwhd/wq+zZZ63h0GJFpwu1xz1SiQgHE8bWryVqZSOECvpn+OPA fhbpN5JmPRtBMYHXShoqSsyIwizcobKYXvPcHCuoPGC9h+AO2XGsUfoc75VBCPdoSlq+QhOwxsD KP0dC6bcYM47O6eB0hFRgePUp0VdcmneLL21/ X-Google-Smtp-Source: AGHT+IFhzRWuwR+Bscc36sL5v+77kZ17PKyFf336ahwHyrg3XZAIIO4MGEqH31ONzuCB1w/8YnlnkSEK4Uq5DsRxr2s= X-Received: by 2002:a17:907:7e87:b0:a7a:ab8a:391 with SMTP id a640c23a62f3a-a7dc509fc5dmr1018125166b.45.1722902321079; Mon, 05 Aug 2024 16:58:41 -0700 (PDT) MIME-Version: 1.0 References: <20240730222707.2324536-1-nphamcs@gmail.com> <20240730222707.2324536-2-nphamcs@gmail.com> In-Reply-To: From: Yosry Ahmed Date: Mon, 5 Aug 2024 16:58:04 -0700 Message-ID: Subject: Re: [PATCH v2 1/2] zswap: implement a second chance algorithm for dynamic zswap shrinker To: Nhat Pham Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, shakeel.butt@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, flintglass@gmail.com, chengming.zhou@linux.dev Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4C033A000A X-Stat-Signature: hky9rgfrhnn496sok64ey8a9u3td7uqr X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1722902323-248651 X-HE-Meta: U2FsdGVkX1/i0hsGJbd2ffBSZcS0Vlk+COLHYkeNexVfUCF0CZiEUhmrhaHyCyW0Dc0cc6InPYdu1KFi/F60QUjNNNPC5o7Mn7RGdRf7TqekpeKg/WFRq3jA539CAkg/ZCH1xJvTQAV5f42VVbA5oAVQHRqLqvoByDU0Fq9zOtSV8PqY4mV1Oku0athOcyLq7NGI02BhA0euL3Yn/yq262lrAwdzmyYmrH0VfPMozceUx1SYtEtOfN2uWiO/ssXqHDDTig90QOIWiXqnUEHf1lD9UsvSWc68jCSm9V6gX2C0JEVLzNMcRo2eRsL45s0NSSU+IpiLEPt5GifrfNQJfr4VOpr+6K40RtkTPm3AxCdNc19VdEFMr5THjfFY119h+JT6W3p2mxJ8E3fDdWYvNVT69t6bmGwpglDwhbZQw3Nz9Y+fS2biplHnlIfZWZkKxZVIidKvFEzsAXMA2CWioBfcIE0ZFBUXdJJTRtpH21eH3PsPEmWiCmClKmYHsYLTVvm5zPzYMHIEV5wRqnf6er7of6hBdqTgi0OVLOZ5OCFRLa+mObZxulTf+AQT7lS5g1u09KuzzO60MP2ditU2pRLcRGHUHCiVyBZrsD9oZ7B64zjZmHqWVTG8VPDbgi4jcWwfBhp/wcARTMcHn3RS47OvyCBkl7Lrdjkzp33ymraG1ojcKO6yXWCp2VD7zJlHTrvGX4aGH4sJl3bSPi3UwsSBA/UmDgoV2E1Lgy3EECd7VZYivC5J+FBYiR8z/O2mWkHMUGVTANqVjTTh+JloISAEVEDEuKR4RTmmr6aXjHw1Xkd/nmGDGkjOOulddNg2V7e9HtWXEgvViiNVv9TmabLKGuU5L6T+jVbv8Yf2Tozli1OB4dyqGF8Ia0Rz0VFLGWh8L3MDzo8P4mxjhG4UtKh0Nogu9JPhwS7niF0hajcmTFf1i5gkHI0pNHuWQmgWAkt/MillEMQ/0OfWzGL oibQfMLM XFN7RxtLc7hOGm4WxYl0xB/Kc11YKowCEAKXHIhafTzqaRfp4AsJJh22mCiBSuKML/PmKQkyrAI6UAv7eS4NsHLCr1QAk2nbxd7BMDdMopL5bMElxBe0dLR+U1Y1Pqm3Fg/15Y+Jmv81KHlMme4f+jNzQ6wGSRzsGK/f4GpsBHkXOglI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.001438, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [..] > > > @@ -1167,25 +1189,6 @@ static unsigned long zswap_shrinker_scan(struct shrinker *shrinker, > > > return SHRINK_STOP; > > > } > > > > > > - nr_protected = > > > - atomic_long_read(&lruvec->zswap_lruvec_state.nr_zswap_protected); > > > - lru_size = list_lru_shrink_count(&zswap_list_lru, sc); > > > - > > > - /* > > > - * Abort if we are shrinking into the protected region. > > > - * > > > - * This short-circuiting is necessary because if we have too many multiple > > > - * concurrent reclaimers getting the freeable zswap object counts at the > > > - * same time (before any of them made reasonable progress), the total > > > - * number of reclaimed objects might be more than the number of unprotected > > > - * objects (i.e the reclaimers will reclaim into the protected area of the > > > - * zswap LRU). > > > - */ > > > - if (nr_protected >= lru_size - sc->nr_to_scan) { > > > - sc->nr_scanned = 0; > > > - return SHRINK_STOP; > > > - } > > > - > > > > Do we need a similar mechanism to protect against concurrent shrinkers > > quickly consuming nr_swapins? > > Not for nr_swapins consumption per se, and the original reason why I > included this (racy) check is just so that concurrent reclaimers do > not disrespect the protection scheme. We had no guarantee that we > wouldn't just reclaim into the protected region (well even with this > racy check technically). With the second chance scheme, a "protected" > page (i.e with its referenced bit set) would not be reclaimed right > away - a shrinker encountering it would have to "age" it first (by > unsetting the referenced bit), so the intended protection is enforced. > > That said, I do believe we need a mechanism to limit the concurrency > here. The amount of pages aged/reclaimed should scale (linearly? > proportionally?) with the reclaim pressure, i.e more reclaimers == > more pages reclaimed/aged, so the current behavior is desired. > However, at some point, if we have more shrinkers than there are work > assigned to each of them, we might be unnecessarily wasting resources > (and potentially building up the nr_deferred counter that we discussed > in v1 of the patch series). Additionally, we might be overshrinking in > a very short amount of time, without letting the system have the > chance to react and provide feedback (through swapins/refaults) to the > memory reclaimers. > > But let's do this as a follow-up work :) It seems orthogonal to what > we have here. Agreed, as long as the data shows we don't regress by removing this part I am fine with doing this as a follow-up work. > > > > - * Subtract the lru size by an estimate of the number of pages > > > - * that should be protected. > > > + * Subtract the lru size by the number of pages that are recently swapped > > > > nit: I don't think "subtract by" is correct, it's usually "subtract > > from". So maybe "Subtract the number of pages that are recently > > swapped in from the lru size"? Also, should we remain consistent about > > mentioning that these are disk swapins throughout all the comments to > > keep things clear? > > Yeah I should be clearer here - it should be swapped in from disk, or > more generally (accurately?) swapped in from the backing swap device > (but the latter can change once we decoupled swap from zswap). Or > maybe swapped in from the secondary tier? > > Let's just not overthink and go with swapped in from disk for now :) Agreed :) I will take a look at the new version soon, thanks for working on this.