From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13881C072A2 for ; Sun, 19 Nov 2023 08:24:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 301D86B019F; Sun, 19 Nov 2023 03:24:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B1D66B029F; Sun, 19 Nov 2023 03:24:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17A896B02B2; Sun, 19 Nov 2023 03:24:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 035146B019F for ; Sun, 19 Nov 2023 03:24:08 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C4639C026E for ; Sun, 19 Nov 2023 08:24:07 +0000 (UTC) X-FDA: 81474016134.18.89ABBB0 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf13.hostedemail.com (Postfix) with ESMTP id DA19B20009 for ; Sun, 19 Nov 2023 08:24:05 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=gQrg63Yf; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf13.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700382246; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ir59+ODpi3UfExU2mSknCfpLaAB+LTgdGg5EIrCW0/0=; b=fh95KXUq+oWTyFHSke5X9fPD7OgeOeRzZ3tuQL9Err6/ORGHjwFsbP2Yx9iV9kWeeDgEwM qa/ZgrBxHSB3GzcCmfX4qLP+27ujhm5M1hxnNJ/1U9H89da2r3pV0Hdg9DrlEas4hBrKo4 G9t/H+TogqeZIoh/HJ5xUz4JG6qBzZU= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=gQrg63Yf; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf13.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700382246; a=rsa-sha256; cv=none; b=p9Hl1QyJmcnj9dBcmu0mkYL3g29H14od2zJqEMV7m2v6FmYeiG/JNehlxzQkL20PvsXvqB JkMykD2lgYSmAmgrHTP+vr7XMgd4/xIbagnKBltIKg4lMe1BjbGfYb79Mh1YDOKLOvFNzA lbMVZXK3Tc3Oth6mLVM80KenSG63WRg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id B118660BC5 for ; Sun, 19 Nov 2023 08:24:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F1BAEC433D9 for ; Sun, 19 Nov 2023 08:24:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700382242; bh=l6wzEvSiwERrHXBTWg1Y8xIWZNoPbKJW6k4RmsePHHc=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=gQrg63YffUTH3IPzixKPkxn1LijWTEsdL0kt3J+D1lRKHhGpI224qMp2MBcopLdnH 18hVOiI3ySgwsiLnypTmMPjDE0q6JDoBYlW2qOtXM9lpsKy0VhoL4JlGbgl6xJlmpB wECVCTnl5Faziq269KLxq1rRh8ad6vVUqc4ytzEWkSzRxYK6W9HkEJBKV0l3g4DPP4 LCbm2LX/4CLQD731kshiW4zaKF2dFqmiMg3R3mZ3jecmznCBmfzxnAfeiKy/qVLkBV hW/Ur1kRmX1tbrUuFgU2tuhrwrBB8cFRKNIBuQYYPX2H1UAklw5mo7t4X/IzWEKEyj rxCPD2TJlKyGA== Received: by mail-pg1-f180.google.com with SMTP id 41be03b00d2f7-5c194b111d6so2678400a12.0 for ; Sun, 19 Nov 2023 00:24:01 -0800 (PST) X-Gm-Message-State: AOJu0YxsLrIB+zJXKHmpeA1A6sTxOdt22g6V8T2rStJJdelnOnQHiajz NVFw16Wi97KRul7OO8LSJKfbFxyla/WU+exGUmVhYQ== X-Google-Smtp-Source: AGHT+IGq5epHlKgVGzYBJFJhGgxXWkX5xrLPlsa/8skI5/7ci/sln4q/j6E4vw30ymyKePS5CNSm7iYOrnTMN1f6guc= X-Received: by 2002:a17:90b:1e02:b0:27d:7887:ddc5 with SMTP id pg2-20020a17090b1e0200b0027d7887ddc5mr4927636pjb.32.1700382241288; Sun, 19 Nov 2023 00:24:01 -0800 (PST) MIME-Version: 1.0 References: <20231113130601.3350915-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Chris Li Date: Sun, 19 Nov 2023 00:23:49 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [External] Re: [PATCH] mm:zswap: fix zswap entry reclamation failure in two scenarios To: Zhongkun He Cc: Yosry Ahmed , Andrew Morton , Johannes Weiner , Nhat Pham , Seth Jennings , Dan Streetman , Vitaly Wool , linux-mm , LKML , Ying Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: DA19B20009 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: a8y99bwd6rqsk93t8cnbkyaj8cgtbby8 X-HE-Tag: 1700382245-7184 X-HE-Meta: U2FsdGVkX19HsclYwWy5mppmIXLMDzJc8iYSYIGppOXiWyG0SM+nQ2dMpPLNbgV9pOyfzgqonyi4UUbHC4ULv7soD5G/reCxedeej8Y03IQWi4rYI14hArMc8Bdp1FVUu5ZOOyekR6lpRaO5cY16pbCBocb7RMfB3Hmhx5vcq/bGOBsKf81U+ah8XhTC5JyoyQNC+xiIerP3Voe0pttWHyScMNCDA+3030dsg6pp2X3nx6ZZn1+Lma7o9PTJpS96SPVj/sVbeecrja4LNHnqgL48xvj7hBcIJCZzExoa+u7xb32ZNWDWXdYCh7jDCuhlQiGxXFMHGlVfvuZ3AZiER3JXA9Os0dDr8llfRvxeazbLivOyxk1nXZ87HFz+K+EhwsRgN4fI7UaVbnt1BiHtL2+us8GhYc4b/S/yyGJ1/fkF38gVw4nAJ854V6QRMJQR7yQtFwIg7qHZ3+zxxPuSoRVQ+gBfTS3y/r4SwVQtHjqS41jAVZJF3Z6GuyUEYTzpBu6i46BzX3sjAf9ii13xOkzNId8ukOoJQMBLz7lhOuOhLwGAVi+bzUdLrUTni2OT8Ya6gdo7sj4ikNqS0Ga+grfd97YRTNZSfJykB3SaK+nl9ReSrLdXY5OlcH5I6vfUw14zooGoN7bcmdq52d8wZll7rA1XpiMt5HOOg74X3sgTDHDe0chgdqsQUHOz9vnKgWvLdQEXBmE2KA8pC4omlLOHe88rTP0KWBlGFu2KwwPnX1CcYKRU67MiNWjg0YBjeObvz7fCcoIlmr56sJrLMwV+LkTXzs+jgpSMvuGo/1L1ey7TBGpMQqL3B/reCtri+UQKjKEclbKDKROS8vYNpkZm5AU0FB6Tjufb6Us5u7hripDrp0YPHnPpeR5rZOd3GuON9NaCVG+CFcDi4YEZKGUzDN1gEEVShBxrG0jWiSg8lMFY8OXZT0ozjYdQlexjSOk21IxYU5Odba1Jwte KfLFadOM DkSGY9sv43rpvRpYBKmU1A8HSmtsBnadRGEdMXDOOAl/Ks6Tv83PdvUG7xp2UNkwx4dA0r9Lt5T9UwoNmFlw84ykKJFVCRXfbP1qonyOi33VfUgX5JLrHnkJ8IIQyVrvkx/+sAR813P0zU4dk9mkAuIVQ0kNR/LSeC/1l1AX8dv8+wACLWgjM05X7bj8DZsWfEzLzL4lVBFeksu3NmxKoUZsjjK75gtzIs7N38Aui3Qf7Q9xAOPalWQJ00FYb0I5hKnETgqx2TwBvlJSwhQzQQLTF7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Zhongkun, On Fri, Nov 17, 2023 at 5:46=E2=80=AFPM Zhongkun He wrote: > > Can you help me understand how much memory you can free from this > > patch? For example, are we talking about a few pages or a few GB? > > > > Where does the freed memory come from? > > If the memory comes from zswap entry struct. Due to the slab allocator > > fragmentation. It would take a lot of zswap entries to have meaningful > > memory reclaimed from the slab allocator. > > > > If the memory comes from the swap cached pages, that would be much > > more meaningful. But that is not what this patch is doing, right? > > > > Chris > > It's my bad for putting two cases together. The memory released in both > cases comes from zswap entry struct and zswap compressed page. Thanks for the clarification. Keep in mind that memory freeing from and zswap entry and zpool does not directly translate into page free. If the page has other none freed zswap entry or zsmalloc usage, those pages will not be free to the system. That is the fragmentation cost I was talking about. With this consideration, do you know many extra pages it can release back to the system by this patch in your usage case? If the difference is very small, it might not be worth the extra complexity to release those. > The original intention of this patch is to solve the problem that > shrink_work() fails to reclaim memory in two situations. > > For case (1), the zswap_writeback_entry() will failed for the > __read_swap_cache_async return NULL because the swap has been > freed but cached in swap_slots_cache, so the memory come from > the zswap entry struct and compressed page. In those cases, if we drop the swap_slots_cache, it will also free those zswap entries and compressed pages (zpool), right? > Count =3D SWAP_BATCH * ncpu. That is the upper limit. Not all CPUs have swap batches fully loaded. > Solution: move the zswap_invalidate() out of batches, free it once the sw= ap > count equal to 0. Per previous discussion, this will have an impact on the swap_slot_cache behavior. We need some data points for cost benefit analysis. > For case (2), the zswap_writeback_entry() will failed for !page_was_allo= cated > because zswap_load will have two copies of the same page in memory > (compressed and uncompressed) after faulting in a page from zswap when > zswap_exclusive_loads disabled. The amount of memory is greater but depen= ds > on the usage. That is basically disable the future swap out page IO write optimization that skip the write if the page hasn't changed. If the system are low on memory, that is justifiable. Again, it seems we can have a pass to drop the compressed memory if the swap count is zero (and mark page dirty). > > Why do we need to release them? > Consider this scenario,there is a lot of data cached in memory and zswap, > hit the limit=EF=BC=8Cand shrink_worker will fail. The new coming data wi= ll be written Yes, the shrink_worker will need to allocate a page to store uncompressed data for write back. > directly to swap due to zswap_store failure. Should we free the last one > to store the latest one in zswap. The "last one" you mean the most recent zswap entry written into zswap? Well, you need to allocate a page to write it out, that is an async process= . Shrink the zpool now is kind of too late already. > According to the previous discussion, the writeback is inevitable. > So I want to make zswap_exclusive_loads_enabled the default behavior > or make it the only way to do zswap loads. It only makes sense when We need some data point for how often we swap it out to zswap again, where the zswap out write can be saved by using the existing compressed dat= a. It is totally possible this page IO write out optimization is not worthwhile for zswap. We need some data to support that claim. > the page is read and no longer dirty. If the page is read frequently, it > should stay in cache rather than zswap. The benefit of doing this is > very small, i.e. two copies of the same page in memory. If the benefit of doing this is very small, that seems to be the argument against this patch? Again we need some data points for cost and benefit analysis. Chris