From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16AD3C2BB3F for ; Sat, 18 Nov 2023 18:44:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0104C280019; Sat, 18 Nov 2023 13:44:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F0264280008; Sat, 18 Nov 2023 13:44:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DCA8B280019; Sat, 18 Nov 2023 13:44:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C9DB4280008 for ; Sat, 18 Nov 2023 13:44:08 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A314B1207AC for ; Sat, 18 Nov 2023 18:44:08 +0000 (UTC) X-FDA: 81471949776.24.E1117C9 Received: from mail-il1-f178.google.com (mail-il1-f178.google.com [209.85.166.178]) by imf30.hostedemail.com (Postfix) with ESMTP id D6DB28000B for ; Sat, 18 Nov 2023 18:44:06 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kDzaiVJ8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.178 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700333046; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8Rqm8c2mzVlSTmByITnlTNyQf2tve+K1Yem5CIyuqsE=; b=VgGLzY1TleG7d6brJiaRfJCBvbLhCwle3Sh2Nhgxl9d1DZ2uoT/qiEeVKRb7VpMXKTndBr MDRVLaWYiURNf4pskfh60lCXa+6Ddb1re4boa2BQJbcrRe4iNM5WmNdQ4JwNEifwn5LLD5 27KQ9zd3eGB4KH6l1BjtkAjenlCSG+U= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kDzaiVJ8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.178 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700333046; a=rsa-sha256; cv=none; b=TnPCvFVKOwXD3Yvo3Ur0C1oSH+JS54JlrUdjnHOpriKeVs8iS3EpIAaTDjaRMLoBglDpmM iZfMNSyYAfSq1CCbWZxKNwv5KY++gMR5C67NQPudfG+liwL6FgfKjL1IMoP7SSQxRXgpVg 5NZXGpjmxA4/hnlWnJID1CKGGBYNzZY= Received: by mail-il1-f178.google.com with SMTP id e9e14a558f8ab-359d559766cso12009385ab.1 for ; Sat, 18 Nov 2023 10:44:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700333046; x=1700937846; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8Rqm8c2mzVlSTmByITnlTNyQf2tve+K1Yem5CIyuqsE=; b=kDzaiVJ8fpczOcM28NODMqig7Zwu8hEvcBy2IIFvWaryUNxeroja3iOBJhl+VEvngR eHM1Uonh7CmOZVJJnFddX0OFpbGtN8FwkCzpr562xPWnYxu1STHOH/ywaqCpMsUywMkr J9M7mGOTuvZtu/52wMr9dx0X5I/say7DlBwM08L/jZDZuNc2fgY+uAlsqdersTE6JYKm 5l6jY23Vs2dzch5mFJyHaPGOqQukPu9mpihxSnot9H5fDEmWV4NLIC8mBPNkOg6+MZTI AiT+pyB9NI/jJUzFJr9joOn2Dcj3K+NC52k337HxMZHGgqubuH7dmwkot1KBY0rRpXDQ rpVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700333046; x=1700937846; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8Rqm8c2mzVlSTmByITnlTNyQf2tve+K1Yem5CIyuqsE=; b=CFYdxFoSPDxrdncOy6B+wWMS29PJlBVP/i+dvndo5/1oaYZBpSuiNPw++PFM9Kmglv gCypdXqKvOZFfIt5CIMxFwEMjUcNwlK9q1Z6qCNndofskBm70pCPp9rPzQgeAmoWWqTI Oxm4veonrCRlhTfIJYGtyemG4m25Dt4LRPqcAW820rRd03UUs7vswvDQ52dj6hs4qR4Y ffrVaOc/BaFWtraeWxtUxp08mG3iKOaMKrOYvAT0suyi1a9oPUS4FXIaXWJHlIVXG2GH zx+mfOfJVdSeVCeJ9P3Os+q6HgMl+8Uz/YCfwSOLpBLlS7466JBlLIbXODm14TjSH01/ rdmA== X-Gm-Message-State: AOJu0YwcXoh1vYHxMNK+qxhEV/PcBskZkpveQSmTicXjziz5/fqGd48h QdPQzo/vwgpQxCo+A7VeD0l2IYlC3kHV7G1/rVQ= X-Google-Smtp-Source: AGHT+IF7/hs76dDRD4AQ6K8kq8hrcgN/GH2w2K664VsS4PmHylTT4IBWCsmfWelXzq4uTSxvrKl2LNwom7FJ1o0rgBM= X-Received: by 2002:a92:bf10:0:b0:359:3b36:9abf with SMTP id z16-20020a92bf10000000b003593b369abfmr3070557ilh.4.1700333045868; Sat, 18 Nov 2023 10:44:05 -0800 (PST) MIME-Version: 1.0 References: <20231113130601.3350915-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Nhat Pham Date: Sat, 18 Nov 2023 13:43:52 -0500 Message-ID: Subject: Re: [External] Re: [PATCH] mm:zswap: fix zswap entry reclamation failure in two scenarios To: Zhongkun He Cc: Chris Li , Yosry Ahmed , Andrew Morton , Johannes Weiner , Seth Jennings , Dan Streetman , Vitaly Wool , linux-mm , LKML , Ying Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: D6DB28000B X-Stat-Signature: ca8s7uyzfawxfg147m6ckyieu6wbqyuo X-HE-Tag: 1700333046-700374 X-HE-Meta: U2FsdGVkX18aNgRFdmHqes0UsVFIXnqTnnBEexGOW4AviO7aSBXMsjtsRlmZtuI9OCExGUJHjCc0TTfL7uB1zllSKHYm8gfbCIeCKL4EA3fSdbtwfhu/xIS6QaPCXg67lxDVidY9l5sjv57ZnPCXP3FMPVk0DD1MqNmRAmeeqNFC9MuWsyHvakRESBDm/pyXUP2OFxDCfBgaPXw34v++AHksMGtNDzSQnTSutZS/Bda+tDZvqsifpDT3rw5woaw1FQEZLNGzVUY+//xzzxcXejieSXb/QbP0XQe5GK/w8r8ClZfYBS1FQnYM+ZAN/c9ECKYlQdMTlBea+bdv/ao/MmI+IiCz9RiPseMKVUTjRWMNqCOs4LVfEw3H3AVEYmVEZ4HzgqhpN4VEjFX24p+QITZ9Kp6tVwXV17xc1cZDP8bYQE0+KDZ2/gCqQ9kSRjNKtLDhqg6jUw3iSe1bckI6secxj35mjrOjwzWXE1Y+RlJnjz9P4+11u+RJ6Kl7I8bUbXQ+JLk0OMpQ4RaIDVWzQDkU3PjY6dq4v39+bxqCrc/jdlKTbH5ZujoHKcocfhvxLKwAytBTMlw5a4dv1RsGgtgc6wFQt3hl2HEHgccKJ/aABVveBvKaV0oy3Cz0+do+7j9u7SQI29auST9pMGu4adSBpPeKaMvfBe6Pq1H9lWwiG+0xcjDV6IFZWb5hfiRgXpBv/x7B2Xjaoo/5MD09BvBkYeBxmiCpiHt56OXpU8IAnkjCdRWO1W+19VFY6+x27C57tpYrV8la3qY0D2mCy7qbV/0uRTqk6Mi10uwZYKcD/yZ1B4j4pKa0rvF4QY6zbyMTZCc/d9nQpTMycRN89TDwiczqVBbV6bMzPf6mmqw+glV+lPr7gJpa9q0QhuoWgqHzqvOpuSBNFWnjdqiWjBwcJIp8WUHy5LUCDkUmvfQWDOrq/Hhhg4HTPQQJBi3+i0QRMtJB5xsY/vyZ1ro 1RIGHDkB Ku433V3fxD21B4u/a6eVGFQe+Br3uS/BktPiO5N9E131LZUbDnO6GhVr0bH8/p6luJHIekjLtMogwXEOuPbt96IrOJPQT6MSLx43gbgraIy4eaf0hKFehVLlydyftpBc8qKK5H37krg5tpWOYynU0X/aEHE0jVf0bFNNuenEBUChyqhUEx56il1xtMFtYyYgmhZcEdSci2OoEjd9nPFmUwLRGG3CseawSPnJmyYWsexjWvLzRnLhSC7RUk0KBdw5gZzmcXWFE58c4W3bbOPaCOjlHD9UQd5ajTPHLsfoksdbj8OKxntkNoeVYGk6pa40iesCMJ5P5+bG8b69zutOYDghpSzZqyJUdqaIWA66e2TXAE1Sucwy1t3VLFjm7lOjT+J+MghMLK9oxbbtiKcraMMBDA2zu+FZ0Idum X-Bogosity: Ham, tests=bogofilter, spamicity=0.000003, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Nov 17, 2023 at 8:46=E2=80=AFPM Zhongkun He wrote: > > Hi Chris, thanks for your time. > > > > > On Fri, Nov 17, 2023 at 1:56=E2=80=AFAM Zhongkun He > > wrote: > > > Hi Chris, thanks for your feedback. I have the same concerns, > > > maybe we should just move the zswap_invalidate() out of batches, > > > as Yosry mentioned above. > > > > As I replied in the previous email, I just want to understand the > > other side effects of the change better. > > > > To me, this patching is actually freeing the memory that does not > > require actual page IO write from zswap. Which means the memory is > > from some kind of cache. It would be interesting if we can not > > complicate the write back path further. Instead, we can drop those > > memories from the different cache if needed. I assume those caches are > > doing something useful in the common case. If not, we should have a > > patch to remove these caches instead. Not sure how big a mess it will > > be to implement separate the write and drop caches. > > > > While you are here, I have some questions for you. > > > > Can you help me understand how much memory you can free from this > > patch? For example, are we talking about a few pages or a few GB? > > > > Where does the freed memory come from? > > If the memory comes from zswap entry struct. Due to the slab allocator > > fragmentation. It would take a lot of zswap entries to have meaningful > > memory reclaimed from the slab allocator. > > > > If the memory comes from the swap cached pages, that would be much > > more meaningful. But that is not what this patch is doing, right? > > > > Chris > > It's my bad for putting two cases together. The memory released in both > cases comes from zswap entry struct and zswap compressed page. > > The original intention of this patch is to solve the problem that > shrink_work() fails to reclaim memory in two situations. > > For case (1), the zswap_writeback_entry() will failed for the > __read_swap_cache_async return NULL because the swap has been > freed but cached in swap_slots_cache, so the memory come from > the zswap entry struct and compressed page. > Count =3D SWAP_BATCH * ncpu. > Solution: move the zswap_invalidate() out of batches, free it once the sw= ap > count equal to 0. > > For case (2), the zswap_writeback_entry() will failed for !page_was_allo= cated > because zswap_load will have two copies of the same page in memory > (compressed and uncompressed) after faulting in a page from zswap when > zswap_exclusive_loads disabled. The amount of memory is greater but depen= ds > on the usage. > > Why do we need to release them? > Consider this scenario,there is a lot of data cached in memory and zswap, > hit the limit=EF=BC=8Cand shrink_worker will fail. The new coming data wi= ll be written > directly to swap due to zswap_store failure. Should we free the last one > to store the latest one in zswap. Shameless plug: zswap will much less likely hit the limit (global or cgroup) with the shrinker enabled ;) It will proactively reclaim the objects way ahead of the limit. It comes with its own can of worms, of course - it's unlikely to work for all workloads in its current form, but perhaps worth experimenting with/improved upon? > > According to the previous discussion, the writeback is inevitable. > So I want to make zswap_exclusive_loads_enabled the default behavior > or make it the only way to do zswap loads. It only makes sense when > the page is read and no longer dirty. If the page is read frequently, it > should stay in cache rather than zswap. The benefit of doing this is > very small, i.e. two copies of the same page in memory. > > Thanks again.