From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48C76C46CD2 for ; Tue, 2 Jan 2024 11:39:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BEB46B01A2; Tue, 2 Jan 2024 06:39:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 746AC6B01A3; Tue, 2 Jan 2024 06:39:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E6F86B01A4; Tue, 2 Jan 2024 06:39:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 484546B01A2 for ; Tue, 2 Jan 2024 06:39:17 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 11C2D160765 for ; Tue, 2 Jan 2024 11:39:17 +0000 (UTC) X-FDA: 81634175154.24.17F0B62 Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) by imf20.hostedemail.com (Postfix) with ESMTP id 6A2871C0003 for ; Tue, 2 Jan 2024 11:39:14 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=iUh5ujeN; spf=pass (imf20.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.48 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704195555; a=rsa-sha256; cv=none; b=6HwTp258WuJ0TRHHZ5rAZBOQL2BwakeaZ4jyP+V0K7Dirk82wHy1YcD54peL2nF5AtXnjA S7IGFiydr6U5i24uJ0rI04ZEw0WUK+Y3ZGcqjeChDCflZjdngHL15sxKiaNRkyUe6YKkHs bsMWWmNjJHw06585uZumm3TLLCxt63Q= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=iUh5ujeN; spf=pass (imf20.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.48 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704195555; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+xrexN7AjIT8/bBjygEtpkIwY//a4OJyt2takyL4czo=; b=6GZAUzGGIGLOM2ET969+3et5ppYCeTppcaQyYM33cKF/D3BtVLvk1AqCHYZVQeON9VKBcY Gl6ehOUf0M1f+1GYJjjdXN4kFfSlXA1mu6O0rCJGJtx1qnvoQqB21iZaXH1ohQA8RWvlA1 My61ZvSjysNqJq4hKuZuoaf4SHbcZIs= Received: by mail-lf1-f48.google.com with SMTP id 2adb3069b0e04-50e7aed08f4so5830321e87.0 for ; Tue, 02 Jan 2024 03:39:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1704195552; x=1704800352; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+xrexN7AjIT8/bBjygEtpkIwY//a4OJyt2takyL4czo=; b=iUh5ujeNrbhWduTRsSUy+o5kJ9yzHKdoVCwoX6HDMHzC4W3978kqtOHyLlcHUZwev3 YGln0W9XtYJ5IHZ/CV4ROl52Yt6IS+QUxPzbPAv4CGNgyeBDllyHXzyUajT9ifQaEGEt HtSJ0g6BmVErW3Wh3RWKRnvTHjptb4RrZf9MvV3CNWFTsofhVDh1ndi0TAJodZqlPkx4 kzxBg687QrHU4MnrW9/jLncjniJ2Y1ZjGdry+PjjacBKGAqrQH0PEvwzged1TwFTRdZw ilWOyYv0lIq5C/JHXF5CnfhxanMS86N9HZr+WPY81AhSN4dKkwxsisBifbgryICPXEEe lCKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704195552; x=1704800352; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+xrexN7AjIT8/bBjygEtpkIwY//a4OJyt2takyL4czo=; b=B+K5azb/WxOSBLt6sMuX9y3eUb5MgRn7zMR7jruDZw6uCDYhoDnakgW81Cb1Yjf8/a PZy4qz+kaot2PDp+f/vwtgHStCr0RjZGvsBkGO9FGfj4ixLmIocL00bojUnzS84lADjh QTkcy8hcltHVxNkJ4jaagRwwK0KrbvGVTcRTCAypzXNLnFXjOy+n8/vQeHeuwMprB1mo VTN+oOKQIE8ZM3zmrh9w+PMCn0JeBHZQn7D5bGb9uJSbf2vKREXvN5ShQXfRY4w4MGZ0 poOMQMAgmFhxXV0/il2Ii9xzCR7fIo8gAq4ef/xp4AsPY7Mly76Rkps2DfJadNC3McKS 5c6g== X-Gm-Message-State: AOJu0Yz0F34wXBTaARJ+XHskJK9LV4sRbg+qkNPOF+s2RmPlROvD+6Xh zN23RsOciQLkQ0MwViSGr3CTNV6S9zoK7fj6H1Gu9oEpbkFWZA== X-Google-Smtp-Source: AGHT+IFbko5i8RH8Vg0vZ4pxb7kzTc7ul8NUW8obTIRqTyaRi54CPtExu7HRi7nRM5xXeMBBJcX0lgK8wbOWkXJVzio= X-Received: by 2002:a05:6512:1254:b0:50e:754c:84ff with SMTP id fb20-20020a056512125400b0050e754c84ffmr8987199lfb.11.1704195552502; Tue, 02 Jan 2024 03:39:12 -0800 (PST) MIME-Version: 1.0 References: <20231024142706.195517-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Zhongkun He Date: Tue, 2 Jan 2024 19:39:00 +0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm: zswap: fix the lack of page lru flag in zswap_writeback_entry To: Nhat Pham Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, yosryahmed@google.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 6A2871C0003 X-Stat-Signature: bsogkpe1onzu8onfonfco754ix5pzp8k X-Rspam-User: X-HE-Tag: 1704195554-490796 X-HE-Meta: U2FsdGVkX1/wwF9Mhh3lepR044con7n+/lyk5O6Lw0JubEKtBcJockHgTM9Lyinse03lH+5MRmRj2k8CfR5FFqXIrZhWzKjBpscyLpSWyS4A+AjLWu7EEVJhhb4RaGrtIcWQs9NjgnErGwsdfFTykDleW8DzFF+sO+zOxy6qY6uD3CZE5GpjhzFMizdagm6w7/uBLh5G0M/431e0DybBHfdzDbhFsgfI0GTmpoej2Bub91cNB9k5/E6cZLzkJjpv5SxjlJkCicEqo++WgMGY8D77N8mlMZpG2T0tOoxd73N4WVAGLgWWYgGmOa/SFSoejEhSUXsu+r6jQptv7BWdu1aCNMxbTPGiIn0oMA7pVz8Fk2z6sE3t8qaNBtbX5n3S67y6NTiSy7cXJYbujJXnDuzKk+CTS4aUGLiHb+rJJMFGHRJZqlNz0JoSx9p2rR+xt+PUB+3aS5Y3c8egiVdiZsg71w9QKwcyznyi89QGxThvRoyZ3wGsGpRCChaJRKg2OpU1MlXYPRXzNuwv05b/6Mnx4WHBjoL5nrb6/FWax2mYAtiiMayJ0TaemOLSzTszdZxaHcr41uzSYhouFNw9p/gIGQoY/CT3Uff/B5Zf2w32DYN22A53uiiHVpEVoK862t9JB1oXD9JQ5X/V+W6VCxMgrk9f/ZY55TqDV2o8QqzKNIpIQv6rKxIQG+vr+Q6bovAsNN1Ep/O4OYinWy8b1CizVpVvZbPNtSZxSXDjmcND53nEYCpczaLkSxZpLOD/B17LZ/QtIZUTQJDB44YSoiiwUviQ6bMLe1rkGtM0LaDe6KcZVL0v27hpgv5LGoSr4phMbc8PWk66BmDtO2yiO4LNWkq3WIQ1oX+/Y2p8qfqgkEkxc649GXz0SbzlETt3W58uajonc0+5FhkSP9WIpTC9vZW0TLypEIfmMT/09NGP5/DZZ0Ra6ZN8I8xSbh45Bh22Hnetxy5O25DzUUL 0h2LE+qf ErKbRhmuHPGug1WimDC7+ZayttZceD8fRk35wqts4+WJmQ4udc+E9p6MbDVTE/JcsbRDs8Jkue8bQJfAhLGKoTsWMvyS4MvDcc5iOSaGuFgWt2iL05RkHLoxVJKnSEgawJjbRt/MRHTqcT6m41hJjqkGJJPDcF8/UOl+Vz7cobY7aWQ+biY0WajBmjo8GZWUUB9pFSx1Fo+nIlLvi6SX96OASuHQeVsvNjqtY+CnrCks+CHQdLhQTSnWnALteQFdoTcm9UJq20Yr7QsNL0h4sUjvotLpkUFlabsE6LagSKDAP+eCfGPDcevCwo8Z4Vkz3PMWU0ihBgF7jMVmw1ZD2q/A20S2/N0GJeIW38v7WTRUI4cwpsQzq6iR+DDoNTA2wQRICbk6KL1aq+IoXGGJCbr2AYUQEFDyvxLIwH9spZ8dfSnUkt7iRBbO1gOkGxbqhRKUqvUHli+TW2jrHJ/QBba5Nnd/vK/Cm6TdTZgJGfnOPhWQATn5uKTv7Hw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000051, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Dec 30, 2023 at 10:09=E2=80=AFAM Nhat Pham wrot= e: > > On Tue, Oct 24, 2023 at 7:27=E2=80=AFAM Zhongkun He > wrote: > > > > My apologies for the delayed response. I have a couple of questions. > > > The zswap_writeback_entry() will add a page to the swap cache, decompre= ss > > the entry data into the page, and issue a bio write to write the page b= ack > > to the swap device. Move the page to the tail of lru list through > > SetPageReclaim(page) and folio_rotate_reclaimable(). > > > > Currently, about half of the pages will fail to move to the tail of lru > > May I ask what's the downstream effect of this? i.e so what if it > fails? And yes, as Andrew pointed out, it'd be nice if the patch > changelog spells out any observable or measurable change from the > user's POV. > The swap cache page used to decompress zswap_entry should be moved to the tail of the inactive list after end_writeback, We can release them in time.Just like the following code in zswap_writeback_entry(). /* move it to the tail of the inactive list after end_writeback */ SetPageReclaim(page); After the writeback is over, the function of folio_rotate_reclaimable() will fail because the page is not in the LRU list but in some of the cpu folio batche= s. Therefore we did not achieve the goal of setting SetPageReclaim(page), and the pages could not be free in time. > > list because there is no LRU flag in page which is not in the LRU list = but > > the cpu_fbatches. So fix it. > > This sentence is a bit confusing to me. Does this mean the page > currently being processed for writeback is not in the LRU list > (!PageLRU(page)), but IN one of the cpu folio batches? Which makes > folio_rotate_reclaimable() fails on this page later on in the > _swap_writepage() path? (hence the necessity of lru_add_drain()?) > Yes, exactly. > Let me know if I'm misunderstanding the intention of this patch. I > know it's a bit pedantic, but spelling things out (ideally in the > changelog itself) will help the reviewers, as well as future > contributors who want to study the codebase and make changes to it. > Sorry,my bad. > > > > Signed-off-by: Zhongkun He > > Thanks and look forward to your response, > Nhat > > P/S: Have a nice holiday season and happy new year! Here are the steps and results of the performance test=EF=BC=9A 1:zswap+ zram (simplified model with on IO) 2:disabel zswap/parameters/same_filled_pages_enabled (stress have same page= s) 3:stress --vm 1 --vm-bytes 2g --vm-hang 0 (2Gi anon pages) 4: In order to quickly release zswap_entry, I used the previous patch (I will send it again later). https://lore.kernel.org/all/20231025095248.458789-1-hezhongkun.hzk@bytedanc= e.com/ Performance result=EF=BC=9A reclaim 1Gi zswap_entry time echo 1 > writeback_time_threshold (will release the zswap_entry, not been accessed for more than 1 seconds ) Base With this patch real 0m1.015s real 0m1.043s user 0m0.000s user 0m0.001s sys 0m1.013s sys 0m1.040s So no obvious performance regression was found. After writeback, we perform the following steps to release the memory again echo 1g > memory.reclaim Base: total used recalim total u= sed Mem: 38Gi 2.5Gi ----> 38Gi 1.5Gi Swap: 5.0Gi 1.0Gi ----> 5Gi 1.5Gi used memory -1G swap +0.5g It means that half of the pages are failed to move to the tail of lru list= , So we need to release an additional 0.5Gi anon pages to swap space. With this patch: total used recalim total u= sed Mem: 38Gi 2.6Gi ----> 38Gi 1.6Gi Swap: 5.0Gi 1.0Gi ----> 5Gi 1Gi used memory -1Gi, swap +0Gi It means that we release all the pages which have been add to the tail of lru list in zswap_writeback_entry() and folio_rotate_reclaimable(). Thanks for your time Nhat and Andrew. Happy New Year!