From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 311B2C47073 for ; Sun, 7 Jan 2024 18:53:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C20A76B0089; Sun, 7 Jan 2024 13:53:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BD0766B008A; Sun, 7 Jan 2024 13:53:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A98326B008C; Sun, 7 Jan 2024 13:53:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9B85D6B0089 for ; Sun, 7 Jan 2024 13:53:22 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 599CD140390 for ; Sun, 7 Jan 2024 18:53:22 +0000 (UTC) X-FDA: 81653413044.01.8936C2F Received: from mail-io1-f42.google.com (mail-io1-f42.google.com [209.85.166.42]) by imf27.hostedemail.com (Postfix) with ESMTP id 7661540024 for ; Sun, 7 Jan 2024 18:53:20 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="e/ycd4E8"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.42 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704653600; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4Tq/x7+l+dEZblRxj/FIt/GF5Ojn1xhSqyYD3OqRmVk=; b=NQoL8ib8wDM1A7MhgB/QOkNq1WrrkppWzu0d2tlmhomt92XexuglV47B+3PDujI5hWvD1e S2xV+WND4WqPEoZDk20xHJneSITnZ1h71qdmoBeQxgmXq5L9Siu5VDUJzGaxQUnZKBVfCV ceF++ntfvOg5b+kUjcbkW2YtvJ/b5R0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="e/ycd4E8"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.42 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704653600; a=rsa-sha256; cv=none; b=WhninZiUN2BvXP7SXJwi53M7Xg86xAXirAENRI8HSQMvBzkzLIjt0ORTnq9xeocVgChGF3 +YqiSxysFMWGkvsslAAF7zKK3FJxfiuTSutYgJ0i+qxIW2Zd0cyVjUy6819OFraFEkbIuh qGhz+1g5o+ccRlZdxSXlFGmbFEEJLFQ= Received: by mail-io1-f42.google.com with SMTP id ca18e2360f4ac-7b7fdde8b26so64443339f.1 for ; Sun, 07 Jan 2024 10:53:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704653599; x=1705258399; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4Tq/x7+l+dEZblRxj/FIt/GF5Ojn1xhSqyYD3OqRmVk=; b=e/ycd4E8ZbDHoWLoUiqvn4yCzOpygMeCtsxZZ0iIaOfafq4CXztYVps9hXGddBJvHD 0r5Aa8u/1Cbg4hgJUi2nZabCSpQlmZk6BfwUjAPwwKse6u7qJYHtNYqFRZuEQIH10deJ w4Yxd0CHhZLg5NUjTAWv3+/a0occ0vTCEhy56dcIUNjSq4X095SUw3eug4AAAJlVGHOc hyXATsH2uFnroHd/9FoEad53EYU5gPXVYy9rjBEZCKuMbNaJjKoPu0pgtnkoS1Qi+CSZ YCEAnQXIFvYdXXBh8GdWJNkscXTZcKDPh+kAo3vFlIg3Ad8VgtOGKyOwzVJf2KyZdZfx Ihkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704653599; x=1705258399; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4Tq/x7+l+dEZblRxj/FIt/GF5Ojn1xhSqyYD3OqRmVk=; b=IXTIYJx6T4WiZQ/zU08gGlwUOXryxPayl16XrJtBK1wb14IrQpianDFBmUN/2Um7sq qr77z/kmbeJApNx90erQA0SrbcLluL3k3G+IlNf0DudOfYiDY62jzfVmvtYE8R3/xGrR U8fIOs/cMsb7IsFZfTPh8doMnpA+eqdM6qVRSe+19juBvqcHihfFIFK+szqVNX+vKhmp 0Nd/Zeq3rK5wd+GLnmxcP0eynGjIEOnwX9VENwP/MnLae/Laeynw/vxY8UFGWgdgms/7 osHriA8zn0OEKLbHAzhP9+hI82cRfOsSnpcGWlbuy0d7I/M13LU8yOfrenGYI/tBzXwv P0Cg== X-Gm-Message-State: AOJu0YxgVXA88LconiphAGU7i4WKbQ5u6jvf+9oTMPOnJXgPgQ7dse8q JjdewlvPKsqcbQ0NbYW3L42mlJ6JDrrRerzik1k= X-Google-Smtp-Source: AGHT+IEbVK1rN7WHGe/bkWjVE/28Dbr1kjqmBa2h1jnCGJW+8BzruTIY4XSVLdnajEzuApp5+kyc8Vhjp6MSjSKdOqg= X-Received: by 2002:a5d:9508:0:b0:7ba:9c04:2e03 with SMTP id d8-20020a5d9508000000b007ba9c042e03mr3118070iom.27.1704653599406; Sun, 07 Jan 2024 10:53:19 -0800 (PST) MIME-Version: 1.0 References: <20231024142706.195517-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Nhat Pham Date: Sun, 7 Jan 2024 10:53:08 -0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm: zswap: fix the lack of page lru flag in zswap_writeback_entry To: Zhongkun He Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, yosryahmed@google.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chris Li Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 7661540024 X-Stat-Signature: o6g44rypf885b9x5q18gzgpy686gejfz X-HE-Tag: 1704653600-630985 X-HE-Meta: U2FsdGVkX1/eSi9Civv+7Lk2CBADq+70jptzMGffGGdtd/j6Y/udSdnj571LX8Q1sKkOM6/TDb4Gb/7Ef38wxhizQPJCKObfZH27zoRIzAcWqnmtzt/Ka5H/Gegds2LTBVFRfUxw9wUwj+EGRaY6FZ/Lks0Kqe5eKmJ6P7yy5SajtBQ1qvd2bS57aEV7auVVNbTDDJfVApz/Sd0EzJVcA9IIX34FWXi05suijTTUlEp9GA2n0SfchFgYrWsylRu0ISv8Y6ylyyeqNAWO34Of8BdDhPfI8MzlMLrtY7hGCVWZehPlkgIwcJeqmZMOODtJwkGIpPudzo1fFccHyvvj1C+DS9HDSfIqlgeIpruM3q0PidDMO+oqZtjuuy5TP5ZI2XV6DZOXlkej0E2wFAkVF5Li9asptWdfavYmvQ0xbURL3W8UZGoaLUtOm68mZ2byYl126r9cWIRgT+BHw1BpPuVpy5nBivrbv2Ls3OFCC1A0Dt+GItzWnvzNFml0goxytNmCqPC0fvZDr0iGQn3tTaZwg6czuap3zYiNsyopW6CzhSIbSa9UkBW5pOgkosU5Ke8lp1wjlhMrhcDkPStNA/zgRnjgfeDiX0rFafNoQGab0mZ9lfNIERCsCYgI2aBQeLjKb90SkuHdi/f8WptPxvw+VPRdahL7OTLJGaW00T9F/bt3Zmx/rrHLt4AtZFU7fgsf3XMhYvp2MzTxufeWZdAcaHG4qFldrJqadDCIC2g3CGE/G4zH6ALJtcfqdz7qe8E4tP0KPIhXWd/WJc6Kuw6kIDDp6mYmyb9rfZJ7Hl2W8XvpXxg+Akc5LvoxtXeC67GYyzeJziQ3CITlnZzKu9pO7BDt9NaR7HKiUn1aGBwzdZycKWVCt7X8McS50WspQShC/7Mi7S6r3BVyzfQHJew2CZX6ivuHDLBeCFTo3xcJCARp7foV74XeRshwMz+kopGxZHt2Zj0v3lyn5nr Jwp1d5Gx 3h5IPN1co2HmoxDL9VWpb+/e6TLJIy8FGvDEirty5TByYvFFFPjB0NziThDfbP6mjwFd83tID7rVKqbJYXu+Wrc8mcGE21Rrkc8qWSY9kuBw4LbBWHQ5QBo9XPPp03rad2wEG9Qz5B5Gfl0fRLLoVWqUhVj5eO+ZSj8UODxE3ywX5c5C/dtWkfalMjfjQFueOGO6KyeunNQH01XvSL4h4K8J68/G1YStvyLaJ9ju/13JnsAi2LHlJINPqJkRDQuh5wooIykXM5lzfNs+hC5Nh3XQl+J3t5PpB3bAH0MxFeiOTm99wbHhIGWCTvLQJYPiaPjf5ofdg8rWo7DtJAQxfUzbkwQ0pFqolwbmhWPksykc6rhnv6XjUJHsXqcMco5+ceSlalYKAhof5n8U8WN+xeMoHUanBGb+cKP7XFYSQI9uFVBz/nXbdUcM7ZBCZHJYm4DGOdcmRxJPBE22vNBEVMK4ncPc+l+/3uPyTMlLJiHGvXbs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.001182, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 5, 2024 at 6:10=E2=80=AFAM Zhongkun He wrote: > > > > > Hmm originally I was thinking of doing an (unconditional) > > lru_add_drain() outside of zswap_writeback_entry() - once in > > shrink_worker() and/or zswap_shrinker_scan(), before we write back any > > of the entries. Not sure if it would work/help here tho - haven't > > tested that idea yet. > > > > The pages are allocated by __read_swap_cache_async() in > zswap_writeback_entry() and it must be newly allocated=EF=BC=8C > not cached in swap. > Please see the code below in zswap_writeback_entry() > > page =3D __read_swap_cache_async(swpentry, GFP_KERNEL, mpol, > NO_INTERLEAVE_INDEX, &page_was_allocated); > if (!page) { > goto fail;} > /* Found an existing page, we raced with load/swapin */ > if (!page_was_allocated) { > put_page(page); > ret =3D -EEXIST; > goto fail; > } > > So when it comes to SetPageReclaim(page), > The page has just been allocated and is still in the percpu batch, > which has not been added to the LRU. > > Therefore=EF=BC=8Clru_add_drain() did not work outside the > zswap_writeback_entry(=EF=BC=89 You're right hmmm. > > > > > > > New test: > > > This patch will add the execution of folio_rotate_reclaimable(not exe= cuted > > > without this patch) and lru_add_drain,including percpu lock competiti= on. > > > I bind a new task to allocate memory and use the same batch lock to c= ompete > > > with the target process, on the same CPU. > > > context: > > > 1:stress --vm 1 --vm-bytes 1g (bind to cpu0) > > > 2:stress --vm 1 --vm-bytes 5g --vm-hang 0=EF=BC=88bind to cpu0=EF=BC= =89 > > > 3:reclaim pages, and writeback 5G zswap_entry in cpu0 and node 0. > > > > > > Average time of five tests > > > > > > Base patch patch + compete > > > 4.947 5.0676 5.1336 > > > +2.4% +3.7% > > > compete means: a new stress run in cpu0 to compete with the writeback= process. > > > PID USER %CPU %MEM TIME+ COMMAND = P > > > 1367 root 49.5 0.0 1:09.17 bash =EF=BC=88writ= eback=EF=BC=89 0 > > > 1737 root 49.5 2.2 0:27.46 stress (use percp= u > > > lock) 0 > > > > > > around 2.4% increase in real time,including the execution of > > > folio_rotate_reclaimable(not executed without this patch) and lru_add= _drain,but > > > no lock contentions. > > > > Hmm looks like the regression is still there, no? > > Yes, it cannot be eliminated. Yeah this solution is not convincing to me. The overall effect of this patch is we're trading runtime to save some swap usage. That seems backward to me? Are there any other observable benefits to this? Otherwise, unless the benchmark is an adversarial workload that is not representative of the common case (and you'll need to show me some alternative benchmark numbers or justifications to convince me here), IMHO this is a risky change to merge. This is not a feature that is gated behind a flag that users can safely ignore/disable. > > > > > > > > > around 1.3% additional increase in real time with lock contentions o= n the same > > > cpu. > > > > > > There is another option here, which is not to move the page to the > > > tail of the inactive > > > list after end_writeback and delete the following code in > > > zswap_writeback_entry(), > > > which did not work properly. But the pages will not be released first= . > > > > > > /* move it to the tail of the inactive list after end_writeback */ > > > SetPageReclaim(page); > > > > Or only SetPageReclaim on pages on LRU? > > No, all the pages are newly allocted and not on LRU. > > This patch should add lru_add_drain() directly, remove the > if statement. > The purpose of writing back data is to release the page, > so I think it is necessary to fix it. > > Thanks for your time, Nhat. Is there a way to detect these kinds of folios at folio_rotate_reclaimable() callsite? i.e if this is a zswap-owned page etc. etc.?