From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60D62C3DA6E for ; Wed, 3 Jan 2024 14:13:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DCB4F6B031D; Wed, 3 Jan 2024 09:13:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D7C1F6B031E; Wed, 3 Jan 2024 09:13:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF5776B0320; Wed, 3 Jan 2024 09:13:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A75B86B031D for ; Wed, 3 Jan 2024 09:13:01 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7B5E68093E for ; Wed, 3 Jan 2024 14:13:01 +0000 (UTC) X-FDA: 81638191362.21.BA2382B Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) by imf05.hostedemail.com (Postfix) with ESMTP id D1B6210000C for ; Wed, 3 Jan 2024 14:12:56 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=lVeKkdEw; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf05.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.48 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704291179; a=rsa-sha256; cv=none; b=rliFqic1P4Xk/c9UIvHbq46EIeFvERVlTA3qW1qyhPHmSo6GDarPMJ2jPD3QN4BGh6AMqq EMZyArOV9RntT2c2ymPg07ZLVNKkhd/urnCUued19/9GXtoFBoeKkkqC14x1D+Z3TIVT9O eBfq7aokv3YC0WqRBmbSp7DEJhjjZ9Q= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=lVeKkdEw; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf05.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.48 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704291179; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=n2Zhz0tCrknBbn4mPfnDfT6xUAWt/cZjWnjiUj21MAI=; b=VhXvtTPD5QhS9HyzCdrdYayaR0uTJS2Duy8nQU8DoVInuEENAosR7OWYbJX/yudoLAu5lL jZ9zc8rLp5xAUwPiARhXfyimbO1j8Jqj4Geq6G7XuBBVFrl7LEhtGCXO/2m2XF8g7h54qU CPMU0++XEDJGq/VHZYZ/Mzb0IZfbsqY= Received: by mail-lf1-f48.google.com with SMTP id 2adb3069b0e04-50ea8fbf261so412567e87.2 for ; Wed, 03 Jan 2024 06:12:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1704291175; x=1704895975; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=n2Zhz0tCrknBbn4mPfnDfT6xUAWt/cZjWnjiUj21MAI=; b=lVeKkdEwTI/y+NPqYG/CUZhKivDKPtr6NbOReXtqDXJNexDad592JpwiJE8cpurWLe zGTAeBUpz7S4Vg3Aq5J6q6IA/YH/Wq5H30iSbb/lWXGLZsGsGWnkTwnuVkwtXH76RtpO mvW/dzSPD3tZCffsyp7YAatvsUXe6G0bui4eMxiKtAhqkj9khlCUDzVE3oe2FOpxFHdv 1eix/UIXBV1G8tMzBkTQ1z5cn3OfcY6ouA6741YOlm8PFdkAgkrRjQLhpl+AFkF9Z1vb PA31u7RAhvgrppDEcupOmSDCQ7hE5urfHnOPoIaxQq+sK05zWNKlFlaIgAOKRE/B80kA TX7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704291175; x=1704895975; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=n2Zhz0tCrknBbn4mPfnDfT6xUAWt/cZjWnjiUj21MAI=; b=rgTtDGd26+wtooxNZ3b8st32tvRuv4iqy/dvrgzHZ0cWTq0NLBI6UeFu+E/6ZdHf+O H9mnp3wDoKnZiHglZToR2MZBFWBdtjGQMvAwmh/1SQYWyVglr1IDyQKZi3k7GyN8/XwO JjAFAJW+OMkSWj7fkVuLxuwLjDi/UbnfE7UGOYrn11d6eO245XyUHXME8hf6ZyJMa/92 0/Qlt+5mnZGVyVb6cj/aYjIbiDiIjCCXH22BJ2FVJUlI9SO/NRlrGmpbPMf1QZiFJ3bP +15cztQwRmEfwlbwF1UwnaDVQMQ7VhIx+T8wi1snJDGyqKu/v+pWfqSw8dQnvew/w074 0w4Q== X-Gm-Message-State: AOJu0YyxIe/sUwiRwCNtvxF/35vY0eAlWvwB7+xZE5MtpPg19D3T6AXO ZMT8ConxM5BFW3ZXxJObs1SiqPYcGk4UEPgDHZSjYXK5DoQBXg== X-Google-Smtp-Source: AGHT+IGPdg2u4pBEOGmI4XV//9plC/mF5CKpkp9utMVG8zs3fwUavyPzk0fmGhV1JzjwG198frCIk53F+i0kplp55zg= X-Received: by 2002:a05:6512:943:b0:50e:a517:6f0b with SMTP id u3-20020a056512094300b0050ea5176f0bmr500241lft.55.1704291174797; Wed, 03 Jan 2024 06:12:54 -0800 (PST) MIME-Version: 1.0 References: <20231024142706.195517-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Zhongkun He Date: Wed, 3 Jan 2024 22:12:43 +0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm: zswap: fix the lack of page lru flag in zswap_writeback_entry To: Nhat Pham Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, yosryahmed@google.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chris Li Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D1B6210000C X-Stat-Signature: whi16rz9kcascfb7k1fdnzpuozseiefo X-HE-Tag: 1704291176-661203 X-HE-Meta: U2FsdGVkX18SrvpIqovMDEgVrjcoQgCHQSTWg8Ex1rgNBwIRF4u11W4l/rBnyMiIkRpXWjMuBzmGKT7BzakfJH4CbFb+/liZLJ/bwGlMeF+n0IdQCrC5dsIwfzJOxzycdync9+biERg+51PGnhvqHk+4302zv9/b1VLxaoT4BQz0mFv7QRbsPqxat/exiFseFRmqagyhdXCeHJPjjnZSMV+OkJwPXOn0mVXLOOrmSFBwhRrM+S6liiycq4gfLB6PLSZeZIPrU6SJIdELY6be2iRzJLo6QS5eamG+7USO7zfrTE0nhEfpLA7tbK3GcXIhidN0x5XH8qQxWeL/nDMmq/vpD/+W9bmi704XASS6TWA/Cqewp7jMpYLr8KAVealV5VsThwZMhDrBdAerxA+TRtKsSBDKZ8AAtHbDgBMG+RzuLQLr8nd5NkM/MuBSOSrY9F9eqv/hL9akDIDEld+bXi16r7cqCWZOLgVk3iXoj8xgTbVatoDi4FUfDEZO1rmgZFkZQnKDVim4+BaUsodNCxS5cPfz48AnZGZGZHEqDbfQxrfwDoNZFjRCfVhdZjo7cQ9fSMvpPDZnk6Yy3+yWbEQ7FMNayzobALzu9Pno5IymfGW08pQvAWqsp/wVBvBrJCSwhv1wGieVX7zpsPwPMZ15eQnGGj108tpX2Qu/TE50xmzSTuQFye7amqLXVSvOABVna9uU0OeKoyn0GwXHegGSIIgnhl4z1JrC3OuHW5QJjVHL7Rdj27gUbG6+FMeP7KryuSayJeQsgMD8+Vojh/R8AfT9bX6XWxlpQ84tHt7ZftMJQGLVAgin4d9FJIn+e6W+F5/7IIoixREFAcafh+L1wGxxRWNfp0ifEAnYLde9aMT235L+tEPVbb5YPsKsEANGYAaDkIAeDvE/BauwyZq9/UzAhDkjMfuqJXUjHnS+mN0cv4LFZDqEdwtI50nEO4Gse+kK7rjGy4N7yr7 Hmmg+en7 /VIzWHuL+B9xLCqyjsIigvFsxhQYFDRSDWGLUMBVwk+QaD5H5Fd72TQrWjvYNaAM7yDVQa14251MSgEdUvPoVGZdMWiCFQLs1+emC9eJapL4xLRI4k6RmcrEDf3x/1cCLXgjl8O+2JfZ9UuBHAijO+8mTnbBZB7lbIyZgkLLv0ZEyLV90WlnwglwNekyNTyiIyRu3OPIL0XLztIUaLJrWBXrc5LXbVaCalld48FTpLG9Hx56YwbBm/y7okA5sQOe/2cQSsoYx3tIXYly8esBoGqx9qSYAehiEYf4lMjXwpfEaDwSUYIn7L//XAm5+c4DyIBE9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > That's around 2.7% increase in real time, no? Admittedly, this > micro-benchmark is too small to conclude either way, but the data > doesn't seem to be in your favor. > > I'm a bit concerned about the overhead here, given that with this > patch we will drain the per-cpu batch on every written-back entry. > That's quite a high frequency, especially since we're moving towards > more writeback (either with the new zswap shrinker, or your time > threshold-based writeback mechanism). For instance, there seems to be > some (local/per-cpu) locking involved, no? Could there be some form of > lock contentions there (especially since with the new shrinker, you > can have a lot of concurrent writeback contexts?) > > Furthermore, note that a writeback from zswap to swap saves less > memory than a writeback from memory to swap, so the effect of the > extra overhead will be even more pronounced. That is, the amount of > extra work done (from this change) to save one unit of memory would be > even larger than if we call lru_add_drain() every time we swap out a > page (from memory -> swap). And I'm pretty sure we don't call > lru_add_drain() every time we swap out a page - I believe we call > lru_add_drain() every time we perform a shrink action. For e.g, in > shrink_inactive_list(). That's much coarser in granularity than here. > > Also, IIUC, the more often we perform lru_add_drain(), the less > batching effect we will obtain. IOW, the overhead of maintaining the > batch will become higher relative to the performance gains from > batching. > > Maybe I'm missing something - could you walk me through how > lru_add_drain() is fine here, from this POV? Thanks! > > > > > After writeback, we perform the following steps to release the memory a= gain > > echo 1g > memory.reclaim > > > > Base: > > total used recalim total = used > > Mem: 38Gi 2.5Gi ----> 38Gi 1.5G= i > > Swap: 5.0Gi 1.0Gi ----> 5Gi 1.5= Gi > > used memory -1G swap +0.5g > > It means that half of the pages are failed to move to the tail of lru = list, > > So we need to release an additional 0.5Gi anon pages to swap space. > > > > With this patch: > > total used recalim total = used > > Mem: 38Gi 2.6Gi ----> 38Gi 1.6G= i > > Swap: 5.0Gi 1.0Gi ----> 5Gi 1Gi > > > > used memory -1Gi, swap +0Gi > > It means that we release all the pages which have been add to the tail = of > > lru list in zswap_writeback_entry() and folio_rotate_reclaimable(). > > > > OTOH, this suggests that we're onto something. Swap usage seems to > decrease quite a bit. Sounds like a real problem that this patch is > tackling. > (Please add this benchmark result to future changelog. It'll help > demonstrate the problem). Yes > > I'm inclined to ack this patch, but it'd be nice if you can assuage my > concerns above (with some justification and/or larger benchmark). > OK=EF=BC=8Cthanks. > (Or perhaps, we have to drain, but less frequently/higher up the stack?) > I've reviewed the code again and have no idea. It would be better if you have any suggestions. New test: This patch will add the execution of folio_rotate_reclaimable(not executed without this patch) and lru_add_drain,including percpu lock competition. I bind a new task to allocate memory and use the same batch lock to compete with the target process, on the same CPU. context: 1:stress --vm 1 --vm-bytes 1g (bind to cpu0) 2:stress --vm 1 --vm-bytes 5g --vm-hang 0=EF=BC=88bind to cpu0=EF=BC=89 3:reclaim pages, and writeback 5G zswap_entry in cpu0 and node 0. Average time of five tests Base patch patch + compete 4.947 5.0676 5.1336 +2.4% +3.7% compete means: a new stress run in cpu0 to compete with the writeback proce= ss. PID USER %CPU %MEM TIME+ COMMAND P 1367 root 49.5 0.0 1:09.17 bash =EF=BC=88writeback= =EF=BC=89 0 1737 root 49.5 2.2 0:27.46 stress (use percpu lock) 0 around 2.4% increase in real time,including the execution of folio_rotate_reclaimable(not executed without this patch) and lru_add_drain= ,but no lock contentions. around 1.3% additional increase in real time with lock contentions on the = same cpu. There is another option here, which is not to move the page to the tail of the inactive list after end_writeback and delete the following code in zswap_writeback_entry(), which did not work properly. But the pages will not be released first. /* move it to the tail of the inactive list after end_writeback */ SetPageReclaim(page); Thanks=EF=BC=8C Zhongkun > Thanks, > Nhat > > > > > Thanks for your time Nhat and Andrew. Happy New Year!