From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54756D3ABF4 for ; Mon, 11 Nov 2024 21:37:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D5B9F8D0011; Mon, 11 Nov 2024 16:37:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D0C538D0001; Mon, 11 Nov 2024 16:37:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BADE68D0011; Mon, 11 Nov 2024 16:37:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9B3308D0001 for ; Mon, 11 Nov 2024 16:37:51 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4E7E1141BDE for ; Mon, 11 Nov 2024 21:37:51 +0000 (UTC) X-FDA: 82775124642.13.BBB7495 Received: from mail-vk1-f171.google.com (mail-vk1-f171.google.com [209.85.221.171]) by imf27.hostedemail.com (Postfix) with ESMTP id C4C5240003 for ; Mon, 11 Nov 2024 21:37:07 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HsRWVAst; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.171 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731360894; a=rsa-sha256; cv=none; b=NbaOJqQWiCxA+RHA1KHDP2JeiTFta2gIK/35lauWzIUO6pAtU3xuUsLG/IgH+/m7tfKw5I Y2PTAY3LlfPu7YZFFn0yM38BT8la91UtcEr1wMnRum7Mlntew2LCPNtOC/kfqTlCjP3h7i 8DggmTDw5Wt/fC5Kui3FG0PSJCwKxqI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HsRWVAst; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.171 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731360894; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iuDxd3x4l+C+rN/QwF5MT9fpvKTkpINe3cU070JO6c0=; b=bW48yZS/3slfO2PY0HZSlxSk0CmpObOncwifP7owIAZ5WdnjI8ZRcNX+928EEhAzJcUHte pbC+M/3w5VAV7nKQsg5KQzWbVQsZM7Ai2Wf3b4E8SNfF62ezkYBHG11pSC76QT+pOmIRDi ahZgZH3d/i6HyKW3nFC4fDyAP1he9NY= Received: by mail-vk1-f171.google.com with SMTP id 71dfb90a1353d-5101c527611so1961251e0c.3 for ; Mon, 11 Nov 2024 13:37:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731361069; x=1731965869; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=iuDxd3x4l+C+rN/QwF5MT9fpvKTkpINe3cU070JO6c0=; b=HsRWVAst2+LvuGLNPYOusb8Q3+t2TpED7nNzbWaSfz4ifid12l0OCMYIFLW0cPVb2Q YNh0+tEUCuV8ymvHMCyuNCCCKCjiwlHUCQE+ef//6E8AQNxWYhnpPsxmdM5RlqRvL6x2 au9wx/kN4i4Rbee/fpFpSLmfxgTDauOYl6MdPzLW66tSnQaJHEULcqwrkutJsI52xSgq bk6se7XXPd8vuPcIno6dgR3EJunx7Q7R6G3ytwCET+s6t1HKVCwvWL68gmuUYYO5p1B+ hVmR1mrKfCyjXibticV6l6u3Lfv+d8S2lzxIne2xIVFJ3r6WjW+9Em/8+m6IlnKrPcMK yR2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731361069; x=1731965869; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iuDxd3x4l+C+rN/QwF5MT9fpvKTkpINe3cU070JO6c0=; b=T0nQg+sl2Oa7z2APRm22PeXiscntUoa0FXChuT5nmJ+WLWz0ItH74bDqNHGyw2NM+M MGtyWHXQtzh9k/8ltFHPXwUxuFBEivlIutb1AktbHvW5pn562MyqFVW5TTNEYyWmp939 BKFxH7eo5Gg41OKqtLxBRRMqTUAH39xHS9fsQKq+eKL5hlr2sa+p4fXjaT8hxC7/dRJS pM8c+C0kVgyNdd6AxsWivhOj7YXttO2n1MALWEdN21Wz6ATfXMmf65OIflX+xSMEhTEB yhK1gTTeCa5yJd6LiNYYM5D4IuOri5Fqp9jx3yaKbxsk7y9kKkt6CO0cfy4RNohVpylp fOug== X-Gm-Message-State: AOJu0YzeTNYU6wdmbn9/gybksxGDeXtWEmuGuJY4rRSZkctRED6GFViD /BubDg3YI+ioF87GPoTL7hrs00y5ojh+SP8vZU4uLzRkd4dd4xDBI1rh1rnnuykrwoN4AkR0zyj IaIAHtLPYGmodqyFhvXfOh1ZV5DM= X-Google-Smtp-Source: AGHT+IEZbfVH8+TjKOm7Cz6Xy1F3alpjr90SF2ywe2hZD+zQKQYhqpmH0zHzRU3SUvid+W4BwYmepJO/jJ8vuu8Pa1A= X-Received: by 2002:a05:6122:1816:b0:50d:bfd3:c834 with SMTP id 71dfb90a1353d-51401bc6114mr13314982e0c.4.1731361068531; Mon, 11 Nov 2024 13:37:48 -0800 (PST) MIME-Version: 1.0 References: <20241107101005.69121-1-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Tue, 12 Nov 2024 10:37:37 +1300 Message-ID: Subject: Re: [PATCH RFC v2 0/2] mTHP-friendly compression in zsmalloc and zram based on multi-pages To: Nhat Pham Cc: linux-mm@kvack.org, akpm@linux-foundation.org, axboe@kernel.dk, bala.seshasayee@linux.intel.com, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, kanchana.p.sridhar@intel.com, kasong@tencent.com, linux-block@vger.kernel.org, minchan@kernel.org, senozhatsky@chromium.org, surenb@google.com, terrelln@fb.com, v-songbaohua@oppo.com, wajdi.k.feghali@intel.com, willy@infradead.org, ying.huang@intel.com, yosryahmed@google.com, yuzhao@google.com, zhengtangquan@oppo.com, zhouchengming@bytedance.com, usamaarif642@gmail.com, ryan.roberts@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: br6e64d6aws8hjy8sjdrou6ygso3m4e6 X-Rspamd-Queue-Id: C4C5240003 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1731361027-653216 X-HE-Meta: U2FsdGVkX18rxbV5Oofi8zZ/K8RDo6FjgKgWcNzDPwWJJaueRnx1Ukwls3qrrY1McAlZSIkBcFqWZSCgnRvscT9tV9efkXTaCD8gLd6BhhFZtpxuU4ZzG4rS4HsvZ89uTc9Kx+UIf7SRXt3weqTOVdqW13N1SCDJ7S7ma1gCZDr/yzppee5swPHj5O4oaMw6/fQOeVdledTJ2FchHplYbMLNW8j5prjAnokFT6IRo/0JLsGy4WktkvAqmdihGfqNAbIeTCtH5Pv9jY7S2rTcVTE2T22YSN0V9zyXQ+d7+bQ37Bzi8mmZYB6fH/4IhGJqNe9JEiz7xH1EavmXqHNxPctuTjn9qdSUfoDtRp3gOITEddO2QZGfQ1i2w5csVG9n/s5621whpynz+0k4Gtd7TmSnIEDihglyZa7mt5jwph1GXn2xGNC7PtImCClluSlNFNgh3N1GrwqyYWLEctpXi9tpUxEVN72I/cd7Zly4Qz+PBeFGAZn4g5RAUp1BxZbmJr8sB9FzkPVuXGayufOIORSuuw1NXsDni81VRg5T3Ws0TaTDPZXYL6Cifqe/abdA45WSB3r5OnilfxT10s/VsqJmJ3ZwF0nDJAGKN/pA1+KaRKGzZMzd2DdrHsaTukBB67OwI+KULZ0QR6h7H9922b4vd8aQiVEGrkP6/2wDd7pnTLE9JNKF+9JZFv2rzZVbtGCyUezYIOOHD7yzKrvTTXEuNPSNP2aJY5+jBfCr1X6XwIz0Ei0vKRFFDTMTv/SoSndEnAmiX2LfNLFzqlVqwrb1f98OEq1fDjZ1iVyjCBrSx3IpseNdNUaXtjMScakzMsE9UjbJFD4u4ZclisZKV5znCgUS6tggJ6AjEpsClzR/SfbutIAxXAxjCV+E7nW+TwRy5AhmJd/QOkLXjUPMfT9YnIUJbbUenky58UBX4D/8q4gJ7AgIGw67q5cS9BHnNuB8zqghLWm/z6IV3+D 8TpNoovO CnYyzuY6+qTKUzjkgfzu/xkh0phGH8rpKWUJ2RwMpyDoyY07GDx5UN3k4dcuiRUEpq0OVLT2Ic8q3ciJSgorzDbUrR7jr1WCVvgIgT9VwD8CaG6gIcu+NeIBkMWnHpMTkPYUViEtEJGntDEVqQ9puQE73bQ1veu4J5s122hjF3rjhp6TGIzgjPr3A4SbfP4OwC5pMDKGNZC9jeCB7+GxnHTSJz0ti/dt9wEyo5CAdVsQeKoZVBe8jO6ZXcFFt10T/X4qRjtGY192vZwzYzDosZo1bPd44bvgdGhCzhwL5mfyyhPD7u8/oiBfpKlpDV0hTIvVZ1cm5OnfkeZhTmgJZbDhjv3CQHAO4CMFMhW/1h1YqX+wCEZuCejk6dBZKcBA1vgrvfyiG/MaAlGM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 12, 2024 at 8:30=E2=80=AFAM Nhat Pham wrote= : > > On Thu, Nov 7, 2024 at 2:10=E2=80=AFAM Barry Song <21cnbao@gmail.com> wro= te: > > > > From: Barry Song > > > > When large folios are compressed at a larger granularity, we observe > > a notable reduction in CPU usage and a significant improvement in > > compression ratios. > > > > mTHP's ability to be swapped out without splitting and swapped back in > > as a whole allows compression and decompression at larger granularities= . > > > > This patchset enhances zsmalloc and zram by adding support for dividing > > large folios into multi-page blocks, typically configured with a > > 2-order granularity. Without this patchset, a large folio is always > > divided into `nr_pages` 4KiB blocks. > > > > The granularity can be set using the `ZSMALLOC_MULTI_PAGES_ORDER` > > setting, where the default of 2 allows all anonymous THP to benefit. > > > > Examples include: > > * A 16KiB large folio will be compressed and stored as a single 16KiB > > block. > > * A 64KiB large folio will be compressed and stored as four 16KiB > > blocks. > > > > For example, swapping out and swapping in 100MiB of typical anonymous > > data 100 times (with 16KB mTHP enabled) using zstd yields the following > > results: > > > > w/o patches w/ patches > > swap-out time(ms) 68711 49908 > > swap-in time(ms) 30687 20685 > > compression ratio 20.49% 16.9% > > The data looks very promising :) My understanding is it also results > in memory saving as well right? Since zstd operates better on bigger > inputs. > > Is there any end-to-end benchmarking? My intuition is that this patch > series overall will improve the situations, assuming we don't fallback > to individual zero order page swapin too often, but it'd be nice if > there is some data backing this intuition (especially with the > upstream setup, i.e without any private patches). If the fallback > scenario happens frequently, the patch series can make a page fault > more expensive (since we have to decompress the entire chunk, and > discard everything but the single page being loaded in), so it might > make a difference. > > Not super qualified to comment on zram changes otherwise - just a > casual observer to see if we can adopt this for zswap. zswap has the > added complexity of not supporting THP zswap in (until Usama's patch > series lands), and the presence of mixed backing states (due to zswap > writeback), increasing the likelihood of fallback :) Correct. As I mentioned to Usama[1], this could be a problem, and we are collecting data. The simplest approach to work around the issue is to fall back to four small folios instead of just one, which would prevent the need for three extra decompressions. [1] https://lore.kernel.org/linux-mm/CAGsJ_4yuZLOE0_yMOZj=3DKkRTyTotHw4g5g-= t91W=3DMvS5zA4rYw@mail.gmail.com/ Thanks Barry