From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F3D6C3ABBF for ; Wed, 7 May 2025 15:11:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 020FF6B0092; Wed, 7 May 2025 11:11:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EEAE26B0095; Wed, 7 May 2025 11:11:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D662C6B0098; Wed, 7 May 2025 11:11:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B23956B0092 for ; Wed, 7 May 2025 11:11:50 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C96F15A1BF for ; Wed, 7 May 2025 15:11:50 +0000 (UTC) X-FDA: 83416451580.17.C4B90A1 Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) by imf24.hostedemail.com (Postfix) with ESMTP id 149CA180017 for ; Wed, 7 May 2025 15:11:48 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HFsBmxNt; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.53 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746630709; a=rsa-sha256; cv=none; b=37NG9PT4g6yeowLyxNOCFpE6txjocjkdJxfmzAcJvrkqmOUHvDx7NJgyoMrkTDY+R8vmVs sVp66sYnTYOEQLL7NZQiQ4xPDqNIopOVGVQ1/hUYgTjxrnxsPj5AnH+Rl0rrqzMHhWLeRD LM1TE06jReqpn2VdUwUH3swd9tpOba8= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HFsBmxNt; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.53 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746630709; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zFsVVhebDK1NYCWoS7EgmVQ2tl6qP90tIWoC9j9TLlQ=; b=G1WgEf55xzqRT81P2ovxLomhnUAGeXu7idTJ5tmzG5/cj79QcOQDMGP/Ew+8EZ620jJVGz cPsVQzYI4cIqM78vqERbgO/GPcJtaR2mtCkF3J8Z9Wi/dSy2cdUgvg0CS0RAoYNmyfb+mk fmbKmLlfz5KZE1IfTtJ2mYqLda0LPEo= Received: by mail-qv1-f53.google.com with SMTP id 6a1803df08f44-6f0c30a1ca3so252046d6.1 for ; Wed, 07 May 2025 08:11:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746630708; x=1747235508; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zFsVVhebDK1NYCWoS7EgmVQ2tl6qP90tIWoC9j9TLlQ=; b=HFsBmxNtQOYTmfD1mdEQUnSwD5uat1DJ1wFNoRB+003E9y0QENW74+XcvrjRkeAVWK s2mVfuyETjLS1AVFqSTo1VSa69Iv//RSqm12fVJlpYhnTqu9NN4IPsz3pFJAjmK2JnN/ p7xUmV6KwTolxX2q8e21zWD6di+C+fD1bT9yV0Jzl5igpsf2YlC7qyK89+NGzJ4NJhsO VJqMs9Oxq2B6QlHbVjWsiXCErUakP9E/O4vS1pljn2RxliRceoXeuuJ68D3akVmtlfuo HwcsCSx6oZez+lp4e2bg3vR0YoNhJNPCCfHVd9JZZnTUEBICgg/tBd/NcYN323v8c1MD 72dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746630708; x=1747235508; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zFsVVhebDK1NYCWoS7EgmVQ2tl6qP90tIWoC9j9TLlQ=; b=B9hGxoM3VjhAlr0y+wFCW4526VfbuGCiCDyXPaC80M+DZua0CRnXb7dEeT7TVsLoQA IGfzCLVqximYYyPRQh4q4ZdPHDCmxCeBz5SHSoVhTWbLwj0n25ps8fZLeHJloXyhtuJN aEYeyGU2fArlY/bKSaKQBOCTglDh3HAxCp+NpFI+w8ydS6kX6/NiB0JW5nHzV+YLMYsn NjeKd5DtONxuEiQIeBS30DpbUN0eN71B2qmRcXrhuu+Ark3TL9Hvy37LvE1pDvE65rs6 +/hFrExMsHm8hZEJzdxeSWAvrjk5/p0hLVQJAFs073usbBJ8CcfFNF3yKMXyYQnIVdey sAzQ== X-Forwarded-Encrypted: i=1; AJvYcCW4rPlelVRqrUjw2DQPEZOwyZI/MP5ZXL/PYz2S2imrMQoVISZ18Za8SPgkpK2No/hIPaCWJuM2BA==@kvack.org X-Gm-Message-State: AOJu0YzZ7GZkt6yIhjEF8zPhwOHXnpFCkCWI1PKXai6YGX87neZ8gxJe mYO/iP5KCF0wGqIZyQ4m+RozYNvcfeMERF3yY9Uvk9/Q5SzSKHNYLYnAdE36LGYy8/DtugasTkJ XKqPmQ+KmN3hPq2+U9XvgucNcGB4= X-Gm-Gg: ASbGncuzp9j2ENcTnQTKv/JuqAOmoJ+2furMxOtrhUjMLXSTcaftfDZi7FXf7SRpBZq qi+xE7zgc9edpdPPS1oEIHExRGxNxKhrIcACO3kWpz+50fL2Czo6ocA+pEkK2YoHz7B+wMy+G3P yQJVR9wFjT2TzjizZDfQRUpKM= X-Google-Smtp-Source: AGHT+IGgnreWVGqTX/OGt9t1yjfLukHeVyIawQ7JyQXO4oyNHOjm4AAmpDVZh+WGawWTYtY55suI3sZpRR9Poy5HWxA= X-Received: by 2002:ad4:5f88:0:b0:6e4:4164:8baa with SMTP id 6a1803df08f44-6f5429e303fmr49840626d6.6.1746630707997; Wed, 07 May 2025 08:11:47 -0700 (PDT) MIME-Version: 1.0 References: <20250430082651.3152444-1-qun-wei.lin@mediatek.com> <20250430145106.8ce79a05d35cec72aa02baa6@linux-foundation.org> In-Reply-To: From: Nhat Pham Date: Wed, 7 May 2025 08:11:36 -0700 X-Gm-Features: ATxdqUEcu4e-_kxqbjUBLr1u_eNqQEaHZr7zcbF_KTN22201A4BiR4YT9uX5Xs0 Message-ID: Subject: Re: [PATCH] mm: Add Kcompressd for accelerated memory compression To: Barry Song <21cnbao@gmail.com> Cc: Andrew Morton , Qun-Wei Lin , Mike Rapoport , Matthias Brugger , AngeloGioacchino Del Regno , Sergey Senozhatsky , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, Casper Li , Chinwen Chang , Andrew Yang , James Hsu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 149CA180017 X-Stat-Signature: iahdkpeb1c3to1xbzs4oh4epyfpx8q5p X-HE-Tag: 1746630708-824782 X-HE-Meta: U2FsdGVkX1+eR6zEGmpocfBseEtHvQCwZlBIDmY4d91JQGErW5CWfI0jVl4j8B6fM0p7Lyzyurk9MbfsEU/deEH1TOadl+97oULoEaSE370UaVDibXgxWdH2m3KtAt4kHiEGuHduCyJScKJ8JReOVfx1DN3Wdgkehosvn+2RqQo2on3Yb+oJJxDXGsHAQssk9QSrmjqRvWbF7DWpvqRoSh/m3+EbbBpxOqb/gYP0Kp1ty7hS9o5KJkNbnxWMPBzioK7VsUuwVBGZIWN8B6Xgg3utb2Yb+sB3Zmx9i4/EC8SPG/2atunR7fOODN7KaVTdF5Ahy9QOnWd7fm7I2zmg+niZG8aUuDHOX9r8MRGRucItUeoXsWvB9lID0kX4TolUNOlfwIeNYtizSXLNUMs0JABaXxY+mSmkuayV5xX1YwaWB+sGtcPG4S3b0zXCVyAQ4SDE4hsiqybO3jk3gYMszM48YuUaK/iqmhPo3687B4WhCJMqaiNc+OGauPkdSNFYeeAlTHuYG/C3lfjSi/Fq8ZywCmfHQSE/XdJth3/p3fHF7dXA+Yc7u+fy5wpwCwSAhhd4J04nxg4rAlcXpnaot3ku4ANemk1TMC59XnMNDvHKTilazw+0kVcV0Icd808Yq2Jx9B915X15dwwON9GPT1XcGuu6uzpbVTJ8cksZT/brm4yIgl2vI9UC1hrYKJdM+9IyLAbhHwycbCNzBUb1dprLm59xKQYZmpmGAIjE2o+7U84D20LxVhrBrLmMjw/WEEqveg2xs/qg1rzR6a77wqiC891qZb1MkZagUkCKzQNjmOSf4UcZpWehdmB0cd6h5UwHNSiJez4IBjtB55t+lYkF8asGN014AW5u1iBXtI6orzrTpFiIBihJDXTw6SYrOtB4ZkmzssblzIR/Y+SnfzjDQP+oimli/EwVkQUlyjMydtoIqPlXOMtfLKTx9WQqKKfr0793qLNvt/rbhS6 El2izAqx CTptZTeQ2skHQk8yLGpf60xpKmRGIZlpxqNFDI/h1Q+dgniFj9tCpUebF8w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 30, 2025 at 3:50=E2=80=AFPM Barry Song <21cnbao@gmail.com> wrot= e: > > On Thu, May 1, 2025 at 9:51=E2=80=AFAM Andrew Morton wrote: > > > > On Wed, 30 Apr 2025 16:26:41 +0800 Qun-Wei Lin wrote: > > > > > This patch series introduces a new mechanism called kcompressd to > > > improve the efficiency of memory reclaiming in the operating system. > > > > > > Problem: > > > In the current system, the kswapd thread is responsible for both sc= anning > > > the LRU pages and handling memory compression tasks (such as those > > > involving ZSWAP/ZRAM, if enabled). This combined responsibility can= lead > > > to significant performance bottlenecks, especially under high memor= y > > > pressure. The kswapd thread becomes a single point of contention, c= ausing > > > delays in memory reclaiming and overall system performance degradat= ion. > > > > > > Solution: > > > Introduced kcompressd to handle asynchronous compression during mem= ory > > > reclaim, improving efficiency by offloading compression tasks from > > > kswapd. This allows kswapd to focus on its primary task of page rec= laim > > > without being burdened by the additional overhead of compression. > > > > > > In our handheld devices, we found that applying this mechanism under = high > > > memory pressure scenarios can increase the rate of pgsteal_anon per s= econd > > > by over 260% compared to the situation with only kswapd. Additionally= , we > > > observed a reduction of over 50% in page allocation stall occurrences= , > > > further demonstrating the effectiveness of kcompressd in alleviating = memory > > > pressure and improving system responsiveness. > > > > It's a significant change and I'm thinking that broader performance > > testing across a broader range of machines is needed before we can > > confidently upstream such a change. > > We ran the same test on our phones and saw the same results as Qun-Wei. > The async compression significantly reduces allocation stalls and improve= s > reclamation speed. However, I agree that broader testing is needed, and > we=E2=80=99ll also need the zswap team=E2=80=99s help with testing zswap = cases. The warning aside (which I got around by setting and unsetting PF_MEMALLOC in kcompressd()), I run kernel building tests with zswap. There is not much performance difference with and without kcompressd. That probably means kernel building is a mediocre benchmark more than anything. Ideally, I want to experiment with some real workloads, but that is a bit more involved to set up, unfortunately :( I can try again once you have sent v2 that incorporates our review, at least to make sure everything is stable and there is no obvious regression. Hopefully I can set up a proper experiment at some point too... > > > > > Also, it's presumably a small net loss on single-CPU machines (do these > > exist any more?). Is it hard to disable this feature on such machines? > > A net loss is possible, but kswapd can sometimes enter sleep contexts, > allowing the parallel kcompressd thread to continue compression. > This could actually be a win. But I agree that additional testing on > single-CPU machines may be necessary. > > It could be disabled by the following if we discover any regression on > single-CPU machines? > > if (num_online_cpus() =3D=3D 1) > return false; > > > > > > > > > +static bool swap_sched_async_compress(struct folio *folio) > > > +{ > > > + struct swap_info_struct *sis =3D swp_swap_info(folio->swap); > > > + int nid =3D numa_node_id(); > > > + pg_data_t *pgdat =3D NODE_DATA(nid); > > > + > > > + if (unlikely(!pgdat->kcompressd)) > > > + return false; > > > + > > > + if (!current_is_kswapd()) > > > + return false; > > > + > > > + if (!folio_test_anon(folio)) > > > + return false; > > > > Are you sure the above three tests are really needed? > > Currently, it runs as a per-node thread mainly to accelerate asynchronous > reclamation, which effectively reduces direct reclamation. Since direct > reclamation already follows the slow path, asynchronous compression offer= s > limited additional benefit in that context. Moreover, it's difficult > to determine > the optimal number of threads for direct reclamation, whereas the compre= ssion > in the current direct reclamation allows it to utilize all CPUs. > > The first condition checks whether kcompressd is present. The second > ensures that we're in kswapd asynchronous reclamation, not direct > reclamation. The third condition might be optimized or dropped, at least = for > swap-backed shmem, and similar cases. > > Thanks > Barry