From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DA14C48BF8 for ; Thu, 22 Feb 2024 05:56:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C49C6B006E; Thu, 22 Feb 2024 00:56:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 674E06B0071; Thu, 22 Feb 2024 00:56:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5633C6B0072; Thu, 22 Feb 2024 00:56:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 453746B006E for ; Thu, 22 Feb 2024 00:56:11 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7BD76C0C76 for ; Thu, 22 Feb 2024 05:56:10 +0000 (UTC) X-FDA: 81818379300.09.5667D48 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf25.hostedemail.com (Postfix) with ESMTP id E4602A0009 for ; Thu, 22 Feb 2024 05:56:08 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ylUo1d+W; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 39-HWZQoKCEwC265Cov0sru22uzs.q20zw18B-00y9oqy.25u@flex--yosryahmed.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=39-HWZQoKCEwC265Cov0sru22uzs.q20zw18B-00y9oqy.25u@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708581368; a=rsa-sha256; cv=none; b=bKi+yOivS7IQJ3AY64wD0nhEK8PZRV6w2RY5nisA+jZ9/9JTkTZ5Ly2PaW20/RBslndCxc 9wLkiFXDG+s7dWxtNB2JsVJon1uRALs1JCSZTOt5OY1E/UbnJeDaqfu8KwA94REk01AQ9f ZxoeaUqHW3oxhgplDCyX4wqPGZR0rLA= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ylUo1d+W; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 39-HWZQoKCEwC265Cov0sru22uzs.q20zw18B-00y9oqy.25u@flex--yosryahmed.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=39-HWZQoKCEwC265Cov0sru22uzs.q20zw18B-00y9oqy.25u@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708581368; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nWBzNYdm8swsY3BfFe1SAI8Gh15aYs1jSIrXf1I5HWs=; b=e+4aYlNqnMBWlnAilciVygmeTj2YA6VYBc5cvvBKjgWAnB4m3f4GMF04jI73q2j+dyghAr x0Wj6I1F4AfPLhEgEW/PCsBKeiGPheo7c/KQJriQwUxq1hEyM2E+73uHNmnHYaXdG5oABw 3ACxARo2K9lX8xCLFrtOjMOzbgiinv8= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-607c9677a91so25855447b3.2 for ; Wed, 21 Feb 2024 21:56:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708581368; x=1709186168; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nWBzNYdm8swsY3BfFe1SAI8Gh15aYs1jSIrXf1I5HWs=; b=ylUo1d+WejXyDIdg/+xPmMTlvx6FbmwCFjOw7jYRwQNthggfrRMLtAp/SWFRTlX2o2 KdKYcYUoj07C5XrWSeWyQ+i0guf4ptEloBSbbUlaApmGY9bTn6JvmA2gNrkiUw5I7xTG KUt6BiOigQdDpbpe6JmSkI4yIdueSSHyrX45SqcKLV1MOYhRjYcsja9r7hJjAyvO4SUo 3rblUPsTGKmAOJk4V5D2A3wxPQk35qKwxCO1U4g9JV8DzI/mMKF1B0IcZ584niVkkLV2 gPbDVzxR4QzpOuFmP1El+MC/X5CmMgDSKNe7PSI4r4uyATQE5G8LXID6cex7+hGhDtKO zNag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708581368; x=1709186168; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nWBzNYdm8swsY3BfFe1SAI8Gh15aYs1jSIrXf1I5HWs=; b=DEfuoCZ/C2l1f+716H5y5fWfipquJdBJzYOVTzoCvxkRH5biaAvNuR8RLt9JadRn6+ jmi7LKA+UiMRwzVKHavUG0ybEVQX72w2Pfy9Avzb70MnhNUe8+ohnI7IWEgOPkyoCCSZ Rc5dlNGv3ArcuCduPTLHiqQh1y9jZG6n3U84jSvZF4l9HIXxm/wkSjNR2tEmRE3FGAKL WyGAgF+HqLfqGBroOOREuItCQOaN1DynTQhyJvP7egcXrFrBwIsZTM/jDB3Z4tv/nxwv mxzPGvdahLRg82t2q00rKiefdzCWDslsdmkh4nppPwh/rienLt5ZWkfr16/RxOttgQV6 ZMzQ== X-Forwarded-Encrypted: i=1; AJvYcCW8BMVk+zGwMDRPPzlQYjPrT4pAr5il9JEWTHx/u2w83ZHWSaNbyw+rg20cN8uByLU4GwYYe79E1mf1nnhFYPAq5iI= X-Gm-Message-State: AOJu0YyaoMrum0CZAg0RgddOFWHB5wIytlpDjv+m147zUzk7JkXtnlXQ w8xSFheAMbV0Itj7dIR0pnwzQ1QiQ0udBbCVL4EaVVWj5jN9K5WH7UP71KA7tWuaOHJfVWHHjGj EU0GE12HZ64xXmn3ilA== X-Google-Smtp-Source: AGHT+IGx5mQfjz4YVQ2AkQXF6vcSWO5MlMlQO6LD/sJs+Z4DDuf4ESWabDpfrsFqAGKQCSx6TjYmFk8xK9910+le X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a5b:ed0:0:b0:dc6:c2e4:5126 with SMTP id a16-20020a5b0ed0000000b00dc6c2e45126mr424431ybs.12.1708581367904; Wed, 21 Feb 2024 21:56:07 -0800 (PST) Date: Thu, 22 Feb 2024 05:56:04 +0000 In-Reply-To: Mime-Version: 1.0 References: Message-ID: Subject: Re: [RFC] Analyzing zpool allocators / Removing zbud and z3fold From: Yosry Ahmed To: Chengming Zhou Cc: Andrew Morton , Vitaly Wool , Miaohe Lin , Johannes Weiner , Nhat Pham , Linux-MM , Linux Kernel Mailing List , Christoph Hellwig , Sergey Senozhatsky , Minchan Kim , Chris Down , Seth Jennings , Dan Streetman , Chris Li Content-Type: text/plain; charset="us-ascii" X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: E4602A0009 X-Stat-Signature: ynjrjz4kab5w8xnjs5zd171es3y3wric X-HE-Tag: 1708581368-762759 X-HE-Meta: U2FsdGVkX1+TV6ev3zpQclJhT6Q2SYl6xA6B0VzQ9NqfZCuwuHoSADppXnJx6BLSRUzIPgoTpXneOl7pGNYS87xqDG3TJ7l9hOyjm6FuxYGxbpePNy200BoKmFog1Xr03Qd9nkU3e/7xDq2wc/BqfFRGB3vOOECLDHnizel0a/yOfRevqiYsN6Fnn0eVe+bkrhVriPGXylZ6DdH28bZCUMxB85NqxW7GRD0gFB2/iknED0YlgEl9K5JECGfe6bdrDr3UHqitRh9eLZ7n9JlrlknoYGwv67dvL9aI9X2VZHSiKbKjVs4Y410/PiHiQGLHU2nBoT2+pV952pLv4gIChauYL7IMHAVUq6APXheTONAaRWfN8t+M7a1GPTKYKbZXcgf4JMUyj4Nh68Eux7EN/2IpQ+bfEP6NO8MUMRw/cmj/BOS4PF0QbOyAqLX74Zddguf9cLFtfg9AMu4LrfvpapTv5f7JJF0PAqPHXKUAHQAbN/9qQqxRphqL7ZKeD2FzWxb1bxpn4rLhNlNgGGLbj/NnL8AG+ODXuCvn49D4XwSVCdYn8pxfUcecBBwfQvIJLezP3P2DPAqIUuuvlo5Y76EXFU8BGftw6EpfnSYtTUCG+8T9pR803UHuttpQTRKxuk2HG1jxmW5YVp6k6/6Vla0i2jW0EgZAuD/QA0QYgYZWND84VT041pu7nzfA3PTo4RZXxWi/wRrpWzeaEvnkTmJxZTg1nLKUh6X+6GyI83ACfqHFysEcPxNaZth3u96V1Ir2eKDSA3Mg6ww3dApYhlWBi19DcBoomhLkdHwfOyNZwLBBA8ypMJU5tSjMa2aZG4laGHORgEu9sRuVJFPynx473iXgpc1eX8hR9q0dydXb7b7uYLgQezMHum22tzLRx74VCb9UFx1f635j1nJ8jLEnShl/rKxZ8If9IDHQG8vpAlLcu5tnCATwzFLOIe4aw8diVFrYy/vgjS2wy6C AY71ekmu 5T4H16uRjTmbL5sT3C/8PbmG8nTsm9K+DoWBLNP5tCPWY4uXTrf05y0a3sO5ZF5WTXcdKuOlJVDG4rv+7AR1rfh1T2h7EwcXWw9RJazKyuu1oj/N+wV66rQtGjkjL+vbRqf5jJkZU1xZN1769D1ugE1zObgtFikMnaovBfbTmazhRu5Jb+Mu10fFcs6zl91aTLuNtnJ5d1WJzjW3YlEZewqaM2irbM76IJUfnGtciongoBYJQW3vjc5TMzmR13Z+PAeH6vm7VEsz5T58bq51qRnghQAKdDnPi6YwRuq4SamX1sZYCXo5gRiW6tO9RhpYH3xG2PuzB9UfFNrNi9Zi3/4uyJ1K0sYI2EibYXVCDXM/bDuI03fSjofBXov6BZ+v3cnLc X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 22, 2024 at 11:54:44AM +0800, Chengming Zhou wrote: > On 2024/2/9 11:27, Yosry Ahmed wrote: > > Hey folks, > > > > This is a follow up on my previously sent RFC patch to deprecate > > z3fold [1]. This is an RFC without code, I thought I could get some > > discussion going before writing (or rather deleting) more code. I went > > back to do some analysis on the 3 zpool allocators: zbud, zsmalloc, > > and z3fold. > > This is a great analysis! Sorry for being late to see it. > > I want to vote for this direction, zram has been using zsmalloc directly, > zswap can also do this, which is simpler and we can just maintain and optimize > only one allocator. The only evident downside is dependence on MMU, right? AFAICT, yes. I saw a lot of positive responses when I sent an RFC to mark z3fold as deprecated, but there were some opposing opinions as well, which is why I did this simple analysis. I was hoping we can make forward progress with that, but was disappointed it didn't get as much attention as the deprecation RFC :) > > And I'm trying to optimize the scalability performance for zsmalloc now, > which is bad so zswap has to use 32 pools to workaround it. (zram only use > one pool, should also have the scalability problem on big server, maybe > have to use many zram block devices to workaround it too.) That's slightly orthogonal. Zsmalloc is not really showing worse performance than other allocators, so this should be a separate effort. > > But too many pools would cause more memory waste and more fragmentation, > so the resulted compression ratio is not good enough. > > As for the MMU dependence, we can actually avoid it? Maybe I missed something, > we can get object's memory vecs from zsmalloc, then send it to decompress, > which should support length(memory vecs) > 1? IIUC the dependency on MMU is due to the use of kmalloc() APIs and the fact that we may be using highmem pages. I think we may be able to work around that dependency but I didn't look closely. Hopefully Minchan or Sergey could shed more light on this. > > > > > [1]https://lore.kernel.org/linux-mm/20240112193103.3798287-1-yosryahmed@google.com/ > > > > In this analysis, for each of the allocators I ran a kernel build test > > on tmpfs in a limit cgroup 5 times and captured: > > (a) The build times. > > (b) zswap_load() and zswap_store() latencies using bpftrace. > > (c) The maximum size of the zswap pool from /proc/meminfo::Zswapped. > > Here should use /proc/meminfo::Zswap, right? > Zswap is the sum of pool pages size, Zswapped is the swapped/compressed pages. Oh yes, it is /proc/meminfo::Zswap actually. I miswrote it in my email. Thanks!