From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1485DC433F5 for ; Thu, 10 Mar 2022 10:27:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 257CF8D0002; Thu, 10 Mar 2022 05:27:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E10C8D0001; Thu, 10 Mar 2022 05:27:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A8C78D0002; Thu, 10 Mar 2022 05:27:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0166.hostedemail.com [216.40.44.166]) by kanga.kvack.org (Postfix) with ESMTP id EBD478D0001 for ; Thu, 10 Mar 2022 05:27:45 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A08C4181CA33C for ; Thu, 10 Mar 2022 10:27:45 +0000 (UTC) X-FDA: 79228100490.18.5B3C54D Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf22.hostedemail.com (Postfix) with ESMTP id F41DBC001D for ; Thu, 10 Mar 2022 10:27:44 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2DD9DB8251C; Thu, 10 Mar 2022 10:27:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C7240C340E8; Thu, 10 Mar 2022 10:27:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1646908062; bh=UTjuzjE4NS4ZAInnUmgKCAJAGpQXOWxDhz+pPibWJPo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=EULB45b3V4AUnpglAQRxxsAiMP6Q0xHhN9qPPFuRPeVwOEisGAhk0mbJhP6Mp+PbG +qmcIOVMJW56/7kAS1Xvva61G16DSsSFONn88Jc1EEzQddI5mo5EB5uSYInDDTAb1Y QLERf/BjPhLrSPSG1ZvdUtUlPa8v56u92yFsxvkYY+meanwNBrFer3DlPixXpdge3/ pshQ3STZrptIQhDYTPXKB+/Z0PO2s6iGEX6fUIlpcULXWztBWLqWJjL+BFQ6b9hsSZ OdXo67K40yHbnM+vRiQvnaIYlHeA832z0VYjbRbSisZNIQiQBrv2puoS9Q6FINexLR FmkBXmXRQsbTg== Date: Thu, 10 Mar 2022 12:27:34 +0200 From: Mike Rapoport To: Ananda Cc: linux-mm@kvack.org, vitaly.wool@konsulko.com, vbabka@suse.cz, akpm@linux-foundation.org Subject: Re: [PATCH v3] mm: add ztree - new allocator for use via zpool API Message-ID: References: <20220307142724.14519-1-a.badmaev@clicknet.pro> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220307142724.14519-1-a.badmaev@clicknet.pro> X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: F41DBC001D X-Stat-Signature: 1g451ao8whwkcpqy8sdejx3rurz8oxws Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=EULB45b3; spf=pass (imf22.hostedemail.com: domain of rppt@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org X-HE-Tag: 1646908064-581062 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 07, 2022 at 05:27:24PM +0300, Ananda wrote: > From: Ananda Badmaev > > Ztree stores integer number of compressed objects per ztree block. > These blocks consist of several physical pages (from 1 to 8) and are > arranged in trees. > The range from 0 to PAGE_SIZE is divided into the number of intervals > corresponding to the number of trees and each tree only operates objects of > size from its interval. Thus the block trees are isolated from each other, > which makes it possible to simultaneously perform actions with several > objects from different trees. > Blocks make it possible to densely arrange objects of various sizes > resulting in low internal fragmentation. Also this allocator tries to fill > incomplete blocks instead of adding new ones thus in many cases providing a > compression ratio substantially higher than z3fold and zbud. > Apart from greater flexibility, ztree is significantly superior to other > zpool backends with regard to the worst execution times, thus allowing for > better response time and real-time characteristics of the whole system. > > Signed-off-by: Ananda Badmaev > --- > > v2: fixed compiler warnings > > v3: added documentation and const modifier to struct tree_descr > > Documentation/vm/ztree.rst | 104 +++++ > MAINTAINERS | 7 + > mm/Kconfig | 18 + > mm/Makefile | 1 + > mm/ztree.c | 754 +++++++++++++++++++++++++++++++++++++ > 5 files changed, 884 insertions(+) > create mode 100644 Documentation/vm/ztree.rst > create mode 100644 mm/ztree.c There are a lot of style issues, please run scripts/checkpatch.pl. > diff --git a/Documentation/vm/ztree.rst b/Documentation/vm/ztree.rst > new file mode 100644 > index 000000000000..78cad0a6d616 > --- /dev/null > +++ b/Documentation/vm/ztree.rst > @@ -0,0 +1,104 @@ > +.. _ztree: > + > +===== > +ztree > +===== > + > +Ztree stores integer number of compressed objects per ztree block. These > +blocks consist of several consecutive physical pages (from 1 to 8) and > +are arranged in trees. The range from 0 to PAGE_SIZE is divided into the > +number of intervals corresponding to the number of trees and each tree > +only operates objects of size from its interval. Thus the block trees are > +isolated from each other, which makes it possible to simultaneously > +perform actions with several objects from different trees. > + > +Blocks make it possible to densely arrange objects of various sizes > +resulting in low internal fragmentation. Also this allocator tries to fill > +incomplete blocks instead of adding new ones thus in many cases providing > +a compression ratio substantially higher than z3fold and zbud. Apart from > +greater flexibility, ztree is significantly superior to other zpool > +backends with regard to the worst execution times, thus allowing for better > +response time and real-time characteristics of the whole system. > + > +Like z3fold and zsmalloc ztree_alloc() does not return a dereferenceable > +pointer. Instead, it returns an unsigned long handle which encodes actual > +location of the allocated object. > + > +Unlike others ztree works well with objects of various sizes - both highly > +compressed and poorly compressed including cases where both types are present. > + > +Tests > +===== I don't think the sections below belong to the Documentation. IMO they are more suitable to the changelog > + > +Test platform > +------------- > + > +Qemu arm64 virtual board with debian 11. > + > +Kernel > +------ > + > +Linux 5.17-rc6 with ztree and zram over zpool patch. Additionally, counters and > +time measurements using ktime_get_ns() have been added to ZPOOL API. > + > +Tools > +----- > + > +ZRAM disks of size 1000M/1500M/2G, fio 3.25. > + > +Test description > +---------------- > + > +Run 2 fio scripts in parallel - one with VALUE=50, other with VALUE=70. > +This emulates page content heterogeneity. > + > +fio --bs=4k --randrepeat=1 --randseed=100 --refill_buffers \ > + --scramble_buffers=1 --buffer_compress_percentage=VALUE \ > + --direct=1 --loops=1 --numjobs=1 --filename=/dev/zram0 \ > + --name=seq-write --rw=write --stonewall --name=seq-read \ > + --rw=read --stonewall --name=seq-readwrite --rw=rw --stonewall \ > + --name=rand-readwrite --rw=randrw --stonewall > + > +Results > +------- > + > +ztree > +~~~~~ > + > +* average malloc time (us): 3.8 > +* average free time (us): 3.1 > +* average map time (us): 4.5 > +* average unmap time (us): 1.2 > +* worst zpool op time (us): ~2200 > +* total zpool ops exceeding 1000 us: 29 > + > + > +zsmalloc > +~~~~~~~~ > + > +* average malloc time (us): 10.3 > +* average free time (us): 6.5 > +* average map time (us): 3.2 > +* average unmap time (us): 1.2 > +* worst zpool op time (us): ~6200 > +* total zpool ops exceeding 1000 us: 1031 > + > +z3fold > +~~~~~~ > + > +* average malloc time (us): 20.8 > +* average free time (us): 29.9 > +* average map time (us): 3.4 > +* average unmap time (us): 1.4 > +* worst zpool op time (us): ~4900 > +* total zpool ops exceeding 1000 us: 100 > + > +zbud > +~~~~ > + > +* average malloc time (us): 8.1 > +* average free time (us): 4.0 > +* average map time (us): 0.3 > +* average unmap time (us): 0.3 > +* worst zpool op time (us): ~9400 > +* total zpool ops exceeding 1000 us: 727 -- Sincerely yours, Mike.