From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF49BC36017 for ; Tue, 1 Apr 2025 18:24:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15F6D280003; Tue, 1 Apr 2025 14:24:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E871280001; Tue, 1 Apr 2025 14:24:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7C2F280003; Tue, 1 Apr 2025 14:24:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C0254280001 for ; Tue, 1 Apr 2025 14:24:28 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id AC292B742C for ; Tue, 1 Apr 2025 18:24:29 +0000 (UTC) X-FDA: 83286300258.14.54901D9 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) by imf13.hostedemail.com (Postfix) with ESMTP id 0EC642000B for ; Tue, 1 Apr 2025 18:24:27 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=B7GkXHE3; spf=pass (imf13.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.222.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743531868; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RfeO8ISu/55MnEc/AQOG9MKsMCumBqwIs3c5prW9UwA=; b=vOoXZ/KX8MO1T6fIBDAcZP5I8PmdvO+kEUUEH2A9oA87QZ75nnFXQyjTPWDlZbuU3+jJ2l x3TXSQT2tflxSeBqoAnl0n/VzMoJ4G1gf2kmFWqNy2wGp2ZKNB9G9kYrVEwAj5GpCGCPvj nBaaqnHjw7ySOMN4qLPxvk5lGUYMbuA= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=B7GkXHE3; spf=pass (imf13.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.222.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743531868; a=rsa-sha256; cv=none; b=LN2OKAQmTpRK2OIAYiEceZciZ7GdKDlQfT48MmNL63DTGx+2Y9XRNOxfbfUJAGfXfCaofQ vIlkoery36T6qxn39Rn2ZweXBw32pzM/mJ/Dkal4NKrCtLdr2LnnARwhQ9aITBS7O2ynen 7FMT1vUWgzgyoZDiwyJ3j29k1IrLio8= Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-7c5e2fe5f17so571851385a.3 for ; Tue, 01 Apr 2025 11:24:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743531867; x=1744136667; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RfeO8ISu/55MnEc/AQOG9MKsMCumBqwIs3c5prW9UwA=; b=B7GkXHE3doZNl2lqNCZn0n1EpdOuzbCpdb1fNduV6kGc3VUYNcqt6XwrLqwnJnsq35 DVVRh9VJDvpZ6EjTLy5hordxopQ9mYRGdzfNj7IHi8zmnlzjtZmGAzUZBAmRkPz3Eg/I zapzeEneNkwdb99jqJx9yfN7dzumRcPUBGNeV+Eax8+a+NqUUSEjOKfYP6KRG0qOZ3W6 uahlSxWder8v0gMKM/lwc1XZTBxcL6vuK6OsDelts8oFKWzw3any+8WULrDBHZx7pznb 5vrERuU86Omx3i4c8MvCJcmYiGDbN5Zmjr/aP3kT8g8Mb1lVpOyILPhBFBSblGYzHf8Z BqMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743531867; x=1744136667; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RfeO8ISu/55MnEc/AQOG9MKsMCumBqwIs3c5prW9UwA=; b=mU03Xi/XiHRdkfxetQe2sBIuV41wf9P1Ih2CPsZAmOAby8UmYY9TgP6nAt1kgJ4Op6 NT3zUrAE/pwLSiWT59o4JXlEC/4Warjd1FSvDfFEHf2Pxys3I5osuJjU3ISaQ7YGlxTw 90H24TY005Q8htLK4m6vSZFllt69wSGv299HpNOCb+Wxl9CRipHvUxDIpiQvtpmplhz0 ZqyDDgHs2ljqDMsuHFX3aZMHqphmROWvDGR3HaIS44PuVb0O0fHGU+1qwfZDjcFp1UK2 asiAIJ4fjGAZJl5uhvseG3tpKOxWUt2h+hRQWGT0w19+TW5ThR/Udf8KKOwXClI8+tjr Eu8A== X-Gm-Message-State: AOJu0Yw13wU+QTNIyCYncTXGa2VfFNahG/9j9cx6EVdaTvaHLab3vvRy sfrGlKlj/1nx3rW94kMgU4sVnLGQIIkarUBrszQXx0RHlBtFaGtU6KqTfVTPLxjyl9FHQrkY2Zu NNetZ/Y025OriPZSKDjV/KFKUTS0= X-Gm-Gg: ASbGncsV8fMShN30cjGVFV5Y5RDMMlj6ZdOyNto9J4AayPGUyAjJEcwuyJL0+/WtGtP uAVzHpOqvA4RqPeEHNAgEhMz3lE+W45hyV3hY0OGZKC7SINRi2VSJncotsPRVXAo/ZiHUi9m+Ph dOLXI5PgbaQLGrHx6+9aadggkVz3o2tHxlRTEcOLPMIA== X-Google-Smtp-Source: AGHT+IELYwknrTrlojouGCxNzeugUdHEj/yMUmvKk0F1DjqoZKgYk5cqq2wnX8CJgdFKmKmFC4BOYnqiwj9dyk12Tj0= X-Received: by 2002:a05:620a:d95:b0:7c5:3d60:7f8f with SMTP id af79cd13be357-7c69071e208mr1912679285a.18.1743531866527; Tue, 01 Apr 2025 11:24:26 -0700 (PDT) MIME-Version: 1.0 References: <20250401171754.2686501-1-vitaly.wool@konsulko.se> In-Reply-To: <20250401171754.2686501-1-vitaly.wool@konsulko.se> From: Nhat Pham Date: Tue, 1 Apr 2025 11:24:15 -0700 X-Gm-Features: AQ5f1JqHoFd1FPodlRkfCpWiezAX8BZtxzrARGuqYENzyWDsnTe5s2KzWXCuwLc Message-ID: Subject: Re: [PATCH] mm: add zblock allocator To: Vitaly Wool Cc: linux-mm@kvack.org, akpm@linux-foundation.org, Igor Belousov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 0EC642000B X-Stat-Signature: 9u96xc1q6mut1umubosjjsey8wj33p64 X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1743531867-82089 X-HE-Meta: U2FsdGVkX1/u6hbzNZ8lMM8oYXDVKf+wmQy1VZJDhZXG5PWY0nxuo3vGfkYOd2Rz1AKf/p3Q/mtjcAUmRp0pF4diBPjdnbW4t2t9ajTE0eER0z3A4kHHtmVqJZMNkyhoFFmHqJIu1IamD/BlU0sFo9VcZ70XMb3s1AUljvE62u1QZ7Gd9UBOxZIV6znMEzwQn+SvZWKTPIVanh+uWYKWOYWeloiJzQwUSyjatynGFrS9h2idB3z019GvMdLSCxwdoJgVQk9ox/P8YrzpyQs3XeYgWq+owS3U9Eo+XpJXtWMUYh1Ay0MFX/G0AFCg4aWE3bFogHCWNu0A9sSw7+h6AS5luSXbQzYW3Srd+RrjMvXYqxrCC+GnrWLwKhwJ1+L0bRTDJleoTKKHfkGh7RUFZwaBeAci+G/hvWOZ0IyiJEyVEmrazsJVMtebV9NSe1HNwCiEhXKfN11jvtViVei8Sc20AAGIs9oSgFWZ90dml776EpzobYCyqymu3RvOQUhOdVJ34+Ka7y9NsX/47B8uMdJeESpVTIhA8/zCv+6Y/R+OBkMhojIGD6w5kwA8sg9ekTJdT46kmUpPPpAF/wM8V51m1iZdi2qrByRSqvxAcQ9hw91jsMhPD564a1k8R8n9lIet+Dr63h+RWzxfJcVMepu7JKrAFYZ10e6AX230ze5gjqY3VjaQYi6/1WuhxpG9bH3HyTDSMjaoAJctXx83KsvN0J48Ip3zYMfiTECu4OoldCgvhm1Dyjx/eYFxsf5otB93JVT13FD+EMm7en4XNrg+Hmdcdha2RKRxFR7pondPmLZ7KNzzNRe2/nQQre5WDh0tFrKF3Ut7dPSfJf/tNuCvhIVzrQHbKeEBA9L+KryrAkvr0zEwdKDGmbego5x2oojACOW+TypwYzR+JYUMeJgT77krx4x5UjmAG0yT+yp6Dn6OWHUSfl7LrTHr7zIbSrbiNRW9ijE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 1, 2025 at 10:18=E2=80=AFAM Vitaly Wool wrote: > > zblock is a special purpose allocator for storing compressed pages. > It stores integer number of compressed objects per its block. These > blocks consist of several physical pages (2**n, i. e. 1/2/4/8). Haven't taken a close look yet, but as a general principle I don't mind having a separate allocator for a separate use case. Some quick notes (will do a careful review later): > > With zblock, it is possible to densely arrange objects of various sizes > resulting in low internal fragmentation. Also this allocator tries to > fill incomplete blocks instead of adding new ones, in many cases > providing a compression ratio substantially higher than z3fold and zbud > (though lower than zmalloc's). Do we have data for comparison here? > > zblock does not require MMU to operate and also is superior to zsmalloc This is not an actual meaningful distinction. CONFIG_SWAP depends on CONFIG= _MMU: menuconfig SWAP bool "Support for paging of anonymous memory (swap)" depends on MMU && BLOCK && !ARCH_NO_SWAP > with regard to average performance and worst execution times, thus > allowing for better response time and real-time characteristics of the > whole system. By performance, do you mean latency or throughput or storage density? > > E. g. on a series of stress-ng tests run on a Raspberry Pi 5, we get > 5-10% higher value for bogo ops/s in zblock/zsmalloc comparison. > > Signed-off-by: Vitaly Wool > Signed-off-by: Igor Belousov > --- > Documentation/mm/zblock.rst | 22 ++ > MAINTAINERS | 7 + > mm/Kconfig | 8 + > mm/Makefile | 1 + > mm/zblock.c | 492 ++++++++++++++++++++++++++++++++++++ > 5 files changed, 530 insertions(+) > create mode 100644 Documentation/mm/zblock.rst > create mode 100644 mm/zblock.c > > diff --git a/Documentation/mm/zblock.rst b/Documentation/mm/zblock.rst > new file mode 100644 > index 000000000000..754b3dbb9e94 > --- /dev/null > +++ b/Documentation/mm/zblock.rst > @@ -0,0 +1,22 @@ > +=3D=3D=3D=3D=3D=3D > +zblock > +=3D=3D=3D=3D=3D=3D > + > +zblock is a special purpose allocator for storing compressed pages. > +It stores integer number of compressed objects per its block. These > +blocks consist of several physical pages (2**n, i. e. 1/2/4/8). > + > +With zblock, it is possible to densely arrange objects of various sizes > +resulting in low internal fragmentation. Also this allocator tries to > +fill incomplete blocks instead of adding new ones, in many cases > +providing a compression ratio substantially higher than z3fold and zbud > +(though lower than zmalloc's). > + > +zblock does not require MMU to operate and also is superior to zsmalloc Same note as above. > +with regard to average performance and worst execution times, thus > +allowing for better response time and real-time characteristics of the > +whole system. > + > +E. g. on a series of stress-ng tests run on a Raspberry Pi 5, we get > +5-10% higher value for bogo ops/s in zblock/zsmalloc comparison. > + > diff --git a/MAINTAINERS b/MAINTAINERS > index 991a33bad10e..166e9bfa04dc 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -26313,6 +26313,13 @@ F: Documentation/networking/device_drivers/h= amradio/z8530drv.rst > F: drivers/net/hamradio/*scc.c > F: drivers/net/hamradio/z8530.h > > +ZBLOCK COMPRESSED SLAB MEMORY ALLOCATOR > +M: Vitaly Wool > +L: linux-mm@kvack.org > +S: Maintained > +F: Documentation/mm/zblock.rst > +F: mm/zblock.c > + > ZBUD COMPRESSED PAGE ALLOCATOR > M: Seth Jennings > M: Dan Streetman > diff --git a/mm/Kconfig b/mm/Kconfig > index 1b501db06417..26b79e3c1300 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -193,6 +193,14 @@ config Z3FOLD_DEPRECATED > page. It is a ZBUD derivative so the simplicity and determinism= are > still there. > > +config ZBLOCK > + tristate "Fast compression allocator with high density" > + depends on ZPOOL > + help > + A special purpose allocator for storing compressed pages. > + It is designed to store same size compressed pages in blocks of > + physical pages. > + > config Z3FOLD > tristate > default y if Z3FOLD_DEPRECATED=3Dy > diff --git a/mm/Makefile b/mm/Makefile > index 850386a67b3e..2018455b7baa 100644 > --- a/mm/Makefile > +++ b/mm/Makefile > @@ -116,6 +116,7 @@ obj-$(CONFIG_ZPOOL) +=3D zpool.o > obj-$(CONFIG_ZBUD) +=3D zbud.o > obj-$(CONFIG_ZSMALLOC) +=3D zsmalloc.o > obj-$(CONFIG_Z3FOLD) +=3D z3fold.o > +obj-$(CONFIG_ZBLOCK) +=3D zblock.o > obj-$(CONFIG_GENERIC_EARLY_IOREMAP) +=3D early_ioremap.o > obj-$(CONFIG_CMA) +=3D cma.o > obj-$(CONFIG_NUMA) +=3D numa.o > diff --git a/mm/zblock.c b/mm/zblock.c > new file mode 100644 > index 000000000000..a6778653c451 > --- /dev/null > +++ b/mm/zblock.c > @@ -0,0 +1,492 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * zblock.c > + * > + * Author: Vitaly Wool > + * Based on the work from Ananda Badmaev > + * Copyright (C) 2022-2024, Konsulko AB. > + * > + * Zblock is a small object allocator with the intention to serve as a > + * zpool backend. It operates on page blocks which consist of number > + * of physical pages being a power of 2 and store integer number of > + * compressed pages per block which results in determinism and simplicit= y. > + * > + * zblock doesn't export any API and is meant to be used via zpool API. > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define SLOT_FREE 0 > +#define BIT_SLOT_OCCUPIED 0 > +#define BIT_SLOT_MAPPED 1 > + > +#define SLOT_BITS 5 > +#define MAX_SLOTS (1 << SLOT_BITS) > +#define SLOT_MASK ((0x1UL << SLOT_BITS) - 1) > + > +#define ZBLOCK_HEADER_SIZE round_up(sizeof(struct zblock_block), siz= eof(long)) > +#define BLOCK_DATA_SIZE(order) ((PAGE_SIZE << order) - ZBLOCK_HEADER_SIZ= E) > +#define SLOT_SIZE(nslots, order) (round_down((BLOCK_DATA_SIZE(order) / n= slots), sizeof(long))) > + > +#define BLOCK_CACHE_SIZE 32 > + > +struct zblock_pool; > + > +/** > + * struct zblock_block - block metadata > + * Block consists of several (1/2/4/8) pages and contains fixed > + * integer number of slots for allocating compressed pages. > + * > + * free_slots: number of free slots in the block > + * slot_info: contains data about free/occupied slots > + * cache_idx: index of the block in cache > + */ > +struct zblock_block { > + atomic_t free_slots; > + u64 slot_info[1]; > + int cache_idx; > +}; > + > +/** > + * struct block_desc - general metadata for block lists > + * Each block list stores only blocks of corresponding type which means > + * that all blocks in it have the same number and size of slots. > + * All slots are aligned to size of long. > + * > + * slot_size: size of slot for this list > + * slots_per_block: number of slots per block for this list > + * order: order for __get_free_pages > + */ > +static const struct block_desc { > + const unsigned int slot_size; > + const unsigned short slots_per_block; > + const unsigned short order; > +} block_desc[] =3D { > + { SLOT_SIZE(32, 0), 32, 0 }, > + { SLOT_SIZE(22, 0), 22, 0 }, > + { SLOT_SIZE(17, 0), 17, 0 }, > + { SLOT_SIZE(13, 0), 13, 0 }, > + { SLOT_SIZE(11, 0), 11, 0 }, > + { SLOT_SIZE(9, 0), 9, 0 }, > + { SLOT_SIZE(8, 0), 8, 0 }, > + { SLOT_SIZE(14, 1), 14, 1 }, > + { SLOT_SIZE(12, 1), 12, 1 }, > + { SLOT_SIZE(11, 1), 11, 1 }, > + { SLOT_SIZE(10, 1), 10, 1 }, > + { SLOT_SIZE(9, 1), 9, 1 }, > + { SLOT_SIZE(8, 1), 8, 1 }, > + { SLOT_SIZE(15, 2), 15, 2 }, > + { SLOT_SIZE(14, 2), 14, 2 }, > + { SLOT_SIZE(13, 2), 13, 2 }, > + { SLOT_SIZE(12, 2), 12, 2 }, > + { SLOT_SIZE(11, 2), 11, 2 }, > + { SLOT_SIZE(10, 2), 10, 2 }, > + { SLOT_SIZE(9, 2), 9, 2 }, > + { SLOT_SIZE(8, 2), 8, 2 }, > + { SLOT_SIZE(15, 3), 15, 3 }, > + { SLOT_SIZE(14, 3), 14, 3 }, > + { SLOT_SIZE(13, 3), 13, 3 }, > + { SLOT_SIZE(12, 3), 12, 3 }, > + { SLOT_SIZE(11, 3), 11, 3 }, > + { SLOT_SIZE(10, 3), 10, 3 }, > + { SLOT_SIZE(9, 3), 9, 3 }, > + { SLOT_SIZE(7, 3), 7, 3 } > +}; > + > +/** > + * struct block_list - stores metadata of particular list > + * lock: protects block_cache > + * block_cache: blocks with free slots > + * block_count: total number of blocks in the list > + */ > +struct block_list { > + spinlock_t lock; > + struct zblock_block *block_cache[BLOCK_CACHE_SIZE]; > + unsigned long block_count; > +}; > + > +/** > + * struct zblock_pool - stores metadata for each zblock pool > + * @block_lists: array of block lists > + * @zpool: zpool driver > + * @alloc_flag: protects block allocation from memory lea= k > + * > + * This structure is allocated at pool creation time and maintains metad= ata > + * for a particular zblock pool. > + */ > +struct zblock_pool { > + struct block_list block_lists[ARRAY_SIZE(block_desc)]; > + struct zpool *zpool; > + atomic_t alloc_flag; > +}; > + > +/***************** > + * Helpers > + *****************/ > + > +static int cache_insert_block(struct zblock_block *block, struct block_l= ist *list) > +{ > + unsigned int i, min_free_slots =3D atomic_read(&block->free_slots= ); > + int min_index =3D -1; > + > + if (WARN_ON(block->cache_idx !=3D -1)) > + return -EINVAL; > + > + min_free_slots =3D atomic_read(&block->free_slots); > + for (i =3D 0; i < BLOCK_CACHE_SIZE; i++) { > + if (!list->block_cache[i] || !atomic_read(&(list->block_c= ache[i])->free_slots)) { > + min_index =3D i; > + break; > + } > + if (atomic_read(&(list->block_cache[i])->free_slots) < mi= n_free_slots) { > + min_free_slots =3D atomic_read(&(list->block_cach= e[i])->free_slots); > + min_index =3D i; > + } > + } > + if (min_index >=3D 0) { > + if (list->block_cache[min_index]) > + (list->block_cache[min_index])->cache_idx =3D -1; > + list->block_cache[min_index] =3D block; > + block->cache_idx =3D min_index; > + } > + return min_index < 0 ? min_index : 0; > +} > + > +static struct zblock_block *cache_find_block(struct block_list *list) > +{ > + int i; > + struct zblock_block *z =3D NULL; > + > + for (i =3D 0; i < BLOCK_CACHE_SIZE; i++) { > + if (list->block_cache[i] && > + atomic_dec_if_positive(&list->block_cache[i]->free_sl= ots) >=3D 0) { > + z =3D list->block_cache[i]; > + break; > + } > + } > + return z; > +} > + > +static int cache_remove_block(struct block_list *list, struct zblock_blo= ck *block) > +{ > + int idx =3D block->cache_idx; > + > + block->cache_idx =3D -1; > + if (idx >=3D 0) > + list->block_cache[idx] =3D NULL; > + return idx < 0 ? idx : 0; > +} > + > +/* > + * Encodes the handle of a particular slot in the pool using metadata > + */ > +static inline unsigned long metadata_to_handle(struct zblock_block *bloc= k, > + unsigned int bloc= k_type, unsigned int slot) > +{ > + return (unsigned long)(block) + (block_type << SLOT_BITS) + slot; > +} > + > +/* Returns block, block type and slot in the pool corresponding to handl= e */ > +static inline struct zblock_block *handle_to_metadata(unsigned long hand= le, > + unsigned int *block_type,= unsigned int *slot) > +{ > + *block_type =3D (handle & (PAGE_SIZE - 1)) >> SLOT_BITS; > + *slot =3D handle & SLOT_MASK; > + return (struct zblock_block *)(handle & PAGE_MASK); > +} > + > + > +/* > + * allocate new block and add it to corresponding block list > + */ > +static struct zblock_block *alloc_block(struct zblock_pool *pool, > + int block_type, gfp_t gfp, > + unsigned long *handle) > +{ > + struct zblock_block *block; > + struct block_list *list; > + > + block =3D (void *)__get_free_pages(gfp, block_desc[block_type].or= der); > + if (!block) > + return NULL; > + > + list =3D &(pool->block_lists)[block_type]; > + > + /* init block data */ > + memset(&block->slot_info, 0, sizeof(block->slot_info)); > + atomic_set(&block->free_slots, block_desc[block_type].slots_per_b= lock - 1); > + block->cache_idx =3D -1; > + set_bit(BIT_SLOT_OCCUPIED, (unsigned long *)block->slot_info); > + *handle =3D metadata_to_handle(block, block_type, 0); > + > + spin_lock(&list->lock); > + cache_insert_block(block, list); > + list->block_count++; > + spin_unlock(&list->lock); > + return block; > +} > + > +/***************** > + * API Functions > + *****************/ > +/** > + * zblock_create_pool() - create a new zblock pool > + * @gfp: gfp flags when allocating the zblock pool structure > + * @ops: user-defined operations for the zblock pool > + * > + * Return: pointer to the new zblock pool or NULL if the metadata alloca= tion > + * failed. > + */ > +static struct zblock_pool *zblock_create_pool(gfp_t gfp) > +{ > + struct zblock_pool *pool; > + struct block_list *list; > + int i, j; > + > + pool =3D kmalloc(sizeof(struct zblock_pool), gfp); > + if (!pool) > + return NULL; > + > + /* init each block list */ > + for (i =3D 0; i < ARRAY_SIZE(block_desc); i++) { > + list =3D &(pool->block_lists)[i]; > + spin_lock_init(&list->lock); > + for (j =3D 0; j < BLOCK_CACHE_SIZE; j++) > + list->block_cache[j] =3D NULL; > + list->block_count =3D 0; > + } > + atomic_set(&pool->alloc_flag, 0); > + return pool; > +} > + > +/** > + * zblock_destroy_pool() - destroys an existing zblock pool > + * @pool: the zblock pool to be destroyed > + * > + */ > +static void zblock_destroy_pool(struct zblock_pool *pool) > +{ > + kfree(pool); > +} > + > + > +/** > + * zblock_alloc() - allocates a slot of appropriate size > + * @pool: zblock pool from which to allocate > + * @size: size in bytes of the desired allocation > + * @gfp: gfp flags used if the pool needs to grow > + * @handle: handle of the new allocation > + * > + * Return: 0 if success and handle is set, otherwise -EINVAL if the size= or > + * gfp arguments are invalid or -ENOMEM if the pool was unable to alloca= te > + * a new slot. > + */ > +static int zblock_alloc(struct zblock_pool *pool, size_t size, gfp_t gfp= , > + unsigned long *handle) > +{ > + unsigned int block_type, slot; > + struct zblock_block *block; > + struct block_list *list; > + > + if (!size) > + return -EINVAL; > + > + if (size > PAGE_SIZE) > + return -ENOSPC; > + > + /* find basic block type with suitable slot size */ > + for (block_type =3D 0; block_type < ARRAY_SIZE(block_desc); block= _type++) { > + if (size <=3D block_desc[block_type].slot_size) > + break; > + } > + list =3D &(pool->block_lists[block_type]); > + > +check: > + /* check if there are free slots in cache */ > + spin_lock(&list->lock); > + block =3D cache_find_block(list); > + spin_unlock(&list->lock); > + if (block) > + goto found; > + > + /* not found block with free slots try to allocate new empty bloc= k */ > + if (atomic_cmpxchg(&pool->alloc_flag, 0, 1)) > + goto check; > + block =3D alloc_block(pool, block_type, gfp, handle); > + atomic_set(&pool->alloc_flag, 0); > + if (block) > + return 0; > + return -ENOMEM; > + > +found: > + /* find the first free slot in block */ > + for (slot =3D 0; slot < block_desc[block_type].slots_per_block; s= lot++) { > + if (!test_and_set_bit(slot*2 + BIT_SLOT_OCCUPIED, > + (unsigned long *)&block->slot_info)) > + break; > + } > + *handle =3D metadata_to_handle(block, block_type, slot); > + return 0; > +} > + > +/** > + * zblock_free() - frees the allocation associated with the given handle > + * @pool: pool in which the allocation resided > + * @handle: handle associated with the allocation returned by zblock_= alloc() > + * > + */ > +static void zblock_free(struct zblock_pool *pool, unsigned long handle) > +{ > + unsigned int slot, block_type; > + struct zblock_block *block; > + struct block_list *list; > + > + block =3D handle_to_metadata(handle, &block_type, &slot); > + list =3D &(pool->block_lists[block_type]); > + > + spin_lock(&list->lock); > + /* if all slots in block are empty delete whole block */ > + if (atomic_inc_return(&block->free_slots) =3D=3D block_desc[block= _type].slots_per_block) { > + list->block_count--; > + cache_remove_block(list, block); > + spin_unlock(&list->lock); > + free_pages((unsigned long)block, block_desc[block_type].o= rder); > + return; > + } > + > + if (atomic_read(&block->free_slots) < block_desc[block_type].slot= s_per_block/2 > + && block->cache_idx =3D=3D -1) > + cache_insert_block(block, list); > + spin_unlock(&list->lock); > + > + clear_bit(slot*2 + BIT_SLOT_OCCUPIED, (unsigned long *)block->slo= t_info); > +} > + > +/** > + * zblock_map() - maps the allocation associated with the given handle > + * @pool: pool in which the allocation resides > + * @handle: handle associated with the allocation to be mapped > + * > + * > + * Returns: a pointer to the mapped allocation > + */ > +static void *zblock_map(struct zblock_pool *pool, unsigned long handle) > +{ > + unsigned int block_type, slot; > + struct zblock_block *block; > + void *p; > + > + block =3D handle_to_metadata(handle, &block_type, &slot); > + p =3D (void *)block + ZBLOCK_HEADER_SIZE + slot * block_desc[bloc= k_type].slot_size; > + return p; > +} > + > +/** > + * zblock_unmap() - unmaps the allocation associated with the given hand= le > + * @pool: pool in which the allocation resides > + * @handle: handle associated with the allocation to be unmapped > + */ > +static void zblock_unmap(struct zblock_pool *pool, unsigned long handle) > +{ > +} > + > +/** > + * zblock_get_total_pages() - gets the zblock pool size in pages > + * @pool: pool being queried > + * > + * Returns: size in bytes of the given pool. > + */ > +static u64 zblock_get_total_pages(struct zblock_pool *pool) > +{ > + u64 total_size; > + int i; > + > + total_size =3D 0; > + for (i =3D 0; i < ARRAY_SIZE(block_desc); i++) > + total_size +=3D pool->block_lists[i].block_count << block= _desc[i].order; > + > + return total_size; > +} > + > +/***************** > + * zpool > + ****************/ > + > +static void *zblock_zpool_create(const char *name, gfp_t gfp) > +{ > + return zblock_create_pool(gfp); > +} > + > +static void zblock_zpool_destroy(void *pool) > +{ > + zblock_destroy_pool(pool); > +} > + > +static int zblock_zpool_malloc(void *pool, size_t size, gfp_t gfp, > + unsigned long *handle) > +{ > + return zblock_alloc(pool, size, gfp, handle); > +} > + > +static void zblock_zpool_free(void *pool, unsigned long handle) > +{ > + zblock_free(pool, handle); > +} > + > +static void *zblock_zpool_map(void *pool, unsigned long handle, > + enum zpool_mapmode mm) > +{ > + return zblock_map(pool, handle); > +} > + > +static void zblock_zpool_unmap(void *pool, unsigned long handle) > +{ > + zblock_unmap(pool, handle); > +} > + > +static u64 zblock_zpool_total_pages(void *pool) > +{ > + return zblock_get_total_pages(pool); > +} > + > +static struct zpool_driver zblock_zpool_driver =3D { > + .type =3D "zblock", > + .owner =3D THIS_MODULE, > + .create =3D zblock_zpool_create, > + .destroy =3D zblock_zpool_destroy, > + .malloc =3D zblock_zpool_malloc, > + .free =3D zblock_zpool_free, > + .map =3D zblock_zpool_map, > + .unmap =3D zblock_zpool_unmap, > + .total_pages =3D zblock_zpool_total_pages, > +}; > + > +MODULE_ALIAS("zpool-zblock"); > + > +static int __init init_zblock(void) > +{ > + pr_info("loaded\n"); > + zpool_register_driver(&zblock_zpool_driver); > + return 0; > +} > + > +static void __exit exit_zblock(void) > +{ > + zpool_unregister_driver(&zblock_zpool_driver); > + pr_info("unloaded\n"); > +} > + > +module_init(init_zblock); > +module_exit(exit_zblock); > + > +MODULE_LICENSE("GPL"); > +MODULE_AUTHOR("Vitaly Wool "); > +MODULE_DESCRIPTION("Block allocator for compressed pages"); > -- > 2.39.2 > >