From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id EBA2CC36010
	for <linux-mm@archiver.kernel.org>; Mon,  7 Apr 2025 15:54:17 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 1D3A96B0008; Mon,  7 Apr 2025 11:54:16 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 182126B0012; Mon,  7 Apr 2025 11:54:16 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 0222C6B0022; Mon,  7 Apr 2025 11:54:15 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id D162B6B0008
	for <linux-mm@kvack.org>; Mon,  7 Apr 2025 11:54:15 -0400 (EDT)
Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay05.hostedemail.com (Postfix) with ESMTP id DAD5F5A48B
	for <linux-mm@kvack.org>; Mon,  7 Apr 2025 15:54:16 +0000 (UTC)
X-FDA: 83307694512.09.F9EB5F0
Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51])
	by imf06.hostedemail.com (Postfix) with ESMTP id CA4C2180011
	for <linux-mm@kvack.org>; Mon,  7 Apr 2025 15:54:14 +0000 (UTC)
Authentication-Results: imf06.hostedemail.com;
	dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=Zzbvt9wZ;
	spf=pass (imf06.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.51 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org;
	dmarc=pass (policy=none) header.from=cmpxchg.org
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1744041255;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=8ms9R7ahBBm0CuzZ+6IiKEPp0eXj/hwoNb73wK3/VqY=;
	b=avWDrfHVpvuMhieyEkgcwekwHieTfqq8b5EUF5roiX3FFgClGVskqhae8RSPm5uRu7dTuG
	jxedyobx0ik9eIQkJ/CvDTWlNGTE4FzZ/cN5kfC9GUc3g4/hYWvG9JJ/831scObiQnPhQ0
	mooPz65SGSeK5zw42ZDvzQk9ocHVtCc=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744041255; a=rsa-sha256;
	cv=none;
	b=100UQRx+JEIwnZ9lrqPCULLMBod0QWSd4hhQbwHQXOWS7HDN0dDcDyP5a/CI/4frz0eOFB
	bzvy782HMD8GuTzJUqzXD38k6qw/cc/Dr7X7P8g3XFAaM40H4fbSbhsi06iUATmwZW63e2
	NWmLVDA3YfI9WBdNLoXxIi8sX8DXCTw=
ARC-Authentication-Results: i=1;
	imf06.hostedemail.com;
	dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=Zzbvt9wZ;
	spf=pass (imf06.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.51 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org;
	dmarc=pass (policy=none) header.from=cmpxchg.org
Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6e8ec399427so38162126d6.2
        for <linux-mm@kvack.org>; Mon, 07 Apr 2025 08:54:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1744041254; x=1744646054; darn=kvack.org;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to;
        bh=8ms9R7ahBBm0CuzZ+6IiKEPp0eXj/hwoNb73wK3/VqY=;
        b=Zzbvt9wZ8M9U0KqQHdX3uxxRVNmvT2Wr05JzY8ZuDKmRoKdi2O7tdlig0h8vyc67an
         1mRL9jp+n3eEU/iCVP1GBw+TRoJxw/8ZKDE1uerZAaleUGLeU0VgGfu0KK4e23F2Phda
         ZcAC4ILsvKbjcZ13KCvXuh1nkb5UdnME93s0PddZArpyOGUUITvRmyaayEhtw1oir7T+
         x2HLUc6oPgP3+IQLSUky0cUEB9Rk63BE4GdwrSiWTD8+nEy5hOIWkH884HRd3WAl62Yx
         elAlozL5e7ogBoU+GXu6L1WJxaS90uzUPXIzek0kTOq9uhXrp9FluVeE0eByq5Ov4cot
         ecjA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1744041254; x=1744646054;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date
         :message-id:reply-to;
        bh=8ms9R7ahBBm0CuzZ+6IiKEPp0eXj/hwoNb73wK3/VqY=;
        b=UNwCbuKVTupMlaBX5m0dsUNgMxRYndZgchqW2gvPvNVNjV0cI8CIzeIRdjSiaxUx76
         6BwvAdTff0kzyau5nFsWZLAbowoWIc2jfFJIqEt9Y0p6T4LamxhxDBZDVyfcltf0bCLr
         URnGzlKnFR/Y/WQX/ZvwOw+PSBIrRCUa8oWu+J/1BEDcTE3OCh4c4SockZn7AchdRZza
         5z+G9zKp78QzbTk3TxtXj4yyeTBn4isRynzLnG34tSJx52tU2pCaQ7Ncz5exD967FqFs
         ZKRD7iRHq4fqUp8S2WpLYfFEWUP7OvRIIUuxq6M6nqPQjCFniOpjA3zsOLumrTqD/r5C
         +jnw==
X-Gm-Message-State: AOJu0Yz8Hx2kmdiZO/eW7EA5g5m9kFyxUZpr5P2mO9jEdt5N+w8f1oNk
	M6Zqba8QS3GnMZOodviOV3RYlB/fuaTHWSvv/dTNmdT+iOABHYUCPsGsnBC1x1Y=
X-Gm-Gg: ASbGncsASXkVufVxPs+MMcoYP3U+uYZmCPJetqiY/OUzhFenG5MesT7mCjWxD464ARJ
	HWVlqPGkdab/NS/4b+89TPKC+yKNYPs6ND51D33NsVCCDDqSXRn4QfuL+UahZ3FdXyg2Tem9yMy
	IC+VRB1yE+MXPH1DLSTH2eWid48RVV5kmoVzqYet2t6UmTGHT9/vi7W96HxtCwuEEFKdyVOGYiy
	qw7QK06cpCQRtLm5qoTgf5qhyIoME/ArOsvpQo4SbcYCR7Pe9erw/MbHhMaiSkfp0ocjfwYA6Hd
	V7ae3DkKKf0j7xbsrUkfNhJvGPKkXnq64J1vYEpILks=
X-Google-Smtp-Source: AGHT+IFAr+Ddmty97kMW/x8YowxM7ltqx/bkYOksgNWcfkNe3XapyOnA/fNdirReU+JtXdW5xrViUA==
X-Received: by 2002:a05:6214:268f:b0:6e8:fad9:c688 with SMTP id 6a1803df08f44-6f0b744e6ebmr129230826d6.16.1744041253746;
        Mon, 07 Apr 2025 08:54:13 -0700 (PDT)
Received: from localhost ([2603:7000:c01:2716:365a:60ff:fe62:ff29])
        by smtp.gmail.com with UTF8SMTPSA id 6a1803df08f44-6ef0f047c23sm59769246d6.58.2025.04.07.08.54.12
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 07 Apr 2025 08:54:13 -0700 (PDT)
Date: Mon, 7 Apr 2025 11:54:09 -0400
From: Johannes Weiner <hannes@cmpxchg.org>
To: Vitaly Wool <vitaly.wool@konsulko.se>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, Nhat Pham <nphamcs@gmail.com>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Igor Belousov <igor.b@beldev.am>
Subject: Re: [PATCH v2] mm: add zblock allocator
Message-ID: <20250407155409.GB827@cmpxchg.org>
References: <20250404192813.925835-1-vitaly.wool@konsulko.se>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20250404192813.925835-1-vitaly.wool@konsulko.se>
X-Rspam-User: 
X-Rspamd-Server: rspam10
X-Rspamd-Queue-Id: CA4C2180011
X-Stat-Signature: 8ufgwrdcgedoct5owcofz5i3wbj137g4
X-HE-Tag: 1744041254-809016
X-HE-Meta: U2FsdGVkX18NApA6eh8+9AgTYEj6jIDvUsBERN/kRbU9+D88iqVFE9TfhEb6zAkCm82VUySifS1qBcPi/3foAkUa+Kuze9D6kFkclNnuFAaVdsjvkkxXiKzO4DRAIGN7w+b9YRnJpnUWIAubqQfsFOePfF6UPeoTi1uE0FHrYQdXsHRq/8GAFUiD9NR2MN6bhFjg2Y2U1CeMi5YggxE1StR9xul9/vMzJ2CVVIBmtsDhd5absJWj1fMyXdSCUHgvxQ1adyjttdBY0JyQgaJvW3dEyo+KLz/zG2w0nSKtBaVgSTAGq2tD/PjVM79KDBKSGWADVaCmmVjjpJYY3lXXEXGrLCY+gy/rtTTk/MDGe6b2YOR7hc/Qz4H+EfLYNhQ27og+09wYlIm+bjnDaqUv6+FM/NzAEVyx863qxMy4mudG4pNZJhTsKkCocs1qRGRRTxNXhXd9QwKIReyJ+Oy6nGhksmFf1P4xXFNkr6Gzqph4v2FwA0386NNUxK94uXpy/oepwvKvpjyonDJt1Zj/gFU6J6ziXLNjPsVgdHnmTpUihs6iaPtBrBNCWRcEVlT0utHa6cyHenTiWrZZ1OXqhJdjAOfKdLoUooOuEsWfUgcdvBtqF7U8qa4y3Q5k2hjqvcqrYMD6TCX1scjMrNT/H6+UChyxi6HJ1+d1YEtq/vOnMhqRQL1D6SSakYg+l6BBhBNyTgVu43bHvc3vuJa0NdemFsBcP9h1wyxicinwrCB8RPgFmJq789fTOEMuKn63GdiF79C4+nEhvKHCB3njMXSkNWPXbsfTD3gwQPCzU3tRs7w8UFsrQ2FRq9pQcYGsNYrvmot94YWX7GZ51h6rqmQPSQUZuIGMqJrvJTbh2Pof1wc4hOibPrIv3nwwiLvxr0o5vkC5SRaHOijT3VgIYkV/Unf+dmtGFKbwrzTShba/zjV38pOp3quwL9KxrLtHHD6nMajDTFrk/um7Lk/
 2ztlnMI6
 B0dN0/rfsrH0aRuiW7SkSgNd9Pp5hFMrB2hPnKOnnQ0ZHq8J3dAl0zdQPSWDgZX4xErw+Wnw5QFPJXcmPfX1vkqUoqPeysb0HprL5mQ+qOd20po9L2Us+1r8qggxUqZaI7Q7HKpnr0E64rR0Yi+hJbqq94YEmV/+NRLn3TgM5nZBy3IiSli9T6I/+ry/j8c6ZWcUiGbHr8Q8ryIJtyK8M1E62Xa88ZD1iOFy7ZeIUw8hDsW7VU2Kf6oR4QBOs61/6anlm4wCY7WfEk3aq+/7Yleg+HU7Lo/tiPgTIweNs9/gEWmy0pCBlt9nwejVDSPkAXEiqrTKDRI88kbCzD8f7YrYTrBo8xAnEyXTJDj9fU3kISIzfqykUm65qActxEuY6re07QNA4yNT22XuJwMzflaTEM3wpzqLzCMsjn3C8GLx7LukhCvjGa1DMSpZvmfKhAyaOzBr82pnN3bJ43kFK5IcC3QiE+KerbcMhA+qYmQnGDLXCP1JQP4+laj2Ufcr1mgkq66MTOrQk8Pm3DTo77WpXA2/e7fuKSvDrQCFGenhbmYu8oLIeOGcfJG/zDGSGU8Hi
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Fri, Apr 04, 2025 at 09:28:13PM +0200, Vitaly Wool wrote:
> @@ -0,0 +1,24 @@
> +. SPDX-License-Identifier: GPL-2.0
> +
> +======
> +zblock
> +======
> +
> +zblock is a special purpose allocator for storing compressed pages.
> +It stores integer number of compressed objects per its block. These
> +blocks consist of several physical pages (2**n, i. e. 1/2/4/8).
> +
> +With zblock, it is possible to densely arrange objects of various sizes
> +resulting in low internal fragmentation. Also this allocator tries to
> +fill incomplete blocks instead of adding new ones,  in many cases
> +providing a compression ratio substantially higher than z3fold and zbud
> +(though lower than zmalloc's).

This last part seems to be incorrect based on Igor's measurements.

> +zblock does not require MMU to operate and also is superior to zsmalloc
> +with regard to average performance and worst execution times, thus
> +allowing for better response time and real-time characteristics of the
> +whole system.

Please delete the MMU comment here and in the changelog. As Nhat said,
this is a pointless advantage of a swap backend allocator, as swapping
itself requires an MMU.

Please also add that highmem is not supported.

Please also add that migration is not supported, which has a negative
impact on the kernel's ability to produce higher-order pages. This is
particularly unfortunate because zblock itself wants higher-order pages.

> +E. g. on a series of stress-ng tests run on a Raspberry Pi 5, we get
> +5-10% higher value for bogo ops/s in zblock/zsmalloc comparison.
> +
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d5dfb9186962..19283e6a08eb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -26556,6 +26556,13 @@ F:	Documentation/networking/device_drivers/hamradio/z8530drv.rst
>  F:	drivers/net/hamradio/*scc.c
>  F:	drivers/net/hamradio/z8530.h
>  
> +ZBLOCK COMPRESSED SLAB MEMORY ALLOCATOR
> +M:	Vitaly Wool <vitaly.wool@konsulko.se>
> +L:	linux-mm@kvack.org
> +S:	Maintained
> +F:	Documentation/mm/zblock.rst
> +F:	mm/zblock.[ch]
> +
>  ZBUD COMPRESSED PAGE ALLOCATOR
>  M:	Seth Jennings <sjenning@redhat.com>
>  M:	Dan Streetman <ddstreet@ieee.org>

It looks like you're using an old tree. Can you please rebase on top
of the mm tree? There have been some significant changes to how zswap
is calling into zpool since the other allocators have been deleted.

> @@ -0,0 +1,418 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * zblock.c
> + *
> + * Author: Vitaly Wool <vitaly.wool@konsulko.se>
> + * Based on the work from Ananda Badmaev <a.badmaev@clicknet.pro>
> + * Copyright (C) 2022-2025, Konsulko AB.
> + *
> + * Zblock is a small object allocator with the intention to serve as a
> + * zpool backend. It operates on page blocks which consist of number
> + * of physical pages being a power of 2 and store integer number of
> + * compressed pages per block which results in determinism and simplicity.
> + *
> + * zblock doesn't export any API and is meant to be used via zpool API.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/atomic.h>
> +#include <linux/list.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/preempt.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +#include <linux/zpool.h>
> +#include "zblock.h"
> +
> +static struct rb_root block_desc_tree = RB_ROOT;
> +
> +/* add a new block to the list */
> +static inline void add_block(struct zblock_block *block,
> +				struct block_list *block_list)
> +{
> +	list_add(&block->link, &block_list->list);
> +}
> +
> +/*
> + * Find a block with at least one free slot and claim it.
> + * We make sure that the first block, if exists, will always work.
> + */
> +static inline struct zblock_block *find_block(struct block_list *block_list)
> +{
> +	struct list_head *l = &block_list->list;
> +
> +	if (!list_empty(l)) {
> +		struct zblock_block *z = list_first_entry(l, typeof(*z), link);
> +
> +		if (z->free_slots > 0) {
> +			if (--z->free_slots == 0)
> +				list_move_tail(&z->link, l);
> +			return z;
> +		}
> +	}
> +	return NULL;
> +}
> +
> +/* remove block from the list */
> +static inline void remove_block(struct zblock_block *block)
> +{
> +	list_del_init(&block->link);
> +}
> +
> +/* Encodes the handle of a particular slot in the pool using metadata */
> +static inline unsigned long metadata_to_handle(struct zblock_block *block,
> +				unsigned int block_type, unsigned int slot)
> +{
> +	return (unsigned long)(block) | (block_type << SLOT_BITS) | slot;
> +}
> +
> +/* Returns block, block type and slot in the pool corresponding to handle */
> +static inline struct zblock_block *handle_to_metadata(unsigned long handle,
> +				unsigned int *block_type, unsigned int *slot)
> +{
> +	*block_type = (handle & (PAGE_SIZE - 1)) >> SLOT_BITS;
> +	*slot = handle & SLOT_MASK;
> +	return (struct zblock_block *)(handle & PAGE_MASK);
> +}
> +
> +
> +/*
> + * allocate new block and add it to corresponding block list
> + */
> +static struct zblock_block *alloc_block(struct zblock_pool *pool,
> +					int block_type, gfp_t gfp,
> +					unsigned long *handle)
> +{
> +	struct zblock_block *block;
> +	struct block_list *block_list;
> +
> +	block = (void *)__get_free_pages(gfp, block_desc[block_type].order);

This gfp comes from zswap, which currently does:

        gfp = GFP_NOWAIT | __GFP_NORETRY | __GFP_HIGHMEM | __GFP_MOVABLE;

So you might get highmem here that cannot be dereferenced and crash.

__GFP_MOVABLE is also wrong, as you're not implementing migration.

> +	if (!block)
> +		return NULL;
> +
> +	block_list = &pool->block_lists[block_type];
> +
> +	/* init block data  */
> +	block->free_slots = block_desc[block_type].slots_per_block - 1;
> +	memset(&block->slot_info, 0, sizeof(block->slot_info));
> +	set_bit(0, block->slot_info);
> +	*handle = metadata_to_handle(block, block_type, 0);
> +
> +	spin_lock(&block_list->lock);
> +	add_block(block, block_list);

Use list_add() directly here.

> +	block_list->block_count++;
> +	spin_unlock(&block_list->lock);
> +	return block;
> +}
> +
> +/*****************
> + * API Functions
> + *****************/
> +/**
> + * zblock_create_pool() - create a new zblock pool
> + * @gfp:	gfp flags when allocating the zblock pool structure
> + * @ops:	user-defined operations for the zblock pool
> + *
> + * Return: pointer to the new zblock pool or NULL if the metadata allocation
> + * failed.
> + */
> +static struct zblock_pool *zblock_create_pool(gfp_t gfp)
> +{
> +	struct zblock_pool *pool;
> +	struct block_list *block_list;
> +	int i;
> +
> +	pool = kmalloc(sizeof(struct zblock_pool), gfp);
> +	if (!pool)
> +		return NULL;
> +
> +	/* init each block list */
> +	for (i = 0; i < ARRAY_SIZE(block_desc); i++) {
> +		block_list = &pool->block_lists[i];
> +		spin_lock_init(&block_list->lock);
> +		INIT_LIST_HEAD(&block_list->list);
> +		block_list->block_count = 0;
> +	}
> +	return pool;
> +}
> +
> +/**
> + * zblock_destroy_pool() - destroys an existing zblock pool
> + * @pool:	the zblock pool to be destroyed
> + *
> + */
> +static void zblock_destroy_pool(struct zblock_pool *pool)
> +{
> +	kfree(pool);
> +}
> +
> +
> +/**
> + * zblock_alloc() - allocates a slot of appropriate size
> + * @pool:	zblock pool from which to allocate
> + * @size:	size in bytes of the desired allocation
> + * @gfp:	gfp flags used if the pool needs to grow
> + * @handle:	handle of the new allocation
> + *
> + * Return: 0 if success and handle is set, otherwise -EINVAL if the size or
> + * gfp arguments are invalid or -ENOMEM if the pool was unable to allocate
> + * a new slot.
> + */
> +static int zblock_alloc(struct zblock_pool *pool, size_t size, gfp_t gfp,
> +			unsigned long *handle)
> +{
> +	int block_type = -1;
> +	unsigned int slot;
> +	struct zblock_block *block;
> +	struct block_list *block_list;
> +
> +	if (!size)
> +		return -EINVAL;
> +
> +	if (size > PAGE_SIZE)
> +		return -ENOSPC;
> +
> +	/* find basic block type with suitable slot size */
> +	if (size < block_desc[0].slot_size)
> +		block_type = 0;
> +	else {
> +		struct block_desc_node *block_node;
> +		struct rb_node *node = block_desc_tree.rb_node;
> +
> +		while (node) {
> +			block_node = container_of(node, typeof(*block_node), node);
> +			if (size < block_node->this_slot_size)
> +				node = node->rb_left;
> +			else if (size >= block_node->next_slot_size)
> +				node = node->rb_right;
> +			else {
> +				block_type = block_node->block_idx + 1;
> +				break;
> +			}
> +		}
> +	}
> +	if (WARN_ON(block_type < 0 || block_type >= ARRAY_SIZE(block_desc)))
> +		return -EINVAL;
> +
> +	block_list = &pool->block_lists[block_type];
> +
> +	spin_lock(&block_list->lock);
> +	block = find_block(block_list);
> +	spin_unlock(&block_list->lock);
> +	if (block)
> +		goto found;
> +	/* not found block with free slots try to allocate new empty block */
> +	block = alloc_block(pool, block_type, gfp, handle);
> +	return block ? 0 : -ENOMEM;
> +
> +found:
> +	/* find the first free slot in block */
> +	for (slot = find_first_zero_bit(block->slot_info,
> +					block_desc[block_type].slots_per_block);
> +	     slot < block_desc[block_type].slots_per_block;
> +	     slot = find_next_zero_bit(block->slot_info,
> +					block_desc[block_type].slots_per_block,
> +					slot)) {
> +		if (!test_and_set_bit(slot, block->slot_info))
> +			break;
> +		barrier();

Ah, so find_block() reserves a slot in block->free_slots. You can race
with another allocation for the exact bit, but there is one bit
guaranteed to remain. Clever. This could use a comment.

> +	}
> +	BUG_ON(slot >= block_desc[block_type].slots_per_block);
> +	*handle = metadata_to_handle(block, block_type, slot);
> +	return 0;
> +}
> +
> +/**
> + * zblock_free() - frees the allocation associated with the given handle
> + * @pool:	pool in which the allocation resided
> + * @handle:	handle associated with the allocation returned by zblock_alloc()
> + *
> + */
> +static void zblock_free(struct zblock_pool *pool, unsigned long handle)
> +{
> +	unsigned int slot, block_type;
> +	struct zblock_block *block;
> +	struct block_list *block_list;
> +
> +	block = handle_to_metadata(handle, &block_type, &slot);
> +	block_list = &pool->block_lists[block_type];
> +
> +	spin_lock(&block_list->lock);
> +	/* if all slots in block are empty delete whole block */
> +	if (++block->free_slots == block_desc[block_type].slots_per_block) {
> +		block_list->block_count--;
> +		remove_block(block);
> +		spin_unlock(&block_list->lock);
> +		free_pages((unsigned long)block, block_desc[block_type].order);
> +		return;
> +	}
> +	spin_unlock(&block_list->lock);
> +
> +	clear_bit(slot, block->slot_info);

The list handling here is unusual. find_block() only checks the first
block in the list. Filling a block puts it at the tail, but partially
freeing a block doesn't rotate it back to the head. AFAICS this can
lead to an ugly edge case where you have an unbounded amount of free
memory trapped in partial blocks, but you keep allocating new blocks:

You start with [new block]. Once full, list_move_tail() is a nop.

You allocate a new one, [partial]->[full].

Keep going, you can have [partial]->[full]->[full]->[full]->[full]
until it's hundreds of gigabytes.

The workload now unmaps randomly and thus partially draining many
blocks. So you have: [partial]->[full]->[partial]->... with many
gigabytes of free slots in partially used blocks.

Once the first partial block fills up, it rotates to the tail:
[full]->[partial]->...[full]

find_block() will fail. You allocate a new block, fill it, move it to
the tail. The old [full] is back at the head. Rinse, repeat.

You could make find_block() keep going, but it might have to go
through a million [full] blocks before finding a partial one.

This is the reason why other allocator schemes, like zsmalloc e.g.,
have separate partial and full lists.