From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id F1679C3ABAC
	for <linux-mm@archiver.kernel.org>; Tue,  6 May 2025 12:01:54 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 93D326B000A; Tue,  6 May 2025 08:01:53 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 8ECC76B0082; Tue,  6 May 2025 08:01:53 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 78CE86B0085; Tue,  6 May 2025 08:01:53 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10])
	by kanga.kvack.org (Postfix) with ESMTP id 5B7A26B000A
	for <linux-mm@kvack.org>; Tue,  6 May 2025 08:01:53 -0400 (EDT)
Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id A5C79140311
	for <linux-mm@kvack.org>; Tue,  6 May 2025 12:01:53 +0000 (UTC)
X-FDA: 83412344106.13.C0763AF
Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131])
	by imf08.hostedemail.com (Postfix) with ESMTP id 3ED1116000B
	for <linux-mm@kvack.org>; Tue,  6 May 2025 12:01:50 +0000 (UTC)
Authentication-Results: imf08.hostedemail.com;
	dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="G/HzpAO9";
	dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Yp0PF/sN";
	dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="G/HzpAO9";
	dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Yp0PF/sN";
	dmarc=none;
	spf=pass (imf08.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746532911; a=rsa-sha256;
	cv=none;
	b=JsJmAN5iOCv4dTWZwX3qOk6goA0zJ6IJWyEYivDDj1BTFaq88TrG62qa2EEP9Je35NlKoN
	8e5JLBcTJgTjPzk2FyzpPVJ/SmPiRtteFjC3EllndEERcAb62ELX4DEzWbK/XDO29Vnm1h
	rzcFxL6ioby3FS/B3b01YTxaSlkPkvQ=
ARC-Authentication-Results: i=1;
	imf08.hostedemail.com;
	dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="G/HzpAO9";
	dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Yp0PF/sN";
	dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="G/HzpAO9";
	dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Yp0PF/sN";
	dmarc=none;
	spf=pass (imf08.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1746532911;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=vy10q/chmakb6t9Knn2dhtwHUF+2eAGNL0MO5J+fUKQ=;
	b=N3xnpunlTO+ufMVlNG0waWbOic5mEMHhTy6q+3gqSUUFF1qNC96BpcDkrUSlw/L8A6gJ+L
	cZ2Ae0mFDlPgXt9wHzk8eEfILqSQccF0BvZNjdQqYMTTB1nPM1jBjnMNr7hAYQgqp7WYP3
	kGf/irTfPD/zeYk4TNt9rwCLd1FeNqI=
Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(No client certificate requested)
	by smtp-out2.suse.de (Postfix) with ESMTPS id 661B11F395;
	Tue,  6 May 2025 12:01:49 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa;
	t=1746532909; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=vy10q/chmakb6t9Knn2dhtwHUF+2eAGNL0MO5J+fUKQ=;
	b=G/HzpAO9a77uWas87rhcWkQdrVmO8BmJEC6hkVlquIbnbCIdc/phAXo9guo6ZV1vpJWG4n
	9dqOj38cSwJ/dblFJNTuWoQZS71yhoIZplm/o1oYfQy5gnsm52576jUtDE7JzPMKVwYIXL
	re25/ZBqpkj1ShbvMfhQRmJ+Gg885qQ=
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz;
	s=susede2_ed25519; t=1746532909;
	h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=vy10q/chmakb6t9Knn2dhtwHUF+2eAGNL0MO5J+fUKQ=;
	b=Yp0PF/sNUDw78/z+B5hmsAB50i4QYaE+SYAD6o5V6XtqD6OLgHo8mW69G/JNh8hPTgpZ/U
	8dos3bo4lhE0zRBg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa;
	t=1746532909; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=vy10q/chmakb6t9Knn2dhtwHUF+2eAGNL0MO5J+fUKQ=;
	b=G/HzpAO9a77uWas87rhcWkQdrVmO8BmJEC6hkVlquIbnbCIdc/phAXo9guo6ZV1vpJWG4n
	9dqOj38cSwJ/dblFJNTuWoQZS71yhoIZplm/o1oYfQy5gnsm52576jUtDE7JzPMKVwYIXL
	re25/ZBqpkj1ShbvMfhQRmJ+Gg885qQ=
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz;
	s=susede2_ed25519; t=1746532909;
	h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=vy10q/chmakb6t9Knn2dhtwHUF+2eAGNL0MO5J+fUKQ=;
	b=Yp0PF/sNUDw78/z+B5hmsAB50i4QYaE+SYAD6o5V6XtqD6OLgHo8mW69G/JNh8hPTgpZ/U
	8dos3bo4lhE0zRBg==
Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(No client certificate requested)
	by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 401AF137CF;
	Tue,  6 May 2025 12:01:49 +0000 (UTC)
Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167])
	by imap1.dmz-prg2.suse.org with ESMTPSA
	id 15XXDS36GWidIAAAD6G6ig
	(envelope-from <vbabka@suse.cz>); Tue, 06 May 2025 12:01:49 +0000
Message-ID: <4d3e5d4b-502b-459b-9779-c0bf55ef2a03@suse.cz>
Date: Tue, 6 May 2025 14:01:48 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PATCH 6/6] slab: Introduce kmalloc_nolock() and kfree_nolock().
Content-Language: en-US
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>, bpf@vger.kernel.org,
 linux-mm@kvack.org
Cc: harry.yoo@oracle.com, shakeel.butt@linux.dev, mhocko@suse.com,
 bigeasy@linutronix.de, andrii@kernel.org, memxor@gmail.com,
 akpm@linux-foundation.org, peterz@infradead.org, rostedt@goodmis.org,
 hannes@cmpxchg.org, willy@infradead.org
References: <20250501032718.65476-1-alexei.starovoitov@gmail.com>
 <20250501032718.65476-7-alexei.starovoitov@gmail.com>
From: Vlastimil Babka <vbabka@suse.cz>
In-Reply-To: <20250501032718.65476-7-alexei.starovoitov@gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Rspamd-Action: no action
X-Rspamd-Queue-Id: 3ED1116000B
X-Rspam-User: 
X-Rspamd-Server: rspam07
X-Stat-Signature: i8arqkmenfnxhypryz5uq6eedwbcz8ab
X-HE-Tag: 1746532910-730864
X-HE-Meta: U2FsdGVkX1/UCncsizepz6aEnzd89UQ6AarSzNMRVmolmdQVi57dl2AgrH/3YbBmQUhiV7cvEqi0odixbsDur1tA5coLvhjUGuMW1RMZa0rNnUoXhufVulYnsHdGkJEdHKftz2lh9xnjwBJR9peKizjba0MGQMUjKgvgqfjQAkp4Axr0OuJvaVtMpZO2rylV9tmmTm59+CbM58KspLSyB+NAq34icJKiH76u3z33465YBIxWc3w7GhaMZLmy+Cy0BfIWUcabjRxDs+ZOEdfRfHZ2D59GhCnLxRJJSlyz9RxP0obDXR3LE4r2XSlvD5SKf7lK1kFlYQ1PCv99CQnCpGLDelhP+Oq6JtrIvpoVHEgu8E4GXGX+BpELKyoQ+BwJ7t8JPYrBmIKPH9VmVQqG1g/Hrxilc5d34kt6z4KjWhWalmhZ5P0clyhLB9tiUicpp+NZUjdPPayc08tY6aO8fhfvTvNVENq6CTMuygFPY7mvaXtmJST+Tz5Kg4/+1B/sHRMnKPcpcAmADJIn5kLxd42g3x1tYyD5n6peeordydwLnxnMvBCK8tahjjpWdz9o4Q6GbKHlkgNP1YGQLLjVXVBoCXK9S2CZclYaCOD66fzPE+E5+CRmZlGvO5PVb7UwdlSxhHk7ZwWtzfroGpd9ajvgUJpy0o+YipZKPyap/9my5Zx1mbapwDk3XsJa5qLAoo7+A7chdM8diMfrZIkwbJEVLuHNMubqzD7e8dIHVBsXina+KXA81+203MdwgbLW3sUnC+fxXkXZJ7LiGlCBSzF3Yd2rWTz+GRf/VO5lGGr+Cl/kaMTyxgc4rj6/RhqHq4ME1fuWZLczACqUxLpDnRZDWgoAWVKQ/A6oCFtPvn/B1oRtDWeSHEUYemKlrWxLaw2M1Oxx/NlsT8bV0SQ47RQwIZiNDbwrUCQKmsJS/miHTjf2DGsHsH/Rp/AQ7t3Q5wVe+H8Jy0SJZcApC4i
 3PYb4I1k
 2IvO8rPDxmbCr6Sf42Fo3tBf+mfIDJv/7+duE9Kd9sewnJV31BRwM18ogeinNRRNH4bs2MtI6CUzruW0x5ColZIUuy1o0Iy6E+txFMTdTs7kFrQryt7Zh7amZXMGGuLv6uJr2mZf1lRAPEFDPohkkp54n/Vt9qQUeYyWWZwrnFLNDvg62DtPvhMZymEq2TyjQ52Z8zoCPlAAJh0orMfQSyxvthdVYIGJWYkRDf9JrDyOhoc3BTufatT3+u2NxzpLs8OmhqqiB5u/CIYkIgH99f55Z6f1Aa4Hw8Vzm7lp/kNkrus+K2UNwtbV9Zh2/4+q6lozm9DEvUZXNbt9qhMSCmvPlkxiQOTeIvLsSe2KobQIs53L1j/l/lTnBiNPv39fHIgR+gLM8Wtl5krdtwilyjEEWTnBZeGgkIrTnVZUaRI8aPuI=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On 5/1/25 05:27, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> kmalloc_nolock() relies on ability of local_lock to detect the situation
> when it's locked.
> In !PREEMPT_RT local_lock_is_locked() is true only when NMI happened in
> irq saved region that protects _that specific_ per-cpu kmem_cache_cpu.
> In that case retry the operation in a different kmalloc bucket.
> The second attempt will likely succeed, since this cpu locked
> different kmem_cache_cpu.
> When lock_local_is_locked() sees locked memcg_stock.stock_lock
> fallback to atomic operations.
> 
> Similarly, in PREEMPT_RT local_lock_is_locked() returns true when
> per-cpu rt_spin_lock is locked by current task. In this case re-entrance
> into the same kmalloc bucket is unsafe, and kmalloc_nolock() tries
> a different bucket that is most likely is not locked by current
> task. Though it may be locked by a different task it's safe to
> rt_spin_lock() on it.
> 
> Similar to alloc_pages_nolock() the kmalloc_nolock() returns NULL
> immediately if called from hard irq or NMI in PREEMPT_RT.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

In general I'd prefer if we could avoid local_lock_is_locked() usage outside
of debugging code. It just feels hacky given we have local_trylock()
operations. But I can see how this makes things simpler so it's probably
acceptable.

> @@ -2458,13 +2468,21 @@ static void *setup_object(struct kmem_cache *s, void *object)
>   * Slab allocation and freeing
>   */
>  static inline struct slab *alloc_slab_page(gfp_t flags, int node,
> -		struct kmem_cache_order_objects oo)
> +					   struct kmem_cache_order_objects oo,
> +					   bool allow_spin)
>  {
>  	struct folio *folio;
>  	struct slab *slab;
>  	unsigned int order = oo_order(oo);
>  
> -	if (node == NUMA_NO_NODE)
> +	if (unlikely(!allow_spin)) {
> +		struct page *p = alloc_pages_nolock(__GFP_COMP, node, order);
> +
> +		if (p)
> +			/* Make the page frozen. Drop refcnt to zero. */
> +			put_page_testzero(p);

This is dangerous. Once we create a refcounted (non-frozen) page, someone
else (a pfn scanner like compaction) can do a get_page_unless_zero(), so the
refcount becomes 2, then we decrement the refcount here to 1, the pfn
scanner realizes it's not a page it can work with, do put_page() and frees
it under us.

The solution is to split out alloc_frozen_pages_nolock() to use from here,
and make alloc_pages_nolock() use it too and then set refcounted.

> +		folio = (struct folio *)p;
> +	} else if (node == NUMA_NO_NODE)
>  		folio = (struct folio *)alloc_frozen_pages(flags, order);
>  	else
>  		folio = (struct folio *)__alloc_frozen_pages(flags, order, node, NULL);

<snip>

> @@ -3958,8 +3989,28 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>  	 */
>  	c = slub_get_cpu_ptr(s->cpu_slab);
>  #endif
> +	if (unlikely(!gfpflags_allow_spinning(gfpflags))) {
> +		struct slab *slab;
> +
> +		slab = c->slab;
> +		if (slab && !node_match(slab, node))
> +			/* In trylock mode numa node is a hint */
> +			node = NUMA_NO_NODE;
> +
> +		if (!local_lock_is_locked(&s->cpu_slab->lock)) {
> +			lockdep_assert_not_held(this_cpu_ptr(&s->cpu_slab->lock));
> +		} else {
> +			/*
> +			 * EBUSY is an internal signal to kmalloc_nolock() to
> +			 * retry a different bucket. It's not propagated further.
> +			 */
> +			p = ERR_PTR(-EBUSY);
> +			goto out;

Am I right in my reasoning as follows?

- If we're on RT and "in_nmi() || in_hardirq()" is true then
kmalloc_nolock_noprof() would return NULL immediately and we never reach
this code

- local_lock_is_locked() on RT tests if the current process is the lock
owner. This means (in absence of double locking bugs) that we locked it as
task (or hardirq) and now we're either in_hardirq() (doesn't change current
AFAIK?) preempting task, or in_nmi() preempting task or hardirq.

- so local_lock_is_locked() will never be true here on RT

> +		}
> +	}
>  
>  	p = ___slab_alloc(s, gfpflags, node, addr, c, orig_size);
> +out:
>  #ifdef CONFIG_PREEMPT_COUNT
>  	slub_put_cpu_ptr(s->cpu_slab);
>  #endif
> @@ -4162,8 +4213,9 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
>  		if (p[i] && init && (!kasan_init ||
>  				     !kasan_has_integrated_init()))
>  			memset(p[i], 0, zero_size);
> -		kmemleak_alloc_recursive(p[i], s->object_size, 1,
> -					 s->flags, init_flags);
> +		if (gfpflags_allow_spinning(flags))
> +			kmemleak_alloc_recursive(p[i], s->object_size, 1,
> +						 s->flags, init_flags);
>  		kmsan_slab_alloc(s, p[i], init_flags);
>  		alloc_tagging_slab_alloc_hook(s, p[i], flags);
>  	}
> @@ -4354,6 +4406,88 @@ void *__kmalloc_noprof(size_t size, gfp_t flags)
>  }
>  EXPORT_SYMBOL(__kmalloc_noprof);
>  
> +/**
> + * kmalloc_nolock - Allocate an object of given size from any context.
> + * @size: size to allocate
> + * @gfp_flags: GFP flags. Only __GFP_ACCOUNT, __GFP_ZERO allowed.
> + * @node: node number of the target node.
> + *
> + * Return: pointer to the new object or NULL in case of error.
> + * NULL does not mean EBUSY or EAGAIN. It means ENOMEM.
> + * There is no reason to call it again and expect !NULL.
> + */
> +void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
> +{
> +	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
> +	struct kmem_cache *s;
> +	bool can_retry = true;
> +	void *ret = ERR_PTR(-EBUSY);
> +
> +	VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO));
> +
> +	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
> +		return NULL;
> +	if (unlikely(!size))
> +		return ZERO_SIZE_PTR;
> +
> +	if (!USE_LOCKLESS_FAST_PATH() && (in_nmi() || in_hardirq()))
> +		/* kmalloc_nolock() in PREEMPT_RT is not supported from irq */
> +		return NULL;
> +retry:
> +	s = kmalloc_slab(size, NULL, alloc_gfp, _RET_IP_);

The idea of retrying on different bucket is based on wrong assumptions and
thus won't work as you expect. kmalloc_slab() doesn't select buckets truly
randomly, but deterministically via hashing from a random per-boot seed and
the _RET_IP_, as the security hardening goal is to make different kmalloc()
callsites get different caches with high probability.

And I wouldn't also recommend changing this for kmalloc_nolock_noprof() case
as that could make the hardening weaker, and also not help for kernels that
don't have it enabled, anyway.

> +
> +	if (!(s->flags & __CMPXCHG_DOUBLE))
> +		/*
> +		 * kmalloc_nolock() is not supported on architectures that
> +		 * don't implement cmpxchg16b.
> +		 */
> +		return NULL;
> +
> +	/*
> +	 * Do not call slab_alloc_node(), since trylock mode isn't
> +	 * compatible with slab_pre_alloc_hook/should_failslab and
> +	 * kfence_alloc.
> +	 *
> +	 * In !PREEMPT_RT ___slab_alloc() manipulates (freelist,tid) pair
> +	 * in irq saved region. It assumes that the same cpu will not
> +	 * __update_cpu_freelist_fast() into the same (freelist,tid) pair.
> +	 * Therefore use in_nmi() to check whether particular bucket is in
> +	 * irq protected section.
> +	 */
> +	if (!in_nmi() || !local_lock_is_locked(&s->cpu_slab->lock))
> +		ret = __slab_alloc_node(s, alloc_gfp, node, _RET_IP_, size);

Hm this is somewhat subtle. We're testing the local lock without having the
cpu explicitly pinned. But the test only happens in_nmi() which implicitly
is a context that won't migrate, so should work I think, but maybe should be
more explicit in the comment?

<snip>

>  /*
>   * Fastpath with forced inlining to produce a kfree and kmem_cache_free that
>   * can perform fastpath freeing without additional function calls.
> @@ -4605,10 +4762,36 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
>  	barrier();
>  
>  	if (unlikely(slab != c->slab)) {

Note this unlikely() is actually a lie. It's actually unlikely that the free
will happen on the same cpu and with the same slab still being c->slab,
unless it's a free following shortly a temporary object allocation.

> -		__slab_free(s, slab, head, tail, cnt, addr);
> +		/* cnt == 0 signals that it's called from kfree_nolock() */
> +		if (unlikely(!cnt)) {
> +			/*
> +			 * Use llist in cache_node ?
> +			 * struct kmem_cache_node *n = get_node(s, slab_nid(slab));
> +			 */
> +			/*
> +			 * __slab_free() can locklessly cmpxchg16 into a slab,
> +			 * but then it might need to take spin_lock or local_lock
> +			 * in put_cpu_partial() for further processing.
> +			 * Avoid the complexity and simply add to a deferred list.
> +			 */
> +			llist_add(head, &s->defer_free_objects);
> +		} else {
> +			free_deferred_objects(&s->defer_free_objects, addr);

So I'm a bit vary that this is actually rather a fast path that might
contend on the defer_free_objects from all cpus.

I'm wondering if we could make the list part of kmem_cache_cpu to distribute
it, and hook the flushing e.g. to places where we do deactivate_slab() which
should be much slower path, and also free_to_partial_list() to handle
SLUB_TINY/caches with debugging enabled.

> +			__slab_free(s, slab, head, tail, cnt, addr);
> +		}
>  		return;
>  	}
>  
> +	if (unlikely(!cnt)) {
> +		if ((in_nmi() || !USE_LOCKLESS_FAST_PATH()) &&
> +		    local_lock_is_locked(&s->cpu_slab->lock)) {
> +			llist_add(head, &s->defer_free_objects);
> +			return;
> +		}
> +		cnt = 1;
> +		kasan_slab_free(s, head, false, false, true);
> +	}
> +
>  	if (USE_LOCKLESS_FAST_PATH()) {
>  		freelist = READ_ONCE(c->freelist);
>