From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D4C2C02181 for ; Fri, 24 Jan 2025 12:48:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8783128004D; Fri, 24 Jan 2025 07:48:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 801166B0128; Fri, 24 Jan 2025 07:48:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67A4528004D; Fri, 24 Jan 2025 07:48:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 457F96B00A4 for ; Fri, 24 Jan 2025 07:48:54 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E12A21C8850 for ; Fri, 24 Jan 2025 12:48:03 +0000 (UTC) X-FDA: 83042322846.09.4B2C035 Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by imf23.hostedemail.com (Postfix) with ESMTP id D3921140015 for ; Fri, 24 Jan 2025 12:48:01 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="g3/9eQxr"; spf=pass (imf23.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737722882; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4BzrtAZUmYmJhcFIblC4VJWVTxNvUlbeOqI5x0bXKmI=; b=xX/MBDD6ozju2X++mhV9JJ/k8inoIw2OriVJ2cw7niAmwlpcokYgUTMi4jfbvZqfWeFKr0 5Iq/KUVm+Yhne8qG92m3cAtInhJ7IuXiUnGFDuulhkIkXUZ033bYQgTz/r6ZwBqLDMY36I +CgVu0iQWTX6Ur8VIWsOgyDBle/4Bh0= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="g3/9eQxr"; spf=pass (imf23.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737722882; a=rsa-sha256; cv=none; b=O/Gt0VnoYHvPUb9WWtA/41uVggXrR+pXKH9BEpYOKy1/n8JU5gUTcIt2dRZvu5c6KeF788 rZJRfhtWNsZ37yeZeMG9D7TGeOYe2Zla+rx0mElMfgC69zRu/k3+hGQwUfjqJ8sipsFMjI Cv8+NgztMgOV7eCQXVva66NbGEVurn4= Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-53f22fd6887so2256523e87.2 for ; Fri, 24 Jan 2025 04:48:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737722880; x=1738327680; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=4BzrtAZUmYmJhcFIblC4VJWVTxNvUlbeOqI5x0bXKmI=; b=g3/9eQxrQ2X4kMldZBDLCj8wCbQNf6+/NnjZoJ3XBoeqmqvu987X3uBlDy/6tGY67t OGUMWL65Ssl1pd4CvBE8I58zA1X89neAXuKK/caJd8zFbUGllq/7phDBLOUoafyLHN2n Q/lN8N/mIT/M75RPJ8x1HEJwGWAAWVGjRTN1qbTN0Hnig01HM5XYub57MInSHjGqfuWu cvKEH30nSaUB5O/D1npYuvQDjtW9YHOJbO9nc5JxdWR1KCXfOTMpn0h4LOG3oiuUPO1h jLs3KrKjQTI4Qx2M+jgP2f8BiQo83S0Rq0KD/qtZs9jzgirIc0ukNPQmu1KqbJgDtede blxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737722880; x=1738327680; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4BzrtAZUmYmJhcFIblC4VJWVTxNvUlbeOqI5x0bXKmI=; b=v7vYdt7d48m8vQaci68XJBVBtoXxnVayT8ynQbuN+ORtWE0eDhUJR1osrt+WLCp+Li 0o0N+qMh8WwcoxiBXa9bkXSq7uonZ1onvHUyyL+BgzGmyr6r/9izYTSIRSzqj2xBstYk ZJK9MpN6ZcexbQSrYQ/YNdQoHZN7jlI+tgjK6mbploRizS6xmaGw2fDioU1J9q76+Vwd NUHsYfkUy3xams9Agi3iA9KuHfdUeTRaIZe16b/SZ27KowsRMFFhZ/e93vhfi7dEvPL8 EPHUJIfssup2cXNEEoVsBIdKN9/6Y+90UAy22YUw4LDb9mUoFKmUaxNKxTc1ChM8RBdb i+Eg== X-Forwarded-Encrypted: i=1; AJvYcCUZHVWThG5u7n2aB0y+iCzxudLxnl67nnAori1f2Ic6GEx7eGGkv6VtzYsqmuFrilENOWGcgj5Xrg==@kvack.org X-Gm-Message-State: AOJu0YxqZBtU2ogGZjWGHwBIM/QZNxh4ZZj6cUbpgZz23eki+GDNLMYa 1+xMbzsG3ZwCJgHubqHSWVdgHyLljQwyCGbhmort9pcTLA2Os6E+ X-Gm-Gg: ASbGnctmA21DJvaF4E3Iv5bTafmQ1bBiHJOAllaCbLxa6Mxg9Qqn1a6OkrIprdEVEvI 0Geo3kLr9pEFQSdOJKGLyDU14lOWf0gB07rQlj1xT9T1JeRCRCznSv2CbgGoGcIquvQAOGuFzep 84yVxLuU9Dc2XL3xy2A1kO/+r8XRXTBHZYwsNBDBIUiirZnR0thaNCH6e4Oz+N2hlLWOq4pt/6x 1qlE3E+V1YXo99udtU9TzVAdf5xgk7lB9SGR/Yig3etZvacIYQv4xt3DOq5hG6Cuvy08zD87QrF fL4PK8G6vUW2vT0M2LnJI7wCt+QMJMZp17c= X-Google-Smtp-Source: AGHT+IHfDCKln2WQAdqNwoqpNnunQ0kWDLLio3VptmybfPuU4j38QgVw6I8QCrCL2FcPVriNtBeZnA== X-Received: by 2002:ac2:520b:0:b0:542:98bb:566f with SMTP id 2adb3069b0e04-5439c287e74mr8146119e87.52.1737722879720; Fri, 24 Jan 2025 04:47:59 -0800 (PST) Received: from pc636 (host-217-213-93-172.mobileonline.telia.com. [217.213.93.172]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-543c822f827sm288753e87.71.2025.01.24.04.47.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jan 2025 04:47:59 -0800 (PST) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Fri, 24 Jan 2025 13:47:56 +0100 To: Vlastimil Babka Cc: Christoph Lameter , David Rientjes , "Paul E. McKenney" , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , rcu@vger.kernel.org Subject: Re: [PATCH RFC 3/4] rcu, slab: use a regular callback function for kvfree_rcu Message-ID: References: <20250123-slub-tiny-kfree_rcu-v1-0-0e386ef1541a@suse.cz> <20250123-slub-tiny-kfree_rcu-v1-3-0e386ef1541a@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250123-slub-tiny-kfree_rcu-v1-3-0e386ef1541a@suse.cz> X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D3921140015 X-Stat-Signature: 9gzbt8bqzfxk58y6j67jktpqqrsygyzb X-Rspam-User: X-HE-Tag: 1737722881-532877 X-HE-Meta: U2FsdGVkX18ogwUyhV5gtLFUP6bf005zpNPRUbW3BkbKY7nOOGoVctLznPa8PLElisZ0HEXXKRz2brqB/Y8nBJTHhj6I3TznYwnKJK4l4MfVfyI7Iz5gZdvrlF3jNBqpO7xqleLqRapdwZYpCT/crgILhvVmc9eBrDSJlUYdyZXSqCiYWAtwJEo8R+8X0b12qoeXHUbpPtJGGBHJ4pzwh1MLpCpp34dF6EGbwhRKyqhU72+yMDOi2D7Zmai5yrX1RIdx/UDz7vSZnUYrb/twTm+WMyLd4rrFBzHHgWqjRqR/2dprNeW/U+1SyfE2gk93xNa8LjQzMYWxrjVDXtWdCfFvIhLvDFS2DIaslIwenDBPzelTa6eWgn2MkRWXCtEiirKMJZT/plCtIqy2WkIb2BzOyDNhhraNEzzBp2ko91VawkFNCA0+WgLroB1Qt75r8p/1qew1zJY5loRZX5sDsZONaRchMFWn+aXte5IUrW1uGRz903PgLqBSLFrlq3QK85lNuf4HBD3EHhCxgITRnzfbcaFo3eRhvmZE7Rc7CMoH6E9f+KHFeZEObtHGIKnOYUgMuwEBHy39H2iMSHl1qY9MQRAimnEeiRZFMzP3ZC81haZ2809kOXrYnfbHg6nIWI26rw0it0sKk95RzGRiXrZaE2+y7oqTjHJf22HQdfcbezY1DC7YwFRfoPKo5I+omo62tE6zIyMcjt5g0bbXFAg0V0dqor0+ND9gjygZSQfsf384aUdUIZNfdraYdjSuFQDXDCz1YWgVn264ny04Qw53KHk1fglST27s3yUvNmLlsi7/yCRr9Ha4TxWKYAWKpsthHhzT5H4Pn6DblqpmgOLxzFIGi8e2A/eeRSW60gvjRrR/pNewaX2OHhqOR5Pl92wV9Ha49+9wcobq7eXssRJjTwqHj5topald6be2O3BzM+K6MXKERnkYP/Fs+h2O+xkFNqUStmoAqibIxou N7OkvLJu oN2w67Vxzxdq/qTszrJny0iqgkRwBb+AavqeHBXdff/p4FdHR5OLk7TxogVH0pvHGu4RVmIjbV8ZZEJBFGj1JbI+wSVJKwmqxvjv7J05cA4AYFeKReERf7kW8miJsW19Rj9Kwzt+nAI75OTBEwlOBAgr+P9ZMF2PzjIF1ZscRyqTw45N5g8Jpz/zS8XoU4/wxAay/1vksd7KaCCoQGNslwY8MCq+ve/vi7eLdeLGm5CSjbpjCFX7DoKTXpdHHfiZrMldQR8VkV+HDNwkKWBYO5UI5/B/6BP35GYROSdxlGfb7FkZlHivn5bLFxBtaiLuu8byatPBNEMSs4tAO2XI2Noy+ob12honVM/7bS7pvbY0XLqN8DgGU7ODsh+XXoRQd/0UsD2/qLU0dCD5hxKlEPo70/jKlEMZnOBB1ApQOXIPCDYFv6gYkpXsAtuOmXwu15npUMGeHpOHpAFIKGcknMxekgcDD9U+T+/oucKWsFuY29JAWGwgMn2+yfTFmNvBnik7/yMSf7YhnVHM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 23, 2025 at 11:37:20AM +0100, Vlastimil Babka wrote: > RCU has been special-casing callback function pointers that are integers > lower than 4096 as offsets of rcu_head for kvfree() instead. The tree > RCU implementation no longer does that as the batched kvfree_rcu() is > not a simple call_rcu(). The tiny RCU still does, and the plan is also > to make tree RCU use call_rcu() for SLUB_TINY configurations. > > Instead of teaching tree RCU again to special case the offsets, let's > remove the special casing completely. Since there's no SLOB anymore, it > is possible to create a callback function that can take a pointer to a > middle of slab object with unknown offset and determine the object's > pointer before freeing it, so implement that as kvfree_rcu_cb(). > > Large kmalloc and vmalloc allocations are handled simply by aligning > down to page size. For that we retain the requirement that the offset is > smaller than 4096. But we can remove __is_kvfree_rcu_offset() completely > and instead just opencode the condition in the BUILD_BUG_ON() check. > > Signed-off-by: Vlastimil Babka > --- > include/linux/rcupdate.h | 24 +++++++++--------------- > kernel/rcu/tiny.c | 13 ------------- > mm/slab.h | 2 ++ > mm/slab_common.c | 5 +---- > mm/slub.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 54 insertions(+), 32 deletions(-) > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index 3f70d1c8144426f40553c8c589f07097ece8a706..7ff16a70ca1c0fb1012c4118388f60687c5e5b3f 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -1025,12 +1025,6 @@ static inline notrace void rcu_read_unlock_sched_notrace(void) > #define RCU_POINTER_INITIALIZER(p, v) \ > .p = RCU_INITIALIZER(v) > > -/* > - * Does the specified offset indicate that the corresponding rcu_head > - * structure can be handled by kvfree_rcu()? > - */ > -#define __is_kvfree_rcu_offset(offset) ((offset) < 4096) > - > /** > * kfree_rcu() - kfree an object after a grace period. > * @ptr: pointer to kfree for double-argument invocations. > @@ -1041,11 +1035,11 @@ static inline notrace void rcu_read_unlock_sched_notrace(void) > * when they are used in a kernel module, that module must invoke the > * high-latency rcu_barrier() function at module-unload time. > * > - * The kfree_rcu() function handles this issue. Rather than encoding a > - * function address in the embedded rcu_head structure, kfree_rcu() instead > - * encodes the offset of the rcu_head structure within the base structure. > - * Because the functions are not allowed in the low-order 4096 bytes of > - * kernel virtual memory, offsets up to 4095 bytes can be accommodated. > + * The kfree_rcu() function handles this issue. In order to have a universal > + * callback function handling different offsets of rcu_head, the callback needs > + * to determine the starting address of the freed object, which can be a large > + * kmalloc of vmalloc allocation. To allow simply aligning the pointer down to > + * page boundary for those, only offsets up to 4095 bytes can be accommodated. > * If the offset is larger than 4095 bytes, a compile-time error will > * be generated in kvfree_rcu_arg_2(). If this error is triggered, you can > * either fall back to use of call_rcu() or rearrange the structure to > @@ -1091,10 +1085,10 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr); > do { \ > typeof (ptr) ___p = (ptr); \ > \ > - if (___p) { \ > - BUILD_BUG_ON(!__is_kvfree_rcu_offset(offsetof(typeof(*(ptr)), rhf))); \ > - kvfree_call_rcu(&((___p)->rhf), (void *) (___p)); \ > - } \ > + if (___p) { \ > + BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096); \ > + kvfree_call_rcu(&((___p)->rhf), (void *) (___p)); \ > + } \ > Why removing the macro? At least __is_kvfree_rcu_offset() makes it clear what and why + it has a nice comment what it is used for. 4096 looks like hard-coded value. Or you do not want that someone else started to use that macro in another places? > } while (0) > > #define kvfree_rcu_arg_1(ptr) \ > diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c > index 0ec27093d0e14a4b1060ea08932c4ac13f9b0f26..77e0db0221364376a99ebeb17485650879385a6e 100644 > --- a/kernel/rcu/tiny.c > +++ b/kernel/rcu/tiny.c > @@ -88,12 +88,6 @@ static inline bool rcu_reclaim_tiny(struct rcu_head *head) > unsigned long offset = (unsigned long)head->func; > > rcu_lock_acquire(&rcu_callback_map); > - if (__is_kvfree_rcu_offset(offset)) { > - trace_rcu_invoke_kvfree_callback("", head, offset); > - kvfree((void *)head - offset); > - rcu_lock_release(&rcu_callback_map); > - return true; > - } > > trace_rcu_invoke_callback("", head); > f = head->func; > @@ -159,10 +153,6 @@ void synchronize_rcu(void) > } > EXPORT_SYMBOL_GPL(synchronize_rcu); > > -static void tiny_rcu_leak_callback(struct rcu_head *rhp) > -{ > -} > - > /* > * Post an RCU callback to be invoked after the end of an RCU grace > * period. But since we have but one CPU, that would be after any > @@ -178,9 +168,6 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > pr_err("%s(): Double-freed CB %p->%pS()!!! ", __func__, head, head->func); > mem_dump_obj(head); > } > - > - if (!__is_kvfree_rcu_offset((unsigned long)head->func)) > - WRITE_ONCE(head->func, tiny_rcu_leak_callback); > return; > } > > diff --git a/mm/slab.h b/mm/slab.h > index e9fd9bf0bfa65b343a4ae0ecd5b4c2a325b04883..2f01c7317988ce036f0b22807403226a59f0f708 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -604,6 +604,8 @@ void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, > void **p, int objects, struct slabobj_ext *obj_exts); > #endif > > +void kvfree_rcu_cb(struct rcu_head *head); > + > size_t __ksize(const void *objp); > > static inline size_t slab_ksize(const struct kmem_cache *s) > diff --git a/mm/slab_common.c b/mm/slab_common.c > index 330cdd8ebc5380090ee784c58e8ca1d1a52b3758..f13d2c901daf1419993620459fbd5845eecb85f1 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -1532,9 +1532,6 @@ kvfree_rcu_list(struct rcu_head *head) > rcu_lock_acquire(&rcu_callback_map); > trace_rcu_invoke_kvfree_callback("slab", head, offset); > > - if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset))) > - kvfree(ptr); > - This is not correct unless i miss something. Why do you remove this? > > diff --git a/mm/slub.c b/mm/slub.c > index c2151c9fee228d121a9cbcc220c3ae054769dacf..651381bf05566e88de8493e0550f121d23b757a1 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -19,6 +19,7 @@ > #include > #include > #include "slab.h" > +#include > #include > #include > #include > @@ -4732,6 +4733,47 @@ static void free_large_kmalloc(struct folio *folio, void *object) > folio_put(folio); > } > > +void kvfree_rcu_cb(struct rcu_head *head) > +{ > + void *obj = head; > + struct folio *folio; > + struct slab *slab; > + struct kmem_cache *s; > + void *slab_addr; > + > + if (unlikely(is_vmalloc_addr(obj))) { > + obj = (void *) PAGE_ALIGN_DOWN((unsigned long)obj); > + vfree(obj); > + return; > + } > + > + folio = virt_to_folio(obj); > + if (unlikely(!folio_test_slab(folio))) { > + /* > + * rcu_head offset can be only less than page size so no need to > + * consider folio order > + */ > + obj = (void *) PAGE_ALIGN_DOWN((unsigned long)obj); > + free_large_kmalloc(folio, obj); > + return; > + } > + > + slab = folio_slab(folio); > + s = slab->slab_cache; > + slab_addr = folio_address(folio); > + > + if (is_kfence_address(obj)) { > + obj = kfence_object_start(obj); > + } else { > + unsigned int idx = __obj_to_index(s, slab_addr, obj); > + > + obj = slab_addr + s->size * idx; > + obj = fixup_red_left(s, obj); > + } > + > + slab_free(s, slab, obj, _RET_IP_); > +} > + Tiny computer case. Just small nit, maybe remove unlikely() but you decide :) -- Uladzislau Rezki