From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3ED5CC02181 for ; Fri, 24 Jan 2025 14:19:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B13F1280053; Fri, 24 Jan 2025 09:19:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AC3C16B0135; Fri, 24 Jan 2025 09:19:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9649B280053; Fri, 24 Jan 2025 09:19:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 786D96B00A9 for ; Fri, 24 Jan 2025 09:19:40 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 168B3C13D9 for ; Fri, 24 Jan 2025 14:19:40 +0000 (UTC) X-FDA: 83042553720.14.CC366B5 Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com [209.85.208.182]) by imf10.hostedemail.com (Postfix) with ESMTP id F2DBCC0008 for ; Fri, 24 Jan 2025 14:19:37 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VxTQXg6N; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.182 as permitted sender) smtp.mailfrom=urezki@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737728378; a=rsa-sha256; cv=none; b=6COmMILQ7SRdT7lvB0yxFxKJ5rjoaI8OHxQnaGdDK+rEb9vvasmo0dl4Xbl8baidj9F18x Khv6fWySpFjV6zAEZacgbTFo2DRRIX7kOA4HD6OlcidkJXXVrqByyCkw+uFYdmb9Y8b2v/ 2IZ3lYFc3+kFnIZGpTBG5+8mPufvEf0= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VxTQXg6N; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.182 as permitted sender) smtp.mailfrom=urezki@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737728378; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=s4yfLGMF4DRRR4uiSdjsJIEX/vDRt9QEVPWz6odNLKc=; b=pK+QbYmFwS0zk1qxSbszO1ayymGXU8PCunR1bxzaNfgWS6J+qV4nQd6hcBDt9Al/M1PjiH Q0Avkpzx7g9J1lInCPV4ifSBPM92gC678T4Q9PAmZKNB1LV8Uw8yR4NbOxKuJPFvAsyJ78 /2pIedwH1Uq/7NO1BUUr3Ar6sG5iz9Y= Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-3022484d4e4so23103781fa.1 for ; Fri, 24 Jan 2025 06:19:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737728376; x=1738333176; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=s4yfLGMF4DRRR4uiSdjsJIEX/vDRt9QEVPWz6odNLKc=; b=VxTQXg6NKayEyBaLsdhysPDO2YsCvDVOlMNeDxXcQGElLGrRrgksPTKS1UD4vfUNYm iRJJUNykC7YP0bujcPv7zXn9GqswC5L2hjqRfO/UyTrLRC90/XiKO78dwEIS0gEqLr5+ JqQh8nKoqSR4xiUZsC+slVX5v6IVKqgYbFIyVq5AOzcodkJ56WhNfSQZDwQwiIEIX+TD 97VQ00ySsEhVpB4OLA5tBQOeZbOjUbcwlKP+I1IT8AO3qNRLDudhIY+ZFmKUWMjxGN48 l7T4e/dWkJhL2uIbR5Yw40NW9Fv+zuKIQXUdsTra/yLwsdFgLc4bHO8Ke1GITxwVPKWh a+zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737728376; x=1738333176; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=s4yfLGMF4DRRR4uiSdjsJIEX/vDRt9QEVPWz6odNLKc=; b=Q7sbMkCBSMrxSoFaoP8FNwAOsATGHT8X6fQkNvRvxjmxwA3bSmpQm5ELGTjaH1F8uW suQLlhXBNmEEONnWARk31fHVT9OOyyQfSsH+yTBUPJdHvbvfNSMpVSFpJOWEbmDfoURG iB5u9cSB8jbO/yddPoHDkjaRpQpu+KH3tvwh7S1LmlaSX7DymsMq8YMJglRE6qSLChCi BYyKYfvQih5eN935APERdvccdl2NhpI66kkd4TiXpW1WdB4SGE5gjBJ2HvrRuRV4GiHE 0DGa2J0pnDM4GfikxrsbVpZxV2ud63rPUQtjeff3nZi3ITt/EbycsHQyuPP1DiyeMfaw kXNA== X-Forwarded-Encrypted: i=1; AJvYcCXgiWVb0xzzJOUV9MzQC/iLXgRlk/ZjTj81hj1QynDe4bpv12gMu4wHxJkRWR8R5C2T045g92WF4Q==@kvack.org X-Gm-Message-State: AOJu0YyKbsbpb2Oinx3ihaVDQi7DzxDOsoissiVTBYiDl+RCY7pwXr0u mhEzulILCRzEpefIyvt0z8wAg5zyKQuehifjyw0twpseZCJ8GPl7 X-Gm-Gg: ASbGncu77K+WgbLfOtYrnf8VpLseCALxQTtIUEJgESkqBBjzDLFQBEx9pteiUvTtd8F jxOSs5RqjXoypgpxLr7Sbey+JHu3Kszgh8XdXZ0wteE2l6fAT2F7NIise3lDlWRyxPa3DXEhWnv 5pnVR8qOl0+ZqaNXl9XzAHlGgi9djtAwqxHGFi7xpt0FPFOxWOfv9ddhfC7ZvWdI9ghFfOS01IF 0LePkAXnQiBx6vW0HzQSX3zh7be1S1h8R8nL3JAwz2dLuCoeoEXWbK0bXuCkk4oYZKneS6d64KT +v1fLbkTLPS9hozlLqMV3gt+OMIbrVcqMpPKr7ZhOePeEQ== X-Google-Smtp-Source: AGHT+IE6TMIsR+8dZQqKWBCDc/C/osFhUt1LkBxJItfLi+LIuu7MXwX133y9q3x+eQYeYB0WY1sdIQ== X-Received: by 2002:ac2:48a1:0:b0:540:1d6c:f1af with SMTP id 2adb3069b0e04-5439c27d0d8mr9216959e87.44.1737728375627; Fri, 24 Jan 2025 06:19:35 -0800 (PST) Received: from pc636 (host-217-213-93-172.mobileonline.telia.com. [217.213.93.172]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-543c822f965sm307666e87.91.2025.01.24.06.19.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jan 2025 06:19:34 -0800 (PST) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Fri, 24 Jan 2025 15:19:32 +0100 To: Vlastimil Babka Cc: Uladzislau Rezki , Christoph Lameter , David Rientjes , "Paul E. McKenney" , Joel Fernandes , Josh Triplett , Boqun Feng , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , rcu@vger.kernel.org Subject: Re: [PATCH RFC 3/4] rcu, slab: use a regular callback function for kvfree_rcu Message-ID: References: <20250123-slub-tiny-kfree_rcu-v1-0-0e386ef1541a@suse.cz> <20250123-slub-tiny-kfree_rcu-v1-3-0e386ef1541a@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: F2DBCC0008 X-Stat-Signature: mgah7n7zcudnobu4h1dmgqci9gj33wa8 X-HE-Tag: 1737728377-900816 X-HE-Meta: U2FsdGVkX18odUiuNonmXLQ5q9O5hH+XpLk88ul067ZzQqUS+hG2hi6iLpwXfhHMo0QyjUC3J3W7oM9qUGNHM0Chut68UPQixpQ0HWb+tGCgfELqO6/U55EQvCGL4slKI8k3jBXeoopLlxD3u/b2E3F/bryMVvY3jgPg3zvgGzVreS83SrpByAr/9Yi6QLyPlkLpZpyvvFQxupIl9XOkwzgiknBR2f8enDdC5X4LCS995T8UoexhYpy9mIRCG+VBlMkx5QpvshgILZUfyIXUAvj9nhqtt2+NtTk1YQmFrHVhynPToi3HFFV2G2h/gxTri19Cz38Id4EPP4JWCy3m9lMuXKE43ofPqz7lNv4nxGDR+RaXr4DzhfNBa7REHtfXeZLx0//bsJ1d7rHONg+r8jDVqnR25FdCn3m+PxQEJFVsHBpgmEsGkJAY8tbfEzjMX9hEFBDUUHb68Juf1t9/tjOlwOcrfziqcM8KHXlckkWL3aSitIAkvH2jwt8mDzIb29JUTDfmj6lgLRNWMTP75P9L5HiFNA79GOeoo+qtod5MdNMGsHe5OhPH/UTdpoiNBk397Kt7LlNNA90gggq9d0PoWk1Eep0wax4hxbj3oZj2Uu24KJyYksAAGXqq1yZ3fljtvQEEXxwFbw2sEbrKhc62cKPibYKsqf69UthUti8rvwiZdv4BQtcnNI28wA70wXYUeRLQp8R68VNJj/i58FE8cYU/AqRF0hpNePd3y8hc+2fa0QOfOPYTZu/1beXpHNw4fjsPUj21LwHb81SXS3opzdBfsKAr7qYUWioyAiy7qDgeef3DgC+BDVi6KheBVAyTmLddbL97dZIbtGGRRVJ9xfQkw2tH9y4696EXPhfUqhc5G+WrbJ/GFkXAfOjZ4068Bx5mx6lDP/oHrdwML7VcxPIIY6rzjykBLWaQ3a5D/zQ53nusCL42dCS/zTToDGk0zr9Yrm4ptOJCbja yk29WqaR 6TmFRSTOo+WFUOt9M8VCpd9BD0lxRzdKnNJtYIBOKUYxwMBkXUFgH0P6wnE0N7dTsvOZTdkKCCHWulxvFyscwq8Y/SGVdRoEGCWH5RfC1Jj3i6ijzZ0xyyuPqsDpzO9hha8hUc8heeHqNb6p/asjTI5zzGQ4Yw4rwSFFpk8b/nTlwA0EuL6rsC5gdPO5AXGzB6XyNm4sUsTbbpYP3i8JW0ZANCwcp+tGKi0ywouxiA2sVbWiYAzSYqPFXMNTmtYe+D7O75/qIjElEi2foZlRziXySCclyE1DweBijU7gjuo3rSbh7AsrRk7tX/InOhEYDWyB93Ccxu5TDEkGi9P4wy4gJNvtLL0xvv/sqHsvt7WzwUwNarissit/4eFaaR+ey/ZryFWH95u+YmMtPPRhZDGDMpo5VYvdMBcTTUjRJxBJtyUAkJ257n6pkZIoAwMJWLk7YrBvQ0AEXSdSK5P4Cwue1vWZT7UhJYoz7qNfjzXDj8n1PsTHIVEfHfxPTaeSdv7fnp61ugyc/pH4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 24, 2025 at 03:16:05PM +0100, Vlastimil Babka wrote: > On 1/24/25 13:47, Uladzislau Rezki wrote: > > On Thu, Jan 23, 2025 at 11:37:20AM +0100, Vlastimil Babka wrote: > >> RCU has been special-casing callback function pointers that are integers > >> lower than 4096 as offsets of rcu_head for kvfree() instead. The tree > >> RCU implementation no longer does that as the batched kvfree_rcu() is > >> not a simple call_rcu(). The tiny RCU still does, and the plan is also > >> to make tree RCU use call_rcu() for SLUB_TINY configurations. > >> > >> Instead of teaching tree RCU again to special case the offsets, let's > >> remove the special casing completely. Since there's no SLOB anymore, it > >> is possible to create a callback function that can take a pointer to a > >> middle of slab object with unknown offset and determine the object's > >> pointer before freeing it, so implement that as kvfree_rcu_cb(). > >> > >> Large kmalloc and vmalloc allocations are handled simply by aligning > >> down to page size. For that we retain the requirement that the offset is > >> smaller than 4096. But we can remove __is_kvfree_rcu_offset() completely > >> and instead just opencode the condition in the BUILD_BUG_ON() check. > >> > >> Signed-off-by: Vlastimil Babka > >> --- > >> include/linux/rcupdate.h | 24 +++++++++--------------- > >> kernel/rcu/tiny.c | 13 ------------- > >> mm/slab.h | 2 ++ > >> mm/slab_common.c | 5 +---- > >> mm/slub.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > >> 5 files changed, 54 insertions(+), 32 deletions(-) > >> > >> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > >> index 3f70d1c8144426f40553c8c589f07097ece8a706..7ff16a70ca1c0fb1012c4118388f60687c5e5b3f 100644 > >> --- a/include/linux/rcupdate.h > >> +++ b/include/linux/rcupdate.h > >> @@ -1025,12 +1025,6 @@ static inline notrace void rcu_read_unlock_sched_notrace(void) > >> #define RCU_POINTER_INITIALIZER(p, v) \ > >> .p = RCU_INITIALIZER(v) > >> > >> -/* > >> - * Does the specified offset indicate that the corresponding rcu_head > >> - * structure can be handled by kvfree_rcu()? > >> - */ > >> -#define __is_kvfree_rcu_offset(offset) ((offset) < 4096) > >> - > >> /** > >> * kfree_rcu() - kfree an object after a grace period. > >> * @ptr: pointer to kfree for double-argument invocations. > >> @@ -1041,11 +1035,11 @@ static inline notrace void rcu_read_unlock_sched_notrace(void) > >> * when they are used in a kernel module, that module must invoke the > >> * high-latency rcu_barrier() function at module-unload time. > >> * > >> - * The kfree_rcu() function handles this issue. Rather than encoding a > >> - * function address in the embedded rcu_head structure, kfree_rcu() instead > >> - * encodes the offset of the rcu_head structure within the base structure. > >> - * Because the functions are not allowed in the low-order 4096 bytes of > >> - * kernel virtual memory, offsets up to 4095 bytes can be accommodated. > >> + * The kfree_rcu() function handles this issue. In order to have a universal > >> + * callback function handling different offsets of rcu_head, the callback needs > >> + * to determine the starting address of the freed object, which can be a large > >> + * kmalloc of vmalloc allocation. To allow simply aligning the pointer down to > >> + * page boundary for those, only offsets up to 4095 bytes can be accommodated. > >> * If the offset is larger than 4095 bytes, a compile-time error will > >> * be generated in kvfree_rcu_arg_2(). If this error is triggered, you can > >> * either fall back to use of call_rcu() or rearrange the structure to > >> @@ -1091,10 +1085,10 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr); > >> do { \ > >> typeof (ptr) ___p = (ptr); \ > >> \ > >> - if (___p) { \ > >> - BUILD_BUG_ON(!__is_kvfree_rcu_offset(offsetof(typeof(*(ptr)), rhf))); \ > >> - kvfree_call_rcu(&((___p)->rhf), (void *) (___p)); \ > >> - } \ > >> + if (___p) { \ > >> + BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096); \ > >> + kvfree_call_rcu(&((___p)->rhf), (void *) (___p)); \ > >> + } \ > >> > > Why removing the macro? At least __is_kvfree_rcu_offset() makes it > > clear what and why + it has a nice comment what it is used for. 4096 > > looks like hard-coded value. > > Hmm but it's ultimately shorter. We could add a short comment to > kvfree_rcu_arg_2() referring to the longer explanation in the kfree_rcu() > comment? > Sounds good to place or keep the comment. > > Or you do not want that someone else started to use that macro in > > another places? > > And also that, since this has to be exposed in a "public" header. > > >> diff --git a/mm/slab.h b/mm/slab.h > >> index e9fd9bf0bfa65b343a4ae0ecd5b4c2a325b04883..2f01c7317988ce036f0b22807403226a59f0f708 100644 > >> --- a/mm/slab.h > >> +++ b/mm/slab.h > >> @@ -604,6 +604,8 @@ void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, > >> void **p, int objects, struct slabobj_ext *obj_exts); > >> #endif > >> > >> +void kvfree_rcu_cb(struct rcu_head *head); > >> + > >> size_t __ksize(const void *objp); > >> > >> static inline size_t slab_ksize(const struct kmem_cache *s) > >> diff --git a/mm/slab_common.c b/mm/slab_common.c > >> index 330cdd8ebc5380090ee784c58e8ca1d1a52b3758..f13d2c901daf1419993620459fbd5845eecb85f1 100644 > >> --- a/mm/slab_common.c > >> +++ b/mm/slab_common.c > >> @@ -1532,9 +1532,6 @@ kvfree_rcu_list(struct rcu_head *head) > >> rcu_lock_acquire(&rcu_callback_map); > >> trace_rcu_invoke_kvfree_callback("slab", head, offset); > >> > >> - if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset))) > >> - kvfree(ptr); > >> - > > This is not correct unless i miss something. Why do you remove this? > > Oops, meant to remove just the warn check, and unconditionally call > kvfree(), thanks for noticing :) > > The warning could really only trigger due to a use-after-free corruption of > the ptr pointer, because the BUILD_BUG_ON() would catch misuse with a too > large offset, so we don't need to keep it. > > >> +void kvfree_rcu_cb(struct rcu_head *head) > >> +{ > >> + void *obj = head; > >> + struct folio *folio; > >> + struct slab *slab; > >> + struct kmem_cache *s; > >> + void *slab_addr; > >> + > >> + if (unlikely(is_vmalloc_addr(obj))) { > >> + obj = (void *) PAGE_ALIGN_DOWN((unsigned long)obj); > >> + vfree(obj); > >> + return; > >> + } > >> + > >> + folio = virt_to_folio(obj); > >> + if (unlikely(!folio_test_slab(folio))) { > >> + /* > >> + * rcu_head offset can be only less than page size so no need to > >> + * consider folio order > >> + */ > >> + obj = (void *) PAGE_ALIGN_DOWN((unsigned long)obj); > >> + free_large_kmalloc(folio, obj); > >> + return; > >> + } > >> + > >> + slab = folio_slab(folio); > >> + s = slab->slab_cache; > >> + slab_addr = folio_address(folio); > >> + > >> + if (is_kfence_address(obj)) { > >> + obj = kfence_object_start(obj); > >> + } else { > >> + unsigned int idx = __obj_to_index(s, slab_addr, obj); > >> + > >> + obj = slab_addr + s->size * idx; > >> + obj = fixup_red_left(s, obj); > >> + } > >> + > >> + slab_free(s, slab, obj, _RET_IP_); > >> +} > >> + > > Tiny computer case. Just small nit, maybe remove unlikely() but you decide :) > > Force of habbit :) > :) -- Uladzislau Rezki