From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDF87C52D7C for ; Mon, 12 Aug 2024 08:00:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 43F7E6B00A1; Mon, 12 Aug 2024 04:00:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3EEEC6B00B3; Mon, 12 Aug 2024 04:00:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B5DF6B00B4; Mon, 12 Aug 2024 04:00:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0AF486B00A1 for ; Mon, 12 Aug 2024 04:00:19 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8794E16038F for ; Mon, 12 Aug 2024 08:00:18 +0000 (UTC) X-FDA: 82442845716.20.8D4DD19 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by imf11.hostedemail.com (Postfix) with ESMTP id 9638340018 for ; Mon, 12 Aug 2024 08:00:13 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=ty31G37+; spf=pass (imf11.hostedemail.com: domain of dtcccc@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=dtcccc@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723449581; a=rsa-sha256; cv=none; b=h7tG1LW2Ntv5ahsydJSy/8ttztrJnqUy6BH6rkfp+Pnq5ajclgau+wCkOrSW7PbiUk3QJR qxzeE0f46PzvIGTGa0xbCF/RxBgS28iuyN5V4qnbHn4I+zfVvUqczoHSB2ka93d/wDy+yZ X2BziXmPmQtoEuTccUATWVGkb2dnr1g= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=ty31G37+; spf=pass (imf11.hostedemail.com: domain of dtcccc@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=dtcccc@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723449581; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=F8QjTcNkTYwR3QaJrDLOWR/4DZgIIKZr5o7qbqN+Pjw=; b=q69uCgv/K6m6fEZ9R7FoJY7TWKIWr/HO7BGvHWmw5G7TYJjs+sTDX02Lx2HbJIV11ZSC35 vYF9NCGwQ6KI6jtGQBx4xHjDALswgGsr+HtLedPadOIkjVrtZPzCEkX/XpSdyVyRrOLFUb Vgg6jpABJC4Ki07KWE53emu7VAwBauU= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723449604; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=F8QjTcNkTYwR3QaJrDLOWR/4DZgIIKZr5o7qbqN+Pjw=; b=ty31G37+AIw6fO5xmaMSAduiA33FSzba5ynlqbtjFQUI0iDQqt9yr/F9+o6WKCSucwIBirav4Eszvmh1q19dRgaFexYaRnwVV7W24IhhCHBb+d5rmIm3vZydUXbjFGwY5pYjpvUx0tWrzMqCZTiAotm6Gg2p7Xi+cDtBrDwhxOI= Received: from 30.97.49.58(mailfrom:dtcccc@linux.alibaba.com fp:SMTPD_---0WCbyGqV_1723449602) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 16:00:03 +0800 Message-ID: <673dba4d-7d45-4e2e-8d2f-b969be0732c1@linux.alibaba.com> Date: Mon, 12 Aug 2024 16:00:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] kfence: Save freeing stack trace at calling time instead of freeing time Content-Language: en-US To: Marco Elver Cc: linux-kernel@vger.kernel.org, Alexander Potapenko , Dmitry Vyukov , Andrew Morton , kasan-dev@googlegroups.com, linux-mm@kvack.org References: <20240812065947.6104-1-dtcccc@linux.alibaba.com> From: Tianchen Ding In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: p55ae7nkorp7emeh5uahodnic4pb9ea4 X-Rspamd-Queue-Id: 9638340018 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1723449613-53516 X-HE-Meta: U2FsdGVkX1/0mU2gsPhDbKDoOJSyy3P8SQyPUaHdFrZTRALp1nsBZh0zqnmiDhMD4WSJu40w5V/jEjVsFuwd8pDxLDFwKLbofRXQKRu5OFgHlUxkpNknvbEvi1fb+23txnCr4U2R3yDcX1gpClwBTR/nIRIFe+H4jJ1tSltid40VVz4p3MwpXqIvNFMSaMLudxcXNopoFtTJaK7Ytw1UHyt30Gk5PwJdwcCx2GU9IdCkSA7u7/TEdYEMImjDaNdzEn3nfxcoP+oYMroN43BY0hS32F0z/JPD9aYL4NyVvK38rsotr3k9PUx53weGDTCAFgLP0cleJNhMNllk3E2av/vAbFyvU+z/uhTlI0z5ebaBQJwQ+36/XiAnGq80OGsTEjJZa1GYZdCphdDVLlyTlkitOJ8NxIYSoP/64GrRwNRC/rmC1uBnKJ/kSIwARMW2MOg28DMcXKJdMtTA1XLWEJLcatYQFovtKCdJa6BkjWpS4oFdCLadeHHY7MN1PjreyNhvnYVbP2z4n6GxPHH+rHlseGIxqp/o478lpaAPytgJ9QVRZPRqbXZvC/7LxGTEzM0TacVQl1MuPihu8uvY3kPEyE6IS9F5SVwJEk8Y7PvutNVR+RPXFVA24CL4Uw/EMbV0wIdH3f/a2vA/nWYkkmzlO/m15wTF9OUBr42z9ieZMfxRdbBXO+TVFZYm7yH5th51d7xIomgJ/rPTXTCyJj+f+SGtqHB9eAcuTZZiH+N6vxaglCVdumCyar49Ett5E99e+NDXzmjUB3ZMxwIt8+9MtTWgmqIiYbLXzd+8QKOEVOXNF/2ZFCIvOmF67+e7HJHDjFClelEVAEa+ctyDJFaOePxEvP1BGVZtrMkhY2u2veIxmcUIQaez6dnb/hj+06Uqu0njeRGOkN88KL6nSShSmYqgMva61TBn6tkpz/QRwwC/rpGpWRdCvup1EY8P3sZHrifKVzrzVlOAQ2D tHTGfGjf gV0UtSqp8XIs4HcvfHkvGBLh49f0H/QZr/CDx7I0TqGi3ySBSmk5Ve3LLLGeB0SwkOAlW+XVEU1M3f5qqc7al2bclaoCs5ydEGHE1/OIswoadyXlIJ93WrAhBf1oQmOSVk0N6o7/We4+HyVsmhYh01xA6Wzjh98TTG5mhgGRGorRDPeLkZkYyCCbt/XiqSh9Gph4NIIoejsOyb7z4ZfWgtpG/8/B4fO/J+Uh4lbFOiTwpQtIzDHEqtouu7QOkA8FLS4R4PksXAqcnxzCdD5kaiw8l9Mg5Od5Ygpv2sUj1KnHmatZpcSNldagqlvqlD+OA/Kd8PVGrVZTKqVI78U8eCJY32e6n8CtRmFrLXoFtMF4g7qpNPT/EtFVNiMlh5MYr7ff3u8G31a08MXDTQ7e/1xDjiS+yvc7KAO87mlskP+QoM+w= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/8/12 15:49, Marco Elver wrote: > On Mon, 12 Aug 2024 at 09:00, Tianchen Ding wrote: >> >> For kmem_cache with SLAB_TYPESAFE_BY_RCU, the freeing trace stack at >> calling kmem_cache_free() is more useful. While the following stack is >> meaningless and provides no help: >> freed by task 46 on cpu 0 at 656.840729s: >> rcu_do_batch+0x1ab/0x540 >> nocb_cb_wait+0x8f/0x260 >> rcu_nocb_cb_kthread+0x25/0x80 >> kthread+0xd2/0x100 >> ret_from_fork+0x34/0x50 >> ret_from_fork_asm+0x1a/0x30 >> >> Signed-off-by: Tianchen Ding >> --- >> I'm not sure whether we should keep KFENCE_OBJECT_FREED info remained >> (maybe the exact free time can be helpful?). But add a new kfence_track >> will cost more memory, so I prefer to reuse free_track and drop the info >> when when KFENCE_OBJECT_RCU_FREEING -> KFENCE_OBJECT_FREED. > > I think the current version is fine. In the SLAB_TYPESAFE_BY_RCU cases > it would always print the stack trace of RCU internals, so it's never > really useful (as you say above). > > Have you encountered a bug where you were debugging a UAF like this? Yes. We are debugging a UAF about struct anon_vma in an old kernel. (finally we found this may be related to commit 2555283eb40d) struct anon_vma has SLAB_TYPESAFE_BY_RCU, so we found the freeing stack is useless. > If not, what prompted you to send this patch? > > Did you run the KFENCE test suite? Yes. All passed. > >> --- >> mm/kfence/core.c | 35 ++++++++++++++++++++++++++--------- >> mm/kfence/kfence.h | 1 + >> mm/kfence/report.c | 7 ++++--- >> 3 files changed, 31 insertions(+), 12 deletions(-) >> >> diff --git a/mm/kfence/core.c b/mm/kfence/core.c >> index c5cb54fc696d..89469d4f2d95 100644 >> --- a/mm/kfence/core.c >> +++ b/mm/kfence/core.c >> @@ -269,6 +269,13 @@ static inline unsigned long metadata_to_pageaddr(const struct kfence_metadata *m >> return pageaddr; >> } >> >> +static bool kfence_obj_inuse(const struct kfence_metadata *meta) > > Other tiny helpers add "inline" so that the compiler is more likely to > inline this. In optimized kernels it should do so by default, but with > some heavily instrumented kernels we need to lower the inlining > threshold - adding "inline" does that. > > Also, note we have KFENCE_OBJECT_UNUSED state, so the > kfence_obj_inuse() helper name would suggest to me that it's all other > states. > > If the object is being freed with RCU, it is still technically > allocated and _usable_ until the next RCU grace period. So maybe > kfence_obj_allocated() is a more accurate name? > >> +{ >> + enum kfence_object_state state = READ_ONCE(meta->state); >> + >> + return state == KFENCE_OBJECT_ALLOCATED || state == KFENCE_OBJECT_RCU_FREEING; >> +} >> + >> /* >> * Update the object's metadata state, including updating the alloc/free stacks >> * depending on the state transition. >> @@ -278,10 +285,14 @@ metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state nex >> unsigned long *stack_entries, size_t num_stack_entries) >> { >> struct kfence_track *track = >> - next == KFENCE_OBJECT_FREED ? &meta->free_track : &meta->alloc_track; >> + next == KFENCE_OBJECT_ALLOCATED ? &meta->alloc_track : &meta->free_track; >> >> lockdep_assert_held(&meta->lock); >> >> + /* Stack has been saved when calling rcu, skip. */ >> + if (READ_ONCE(meta->state) == KFENCE_OBJECT_RCU_FREEING) >> + goto out; >> + >> if (stack_entries) { >> memcpy(track->stack_entries, stack_entries, >> num_stack_entries * sizeof(stack_entries[0])); >> @@ -297,6 +308,7 @@ metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state nex >> track->cpu = raw_smp_processor_id(); >> track->ts_nsec = local_clock(); /* Same source as printk timestamps. */ >> >> +out: >> /* >> * Pairs with READ_ONCE() in >> * kfence_shutdown_cache(), >> @@ -502,7 +514,7 @@ static void kfence_guarded_free(void *addr, struct kfence_metadata *meta, bool z >> >> raw_spin_lock_irqsave(&meta->lock, flags); >> >> - if (meta->state != KFENCE_OBJECT_ALLOCATED || meta->addr != (unsigned long)addr) { >> + if (!kfence_obj_inuse(meta) || meta->addr != (unsigned long)addr) { >> /* Invalid or double-free, bail out. */ >> atomic_long_inc(&counters[KFENCE_COUNTER_BUGS]); >> kfence_report_error((unsigned long)addr, false, NULL, meta, >> @@ -780,7 +792,7 @@ static void kfence_check_all_canary(void) >> for (i = 0; i < CONFIG_KFENCE_NUM_OBJECTS; i++) { >> struct kfence_metadata *meta = &kfence_metadata[i]; >> >> - if (meta->state == KFENCE_OBJECT_ALLOCATED) >> + if (kfence_obj_inuse(meta)) >> check_canary(meta); >> } >> } >> @@ -1006,12 +1018,11 @@ void kfence_shutdown_cache(struct kmem_cache *s) >> * the lock will not help, as different critical section >> * serialization will have the same outcome. >> */ >> - if (READ_ONCE(meta->cache) != s || >> - READ_ONCE(meta->state) != KFENCE_OBJECT_ALLOCATED) >> + if (READ_ONCE(meta->cache) != s || !kfence_obj_inuse(meta)) >> continue; >> >> raw_spin_lock_irqsave(&meta->lock, flags); >> - in_use = meta->cache == s && meta->state == KFENCE_OBJECT_ALLOCATED; >> + in_use = meta->cache == s && kfence_obj_inuse(meta); >> raw_spin_unlock_irqrestore(&meta->lock, flags); >> >> if (in_use) { >> @@ -1145,6 +1156,7 @@ void *kfence_object_start(const void *addr) >> void __kfence_free(void *addr) >> { >> struct kfence_metadata *meta = addr_to_metadata((unsigned long)addr); >> + unsigned long flags; > > This flags variable does not need to be scoped for the whole function. > It can just be scoped within the if-branch where it's needed (at least > I don't see other places besides there where it's used). > >> #ifdef CONFIG_MEMCG >> KFENCE_WARN_ON(meta->obj_exts.objcg); >> @@ -1154,9 +1166,14 @@ void __kfence_free(void *addr) >> * the object, as the object page may be recycled for other-typed >> * objects once it has been freed. meta->cache may be NULL if the cache >> * was destroyed. >> + * Save the stack trace here. It is more useful. > > "It is more useful." adds no value to the comment. > > I would say something like: "Save the stack trace here so that reports > show where the user freed the object." > >> */ >> - if (unlikely(meta->cache && (meta->cache->flags & SLAB_TYPESAFE_BY_RCU))) >> + if (unlikely(meta->cache && (meta->cache->flags & SLAB_TYPESAFE_BY_RCU))) { >> + raw_spin_lock_irqsave(&meta->lock, flags); >> + metadata_update_state(meta, KFENCE_OBJECT_RCU_FREEING, NULL, 0); >> + raw_spin_unlock_irqrestore(&meta->lock, flags); >> call_rcu(&meta->rcu_head, rcu_guarded_free); >> + } > > Wrong if-else style. Turn the whole thing into > > if (...) { > ... > } else { > kfence_guarded_free(...); > } > > So it looks balanced. > >> else >> kfence_guarded_free(addr, meta, false); >> } >> @@ -1182,14 +1199,14 @@ bool kfence_handle_page_fault(unsigned long addr, bool is_write, struct pt_regs >> int distance = 0; >> >> meta = addr_to_metadata(addr - PAGE_SIZE); >> - if (meta && READ_ONCE(meta->state) == KFENCE_OBJECT_ALLOCATED) { >> + if (meta && kfence_obj_inuse(meta)) { >> to_report = meta; >> /* Data race ok; distance calculation approximate. */ >> distance = addr - data_race(meta->addr + meta->size); >> } >> >> meta = addr_to_metadata(addr + PAGE_SIZE); >> - if (meta && READ_ONCE(meta->state) == KFENCE_OBJECT_ALLOCATED) { >> + if (meta && kfence_obj_inuse(meta)) { >> /* Data race ok; distance calculation approximate. */ >> if (!to_report || distance > data_race(meta->addr) - addr) >> to_report = meta; >> diff --git a/mm/kfence/kfence.h b/mm/kfence/kfence.h >> index db87a05047bd..dfba5ea06b01 100644 >> --- a/mm/kfence/kfence.h >> +++ b/mm/kfence/kfence.h >> @@ -38,6 +38,7 @@ >> enum kfence_object_state { >> KFENCE_OBJECT_UNUSED, /* Object is unused. */ >> KFENCE_OBJECT_ALLOCATED, /* Object is currently allocated. */ >> + KFENCE_OBJECT_RCU_FREEING, /* Object was allocated, and then being freed by rcu. */ >> KFENCE_OBJECT_FREED, /* Object was allocated, and then freed. */ >> }; >> >> diff --git a/mm/kfence/report.c b/mm/kfence/report.c >> index 73a6fe42845a..451991a3a8f2 100644 >> --- a/mm/kfence/report.c >> +++ b/mm/kfence/report.c >> @@ -114,7 +114,8 @@ static void kfence_print_stack(struct seq_file *seq, const struct kfence_metadat >> >> /* Timestamp matches printk timestamp format. */ >> seq_con_printf(seq, "%s by task %d on cpu %d at %lu.%06lus (%lu.%06lus ago):\n", >> - show_alloc ? "allocated" : "freed", track->pid, >> + show_alloc ? "allocated" : meta->state == KFENCE_OBJECT_RCU_FREEING ? >> + "rcu freeing" : "freed", track->pid, >> track->cpu, (unsigned long)ts_sec, rem_nsec / 1000, >> (unsigned long)interval_nsec, rem_interval_nsec / 1000); >> >> @@ -149,7 +150,7 @@ void kfence_print_object(struct seq_file *seq, const struct kfence_metadata *met >> >> kfence_print_stack(seq, meta, true); >> >> - if (meta->state == KFENCE_OBJECT_FREED) { >> + if (meta->state == KFENCE_OBJECT_FREED || meta->state == KFENCE_OBJECT_RCU_FREEING) { >> seq_con_printf(seq, "\n"); >> kfence_print_stack(seq, meta, false); >> } >> @@ -318,7 +319,7 @@ bool __kfence_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *sla >> kpp->kp_slab_cache = meta->cache; >> kpp->kp_objp = (void *)meta->addr; >> kfence_to_kp_stack(&meta->alloc_track, kpp->kp_stack); >> - if (meta->state == KFENCE_OBJECT_FREED) >> + if (meta->state == KFENCE_OBJECT_FREED || meta->state == KFENCE_OBJECT_RCU_FREEING) >> kfence_to_kp_stack(&meta->free_track, kpp->kp_free_stack); >> /* get_stack_skipnr() ensures the first entry is outside allocator. */ >> kpp->kp_ret = kpp->kp_stack[0]; >> -- >> 2.39.3 >> >> -- >> You received this message because you are subscribed to the Google Groups "kasan-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. >> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20240812065947.6104-1-dtcccc%40linux.alibaba.com. Thanks for your comments. I'll fix them and send v2 later.