From: Joel Fernandes <joel@joelfernandes.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Paasch <cpaasch@apple.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, MPTCP Upstream <mptcp@lists.linux.dev>,
rcu@vger.kernel.org
Subject: Re: kmemleak handling of kfree_rcu
Date: Tue, 5 Sep 2023 11:17:25 +0000 [thread overview]
Message-ID: <20230905111725.GA3737639@google.com> (raw)
In-Reply-To: <ZPZKsJUPVeHCsLQB@arm.com>
On Mon, Sep 04, 2023 at 10:22:56PM +0100, Catalin Marinas wrote:
> Hi Christoph,
>
> Please try not to send html, many servers block such emails.
>
> Also adding the RCU list.
>
> On Wed, Aug 30, 2023 at 09:37:23AM -0700, Christoph Paasch wrote:
> > for the MPTCP upstream project, we are running syzkaller [1] continuously to
> > qualify our kernel changes.
> >
> > We found one issue with kmemleak and its handling of kfree_rcu.
> >
> > Specifically, it looks like kmemleak falsely reports memory-leaks when the
> > object is being freed by kfree_rcu after a certain grace-period.
> >
> > For example, https://github.com/multipath-tcp/mptcp_net-next/issues/398#
> > issuecomment-1584819482 shows how the syzkaller program reliably produces a
> > kmemleak report, although the object is not leaked (we confirmed that by simply
> > increasing MSECS_MIN_AGE to something larger than the grace-period).
>
> Not sure which RCU variant you are using but most likely the false
> positives are caused by the original reference to the object being lost
> and the pointer added to a new location that kmemleak does not track
> (e.g. bnode->records[] in the tree-based variant).
>
> A quick attempt (untested, not even compiled):
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
I am not sure if that works. Correct me if I'm wrong but the issue is that
the allocated pointer being RCU-freed is no longer reachable by kmemleak (it
is scanned but not reachable), however it might still be reachable via an
RCU reader. In such a situation, it is a false-positive.
The correct fix then should probably be to mark the object as
kmemleak_not_leak() until a grace period elapses. This will cause the object
to not be reported but still be scanned until eventually the lower layers
will remove the object from kmemleak-tracking after the grace period. Per the
docs also, that API is used to prevent false-positives.
Instead what the below diff appears to do is to mark the bnode cache as a
kmemleak object itself, which smells a bit wrong. The bnode is not an
allocated object in the traditional sense, it is simple an internal data
structure. That may not solve the issue as the false-positive unreachable
object is still unreachable.
Or did I misunderstand how kmemleak works? (Quite possible).
thanks,
- Joel
> index 1449cb69a0e0..681a1eb7560a 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -53,6 +53,7 @@
> #include <linux/sysrq.h>
> #include <linux/kprobes.h>
> #include <linux/gfp.h>
> +#include <linux/kmemleak.h>
> #include <linux/oom.h>
> #include <linux/smpboot.h>
> #include <linux/jiffies.h>
> @@ -2934,6 +2935,7 @@ drain_page_cache(struct kfree_rcu_cpu *krcp)
>
> llist_for_each_safe(pos, n, page_list) {
> free_page((unsigned long)pos);
> + kmemleak_free(pos);
> freed++;
> }
>
> @@ -2962,8 +2964,16 @@ kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp,
> rcu_state.name, bnode->records[i], 0);
>
> vfree(bnode->records[i]);
> + /* avoid false negatives */
> + kmemleak_erase(&bnode->records[i]);
> }
> }
> + /*
> + * Avoid kmemleak false negatives when such pointers are later
> + * re-allocated.
> + */
> + for (i = 0; i < bnode->nr_records; i++)
> + kmemleak_erase(&bnode->records[i]);
> rcu_lock_release(&rcu_callback_map);
> }
>
> @@ -2972,8 +2982,10 @@ kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp,
> bnode = NULL;
> raw_spin_unlock_irqrestore(&krcp->lock, flags);
>
> - if (bnode)
> + if (bnode) {
> free_page((unsigned long) bnode);
> + kmemleak_free(bnode);
> + }
>
> cond_resched_tasks_rcu_qs();
> }
> @@ -3241,6 +3253,12 @@ static void fill_page_cache_func(struct work_struct *work)
> free_page((unsigned long) bnode);
> break;
> }
> +
> + /*
> + * Scan the bnode->records[] array until the objects are
> + * actually freed.
> + */
> + kmemleak_alloc(bnode, PAGE_SIZE, 0, GFP_KERNEL);
> }
>
> atomic_set(&krcp->work_in_progress, 0);
> @@ -3308,6 +3326,11 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
> // scenarios.
> bnode = (struct kvfree_rcu_bulk_data *)
> __get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
> + /*
> + * Scan the bnode->records[] array until the objects are
> + * actually freed.
> + */
> + kmemleak_alloc(bnode, PAGE_SIZE, 0, GFP_KERNEL);
> raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
> }
>
>
> --
> Catalin
next prev parent reply other threads:[~2023-09-05 11:17 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-30 16:37 Christoph Paasch
2023-09-04 21:22 ` Catalin Marinas
2023-09-05 11:17 ` Joel Fernandes [this message]
2023-09-05 14:41 ` Catalin Marinas
2023-09-06 14:35 ` Joel Fernandes
2023-09-06 17:15 ` Catalin Marinas
2023-09-06 19:11 ` Paul E. McKenney
2023-09-06 21:37 ` Catalin Marinas
2023-09-06 22:02 ` Paul E. McKenney
2023-09-06 23:10 ` Joel Fernandes
2023-09-12 13:15 ` Matthieu Baerts
2023-09-12 13:32 ` Joel Fernandes
2023-09-05 21:22 ` Christoph Paasch
2023-09-06 17:21 ` Catalin Marinas
2023-09-10 23:10 ` Joel Fernandes
2023-09-11 17:41 ` Christoph Paasch
2023-09-12 9:52 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230905111725.GA3737639@google.com \
--to=joel@joelfernandes.org \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=cpaasch@apple.com \
--cc=linux-mm@kvack.org \
--cc=mptcp@lists.linux.dev \
--cc=rcu@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox