linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Paasch <cpaasch@apple.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, MPTCP Upstream <mptcp@lists.linux.dev>,
	rcu@vger.kernel.org
Subject: Re: kmemleak handling of kfree_rcu
Date: Tue, 5 Sep 2023 11:17:25 +0000	[thread overview]
Message-ID: <20230905111725.GA3737639@google.com> (raw)
In-Reply-To: <ZPZKsJUPVeHCsLQB@arm.com>

On Mon, Sep 04, 2023 at 10:22:56PM +0100, Catalin Marinas wrote:
> Hi Christoph,
> 
> Please try not to send html, many servers block such emails.
> 
> Also adding the RCU list.
> 
> On Wed, Aug 30, 2023 at 09:37:23AM -0700, Christoph Paasch wrote:
> > for the MPTCP upstream project, we are running syzkaller [1] continuously to
> > qualify our kernel changes.
> > 
> > We found one issue with kmemleak and its handling of kfree_rcu.
> > 
> > Specifically, it looks like kmemleak falsely reports memory-leaks when the
> > object is being freed by kfree_rcu after a certain grace-period.
> > 
> > For example, https://github.com/multipath-tcp/mptcp_net-next/issues/398#
> > issuecomment-1584819482 shows how the syzkaller program reliably produces a
> > kmemleak report, although the object is not leaked (we confirmed that by simply
> > increasing MSECS_MIN_AGE to something larger than the grace-period).
> 
> Not sure which RCU variant you are using but most likely the false
> positives are caused by the original reference to the object being lost
> and the pointer added to a new location that kmemleak does not track
> (e.g. bnode->records[] in the tree-based variant).
> 
> A quick attempt (untested, not even compiled):
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c

I am not sure if that works. Correct me if I'm wrong but the issue is that
the allocated pointer being RCU-freed is no longer reachable by kmemleak (it
is scanned but not reachable), however it might still be reachable via an
RCU reader. In such a situation, it is a false-positive.

The correct fix then should probably be to mark the object as
kmemleak_not_leak() until a grace period elapses. This will cause the object
to not be reported but still be scanned until eventually the lower layers
will remove the object from kmemleak-tracking after the grace period. Per the
docs also, that API is used to prevent false-positives.

Instead what the below diff appears to do is to mark the bnode cache as a
kmemleak object itself, which smells a bit wrong. The bnode is not an
allocated object in the traditional sense, it is simple an internal data
structure. That may not solve the issue as the false-positive unreachable
object is still unreachable.

Or did I misunderstand how kmemleak works? (Quite possible).

thanks,

 - Joel


> index 1449cb69a0e0..681a1eb7560a 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -53,6 +53,7 @@
>  #include <linux/sysrq.h>
>  #include <linux/kprobes.h>
>  #include <linux/gfp.h>
> +#include <linux/kmemleak.h>
>  #include <linux/oom.h>
>  #include <linux/smpboot.h>
>  #include <linux/jiffies.h>
> @@ -2934,6 +2935,7 @@ drain_page_cache(struct kfree_rcu_cpu *krcp)
>  
>  	llist_for_each_safe(pos, n, page_list) {
>  		free_page((unsigned long)pos);
> +		kmemleak_free(pos);
>  		freed++;
>  	}
>  
> @@ -2962,8 +2964,16 @@ kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp,
>  					rcu_state.name, bnode->records[i], 0);
>  
>  				vfree(bnode->records[i]);
> +				/* avoid false negatives */
> +				kmemleak_erase(&bnode->records[i]);
>  			}
>  		}
> +		/*
> +		 * Avoid kmemleak false negatives when such pointers are later
> +		 * re-allocated.
> +		 */
> +		for (i = 0; i < bnode->nr_records; i++)
> +			kmemleak_erase(&bnode->records[i]);
>  		rcu_lock_release(&rcu_callback_map);
>  	}
>  
> @@ -2972,8 +2982,10 @@ kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp,
>  		bnode = NULL;
>  	raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  
> -	if (bnode)
> +	if (bnode) {
>  		free_page((unsigned long) bnode);
> +		kmemleak_free(bnode);
> +	}
>  
>  	cond_resched_tasks_rcu_qs();
>  }
> @@ -3241,6 +3253,12 @@ static void fill_page_cache_func(struct work_struct *work)
>  			free_page((unsigned long) bnode);
>  			break;
>  		}
> +
> +		/*
> +		 * Scan the bnode->records[] array until the objects are
> +		 * actually freed.
> +		 */
> +		kmemleak_alloc(bnode, PAGE_SIZE, 0, GFP_KERNEL);
>  	}
>  
>  	atomic_set(&krcp->work_in_progress, 0);
> @@ -3308,6 +3326,11 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
>  			// scenarios.
>  			bnode = (struct kvfree_rcu_bulk_data *)
>  				__get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
> +			/*
> +			 * Scan the bnode->records[] array until the objects are
> +			 * actually freed.
> +			 */
> +			kmemleak_alloc(bnode, PAGE_SIZE, 0, GFP_KERNEL);
>  			raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
>  		}
>  
> 
> -- 
> Catalin


  reply	other threads:[~2023-09-05 11:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-30 16:37 Christoph Paasch
2023-09-04 21:22 ` Catalin Marinas
2023-09-05 11:17   ` Joel Fernandes [this message]
2023-09-05 14:41     ` Catalin Marinas
2023-09-06 14:35       ` Joel Fernandes
2023-09-06 17:15         ` Catalin Marinas
2023-09-06 19:11           ` Paul E. McKenney
2023-09-06 21:37             ` Catalin Marinas
2023-09-06 22:02               ` Paul E. McKenney
2023-09-06 23:10                 ` Joel Fernandes
2023-09-12 13:15           ` Matthieu Baerts
2023-09-12 13:32             ` Joel Fernandes
2023-09-05 21:22   ` Christoph Paasch
2023-09-06 17:21     ` Catalin Marinas
2023-09-10 23:10       ` Joel Fernandes
2023-09-11 17:41         ` Christoph Paasch
2023-09-12  9:52           ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230905111725.GA3737639@google.com \
    --to=joel@joelfernandes.org \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=cpaasch@apple.com \
    --cc=linux-mm@kvack.org \
    --cc=mptcp@lists.linux.dev \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox