linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <koct9i@gmail.com>
To: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Ohad Ben-Cohen <ohad@wizery.com>,
	Matthew Wilcox <willy@linux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Stable <stable@vger.kernel.org>
Subject: Re: [PATCH 1/5] radix-tree: Fix race in gang lookup
Date: Thu, 4 Feb 2016 11:44:02 +0300	[thread overview]
Message-ID: <CALYGNiPkrB4JauWkTNsqxH++iiCGtaRWQGFMW2BU7VDfz-rq=A@mail.gmail.com> (raw)
In-Reply-To: <CALYGNiOksSkSzJWz3JPPozfeAaHPWOQZFgDzSr-MnR9zVBTncw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5224 bytes --]

On Thu, Feb 4, 2016 at 12:37 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Thu, Jan 28, 2016 at 12:17 AM, Matthew Wilcox
> <matthew.r.wilcox@intel.com> wrote:
>> From: Matthew Wilcox <willy@linux.intel.com>
>>
>> If the indirect_ptr bit is set on a slot, that indicates we need to
>> redo the lookup.  Introduce a new function radix_tree_iter_retry()
>> which forces the loop to retry the lookup by setting 'slot' to NULL and
>> turning the iterator back to point at the problematic entry.
>>
>> This is a pretty rare problem to hit at the moment; the lookup has to
>> race with a grow of the radix tree from a height of 0.  The consequences
>> of hitting this race are that gang lookup could return a pointer to a
>> radix_tree_node instead of a pointer to whatever the user had inserted
>> in the tree.
>>
>> Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
>> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  include/linux/radix-tree.h | 16 ++++++++++++++++
>>  lib/radix-tree.c           | 12 ++++++++++--
>>  2 files changed, 26 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
>> index f9a3da5bf892..db0ed595749b 100644
>> --- a/include/linux/radix-tree.h
>> +++ b/include/linux/radix-tree.h
>> @@ -387,6 +387,22 @@ void **radix_tree_next_chunk(struct radix_tree_root *root,
>>                              struct radix_tree_iter *iter, unsigned flags);
>>
>>  /**
>> + * radix_tree_iter_retry - retry this chunk of the iteration
>> + * @iter:      iterator state
>> + *
>> + * If we iterate over a tree protected only by the RCU lock, a race
>> + * against deletion or creation may result in seeing a slot for which
>> + * radix_tree_deref_retry() returns true.  If so, call this function
>> + * and continue the iteration.
>> + */
>> +static inline __must_check
>> +void **radix_tree_iter_retry(struct radix_tree_iter *iter)
>> +{
>> +       iter->next_index = iter->index;
>> +       return NULL;
>> +}
>> +
>> +/**
>>   * radix_tree_chunk_size - get current chunk size
>>   *
>>   * @iter:      pointer to radix tree iterator
>> diff --git a/lib/radix-tree.c b/lib/radix-tree.c
>> index a25f635dcc56..65422ac17114 100644
>> --- a/lib/radix-tree.c
>> +++ b/lib/radix-tree.c
>> @@ -1105,9 +1105,13 @@ radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
>>                 return 0;
>>
>>         radix_tree_for_each_slot(slot, root, &iter, first_index) {
>> -               results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
>> +               results[ret] = rcu_dereference_raw(*slot);
>>                 if (!results[ret])
>>                         continue;
>> +               if (radix_tree_is_indirect_ptr(results[ret])) {
>> +                       slot = radix_tree_iter_retry(&iter);
>> +                       continue;
>> +               }
>>                 if (++ret == max_items)
>>                         break;
>>         }
>
> Looks like your fix doesn't work.
>
> After radix_tree_iter_retry: radix_tree_for_each_slot will call
> radix_tree_next_slot which isn't safe to call for NULL slot.
>
> #define radix_tree_for_each_slot(slot, root, iter, start) \
> for (slot = radix_tree_iter_init(iter, start) ; \
>     slot || (slot = radix_tree_next_chunk(root, iter, 0)) ; \
>     slot = radix_tree_next_slot(slot, iter, 0))
>
> tagged iterator works becase restart happens only at root - tags
> filled with single bit.
>
> quick (untested) fix for that
>
> --- a/include/linux/radix-tree.h
> +++ b/include/linux/radix-tree.h
> @@ -457,9 +457,9 @@ radix_tree_next_slot(void **slot, struct
> radix_tree_iter *iter, unsigned flags)
>                         return slot + offset + 1;
>                 }
>         } else {
> -               unsigned size = radix_tree_chunk_size(iter) - 1;
> +               int size = radix_tree_chunk_size(iter) - 1;
>
> -               while (size--) {
> +               while (size-- > 0) {
>                         slot++;
>                         iter->index++;
>                         if (likely(*slot))
>
>

Yep. Kernel crashes. Test in attachment.

fix: https://lkml.kernel.org/r/145457528789.31321.4441662473067711123.stgit@zurg

>> @@ -1184,9 +1188,13 @@ radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results,
>>                 return 0;
>>
>>         radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
>> -               results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
>> +               results[ret] = rcu_dereference_raw(*slot);
>>                 if (!results[ret])
>>                         continue;
>> +               if (radix_tree_is_indirect_ptr(results[ret])) {
>> +                       slot = radix_tree_iter_retry(&iter);
>> +                       continue;
>> +               }
>>                 if (++ret == max_items)
>>                         break;
>>         }
>> --
>> 2.7.0.rc3
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

[-- Attachment #2: radix-tree-test-radix_tree_iter_retry --]
[-- Type: application/octet-stream, Size: 2218 bytes --]

radix-tree: test radix_tree_iter_retry

From: Konstantin Khlebnikov <koct9i@gmail.com>

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
---
 lib/radix-tree.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 6b79e9026e24..f489334b9cb7 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -1491,6 +1491,66 @@ static int radix_tree_callback(struct notifier_block *nfb,
        return NOTIFY_OK;
 }
 
+static void test_iter_retry(void)
+{
+	RADIX_TREE(root, GFP_KERNEL);
+	void *ptr = (void *)4ul;
+	struct radix_tree_iter iter;
+	void **slot;
+	bool first;
+
+	radix_tree_insert(&root, 0, ptr);
+	radix_tree_tag_set(&root, 0, 0);
+
+	first = true;
+	radix_tree_for_each_tagged(slot, &root, &iter, 0, 0) {
+		printk("tagged %ld %p\n", iter.index, *slot);
+		if (first) {
+			radix_tree_insert(&root, 1, ptr);
+			radix_tree_tag_set(&root, 1, 0);
+			first = false;
+		}
+		if (radix_tree_deref_retry(*slot)) {
+			printk("retry %ld\n", iter.index);
+			slot = radix_tree_iter_retry(&iter);
+			continue;
+		}
+	}
+	radix_tree_delete(&root, 1);
+
+	first = true;
+	radix_tree_for_each_slot(slot, &root, &iter, 0) {
+		printk("slot %ld %p\n", iter.index, *slot);
+		if (first) {
+			radix_tree_insert(&root, 1, ptr);
+			first = false;
+		}
+		if (radix_tree_deref_retry(*slot)) {
+			printk("retry %ld\n", iter.index);
+			slot = radix_tree_iter_retry(&iter);
+			continue;
+		}
+	}
+	radix_tree_delete(&root, 1);
+
+	first = true;
+	radix_tree_for_each_contig(slot, &root, &iter, 0) {
+		printk("contig %ld %p\n", iter.index, *slot);
+		if (first) {
+			radix_tree_insert(&root, 1, ptr);
+			first = false;
+		}
+		if (radix_tree_deref_retry(*slot)) {
+			printk("retry %ld\n", iter.index);
+			slot = radix_tree_iter_retry(&iter);
+			continue;
+		}
+	}
+
+	radix_tree_delete(&root, 0);
+	radix_tree_delete(&root, 1);
+}
+
 void __init radix_tree_init(void)
 {
 	radix_tree_node_cachep = kmem_cache_create("radix_tree_node",
@@ -1499,4 +1559,6 @@ void __init radix_tree_init(void)
 			radix_tree_node_ctor);
 	radix_tree_init_maxindex();
 	hotcpu_notifier(radix_tree_callback, 0);
+
+	test_iter_retry();
 }

  reply	other threads:[~2016-02-04  8:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-27 21:17 [PATCH 0/5] Fix races & improve the radix tree iterator patterns Matthew Wilcox
2016-01-27 21:17 ` [PATCH 1/5] radix-tree: Fix race in gang lookup Matthew Wilcox
2016-02-03 21:37   ` Konstantin Khlebnikov
2016-02-04  8:44     ` Konstantin Khlebnikov [this message]
2016-03-04 13:21   ` zhong jiang
2016-01-27 21:17 ` [PATCH 2/5] hwspinlock: Fix race between radix tree insertion and lookup Matthew Wilcox
2016-01-27 21:17 ` [PATCH 3/5] btrfs: Use radix_tree_iter_retry() Matthew Wilcox
2016-02-01 14:34   ` David Sterba
2016-01-27 21:17 ` [PATCH 4/5] mm: " Matthew Wilcox
2016-01-29 14:45   ` Vlastimil Babka
2016-01-29 14:50     ` Matthew Wilcox
2016-02-19 18:02   ` Sasha Levin
2016-01-27 21:17 ` [PATCH 5/5] radix-tree,shmem: Introduce radix_tree_iter_next() Matthew Wilcox
2016-02-04  8:50   ` Konstantin Khlebnikov
2016-01-28  7:17 ` [PATCH 0/5] Fix races & improve the radix tree iterator patterns Konstantin Khlebnikov
2016-02-03  6:27   ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALYGNiPkrB4JauWkTNsqxH++iiCGtaRWQGFMW2BU7VDfz-rq=A@mail.gmail.com' \
    --to=koct9i@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=ohad@wizery.com \
    --cc=stable@vger.kernel.org \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox