* [PATCH 2/2] test_maple_tree: Add more testing for mas_empty_area()
2023-03-03 2:15 [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection Liam R. Howlett
@ 2023-03-03 2:15 ` Liam R. Howlett
2023-03-03 8:51 ` [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection Snild Dolkow
2023-03-07 13:05 ` Peng Zhang
2 siblings, 0 replies; 9+ messages in thread
From: Liam R. Howlett @ 2023-03-03 2:15 UTC (permalink / raw)
To: maple-tree, linux-mm, linux-kernel, Andrew Morton, Snild Dolkow
Cc: Liam R. Howlett, Stable
Test robust filling of an entire area of the tree, then test one beyond.
This is to test the walking back up the tree at the end of nodes and
error condition.
Test inspired by the reproducer code provided by Snild Dolkow.
Cc: Snild Dolkow <snild@sony.com>
Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
Cc: <Stable@vger.kernel.org>
Fixes: e15e06a83923 ("lib/test_maple_tree: add testing for maple tree")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
lib/test_maple_tree.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/lib/test_maple_tree.c b/lib/test_maple_tree.c
index 3d19b1f78d71..fa86874763f0 100644
--- a/lib/test_maple_tree.c
+++ b/lib/test_maple_tree.c
@@ -2670,6 +2670,36 @@ static noinline void check_empty_area_window(struct maple_tree *mt)
rcu_read_unlock();
}
+static noinline void check_empty_area_fill(struct maple_tree *mt)
+{
+ int loop, shift;
+ unsigned long max = 0x25D78000;
+ unsigned long size;
+ MA_STATE(mas, mt, 0, 0);
+
+ mt_set_non_kernel(99999);
+ for (shift = 12; shift <= 16; shift++) {
+ loop = 5000;
+ size = 1 << shift;
+ while (loop--) {
+ mas_lock(&mas);
+ MT_BUG_ON(mt, mas_empty_area(&mas, 0, max, size) != 0);
+ MT_BUG_ON(mt, mas.last != mas.index + size - 1);
+ mas_store_gfp(&mas, &check_empty_area_fill, GFP_KERNEL);
+ mas_unlock(&mas);
+ mas_reset(&mas);
+ }
+ }
+
+ /* No space left. */
+ size = 0x1000;
+ rcu_read_lock();
+ MT_BUG_ON(mt, mas_empty_area(&mas, 0, max, size) != -EBUSY);
+ rcu_read_unlock();
+
+ mt_set_non_kernel(0);
+}
+
static DEFINE_MTREE(tree);
static int maple_tree_seed(void)
{
@@ -2926,6 +2956,11 @@ static int maple_tree_seed(void)
check_empty_area_window(&tree);
mtree_destroy(&tree);
+ mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE);
+ check_empty_area_fill(&tree);
+ mtree_destroy(&tree);
+
+
#if defined(BENCH)
skip:
#endif
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection
2023-03-03 2:15 [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection Liam R. Howlett
2023-03-03 2:15 ` [PATCH 2/2] test_maple_tree: Add more testing for mas_empty_area() Liam R. Howlett
@ 2023-03-03 8:51 ` Snild Dolkow
2023-03-07 13:05 ` Peng Zhang
2 siblings, 0 replies; 9+ messages in thread
From: Snild Dolkow @ 2023-03-03 8:51 UTC (permalink / raw)
To: Liam R. Howlett, maple-tree, linux-mm, linux-kernel, Andrew Morton; +Cc: Stable
Thanks, this is a significant improvement. Applying it on top of v6.1.12
allows my reproducer to pass most of the time (running as init in qemu).
Unfortunately, it's still failing around 10% of the time:
> $ for x in $(seq 100); do qemu-system-x86_64 -nographic -no-reboot -append 'console=ttyS0 panic=-1' -kernel arch/x86/boot/bzImage -initrd initrd/initrd.gz; done | tee qemu.log
> [...] > $ egrep -o 'Failed|Success' qemu.log | sort | uniq -c
> 11 Failed
> 89 Success
The failures now happen later, around 25 MiB:
> $ grep Failed qemu.log
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=1050 errno=12 total_leaks=29081600 (27 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=332 errno=12 total_leaks=23199744 (22 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=838 errno=12 total_leaks=27344896 (26 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=282 errno=12 total_leaks=22790144 (21 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=695 errno=12 total_leaks=26173440 (24 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=1064 errno=12 total_leaks=29196288 (27 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=608 errno=12 total_leaks=25460736 (24 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=443 errno=12 total_leaks=24109056 (22 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=549 errno=12 total_leaks=24977408 (23 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=630 errno=12 total_leaks=25640960 (24 MiB)
> Failed. m=0xffffffffffffffff size=8192 (1<<13) i=820 errno=12 total_leaks=27197440 (25 MiB)
Just to make sure, I went back to e15e06a8 and ran the same loop.
> $ egrep -o 'Failed|Success' qemu.log | sort | uniq -c
> 100 Success
And with the patches applied on top of master (ee3f96b1):
> $ egrep -o 'Failed|Success' qemu.log | sort | uniq -c
> 10 Failed
> 90 Success
//Snild
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection
2023-03-03 2:15 [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection Liam R. Howlett
2023-03-03 2:15 ` [PATCH 2/2] test_maple_tree: Add more testing for mas_empty_area() Liam R. Howlett
2023-03-03 8:51 ` [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection Snild Dolkow
@ 2023-03-07 13:05 ` Peng Zhang
2023-03-07 14:30 ` Snild Dolkow
2 siblings, 1 reply; 9+ messages in thread
From: Peng Zhang @ 2023-03-07 13:05 UTC (permalink / raw)
To: Liam R. Howlett
Cc: Stable, maple-tree, linux-mm, linux-kernel, Andrew Morton, Snild Dolkow
Hi, Liam,
> mas_skip_node() is used to move the maple state to the node with a
> higher limit. It does this by walking up the tree and increasing the
> slot count. Since slot count may not be able to be increased, it may
> need to walk up multiple times to find room to walk right to a higher
> limit node. The limit of slots that was being used was the node limit
> and not the last location of data in the node. This would cause the
> maple state to be shifted outside actual data and enter an error state,
> thus returning -EBUSY.
>
> The result of the incorrect error state means that mas_awalk() would
> return an error instead of finding the allocation space.
>
> The fix is to use mas_data_end() in mas_skip_node() to detect the nodes
> data end point and continue walking the tree up until it is safe to move
> to a node with a higher limit.
>
> mas_skip_node() may also be passed a maple state in an error state from
> mas_anode_descend() when no allocations are available. Return on such
> an error state immediately.
>
> Reported-by: Snild Dolkow <snild@sony.com>
> Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
> Cc: <Stable@vger.kernel.org>
> Fixes: 54a611b60590 ("Maple Tree: add new data structure")
> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
> ---
> lib/maple_tree.c | 25 ++++++++++---------------
> 1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index 2be86368237d..2efe854946d6 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5188,34 +5188,29 @@ static inline bool mas_rewind_node(struct ma_state *mas)
> */
> static inline bool mas_skip_node(struct ma_state *mas)
> {
> - unsigned char slot, slot_count;
> unsigned long *pivots;
> enum maple_type mt;
>
> - mt = mte_node_type(mas->node);
> - slot_count = mt_slots[mt] - 1;
> + if (mas_is_err(mas))
> + return false;
> +
> do {
> if (mte_is_root(mas->node)) {
> - slot = mas->offset;
> - if (slot > slot_count) {
> + if (mas->offset >= mas_data_end(mas)) {
> mas_set_err(mas, -EBUSY);
> return false;
> }
> } else {
> mas_ascend(mas);
> - slot = mas->offset;
> - mt = mte_node_type(mas->node);
> - slot_count = mt_slots[mt] - 1;
> }
> - } while (slot > slot_count);
> + } while (mas->offset >= mas_data_end(mas));
>
> - mas->offset = ++slot;
> + mt = mte_node_type(mas->node);
> pivots = ma_pivots(mas_mn(mas), mt);
> - if (slot > 0)
> - mas->min = pivots[slot - 1] + 1;
> -
> - if (slot <= slot_count)
> - mas->max = pivots[slot];
> + mas->min = pivots[mas->offset] + 1;
> + mas->offset++;
> + if (mas->offset < mt_slots[mt])
> + mas->max = pivots[mas->offset];
There is a bug here, the assignment of mas->min and mas->max is wrong.
The assignment will make them represent the range of a child node, but
it should represent the range of the current node. After mas_ascend()
returns, mas-min and mas->max already represent the range of the current
node, so we should delete these assignments of mas->min and mas->max.
>
> return true;
> }
Sincerely yours,
Peng.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection
2023-03-07 13:05 ` Peng Zhang
@ 2023-03-07 14:30 ` Snild Dolkow
2023-03-07 16:10 ` Liam R. Howlett
2023-03-07 16:16 ` Peng Zhang
0 siblings, 2 replies; 9+ messages in thread
From: Snild Dolkow @ 2023-03-07 14:30 UTC (permalink / raw)
To: Peng Zhang, Liam R. Howlett
Cc: Stable, maple-tree, linux-mm, linux-kernel, Andrew Morton
On 2023-03-07 14:05, Peng Zhang wrote:
> Hi, Liam,
>> - } while (slot > slot_count);
>> + } while (mas->offset >= mas_data_end(mas));
>> - mas->offset = ++slot;
>> + mt = mte_node_type(mas->node);
>> pivots = ma_pivots(mas_mn(mas), mt);
>> - if (slot > 0)
>> - mas->min = pivots[slot - 1] + 1;
>> -
>> - if (slot <= slot_count)
>> - mas->max = pivots[slot];
>> + mas->min = pivots[mas->offset] + 1;
>> + mas->offset++;
>> + if (mas->offset < mt_slots[mt])
>> + mas->max = pivots[mas->offset];
> There is a bug here, the assignment of mas->min and mas->max is wrong.
> The assignment will make them represent the range of a child node, but
> it should represent the range of the current node. After mas_ascend()
> returns, mas-min and mas->max already represent the range of the current
> node, so we should delete these assignments of mas->min and mas->max.
Thanks for your suggestion, Peng. Applying it literally by removing only
the min/max assignments:
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 6fc1ad42b409..9b6e581cf83f 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5118,10 +5118,7 @@ static inline bool mas_skip_node
mt = mte_node_type(mas->node);
pivots = ma_pivots(mas_mn(mas), mt);
- mas->min = pivots[mas->offset] + 1;
mas->offset++;
- if (mas->offset < mt_slots[mt])
- mas->max = pivots[mas->offset];
return true;
}
This allowed my test to pass 100/100 runs. Still in qemu with the test
as init, so not really stressed in any way except that specific usecase.
//Snild
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection
2023-03-07 14:30 ` Snild Dolkow
@ 2023-03-07 16:10 ` Liam R. Howlett
2023-03-07 16:16 ` Peng Zhang
1 sibling, 0 replies; 9+ messages in thread
From: Liam R. Howlett @ 2023-03-07 16:10 UTC (permalink / raw)
To: Snild Dolkow
Cc: Peng Zhang, Stable, maple-tree, linux-mm, linux-kernel, Andrew Morton
* Snild Dolkow <snild@sony.com> [230307 09:31]:
> On 2023-03-07 14:05, Peng Zhang wrote:
> > Hi, Liam,
> > > - } while (slot > slot_count);
> > > + } while (mas->offset >= mas_data_end(mas));
> > > - mas->offset = ++slot;
> > > + mt = mte_node_type(mas->node);
> > > pivots = ma_pivots(mas_mn(mas), mt);
> > > - if (slot > 0)
> > > - mas->min = pivots[slot - 1] + 1;
> > > -
> > > - if (slot <= slot_count)
> > > - mas->max = pivots[slot];
> > > + mas->min = pivots[mas->offset] + 1;
> > > + mas->offset++;
> > > + if (mas->offset < mt_slots[mt])
> > > + mas->max = pivots[mas->offset];
> > There is a bug here, the assignment of mas->min and mas->max is wrong.
> > The assignment will make them represent the range of a child node, but
> > it should represent the range of the current node. After mas_ascend()
> > returns, mas-min and mas->max already represent the range of the current
> > node, so we should delete these assignments of mas->min and mas->max.
>
>
> Thanks for your suggestion, Peng. Applying it literally by removing only the
> min/max assignments:
>
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index 6fc1ad42b409..9b6e581cf83f 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5118,10 +5118,7 @@ static inline bool mas_skip_node
>
> mt = mte_node_type(mas->node);
> pivots = ma_pivots(mas_mn(mas), mt);
> - mas->min = pivots[mas->offset] + 1;
> mas->offset++;
> - if (mas->offset < mt_slots[mt])
> - mas->max = pivots[mas->offset];
>
> return true;
> }
>
>
> This allowed my test to pass 100/100 runs. Still in qemu with the test as
> init, so not really stressed in any way except that specific usecase.
>
Yes, I have a v2 that removes these lines and some extra unused lines.
I've been working on a recreation to add to the testing before I send v2
out. I had 100% success running my testing over the weekend, but I
wanted to have a testcase before sending the change.
Thanks,
Liam
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] maple_tree: Fix mas_skip_node() end slot detection
2023-03-07 14:30 ` Snild Dolkow
2023-03-07 16:10 ` Liam R. Howlett
@ 2023-03-07 16:16 ` Peng Zhang
2023-03-07 16:21 ` Peng Zhang
1 sibling, 1 reply; 9+ messages in thread
From: Peng Zhang @ 2023-03-07 16:16 UTC (permalink / raw)
To: Snild Dolkow, Peng Zhang, Liam R. Howlett
Cc: Stable, maple-tree, linux-mm, linux-kernel, Andrew Morton
Hi, Snild,
在 2023/3/7 22:30, Snild Dolkow 写道:
> On 2023-03-07 14:05, Peng Zhang wrote:
>> Hi, Liam,
>>> - } while (slot > slot_count);
>>> + } while (mas->offset >= mas_data_end(mas));
>>> - mas->offset = ++slot;
>>> + mt = mte_node_type(mas->node);
>>> pivots = ma_pivots(mas_mn(mas), mt);
>>> - if (slot > 0)
>>> - mas->min = pivots[slot - 1] + 1;
>>> -
>>> - if (slot <= slot_count)
>>> - mas->max = pivots[slot];
>>> + mas->min = pivots[mas->offset] + 1;
>>> + mas->offset++;
>>> + if (mas->offset < mt_slots[mt])
>>> + mas->max = pivots[mas->offset];
>> There is a bug here, the assignment of mas->min and mas->max is wrong.
>> The assignment will make them represent the range of a child node,
>> but it should represent the range of the current node. After
>> mas_ascend() returns, mas-min and mas->max already represent the
>> range of the current node, so we should delete these assignments of
>> mas->min and mas->max.
>
>
> Thanks for your suggestion, Peng. Applying it literally by removing
> only the min/max assignments:
>
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index 6fc1ad42b409..9b6e581cf83f 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5118,10 +5118,7 @@ static inline bool mas_skip_node
>
> mt = mte_node_type(mas->node);
> pivots = ma_pivots(mas_mn(mas), mt);
> - mas->min = pivots[mas->offset] + 1;
> mas->offset++;
> - if (mas->offset < mt_slots[mt])
> - mas->max = pivots[mas->offset];
>
> return true;
> }
>
>
> This allowed my test to pass 100/100 runs. Still in qemu with the test
> as init, so not really stressed in any way except that specific usecase.
>
> //Snild
Thanks for the test, I'm happy if it happens to fix your problem. So a
patch was made.
This patch needs to be applied after Liam's patch.
Sincerely yours,
Peng.
^ permalink raw reply [flat|nested] 9+ messages in thread