linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Fix and cleanups to xarray
@ 2024-12-18 15:46 Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 1/5] Xarray: Do not return sibling entries from xas_find_marked() Kemeng Shi
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Kemeng Shi @ 2024-12-18 15:46 UTC (permalink / raw)
  To: akpm, willy; +Cc: linux-kernel, linux-fsdevel, linux-mm, linux-nfs

v3->v4:
-Correct changelog in patch 1 by using nfs as low-level filesystem in
example how bug could be triggered in theory.

v2->v3:
-Add impact about fixed issue in changelog.

v1->v2:
-Drop patch "Xarray: skip unneeded xas_store() and xas_clear_mark() in
__xa_alloc()"

This series contains some random fixes and cleanups to xarray. Patch 1-2
are fixes and patch 3-6 are cleanups. More details can be found in
respective patches. Thanks!

Kemeng Shi (5):
  Xarray: Do not return sibling entries from xas_find_marked()
  Xarray: move forward index correctly in xas_pause()
  Xarray: distinguish large entries correctly in xas_split_alloc()
  Xarray: remove repeat check in xas_squash_marks()
  Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent

 lib/test_xarray.c                     | 35 +++++++++++++++++++++++++++
 lib/xarray.c                          | 26 +++++++++++---------
 tools/testing/radix-tree/multiorder.c |  4 +++
 3 files changed, 54 insertions(+), 11 deletions(-)

-- 
2.30.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/5] Xarray: Do not return sibling entries from xas_find_marked()
  2024-12-18 15:46 [PATCH v4 0/5] Fix and cleanups to xarray Kemeng Shi
@ 2024-12-18 15:46 ` Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause() Kemeng Shi
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Kemeng Shi @ 2024-12-18 15:46 UTC (permalink / raw)
  To: akpm, willy; +Cc: linux-kernel, linux-fsdevel, linux-mm, linux-nfs

Similar to issue fixed in commit cbc02854331ed ("XArray: Do not return
sibling entries from xa_load()"), we may return sibling entries from
xas_find_marked as following:
    Thread A:               Thread B:
                            xa_store_range(xa, entry, 6, 7, gfp);
			    xa_set_mark(xa, 6, mark)
    XA_STATE(xas, xa, 6);
    xas_find_marked(&xas, 7, mark);
    offset = xas_find_chunk(xas, advance, mark);
    [offset is 6 which points to a valid entry]
                            xa_store_range(xa, entry, 4, 7, gfp);
    entry = xa_entry(xa, node, 6);
    [entry is a sibling of 4]
    if (!xa_is_node(entry))
        return entry;

Skip sibling entry like xas_find() does to protect caller from seeing
sibling entry from xas_find_marked() or caller may use sibling entry
as a valid entry and crash the kernel.

Besides, load_race() test is modified to catch mentioned issue and modified
load_race() only passes after this fix is merged.

Here is an example how this bug could be triggerred in theory in nfs which
enables large folio in mapping:
Let's take a look at involved racer:
1. How pages could be created and dirtied in nfs.
write
 ksys_write
  vfs_write
   new_sync_write
    nfs_file_write
     generic_perform_write
      nfs_write_begin
       fgf_set_order
        __filemap_get_folio
      nfs_write_end
       nfs_update_folio
        nfs_writepage_setup
	 nfs_mark_request_dirty
	  filemap_dirty_folio
	   __folio_mark_dirty
	    __xa_set_mark

2. How dirty pages could be deleted in nfs.
ioctl
 do_vfs_ioctl
  file_ioctl
   ioctl_preallocate
    vfs_fallocate
     nfs42_fallocate
      nfs42_proc_deallocate
       truncate_pagecache_range
        truncate_inode_pages_range
	 truncate_inode_folio
	  filemap_remove_folio
	   page_cache_delete
	    xas_store(&xas, NULL);

3. How dirty pages could be lockless searched
sync_file_range
 ksys_sync_file_range
  __filemap_fdatawrite_range
   filemap_fdatawrite_wbc
    do_writepages
     writeback_use_writepage
      writeback_iter
       writeback_get_folio
        filemap_get_folios_tag
         find_get_entry
          folio = xas_find_marked()
          folio_try_get(folio)

In theory, kernel will crash as following:
1.Create               2.Search             3.Delete
/* write page 2,3 */
write
 ...
  nfs_write_begin
   fgf_set_order
   __filemap_get_folio
    ...
     /* index = 2, order = 1 */
     xa_store(&xas, folio)
  nfs_write_end
   ...
    __folio_mark_dirty

                       /* sync page 2 and page 3 */
                       sync_file_range
                        ...
                         find_get_entry
                          folio = xas_find_marked()
                          /* offset will be 2 */
                          offset = xas_find_chunk()

                                             /* delete page 2 and page 3 */
                                             ioctl
                                              ...
                                               xas_store(&xas, NULL);

/* write page 0-3 */
write
 ...
  nfs_write_begin
   fgf_set_order
   __filemap_get_folio
    ...
     /* index = 0, order = 2 */
     xa_store(&xas, folio)
  nfs_write_end
   ...
    __folio_mark_dirty

                          /* get sibling entry from offset 2 */
                          entry = xa_entry(.., 2)
                          /* use sibling entry as folio and crash kernel */
                          folio_try_get(folio)

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 lib/xarray.c                          | 2 ++
 tools/testing/radix-tree/multiorder.c | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/lib/xarray.c b/lib/xarray.c
index 32d4bac8c94c..fa87949719a0 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1382,6 +1382,8 @@ void *xas_find_marked(struct xa_state *xas, unsigned long max, xa_mark_t mark)
 		entry = xa_entry(xas->xa, xas->xa_node, xas->xa_offset);
 		if (!entry && !(xa_track_free(xas->xa) && mark == XA_FREE_MARK))
 			continue;
+		if (xa_is_sibling(entry))
+			continue;
 		if (!xa_is_node(entry))
 			return entry;
 		xas->xa_node = xa_to_node(entry);
diff --git a/tools/testing/radix-tree/multiorder.c b/tools/testing/radix-tree/multiorder.c
index cffaf2245d4f..eaff1b036989 100644
--- a/tools/testing/radix-tree/multiorder.c
+++ b/tools/testing/radix-tree/multiorder.c
@@ -227,6 +227,7 @@ static void *load_creator(void *ptr)
 			unsigned long index = (3 << RADIX_TREE_MAP_SHIFT) -
 						(1 << order);
 			item_insert_order(tree, index, order);
+			xa_set_mark(tree, index, XA_MARK_1);
 			item_delete_rcu(tree, index);
 		}
 	}
@@ -242,8 +243,11 @@ static void *load_worker(void *ptr)
 
 	rcu_register_thread();
 	while (!stop_iteration) {
+		unsigned long find_index = (2 << RADIX_TREE_MAP_SHIFT) + 1;
 		struct item *item = xa_load(ptr, index);
 		assert(!xa_is_internal(item));
+		item = xa_find(ptr, &find_index, index, XA_MARK_1);
+		assert(!xa_is_internal(item));
 	}
 	rcu_unregister_thread();
 
-- 
2.30.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause()
  2024-12-18 15:46 [PATCH v4 0/5] Fix and cleanups to xarray Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 1/5] Xarray: Do not return sibling entries from xas_find_marked() Kemeng Shi
@ 2024-12-18 15:46 ` Kemeng Shi
  2025-01-27 16:21   ` Geert Uytterhoeven
  2024-12-18 15:46 ` [PATCH v4 3/5] Xarray: distinguish large entries correctly in xas_split_alloc() Kemeng Shi
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Kemeng Shi @ 2024-12-18 15:46 UTC (permalink / raw)
  To: akpm, willy; +Cc: linux-kernel, linux-fsdevel, linux-mm, linux-nfs

After xas_load(), xas->index could point to mid of found multi-index entry
and xas->index's bits under node->shift maybe non-zero. The afterward
xas_pause() will move forward xas->index with xa->node->shift with bits
under node->shift un-masked and thus skip some index unexpectedly.

Consider following case:
Assume XA_CHUNK_SHIFT is 4.
xa_store_range(xa, 16, 31, ...)
xa_store(xa, 32, ...)
XA_STATE(xas, xa, 17);
xas_for_each(&xas,...)
xas_load(&xas)
/* xas->index = 17, xas->xa_offset = 1, xas->xa_node->xa_shift = 4 */
xas_pause()
/* xas->index = 33, xas->xa_offset = 2, xas->xa_node->xa_shift = 4 */
As we can see, index of 32 is skipped unexpectedly.

Fix this by mask bit under node->xa_shift when move forward index in
xas_pause().

For now, this will not cause serious problems. Only minor problem
like cachestat return less number of page status could happen.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 lib/test_xarray.c | 35 +++++++++++++++++++++++++++++++++++
 lib/xarray.c      |  1 +
 2 files changed, 36 insertions(+)

diff --git a/lib/test_xarray.c b/lib/test_xarray.c
index d5c5cbba33ed..6932a26f4927 100644
--- a/lib/test_xarray.c
+++ b/lib/test_xarray.c
@@ -1448,6 +1448,41 @@ static noinline void check_pause(struct xarray *xa)
 	XA_BUG_ON(xa, count != order_limit);
 
 	xa_destroy(xa);
+
+	index = 0;
+	for (order = XA_CHUNK_SHIFT; order > 0; order--) {
+		XA_BUG_ON(xa, xa_store_order(xa, index, order,
+					xa_mk_index(index), GFP_KERNEL));
+		index += 1UL << order;
+	}
+
+	index = 0;
+	count = 0;
+	xas_set(&xas, 0);
+	rcu_read_lock();
+	xas_for_each(&xas, entry, ULONG_MAX) {
+		XA_BUG_ON(xa, entry != xa_mk_index(index));
+		index += 1UL << (XA_CHUNK_SHIFT - count);
+		count++;
+	}
+	rcu_read_unlock();
+	XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
+
+	index = 0;
+	count = 0;
+	xas_set(&xas, XA_CHUNK_SIZE / 2 + 1);
+	rcu_read_lock();
+	xas_for_each(&xas, entry, ULONG_MAX) {
+		XA_BUG_ON(xa, entry != xa_mk_index(index));
+		index += 1UL << (XA_CHUNK_SHIFT - count);
+		count++;
+		xas_pause(&xas);
+	}
+	rcu_read_unlock();
+	XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
+
+	xa_destroy(xa);
+
 }
 
 static noinline void check_move_tiny(struct xarray *xa)
diff --git a/lib/xarray.c b/lib/xarray.c
index fa87949719a0..d0732c5b8403 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1147,6 +1147,7 @@ void xas_pause(struct xa_state *xas)
 			if (!xa_is_sibling(xa_entry(xas->xa, node, offset)))
 				break;
 		}
+		xas->xa_index &= ~0UL << node->shift;
 		xas->xa_index += (offset - xas->xa_offset) << node->shift;
 		if (xas->xa_index == 0)
 			xas->xa_node = XAS_BOUNDS;
-- 
2.30.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 3/5] Xarray: distinguish large entries correctly in xas_split_alloc()
  2024-12-18 15:46 [PATCH v4 0/5] Fix and cleanups to xarray Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 1/5] Xarray: Do not return sibling entries from xas_find_marked() Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause() Kemeng Shi
@ 2024-12-18 15:46 ` Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 4/5] Xarray: remove repeat check in xas_squash_marks() Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 5/5] Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent Kemeng Shi
  4 siblings, 0 replies; 9+ messages in thread
From: Kemeng Shi @ 2024-12-18 15:46 UTC (permalink / raw)
  To: akpm, willy; +Cc: linux-kernel, linux-fsdevel, linux-mm, linux-nfs

We don't support large entries which expand two more level xa_node in
split. For case "xas->xa_shift + 2 * XA_CHUNK_SHIFT == order", we also
need two level of xa_node to expand. Distinguish entry as large entry in
case "xas->xa_shift + 2 * XA_CHUNK_SHIFT == order".

As max order of folio in pagecache (MAX_PAGECACHE_ORDER) is <=
(XA_CHUNK_SHIFT * 2 - 1), this change is more likely a cleanup...

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 lib/xarray.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/xarray.c b/lib/xarray.c
index d0732c5b8403..3fac3f2cea9d 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1022,7 +1022,7 @@ void xas_split_alloc(struct xa_state *xas, void *entry, unsigned int order,
 	unsigned int mask = xas->xa_sibs;
 
 	/* XXX: no support for splitting really large entries yet */
-	if (WARN_ON(xas->xa_shift + 2 * XA_CHUNK_SHIFT < order))
+	if (WARN_ON(xas->xa_shift + 2 * XA_CHUNK_SHIFT <= order))
 		goto nomem;
 	if (xas->xa_shift + XA_CHUNK_SHIFT > order)
 		return;
-- 
2.30.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 4/5] Xarray: remove repeat check in xas_squash_marks()
  2024-12-18 15:46 [PATCH v4 0/5] Fix and cleanups to xarray Kemeng Shi
                   ` (2 preceding siblings ...)
  2024-12-18 15:46 ` [PATCH v4 3/5] Xarray: distinguish large entries correctly in xas_split_alloc() Kemeng Shi
@ 2024-12-18 15:46 ` Kemeng Shi
  2024-12-18 15:46 ` [PATCH v4 5/5] Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent Kemeng Shi
  4 siblings, 0 replies; 9+ messages in thread
From: Kemeng Shi @ 2024-12-18 15:46 UTC (permalink / raw)
  To: akpm, willy; +Cc: linux-kernel, linux-fsdevel, linux-mm, linux-nfs

Caller of xas_squash_marks() has ensured xas->xa_sibs is non-zero. Just
remove repeat check of xas->xa_sibs in xas_squash_marks().

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 lib/xarray.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/lib/xarray.c b/lib/xarray.c
index 3fac3f2cea9d..4231af284bd8 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -128,9 +128,6 @@ static void xas_squash_marks(const struct xa_state *xas)
 	unsigned int mark = 0;
 	unsigned int limit = xas->xa_offset + xas->xa_sibs + 1;
 
-	if (!xas->xa_sibs)
-		return;
-
 	do {
 		unsigned long *marks = xas->xa_node->marks[mark];
 		if (find_next_bit(marks, limit, xas->xa_offset + 1) == limit)
-- 
2.30.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 5/5] Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent
  2024-12-18 15:46 [PATCH v4 0/5] Fix and cleanups to xarray Kemeng Shi
                   ` (3 preceding siblings ...)
  2024-12-18 15:46 ` [PATCH v4 4/5] Xarray: remove repeat check in xas_squash_marks() Kemeng Shi
@ 2024-12-18 15:46 ` Kemeng Shi
  4 siblings, 0 replies; 9+ messages in thread
From: Kemeng Shi @ 2024-12-18 15:46 UTC (permalink / raw)
  To: akpm, willy; +Cc: linux-kernel, linux-fsdevel, linux-mm, linux-nfs

Besides xas_squash_marks(), all functions use xa_mark_t type to iterate
all possible marks. Use xa_mark_t in xas_squash_marks() to keep code
consistent.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 lib/xarray.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/lib/xarray.c b/lib/xarray.c
index 4231af284bd8..a74795911f1c 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -125,16 +125,20 @@ static inline void node_mark_all(struct xa_node *node, xa_mark_t mark)
  */
 static void xas_squash_marks(const struct xa_state *xas)
 {
-	unsigned int mark = 0;
+	xa_mark_t mark = 0;
 	unsigned int limit = xas->xa_offset + xas->xa_sibs + 1;
 
-	do {
-		unsigned long *marks = xas->xa_node->marks[mark];
-		if (find_next_bit(marks, limit, xas->xa_offset + 1) == limit)
-			continue;
-		__set_bit(xas->xa_offset, marks);
-		bitmap_clear(marks, xas->xa_offset + 1, xas->xa_sibs);
-	} while (mark++ != (__force unsigned)XA_MARK_MAX);
+	for (;;) {
+		unsigned long *marks = node_marks(xas->xa_node, mark);
+
+		if (find_next_bit(marks, limit, xas->xa_offset + 1) != limit) {
+			__set_bit(xas->xa_offset, marks);
+			bitmap_clear(marks, xas->xa_offset + 1, xas->xa_sibs);
+		}
+		if (mark == XA_MARK_MAX)
+			break;
+		mark_inc(mark);
+	}
 }
 
 /* extracts the offset within this node from the index */
-- 
2.30.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause()
  2024-12-18 15:46 ` [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause() Kemeng Shi
@ 2025-01-27 16:21   ` Geert Uytterhoeven
  2025-02-06  6:13     ` Kemeng Shi
  0 siblings, 1 reply; 9+ messages in thread
From: Geert Uytterhoeven @ 2025-01-27 16:21 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: akpm, willy, linux-kernel, linux-fsdevel, linux-mm, linux-nfs,
	linux-m68k

Hi Kemeng,

On Wed, 18 Dec 2024 at 07:58, Kemeng Shi <shikemeng@huaweicloud.com> wrote:
> After xas_load(), xas->index could point to mid of found multi-index entry
> and xas->index's bits under node->shift maybe non-zero. The afterward
> xas_pause() will move forward xas->index with xa->node->shift with bits
> under node->shift un-masked and thus skip some index unexpectedly.
>
> Consider following case:
> Assume XA_CHUNK_SHIFT is 4.
> xa_store_range(xa, 16, 31, ...)
> xa_store(xa, 32, ...)
> XA_STATE(xas, xa, 17);
> xas_for_each(&xas,...)
> xas_load(&xas)
> /* xas->index = 17, xas->xa_offset = 1, xas->xa_node->xa_shift = 4 */
> xas_pause()
> /* xas->index = 33, xas->xa_offset = 2, xas->xa_node->xa_shift = 4 */
> As we can see, index of 32 is skipped unexpectedly.
>
> Fix this by mask bit under node->xa_shift when move forward index in
> xas_pause().
>
> For now, this will not cause serious problems. Only minor problem
> like cachestat return less number of page status could happen.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>

Thanks for your patch, which is now commit c9ba5249ef8b080c ("Xarray:
move forward index correctly in xas_pause()") upstream.

> --- a/lib/test_xarray.c
> +++ b/lib/test_xarray.c
> @@ -1448,6 +1448,41 @@ static noinline void check_pause(struct xarray *xa)
>         XA_BUG_ON(xa, count != order_limit);
>
>         xa_destroy(xa);
> +
> +       index = 0;
> +       for (order = XA_CHUNK_SHIFT; order > 0; order--) {
> +               XA_BUG_ON(xa, xa_store_order(xa, index, order,
> +                                       xa_mk_index(index), GFP_KERNEL));
> +               index += 1UL << order;
> +       }
> +
> +       index = 0;
> +       count = 0;
> +       xas_set(&xas, 0);
> +       rcu_read_lock();
> +       xas_for_each(&xas, entry, ULONG_MAX) {
> +               XA_BUG_ON(xa, entry != xa_mk_index(index));
> +               index += 1UL << (XA_CHUNK_SHIFT - count);
> +               count++;
> +       }
> +       rcu_read_unlock();
> +       XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
> +
> +       index = 0;
> +       count = 0;
> +       xas_set(&xas, XA_CHUNK_SIZE / 2 + 1);
> +       rcu_read_lock();
> +       xas_for_each(&xas, entry, ULONG_MAX) {
> +               XA_BUG_ON(xa, entry != xa_mk_index(index));
> +               index += 1UL << (XA_CHUNK_SHIFT - count);
> +               count++;
> +               xas_pause(&xas);
> +       }
> +       rcu_read_unlock();
> +       XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
> +
> +       xa_destroy(xa);
> +
>  }

On m68k, the last four XA_BUG_ON() checks above are triggered when
running the test.  With extra debug prints added:

    entry = 00000002 xa_mk_index(index) = 000000c1
    entry = 00000002 xa_mk_index(index) = 000000e1
    entry = 00000002 xa_mk_index(index) = 000000f1
    ...
    entry = 000000e2 xa_mk_index(index) = fffff0ff
    entry = 000000f9 xa_mk_index(index) = fffff8ff
    entry = 000000f2 xa_mk_index(index) = fffffcff
    count = 63 XA_CHUNK_SHIFT = 6
    entry = 00000081 xa_mk_index(index) = 00000001
    entry = 00000002 xa_mk_index(index) = 00000081
    entry = 00000002 xa_mk_index(index) = 000000c1
    ...
    entry = 000000e2 xa_mk_index(index) = ffffe0ff
    entry = 000000f9 xa_mk_index(index) = fffff0ff
    entry = 000000f2 xa_mk_index(index) = fffff8ff
     count = 62 XA_CHUNK_SHIFT = 6

On arm32, the test succeeds, so it's probably not a 32-vs-64-bit issue.
Perhaps a big-endian or alignment issue (alignof(int/long) = 2)?

> --- a/lib/xarray.c
> +++ b/lib/xarray.c
> @@ -1147,6 +1147,7 @@ void xas_pause(struct xa_state *xas)
>                         if (!xa_is_sibling(xa_entry(xas->xa, node, offset)))
>                                 break;
>                 }
> +               xas->xa_index &= ~0UL << node->shift;
>                 xas->xa_index += (offset - xas->xa_offset) << node->shift;
>                 if (xas->xa_index == 0)
>                         xas->xa_node = XAS_BOUNDS;

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause()
  2025-01-27 16:21   ` Geert Uytterhoeven
@ 2025-02-06  6:13     ` Kemeng Shi
  2025-02-06  7:34       ` Geert Uytterhoeven
  0 siblings, 1 reply; 9+ messages in thread
From: Kemeng Shi @ 2025-02-06  6:13 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: akpm, willy, linux-kernel, linux-fsdevel, linux-mm, linux-nfs,
	linux-m68k



on 1/28/2025 12:21 AM, Geert Uytterhoeven wrote:
> Hi Kemeng,
> 
> On Wed, 18 Dec 2024 at 07:58, Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>> After xas_load(), xas->index could point to mid of found multi-index entry
>> and xas->index's bits under node->shift maybe non-zero. The afterward
>> xas_pause() will move forward xas->index with xa->node->shift with bits
>> under node->shift un-masked and thus skip some index unexpectedly.
>>
>> Consider following case:
>> Assume XA_CHUNK_SHIFT is 4.
>> xa_store_range(xa, 16, 31, ...)
>> xa_store(xa, 32, ...)
>> XA_STATE(xas, xa, 17);
>> xas_for_each(&xas,...)
>> xas_load(&xas)
>> /* xas->index = 17, xas->xa_offset = 1, xas->xa_node->xa_shift = 4 */
>> xas_pause()
>> /* xas->index = 33, xas->xa_offset = 2, xas->xa_node->xa_shift = 4 */
>> As we can see, index of 32 is skipped unexpectedly.
>>
>> Fix this by mask bit under node->xa_shift when move forward index in
>> xas_pause().
>>
>> For now, this will not cause serious problems. Only minor problem
>> like cachestat return less number of page status could happen.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> 
> Thanks for your patch, which is now commit c9ba5249ef8b080c ("Xarray:
> move forward index correctly in xas_pause()") upstream.
> 
>> --- a/lib/test_xarray.c
>> +++ b/lib/test_xarray.c
>> @@ -1448,6 +1448,41 @@ static noinline void check_pause(struct xarray *xa)
>>         XA_BUG_ON(xa, count != order_limit);
>>
>>         xa_destroy(xa);
>> +
>> +       index = 0;
>> +       for (order = XA_CHUNK_SHIFT; order > 0; order--) {
>> +               XA_BUG_ON(xa, xa_store_order(xa, index, order,
>> +                                       xa_mk_index(index), GFP_KERNEL));
>> +               index += 1UL << order;
>> +       }
>> +
>> +       index = 0;
>> +       count = 0;
>> +       xas_set(&xas, 0);
>> +       rcu_read_lock();
>> +       xas_for_each(&xas, entry, ULONG_MAX) {
>> +               XA_BUG_ON(xa, entry != xa_mk_index(index));
>> +               index += 1UL << (XA_CHUNK_SHIFT - count);
>> +               count++;
>> +       }
>> +       rcu_read_unlock();
>> +       XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
>> +
>> +       index = 0;
>> +       count = 0;
>> +       xas_set(&xas, XA_CHUNK_SIZE / 2 + 1);
>> +       rcu_read_lock();
>> +       xas_for_each(&xas, entry, ULONG_MAX) {
>> +               XA_BUG_ON(xa, entry != xa_mk_index(index));
>> +               index += 1UL << (XA_CHUNK_SHIFT - count);
>> +               count++;
>> +               xas_pause(&xas);
>> +       }
>> +       rcu_read_unlock();
>> +       XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
>> +
>> +       xa_destroy(xa);
>> +
>>  }
> 
> On m68k, the last four XA_BUG_ON() checks above are triggered when
> running the test.  With extra debug prints added:
> 
>     entry = 00000002 xa_mk_index(index) = 000000c1
>     entry = 00000002 xa_mk_index(index) = 000000e1
>     entry = 00000002 xa_mk_index(index) = 000000f1
>     ...
>     entry = 000000e2 xa_mk_index(index) = fffff0ff
>     entry = 000000f9 xa_mk_index(index) = fffff8ff
>     entry = 000000f2 xa_mk_index(index) = fffffcff
>     count = 63 XA_CHUNK_SHIFT = 6
>     entry = 00000081 xa_mk_index(index) = 00000001
>     entry = 00000002 xa_mk_index(index) = 00000081
>     entry = 00000002 xa_mk_index(index) = 000000c1
>     ...
>     entry = 000000e2 xa_mk_index(index) = ffffe0ff
>     entry = 000000f9 xa_mk_index(index) = fffff0ff
>     entry = 000000f2 xa_mk_index(index) = fffff8ff
>      count = 62 XA_CHUNK_SHIFT = 6
> 
> On arm32, the test succeeds, so it's probably not a 32-vs-64-bit issue.
> Perhaps a big-endian or alignment issue (alignof(int/long) = 2)?
Hi Geert,
Sorry for late reply. After check the debug info and the code, I think
the test is failed because CONFIG_XARRAY_MULTI is off. I guess
CONFIG_XARRAY_MULTI is on arm32 and is off on m68k so the test result
diffs. Luckly it's only a problem of of test code.
I will send patch to correct the test code soon. Thanks!

Kemeng

> 
>> --- a/lib/xarray.c
>> +++ b/lib/xarray.c
>> @@ -1147,6 +1147,7 @@ void xas_pause(struct xa_state *xas)
>>                         if (!xa_is_sibling(xa_entry(xas->xa, node, offset)))
>>                                 break;
>>                 }
>> +               xas->xa_index &= ~0UL << node->shift;
>>                 xas->xa_index += (offset - xas->xa_offset) << node->shift;
>>                 if (xas->xa_index == 0)
>>                         xas->xa_node = XAS_BOUNDS;
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause()
  2025-02-06  6:13     ` Kemeng Shi
@ 2025-02-06  7:34       ` Geert Uytterhoeven
  0 siblings, 0 replies; 9+ messages in thread
From: Geert Uytterhoeven @ 2025-02-06  7:34 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: akpm, willy, linux-kernel, linux-fsdevel, linux-mm, linux-nfs,
	linux-m68k

Hi Kemeng,

On Thu, 6 Feb 2025 at 07:13, Kemeng Shi <shikemeng@huaweicloud.com> wrote:
> on 1/28/2025 12:21 AM, Geert Uytterhoeven wrote:
> > On Wed, 18 Dec 2024 at 07:58, Kemeng Shi <shikemeng@huaweicloud.com> wrote:
> >> After xas_load(), xas->index could point to mid of found multi-index entry
> >> and xas->index's bits under node->shift maybe non-zero. The afterward
> >> xas_pause() will move forward xas->index with xa->node->shift with bits
> >> under node->shift un-masked and thus skip some index unexpectedly.
> >>
> >> Consider following case:
> >> Assume XA_CHUNK_SHIFT is 4.
> >> xa_store_range(xa, 16, 31, ...)
> >> xa_store(xa, 32, ...)
> >> XA_STATE(xas, xa, 17);
> >> xas_for_each(&xas,...)
> >> xas_load(&xas)
> >> /* xas->index = 17, xas->xa_offset = 1, xas->xa_node->xa_shift = 4 */
> >> xas_pause()
> >> /* xas->index = 33, xas->xa_offset = 2, xas->xa_node->xa_shift = 4 */
> >> As we can see, index of 32 is skipped unexpectedly.
> >>
> >> Fix this by mask bit under node->xa_shift when move forward index in
> >> xas_pause().
> >>
> >> For now, this will not cause serious problems. Only minor problem
> >> like cachestat return less number of page status could happen.
> >>
> >> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> >
> > Thanks for your patch, which is now commit c9ba5249ef8b080c ("Xarray:
> > move forward index correctly in xas_pause()") upstream.
> >
> >> --- a/lib/test_xarray.c
> >> +++ b/lib/test_xarray.c
> >> @@ -1448,6 +1448,41 @@ static noinline void check_pause(struct xarray *xa)
> >>         XA_BUG_ON(xa, count != order_limit);
> >>
> >>         xa_destroy(xa);
> >> +
> >> +       index = 0;
> >> +       for (order = XA_CHUNK_SHIFT; order > 0; order--) {
> >> +               XA_BUG_ON(xa, xa_store_order(xa, index, order,
> >> +                                       xa_mk_index(index), GFP_KERNEL));
> >> +               index += 1UL << order;
> >> +       }
> >> +
> >> +       index = 0;
> >> +       count = 0;
> >> +       xas_set(&xas, 0);
> >> +       rcu_read_lock();
> >> +       xas_for_each(&xas, entry, ULONG_MAX) {
> >> +               XA_BUG_ON(xa, entry != xa_mk_index(index));
> >> +               index += 1UL << (XA_CHUNK_SHIFT - count);
> >> +               count++;
> >> +       }
> >> +       rcu_read_unlock();
> >> +       XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
> >> +
> >> +       index = 0;
> >> +       count = 0;
> >> +       xas_set(&xas, XA_CHUNK_SIZE / 2 + 1);
> >> +       rcu_read_lock();
> >> +       xas_for_each(&xas, entry, ULONG_MAX) {
> >> +               XA_BUG_ON(xa, entry != xa_mk_index(index));
> >> +               index += 1UL << (XA_CHUNK_SHIFT - count);
> >> +               count++;
> >> +               xas_pause(&xas);
> >> +       }
> >> +       rcu_read_unlock();
> >> +       XA_BUG_ON(xa, count != XA_CHUNK_SHIFT);
> >> +
> >> +       xa_destroy(xa);
> >> +
> >>  }
> >
> > On m68k, the last four XA_BUG_ON() checks above are triggered when
> > running the test.  With extra debug prints added:
> >
> >     entry = 00000002 xa_mk_index(index) = 000000c1
> >     entry = 00000002 xa_mk_index(index) = 000000e1
> >     entry = 00000002 xa_mk_index(index) = 000000f1
> >     ...
> >     entry = 000000e2 xa_mk_index(index) = fffff0ff
> >     entry = 000000f9 xa_mk_index(index) = fffff8ff
> >     entry = 000000f2 xa_mk_index(index) = fffffcff
> >     count = 63 XA_CHUNK_SHIFT = 6
> >     entry = 00000081 xa_mk_index(index) = 00000001
> >     entry = 00000002 xa_mk_index(index) = 00000081
> >     entry = 00000002 xa_mk_index(index) = 000000c1
> >     ...
> >     entry = 000000e2 xa_mk_index(index) = ffffe0ff
> >     entry = 000000f9 xa_mk_index(index) = fffff0ff
> >     entry = 000000f2 xa_mk_index(index) = fffff8ff
> >      count = 62 XA_CHUNK_SHIFT = 6
> >
> > On arm32, the test succeeds, so it's probably not a 32-vs-64-bit issue.
> > Perhaps a big-endian or alignment issue (alignof(int/long) = 2)?
> Hi Geert,
> Sorry for late reply. After check the debug info and the code, I think
> the test is failed because CONFIG_XARRAY_MULTI is off. I guess
> CONFIG_XARRAY_MULTI is on arm32 and is off on m68k so the test result
> diffs. Luckly it's only a problem of of test code.
> I will send patch to correct the test code soon. Thanks!

You are right: CONFIG_XARRAY_MULTI is enabled in my arm32 build,
but not in my m68k build.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-02-06  7:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-18 15:46 [PATCH v4 0/5] Fix and cleanups to xarray Kemeng Shi
2024-12-18 15:46 ` [PATCH v4 1/5] Xarray: Do not return sibling entries from xas_find_marked() Kemeng Shi
2024-12-18 15:46 ` [PATCH v4 2/5] Xarray: move forward index correctly in xas_pause() Kemeng Shi
2025-01-27 16:21   ` Geert Uytterhoeven
2025-02-06  6:13     ` Kemeng Shi
2025-02-06  7:34       ` Geert Uytterhoeven
2024-12-18 15:46 ` [PATCH v4 3/5] Xarray: distinguish large entries correctly in xas_split_alloc() Kemeng Shi
2024-12-18 15:46 ` [PATCH v4 4/5] Xarray: remove repeat check in xas_squash_marks() Kemeng Shi
2024-12-18 15:46 ` [PATCH v4 5/5] Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent Kemeng Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox