linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
@ 2026-04-16 10:25 Sang-Heon Jeon
  2026-04-16 10:36 ` Sang-Heon Jeon
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Sang-Heon Jeon @ 2026-04-16 10:25 UTC (permalink / raw)
  To: akpm, rppt, djbw, mingo
  Cc: linux-mm, Sang-Heon Jeon, Donghyeon Lee, Munhui Chae

When split NUMA node uniformly, split_nodes_size_interleave_uniform()
returns the next absolute node ID, not the number of nodes created.

The existing under-allocation detection logic compares next absolute node
ID (ret) and request count (n), which only works when nid starts at 0.

For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
fake node nid starts at 8, but only 4 fake nodes are created due to
current FAKE_NODE_MIN_SIZE being 32MB, and
split_nodes_size_interleave_uniform() returns 12. By existing
under-allocation detection logic, "ret < n" (12 < 8) is false, so the
under-allocation will not be detected.

Fix under-allocation detection logic to compare the number of actually
created nodes (ret - nid) against the request count (n).

Also, fix the outdated comment to match the actual return value.

Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
Reported-by: Donghyeon Lee <asd142513@gmail.com>
Reported-by: Munhui Chae <mochae@student.42seoul.kr>
Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
---
 mm/numa_emulation.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
index 703c8fa05048..c1d0a76aef64 100644
--- a/mm/numa_emulation.c
+++ b/mm/numa_emulation.c
@@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
  * Sets up fake nodes of `size' interleaved over physical nodes ranging from
  * `addr' to `max_addr'.
  *
- * Returns zero on success or negative on error.
+ * Returns absolute node ID on success or negative on error.
  */
 static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
 					      struct numa_meminfo *pi,
@@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
 					n, &pi.blk[0], nid);
 			if (ret < 0)
 				break;
-			if (ret < n) {
+			if (ret - nid < n) {
 				pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
 						__func__, i, ret, n);
 				ret = -1;
-- 
2.43.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
  2026-04-16 10:25 [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split Sang-Heon Jeon
@ 2026-04-16 10:36 ` Sang-Heon Jeon
  2026-04-16 14:29 ` Mike Rapoport
  2026-04-16 14:36 ` Mike Rapoport
  2 siblings, 0 replies; 8+ messages in thread
From: Sang-Heon Jeon @ 2026-04-16 10:36 UTC (permalink / raw)
  To: akpm, rppt, djbw, mingo; +Cc: linux-mm, Donghyeon Lee, Munhui Chae

Hello,

On Thu, Apr 16, 2026 at 7:26 PM Sang-Heon Jeon <ekffu200098@gmail.com> wrote:
>
> When split NUMA node uniformly, split_nodes_size_interleave_uniform()
> returns the next absolute node ID, not the number of nodes created.
>
> The existing under-allocation detection logic compares next absolute node
> ID (ret) and request count (n), which only works when nid starts at 0.
>
> For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
> 1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
> node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
> fake node nid starts at 8, but only 4 fake nodes are created due to
> current FAKE_NODE_MIN_SIZE being 32MB, and
> split_nodes_size_interleave_uniform() returns 12. By existing
> under-allocation detection logic, "ret < n" (12 < 8) is false, so the
> under-allocation will not be detected.
>
> Fix under-allocation detection logic to compare the number of actually
> created nodes (ret - nid) against the request count (n).
>
> Also, fix the outdated comment to match the actual return value.
>
> Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
> Reported-by: Donghyeon Lee <asd142513@gmail.com>
> Reported-by: Munhui Chae <mochae@student.42seoul.kr>
> Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
> ---
>  mm/numa_emulation.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
> index 703c8fa05048..c1d0a76aef64 100644
> --- a/mm/numa_emulation.c
> +++ b/mm/numa_emulation.c
> @@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
>   * Sets up fake nodes of `size' interleaved over physical nodes ranging from
>   * `addr' to `max_addr'.
>   *
> - * Returns zero on success or negative on error.
> + * Returns absolute node ID on success or negative on error.
>   */
>  static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
>                                               struct numa_meminfo *pi,
> @@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
>                                         n, &pi.blk[0], nid);
>                         if (ret < 0)
>                                 break;
> -                       if (ret < n) {
> +                       if (ret - nid < n) {
>                                 pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
>                                                 __func__, i, ret, n);
>                                 ret = -1;
> --
> 2.43.0
>

The change log from the previous patch was accidentally omitted, so I
added it here.

---
Changes from v1 [1]
- Merge patchset into once.
- Change base from linux-next to mm-unstable

[1] https://lore.kernel.org/all/20260413154438.396031-1-ekffu200098@gmail.com/
---

Best Regards,
Sang-Heon Jeon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
  2026-04-16 10:25 [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split Sang-Heon Jeon
  2026-04-16 10:36 ` Sang-Heon Jeon
@ 2026-04-16 14:29 ` Mike Rapoport
  2026-04-16 15:10   ` Sang-Heon Jeon
  2026-04-16 14:36 ` Mike Rapoport
  2 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2026-04-16 14:29 UTC (permalink / raw)
  To: Sang-Heon Jeon; +Cc: akpm, djbw, mingo, linux-mm, Donghyeon Lee, Munhui Chae

Hi,

On Thu, Apr 16, 2026 at 07:25:58PM +0900, Sang-Heon Jeon wrote:
> Subject: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split

RFC in the subject means you don't intent for the patch to be included, but
rather want to share it to get an early feedback.

I believe this is not the case here :)

> When split NUMA node uniformly, split_nodes_size_interleave_uniform()
> returns the next absolute node ID, not the number of nodes created.
> 
> The existing under-allocation detection logic compares next absolute node
> ID (ret) and request count (n), which only works when nid starts at 0.
> 
> For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
> 1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
> node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
> fake node nid starts at 8, but only 4 fake nodes are created due to
> current FAKE_NODE_MIN_SIZE being 32MB, and
> split_nodes_size_interleave_uniform() returns 12. By existing
> under-allocation detection logic, "ret < n" (12 < 8) is false, so the
> under-allocation will not be detected.
> 
> Fix under-allocation detection logic to compare the number of actually
> created nodes (ret - nid) against the request count (n).
> 
> Also, fix the outdated comment to match the actual return value.
> 
> Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
> Reported-by: Donghyeon Lee <asd142513@gmail.com>
> Reported-by: Munhui Chae <mochae@student.42seoul.kr>
> Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
> ---
>  mm/numa_emulation.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
> index 703c8fa05048..c1d0a76aef64 100644
> --- a/mm/numa_emulation.c
> +++ b/mm/numa_emulation.c
> @@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
>   * Sets up fake nodes of `size' interleaved over physical nodes ranging from
>   * `addr' to `max_addr'.
>   *
> - * Returns zero on success or negative on error.
> + * Returns absolute node ID on success or negative on error.
>   */
>  static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
>  					      struct numa_meminfo *pi,
> @@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
>  					n, &pi.blk[0], nid);
>  			if (ret < 0)
>  				break;
> -			if (ret < n) {
> +			if (ret - nid < n) {
>  				pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
>  						__func__, i, ret, n);
>  				ret = -1;
> -- 
> 2.43.0
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
  2026-04-16 10:25 [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split Sang-Heon Jeon
  2026-04-16 10:36 ` Sang-Heon Jeon
  2026-04-16 14:29 ` Mike Rapoport
@ 2026-04-16 14:36 ` Mike Rapoport
  2026-04-16 15:25   ` Sang-Heon Jeon
  2 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2026-04-16 14:36 UTC (permalink / raw)
  To: Sang-Heon Jeon; +Cc: akpm, djbw, mingo, linux-mm, Donghyeon Lee, Munhui Chae

On Thu, Apr 16, 2026 at 07:25:58PM +0900, Sang-Heon Jeon wrote:
> When split NUMA node uniformly, split_nodes_size_interleave_uniform()
> returns the next absolute node ID, not the number of nodes created.
> 
> The existing under-allocation detection logic compares next absolute node
> ID (ret) and request count (n), which only works when nid starts at 0.
> 
> For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
> 1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
> node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
> fake node nid starts at 8, but only 4 fake nodes are created due to
> current FAKE_NODE_MIN_SIZE being 32MB, and
> split_nodes_size_interleave_uniform() returns 12. By existing
> under-allocation detection logic, "ret < n" (12 < 8) is false, so the
> under-allocation will not be detected.
> 
> Fix under-allocation detection logic to compare the number of actually
> created nodes (ret - nid) against the request count (n).
> 
> Also, fix the outdated comment to match the actual return value.
> 
> Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
> Reported-by: Donghyeon Lee <asd142513@gmail.com>
> Reported-by: Munhui Chae <mochae@student.42seoul.kr>
> Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
> ---
>  mm/numa_emulation.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
> index 703c8fa05048..c1d0a76aef64 100644
> --- a/mm/numa_emulation.c
> +++ b/mm/numa_emulation.c
> @@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
>   * Sets up fake nodes of `size' interleaved over physical nodes ranging from
>   * `addr' to `max_addr'.
>   *
> - * Returns zero on success or negative on error.
> + * Returns absolute node ID on success or negative on error.
>   */
>  static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
>  					      struct numa_meminfo *pi,
> @@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
>  					n, &pi.blk[0], nid);
>  			if (ret < 0)
>  				break;
> -			if (ret < n) {
> +			if (ret - nid < n) {
>  				pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
>  						__func__, i, ret, n);

The error message also should be updated, now it prints the last node ID
rather than number of created nodes. I think it's worse creating a
temporary variable for ret - nid to make the code clearer.

I'd also recommend running qemu without and with your patch and verifying
it works as intended.

>  				ret = -1;
> -- 
> 2.43.0
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
  2026-04-16 14:29 ` Mike Rapoport
@ 2026-04-16 15:10   ` Sang-Heon Jeon
  0 siblings, 0 replies; 8+ messages in thread
From: Sang-Heon Jeon @ 2026-04-16 15:10 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: akpm, djbw, mingo, linux-mm, Donghyeon Lee, Munhui Chae

Hello,

On Thu, Apr 16, 2026 at 11:29 PM Mike Rapoport <rppt@kernel.org> wrote:
>
> Hi,
>
> On Thu, Apr 16, 2026 at 07:25:58PM +0900, Sang-Heon Jeon wrote:
> > Subject: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
>
> RFC in the subject means you don't intent for the patch to be included, but
> rather want to share it to get an early feedback.
>
> I believe this is not the case here :)

You're right. When I sent v1 patch, I wanted to check if the way of
patch was correct or not.
Anyway, it seems there are no disagreements on that part now. I'll
remove RFC in the next patch.
I really appreciate your attention :)

> > When split NUMA node uniformly, split_nodes_size_interleave_uniform()
> > returns the next absolute node ID, not the number of nodes created.
> >
> > The existing under-allocation detection logic compares next absolute node
> > ID (ret) and request count (n), which only works when nid starts at 0.
> >
> > For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
> > 1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
> > node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
> > fake node nid starts at 8, but only 4 fake nodes are created due to
> > current FAKE_NODE_MIN_SIZE being 32MB, and
> > split_nodes_size_interleave_uniform() returns 12. By existing
> > under-allocation detection logic, "ret < n" (12 < 8) is false, so the
> > under-allocation will not be detected.
> >
> > Fix under-allocation detection logic to compare the number of actually
> > created nodes (ret - nid) against the request count (n).
> >
> > Also, fix the outdated comment to match the actual return value.
> >
> > Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
> > Reported-by: Donghyeon Lee <asd142513@gmail.com>
> > Reported-by: Munhui Chae <mochae@student.42seoul.kr>
> > Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
> > ---
> >  mm/numa_emulation.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
> > index 703c8fa05048..c1d0a76aef64 100644
> > --- a/mm/numa_emulation.c
> > +++ b/mm/numa_emulation.c
> > @@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
> >   * Sets up fake nodes of `size' interleaved over physical nodes ranging from
> >   * `addr' to `max_addr'.
> >   *
> > - * Returns zero on success or negative on error.
> > + * Returns absolute node ID on success or negative on error.
> >   */
> >  static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
> >                                             struct numa_meminfo *pi,
> > @@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
> >                                       n, &pi.blk[0], nid);
> >                       if (ret < 0)
> >                               break;
> > -                     if (ret < n) {
> > +                     if (ret - nid < n) {
> >                               pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
> >                                               __func__, i, ret, n);
> >                               ret = -1;
> > --
> > 2.43.0
> >
>
> --
> Sincerely yours,
> Mike.

Best Regards,
Sang-Heon Jeon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
  2026-04-16 14:36 ` Mike Rapoport
@ 2026-04-16 15:25   ` Sang-Heon Jeon
  2026-04-16 18:25     ` Donghyeon Lee
  2026-04-16 18:45     ` Mike Rapoport
  0 siblings, 2 replies; 8+ messages in thread
From: Sang-Heon Jeon @ 2026-04-16 15:25 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: akpm, djbw, mingo, linux-mm, Donghyeon Lee, Munhui Chae

On Thu, Apr 16, 2026 at 11:36 PM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Thu, Apr 16, 2026 at 07:25:58PM +0900, Sang-Heon Jeon wrote:
> > When split NUMA node uniformly, split_nodes_size_interleave_uniform()
> > returns the next absolute node ID, not the number of nodes created.
> >
> > The existing under-allocation detection logic compares next absolute node
> > ID (ret) and request count (n), which only works when nid starts at 0.
> >
> > For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
> > 1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
> > node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
> > fake node nid starts at 8, but only 4 fake nodes are created due to
> > current FAKE_NODE_MIN_SIZE being 32MB, and
> > split_nodes_size_interleave_uniform() returns 12. By existing
> > under-allocation detection logic, "ret < n" (12 < 8) is false, so the
> > under-allocation will not be detected.
> >
> > Fix under-allocation detection logic to compare the number of actually
> > created nodes (ret - nid) against the request count (n).
> >
> > Also, fix the outdated comment to match the actual return value.
> >
> > Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
> > Reported-by: Donghyeon Lee <asd142513@gmail.com>
> > Reported-by: Munhui Chae <mochae@student.42seoul.kr>
> > Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
> > ---
> >  mm/numa_emulation.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
> > index 703c8fa05048..c1d0a76aef64 100644
> > --- a/mm/numa_emulation.c
> > +++ b/mm/numa_emulation.c
> > @@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
> >   * Sets up fake nodes of `size' interleaved over physical nodes ranging from
> >   * `addr' to `max_addr'.
> >   *
> > - * Returns zero on success or negative on error.
> > + * Returns absolute node ID on success or negative on error.
> >   */
> >  static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
> >                                             struct numa_meminfo *pi,
> > @@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
> >                                       n, &pi.blk[0], nid);
> >                       if (ret < 0)
> >                               break;
> > -                     if (ret < n) {
> > +                     if (ret - nid < n) {
> >                               pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
> >                                               __func__, i, ret, n);
>
> The error message also should be updated, now it prints the last node ID
> rather than number of created nodes. I think it's worse creating a
> temporary variable for ret - nid to make the code clearer.

Is "worse" a typo of "worth"?

And thanks for catching. I totally agree that the error message needs
to be updated.

> I'd also recommend running qemu without and with your patch and verifying
> it works as intended.

In the next version, I'll try to include qemu based test results.

> >                               ret = -1;
> > --
> > 2.43.0
> >
>
> --
> Sincerely yours,
> Mike.

Best Regards,
Sang-Heon Jeon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
  2026-04-16 15:25   ` Sang-Heon Jeon
@ 2026-04-16 18:25     ` Donghyeon Lee
  2026-04-16 18:45     ` Mike Rapoport
  1 sibling, 0 replies; 8+ messages in thread
From: Donghyeon Lee @ 2026-04-16 18:25 UTC (permalink / raw)
  To: Sang-Heon Jeon; +Cc: Mike Rapoport, akpm, djbw, mingo, linux-mm, Munhui Chae

Hello,

Please note that this message was written with the help of machine
translation.

Below are the QEMU-based before/after test results for the patch.

The test scenarios were as follows:

1. Test with the unpatched kernel.
   Assign 128MB to NUMA node0 and 2GB to node1, then split each
   physical node into 5 uniform fake nodes.

   Since node0 does not have enough memory to be split into 5 parts,
   this case is expected to fail.

2. The reverse of case 1.
   Assign 2GB to node0 and 128MB to node1, and again attempt to split
   both into 5 parts.

   node1 is expected to fail the 5-way split, but since its fake node
   IDs start from 5, slightly fewer than the intended 10 nodes will be
   created.

3. Test with the patched kernel under the same setup as case 2.

   Since node1 fails the 5-way split, NUMA emulation is expected to be
   disabled.

- TEST 1 ----------------------------------------------------------------

qemu-system-x86_64 \
        -drive file=disk.qcow2,if=virtio,index=0,media=disk,format=qcow2 \
        -bios /usr/share/ovmf/x64/OVMF.4m.fd -nographic -s \
        -machine hmat=on \
        -m 2176M,slots=2,maxmem=4G \
        -object memory-backend-ram,size=128M,id=m0 \
        -object memory-backend-ram,size=2G,id=m1 \
        -numa node,nodeid=0,memdev=m0 \
        -numa node,nodeid=1,memdev=m1 \
        -smp 2,sockets=2,maxcpus=2 \
        -numa cpu,node-id=0,socket-id=0 \
        -numa cpu,node-id=1,socket-id=1

~ # cat /proc/cmdline
\bzImage console=ttyS0 root=/dev/vda2 rw nokaslr norandmaps numa=fake=5U

~ # ls -d /sys/devices/system/node/node*
/sys/devices/system/node/node0  /sys/devices/system/node/node1

~ # cat /sys/devices/system/node/node?/meminfo | grep MemTotal
Node 0 MemTotal:          83024 kB
Node 1 MemTotal:        1992536 kB

~ # dmesg | grep failing
[    0.037511] numa_emulation: phys: 0 only got 4 of 5 nodes, failing

-------------------------------------------------------------------------

As expected, the NUMA nodes were not split.


- TEST 2 ----------------------------------------------------------------

qemu-system-x86_64 \
        -drive file=disk.qcow2,if=virtio,index=0,media=disk,format=qcow2 \
        -bios /usr/share/ovmf/x64/OVMF.4m.fd -nographic -s \
        -machine hmat=on \
        -m 2176M,slots=2,maxmem=4G \
        -object memory-backend-ram,size=2G,id=m0 \
        -object memory-backend-ram,size=128M,id=m1 \
        -numa node,nodeid=0,memdev=m0 \
        -numa node,nodeid=1,memdev=m1 \
        -smp 2,sockets=2,maxcpus=2 \
        -numa cpu,node-id=0,socket-id=0 \
        -numa cpu,node-id=1,socket-id=1

~ # cat /proc/cmdline
\bzImage console=ttyS0 root=/dev/vda2 rw nokaslr norandmaps numa=fake=5U

~ # ls -d /sys/devices/system/node/node*
/sys/devices/system/node/node0  /sys/devices/system/node/node4
/sys/devices/system/node/node1  /sys/devices/system/node/node5
/sys/devices/system/node/node2  /sys/devices/system/node/node6
/sys/devices/system/node/node3  /sys/devices/system/node/node7

~ # cat /sys/devices/system/node/node?/meminfo | grep MemTotal
Node 0 MemTotal:         371796 kB
Node 1 MemTotal:         419824 kB
Node 2 MemTotal:         419824 kB
Node 3 MemTotal:         417776 kB
Node 4 MemTotal:         323564 kB
Node 5 MemTotal:          30496 kB
Node 6 MemTotal:          32752 kB
Node 7 MemTotal:          59432 kB

-------------------------------------------------------------------------

As expected, although node1 failed the 5-way split, the overall
emulation did not fail. Instead, it resulted in a partial/incomplete
split.


- TEST 3 ----------------------------------------------------------------

qemu-system-x86_64 \
        -drive file=disk.qcow2,if=virtio,index=0,media=disk,format=qcow2 \
        -bios /usr/share/ovmf/x64/OVMF.4m.fd -nographic -s \
        -machine hmat=on \
        -m 2176M,slots=2,maxmem=4G \
        -object memory-backend-ram,size=2G,id=m0 \
        -object memory-backend-ram,size=128M,id=m1 \
        -numa node,nodeid=0,memdev=m0 \
        -numa node,nodeid=1,memdev=m1 \
        -smp 2,sockets=2,maxcpus=2 \
        -numa cpu,node-id=0,socket-id=0 \
        -numa cpu,node-id=1,socket-id=1

~ # cat /proc/cmdline
\bzImage console=ttyS0 root=/dev/vda2 rw nokaslr norandmaps numa=fake=5U

~ # ls -d /sys/devices/system/node/node*
/sys/devices/system/node/node0  /sys/devices/system/node/node1

~ # cat /sys/devices/system/node/node?/meminfo | grep MemTotal
Node 0 MemTotal:        1952844 kB
Node 1 MemTotal:         122712 kB

~ # dmesg | grep failing
[    0.036879] numa_emulation: phys: 1 only got 8 of 5 nodes, failing

-------------------------------------------------------------------------

With the patch applied, the emulation now fails as intended.

The log message is still from before the logging fix, so it prints the
odd message saying it "only got 8 of 5 nodes" before failing :)

Best Regards,
Donghyeon Lee

2026년 4월 17일 (금) 오전 12:26, Sang-Heon Jeon <ekffu200098@gmail.com>님이 작성:
>
> On Thu, Apr 16, 2026 at 11:36 PM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Thu, Apr 16, 2026 at 07:25:58PM +0900, Sang-Heon Jeon wrote:
> > > When split NUMA node uniformly, split_nodes_size_interleave_uniform()
> > > returns the next absolute node ID, not the number of nodes created.
> > >
> > > The existing under-allocation detection logic compares next absolute node
> > > ID (ret) and request count (n), which only works when nid starts at 0.
> > >
> > > For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
> > > 1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
> > > node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
> > > fake node nid starts at 8, but only 4 fake nodes are created due to
> > > current FAKE_NODE_MIN_SIZE being 32MB, and
> > > split_nodes_size_interleave_uniform() returns 12. By existing
> > > under-allocation detection logic, "ret < n" (12 < 8) is false, so the
> > > under-allocation will not be detected.
> > >
> > > Fix under-allocation detection logic to compare the number of actually
> > > created nodes (ret - nid) against the request count (n).
> > >
> > > Also, fix the outdated comment to match the actual return value.
> > >
> > > Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
> > > Reported-by: Donghyeon Lee <asd142513@gmail.com>
> > > Reported-by: Munhui Chae <mochae@student.42seoul.kr>
> > > Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
> > > ---
> > >  mm/numa_emulation.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
> > > index 703c8fa05048..c1d0a76aef64 100644
> > > --- a/mm/numa_emulation.c
> > > +++ b/mm/numa_emulation.c
> > > @@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
> > >   * Sets up fake nodes of `size' interleaved over physical nodes ranging from
> > >   * `addr' to `max_addr'.
> > >   *
> > > - * Returns zero on success or negative on error.
> > > + * Returns absolute node ID on success or negative on error.
> > >   */
> > >  static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
> > >                                             struct numa_meminfo *pi,
> > > @@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
> > >                                       n, &pi.blk[0], nid);
> > >                       if (ret < 0)
> > >                               break;
> > > -                     if (ret < n) {
> > > +                     if (ret - nid < n) {
> > >                               pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
> > >                                               __func__, i, ret, n);
> >
> > The error message also should be updated, now it prints the last node ID
> > rather than number of created nodes. I think it's worse creating a
> > temporary variable for ret - nid to make the code clearer.
>
> Is "worse" a typo of "worth"?
>
> And thanks for catching. I totally agree that the error message needs
> to be updated.
>
> > I'd also recommend running qemu without and with your patch and verifying
> > it works as intended.
>
> In the next version, I'll try to include qemu based test results.
>
> > >                               ret = -1;
> > > --
> > > 2.43.0
> > >
> >
> > --
> > Sincerely yours,
> > Mike.
>
> Best Regards,
> Sang-Heon Jeon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split
  2026-04-16 15:25   ` Sang-Heon Jeon
  2026-04-16 18:25     ` Donghyeon Lee
@ 2026-04-16 18:45     ` Mike Rapoport
  1 sibling, 0 replies; 8+ messages in thread
From: Mike Rapoport @ 2026-04-16 18:45 UTC (permalink / raw)
  To: Sang-Heon Jeon; +Cc: akpm, djbw, mingo, linux-mm, Donghyeon Lee, Munhui Chae

On Fri, Apr 17, 2026 at 12:25:54AM +0900, Sang-Heon Jeon wrote:
> On Thu, Apr 16, 2026 at 11:36 PM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Thu, Apr 16, 2026 at 07:25:58PM +0900, Sang-Heon Jeon wrote:
> > > When split NUMA node uniformly, split_nodes_size_interleave_uniform()
> > > returns the next absolute node ID, not the number of nodes created.
> > >
> > > The existing under-allocation detection logic compares next absolute node
> > > ID (ret) and request count (n), which only works when nid starts at 0.
> > >
> > > For example, on a system with 2 physical NUMA nodes (node 0: 2GB, node
> > > 1: 128MB) and numa=fake=8U, 8 fake nodes are successfully created from
> > > node 0 and split_nodes_size_interleave_uniform() returns 8. For node 1,
> > > fake node nid starts at 8, but only 4 fake nodes are created due to
> > > current FAKE_NODE_MIN_SIZE being 32MB, and
> > > split_nodes_size_interleave_uniform() returns 12. By existing
> > > under-allocation detection logic, "ret < n" (12 < 8) is false, so the
> > > under-allocation will not be detected.
> > >
> > > Fix under-allocation detection logic to compare the number of actually
> > > created nodes (ret - nid) against the request count (n).
> > >
> > > Also, fix the outdated comment to match the actual return value.
> > >
> > > Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com>
> > > Reported-by: Donghyeon Lee <asd142513@gmail.com>
> > > Reported-by: Munhui Chae <mochae@student.42seoul.kr>
> > > Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") # 4.19
> > > ---
> > >  mm/numa_emulation.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c
> > > index 703c8fa05048..c1d0a76aef64 100644
> > > --- a/mm/numa_emulation.c
> > > +++ b/mm/numa_emulation.c
> > > @@ -214,7 +214,7 @@ static u64 uniform_size(u64 max_addr, u64 base, u64 hole, int nr_nodes)
> > >   * Sets up fake nodes of `size' interleaved over physical nodes ranging from
> > >   * `addr' to `max_addr'.
> > >   *
> > > - * Returns zero on success or negative on error.
> > > + * Returns absolute node ID on success or negative on error.
> > >   */
> > >  static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
> > >                                             struct numa_meminfo *pi,
> > > @@ -416,7 +416,7 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
> > >                                       n, &pi.blk[0], nid);
> > >                       if (ret < 0)
> > >                               break;
> > > -                     if (ret < n) {
> > > +                     if (ret - nid < n) {
> > >                               pr_info("%s: phys: %d only got %d of %ld nodes, failing\n",
> > >                                               __func__, i, ret, n);
> >
> > The error message also should be updated, now it prints the last node ID
> > rather than number of created nodes. I think it's worse creating a
> > temporary variable for ret - nid to make the code clearer.
> 
> Is "worse" a typo of "worth"?

Yes, my bad :)
 
> And thanks for catching. I totally agree that the error message needs
> to be updated.
> 
> > I'd also recommend running qemu without and with your patch and verifying
> > it works as intended.
> 
> In the next version, I'll try to include qemu based test results.

Thanks!
 
> Best Regards,
> Sang-Heon Jeon

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-04-16 18:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-04-16 10:25 [RFC PATCH v2] mm/fake-numa: fix under-allocation detection in uniform split Sang-Heon Jeon
2026-04-16 10:36 ` Sang-Heon Jeon
2026-04-16 14:29 ` Mike Rapoport
2026-04-16 15:10   ` Sang-Heon Jeon
2026-04-16 14:36 ` Mike Rapoport
2026-04-16 15:25   ` Sang-Heon Jeon
2026-04-16 18:25     ` Donghyeon Lee
2026-04-16 18:45     ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox