linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free()
@ 2026-04-10 23:03 Matthew Brost
  2026-04-10 23:26 ` Matthew Brost
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Matthew Brost @ 2026-04-10 23:03 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: David Hildenbrand, Oscar Salvador, Andrew Morton, Balbir Singh,
	linux-mm, linux-cxl, linux-kernel

The contents of a device folio can immediately change after calling
->folio_free(), as the folio may be reallocated by a driver with a
different order. Instead of touching the folio again to extract the
pgmap, use the local stack variable when calling percpu_ref_put_many().

Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Cc: linux-cxl@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>

---
Stack trace:

[  631.875165] [IGT] xe_exec_system_allocator: starting subtest threads-many-new-prefetch
[  632.282992] Oops: general protection fault, probably for non-canonical address 0x900000000000000: 0000 [#1] SMP NOPTI
[  632.293469] CPU: 8 UID: 0 PID: 59267 Comm: xe_exec_system_ Not tainted 7.0.0-rc7-xe+ #281 PREEMPT(full)
[  632.316023] RIP: 0010:free_zone_device_folio+0x149/0x240
[  632.339782] RSP: 0000:ffffc90023d1fd00 EFLAGS: 00010206
[  632.344947] RAX: 0900000000000000 RBX: 0000000000000001 RCX: 0000000094472d4d
[  632.351991] RDX: ffffffff8155c76f RSI: 000000006f2213bf RDI: 000000008e84943a
[  632.359042] RBP: ffffea0ff4030001 R08: 0000000000000000 R09: 0000000000000001
[  632.366094] R10: 0000000000000028 R11: 0000000000000000 R12: ffff88811828e400
[  632.373145] R13: 0000000000000000 R14: 000fffffc0000000 R15: 0000000000100073
[  632.380194] FS:  00007f2f0fdfe6c0(0000) GS:ffff88890a7e7000(0000) knlGS:0000000000000000
[  632.388186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  632.393870] CR2: 00007f2f002e90f8 CR3: 0000000106708002 CR4: 0000000000f70ef0
[  632.400919] PKRU: 55555554
[  632.403605] Call Trace:
[  632.406039]  <TASK>
[  632.408131]  do_swap_page+0x146d/0x18c0
[  632.411938]  ? __pte_offset_map+0x3e/0x190
[  632.415994]  __handle_mm_fault+0x6e8/0x8d0
[  632.420053]  handle_mm_fault+0xbf/0x250
[  632.423855]  ? lock_mm_and_find_vma+0x41/0x6f0
[  632.428256]  do_user_addr_fault+0x168/0x690
[  632.432399]  exc_page_fault+0x74/0x200
[  632.436117]  asm_exc_page_fault+0x26/0x30
[  632.440092] RIP: 0033:0x5587554ff70d
[  632.462142] RSP: 002b:00007f2f0fdfc970 EFLAGS: 00010246
[  632.467308] RAX: 0000000000003fc0 RBX: 00007f2f082e1fc0 RCX: 00007f2f12b3287d
[  632.474355] RDX: 0000000000000000 RSI: 00000000c048644a RDI: 0000000000000003
[  632.481404] RBP: 00007f2f082e1fc0 R08: 00007f2f0fdfc958 R09: 0000000000000066
[  632.488450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[  632.495495] R13: 00007f2f082de000 R14: 0000000000c00002 R15: 00007f2f1319e000
[  632.502547]  </TASK>
---
 mm/memremap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memremap.c b/mm/memremap.c
index ac7be07e3361..053842d45cb1 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -454,7 +454,7 @@ void free_zone_device_folio(struct folio *folio)
 		if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->folio_free))
 			break;
 		pgmap->ops->folio_free(folio);
-		percpu_ref_put_many(&folio->pgmap->ref, nr);
+		percpu_ref_put_many(&pgmap->ref, nr);
 		break;
 
 	case MEMORY_DEVICE_GENERIC:
-- 
2.34.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free()
  2026-04-10 23:03 [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free() Matthew Brost
@ 2026-04-10 23:26 ` Matthew Brost
  2026-04-12  1:32 ` Balbir Singh
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Matthew Brost @ 2026-04-10 23:26 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: David Hildenbrand, Oscar Salvador, Andrew Morton, Balbir Singh,
	linux-mm, linux-cxl, linux-kernel

On Fri, Apr 10, 2026 at 04:03:46PM -0700, Matthew Brost wrote:
> The contents of a device folio can immediately change after calling
> ->folio_free(), as the folio may be reallocated by a driver with a
> different order. Instead of touching the folio again to extract the
> pgmap, use the local stack variable when calling percpu_ref_put_many().
> 
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Balbir Singh <balbirs@nvidia.com>
> Cc: linux-mm@kvack.org
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

Cc: <stable@vger.kernel.org>

> Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> 
> ---
> Stack trace:
> 
> [  631.875165] [IGT] xe_exec_system_allocator: starting subtest threads-many-new-prefetch
> [  632.282992] Oops: general protection fault, probably for non-canonical address 0x900000000000000: 0000 [#1] SMP NOPTI
> [  632.293469] CPU: 8 UID: 0 PID: 59267 Comm: xe_exec_system_ Not tainted 7.0.0-rc7-xe+ #281 PREEMPT(full)
> [  632.316023] RIP: 0010:free_zone_device_folio+0x149/0x240
> [  632.339782] RSP: 0000:ffffc90023d1fd00 EFLAGS: 00010206
> [  632.344947] RAX: 0900000000000000 RBX: 0000000000000001 RCX: 0000000094472d4d
> [  632.351991] RDX: ffffffff8155c76f RSI: 000000006f2213bf RDI: 000000008e84943a
> [  632.359042] RBP: ffffea0ff4030001 R08: 0000000000000000 R09: 0000000000000001
> [  632.366094] R10: 0000000000000028 R11: 0000000000000000 R12: ffff88811828e400
> [  632.373145] R13: 0000000000000000 R14: 000fffffc0000000 R15: 0000000000100073
> [  632.380194] FS:  00007f2f0fdfe6c0(0000) GS:ffff88890a7e7000(0000) knlGS:0000000000000000
> [  632.388186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  632.393870] CR2: 00007f2f002e90f8 CR3: 0000000106708002 CR4: 0000000000f70ef0
> [  632.400919] PKRU: 55555554
> [  632.403605] Call Trace:
> [  632.406039]  <TASK>
> [  632.408131]  do_swap_page+0x146d/0x18c0
> [  632.411938]  ? __pte_offset_map+0x3e/0x190
> [  632.415994]  __handle_mm_fault+0x6e8/0x8d0
> [  632.420053]  handle_mm_fault+0xbf/0x250
> [  632.423855]  ? lock_mm_and_find_vma+0x41/0x6f0
> [  632.428256]  do_user_addr_fault+0x168/0x690
> [  632.432399]  exc_page_fault+0x74/0x200
> [  632.436117]  asm_exc_page_fault+0x26/0x30
> [  632.440092] RIP: 0033:0x5587554ff70d
> [  632.462142] RSP: 002b:00007f2f0fdfc970 EFLAGS: 00010246
> [  632.467308] RAX: 0000000000003fc0 RBX: 00007f2f082e1fc0 RCX: 00007f2f12b3287d
> [  632.474355] RDX: 0000000000000000 RSI: 00000000c048644a RDI: 0000000000000003
> [  632.481404] RBP: 00007f2f082e1fc0 R08: 00007f2f0fdfc958 R09: 0000000000000066
> [  632.488450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> [  632.495495] R13: 00007f2f082de000 R14: 0000000000c00002 R15: 00007f2f1319e000
> [  632.502547]  </TASK>
> ---
>  mm/memremap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memremap.c b/mm/memremap.c
> index ac7be07e3361..053842d45cb1 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -454,7 +454,7 @@ void free_zone_device_folio(struct folio *folio)
>  		if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->folio_free))
>  			break;
>  		pgmap->ops->folio_free(folio);
> -		percpu_ref_put_many(&folio->pgmap->ref, nr);
> +		percpu_ref_put_many(&pgmap->ref, nr);
>  		break;
>  
>  	case MEMORY_DEVICE_GENERIC:
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free()
  2026-04-10 23:03 [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free() Matthew Brost
  2026-04-10 23:26 ` Matthew Brost
@ 2026-04-12  1:32 ` Balbir Singh
  2026-04-12  4:39 ` Vishal Moola
  2026-04-13  4:06 ` Alistair Popple
  3 siblings, 0 replies; 7+ messages in thread
From: Balbir Singh @ 2026-04-12  1:32 UTC (permalink / raw)
  To: Matthew Brost, intel-xe, dri-devel
  Cc: David Hildenbrand, Oscar Salvador, Andrew Morton, linux-mm,
	linux-cxl, linux-kernel

On 4/11/26 09:03, Matthew Brost wrote:
> The contents of a device folio can immediately change after calling
> ->folio_free(), as the folio may be reallocated by a driver with a
> different order. Instead of touching the folio again to extract the
> pgmap, use the local stack variable when calling percpu_ref_put_many().
> 
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Balbir Singh <balbirs@nvidia.com>
> Cc: linux-mm@kvack.org
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> 
> ---
> Stack trace:
> 
> [  631.875165] [IGT] xe_exec_system_allocator: starting subtest threads-many-new-prefetch
> [  632.282992] Oops: general protection fault, probably for non-canonical address 0x900000000000000: 0000 [#1] SMP NOPTI
> [  632.293469] CPU: 8 UID: 0 PID: 59267 Comm: xe_exec_system_ Not tainted 7.0.0-rc7-xe+ #281 PREEMPT(full)
> [  632.316023] RIP: 0010:free_zone_device_folio+0x149/0x240
> [  632.339782] RSP: 0000:ffffc90023d1fd00 EFLAGS: 00010206
> [  632.344947] RAX: 0900000000000000 RBX: 0000000000000001 RCX: 0000000094472d4d
> [  632.351991] RDX: ffffffff8155c76f RSI: 000000006f2213bf RDI: 000000008e84943a
> [  632.359042] RBP: ffffea0ff4030001 R08: 0000000000000000 R09: 0000000000000001
> [  632.366094] R10: 0000000000000028 R11: 0000000000000000 R12: ffff88811828e400
> [  632.373145] R13: 0000000000000000 R14: 000fffffc0000000 R15: 0000000000100073
> [  632.380194] FS:  00007f2f0fdfe6c0(0000) GS:ffff88890a7e7000(0000) knlGS:0000000000000000
> [  632.388186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  632.393870] CR2: 00007f2f002e90f8 CR3: 0000000106708002 CR4: 0000000000f70ef0
> [  632.400919] PKRU: 55555554
> [  632.403605] Call Trace:
> [  632.406039]  <TASK>
> [  632.408131]  do_swap_page+0x146d/0x18c0
> [  632.411938]  ? __pte_offset_map+0x3e/0x190
> [  632.415994]  __handle_mm_fault+0x6e8/0x8d0
> [  632.420053]  handle_mm_fault+0xbf/0x250
> [  632.423855]  ? lock_mm_and_find_vma+0x41/0x6f0
> [  632.428256]  do_user_addr_fault+0x168/0x690
> [  632.432399]  exc_page_fault+0x74/0x200
> [  632.436117]  asm_exc_page_fault+0x26/0x30
> [  632.440092] RIP: 0033:0x5587554ff70d
> [  632.462142] RSP: 002b:00007f2f0fdfc970 EFLAGS: 00010246
> [  632.467308] RAX: 0000000000003fc0 RBX: 00007f2f082e1fc0 RCX: 00007f2f12b3287d
> [  632.474355] RDX: 0000000000000000 RSI: 00000000c048644a RDI: 0000000000000003
> [  632.481404] RBP: 00007f2f082e1fc0 R08: 00007f2f0fdfc958 R09: 0000000000000066
> [  632.488450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> [  632.495495] R13: 00007f2f082de000 R14: 0000000000c00002 R15: 00007f2f1319e000
> [  632.502547]  </TASK>
> ---
>  mm/memremap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memremap.c b/mm/memremap.c
> index ac7be07e3361..053842d45cb1 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -454,7 +454,7 @@ void free_zone_device_folio(struct folio *folio)
>  		if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->folio_free))
>  			break;
>  		pgmap->ops->folio_free(folio);
> -		percpu_ref_put_many(&folio->pgmap->ref, nr);
> +		percpu_ref_put_many(&pgmap->ref, nr);
>  		break;
>  
>  	case MEMORY_DEVICE_GENERIC:


Reviewed-by: Balbir Singh <balbirs@nvidia.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free()
  2026-04-10 23:03 [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free() Matthew Brost
  2026-04-10 23:26 ` Matthew Brost
  2026-04-12  1:32 ` Balbir Singh
@ 2026-04-12  4:39 ` Vishal Moola
  2026-04-13  4:06 ` Alistair Popple
  3 siblings, 0 replies; 7+ messages in thread
From: Vishal Moola @ 2026-04-12  4:39 UTC (permalink / raw)
  To: Matthew Brost
  Cc: intel-xe, dri-devel, David Hildenbrand, Oscar Salvador,
	Andrew Morton, Balbir Singh, linux-mm, linux-cxl, linux-kernel

On Fri, Apr 10, 2026 at 04:03:46PM -0700, Matthew Brost wrote:
> The contents of a device folio can immediately change after calling
> ->folio_free(), as the folio may be reallocated by a driver with a
> different order. Instead of touching the folio again to extract the
> pgmap, use the local stack variable when calling percpu_ref_put_many().
> 
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Balbir Singh <balbirs@nvidia.com>
> Cc: linux-mm@kvack.org
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>

With the cc-stable:

Reviewed-by: Vishal Moola <vishal.moola@gmail.com>

> ---
> Stack trace:
> 
> [  631.875165] [IGT] xe_exec_system_allocator: starting subtest threads-many-new-prefetch
> [  632.282992] Oops: general protection fault, probably for non-canonical address 0x900000000000000: 0000 [#1] SMP NOPTI
> [  632.293469] CPU: 8 UID: 0 PID: 59267 Comm: xe_exec_system_ Not tainted 7.0.0-rc7-xe+ #281 PREEMPT(full)
> [  632.316023] RIP: 0010:free_zone_device_folio+0x149/0x240
> [  632.339782] RSP: 0000:ffffc90023d1fd00 EFLAGS: 00010206
> [  632.344947] RAX: 0900000000000000 RBX: 0000000000000001 RCX: 0000000094472d4d
> [  632.351991] RDX: ffffffff8155c76f RSI: 000000006f2213bf RDI: 000000008e84943a
> [  632.359042] RBP: ffffea0ff4030001 R08: 0000000000000000 R09: 0000000000000001
> [  632.366094] R10: 0000000000000028 R11: 0000000000000000 R12: ffff88811828e400
> [  632.373145] R13: 0000000000000000 R14: 000fffffc0000000 R15: 0000000000100073
> [  632.380194] FS:  00007f2f0fdfe6c0(0000) GS:ffff88890a7e7000(0000) knlGS:0000000000000000
> [  632.388186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  632.393870] CR2: 00007f2f002e90f8 CR3: 0000000106708002 CR4: 0000000000f70ef0
> [  632.400919] PKRU: 55555554
> [  632.403605] Call Trace:
> [  632.406039]  <TASK>
> [  632.408131]  do_swap_page+0x146d/0x18c0
> [  632.411938]  ? __pte_offset_map+0x3e/0x190
> [  632.415994]  __handle_mm_fault+0x6e8/0x8d0
> [  632.420053]  handle_mm_fault+0xbf/0x250
> [  632.423855]  ? lock_mm_and_find_vma+0x41/0x6f0
> [  632.428256]  do_user_addr_fault+0x168/0x690
> [  632.432399]  exc_page_fault+0x74/0x200
> [  632.436117]  asm_exc_page_fault+0x26/0x30
> [  632.440092] RIP: 0033:0x5587554ff70d
> [  632.462142] RSP: 002b:00007f2f0fdfc970 EFLAGS: 00010246
> [  632.467308] RAX: 0000000000003fc0 RBX: 00007f2f082e1fc0 RCX: 00007f2f12b3287d
> [  632.474355] RDX: 0000000000000000 RSI: 00000000c048644a RDI: 0000000000000003
> [  632.481404] RBP: 00007f2f082e1fc0 R08: 00007f2f0fdfc958 R09: 0000000000000066
> [  632.488450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> [  632.495495] R13: 00007f2f082de000 R14: 0000000000c00002 R15: 00007f2f1319e000
> [  632.502547]  </TASK>
> ---
>  mm/memremap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memremap.c b/mm/memremap.c
> index ac7be07e3361..053842d45cb1 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -454,7 +454,7 @@ void free_zone_device_folio(struct folio *folio)
>  		if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->folio_free))
>  			break;
>  		pgmap->ops->folio_free(folio);
> -		percpu_ref_put_many(&folio->pgmap->ref, nr);
> +		percpu_ref_put_many(&pgmap->ref, nr);
>  		break;
>  
>  	case MEMORY_DEVICE_GENERIC:
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free()
  2026-04-10 23:03 [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free() Matthew Brost
                   ` (2 preceding siblings ...)
  2026-04-12  4:39 ` Vishal Moola
@ 2026-04-13  4:06 ` Alistair Popple
  2026-04-16  8:52   ` David Hildenbrand (Arm)
  3 siblings, 1 reply; 7+ messages in thread
From: Alistair Popple @ 2026-04-13  4:06 UTC (permalink / raw)
  To: Matthew Brost
  Cc: intel-xe, dri-devel, David Hildenbrand, Oscar Salvador,
	Andrew Morton, Balbir Singh, linux-mm, linux-cxl, linux-kernel

On 2026-04-11 at 09:03 +1000, Matthew Brost <matthew.brost@intel.com> wrote...
> The contents of a device folio can immediately change after calling
> ->folio_free(), as the folio may be reallocated by a driver with a
> different order. Instead of touching the folio again to extract the
> pgmap, use the local stack variable when calling percpu_ref_put_many().
> 
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Balbir Singh <balbirs@nvidia.com>
> Cc: linux-mm@kvack.org
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> 
> ---
> Stack trace:
> 
> [  631.875165] [IGT] xe_exec_system_allocator: starting subtest threads-many-new-prefetch
> [  632.282992] Oops: general protection fault, probably for non-canonical address 0x900000000000000: 0000 [#1] SMP NOPTI
> [  632.293469] CPU: 8 UID: 0 PID: 59267 Comm: xe_exec_system_ Not tainted 7.0.0-rc7-xe+ #281 PREEMPT(full)
> [  632.316023] RIP: 0010:free_zone_device_folio+0x149/0x240
> [  632.339782] RSP: 0000:ffffc90023d1fd00 EFLAGS: 00010206
> [  632.344947] RAX: 0900000000000000 RBX: 0000000000000001 RCX: 0000000094472d4d
> [  632.351991] RDX: ffffffff8155c76f RSI: 000000006f2213bf RDI: 000000008e84943a
> [  632.359042] RBP: ffffea0ff4030001 R08: 0000000000000000 R09: 0000000000000001
> [  632.366094] R10: 0000000000000028 R11: 0000000000000000 R12: ffff88811828e400
> [  632.373145] R13: 0000000000000000 R14: 000fffffc0000000 R15: 0000000000100073
> [  632.380194] FS:  00007f2f0fdfe6c0(0000) GS:ffff88890a7e7000(0000) knlGS:0000000000000000
> [  632.388186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  632.393870] CR2: 00007f2f002e90f8 CR3: 0000000106708002 CR4: 0000000000f70ef0
> [  632.400919] PKRU: 55555554
> [  632.403605] Call Trace:
> [  632.406039]  <TASK>
> [  632.408131]  do_swap_page+0x146d/0x18c0
> [  632.411938]  ? __pte_offset_map+0x3e/0x190
> [  632.415994]  __handle_mm_fault+0x6e8/0x8d0
> [  632.420053]  handle_mm_fault+0xbf/0x250
> [  632.423855]  ? lock_mm_and_find_vma+0x41/0x6f0
> [  632.428256]  do_user_addr_fault+0x168/0x690
> [  632.432399]  exc_page_fault+0x74/0x200
> [  632.436117]  asm_exc_page_fault+0x26/0x30
> [  632.440092] RIP: 0033:0x5587554ff70d
> [  632.462142] RSP: 002b:00007f2f0fdfc970 EFLAGS: 00010246
> [  632.467308] RAX: 0000000000003fc0 RBX: 00007f2f082e1fc0 RCX: 00007f2f12b3287d
> [  632.474355] RDX: 0000000000000000 RSI: 00000000c048644a RDI: 0000000000000003
> [  632.481404] RBP: 00007f2f082e1fc0 R08: 00007f2f0fdfc958 R09: 0000000000000066
> [  632.488450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> [  632.495495] R13: 00007f2f082de000 R14: 0000000000c00002 R15: 00007f2f1319e000
> [  632.502547]  </TASK>

I'm not sure, but I think Andrew likes the stack traces included in the actual
commit messages. I've certainly found it helpful when debugging traces reported
from the field so would prefer it there.

> ---
>  mm/memremap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memremap.c b/mm/memremap.c
> index ac7be07e3361..053842d45cb1 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -454,7 +454,7 @@ void free_zone_device_folio(struct folio *folio)
>  		if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->folio_free))
>  			break;
>  		pgmap->ops->folio_free(folio);
> -		percpu_ref_put_many(&folio->pgmap->ref, nr);
> +		percpu_ref_put_many(&pgmap->ref, nr);

It's a pity this was open-coded rather than implementing put_dev_pagemap_many()
which makes it clearer what this is doing, but that's unrelated to this issue
and on me for not catching it when reviewing. So for this fix:

Reviewed-by: Alistair Popple <apopple@nvidia.com>

>  		break;
>  
>  	case MEMORY_DEVICE_GENERIC:
> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free()
  2026-04-13  4:06 ` Alistair Popple
@ 2026-04-16  8:52   ` David Hildenbrand (Arm)
  2026-04-16 23:40     ` Alistair Popple
  0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-16  8:52 UTC (permalink / raw)
  To: Alistair Popple, Matthew Brost
  Cc: intel-xe, dri-devel, Oscar Salvador, Andrew Morton, Balbir Singh,
	linux-mm, linux-cxl, linux-kernel

On 4/13/26 06:06, Alistair Popple wrote:
> On 2026-04-11 at 09:03 +1000, Matthew Brost <matthew.brost@intel.com> wrote...
>> The contents of a device folio can immediately change after calling
>> ->folio_free(), as the folio may be reallocated by a driver with a
>> different order. Instead of touching the folio again to extract the
>> pgmap, use the local stack variable when calling percpu_ref_put_many().
>>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Balbir Singh <balbirs@nvidia.com>
>> Cc: linux-mm@kvack.org
>> Cc: linux-cxl@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>
>> ---
>> Stack trace:
>>
>> [  631.875165] [IGT] xe_exec_system_allocator: starting subtest threads-many-new-prefetch
>> [  632.282992] Oops: general protection fault, probably for non-canonical address 0x900000000000000: 0000 [#1] SMP NOPTI
>> [  632.293469] CPU: 8 UID: 0 PID: 59267 Comm: xe_exec_system_ Not tainted 7.0.0-rc7-xe+ #281 PREEMPT(full)
>> [  632.316023] RIP: 0010:free_zone_device_folio+0x149/0x240
>> [  632.339782] RSP: 0000:ffffc90023d1fd00 EFLAGS: 00010206
>> [  632.344947] RAX: 0900000000000000 RBX: 0000000000000001 RCX: 0000000094472d4d
>> [  632.351991] RDX: ffffffff8155c76f RSI: 000000006f2213bf RDI: 000000008e84943a
>> [  632.359042] RBP: ffffea0ff4030001 R08: 0000000000000000 R09: 0000000000000001
>> [  632.366094] R10: 0000000000000028 R11: 0000000000000000 R12: ffff88811828e400
>> [  632.373145] R13: 0000000000000000 R14: 000fffffc0000000 R15: 0000000000100073
>> [  632.380194] FS:  00007f2f0fdfe6c0(0000) GS:ffff88890a7e7000(0000) knlGS:0000000000000000
>> [  632.388186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  632.393870] CR2: 00007f2f002e90f8 CR3: 0000000106708002 CR4: 0000000000f70ef0
>> [  632.400919] PKRU: 55555554
>> [  632.403605] Call Trace:
>> [  632.406039]  <TASK>
>> [  632.408131]  do_swap_page+0x146d/0x18c0
>> [  632.411938]  ? __pte_offset_map+0x3e/0x190
>> [  632.415994]  __handle_mm_fault+0x6e8/0x8d0
>> [  632.420053]  handle_mm_fault+0xbf/0x250
>> [  632.423855]  ? lock_mm_and_find_vma+0x41/0x6f0
>> [  632.428256]  do_user_addr_fault+0x168/0x690
>> [  632.432399]  exc_page_fault+0x74/0x200
>> [  632.436117]  asm_exc_page_fault+0x26/0x30
>> [  632.440092] RIP: 0033:0x5587554ff70d
>> [  632.462142] RSP: 002b:00007f2f0fdfc970 EFLAGS: 00010246
>> [  632.467308] RAX: 0000000000003fc0 RBX: 00007f2f082e1fc0 RCX: 00007f2f12b3287d
>> [  632.474355] RDX: 0000000000000000 RSI: 00000000c048644a RDI: 0000000000000003
>> [  632.481404] RBP: 00007f2f082e1fc0 R08: 00007f2f0fdfc958 R09: 0000000000000066
>> [  632.488450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>> [  632.495495] R13: 00007f2f082de000 R14: 0000000000c00002 R15: 00007f2f1319e000
>> [  632.502547]  </TASK>
> 
> I'm not sure, but I think Andrew likes the stack traces included in the actual
> commit messages. I've certainly found it helpful when debugging traces reported
> from the field so would prefer it there.

Agreed.

> 
>> ---
>>  mm/memremap.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/memremap.c b/mm/memremap.c
>> index ac7be07e3361..053842d45cb1 100644
>> --- a/mm/memremap.c
>> +++ b/mm/memremap.c
>> @@ -454,7 +454,7 @@ void free_zone_device_folio(struct folio *folio)
>>  		if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->folio_free))
>>  			break;
>>  		pgmap->ops->folio_free(folio);
>> -		percpu_ref_put_many(&folio->pgmap->ref, nr);
>> +		percpu_ref_put_many(&pgmap->ref, nr);
> 

I assume the ref keeps pgmap alive, such that that cannot go away after
the folio_free().

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free()
  2026-04-16  8:52   ` David Hildenbrand (Arm)
@ 2026-04-16 23:40     ` Alistair Popple
  0 siblings, 0 replies; 7+ messages in thread
From: Alistair Popple @ 2026-04-16 23:40 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Matthew Brost, intel-xe, dri-devel, Oscar Salvador,
	Andrew Morton, Balbir Singh, linux-mm, linux-cxl, linux-kernel

On 2026-04-16 at 18:52 +1000, "David Hildenbrand (Arm)" <david@kernel.org> wrote...
> On 4/13/26 06:06, Alistair Popple wrote:
> > On 2026-04-11 at 09:03 +1000, Matthew Brost <matthew.brost@intel.com> wrote...
> >> The contents of a device folio can immediately change after calling
> >> ->folio_free(), as the folio may be reallocated by a driver with a
> >> different order. Instead of touching the folio again to extract the
> >> pgmap, use the local stack variable when calling percpu_ref_put_many().
> >>
> >> Cc: David Hildenbrand <david@kernel.org>
> >> Cc: Oscar Salvador <osalvador@suse.de>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Balbir Singh <balbirs@nvidia.com>
> >> Cc: linux-mm@kvack.org
> >> Cc: linux-cxl@vger.kernel.org
> >> Cc: linux-kernel@vger.kernel.org
> >> Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
> >> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> >>
> >> ---
> >> Stack trace:
> >>
> >> [  631.875165] [IGT] xe_exec_system_allocator: starting subtest threads-many-new-prefetch
> >> [  632.282992] Oops: general protection fault, probably for non-canonical address 0x900000000000000: 0000 [#1] SMP NOPTI
> >> [  632.293469] CPU: 8 UID: 0 PID: 59267 Comm: xe_exec_system_ Not tainted 7.0.0-rc7-xe+ #281 PREEMPT(full)
> >> [  632.316023] RIP: 0010:free_zone_device_folio+0x149/0x240
> >> [  632.339782] RSP: 0000:ffffc90023d1fd00 EFLAGS: 00010206
> >> [  632.344947] RAX: 0900000000000000 RBX: 0000000000000001 RCX: 0000000094472d4d
> >> [  632.351991] RDX: ffffffff8155c76f RSI: 000000006f2213bf RDI: 000000008e84943a
> >> [  632.359042] RBP: ffffea0ff4030001 R08: 0000000000000000 R09: 0000000000000001
> >> [  632.366094] R10: 0000000000000028 R11: 0000000000000000 R12: ffff88811828e400
> >> [  632.373145] R13: 0000000000000000 R14: 000fffffc0000000 R15: 0000000000100073
> >> [  632.380194] FS:  00007f2f0fdfe6c0(0000) GS:ffff88890a7e7000(0000) knlGS:0000000000000000
> >> [  632.388186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [  632.393870] CR2: 00007f2f002e90f8 CR3: 0000000106708002 CR4: 0000000000f70ef0
> >> [  632.400919] PKRU: 55555554
> >> [  632.403605] Call Trace:
> >> [  632.406039]  <TASK>
> >> [  632.408131]  do_swap_page+0x146d/0x18c0
> >> [  632.411938]  ? __pte_offset_map+0x3e/0x190
> >> [  632.415994]  __handle_mm_fault+0x6e8/0x8d0
> >> [  632.420053]  handle_mm_fault+0xbf/0x250
> >> [  632.423855]  ? lock_mm_and_find_vma+0x41/0x6f0
> >> [  632.428256]  do_user_addr_fault+0x168/0x690
> >> [  632.432399]  exc_page_fault+0x74/0x200
> >> [  632.436117]  asm_exc_page_fault+0x26/0x30
> >> [  632.440092] RIP: 0033:0x5587554ff70d
> >> [  632.462142] RSP: 002b:00007f2f0fdfc970 EFLAGS: 00010246
> >> [  632.467308] RAX: 0000000000003fc0 RBX: 00007f2f082e1fc0 RCX: 00007f2f12b3287d
> >> [  632.474355] RDX: 0000000000000000 RSI: 00000000c048644a RDI: 0000000000000003
> >> [  632.481404] RBP: 00007f2f082e1fc0 R08: 00007f2f0fdfc958 R09: 0000000000000066
> >> [  632.488450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> >> [  632.495495] R13: 00007f2f082de000 R14: 0000000000c00002 R15: 00007f2f1319e000
> >> [  632.502547]  </TASK>
> > 
> > I'm not sure, but I think Andrew likes the stack traces included in the actual
> > commit messages. I've certainly found it helpful when debugging traces reported
> > from the field so would prefer it there.
> 
> Agreed.
> 
> > 
> >> ---
> >>  mm/memremap.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/mm/memremap.c b/mm/memremap.c
> >> index ac7be07e3361..053842d45cb1 100644
> >> --- a/mm/memremap.c
> >> +++ b/mm/memremap.c
> >> @@ -454,7 +454,7 @@ void free_zone_device_folio(struct folio *folio)
> >>  		if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->folio_free))
> >>  			break;
> >>  		pgmap->ops->folio_free(folio);
> >> -		percpu_ref_put_many(&folio->pgmap->ref, nr);
> >> +		percpu_ref_put_many(&pgmap->ref, nr);
> > 
> 
> I assume the ref keeps pgmap alive, such that that cannot go away after
> the folio_free().

Drivers keep the pgmap alive by holding the initial pgmap->ref from the
percpu_ref_init() initialisation in memremap_pages(). They release it when done
with the range as part of memunmap_pages() which the driver should only call
when all pages have been freed (at least for PRIVATE/COHERENT pages).

In practice we could drop the whole pgmap->ref counting for ZONE_DEVICE pages
(certainly for PRIVATE and COHERENT variants anyway). Now they are refcounted
normally we could just scan the page range as a BUG/WARN_ON check to see if any
are in use. I haven't really felt the need to do that though because the check
already exists and scanning the whole pgmap range for pages with a non-zero
refcount would be slow just for a debug check.

 - Alistair

> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> 
> -- 
> Cheers,
> 
> David


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-16 23:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-04-10 23:03 [PATCH] mm/zone_device: Do not touch device folio after calling ->folio_free() Matthew Brost
2026-04-10 23:26 ` Matthew Brost
2026-04-12  1:32 ` Balbir Singh
2026-04-12  4:39 ` Vishal Moola
2026-04-13  4:06 ` Alistair Popple
2026-04-16  8:52   ` David Hildenbrand (Arm)
2026-04-16 23:40     ` Alistair Popple

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox