* Re: [PATCH 3/3] ptdesc: Account page tables to memcgs again
2026-02-25 16:22 ` [PATCH 3/3] ptdesc: Account page tables to memcgs again Matthew Wilcox (Oracle)
@ 2026-02-25 16:55 ` Shakeel Butt
2026-02-25 21:01 ` Matthew Wilcox
2026-02-25 20:57 ` Matthew Wilcox
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Shakeel Butt @ 2026-02-25 16:55 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, cgroups, linux-mm,
Axel Rasmussen
On Wed, Feb 25, 2026 at 04:22:17PM +0000, Matthew Wilcox (Oracle) wrote:
> Commit f0c92726e89f removed the accounting of page tables to memcgs.
> Reintroduce it.
>
> Fixes: f0c92726e89f (ptdesc: remove references to folios from __pagetable_ctor() and pagetable_dtor())
> Reported-by: Axel Rasmussen <axelrasmussen@google.com>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> include/linux/mm.h | 15 +++++++++++++--
> include/linux/mm_types.h | 6 +++---
> 2 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5be3d8a8f806..34bc6f00ed7b 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3519,21 +3519,32 @@ static inline unsigned long ptdesc_nr_pages(const struct ptdesc *ptdesc)
> return compound_nr(ptdesc_page(ptdesc));
> }
>
> +static inline struct mem_cgroup *pagetable_memcg(const struct ptdesc *ptdesc)
> +{
> +#ifdef CONFIG_MEMCG
> + return ptdesc->pt_memcg;
> +#else
> + return NULL;
> +#endif
> +}
> +
> static inline void __pagetable_ctor(struct ptdesc *ptdesc)
> {
> pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags));
> + struct mem_cgroup *memcg = pagetable_memcg(ptdesc);
>
> __SetPageTable(ptdesc_page(ptdesc));
> - mod_node_page_state(pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc));
> + memcg_stat_mod(memcg, pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc));
> }
>
> static inline void pagetable_dtor(struct ptdesc *ptdesc)
> {
> pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags));
> + struct mem_cgroup *memcg = pagetable_memcg(ptdesc);
>
> ptlock_free(ptdesc);
> __ClearPageTable(ptdesc_page(ptdesc));
> - mod_node_page_state(pgdat, NR_PAGETABLE, -ptdesc_nr_pages(ptdesc));
> + memcg_stat_mod(memcg, pgdat, NR_PAGETABLE, -ptdesc_nr_pages(ptdesc));
> }
>
> static inline void pagetable_dtor_free(struct ptdesc *ptdesc)
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3cc8ae722886..e9b1da04938a 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -564,7 +564,7 @@ FOLIO_MATCH(compound_head, _head_3);
> * @ptl: Lock for the page table.
> * @__page_type: Same as page->page_type. Unused for page tables.
> * @__page_refcount: Same as page refcount.
> - * @pt_memcg_data: Memcg data. Tracked for page tables here.
> + * @pt_memcg: Memcg that this page table belongs to.
> *
> * This struct overlays struct page for now. Do not modify without a good
> * understanding of the issues.
> @@ -602,7 +602,7 @@ struct ptdesc {
> unsigned int __page_type;
> atomic_t __page_refcount;
> #ifdef CONFIG_MEMCG
> - unsigned long pt_memcg_data;
> + struct mem_cgroup *pt_memcg;
This is kernel memory, so this would be struct obj_cgroup * instead of struct
mem_cgroup pointer. We will need something similar to __folio_objcg(), maybe
__ptdesc_objcg() and then call obj_cgroup_memcg() on it. Basically how
folio_memcg() handles the kernel memory.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 3/3] ptdesc: Account page tables to memcgs again
2026-02-25 16:55 ` Shakeel Butt
@ 2026-02-25 21:01 ` Matthew Wilcox
2026-02-26 0:00 ` Shakeel Butt
0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2026-02-25 21:01 UTC (permalink / raw)
To: Shakeel Butt
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, cgroups, linux-mm,
Axel Rasmussen
On Wed, Feb 25, 2026 at 08:55:54AM -0800, Shakeel Butt wrote:
> > #ifdef CONFIG_MEMCG
> > - unsigned long pt_memcg_data;
> > + struct mem_cgroup *pt_memcg;
>
> This is kernel memory, so this would be struct obj_cgroup * instead of struct
> mem_cgroup pointer. We will need something similar to __folio_objcg(), maybe
> __ptdesc_objcg() and then call obj_cgroup_memcg() on it. Basically how
> folio_memcg() handles the kernel memory.
Why would we want to do that instead of just stashing a pointer to the
memcg in the ptdesc? I feel very stupid about the differences between
all of these things and would dearly love to read some documentation to
learn.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] ptdesc: Account page tables to memcgs again
2026-02-25 21:01 ` Matthew Wilcox
@ 2026-02-26 0:00 ` Shakeel Butt
0 siblings, 0 replies; 11+ messages in thread
From: Shakeel Butt @ 2026-02-26 0:00 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, cgroups, linux-mm,
Axel Rasmussen
On Wed, Feb 25, 2026 at 09:01:18PM +0000, Matthew Wilcox wrote:
> On Wed, Feb 25, 2026 at 08:55:54AM -0800, Shakeel Butt wrote:
> > > #ifdef CONFIG_MEMCG
> > > - unsigned long pt_memcg_data;
> > > + struct mem_cgroup *pt_memcg;
> >
> > This is kernel memory, so this would be struct obj_cgroup * instead of struct
> > mem_cgroup pointer. We will need something similar to __folio_objcg(), maybe
> > __ptdesc_objcg() and then call obj_cgroup_memcg() on it. Basically how
> > folio_memcg() handles the kernel memory.
>
> Why would we want to do that instead of just stashing a pointer to the
> memcg in the ptdesc?
Not the memcg pointer but the objcg pointer, a bit background first though.
Underlying we are using alloc_pages_noprof(__GFP_ACCOUNT) and __free_pages() for
ptdesc, so allocation path looks like the following:
alloc_pages_noprof(__GFP_ACCOUNT)
...
-> __alloc_frozen_pages_noprof(__GFP_ACCOUNT)
-> __memcg_kmem_charge_page()
-> page_set_objcg(page, objcg)
and page_set_objcg() is defined as
static void page_set_objcg(struct page *page, const struct obj_cgroup *objcg)
{
page->memcg_data = (unsigned long)objcg | MEMCG_DATA_KMEM;
}
page->memcg_data overlaps with ptdesc->pt_memcg_data, so we need remove
MEMCG_DATA_KMEM to get the objcg pointer.
If we want to store a pointer in struct ptdesc then we can't use the raw page
allocator/free functions. We have to allocate without __GFP_ACCOUNT and then do
the charging in __pagetable_ctor and uncharging in pagetable_dtor explicitly.
BTW we are trying to migrate from memcg pointers to objcg pointers in most of
the places due to zombie issue.
> I feel very stupid about the differences between
> all of these things and would dearly love to read some documentation to
> learn.
Unfortunetely we don't have a good documentation, just code.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] ptdesc: Account page tables to memcgs again
2026-02-25 16:22 ` [PATCH 3/3] ptdesc: Account page tables to memcgs again Matthew Wilcox (Oracle)
2026-02-25 16:55 ` Shakeel Butt
@ 2026-02-25 20:57 ` Matthew Wilcox
2026-02-25 21:48 ` Axel Rasmussen
2026-03-05 7:00 ` kernel test robot
3 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2026-02-25 20:57 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
cgroups, linux-mm
Cc: Axel Rasmussen
On Wed, Feb 25, 2026 at 04:22:17PM +0000, Matthew Wilcox (Oracle) wrote:
> static inline void __pagetable_ctor(struct ptdesc *ptdesc)
> {
> pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags));
> + struct mem_cgroup *memcg = pagetable_memcg(ptdesc);
>
> __SetPageTable(ptdesc_page(ptdesc));
> - mod_node_page_state(pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc));
> + memcg_stat_mod(memcg, pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc));
> }
It occurs to me that we're not holding the rcu_read_lock() here
(whereas we do for the other two callers). I'm not quite clear
on what the rcu read lock is protecting here -- can it be that the
memcg is rcu-freed while a page table belongs to it? Or does the task
existing prevent the memcg from being freed?
(is there documentation on this that I've been unable to find?)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 3/3] ptdesc: Account page tables to memcgs again
2026-02-25 16:22 ` [PATCH 3/3] ptdesc: Account page tables to memcgs again Matthew Wilcox (Oracle)
2026-02-25 16:55 ` Shakeel Butt
2026-02-25 20:57 ` Matthew Wilcox
@ 2026-02-25 21:48 ` Axel Rasmussen
2026-03-05 7:00 ` kernel test robot
3 siblings, 0 replies; 11+ messages in thread
From: Axel Rasmussen @ 2026-02-25 21:48 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
cgroups, linux-mm
On Wed, Feb 25, 2026 at 8:23 AM Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Commit f0c92726e89f removed the accounting of page tables to memcgs.
> Reintroduce it.
>
> Fixes: f0c92726e89f (ptdesc: remove references to folios from __pagetable_ctor() and pagetable_dtor())
> Reported-by: Axel Rasmussen <axelrasmussen@google.com>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> include/linux/mm.h | 15 +++++++++++++--
> include/linux/mm_types.h | 6 +++---
> 2 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5be3d8a8f806..34bc6f00ed7b 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3519,21 +3519,32 @@ static inline unsigned long ptdesc_nr_pages(const struct ptdesc *ptdesc)
> return compound_nr(ptdesc_page(ptdesc));
> }
>
> +static inline struct mem_cgroup *pagetable_memcg(const struct ptdesc *ptdesc)
> +{
> +#ifdef CONFIG_MEMCG
> + return ptdesc->pt_memcg;
I think this is buggy and we need to decode the "real" pointer from memcg_data?
I applied this series (cleanly) on top of torvalds/master
(7dff99b354601dd01829e1511711846e04340a69) and when I boot I get:
[ 3.315420] BUG: kernel NULL pointer dereference, address: 00000000000004e8
[ 3.316955] #PF: supervisor read access in kernel mode
[ 3.318100] #PF: error_code(0x0000) - not-present page
[ 3.319302] PGD 0 P4D 0
[ 3.319877] Oops: Oops: 0000 [#1] SMP NOPTI
[ 3.320829] CPU: 2 UID: 0 PID: 157 Comm: systemd Not tainted
7.0.0-smp-DEV #2 PREEMPTLAZY
[ 3.322665] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.17.0-debian-1.17.0-1 04/01/2014
[ 3.324772] RIP: 0010:memcg_stat_mod+0x2c/0x90
[ 3.325784] Code: 40 d6 0f 1f 44 00 00 55 41 56 53 48 89 cb 89 d5
48 85 ff 74 3d 66 90 48 63 86 c0 19 00 00 4c 8b b4 c7 90 08 00 00 49
83 c6 48 <49> 8b be a0 04 00 00 48 39 f7 75 2d 48 63 d3 8f
[ 3.329919] RSP: 0018:ffff9b62c0817de0 EFLAGS: 00010206
[ 3.331110] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
[ 3.332718] RDX: 0000000000000025 RSI: ffff98d33fffdcc0 RDI: ffff98cc08b8d142
[ 3.334322] RBP: 0000000000000025 R08: 0000000000007fff R09: ffffffff99079980
[ 3.335917] R10: 0000000000017ffd R11: 00000000ffff7fff R12: ffff98cc0310c138
[ 3.337522] R13: 00007ffc318c77d8 R14: 0000000000000048 R15: ffff98cc009e2280
[ 3.339118] FS: 00007f2fffd3d400(0000) GS:ffff98d385556000(0000)
knlGS:0000000000000000
[ 3.340915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.342208] CR2: 00000000000004e8 CR3: 00000001089ad000 CR4: 0000000000350ef0
[ 3.343804] Call Trace:
[ 3.344383] <TASK>
[ 3.344872] pgd_alloc+0x5d/0x1d0
[ 3.345643] mm_init+0x1df/0x3b0
[ 3.346395] alloc_bprm+0x10b/0x1c0
[ 3.347231] do_execveat_common+0x9b/0x300
[ 3.348162] __x64_sys_execve+0x41/0x60
[ 3.349020] do_syscall_64+0xe0/0x8a0
[ 3.349860] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 3.351009] RIP: 0033:0x7f30004f423b
[ 3.351831] Code: 0f 1e fa 48 8b 05 85 1d 10 00 48 8b 10 e9 0d 00
00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa b8 3b 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c5 1a 10 08
[ 3.356028] RSP: 002b:00007f2fff657e68 EFLAGS: 00000202 ORIG_RAX:
000000000000003b
[ 3.357707] RAX: ffffffffffffffda RBX: 00007ffc318c6b90 RCX: 00007f30004f423b
[ 3.359321] RDX: 00007ffc318c77d8 RSI: 00007ffc318c6e80 RDI: 00007ffc318c6e60
[ 3.360894] RBP: 00007f2fff657ff0 R08: 00007ffc318c68c0 R09: 0000000000000000
[ 3.362483] R10: 0000000000000008 R11: 0000000000000202 R12: 00007ffc318c68c0
[ 3.364061] R13: 0000000000000040 R14: 0000000000000001 R15: 00007f2fff657f20
[ 3.365657] </TASK>
[ 3.366177] Modules linked in: xhci_pci xhci_hcd virtio_net
net_failover failover virtio_blk virtio_balloon uhci_hcd ohci_pci
ohci_hcd evdev ehci_pci ehci_hcd 9pnet_virtio 9p 9pnet netfs
[ 3.369780] CR2: 00000000000004e8
[ 3.370543] ---[ end trace 0000000000000000 ]---
[ 3.371578] RIP: 0010:memcg_stat_mod+0x2c/0x90
[ 3.372584] Code: 40 d6 0f 1f 44 00 00 55 41 56 53 48 89 cb 89 d5
48 85 ff 74 3d 66 90 48 63 86 c0 19 00 00 4c 8b b4 c7 90 08 00 00 49
83 c6 48 <49> 8b be a0 04 00 00 48 39 f7 75 2d 48 63 d3 8f
[ 3.376675] RSP: 0018:ffff9b62c0817de0 EFLAGS: 00010206
[ 3.377838] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
[ 3.379437] RDX: 0000000000000025 RSI: ffff98d33fffdcc0 RDI: ffff98cc08b8d142
[ 3.380994] RBP: 0000000000000025 R08: 0000000000007fff R09: ffffffff99079980
[ 3.382586] R10: 0000000000017ffd R11: 00000000ffff7fff R12: ffff98cc0310c138
[ 3.384188] R13: 00007ffc318c77d8 R14: 0000000000000048 R15: ffff98cc009e2280
[ 3.385761] FS: 00007f2fffd3d400(0000) GS:ffff98d385556000(0000)
knlGS:0000000000000000
[ 3.387554] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.388836] CR2: 00000000000004e8 CR3: 00000001089ad000 CR4: 0000000000350ef0
[ 3.390449] Kernel panic - not syncing: Fatal exception
[ 3.391806] Kernel Offset: 0x16200000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 3.394178] Rebooting in 10 seconds..
> +#else
> + return NULL;
> +#endif
> +}
> +
> static inline void __pagetable_ctor(struct ptdesc *ptdesc)
> {
> pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags));
> + struct mem_cgroup *memcg = pagetable_memcg(ptdesc);
>
> __SetPageTable(ptdesc_page(ptdesc));
> - mod_node_page_state(pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc));
> + memcg_stat_mod(memcg, pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc));
> }
>
> static inline void pagetable_dtor(struct ptdesc *ptdesc)
> {
> pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags));
> + struct mem_cgroup *memcg = pagetable_memcg(ptdesc);
>
> ptlock_free(ptdesc);
> __ClearPageTable(ptdesc_page(ptdesc));
> - mod_node_page_state(pgdat, NR_PAGETABLE, -ptdesc_nr_pages(ptdesc));
> + memcg_stat_mod(memcg, pgdat, NR_PAGETABLE, -ptdesc_nr_pages(ptdesc));
Re: the RCU read lock discussion, I spotted that too. I'm also not
100% clear on whether or not it's required. folio_memcg says:
"For a kmem folio a caller should hold an rcu read lock to protect
memcg associated with a kmem folio from being released."
But on the other hand get_mem_cgroup_from_folio seems to think it's
fine to unconditionally call folio_memcg without an RCU read lock, it
seems to think we only need one whilst acquiring a reference, and once
we have that we can unlock. (Not that that helps us greatly, I don't
think we want ptdecs to hold a reference for their entire lifetime.)
> }
>
> static inline void pagetable_dtor_free(struct ptdesc *ptdesc)
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3cc8ae722886..e9b1da04938a 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -564,7 +564,7 @@ FOLIO_MATCH(compound_head, _head_3);
> * @ptl: Lock for the page table.
> * @__page_type: Same as page->page_type. Unused for page tables.
> * @__page_refcount: Same as page refcount.
> - * @pt_memcg_data: Memcg data. Tracked for page tables here.
> + * @pt_memcg: Memcg that this page table belongs to.
> *
> * This struct overlays struct page for now. Do not modify without a good
> * understanding of the issues.
> @@ -602,7 +602,7 @@ struct ptdesc {
> unsigned int __page_type;
> atomic_t __page_refcount;
> #ifdef CONFIG_MEMCG
> - unsigned long pt_memcg_data;
> + struct mem_cgroup *pt_memcg;
> #endif
> };
>
> @@ -617,7 +617,7 @@ TABLE_MATCH(rcu_head, pt_rcu_head);
> TABLE_MATCH(page_type, __page_type);
> TABLE_MATCH(_refcount, __page_refcount);
> #ifdef CONFIG_MEMCG
> -TABLE_MATCH(memcg_data, pt_memcg_data);
> +TABLE_MATCH(memcg_data, pt_memcg);
> #endif
> #undef TABLE_MATCH
> static_assert(sizeof(struct ptdesc) <= sizeof(struct page));
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 3/3] ptdesc: Account page tables to memcgs again
2026-02-25 16:22 ` [PATCH 3/3] ptdesc: Account page tables to memcgs again Matthew Wilcox (Oracle)
` (2 preceding siblings ...)
2026-02-25 21:48 ` Axel Rasmussen
@ 2026-03-05 7:00 ` kernel test robot
3 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2026-03-05 7:00 UTC (permalink / raw)
To: Matthew Wilcox (Oracle)
Cc: oe-lkp, lkp, Axel Rasmussen, linux-mm, Johannes Weiner,
Michal Hocko, Roman Gushchin, Shakeel Butt, cgroups,
Matthew Wilcox (Oracle),
oliver.sang
Hello,
kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
commit: 1445ef3d5f2fefd1fcedb68cff3fff0a33994791 ("[PATCH 3/3] ptdesc: Account page tables to memcgs again")
url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/memcg-Add-memcg_stat_mod/20260226-003144
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/all/20260225162319.315281-4-willy@infradead.org/
patch subject: [PATCH 3/3] ptdesc: Account page tables to memcgs again
in testcase: boot
config: x86_64-randconfig-r052-20250414
compiler: gcc-14
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202603051407.fde83fdb-lkp@intel.com
[ 14.109191][ T1] BUG: kernel NULL pointer dereference, address: 0000000000000880
[ 14.109653][ T1] #PF: supervisor read access in kernel mode
[ 14.109989][ T1] #PF: error_code(0x0000) - not-present page
[ 14.110322][ T1] PGD 12a8ff067 P4D 12a8ff067 PUD 0
[ 14.110622][ T1] Oops: Oops: 0000 [#1] SMP
[ 14.110878][ T1] CPU: 0 UID: 0 PID: 1 Comm: systemd Not tainted 7.0.0-rc1-00154-g1445ef3d5f2f #1 PREEMPT(full)
[ 14.111462][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 14.112082][ T1] RIP: 0010:mem_cgroup_lruvec (include/linux/memcontrol.h:729 (discriminator 1))
[ 14.112396][ T1] Code: 74 09 48 8d 83 40 22 00 00 eb 1f 4d 85 e4 75 07 4c 8b 25 bd c3 1e 02 48 63 83 b8 1e 00 00 49 8b 84 c4 58 07 00 00 48 83 c0 40 <48> 39 98 40 08 00 00 74 07 48 89 98 40 08 00 00 5b 41 5c 5d c3 cc
All code
========
0: 74 09 je 0xb
2: 48 8d 83 40 22 00 00 lea 0x2240(%rbx),%rax
9: eb 1f jmp 0x2a
b: 4d 85 e4 test %r12,%r12
e: 75 07 jne 0x17
10: 4c 8b 25 bd c3 1e 02 mov 0x21ec3bd(%rip),%r12 # 0x21ec3d4
17: 48 63 83 b8 1e 00 00 movslq 0x1eb8(%rbx),%rax
1e: 49 8b 84 c4 58 07 00 mov 0x758(%r12,%rax,8),%rax
25: 00
26: 48 83 c0 40 add $0x40,%rax
2a:* 48 39 98 40 08 00 00 cmp %rbx,0x840(%rax) <-- trapping instruction
31: 74 07 je 0x3a
33: 48 89 98 40 08 00 00 mov %rbx,0x840(%rax)
3a: 5b pop %rbx
3b: 41 5c pop %r12
3d: 5d pop %rbp
3e: c3 ret
3f: cc int3
Code starting with the faulting instruction
===========================================
0: 48 39 98 40 08 00 00 cmp %rbx,0x840(%rax)
7: 74 07 je 0x10
9: 48 89 98 40 08 00 00 mov %rbx,0x840(%rax)
10: 5b pop %rbx
11: 41 5c pop %r12
13: 5d pop %rbp
14: c3 ret
15: cc int3
[ 14.113448][ T1] RSP: 0018:ffff888101467c10 EFLAGS: 00210202
[ 14.113783][ T1] RAX: 0000000000000040 RBX: ffffffff83708880 RCX: 0000000000000001
[ 14.114217][ T1] RDX: 0000000000000001 RSI: ffffffff83708880 RDI: ffff88812a845f82
[ 14.114652][ T1] RBP: ffff888101467c20 R08: 0000000000000000 R09: 0000000000000000
[ 14.115087][ T1] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88812a845f82
[ 14.115979][ T1] R13: ffff888105fdd3a8 R14: ffff888105fdd740 R15: ffff888101442000
[ 14.116421][ T1] FS: 0000000000000000(0000) GS:ffff88889bc98000(0063) knlGS:00000000f72f8840
[ 14.116911][ T1] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 14.117276][ T1] CR2: 0000000000000880 CR3: 000000012b816000 CR4: 00000000000406f0
[ 14.117713][ T1] Call Trace:
[ 14.117901][ T1] <TASK>
[ 14.118070][ T1] memcg_stat_mod (mm/memcontrol.c:804)
[ 14.118328][ T1] __pagetable_ctor (include/linux/mm.h:3547)
[ 14.118595][ T1] pgd_alloc (include/asm-generic/pgalloc.h:291 arch/x86/mm/pgtable.c:314 arch/x86/mm/pgtable.c:328)
[ 14.118863][ T1] mm_init+0x210/0x390
[ 14.119129][ T1] dup_mm+0x45/0xe0
[ 14.119401][ T1] copy_process (kernel/fork.c:1587 (discriminator 1) kernel/fork.c:2228 (discriminator 1))
[ 14.119661][ T1] ? free_filename (fs/namei.c:148)
[ 14.119930][ T1] kernel_clone (include/linux/random.h:26 kernel/fork.c:2660)
[ 14.120200][ T1] __do_compat_sys_ia32_clone (arch/x86/kernel/sys_ia32.c:255)
[ 14.120519][ T1] __ia32_compat_sys_ia32_clone (arch/x86/kernel/sys_ia32.c:240)
[ 14.120837][ T1] ia32_sys_call (kbuild/obj/consumer/x86_64-randconfig-r052-20250414/./arch/x86/include/generated/asm/syscalls_32.h:121)
[ 14.121105][ T1] __do_fast_syscall_32 (arch/x86/entry/syscall_32.c:83 arch/x86/entry/syscall_32.c:307)
[ 14.121398][ T1] do_fast_syscall_32 (arch/x86/entry/syscall_32.c:332 (discriminator 1))
[ 14.121671][ T1] do_SYSENTER_32 (arch/x86/entry/syscall_32.c:371)
[ 14.121926][ T1] entry_SYSENTER_compat_after_hwframe (arch/x86/entry/entry_64_compat.S:127)
[ 14.122298][ T1] RIP: 0023:0xf7f9c38c
[ 14.122527][ T1] Code: d2 74 05 c1 e8 0c 89 02 8b 5d fc 31 c0 c9 c3 cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 58 b8
All code
========
0: d2 74 05 c1 shlb %cl,-0x3f(%rbp,%rax,1)
4: e8 0c 89 02 8b call 0xffffffff8b028915
9: 5d pop %rbp
a: fc cld
b: 31 c0 xor %eax,%eax
d: c9 leave
e: c3 ret
f: cc int3
10: 90 nop
11: 90 nop
12: 90 nop
13: 90 nop
14: 90 nop
15: 90 nop
16: 90 nop
17: 90 nop
18: 90 nop
19: 90 nop
1a: 90 nop
1b: 90 nop
1c: 90 nop
1d: 90 nop
1e: 0f 1f 00 nopl (%rax)
21: 51 push %rcx
22: 52 push %rdx
23: 55 push %rbp
24: 89 e5 mov %esp,%ebp
26: 0f 34 sysenter
28: cd 80 int $0x80
2a:* 5d pop %rbp <-- trapping instruction
2b: 5a pop %rdx
2c: 59 pop %rcx
2d: c3 ret
2e: cc int3
2f: 90 nop
30: 90 nop
31: 90 nop
32: 90 nop
33: 90 nop
34: 90 nop
35: 90 nop
36: 90 nop
37: 90 nop
38: 90 nop
39: 90 nop
3a: 90 nop
3b: 90 nop
3c: 90 nop
3d: 90 nop
3e: 58 pop %rax
3f: b8 .byte 0xb8
Code starting with the faulting instruction
===========================================
0: 5d pop %rbp
1: 5a pop %rdx
2: 59 pop %rcx
3: c3 ret
4: cc int3
5: 90 nop
6: 90 nop
7: 90 nop
8: 90 nop
9: 90 nop
a: 90 nop
b: 90 nop
c: 90 nop
d: 90 nop
e: 90 nop
f: 90 nop
10: 90 nop
11: 90 nop
12: 90 nop
13: 90 nop
14: 58 pop %rax
15: b8 .byte 0xb8
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260305/202603051407.fde83fdb-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 11+ messages in thread