From: Harry Yoo <harry.yoo@oracle.com>
To: David Wang <00107082@163.com>
Cc: akpm@linux-foundation.org, surenb@google.com,
kent.overstreet@linux.dev, oliver.sang@intel.com,
cachen@purestorage.com, linux-mm@kvack.org,
oe-lkp@lists.linux.dev, stable@vger.kernel.org
Subject: Re: [PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users()y
Date: Tue, 24 Jun 2025 18:09:54 +0900 [thread overview]
Message-ID: <aFprYu5H_ztouxw2@hyeyoo> (raw)
In-Reply-To: <4f12c217.7a79.197a1070f55.Coremail.00107082@163.com>
On Tue, Jun 24, 2025 at 04:21:23PM +0800, David Wang wrote:
> At 2025-06-24 15:25:13, "Harry Yoo" <harry.yoo@oracle.com> wrote:
> >alloc_tag_top_users() attempts to lock alloc_tag_cttype->mod_lock
> >even when the alloc_tag_cttype is not allocated because:
> >
> > 1) alloc tagging is disabled because mem profiling is disabled
> > (!alloc_tag_cttype)
> > 2) alloc tagging is enabled, but not yet initialized (!alloc_tag_cttype)
> > 3) alloc tagging is enabled, but failed initialization
> > (!alloc_tag_cttype or IS_ERR(alloc_tag_cttype))
> >
> >In all cases, alloc_tag_cttype is not allocated, and therefore
> >alloc_tag_top_users() should not attempt to acquire the semaphore.
> >
> >This leads to a crash on memory allocation failure by attempting to
> >acquire a non-existent semaphore:
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc000000001b: 0000 [#3] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x00000000000000d8-0x00000000000000df]
> > CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G D 6.16.0-rc2 #1 VOLUNTARY
> > Tainted: [D]=DIE
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > RIP: 0010:down_read_trylock+0xaa/0x3b0
> > Code: d0 7c 08 84 d2 0f 85 a0 02 00 00 8b 0d df 31 dd 04 85 c9 75 29 48 b8 00 00 00 00 00 fc ff df 48 8d 6b 68 48 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 88 02 00 00 48 3b 5b 68 0f 85 53 01 00 00 65 ff
> > RSP: 0000:ffff8881002ce9b8 EFLAGS: 00010016
> > RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 0000000000000000
> > RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
> > RBP: 00000000000000d8 R08: 0000000000000001 R09: ffffed107dde49d1
> > R10: ffff8883eef24e8b R11: ffff8881002cec20 R12: 1ffff11020059d37
> > R13: 00000000003fff7b R14: ffff8881002cec20 R15: dffffc0000000000
> > FS: 00007f963f21d940(0000) GS:ffff888458ca6000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f963f5edf71 CR3: 000000010672c000 CR4: 0000000000350ef0
> > Call Trace:
> > <TASK>
> > codetag_trylock_module_list+0xd/0x20
> > alloc_tag_top_users+0x369/0x4b0
> > __show_mem+0x1cd/0x6e0
> > warn_alloc+0x2b1/0x390
> > __alloc_frozen_pages_noprof+0x12b9/0x21a0
> > alloc_pages_mpol+0x135/0x3e0
> > alloc_slab_page+0x82/0xe0
> > new_slab+0x212/0x240
> > ___slab_alloc+0x82a/0xe00
> > </TASK>
> >
> >As David Wang points out, this issue became easier to trigger after commit
> >780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init").
> >
> >Before the commit, the issue occurred only when it failed to allocate
> >and initialize alloc_tag_cttype or if a memory allocation fails before
> >alloc_tag_init() is called. After the commit, it can be easily triggered
> >when memory profiling is compiled but disabled at boot.
> >
> >To properly determine whether alloc_tag_init() has been called and
> >its data structures initialized, verify that alloc_tag_cttype is a valid
> >pointer before acquiring the semaphore. If the variable is NULL or an error
> >value, it has not been properly initialized. In such a case, just skip
> >and do not attempt to acquire the semaphore.
> >
> >Reported-by: kernel test robot <oliver.sang@intel.com>
> >Closes: https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com__;!!ACWV5N9M2RV99hQ!PxJNKp4Bj6h0XIWpRXcmFeIz51jORtRRAo1j23ZnRgvTm0E0Mp5l6UrLNCkiHww6AVWOSfbDDdBwKgJ9_Q$
> >Closes: https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/202506131711.5b41931c-lkp@intel.com__;!!ACWV5N9M2RV99hQ!PxJNKp4Bj6h0XIWpRXcmFeIz51jORtRRAo1j23ZnRgvTm0E0Mp5l6UrLNCkiHww6AVWOSfbDDdC-7OiUsg$
> >Fixes: 780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init")
> >Fixes: 1438d349d16b ("lib: add memory allocations report in show_mem()")
> >Cc: stable@vger.kernel.org
> >Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> >---
> >
> >@Suren: I did not add another pr_warn() because every error path in
> >alloc_tag_init() already has pr_err().
> >
> >v2 -> v3:
> >- Added another Closes: tag (David)
> >- Moved the condition into a standalone if block for better readability
> > (Suren)
> >- Typo fix (Suren)
> >
> > lib/alloc_tag.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> >diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> >index 41ccfb035b7b..e9b33848700a 100644
> >--- a/lib/alloc_tag.c
> >+++ b/lib/alloc_tag.c
> >@@ -127,6 +127,9 @@ size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sl
> > struct codetag_bytes n;
> > unsigned int i, nr = 0;
> >
> >+ if (IS_ERR_OR_NULL(alloc_tag_cttype))
>
> Should a warning added here? indicating codetag module not ready yet and the memory failure happened during boot:
> if (mem_profiling_support) pr_warn("...
I think you're saying we need to print a warning when alloc tagging
can't provide "top users".
And there can be three different reasons why it can't provide them:
1) alloc_tag_cttype is not ready yet or mem profiling is disabled.
2) the context can't sleep and trylock failed.
3) alloc tags do not exist.
I think that makes sense, but it should be a new feature (as a separate
patch) and not a -stable material?
If you're interested in doing this, please feel free to proceed.
It will look like this:
sh invoked oom-killer: gfp_mask=0x140dca(GFP_HIGHUSER_MOVABLE|_0
[... snip ...]
Mem-Info:
active_anon:467412 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0
slab_reclaimable:872 slab_unreclaimable:4769
mapped:833 shmem:49252 pagetables:930
sec_pagetables:0 bounce:0
kernel_misc_reclaimable:0
free:15416 free_pcp:7714 free_cma:0
[... snip ...]
0 pages in swap cache
Free swap = 0kB
Total swap = 0kB
524158 pages RAM
0 pages HighMem/MovableOnly
22064 pages reserved
0 pages cma reserved
0 pages hwpoisoned
No memory allocation info available: Alloc tagging subsystem not initialized || Context cannot sleep || No alloc tags recorded
[ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[ 105] 0 105 3065 1003 402 0 601 65536 0 0 systemd-udevd
[ 171] 0 171 775610 418124 417661 0 463 3416064 0 0 sh
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/,task=sh,pid=171,uid=0
Out of memory: Killed process 171 (sh) total-vm:3102440kB, anon-rss:1670644kB, file-rss:0kB, shmem-rss:1852kB, UID:0 pgtables:3336kB oom_score_adj:0
> >+ return 0;
> >+
> > if (can_sleep)
> > codetag_lock_module_list(alloc_tag_cttype, true);
> > else if (!codetag_trylock_module_list(alloc_tag_cttype))
> >--
> >2.43.0
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2025-06-24 9:10 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-24 7:25 [PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users() Harry Yoo
2025-06-24 8:21 ` David Wang
2025-06-24 9:09 ` Harry Yoo [this message]
2025-06-24 9:30 ` [PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users()y David Wang
2025-06-24 10:59 ` Harry Yoo
2025-06-24 11:28 ` David Wang
2025-06-24 13:15 ` David Wang
2025-06-24 14:24 ` Harry Yoo
2025-06-24 14:47 ` David Wang
2025-06-24 15:04 ` Suren Baghdasaryan
2025-06-24 13:25 ` Re:[PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users() David Wang
2025-06-24 13:50 ` [PATCH " Harry Yoo
2025-06-24 14:00 ` David Wang
2025-06-24 14:13 ` Harry Yoo
2025-06-24 14:28 ` David Wang
2025-06-24 14:57 ` Suren Baghdasaryan
2025-06-24 15:14 ` Harry Yoo
2025-06-24 15:38 ` Suren Baghdasaryan
2025-06-24 18:00 ` Harry Yoo
2025-06-25 0:52 ` Suren Baghdasaryan
2025-06-24 15:23 ` David Wang
2025-06-30 6:25 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aFprYu5H_ztouxw2@hyeyoo \
--to=harry.yoo@oracle.com \
--cc=00107082@163.com \
--cc=akpm@linux-foundation.org \
--cc=cachen@purestorage.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-mm@kvack.org \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox