linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: David Wang <00107082@163.com>
Cc: Harry Yoo <harry.yoo@oracle.com>,
	akpm@linux-foundation.org,  kent.overstreet@linux.dev,
	oliver.sang@intel.com, cachen@purestorage.com,
	 linux-mm@kvack.org, oe-lkp@lists.linux.dev,
	stable@vger.kernel.org
Subject: Re: [PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users()y
Date: Tue, 24 Jun 2025 08:04:43 -0700	[thread overview]
Message-ID: <CAJuCfpHVP16=uvE8OHpUo5+k8EOGo=fzHUZffTVbDoYW92L_EQ@mail.gmail.com> (raw)
In-Reply-To: <44eb4892.b434.197a2681c42.Coremail.00107082@163.com>

On Tue, Jun 24, 2025 at 7:47 AM David Wang <00107082@163.com> wrote:
>
>
> At 2025-06-24 22:24:33, "Harry Yoo" <harry.yoo@oracle.com> wrote:
> >On Tue, Jun 24, 2025 at 09:15:55PM +0800, David Wang wrote:
> >>
> >> At 2025-06-24 19:28:18, "David Wang" <00107082@163.com> wrote:
> >> >
> >> >At 2025-06-24 18:59:52, "Harry Yoo" <harry.yoo@oracle.com> wrote:
> >> >>On Tue, Jun 24, 2025 at 05:30:02PM +0800, David Wang wrote:
> >> >>>
> >> >>> At 2025-06-24 17:09:54, "Harry Yoo" <harry.yoo@oracle.com> wrote:
> >> >>> >On Tue, Jun 24, 2025 at 04:21:23PM +0800, David Wang wrote:
> >> >>> >> At 2025-06-24 15:25:13, "Harry Yoo" <harry.yoo@oracle.com> wrote:
> >> >>> >> >alloc_tag_top_users() attempts to lock alloc_tag_cttype->mod_lock
> >> >>> >> >even when the alloc_tag_cttype is not allocated because:
> >> >>> >> >
> >> >>> >> >  1) alloc tagging is disabled because mem profiling is disabled
> >> >>> >> >     (!alloc_tag_cttype)
> >> >>> >> >  2) alloc tagging is enabled, but not yet initialized (!alloc_tag_cttype)
> >> >>> >> >  3) alloc tagging is enabled, but failed initialization
> >> >>> >> >     (!alloc_tag_cttype or IS_ERR(alloc_tag_cttype))
> >> >>> >> >
> >> >>> >> >In all cases, alloc_tag_cttype is not allocated, and therefore
> >> >>> >> >alloc_tag_top_users() should not attempt to acquire the semaphore.
> >> >>> >> >
> >> >>> >> >This leads to a crash on memory allocation failure by attempting to
> >> >>> >> >acquire a non-existent semaphore:
> >> >>> >> >
> >> >>> >> >  Oops: general protection fault, probably for non-canonical address 0xdffffc000000001b: 0000 [#3] SMP KASAN NOPTI
> >> >>> >> >  KASAN: null-ptr-deref in range [0x00000000000000d8-0x00000000000000df]
> >> >>> >> >  CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G      D             6.16.0-rc2 #1 VOLUNTARY
> >> >>> >> >  Tainted: [D]=DIE
> >> >>> >> >  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> >> >>> >> >  RIP: 0010:down_read_trylock+0xaa/0x3b0
> >> >>> >> >  Code: d0 7c 08 84 d2 0f 85 a0 02 00 00 8b 0d df 31 dd 04 85 c9 75 29 48 b8 00 00 00 00 00 fc ff df 48 8d 6b 68 48 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 88 02 00 00 48 3b 5b 68 0f 85 53 01 00 00 65 ff
> >> >>> >> >  RSP: 0000:ffff8881002ce9b8 EFLAGS: 00010016
> >> >>> >> >  RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 0000000000000000
> >> >>> >> >  RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
> >> >>> >> >  RBP: 00000000000000d8 R08: 0000000000000001 R09: ffffed107dde49d1
> >> >>> >> >  R10: ffff8883eef24e8b R11: ffff8881002cec20 R12: 1ffff11020059d37
> >> >>> >> >  R13: 00000000003fff7b R14: ffff8881002cec20 R15: dffffc0000000000
> >> >>> >> >  FS:  00007f963f21d940(0000) GS:ffff888458ca6000(0000) knlGS:0000000000000000
> >> >>> >> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> >>> >> >  CR2: 00007f963f5edf71 CR3: 000000010672c000 CR4: 0000000000350ef0
> >> >>> >> >  Call Trace:
> >> >>> >> >   <TASK>
> >> >>> >> >   codetag_trylock_module_list+0xd/0x20
> >> >>> >> >   alloc_tag_top_users+0x369/0x4b0
> >> >>> >> >   __show_mem+0x1cd/0x6e0
> >> >>> >> >   warn_alloc+0x2b1/0x390
> >> >>> >> >   __alloc_frozen_pages_noprof+0x12b9/0x21a0
> >> >>> >> >   alloc_pages_mpol+0x135/0x3e0
> >> >>> >> >   alloc_slab_page+0x82/0xe0
> >> >>> >> >   new_slab+0x212/0x240
> >> >>> >> >   ___slab_alloc+0x82a/0xe00
> >> >>> >> >   </TASK>
> >> >>> >> >
> >> >>> >> >As David Wang points out, this issue became easier to trigger after commit
> >> >>> >> >780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init").
> >> >>> >> >
> >> >>> >> >Before the commit, the issue occurred only when it failed to allocate
> >> >>> >> >and initialize alloc_tag_cttype or if a memory allocation fails before
> >> >>> >> >alloc_tag_init() is called. After the commit, it can be easily triggered
> >> >>> >> >when memory profiling is compiled but disabled at boot.
> >> >>> >> >
> >> >>> >> >To properly determine whether alloc_tag_init() has been called and
> >> >>> >> >its data structures initialized, verify that alloc_tag_cttype is a valid
> >> >>> >> >pointer before acquiring the semaphore. If the variable is NULL or an error
> >> >>> >> >value, it has not been properly initialized. In such a case, just skip
> >> >>> >> >and do not attempt to acquire the semaphore.
> >> >>> >> >
> >> >>> >> >Reported-by: kernel test robot <oliver.sang@intel.com>
> >> >>> >> >Closes: https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com__;!!ACWV5N9M2RV99hQ!PxJNKp4Bj6h0XIWpRXcmFeIz51jORtRRAo1j23ZnRgvTm0E0Mp5l6UrLNCkiHww6AVWOSfbDDdBwKgJ9_Q$
> >> >>> >> >Closes: https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/202506131711.5b41931c-lkp@intel.com__;!!ACWV5N9M2RV99hQ!PxJNKp4Bj6h0XIWpRXcmFeIz51jORtRRAo1j23ZnRgvTm0E0Mp5l6UrLNCkiHww6AVWOSfbDDdC-7OiUsg$
> >> >>> >> >Fixes: 780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init")
> >> >>> >> >Fixes: 1438d349d16b ("lib: add memory allocations report in show_mem()")
> >> >>> >> >Cc: stable@vger.kernel.org
> >> >>> >> >Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> >> >>> >> >---
> >> >>> >> >
> >> >>> >> >@Suren: I did not add another pr_warn() because every error path in
> >> >>> >> >alloc_tag_init() already has pr_err().
> >> >>> >> >
> >> >>> >> >v2 -> v3:
> >> >>> >> >- Added another Closes: tag (David)
> >> >>> >> >- Moved the condition into a standalone if block for better readability
> >> >>> >> >  (Suren)
> >> >>> >> >- Typo fix (Suren)
> >> >>> >> >
> >> >>> >> > lib/alloc_tag.c | 3 +++
> >> >>> >> > 1 file changed, 3 insertions(+)
> >> >>> >> >
> >> >>> >> >diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> >> >>> >> >index 41ccfb035b7b..e9b33848700a 100644
> >> >>> >> >--- a/lib/alloc_tag.c
> >> >>> >> >+++ b/lib/alloc_tag.c
> >> >>> >> >@@ -127,6 +127,9 @@ size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sl
> >> >>> >> >     struct codetag_bytes n;
> >> >>> >> >     unsigned int i, nr = 0;
> >> >>> >> >
> >> >>> >> >+    if (IS_ERR_OR_NULL(alloc_tag_cttype))
> >> >>> >>
> >> >>> >> Should a warning  added here? indicating  codetag module not ready yet and the memory failure happened during boot:
> >> >>> >>  if (mem_profiling_support) pr_warn("...
> >> >>> >
> >> >>> >I think you're saying we need to print a warning when alloc tagging
> >> >>> >can't provide "top users".
> >> >>>
> >> >>> I just meant printing a warning when show_mem is needed before codetag module initialized,
> >> >>> as reported in https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com/__;!!ACWV5N9M2RV99hQ!J2waTUro8owaYlpAZJ6fnrHZvcGMbY6qAO5QvvIGZzUv-ryWjCjhO-maTOolfpPvPSr6CpqOgkRalCwJow$
> >> >>> where mem_profiling_support is 1, but alloc_tag_cttype is still NULL.
> >> >>> This can tell we do have a memory failure during boot before codetag_init, even with memory profiling activated.
> >> >>
> >> >>Ok. You didn't mean that.
> >> >>
> >> >>But still I think it's better to handle all cases and print distinct
> >> >>warnings, rather than handling only the specific case where memory profiling
> >> >>is enabled but not yet initialized.
> >> >>
> >> >>Users will want to know why allocation information is not available,
> >> >>and there can be multiple reasons including the one you mentioned.
> >> >>
> >> >>What do you think?
> >> >
> >> >I am not sure....
> >> >I think most cases you mentioned is just a pr_info,  those are expected behavior or designed that way.
> >> >But when  mem_profiling_support==1 && alloc_tag_cttype==NULL, this is an unexpected behavior, which is a pr_warn.
> >>
> >> Put it in a clearer way, so far we have identified two "error" conditions:
> >> 1.  mem_profiling_support=1 but initialization for alloc_tag_cttype failed,  "alloc_tag_init() already has pr_err()", as you mentioned.
> >
> >Yes, and this is helpful because it is not expected to fail.
> >
> >> 2.  mem_profiling_support=1 , but codetag module have not been init yet.  I  suggested adding a pr_warn here.
> >
> >But in this case, I'm not sure what's the point of the pr_warn() is.
> >"Memory allocations are not expected fail before alloc_tag_init()"?
> >That's a weird assumption to write as code. I'd rather handle it
> >silently without informing the user.
> >
> >Yes, we've identified the error condition, but it’s not an error anymore
> >because this patch fixes it. If it's not an error, users don't need to
> >be aware of the case.
> >
> >I don't understand what makes this case special that the user needs to
> >be specifically informed about it, while they aren't informed when
> >memory allocation info is unavailable for other reasons.
> >As a user, I only care why there is no memory allocation info available.
>
> My point is just that we are not expecting anyone calls alloc_tag_top_users() before
> alloc_tag_init(), when that happened, measures, such as late_initcall if possible, can be taken
> to fix it. and a warning message is easier to catch.
> (This is not just for explaining why no memory profiling information shows up)

I wouldn't rule out the possibility of
show_mem()->alloc_tag_top_users() being called during early boot and
before alloc_tag_init(). Such a thing can happen during early boot,
however I would not call that an error and if that happens and
allocinfo data is missing from the report I don't think that would
surprise the user, so not sure it's worth detecting and reporting this
case.

>
> >
> >--
> >Cheers,
> >Harry / Hyeonggon


  reply	other threads:[~2025-06-24 15:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-24  7:25 [PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users() Harry Yoo
2025-06-24  8:21 ` David Wang
2025-06-24  9:09   ` [PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users()y Harry Yoo
2025-06-24  9:30     ` David Wang
2025-06-24 10:59       ` Harry Yoo
2025-06-24 11:28         ` David Wang
2025-06-24 13:15           ` David Wang
2025-06-24 14:24             ` Harry Yoo
2025-06-24 14:47               ` David Wang
2025-06-24 15:04                 ` Suren Baghdasaryan [this message]
2025-06-24 13:25 ` Re:[PATCH v3] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users() David Wang
2025-06-24 13:50   ` [PATCH " Harry Yoo
2025-06-24 14:00     ` David Wang
2025-06-24 14:13       ` Harry Yoo
2025-06-24 14:28         ` David Wang
2025-06-24 14:57           ` Suren Baghdasaryan
2025-06-24 15:14             ` Harry Yoo
2025-06-24 15:38               ` Suren Baghdasaryan
2025-06-24 18:00                 ` Harry Yoo
2025-06-25  0:52                   ` Suren Baghdasaryan
2025-06-24 15:23             ` David Wang
2025-06-30  6:25 ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJuCfpHVP16=uvE8OHpUo5+k8EOGo=fzHUZffTVbDoYW92L_EQ@mail.gmail.com' \
    --to=surenb@google.com \
    --cc=00107082@163.com \
    --cc=akpm@linux-foundation.org \
    --cc=cachen@purestorage.com \
    --cc=harry.yoo@oracle.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox