linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Andrey Konovalov <andreyknvl@gmail.com>
Cc: kasan-dev@googlegroups.com, ryabinin.a.a@gmail.com,
	glider@google.com, dvyukov@google.com, vincenzo.frascino@arm.com,
	linux-mm@kvack.org,
	Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Subject: Re: System is broken in KASAN sw_tags mode during bootup
Date: Sat, 13 Sep 2025 16:18:46 +0800	[thread overview]
Message-ID: <aMUo5rXaXOU2nNh4@MiWiFi-R3L-srv> (raw)
In-Reply-To: <CA+fCnZebyMgWWEOW_ZxiGwnkiXqXX6XK5NJv-uWXAxdN+JxsSw@mail.gmail.com>

On 09/06/25 at 07:23pm, Andrey Konovalov wrote:
> On Mon, Aug 18, 2025 at 1:16 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > Hi,
> >
> > This can be reproduced stably on hpe-apollo arm64 system with the latest
> > upstream kernel. I have this system at hand now, the boot log and kernel
> > config are attached for reference.
> >
> > [   89.257633] ==================================================================
> > [   89.257646] BUG: KASAN: invalid-access in pcpu_alloc_noprof+0x42c/0x9a8
> > [   89.257672] Write of size 528 at addr ddfffd7fbdc00000 by task systemd/1
> > [   89.257685] Pointer tag: [dd], memory tag: [ca]
> > [   89.257692]
> > [   89.257703] CPU: 108 UID: 0 PID: 1 Comm: systemd Not tainted 6.17.0-rc2 #1 PREEMPT(voluntary)
> > [   89.257719] Hardware name: HPE Apollo 70             /C01_APACHE_MB         , BIOS L50_5.13_1.16 07/29/2020
> > [   89.257726] Call trace:
> > [   89.257731]  show_stack+0x30/0x90 (C)
> > [   89.257753]  dump_stack_lvl+0x7c/0xa0
> > [   89.257769]  print_address_description.isra.0+0x90/0x2b8
> > [   89.257789]  print_report+0x120/0x208
> > [   89.257804]  kasan_report+0xc8/0x110
> > [   89.257823]  kasan_check_range+0x7c/0xa0
> > [   89.257835]  __asan_memset+0x30/0x68
> > [   89.257847]  pcpu_alloc_noprof+0x42c/0x9a8
> > [   89.257859]  mem_cgroup_alloc+0x2bc/0x560
> > [   89.257873]  mem_cgroup_css_alloc+0x78/0x780
> > [   89.257893]  cgroup_apply_control_enable+0x230/0x578
> > [   89.257914]  cgroup_mkdir+0xf0/0x330
> > [   89.257928]  kernfs_iop_mkdir+0xb0/0x120
> > [   89.257947]  vfs_mkdir+0x250/0x380
> > [   89.257965]  do_mkdirat+0x254/0x298
> > [   89.257979]  __arm64_sys_mkdirat+0x80/0xc0
> > [   89.257994]  invoke_syscall.constprop.0+0x88/0x148
> > [   89.258011]  el0_svc_common.constprop.0+0x78/0x148
> > [   89.258025]  do_el0_svc+0x38/0x50
> > [   89.258037]  el0_svc+0x3c/0x168
> > [   89.258050]  el0t_64_sync_handler+0xa0/0xf0
> > [   89.258063]  el0t_64_sync+0x1b0/0x1b8
> > [   89.258076]
> > [   89.258080] The buggy address belongs to a 0-page vmalloc region starting at 0xcafffd7fbdc00000 allocated at pcpu_get_vm_areas+0x0/0x1da0
> > [   89.258111] The buggy address belongs to the physical page:
> > [   89.258117] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x881ddac
> > [   89.258129] flags: 0xa5c00000000000(node=1|zone=2|kasantag=0x5c)
> > [   89.258148] raw: 00a5c00000000000 0000000000000000 dead000000000122 0000000000000000
> > [   89.258160] raw: 0000000000000000 f3ff000813efa600 00000001ffffffff 0000000000000000
> > [   89.258168] raw: 00000000000fffff 0000000000000000
> > [   89.258173] page dumped because: kasan: bad access detected
> > [   89.258178]
> > [   89.258181] Memory state around the buggy address:
> > [   89.258192] Unable to handle kernel paging request at virtual address ffff7fd7fbdbffe0
> > [   89.258199] KASAN: probably wild-memory-access in range [0xfffffd7fbdbffe00-0xfffffd7fbdbffe0f]
> > [   89.258207] Mem abort info:
> > [   89.258211]   ESR = 0x0000000096000007
> > [   89.258216]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [   89.258223]   SET = 0, FnV = 0
> > [   89.258228]   EA = 0, S1PTW = 0
> > [   89.258232]   FSC = 0x07: level 3 translation fault
> > [   89.258238] Data abort info:
> > [   89.258241]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> > [   89.258246]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> > [   89.258252]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > [   89.258260] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000008ff8b8f000
> > [   89.258267] [ffff7fd7fbdbffe0] pgd=1000008ff0275403, p4d=1000008ff0275403, pud=1000008ff0274403, pmd=1000000899079403, pte=0000000000000000
> > [   89.258296] Internal error: Oops: 0000000096000007 [#1]  SMP
> > [   89.540859] Modules linked in: i2c_dev
> > [   89.544619] CPU: 108 UID: 0 PID: 1 Comm: systemd Not tainted 6.17.0-rc2 #1 PREEMPT(voluntary)
> > [   89.553234] Hardware name: HPE Apollo 70             /C01_APACHE_MB         , BIOS L50_5.13_1.16 07/29/2020
> > [   89.562970] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   89.569933] pc : __pi_memcpy_generic+0x24/0x230
> > [   89.574472] lr : kasan_metadata_fetch_row+0x20/0x30
> > [   89.579350] sp : ffff8000859d76c0
> > [   89.582660] x29: ffff8000859d76c0 x28: 0000000000000100 x27: ffff008ec626d800
> > [   89.589807] x26: 0000000000000210 x25: 0000000000000000 x24: fffffd7fbdbfff00
> > [   89.596952] x23: ffff8000826cbeb8 x22: fffffd7fbdc00000 x21: 00000000fffffffe
> > [   89.604097] x20: ffff800082682ee0 x19: fffffd7fbdbffe00 x18: 00000000049016ff
> > [   89.611242] x17: 3030303030303030 x16: 2066666666666666 x15: 6631303030303030
> > [   89.618386] x14: 0000000000000001 x13: 0000000000000001 x12: 0000000000000001
> > [   89.625530] x11: 687420646e756f72 x10: 0000000000000020 x9 : 0000000000000000
> > [   89.632674] x8 : ffff78000859d766 x7 : 0000000000000000 x6 : 000000000000003a
> > [   89.639818] x5 : ffff8000859d7728 x4 : ffff7fd7fbdbfff0 x3 : efff800000000000
> > [   89.646963] x2 : 0000000000000010 x1 : ffff7fd7fbdbffe0 x0 : ffff8000859d7718
> > [   89.654107] Call trace:
> > [   89.656549]  __pi_memcpy_generic+0x24/0x230 (P)
> > [   89.661086]  print_report+0x180/0x208
> > [   89.664753]  kasan_report+0xc8/0x110
> > [   89.668333]  kasan_check_range+0x7c/0xa0
> > [   89.672258]  __asan_memset+0x30/0x68
> > [   89.675836]  pcpu_alloc_noprof+0x42c/0x9a8
> > [   89.679935]  mem_cgroup_alloc+0x2bc/0x560
> > [   89.683947]  mem_cgroup_css_alloc+0x78/0x780
> > [   89.688222]  cgroup_apply_control_enable+0x230/0x578
> > [   89.693191]  cgroup_mkdir+0xf0/0x330
> > [   89.696771]  kernfs_iop_mkdir+0xb0/0x120
> > [   89.700697]  vfs_mkdir+0x250/0x380
> > [   89.704103]  do_mkdirat+0x254/0x298
> > [   89.707596]  __arm64_sys_mkdirat+0x80/0xc0
> > [   89.711697]  invoke_syscall.constprop.0+0x88/0x148
> > [   89.716491]  el0_svc_common.constprop.0+0x78/0x148
> > [   89.721286]  do_el0_svc+0x38/0x50
> > [   89.724602]  el0_svc+0x3c/0x168
> > [   89.727746]  el0t_64_sync_handler+0xa0/0xf0
> > [   89.731933]  el0t_64_sync+0x1b0/0x1b8
> > [   89.735603] Code: f100805f 540003c8 f100405f 540000c3 (a9401c26)
> > [   89.741695] ---[ end trace 0000000000000000 ]---
> > [   89.746308] note: systemd[1] exi
> > =========================
> 
> Might be the same issue as the one being fixed by Maciej here:
> 
> https://lore.kernel.org/all/bcf18f220ef3b40e02f489fdb90fc7a5a153a383.1756151769.git.maciej.wieczor-retman@intel.com/
> https://lore.kernel.org/all/3339d11e69c9127108fe8ef80a069b7b3bb07175.1756151769.git.maciej.wieczor-retman@intel.com/
> 
> Perhaps it makes sense to split that fix out of the series and submit
> separately.

Thanks for the information. I finally got a machine to reproduce the
issue and testing the patches. It's weird it firstly can't be reproduced
in the latest 6.17.0-rc5+, not sure if I made anything wrong on steps.
Later, I started it over and can stably reproduce the problem, I can
confirm Maciej's two patches can fix the problem very well.

Will reply to Maciej's patches to add my Tested-by.

Thanks
Baoquan



      reply	other threads:[~2025-09-13  8:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-18 11:16 Baoquan He
2025-09-06 17:23 ` Andrey Konovalov
2025-09-13  8:18   ` Baoquan He [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMUo5rXaXOU2nNh4@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=andreyknvl@gmail.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-mm@kvack.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=ryabinin.a.a@gmail.com \
    --cc=vincenzo.frascino@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox