From: Jan Stancek <jstancek@redhat.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
Jonathan.Cameron@huawei.com, dan.j.williams@intel.com,
David Hildenbrand <david@redhat.com>,
linux-tegra@vger.kernel.org,
Thierry Reding <thierry.reding@gmail.com>,
Jonathan Hunter <jonathanh@nvidia.com>
Subject: Re: [bug] aarch64 host no longer boots after 767507654c22 ("arch_numa: switch over to numa_memblks")
Date: Tue, 29 Oct 2024 22:03:31 +0100 [thread overview]
Message-ID: <CAASaF6x8cMJUTYpm0dedkh9boXgVaL9WvxaG-r+aHY+6QOpw6Q@mail.gmail.com> (raw)
In-Reply-To: <ZyELRjoDLmHeVvSR@kernel.org>
On Tue, Oct 29, 2024 at 5:24 PM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Tue, Oct 29, 2024 at 04:43:39PM +0100, Jan Stancek wrote:
> > On Tue, Oct 29, 2024 at 4:07 PM Zi Yan <ziy@nvidia.com> wrote:
> > >
> > > +tegra mailing list and maintainers
> > >
> > > On 29 Oct 2024, at 8:47, Jan Stancek wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm seeing a regression on Nvidia IGX system, which no longer boots.
> > > >
> > > > bisect points at commit 767507654c22 ("arch_numa: switch over to numa_memblks").
> > > > It hangs very early, with 4k or 64k pages, with no kernel messages printed:
> > > >
> > > > EFI stub: Booting Linux Kernel...
> > > > EFI stub: Using DTB from configuration table
> > > > EFI stub: Exiting boot services...
> > > > <hangs here>
> > > >
> > >
> > > Is it possible to have earlycon output? It is hard to debug without any
> > > information except kernel fails to boot.
> >
> > I know it was a long shot, so far I haven't had luck getting it to work.
>
> Does it boot with numa=off and numa=fake?
No, it doesn't.
>
> In the log from successful boot it seems there is no NUMA information in
> the device tree, can you send the device tree as well please?
https://people.redhat.com/jstancek/aarch64_numa_boot/device_tree
Regards,
Jan
>
> > > Since the previous commit boots and I assume both kernels are compiled
> > > with the same gcc toolchain, this should not be caused by the binuils
> > > bug in 2.42[1]. Is your binutils version 2.42?
> >
> > Yes, both are compiled locally, with binutils 2.41
> >
> > >
> > > Thanks.
> > >
> > >
> > > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=31924
> > >
> > > > Here's a log from successful boot with previous commit:
> > > > https://people.redhat.com/jstancek/aarch64_numa_boot/console-log-good.txt
> > > > and config: https://people.redhat.com/jstancek/aarch64_numa_boot/config
> > > >
> > > > # lscpu
> > > > Architecture: aarch64
> > > > CPU op-mode(s): 32-bit, 64-bit
> > > > Byte Order: Little Endian
> > > > CPU(s): 12
> > > > On-line CPU(s) list: 0-11
> > > > Vendor ID: ARM
> > > > BIOS Vendor ID: NVIDIA
> > > > Model name: Cortex-A78AE
> > > > BIOS Model name: Not Specified Not Specified CPU @ 0.0GHz
> > > > BIOS CPU family: 257
> > > > Model: 1
> > > > Thread(s) per core: 1
> > > > Core(s) per cluster: 12
> > > > Socket(s): 1
> > > > Cluster(s): 1
> > > > Stepping: r0p1
> > > > CPU(s) scaling MHz: 100%
> > > > CPU max MHz: 1971.2000
> > > > CPU min MHz: 115.2000
> > > > BogoMIPS: 62.50
> > > > Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32
> > > > atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcpc
> > > > flagm paca pacg
> > > > Caches (sum of all):
> > > > L1d: 768 KiB (12 instances)
> > > > L1i: 768 KiB (12 instances)
> > > > L2: 3 MiB (12 instances)
> > > > L3: 6 MiB (3 instances)
> > > > NUMA:
> > > > NUMA node(s): 1
> > > > NUMA node0 CPU(s): 0-11
> > > > Vulnerabilities:
> > > > Gather data sampling: Not affected
> > > > Itlb multihit: Not affected
> > > > L1tf: Not affected
> > > > Mds: Not affected
> > > > Meltdown: Not affected
> > > > Mmio stale data: Not affected
> > > > Reg file data sampling: Not affected
> > > > Retbleed: Not affected
> > > > Spec rstack overflow: Not affected
> > > > Spec store bypass: Mitigation; Speculative Store Bypass
> > > > disabled via prctl
> > > > Spectre v1: Mitigation; __user pointer sanitization
> > > > Spectre v2: Mitigation; CSV2, BHB
> > > > Srbds: Not affected
> > > > Tsx async abort: Not affected
> > > >
> > > > Regards,
> > > > Jan
> > >
> > >
> > > Best Regards,
> > > Yan, Zi
> > >
> >
>
> --
> Sincerely yours,
> Mike.
>
next prev parent reply other threads:[~2024-10-29 21:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-29 12:47 Jan Stancek
2024-10-29 15:07 ` Zi Yan
2024-10-29 15:43 ` Jan Stancek
2024-10-29 16:20 ` Mike Rapoport
2024-10-29 21:03 ` Jan Stancek [this message]
2024-10-30 13:08 ` Mike Rapoport
2024-10-30 21:50 ` Jan Stancek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAASaF6x8cMJUTYpm0dedkh9boXgVaL9WvxaG-r+aHY+6QOpw6Q@mail.gmail.com \
--to=jstancek@redhat.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=jonathanh@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=linux-tegra@vger.kernel.org \
--cc=rppt@kernel.org \
--cc=thierry.reding@gmail.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox