From: Dan Williams <dan.j.williams@intel.com>
To: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>,
Linux MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Oscar Salvador <osalvador@suse.de>
Subject: Re: uninitialized pmem struct pages
Date: Tue, 5 Jan 2021 00:27:34 -0800 [thread overview]
Message-ID: <CAPcyv4jKKWqjgdpi3yiPCaFdfHYzPDrgAc1YvELEPogD3go2PA@mail.gmail.com> (raw)
In-Reply-To: <20210105081654.GU13207@dhcp22.suse.cz>
On Tue, Jan 5, 2021 at 12:17 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Tue 05-01-21 09:01:00, Michal Hocko wrote:
> > On Mon 04-01-21 16:44:52, David Hildenbrand wrote:
> > > On 04.01.21 16:43, David Hildenbrand wrote:
> > > > On 04.01.21 16:33, Michal Hocko wrote:
> > > >> On Mon 04-01-21 16:15:23, David Hildenbrand wrote:
> > > >>> On 04.01.21 16:10, Michal Hocko wrote:
> > > >> [...]
> > > >>> Do the physical addresses you see fall into the same section as boot
> > > >>> memory? Or what's around these addresses?
> > > >>
> > > >> Yes I am getting a garbage for the first struct page belonging to the
> > > >> pmem section [1]
> > > >> [ 0.020161] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x603fffffff]
> > > >> [ 0.020163] ACPI: SRAT: Node 4 PXM 4 [mem 0x6060000000-0x11d5fffffff] non-volatile
> > > >>
> > > >> The pfn without the initialized struct page is 0x6060000. This is a
> > > >> first pfn in a section.
> > > >
> > > > Okay, so we're not dealing with the "early section" mess I described,
> > > > different story.
> > > >
> > > > Due to [1], is_mem_section_removable() called
> > > > pfn_to_page(PHYS_PFN(0x6060000)). page_zone(page) made it crash, as not
> > > > initialized.
> > > >
> > > > Let's assume this is indeed a reserved pfn in the altmap. What's the
> > > > actual address of the memmap?
> > > >
> > > > I do wonder what hosts pfn_to_page(PHYS_PFN(0x6060000)) - is it actually
> > > > part of the actual altmap (i.e. > 0x6060000) or maybe even self-hosted?
> > > >
> > > > If it's not self-hosted, initializing the relevant memmaps should work
> > > > just fine I guess. Otherwise things get more complicated.
> > >
> > > Oh, I forgot: pfn_to_online_page() should at least in your example make
> > > sure other pfn walkers are safe. It was just an issue of
> > > is_mem_section_removable().
> >
> > Hmm, I suspect you are right. I haven't put this together, thanks! The memory
> > section is indeed marked offline so pfn_to_online_page would indeed bail
> > out:
> > crash> p (0x6060000>>15)
> > $3 = 3084
> > crash> p mem_section[3084/128][3084 & 127]
> > $4 = {
> > section_mem_map = 18446736128020054019,
> > usage = 0xffff902dcf956680,
> > page_ext = 0x0,
> > pad = 0
> > }
> > crash> p 18446736128020054019 & (1UL<<2)
> > $5 = 0
> >
> > That makes it considerably less of a problem than I thought!
>
> Forgot to add that those who are running kernels without 53cdc1cb29e8
> ("drivers/base/memory.c: indicate all memory blocks as removable") for
> some reason can fix the crash by the following simple patch.
>
> Index: linux-5.3-users_mhocko_SLE15-SP2_for-next/drivers/base/memory.c
> ===================================================================
> --- linux-5.3-users_mhocko_SLE15-SP2_for-next.orig/drivers/base/memory.c
> +++ linux-5.3-users_mhocko_SLE15-SP2_for-next/drivers/base/memory.c
> @@ -152,9 +152,14 @@ static ssize_t removable_show(struct dev
> goto out;
>
> for (i = 0; i < sections_per_block; i++) {
> - if (!present_section_nr(mem->start_section_nr + i))
> + unsigned long nr = mem->start_section_nr + i;
> + if (!present_section_nr(nr))
> continue;
> - pfn = section_nr_to_pfn(mem->start_section_nr + i);
> + if (!online_section_nr()) {
I assume that's onlince_section_nr(nr) in the version that compiles?
This makes sense because the memory block size is larger than the
section size. I suspect you have 1GB memory block size on this system,
but since the System RAM and PMEM collide at a 512MB alignment in a
memory block you end up walking the back end of the last 512MB of the
System RAM memory block and run into the offline PMEM section.
So, I don't think it's pfn_to_online_page that necessarily needs to
know how to disambiguate each page, it's things that walk sections and
memory blocks and expects them to be consistent over the span.
next prev parent reply other threads:[~2021-01-05 8:27 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-04 10:03 Michal Hocko
2021-01-04 10:45 ` David Hildenbrand
2021-01-04 14:26 ` Michal Hocko
2021-01-04 14:51 ` David Hildenbrand
2021-01-04 15:10 ` Michal Hocko
2021-01-04 15:15 ` David Hildenbrand
2021-01-04 15:33 ` Michal Hocko
2021-01-04 15:43 ` David Hildenbrand
2021-01-04 15:44 ` David Hildenbrand
2021-01-05 8:00 ` Michal Hocko
2021-01-05 8:16 ` Michal Hocko
2021-01-05 8:27 ` Dan Williams [this message]
2021-01-05 8:42 ` Michal Hocko
2021-01-05 8:57 ` Dan Williams
2021-01-05 9:05 ` Michal Hocko
2021-01-05 9:13 ` David Hildenbrand
2021-01-05 9:25 ` Michal Hocko
2021-01-05 9:27 ` David Hildenbrand
2021-01-04 15:59 ` Michal Hocko
2021-01-04 16:30 ` David Hildenbrand
2021-01-05 7:44 ` Michal Hocko
2021-01-05 9:56 ` David Hildenbrand
2021-01-05 5:33 ` Dan Williams
2021-01-05 7:40 ` Michal Hocko
2021-01-05 5:17 ` Dan Williams
2021-01-05 7:50 ` Michal Hocko
2021-01-05 9:16 ` David Hildenbrand
2021-01-05 9:25 ` David Hildenbrand
2021-01-05 9:33 ` Dan Williams
2021-01-05 9:37 ` David Hildenbrand
2021-01-05 9:56 ` Dan Williams
2021-01-05 9:58 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPcyv4jKKWqjgdpi3yiPCaFdfHYzPDrgAc1YvELEPogD3go2PA@mail.gmail.com \
--to=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox