linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-mm@kvack.org, Alexander Potapenko <glider@google.com>,
	Marco Elver <elver@google.com>,
	Dmitry Vyukov <dvyukov@google.com>
Subject: Re: [PATCH] mm: Fix memblock_free_late() when using deferred struct page
Date: Mon, 16 Feb 2026 17:28:34 +0200	[thread overview]
Message-ID: <aZM3ota-JFJPsa1M@kernel.org> (raw)
In-Reply-To: <1e83d9c3cf11ba825237dbc7d6a70ba47ab328cc.camel@kernel.crashing.org>

On Mon, Feb 16, 2026 at 03:53:33PM +1100, Benjamin Herrenschmidt wrote:
> (stripping history)
> 
> So I went into a big refresher (or learning exercise since there's
> quite a bit here that I never really looked at before either).
> 
> So here is a break down, in chronological order, of the setup and
> initialization of the memory map, and how the reserve business
> interacts with it as I understand it from reading the code.
> 
> Please correct me if I missed or misunderstood something :-) 

Your description is correct. There are some things that moved from arch to
mm_core_init() just recently, but the order remains the same.

> Also maybe this is worth turning into a piece of doc ?

Care to send a patch? ;-)
 
> Then some conclusions (I think I know why the patches crashed).
 
(trimmed down the description)
 
>  * One thing I have NOT yet figured out ... do we have a problem if the
> page is in a hole that lands outside of a zone boundary ? I haven't
> really got my head deep down into the details of zone initializations
> (especially as we adjust the boundaries here or there), so this could
> be a problem.

Zones boundaries are determined by addressing constraints (DMA, DMA32,
NORMAL), node assignment and actual usable memory in memblock.memory.

For a machine like t3a.nano there will be ZONE_DMA up to 16M and ZONE_DMA32
from 16M till the end of the usable memory so there should be no problem
there :)

In the general case, I believe it is possible that some reserved memory
will not be covered by a zone, for example if the reserved memory is beyond
the end of the last zone.

I think that this is a really pathological case and we can dismiss it for
now. 
 
> 99) Conclusion :-)
> ------------------
> 
> Nothing firm yet here but a few hints at what could possibly go wrong
> and one obvious issue with the previous patch(es).
> 
> First the obvious ... the proposed patch that just makes
> memblock_free_late() call free_reserved_page() is missing a call to
> pfn_valid(). Without this, it can (and will) hit holes in the mem_map,
> and that's probably one of the crashes I reported.

It can, not sure it will, but we want to stay on the safe side :)
 
> Now, it would be nice to then go allocate those missing bits of
> mem_map, because I really don't want to give up on that memory. Small
> instances are a thing and with the current price of DRAM, a fairly
> relevant one :-) But I'll look at that later.
> 
> My original patch had the exact same issue btw.
> 
> The other potential issue, for which I welcome your input as I'm
> running short on time for the day is ... the impact to zones. I see a
> possibility for those pages to be outside of any zone's
> zone_start_pfn/spanned_pages range ... or not ? As I said, I didn't get
> my head yet around the zones init and spanning adjustments that
> happens, so I don't know if we really have potentially "holes" here or
> not.
> 
> This leads to the question... could we work around a lot of those
> issues easily by making the early efi_reserve_boot_services() *also*
> add the regions to memblock.memory in addition to memblock.reserve ?
> ie, those regions are marked as boot services code/data, so they must
> be memory to begin with, and that's all early enough that we can do it.

Adding the EFI boot services to memblock.memory would certainly solve both
potentially missing memory map and (unlikely) missing zone span.

More broadly, it would have been nice if e820/efi would add *anything* that
lives in DRAM to memblock.memory, but last time I tried I could not
convince x86 folks that even memory that's unusable to kernel is still
memory :)
 
> We should still add the missing pfn_valid() of course, if anything for
> the sake of any other caller of memblock_free_late() ... or we could
> change memblock_free_late() to only consider ranges that are both
> reserved *and* in memblock.memory. You mentioned that might be slow
> though.

memblock_free_late() can't consider memblock.memory and memblock.reserved
because it might be called after they were discarded. 
And pfn_valid() should protect against accesses to non-existent memory map.
 
> Opinions ?
> 
> Cheers,
> Ben.

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2026-02-16 15:28 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-03  8:02 Benjamin Herrenschmidt
2026-02-03 18:40 ` Mike Rapoport
2026-02-03 19:53   ` Benjamin Herrenschmidt
2026-02-04  7:39     ` Mike Rapoport
2026-02-04  9:02       ` Benjamin Herrenschmidt
2026-02-06 10:33         ` Mike Rapoport
2026-02-10  1:04           ` Benjamin Herrenschmidt
2026-02-10  2:10             ` Benjamin Herrenschmidt
2026-02-10  6:17               ` Benjamin Herrenschmidt
2026-02-10  8:34                 ` Benjamin Herrenschmidt
2026-02-10 14:32                   ` Mike Rapoport
2026-02-10 23:23                     ` Benjamin Herrenschmidt
2026-02-11  5:20                       ` Mike Rapoport
2026-02-16  5:34                       ` Benjamin Herrenschmidt
2026-02-16  6:51                         ` Benjamin Herrenschmidt
2026-02-16  4:53                     ` Benjamin Herrenschmidt
2026-02-16 15:28                       ` Mike Rapoport [this message]
2026-02-16 10:36           ` Alexander Potapenko
2026-02-17  8:28 ` [PATCH v2] " Benjamin Herrenschmidt
2026-02-17 12:32   ` Mike Rapoport
2026-02-17 22:00     ` Benjamin Herrenschmidt
2026-02-17 21:47   ` Benjamin Herrenschmidt
2026-02-18  0:15     ` Benjamin Herrenschmidt
2026-02-18  8:05       ` Mike Rapoport
2026-02-19  2:48         ` Benjamin Herrenschmidt
2026-02-19 10:16           ` Mike Rapoport
2026-02-19 22:46             ` Benjamin Herrenschmidt
2026-02-20  4:57               ` Benjamin Herrenschmidt
2026-02-20  9:09                 ` Mike Rapoport
2026-02-20  9:00               ` Mike Rapoport
2026-02-20  5:12             ` Benjamin Herrenschmidt
2026-02-20  5:15             ` Benjamin Herrenschmidt
2026-02-20  5:47             ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZM3ota-JFJPsa1M@kernel.org \
    --to=rppt@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox