linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH v2] mm: Fix memblock_free_late() when using deferred struct page
Date: Thu, 19 Feb 2026 12:16:40 +0200	[thread overview]
Message-ID: <aZbjCFqlqdFwKSHR@kernel.org> (raw)
In-Reply-To: <6453da0558ba20d5c87e730bdfedd47966977931.camel@kernel.crashing.org>

On Thu, Feb 19, 2026 at 01:48:16PM +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2026-02-18 at 10:05 +0200, Mike Rapoport wrote:
> > Apparently we do miss some piece of the puzzle, otherwise you'd see no
> > crashes :)
> > 
> > I think an easy and backportable fix would be to make
> > efi_free_boot_services() an initcall, so that it will surely run after
> > deferred pages are initialized.
> > And since the boot services memory is not memblock_alloc()ed but rather
> > memblock_reserve()ed, it should be freed with free_reserved_area().
> > 
> > With the symptom fixed, we can audit memblock_free_late() and
> > free_reserved_area() callers and see how to make this all less messy and
> > more robust.
> 
> I will play around. The biggest issue I see with this is that
> efi_free_boot_services() also manipulates the efi memmap, and that's
> done without any locking whatsoever.
> 
> I'm semi tempted to split that part. We can unmap the boot services
> from EFI memory in the current spot, and defer the actual freeing.

Let's split it. EFI does weird things with memory already, like mremapping
normal memory for example.

Here's my take on the split. Lightly tested on qemu and recovered ~45M of
ram with the OVMF version I have :)

From fdfbda756d6107a7bc7c3ad4eb589af810ddba49 Mon Sep 17 00:00:00 2001
From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
Date: Thu, 19 Feb 2026 11:22:53 +0200
Subject: [PATCH] x86/efi: defer freeing of boot services memory

efi_free_boot_services() frees memory occupied by EFI_BOOT_SERVICES_CODE
and EFI_BOOT_SERVICES_DATA using memblock_free_late().

There are two issue with that: memblock_free_late() should be used for
memory allocated with memblock_alloc() while the memory reserved with
memblock_reserve() should be freed with free_reserved_area().

More acutely, with CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
efi_free_boot_services() is called before deferred initialization of the
memory map is complete. The freeing path

If the freed memory resides in the areas that memory map for them is
still uninitialized, they won't be actually freed because
memblock_free_late() calls memblock_free_pages() and the latter skips
uninitialized pages.

Using free_reserved_area() at this point is also problematic because
__free_page() accesses the buddy of the freed page and that again might
end up in uninitialized part of the memory map.

Delaying the entire efi_free_boot_services() could be problematic
because in addition to freeing boot services memory it updates
efi.memmap without any synchronization and that's undesirable late in
boot when there is concurrency.

More robust approach is to only defer freeing of the EFI boot services memory.

Make efi_free_boot_services() collect ranges that should be freed into
an array and add an initcall efi_free_boot_services_memory() that walks
that array and actually frees the memory using free_reserved_area().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 42 +++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 553f330198f2..bba1fb57a4bd 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -404,17 +404,32 @@ static void __init efi_unmap_pages(efi_memory_desc_t *md)
 		pr_err("Failed to unmap VA mapping for 0x%llx\n", va);
 }
 
+struct efi_freeable_range {
+	u64 start;
+	u64 end;
+};
+
+static struct efi_freeable_range *ranges_to_free;
+
 void __init efi_free_boot_services(void)
 {
 	struct efi_memory_map_data data = { 0 };
 	efi_memory_desc_t *md;
 	int num_entries = 0;
+	int idx = 0;
 	void *new, *new_md;
 
 	/* Keep all regions for /sys/kernel/debug/efi */
 	if (efi_enabled(EFI_DBG))
 		return;
 
+	ranges_to_free = kzalloc(sizeof(*ranges_to_free) * efi.memmap.nr_map,
+				 GFP_KERNEL);
+	if (!ranges_to_free) {
+		pr_err("Failed to allocate storage for freeable EFI regions\n");
+		return;
+	}
+
 	for_each_efi_memory_desc(md) {
 		unsigned long long start = md->phys_addr;
 		unsigned long long size = md->num_pages << EFI_PAGE_SHIFT;
@@ -471,7 +486,15 @@ void __init efi_free_boot_services(void)
 			start = SZ_1M;
 		}
 
-		memblock_free_late(start, size);
+		/*
+		 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
+		 * map are still not initialized and we can't reliably free
+		 * memory here.
+		 * Queue the ranges to free at a later point.
+		 */
+		ranges_to_free[idx].start = start;
+		ranges_to_free[idx].end = start + size;
+		idx++;
 	}
 
 	if (!num_entries)
@@ -512,6 +535,23 @@ void __init efi_free_boot_services(void)
 	}
 }
 
+static int __init efi_free_boot_services_memory(void)
+{
+	struct efi_freeable_range *range = ranges_to_free;
+
+	while (range->start) {
+		void *start = phys_to_virt(range->start);
+		void *end = phys_to_virt(range->end);
+
+		free_reserved_area(start, end, -1, NULL);
+		range++;
+	}
+	kfree(ranges_to_free);
+
+	return 0;
+}
+late_initcall(efi_free_boot_services_memory);
+
 /*
  * A number of config table entries get remapped to virtual addresses
  * after entering EFI virtual mode. However, the kexec kernel requires

base-commit: 05f7e89ab9731565d8a62e3b5d1ec206485eeb0b
-- 
2.51.0

 
> Cheers,
> Ben.
> 

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2026-02-19 10:16 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-03  8:02 [PATCH] " Benjamin Herrenschmidt
2026-02-03 18:40 ` Mike Rapoport
2026-02-03 19:53   ` Benjamin Herrenschmidt
2026-02-04  7:39     ` Mike Rapoport
2026-02-04  9:02       ` Benjamin Herrenschmidt
2026-02-06 10:33         ` Mike Rapoport
2026-02-10  1:04           ` Benjamin Herrenschmidt
2026-02-10  2:10             ` Benjamin Herrenschmidt
2026-02-10  6:17               ` Benjamin Herrenschmidt
2026-02-10  8:34                 ` Benjamin Herrenschmidt
2026-02-10 14:32                   ` Mike Rapoport
2026-02-10 23:23                     ` Benjamin Herrenschmidt
2026-02-11  5:20                       ` Mike Rapoport
2026-02-16  5:34                       ` Benjamin Herrenschmidt
2026-02-16  6:51                         ` Benjamin Herrenschmidt
2026-02-16  4:53                     ` Benjamin Herrenschmidt
2026-02-16 15:28                       ` Mike Rapoport
2026-02-16 10:36           ` Alexander Potapenko
2026-02-17  8:28 ` [PATCH v2] " Benjamin Herrenschmidt
2026-02-17 12:32   ` Mike Rapoport
2026-02-17 22:00     ` Benjamin Herrenschmidt
2026-02-17 21:47   ` Benjamin Herrenschmidt
2026-02-18  0:15     ` Benjamin Herrenschmidt
2026-02-18  8:05       ` Mike Rapoport
2026-02-19  2:48         ` Benjamin Herrenschmidt
2026-02-19 10:16           ` Mike Rapoport [this message]
2026-02-19 22:46             ` Benjamin Herrenschmidt
2026-02-20  4:57               ` Benjamin Herrenschmidt
2026-02-20  9:09                 ` Mike Rapoport
2026-02-20  9:00               ` Mike Rapoport
2026-02-20  5:12             ` Benjamin Herrenschmidt
2026-02-20  5:15             ` Benjamin Herrenschmidt
2026-02-20  5:47             ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZbjCFqlqdFwKSHR@kernel.org \
    --to=rppt@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox