linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Aaron Thompson <dev@aaront.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/1] Fix memblock_free_late() deferred init bug, redux
Date: Tue, 7 Feb 2023 10:02:32 +0200	[thread overview]
Message-ID: <Y+IFmA1FVNRtpEFZ@kernel.org> (raw)
In-Reply-To: <20230206071211.3157-1-dev@aaront.org>

Hi Aaron,

On Mon, Feb 06, 2023 at 07:12:09AM +0000, Aaron Thompson wrote:
> Hi Mike,
> 
> Unfortunately my attempted bugfix 115d9d77bb0f ("mm: Always release pages to the
> buddy allocator in memblock_free_late()") is itself buggy. It's true that all
> reserved pages are initialized by the time memblock_free_late() is called, but
> if the pages being freed are in the deferred init range, __free_one_page() might
> access nearby uninitialized pages when trying to coalesce buddies, in which case
> badness ensues :(
> 
> deferred_init_maxorder() handles this by initializing a max-order-sized block of
> pages before freeing any of them. We could change memblock_free_late() to do
> that, but it seems to me that having memblock_free_late() get involved in
> initializing and freeing free pages would be a bit messy. I think it will be
> simpler to free the reserved pages later, as part of deferred init or after.
> 
> I can see a few ways to accomplish that:
> 
> 1. Have memblock_free_late() remove the range from memblock.reserved. Deferred
>    init will then handle that range as just another free range, so the pages
>    will be initialized and freed by deferred_init_maxorder().
> 
>    This is the simplest fix, but the problem is that the pages will be
>    initialized twice, first by memmap_init_reserved_pages() and again by
>    deferred_init_maxorder(). It looks risky to me to blindly zero out an
>    already-initialized page struct, but if we could say for certain that the
>    page structs for memblock-reserved ranges aren't actually used, at least
>    until after deferred init is done, then this could work. I don't know the
>    usage of page structs well enough to say.
> 
> 2. Take 1 and fix the double-init problem. In addition to removing the range
>    from memblock.reserved, also set a flag on the range in memblock.memory that
>    indicates the pages for that range have already been initialized.
>    deferred_init_maxorder() would skip initializing pages for ranges with the
>    flag set, but it would still free them.
> 
>    This seems like a bit of a conceptual stretch of the memblock region flags
>    since this is not a property of the memory itself but rather of the page
>    structs corresponding to that memory. But it gets the job done.
> 
> 3. Defer the freeing of reserved pages until after deferred init is completely
>    done. Have memblock_free_late() set a flag on the range in memblock.reserved,
>    and have memblock_discard() call __free_pages_core() on those ranges.
> 
>    I think this would be a reasonable use of flags on reserved regions. They are
>    not currently used.

I think 3 is the most straight-forward as a concept. It'll need some care
for ARCH_KEEP_MEMBLOCK architectures, e.g. arm64, though

I also can think about

4. Extend initialization of the memory map around the reserved regions in
memmap_init_reserved_pages()/reserve_bootmem_region(). If these functions
initialize the memory map of the entire pageblock surrounding the reserved
range, __free_one_page() will certainly access initialized struct pages.
 
> The included patch implements option 1 because it is the simplest, but it should
> not be used if the double-init of the page structs is unsafe. In my testing I
> verified that the count, mapcount, and lru list head of all pages are at their
> defaults when memblock_free_late() is called by efi_free_boot_services(), but
> that's obviously not conclusive. I have draft versions of 2 and 3 that I can
> finish up quickly if either of those are preferable.

At this point of the release cycle I prefer to revert 115d9d77bb0f ("mm:
Always release pages to the buddy allocator in memblock_free_late()") and
to work on the proper fix for the next release.
 
> Please let me know what you think, and sorry for introducing this bug.
> 
> Thanks,
> Aaron
> 
> Aaron Thompson (1):
>   mm: Defer freeing reserved pages in memblock_free_late()
> 
>  mm/internal.h                     |  2 ++
>  mm/memblock.c                     | 36 ++++++++++++++++++++-----------
>  mm/page_alloc.c                   | 17 +++++++++++++++
>  tools/testing/memblock/internal.h |  7 +++---
>  4 files changed, 47 insertions(+), 15 deletions(-)
> 
> -- 
> 2.30.2
> 
> 

-- 
Sincerely yours,
Mike.


  parent reply	other threads:[~2023-02-07  8:02 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06  7:12 Aaron Thompson
2023-02-06  7:12 ` [PATCH 1/1] mm: Defer freeing reserved pages in memblock_free_late() Aaron Thompson
2023-02-07  8:02 ` Mike Rapoport [this message]
2023-02-07  8:28   ` [PATCH 0/1] Fix memblock_free_late() deferred init bug, redux Aaron Thompson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+IFmA1FVNRtpEFZ@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dev@aaront.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox