From: Naoya Horiguchi <naoya.horiguchi@linux.dev>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: akpm@linux-foundation.org, naoya.horiguchi@nec.com,
david@redhat.com, osalvador@suse.de, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] mm/memory-failure.c: fix memory failure race with memory offline
Date: Mon, 28 Feb 2022 21:04:48 +0900 [thread overview]
Message-ID: <20220228120448.GA1142923@u2004> (raw)
In-Reply-To: <20220226094034.23938-1-linmiaohe@huawei.com>
On Sat, Feb 26, 2022 at 05:40:34PM +0800, Miaohe Lin wrote:
> There is a theoretical race window between memory failure and memory
> offline. Think about the below scene:
>
> CPU A CPU B
> memory_failure offline_pages
> mutex_lock(&mf_mutex);
> TestSetPageHWPoison(p)
> start_isolate_page_range
> has_unmovable_pages
> --PageHWPoison is movable
> do {
> scan_movable_pages
> do_migrate_range
> --PageHWPoison isn't migrated
> }
> test_pages_isolated
> --PageHWPoison is isolated
> remove_memory
> access page... bang
> ...
>
> When PageHWPoison is set, the page could be offlined anytime regardless
> of the page refcnt. It's bacause start_isolate_page_range treats HWPoison
> page as movable and already isolated, so the page range can be successfully
> isolated. soft_offline_page and unpoison_memory have the similar race. Fix
> this by using get_online_mems + put_online_mems pair to guard aginst memory
> offline when doing memory failure.
Sounds convincing to me. Thanks for identifying. Is this problem reproduced
in your testing, or just picked out by code inspection?
>
> There is a even worse race window. If the page refcnt is held, then memory
> failure happens, the page could be offlined while it's still in use. So The
> assumption that a page can not be offlined when the page refcnt is held is
> now broken. This theoretical race window could happen in every vm activity.
> But this race window might be too small to fix.
Yes, hwpoisoned pages can now be offlined while they have refcount > 0.
I think that they need to be categorize into two groups based on
whether the error page was successfully handled or not.
If a error page is successfully handled, then page_handle_poison() succeeded
and the refcount should be one (which is held only by hwpoison subsystem
itself, so there should be no other reference to it), so it should be safely
offlined as we do now. But if error handling failed, the hwpoisoned page
have any value and there could remain some reference to it. So we might
better to make offline_pages() fail for such "failed handling" pages, or for
all hwpoisoned pages with refcount more than one?
>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
> mm/memory-failure.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 5444a8ef4867..b85232a64104 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1702,6 +1702,7 @@ int memory_failure(unsigned long pfn, int flags)
> if (!sysctl_memory_failure_recovery)
> panic("Memory failure on page %lx", pfn);
>
> + get_online_mems();
> mutex_lock(&mf_mutex);
>
> p = pfn_to_online_page(pfn);
> @@ -1894,11 +1895,13 @@ int memory_failure(unsigned long pfn, int flags)
> identify_page_state:
> res = identify_page_state(pfn, p, page_flags);
> mutex_unlock(&mf_mutex);
> + put_online_mems();
> return res;
> unlock_page:
> unlock_page(p);
> unlock_mutex:
> mutex_unlock(&mf_mutex);
> + put_online_mems();
> return res;
> }
> EXPORT_SYMBOL_GPL(memory_failure);
> @@ -2058,6 +2061,7 @@ int unpoison_memory(unsigned long pfn)
> if (!pfn_valid(pfn))
> return -ENXIO;
>
> + get_online_mems();
> p = pfn_to_page(pfn);
> page = compound_head(p);
>
> @@ -2114,6 +2118,7 @@ int unpoison_memory(unsigned long pfn)
>
> unlock_mutex:
> mutex_unlock(&mf_mutex);
> + put_online_mems();
> return ret;
> }
> EXPORT_SYMBOL(unpoison_memory);
> @@ -2278,10 +2283,12 @@ int soft_offline_page(unsigned long pfn, int flags)
> if (flags & MF_COUNT_INCREASED)
> ref_page = pfn_to_page(pfn);
>
> + get_online_mems();
I felt that {get,put}_online_mems() can be put together with takeing/freeing
mf_mutex with some wrapper lock/unlock function, which slightly improves
code readability (developers won't have to care about two locking separately).
That requires moving mutex_lock(&mf_mutex), which could have non-trivial
change, but maybe that's not so bad.
Thanks,
Naoya Horiguchi
> /* Only online pages can be soft-offlined (esp., not ZONE_DEVICE). */
> page = pfn_to_online_page(pfn);
> if (!page) {
> put_ref_page(ref_page);
> + put_online_mems();
> return -EIO;
> }
>
> @@ -2291,13 +2298,12 @@ int soft_offline_page(unsigned long pfn, int flags)
> pr_info("%s: %#lx page already poisoned\n", __func__, pfn);
> put_ref_page(ref_page);
> mutex_unlock(&mf_mutex);
> + put_online_mems();
> return 0;
> }
>
> retry:
> - get_online_mems();
> ret = get_hwpoison_page(page, flags);
> - put_online_mems();
>
> if (ret > 0) {
> ret = soft_offline_in_use_page(page);
> @@ -2310,6 +2316,7 @@ int soft_offline_page(unsigned long pfn, int flags)
> }
>
> mutex_unlock(&mf_mutex);
> + put_online_mems();
>
> return ret;
> }
> ---
> 2.23.0
>
next prev parent reply other threads:[~2022-02-28 12:05 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-26 9:40 Miaohe Lin
2022-02-28 12:04 ` Naoya Horiguchi [this message]
2022-03-01 3:32 ` Miaohe Lin
2022-03-01 9:53 ` David Hildenbrand
2022-03-01 13:22 ` Miaohe Lin
2022-03-10 13:04 ` Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220228120448.GA1142923@u2004 \
--to=naoya.horiguchi@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox