linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: <akpm@linux-foundation.org>, <naoya.horiguchi@nec.com>,
	<david@redhat.com>, <osalvador@suse.de>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	<linmiaohe@huawei.com>
Subject: [PATCH RFC] mm/memory-failure.c: fix memory failure race with memory offline
Date: Sat, 26 Feb 2022 17:40:34 +0800	[thread overview]
Message-ID: <20220226094034.23938-1-linmiaohe@huawei.com> (raw)

There is a theoretical race window between memory failure and memory
offline. Think about the below scene:

  CPU A					  CPU B
memory_failure				offline_pages
  mutex_lock(&mf_mutex);
  TestSetPageHWPoison(p)
					  start_isolate_page_range
					    has_unmovable_pages
					      --PageHWPoison is movable
					  do {
					    scan_movable_pages
					    do_migrate_range
					      --PageHWPoison isn't migrated
					  }
					  test_pages_isolated
					    --PageHWPoison is isolated
					remove_memory
  access page... bang
  ...

When PageHWPoison is set, the page could be offlined anytime regardless
of the page refcnt. It's bacause start_isolate_page_range treats HWPoison
page as movable and already isolated, so the page range can be successfully
isolated. soft_offline_page and unpoison_memory have the similar race. Fix
this by using get_online_mems + put_online_mems pair to guard aginst memory
offline when doing memory failure.

There is a even worse race window. If the page refcnt is held, then memory
failure happens, the page could be offlined while it's still in use. So The
assumption that a page can not be offlined when the page refcnt is held is
now broken. This theoretical race window could happen in every vm activity.
But this race window might be too small to fix.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/memory-failure.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5444a8ef4867..b85232a64104 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1702,6 +1702,7 @@ int memory_failure(unsigned long pfn, int flags)
 	if (!sysctl_memory_failure_recovery)
 		panic("Memory failure on page %lx", pfn);
 
+	get_online_mems();
 	mutex_lock(&mf_mutex);
 
 	p = pfn_to_online_page(pfn);
@@ -1894,11 +1895,13 @@ int memory_failure(unsigned long pfn, int flags)
 identify_page_state:
 	res = identify_page_state(pfn, p, page_flags);
 	mutex_unlock(&mf_mutex);
+	put_online_mems();
 	return res;
 unlock_page:
 	unlock_page(p);
 unlock_mutex:
 	mutex_unlock(&mf_mutex);
+	put_online_mems();
 	return res;
 }
 EXPORT_SYMBOL_GPL(memory_failure);
@@ -2058,6 +2061,7 @@ int unpoison_memory(unsigned long pfn)
 	if (!pfn_valid(pfn))
 		return -ENXIO;
 
+	get_online_mems();
 	p = pfn_to_page(pfn);
 	page = compound_head(p);
 
@@ -2114,6 +2118,7 @@ int unpoison_memory(unsigned long pfn)
 
 unlock_mutex:
 	mutex_unlock(&mf_mutex);
+	put_online_mems();
 	return ret;
 }
 EXPORT_SYMBOL(unpoison_memory);
@@ -2278,10 +2283,12 @@ int soft_offline_page(unsigned long pfn, int flags)
 	if (flags & MF_COUNT_INCREASED)
 		ref_page = pfn_to_page(pfn);
 
+	get_online_mems();
 	/* Only online pages can be soft-offlined (esp., not ZONE_DEVICE). */
 	page = pfn_to_online_page(pfn);
 	if (!page) {
 		put_ref_page(ref_page);
+		put_online_mems();
 		return -EIO;
 	}
 
@@ -2291,13 +2298,12 @@ int soft_offline_page(unsigned long pfn, int flags)
 		pr_info("%s: %#lx page already poisoned\n", __func__, pfn);
 		put_ref_page(ref_page);
 		mutex_unlock(&mf_mutex);
+		put_online_mems();
 		return 0;
 	}
 
 retry:
-	get_online_mems();
 	ret = get_hwpoison_page(page, flags);
-	put_online_mems();
 
 	if (ret > 0) {
 		ret = soft_offline_in_use_page(page);
@@ -2310,6 +2316,7 @@ int soft_offline_page(unsigned long pfn, int flags)
 	}
 
 	mutex_unlock(&mf_mutex);
+	put_online_mems();
 
 	return ret;
 }
-- 
2.23.0



             reply	other threads:[~2022-02-26  9:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-26  9:40 Miaohe Lin [this message]
2022-02-28 12:04 ` Naoya Horiguchi
2022-03-01  3:32   ` Miaohe Lin
2022-03-01  9:53 ` David Hildenbrand
2022-03-01 13:22   ` Miaohe Lin
2022-03-10 13:04   ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220226094034.23938-1-linmiaohe@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox