* Re:[RFC] a question about reuse hwpoison page in soft_offline_page()
@ 2018-07-06 9:59 裘稀石(稀石)
2018-07-07 5:04 ` 回复:[RFC] " 裘稀石(稀石)
2018-07-09 0:38 ` Re:[RFC] " Naoya Horiguchi
0 siblings, 2 replies; 3+ messages in thread
From: 裘稀石(稀石) @ 2018-07-06 9:59 UTC (permalink / raw)
To: Naoya Horiguchi; +Cc: linux-mm, linux-kernel, 陈义全
[-- Attachment #1: Type: text/plain, Size: 4218 bytes --]
Hi Naoya,
How about this case, we only trigger soft offline page, but someone killed later.
As the race I said before, then someone may use the hwpoisoned hugetlb page again.
Please see the following.
soft offline:
get_any_page - find the hugetlb is free
process A:
do_page_fault - handle_mm_fault - hugetlb_fault - hugetlb_no_page - alloc_huge_page
soft offline:
soft_offline_free_page - set hwpoison flag
process B:
mmap the hugetlb file from A, hugetlb_fault - hugetlb_no_page - find_lock_page
it find hwpoison flag is already set, so ret = VM_FAULT_HWPOISON
then mm_fault_error - do_sigbus - mce kill
process B was killed by soft offline, right?
Thanks,
Xishi Qiu
------------------------------------------------------------------
发件人:Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
发送时间:2018年7月6日(星期五) 16:19
收件人:裘稀石(稀石) <xishi.qiuxishi@alibaba-inc.com>
抄 送:linux-mm <linux-mm@kvack.org>; linux-kernel <linux-kernel@vger.kernel.org>; 陈义全 <zy.zhengyi@alibaba-inc.com>
主 题:Re: [RFC] a question about reuse hwpoison page in soft_offline_page()
On Fri, Jul 06, 2018 at 11:37:41AM +0800, 裘稀石(稀石) wrote:
> This patch add05cec
> (mm: soft-offline: don't free target page in successful page migration) removes
> set_migratetype_isolate() and unset_migratetype_isolate() in soft_offline_page
> ().
>
> And this patch 243abd5b
> (mm: hugetlb: prevent reuse of hwpoisoned free hugepages) changes
> if (!is_migrate_isolate_page(page)) to if (!PageHWPoison(page)), so it could
> prevent someone
> reuse the free hugetlb again after set the hwpoison flag
> in soft_offline_free_page()
>
> My question is that if someone reuse the free hugetlb again before
> soft_offline_free_page() and
> after get_any_page(), then it uses the hopoison page, and this may trigger mce
> kill later, right?
Hi Xishi,
Thank you for pointing out the issue. That's nice catch.
I think that the race condition itself could happen, but it doesn't lead
to MCE kill because PageHWPoison is not visible to HW which triggers MCE.
PageHWPoison flag is just a flag in struct page to report the memory error
from kernel to userspace. So even if a CPU is accessing to the page whose
struct page has PageHWPoison set, that doesn't cause a MCE unless the page
is physically broken.
The type of memory error that soft offline tries to handle is corrected
one which is not a failure yet although it's starting to wear.
So such PageHWPoison page can be reused, but that's not critical because
the page is freed at some point afterword and error containment completes.
However, I noticed that there's a small pain in free hugetlb case.
We call dissolve_free_huge_page() in soft_offline_free_page() which moves
the PageHWPoison flag from the head page to the raw error page.
If the reported race happens, dissolve_free_huge_page() just return without
doing any dissolve work because "if (PageHuge(page) && !page_count(page))"
block is skipped.
The hugepage is allocated and used as usual, but the contaiment doesn't
complete as expected in the normal page, because free_huge_pages() doesn't
call dissolve_free_huge_page() for hwpoison hugepage. This is not critical
because such error hugepage just reside in free hugepage list. But this
might looks like a kind of memory leak. And even worse when hugepage pool
is shrinked and the hwpoison hugepage is freed, the PageHWPoison flag is
still on the head page which is unlikely to be an actual error page.
So I think we need improvement here, how about the fix like below?
(not tested yet, sorry)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1883,6 +1883,11 @@ static void soft_offline_free_page(struct page *page)
struct page *head = compound_head(page);
if (!TestSetPageHWPoison(head)) {
+ if (page_count(head)) {
+ ClearPageHWPoison(head);
+ return;
+ }
+
num_poisoned_pages_inc();
if (PageHuge(head))
dissolve_free_huge_page(page);
Thanks,
Naoya Horiguchi
[-- Attachment #2: Type: text/html, Size: 10102 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread* 回复:[RFC] a question about reuse hwpoison page in soft_offline_page()
2018-07-06 9:59 Re:[RFC] a question about reuse hwpoison page in soft_offline_page() 裘稀石(稀石)
@ 2018-07-07 5:04 ` 裘稀石(稀石)
2018-07-09 0:38 ` Re:[RFC] " Naoya Horiguchi
1 sibling, 0 replies; 3+ messages in thread
From: 裘稀石(稀石) @ 2018-07-07 5:04 UTC (permalink / raw)
To: 裘稀石(稀石), Naoya Horiguchi
Cc: linux-mm, linux-kernel, 陈义全
[-- Attachment #1: Type: text/plain, Size: 3500 bytes --]
Hi Naoya,
1) If someone mmap the hugetlb but not alloc memory yet, so h->resv_huge_pages
increased.
2) Then soft_offline_page - soft_offline_free_page - dissolve_free_huge_page
will failed.
3) Later someone access the hugetlb file, so trigger page fault, but return no mem,
so someone has been killed.
In this case, the hwpoisoned hugetlb still in the huge free list, but it can not be alloced.
And soft offline trigger kill something later, does it handle properly?
Thanks,
Xishi Qiu> This patch add05cec
> (mm: soft-offline: don't free target page in successful page migration) removes
> set_migratetype_isolate() and unset_migratetype_isolate() in soft_offline_page
> ().
>
> And this patch 243abd5b
> (mm: hugetlb: prevent reuse of hwpoisoned free hugepages) changes
> if (!is_migrate_isolate_page(page)) to if (!PageHWPoison(page)), so it could
> prevent someone
> reuse the free hugetlb again after set the hwpoison flag
> in soft_offline_free_page()
>
> My question is that if someone reuse the free hugetlb again before
> soft_offline_free_page() and
> after get_any_page(), then it uses the hopoison page, and this may trigger mce
> kill later, right?
Hi Xishi,
Thank you for pointing out the issue. That's nice catch.
I think that the race condition itself could happen, but it doesn't lead
to MCE kill because PageHWPoison is not visible to HW which triggers MCE.
PageHWPoison flag is just a flag in struct page to report the memory error
from kernel to userspace. So even if a CPU is accessing to the page whose
struct page has PageHWPoison set, that doesn't cause a MCE unless the page
is physically broken.
The type of memory error that soft offline tries to handle is corrected
one which is not a failure yet although it's starting to wear.
So such PageHWPoison page can be reused, but that's not critical because
the page is freed at some point afterword and error containment completes.
However, I noticed that there's a small pain in free hugetlb case.
We call dissolve_free_huge_page() in soft_offline_free_page() which moves
the PageHWPoison flag from the head page to the raw error page.
If the reported race happens, dissolve_free_huge_page() just return without
doing any dissolve work because "if (PageHuge(page) && !page_count(page))"
block is skipped.
The hugepage is allocated and used as usual, but the contaiment doesn't
complete as expected in the normal page, because free_huge_pages() doesn't
call dissolve_free_huge_page() for hwpoison hugepage. This is not critical
because such error hugepage just reside in free hugepage list. But this
might looks like a kind of memory leak. And even worse when hugepage pool
is shrinked and the hwpoison hugepage is freed, the PageHWPoison flag is
still on the head page which is unlikely to be an actual error page.
So I think we need improvement here, how about the fix like below?
(not tested yet, sorry)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1883,6 +1883,11 @@ static void soft_offline_free_page(struct page *page)
struct page *head = compound_head(page);
if (!TestSetPageHWPoison(head)) {
+ if (page_count(head)) {
+ ClearPageHWPoison(head);
+ return;
+ }
+
num_poisoned_pages_inc();
if (PageHuge(head))
dissolve_free_huge_page(page);
Thanks,
Naoya Horiguchi
[-- Attachment #2: Type: text/html, Size: 7391 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Re:[RFC] a question about reuse hwpoison page in soft_offline_page()
2018-07-06 9:59 Re:[RFC] a question about reuse hwpoison page in soft_offline_page() 裘稀石(稀石)
2018-07-07 5:04 ` 回复:[RFC] " 裘稀石(稀石)
@ 2018-07-09 0:38 ` Naoya Horiguchi
1 sibling, 0 replies; 3+ messages in thread
From: Naoya Horiguchi @ 2018-07-09 0:38 UTC (permalink / raw)
To: 裘稀石(稀石)
Cc: linux-mm, linux-kernel, 陈义全
On Fri, Jul 06, 2018 at 05:59:15PM +0800, 裘稀石(稀石) wrote:
>
> Hi Naoya,
>
> How about this case, we only trigger soft offline page, but someone killed
> later.
> As the race I said before, then someone may use the hwpoisoned hugetlb page
> again.
> Please see the following.
>
> soft offline:
> get_any_page - find the hugetlb is free
> process A:
> do_page_fault - handle_mm_fault - hugetlb_fault - hugetlb_no_page
> - alloc_huge_page
> soft offline:
> soft_offline_free_page - set hwpoison flag
> process B:
> mmap the hugetlb file from A, hugetlb_fault - hugetlb_no_page
> - find_lock_page
> it find hwpoison flag is already set, so ret = VM_FAULT_HWPOISON
> then mm_fault_error - do_sigbus - mce kill
> process B was killed by soft offline, right?
Right, this and your another email shows how things go bad.
Soft offline handler is simply racy now, so we had better cancel it if
the target page was allocated during soft offline handling by double
checking as I mentioned.
Thanks,
Naoya Horiguchi
>
> Thanks,
> Xishi Qiu
>
> ------------------------------------------------------------------
> 发件人:Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> 发送时间:2018年7月6日(星期五) 16:19
> 收件人:裘稀石(稀石) <xishi.qiuxishi@alibaba-inc.com>
> 抄 送:linux-mm <linux-mm@kvack.org>; linux-kernel
> <linux-kernel@vger.kernel.org>; 陈义全 <zy.zhengyi@alibaba-inc.com>
> 主 题:Re: [RFC] a question about reuse hwpoison page in soft_offline_page
> ()
>
> On Fri, Jul 06, 2018 at 11:37:41AM +0800, 裘稀石(稀石) wrote:
> > This patch add05cec
> > (mm: soft-offline: don't free target page in successful page migration)
> removes
> > set_migratetype_isolate() and unset_migratetype_isolate()
> in soft_offline_page
> > ().
> >
> > And this patch 243abd5b
> > (mm: hugetlb: prevent reuse of hwpoisoned free hugepages) changes
> > if (!is_migrate_isolate_page(page)) to if (!PageHWPoison
> (page)), so it could
> > prevent someone
> > reuse the free hugetlb again after set the hwpoison flag
> > in soft_offline_free_page()
> >
> > My question is that if someone reuse the free hugetlb again before
> > soft_offline_free_page() and
> > after get_any_page
> (), then it uses the hopoison page, and this may trigger mce
> > kill later, right?
>
> Hi Xishi,
>
> Thank you for pointing out the issue. That's nice catch.
>
> I think that the race condition itself could happen, but it doesn't lead
> to MCE kill because PageHWPoison is not visible to HW which triggers MCE.
> PageHWPoison flag is just a flag in struct page to report the memory error
> from kernel to userspace. So even if a CPU is accessing to the page whose
> struct page has PageHWPoison set, that doesn't cause a MCE unless the page
> is physically broken.
> The type of memory error that soft offline tries to handle is corrected
> one which is not a failure yet although it's starting to wear.
> So such PageHWPoison page can be reused, but that's not critical because
> the page is freed at some point afterword and error containment completes.
>
> However, I noticed that there's a small pain in free hugetlb case.
> We call dissolve_free_huge_page() in soft_offline_free_page() which moves
> the PageHWPoison flag from the head page to the raw error page.
> If the reported race happens, dissolve_free_huge_page() just return without
> doing any dissolve work because "if (PageHuge(page) && !page_count(page))"
> block is skipped.
> The hugepage is allocated and used as usual, but the contaiment doesn't
> complete as expected in the normal page, because free_huge_pages() doesn't
> call dissolve_free_huge_page() for hwpoison hugepage. This is not critical
> because such error hugepage just reside in free hugepage list. But this
> might looks like a kind of memory leak. And even worse when hugepage pool
> is shrinked and the hwpoison hugepage is freed, the PageHWPoison flag is
> still on the head page which is unlikely to be an actual error page.
>
> So I think we need improvement here, how about the fix like below?
>
> (not tested yet, sorry)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1883,6 +1883,11 @@ static void soft_offline_free_page
> (struct page *page)
> struct page *head = compound_head(page);
>
> if (!TestSetPageHWPoison(head)) {
> + if (page_count(head)) {
> + ClearPageHWPoison(head);
> + return;
> + }
> +
> num_poisoned_pages_inc();
> if (PageHuge(head))
> dissolve_free_huge_page(page);
>
> Thanks,
> Naoya Horiguchi
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-07-09 0:47 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-06 9:59 Re:[RFC] a question about reuse hwpoison page in soft_offline_page() 裘稀石(稀石)
2018-07-07 5:04 ` 回复:[RFC] " 裘稀石(稀石)
2018-07-09 0:38 ` Re:[RFC] " Naoya Horiguchi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox