From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Yang Shi <shy828301@gmail.com>,
Oscar Salvador <osalvador@suse.de>,
Muchun Song <songmuchun@bytedance.com>,
Jane Chu <jane.chu@oracle.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH v2 1/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage
Date: Tue, 6 Sep 2022 06:14:18 +0000 [thread overview]
Message-ID: <20220906061417.GA1406504@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <3e302aa5-d63d-097e-2cb7-831b7c99e736@huawei.com>
On Tue, Sep 06, 2022 at 10:59:58AM +0800, Miaohe Lin wrote:
> On 2022/9/5 14:21, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> >
> > HWPoisoned page is not supposed to be accessed once marked, but currently
> > such accesses can happen during memory hotremove because do_migrate_range()
> > can be called before dissolve_free_huge_pages() is called.
> >
> > Move dissolve_free_huge_pages() before scan_movable_pages(). Recently
> > delayed dissolve has been implemented, so the dissolving can turn
> > a hwpoisoned hugepage into 4kB hwpoison page, which memory hotplug can
> > handle safely.
>
> Yes, thanks for your work, Naoya. ;)
>
> >
> > Reported-by: Miaohe Lin <linmiaohe@huawei.com>
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > ---
> > mm/memory_hotplug.c | 22 +++++++++++-----------
> > 1 file changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index fad6d1f2262a..c24735d63b25 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1880,6 +1880,17 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
> >
> > cond_resched();
> >
> > + /*
> > + * Dissolve free hugepages in the memory block before doing
> > + * offlining actually in order to make hugetlbfs's object
> > + * counting consistent.
> > + */
> > + ret = dissolve_free_huge_pages(start_pfn, end_pfn);
> > + if (ret) {
> > + reason = "failure to dissolve huge pages";
> > + goto failed_removal_isolated;
> > + }
>
> This change has a side-effect. If hugetlb pages are in-use, dissolve_free_huge_pages() will always return -EBUSY
> even if those pages can be migrated. So we fail to hotremove the memory even if they could be offlined.
> Or am I miss something?
Thank you for the comment, you're right. (Taking a look over my test result
carefully, it showed failures for the related cases, I somehow overlooked
them, really sorry.) So my second thought is that we keep offline_pages()
as is, and insert a few line in do_migrate_range() to handle the case of
hwpoisoned hugepage like below:
@@ -1642,6 +1642,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
if (PageHuge(page)) {
pfn = page_to_pfn(head) + compound_nr(head) - 1;
+ if (PageHWPoison(head))
+ continue;
isolate_hugetlb(head, &source);
continue;
} else if (PageTransHuge(page))
This is slightly different from your original suggestion
https://lore.kernel.org/linux-mm/20220421135129.19767-1-linmiaohe@huawei.com/T
, as discussed in the thread existing "if (PageHWPoison(page))" branch in
this function can't be used for hugetlb. We could adjust them to handle
hugetlb, but maybe separating code for hugetlb first from the others looks
less compicated to me.
If you have any suggestion on this, please let me know.
Thanks,
Naoya Horiguchi
next prev parent reply other threads:[~2022-09-06 6:14 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-05 6:21 [PATCH v2 0/4] mm, hwpoison: improve handling workload related to hugetlb and memory_hotplug Naoya Horiguchi
2022-09-05 6:21 ` [PATCH v2 1/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage Naoya Horiguchi
2022-09-06 2:59 ` Miaohe Lin
2022-09-06 6:14 ` HORIGUCHI NAOYA(堀口 直也) [this message]
2022-09-06 8:14 ` Miaohe Lin
2022-09-07 4:12 ` HORIGUCHI NAOYA(堀口 直也)
2022-09-05 6:21 ` [PATCH v2 2/4] mm/hwpoison: move definitions of num_poisoned_pages_* to memory-failure.c Naoya Horiguchi
2022-09-05 6:34 ` HORIGUCHI NAOYA(堀口 直也)
2022-09-07 2:20 ` Miaohe Lin
2022-09-05 6:21 ` [PATCH v2 3/4] mm/hwpoison: pass pfn to num_poisoned_pages_*() Naoya Horiguchi
2022-09-07 2:32 ` Miaohe Lin
2022-09-05 6:21 ` [PATCH v2 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter Naoya Horiguchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220906061417.GA1406504@hori.linux.bs1.fc.nec.co.jp \
--to=naoya.horiguchi@nec.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jane.chu@oracle.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=naoya.horiguchi@linux.dev \
--cc=osalvador@suse.de \
--cc=shy828301@gmail.com \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox