linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Yang Shi <shy828301@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <songmuchun@bytedance.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [RFC PATCH v1 1/4] mm, hwpoison, hugetlb: introduce SUBPAGE_INDEX_HWPOISON to save raw error page
Date: Wed, 27 Apr 2022 13:03:43 +0000	[thread overview]
Message-ID: <20220427130342.GA3915117@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <5b956156-887b-1d91-7831-28a66db53c6a@huawei.com>

On Wed, Apr 27, 2022 at 03:11:31PM +0800, Miaohe Lin wrote:
> On 2022/4/27 12:28, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > 
> > When handling memory error on a hugetlb page, the error handler tries to
> > dissolve and turn it into 4kB pages.  If it's successfully dissolved,
> > PageHWPoison flag is moved to the raw error page, so but that's all
> 
> s/so but/so/

Fixed, thank you.

> 
> > right.  However, dissolve sometimes fails, then the error page is left
> > as hwpoisoned hugepage. It's useful if we can retry to dissolve it to
> > save healthy pages, but that's not possible now because the information
> > about where the raw error page is lost.
> > 
> > Use the private field of a tail page to keep that information.  The code
> 
> Only one raw error page is saved now. Should this be ok? I think so as memory
> failure should be rare anyway?

This is a good point.  It might be rare, but maybe we need some consideration
on it. Some ideas in my mind below ...

- using struct page of all subpages is not compatible with hugetlb_free_vmemmap,
  so it's not desirable.
- defining a linked list starting from hpage[SUBPAGE_INDEX_HWPOISON].private
  might be a solution to save the multiple offsets.
- hacking bits in hpage[SUBPAGE_INDEX_HWPOISON].private field to save offset
  info in compressed format.  For example, for 2MB hugepage there could be
  512 offset numbers, so we can save one offset with 9 bits subfield.
  So we can save upto 7 offsets in the field.  This is not flexible and
  still can't handle many errors.
- maintaining global data structure to save the pfn of all hwpoison pages
  in the system. This might sound overkilling for the current purpose,
  but this data structure might be helpful for other purpose, so in the long
  run someone might get interested in it.

> 
> > path of shrinking hugepage pool used this info to try delayed dissolve.
> > 
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > ---
> >  include/linux/hugetlb.h | 24 ++++++++++++++++++++++++
> >  mm/hugetlb.c            |  9 +++++++++
> >  mm/memory-failure.c     |  2 ++
> >  3 files changed, 35 insertions(+)
> > 
> > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> > index ac2a1d758a80..689e69cb556b 100644
> > --- a/include/linux/hugetlb.h
> > +++ b/include/linux/hugetlb.h
> > @@ -42,6 +42,9 @@ enum {
> >  	SUBPAGE_INDEX_CGROUP,		/* reuse page->private */
> >  	SUBPAGE_INDEX_CGROUP_RSVD,	/* reuse page->private */
> >  	__MAX_CGROUP_SUBPAGE_INDEX = SUBPAGE_INDEX_CGROUP_RSVD,
> > +#endif
> > +#ifdef CONFIG_CGROUP_HUGETLB
> > +	SUBPAGE_INDEX_HWPOISON,
> >  #endif
> 
> Do we rely on the CONFIG_CGROUP_HUGETLB to store the raw error page?

No. I meant CONFIG_MEMORY_FAILURE.
# I just copied and pasted the #ifdef line just above, and forget to update
# the CONFIG_* part :(

Thanks,
Naoya Horiguchi

  reply	other threads:[~2022-04-27 13:03 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-27  4:28 [RFC PATCH v1 0/4] mm, hwpoison: improve handling workload related to hugetlb and memory_hotplug Naoya Horiguchi
2022-04-27  4:28 ` [RFC PATCH v1 1/4] mm, hwpoison, hugetlb: introduce SUBPAGE_INDEX_HWPOISON to save raw error page Naoya Horiguchi
2022-04-27  7:11   ` Miaohe Lin
2022-04-27 13:03     ` HORIGUCHI NAOYA(堀口 直也) [this message]
2022-04-28  3:14       ` Miaohe Lin
2022-05-12 22:31   ` Jane Chu
2022-05-12 22:49     ` HORIGUCHI NAOYA(堀口 直也)
2022-04-27  4:28 ` [RFC PATCH v1 2/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage Naoya Horiguchi
2022-04-29  8:49   ` Miaohe Lin
2022-05-09  7:55     ` HORIGUCHI NAOYA(堀口 直也)
2022-05-09  8:57       ` Miaohe Lin
2022-04-27  4:28 ` [RFC PATCH v1 3/4] mm, hwpoison: add parameter unpoison to get_hwpoison_huge_page() Naoya Horiguchi
2022-04-27  4:28 ` [RFC PATCH v1 4/4] mm, memory_hotplug: fix inconsistent num_poisoned_pages on memory hotremove Naoya Horiguchi
2022-04-28  3:20   ` Miaohe Lin
2022-04-28  4:05     ` HORIGUCHI NAOYA(堀口 直也)
2022-04-28  7:16       ` Miaohe Lin
2022-05-09 13:34         ` Naoya Horiguchi
2022-04-27 10:48 ` [RFC PATCH v1 0/4] mm, hwpoison: improve handling workload related to hugetlb and memory_hotplug David Hildenbrand
2022-04-27 12:20   ` Oscar Salvador
2022-04-27 12:20   ` HORIGUCHI NAOYA(堀口 直也)
2022-04-28  8:44     ` David Hildenbrand
2022-05-09  7:29       ` HORIGUCHI NAOYA(堀口 直也)
2022-05-09  9:04         ` Miaohe Lin
2022-05-09  9:58           ` Oscar Salvador
2022-05-09 10:53             ` Miaohe Lin
2022-05-11 15:11               ` David Hildenbrand
2022-05-11 16:10                 ` HORIGUCHI NAOYA(堀口 直也)
2022-05-11 16:22                   ` David Hildenbrand
2022-05-12  3:04                     ` Miaohe Lin
2022-05-12  6:35                     ` HORIGUCHI NAOYA(堀口 直也)
2022-05-12  7:28                       ` David Hildenbrand
2022-05-12 11:13                         ` Miaohe Lin
2022-05-12 12:59                           ` David Hildenbrand
2022-05-16  3:25                             ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220427130342.GA3915117@hori.linux.bs1.fc.nec.co.jp \
    --to=naoya.horiguchi@nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox