linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Naoya Horiguchi <naoya.horiguchi@linux.dev>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Yang Shi <shy828301@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <songmuchun@bytedance.com>,
	Jane Chu <jane.chu@oracle.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage
Date: Wed, 28 Sep 2022 10:26:47 +0900	[thread overview]
Message-ID: <20220928012647.GA597297@u2004.lan> (raw)
In-Reply-To: <91a4759f-88e4-f9ac-aff5-41d2db5ecfdd@huawei.com>

On Sat, Sep 24, 2022 at 07:43:16PM +0800, Miaohe Lin wrote:
> On 2022/9/21 17:13, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > 
> > HWPoisoned page is not supposed to be accessed once marked, but currently
> > such accesses can happen during memory hotremove because do_migrate_range()
> > can be called before dissolve_free_huge_pages() is called.
> > 
> > Clear HPageMigratable for hwpoisoned hugepages to prevent them from being
> > migrated.  This should be done in hugetlb_lock to avoid race against
> > isolate_hugetlb().
> > 
> > get_hwpoison_huge_page() needs to have a flag to show it's called from
> > unpoison to take refcount of hwpoisoned hugepages, so add it.
> > 
> > Reported-by: Miaohe Lin <linmiaohe@huawei.com>
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> Thanks for your work, Naoya. Maybe something to improve below.
> 
> > ---
> > ChangeLog v2 -> v3
> > - move to the approach of clearing HPageMigratable instead of shifting
> >   dissolve_free_huge_pages.
> > ---
> >  include/linux/hugetlb.h |  4 ++--
> >  mm/hugetlb.c            |  4 ++--
> >  mm/memory-failure.c     | 12 ++++++++++--
> >  3 files changed, 14 insertions(+), 6 deletions(-)
> > 
> 
> <snip>
> 
> > @@ -7267,7 +7267,7 @@ int get_hwpoison_huge_page(struct page *page, bool *hugetlb)
> >  		*hugetlb = true;
> >  		if (HPageFreed(page))
> >  			ret = 0;
> > -		else if (HPageMigratable(page))
> > +		else if (HPageMigratable(page) || unpoison)
> 
> Is unpoison_memory() expected to restore the HPageMigratable flag as well ?

No it isn't. When unpoison_memory() unpoisons a hugepage, the hugepage
is sent back to free hugepage pool, so I think that there's no need to
restore HPageMigratable for it.

> 
> >  			ret = get_page_unless_zero(page);
> >  		else
> >  			ret = -EBUSY;
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 145bb561ddb3..5942e1c0407e 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1244,7 +1244,7 @@ static int __get_hwpoison_page(struct page *page, unsigned long flags)
> >  	int ret = 0;
> >  	bool hugetlb = false;
> >  
> > -	ret = get_hwpoison_huge_page(head, &hugetlb);
> > +	ret = get_hwpoison_huge_page(head, &hugetlb, false);
> >  	if (hugetlb)
> >  		return ret;
> >  
> > @@ -1334,7 +1334,7 @@ static int __get_unpoison_page(struct page *page)
> >  	int ret = 0;
> >  	bool hugetlb = false;
> >  
> > -	ret = get_hwpoison_huge_page(head, &hugetlb);
> > +	ret = get_hwpoison_huge_page(head, &hugetlb, true);
> >  	if (hugetlb)
> >  		return ret;
> >  
> > @@ -1815,6 +1815,13 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags)
> >  		goto out;
> >  	}
> >  
> > +	/*
> > +	 * Clearing HPageMigratable for hwpoisoned hugepages to prevent them
> > +	 * from being migrated by memory hotremove.
> > +	 */
> > +	if (count_increased)
> > +		ClearHPageMigratable(head);
> 
> I believe this can prevent hwpoisoned hugepages from being migrated though there still be some windows.

I'm not sure of "still racy" part, so could you elaborate it?
Main scenario this patch tries to handle is like below:

  CPU 0                                   CPU 1                                   
  get_huge_page_for_hwpoison                                                      
    // take hugetlb_lock                                                          
    __get_huge_page_for_hwpoison                                                  
                                          scan_movable_pages                      
                                            if HPageMigratable                    
                                              goto found                          
                                          do_migrate_range                        
      if HPageMigratable                                                          
        get_page_unless_zero                                                      
      hugetlb_set_page_hwpoison                                                   
      ClearHPageMigratable                                                        
    // release hugetlb_lock                                                       
                                            isolate_hugetlb                       
                                              // take hugetlb_lock                
                                              if !HPageMigratable                 
                                                // fails to isolate the hwpoisoned hugetlb.

Maybe the following could be possible.

  CPU 0                                   CPU 1
                                          scan_movable_pages
                                            if HPageMigratable
                                              goto found
                                          do_migrate_range
                                            isolate_hugetlb
                                              // the hugetlb is isolated,
                                              // but it's not hwpoisoned yet.
  get_huge_page_for_hwpoison
    // take hugetlb_lock
    __get_huge_page_for_hwpoison
      if HPageMigratable
        get_page_unless_zero
      hugetlb_set_page_hwpoison
      ClearHPageMigratable
    // release hugetlb_lock

In this case, the hugepage is maybe broken but not marked as hwpoison yet,
so it's not detectable.

> 
> > +
> >  	return ret;
> >  out:
> >  	if (count_increased)
> > @@ -1862,6 +1869,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
> >  
> >  	if (hwpoison_filter(p)) {
> >  		hugetlb_clear_page_hwpoison(head);
> > +		SetHPageMigratable(head);
> 
> Would we set HPageMigratable flag for free hugetlb pages here? IIUC, they're not expected to have this flag set.

Thank you, you're right.  This should be done in "if (res == 1)" block.
(hwpoison_filter often bothers me ...)

> 
> Thanks,
> Miaohe Lin

Thank you very much!
- Naoya Horiguchi


  reply	other threads:[~2022-09-28  1:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  9:13 [PATCH v3 0/4] mm, hwpoison: improve handling workload related to hugetlb and memory_hotplug Naoya Horiguchi
2022-09-21  9:13 ` [PATCH v3 1/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage Naoya Horiguchi
2022-09-24 11:43   ` Miaohe Lin
2022-09-28  1:26     ` Naoya Horiguchi [this message]
2022-09-28  9:32       ` Miaohe Lin
2022-10-07  0:45         ` HORIGUCHI NAOYA(堀口 直也)
2022-10-08  2:33           ` Miaohe Lin
2022-09-21  9:13 ` [PATCH v3 2/4] mm/hwpoison: move definitions of num_poisoned_pages_* to memory-failure.c Naoya Horiguchi
2022-09-24 11:53   ` Miaohe Lin
2022-09-28  2:05     ` Naoya Horiguchi
2022-09-28  7:56       ` Miaohe Lin
2022-09-21  9:13 ` [PATCH v3 3/4] mm/hwpoison: pass pfn to num_poisoned_pages_*() Naoya Horiguchi
2022-09-21  9:13 ` [PATCH v3 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter Naoya Horiguchi
2022-09-23  8:26   ` [PATCH v4 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter counter Naoya Horiguchi
2022-09-23 14:12     ` [PATCH v5 " Naoya Horiguchi
2022-09-24 12:27       ` Miaohe Lin
2022-10-07  0:47         ` HORIGUCHI NAOYA(堀口 直也)
2022-09-26  8:05       ` David Hildenbrand
2022-10-07  0:52         ` HORIGUCHI NAOYA(堀口 直也)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220928012647.GA597297@u2004.lan \
    --to=naoya.horiguchi@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jane.chu@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox