linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Yang Shi <shy828301@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <songmuchun@bytedance.com>,
	Jane Chu <jane.chu@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter counter
Date: Fri, 7 Oct 2022 00:47:00 +0000	[thread overview]
Message-ID: <20221007004700.GB3227576@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <542473d1-b687-55b8-24d1-96af715aed56@huawei.com>

On Sat, Sep 24, 2022 at 08:27:35PM +0800, Miaohe Lin wrote:
> On 2022/9/23 22:12, Naoya Horiguchi wrote:
> > There seems another build error in aarch64 with MEMORY_HOTPLUG disabled.
> > https://lore.kernel.org/lkml/20220923110144.GA1413812@ik1-406-35019.vs.sakura.ne.jp/
> > , so let me revise this patch again to handle it.
> > 
> > - Naoya Horiguchi
> > 
> > ---
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Date: Fri, 23 Sep 2022 22:51:20 +0900
> > Subject: [PATCH v5 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter
> > 
> > Currently PageHWPoison flag does not behave well when experiencing memory
> > hotremove/hotplug.  Any data field in struct page is unreliable when the
> > associated memory is offlined, and the current mechanism can't tell whether
> > a memory section is onlined because a new memory devices is installed or
> > because previous failed offline operations are undone.  Especially if
> > there's a hwpoisoned memory, it's unclear what the best option is.
> > 
> > So introduce a new mechanism to make struct memory_block remember that
> > a memory block has hwpoisoned memory inside it. And make any online event
> > fail if the onlined memory block contains hwpoison.  struct memory_block
> > is freed and reallocated over ACPI-based hotremove/hotplug, but not over
> > sysfs-based hotremove/hotplug.  So it's desirable to implement hwpoison
> > counter on this struct.
> > 
> > Note that clear_hwpoisoned_pages() is relocated to be called earlier than
> > now, just before unregistering struct memory_block.  Otherwise, the
> > per-memory_block hwpoison counter is freed and we fail to adjust global
> > hwpoison counter properly.
> > 
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Reported-by: kernel test robot <lkp@intel.com>
> 
> LGTM with some nits below. Thanks.
> 
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thank you.

> 
> > ---
> > ChangeLog v4 -> v5:
> > - add Reported-by of lkp bot,
> > - check both CONFIG_MEMORY_FAILURE and CONFIG_MEMORY_HOTPLUG in introduced #ifdefs,
> >   intending to fix "undefined reference" errors in aarch64.
> > 
> > ChangeLog v3 -> v4:
> > - fix build error (https://lore.kernel.org/linux-mm/202209231134.tnhKHRfg-lkp@intel.com/)
> >   by using memblk_nr_poison() to access to the member ->nr_hwpoison
> > ---
> >  drivers/base/memory.c  | 34 ++++++++++++++++++++++++++++++++++
> >  include/linux/memory.h |  3 +++
> >  include/linux/mm.h     | 24 ++++++++++++++++++++++++
> >  mm/internal.h          |  8 --------
> >  mm/memory-failure.c    | 31 ++++++++++---------------------
> >  mm/sparse.c            |  2 --
> >  6 files changed, 71 insertions(+), 31 deletions(-)
> > 
> > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > index 9aa0da991cfb..99e0e789616c 100644
> > --- a/drivers/base/memory.c
> > +++ b/drivers/base/memory.c
> > @@ -183,6 +183,9 @@ static int memory_block_online(struct memory_block *mem)
> >  	struct zone *zone;
> >  	int ret;
> >  
> > +	if (memblk_nr_poison(start_pfn))
> > +		return -EHWPOISON;
> > +
> >  	zone = zone_for_pfn_range(mem->online_type, mem->nid, mem->group,
> >  				  start_pfn, nr_pages);
> >  
> > @@ -864,6 +867,7 @@ void remove_memory_block_devices(unsigned long start, unsigned long size)
> >  		mem = find_memory_block_by_id(block_id);
> >  		if (WARN_ON_ONCE(!mem))
> >  			continue;
> > +		clear_hwpoisoned_pages(memblk_nr_poison(start));
> 
> clear_hwpoisoned_pages seems not a proper name now? PageHWPoison info is kept now. But this should be trivial.
> 

Right, I think that the name num_poisoned_pages_sub() is clear enough, so
I'll open this function.

> >  		unregister_memory_block_under_nodes(mem);
> >  		remove_memory_block(mem);
> >  	}
> > @@ -1164,3 +1168,33 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
> >  	}
> >  	return ret;
> >  }
> > +
> > +#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_MEMORY_HOTPLUG)
> > +void memblk_nr_poison_inc(unsigned long pfn)
> > +{
> > +	const unsigned long block_id = pfn_to_block_id(pfn);
> > +	struct memory_block *mem = find_memory_block_by_id(block_id);
> > +
> > +	if (mem)
> > +		atomic_long_inc(&mem->nr_hwpoison);
> > +}
> > +
> > +void memblk_nr_poison_sub(unsigned long pfn, long i)
> > +{
> > +	const unsigned long block_id = pfn_to_block_id(pfn);
> > +	struct memory_block *mem = find_memory_block_by_id(block_id);
> > +
> > +	if (mem)
> > +		atomic_long_sub(i, &mem->nr_hwpoison);
> > +}
> > +
> > +unsigned long memblk_nr_poison(unsigned long pfn)
> 
> memblk_nr_poison() is only used inside this file. Make it static?

Thanks, I'll add it.

Thanks,
Naoya Horiguchi

  reply	other threads:[~2022-10-07  0:47 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  9:13 [PATCH v3 0/4] mm, hwpoison: improve handling workload related to hugetlb and memory_hotplug Naoya Horiguchi
2022-09-21  9:13 ` [PATCH v3 1/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage Naoya Horiguchi
2022-09-24 11:43   ` Miaohe Lin
2022-09-28  1:26     ` Naoya Horiguchi
2022-09-28  9:32       ` Miaohe Lin
2022-10-07  0:45         ` HORIGUCHI NAOYA(堀口 直也)
2022-10-08  2:33           ` Miaohe Lin
2022-09-21  9:13 ` [PATCH v3 2/4] mm/hwpoison: move definitions of num_poisoned_pages_* to memory-failure.c Naoya Horiguchi
2022-09-24 11:53   ` Miaohe Lin
2022-09-28  2:05     ` Naoya Horiguchi
2022-09-28  7:56       ` Miaohe Lin
2022-09-21  9:13 ` [PATCH v3 3/4] mm/hwpoison: pass pfn to num_poisoned_pages_*() Naoya Horiguchi
2022-09-21  9:13 ` [PATCH v3 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter Naoya Horiguchi
2022-09-23  8:26   ` [PATCH v4 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter counter Naoya Horiguchi
2022-09-23 14:12     ` [PATCH v5 " Naoya Horiguchi
2022-09-24 12:27       ` Miaohe Lin
2022-10-07  0:47         ` HORIGUCHI NAOYA(堀口 直也) [this message]
2022-09-26  8:05       ` David Hildenbrand
2022-10-07  0:52         ` HORIGUCHI NAOYA(堀口 直也)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221007004700.GB3227576@hori.linux.bs1.fc.nec.co.jp \
    --to=naoya.horiguchi@nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jane.chu@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox