linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: zhenwei pi <pizhenwei@bytedance.com>
Cc: akpm@linux-foundation.org, naoya.horiguchi@nec.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	david@redhat.com, linmiaohe@huawei.com
Subject: Re: [PATCH v4 1/2] mm/memory-failure: introduce "hwpoisoned-pages" entry
Date: Tue, 14 Jun 2022 13:12:19 +0800	[thread overview]
Message-ID: <YqgYs75fD019NkUd@FVFYT0MHHV2J.usts.net> (raw)
In-Reply-To: <20220614043830.99607-2-pizhenwei@bytedance.com>

On Tue, Jun 14, 2022 at 12:38:29PM +0800, zhenwei pi wrote:
> Add a new debug entry to show the number of hwpoisoned pages. And
> use module_get/module_put to manager this kernel module, don't allow
> to remove this module unless hwpoisoned-pages is zero.
> 
> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
> ---
>  Documentation/vm/hwpoison.rst |  4 ++++
>  mm/hwpoison-inject.c          | 19 ++++++++++++++++++-
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/vm/hwpoison.rst b/Documentation/vm/hwpoison.rst
> index c742de1769d1..c832a8b192d4 100644
> --- a/Documentation/vm/hwpoison.rst
> +++ b/Documentation/vm/hwpoison.rst
> @@ -155,6 +155,10 @@ Testing
>  	flag bits are defined in include/linux/kernel-page-flags.h and
>  	documented in Documentation/admin-guide/mm/pagemap.rst
>  
> +  hwpoisoned-pages

A bit weird to me. IIUC, this means the number of **software** poisoned
pages instead of **hardware**. The prefix "hw" may be not suitable.  How
about "poisoned-pages" (a little simplified), "poisoned-pfns" (keep the
name consistent with "corrupt-pfn" and "unpoison-pfn") or "swpoisoned-pages"
(sw means software)?

> +	The number of hwpoisoned pages. The hwpoison kernel module can not be
> +	removed unless this count is zero.
> +
>  * Architecture specific MCE injector
>  
>    x86 has mce-inject, mce-test
> diff --git a/mm/hwpoison-inject.c b/mm/hwpoison-inject.c
> index 5c0cddd81505..9e522ecedeef 100644
> --- a/mm/hwpoison-inject.c
> +++ b/mm/hwpoison-inject.c
> @@ -10,6 +10,7 @@
>  #include "internal.h"
>  
>  static struct dentry *hwpoison_dir;
> +static atomic_t hwpoisoned_pages;
>  
>  static int hwpoison_inject(void *data, u64 val)
>  {
> @@ -49,15 +50,28 @@ static int hwpoison_inject(void *data, u64 val)
>  inject:
>  	pr_info("Injecting memory failure at pfn %#lx\n", pfn);
>  	err = memory_failure(pfn, 0);
> +	if (!err) {
> +		WARN_ON(!try_module_get(THIS_MODULE));

__module_get() is enough since we already hold a refcount at open time.
This WARN_ON() will not be triggered unless something unexpected happens.

> +		atomic_inc(&hwpoisoned_pages);
> +	}
> +
>  	return (err == -EOPNOTSUPP) ? 0 : err;
>  }
>  
>  static int hwpoison_unpoison(void *data, u64 val)
>  {
> +	int ret;
> +
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> -	return unpoison_memory(val);
> +	ret = unpoison_memory(val);
> +	if (!ret) {
> +		atomic_dec(&hwpoisoned_pages);
> +		module_put(THIS_MODULE);
> +	}
> +
> +	return ret;
>  }
>  
>  DEFINE_DEBUGFS_ATTRIBUTE(hwpoison_fops, NULL, hwpoison_inject, "%lli\n");
> @@ -99,6 +113,9 @@ static int pfn_inject_init(void)
>  	debugfs_create_u64("corrupt-filter-flags-value", 0600, hwpoison_dir,
>  			   &hwpoison_filter_flags_value);
>  
> +	debugfs_create_atomic_t("hwpoisoned-pages", 0400, hwpoison_dir,
> +			   &hwpoisoned_pages);
> +
>  #ifdef CONFIG_MEMCG
>  	debugfs_create_u64("corrupt-filter-memcg", 0600, hwpoison_dir,
>  			   &hwpoison_filter_memcg);
> -- 
> 2.20.1
> 
> 


  reply	other threads:[~2022-06-14  5:12 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14  4:38 [PATCH v4 0/2] mm/memory-failure: don't allow to unpoison hw corrupted page zhenwei pi
2022-06-14  4:38 ` [PATCH v4 1/2] mm/memory-failure: introduce "hwpoisoned-pages" entry zhenwei pi
2022-06-14  5:12   ` Muchun Song [this message]
2022-06-14  7:09   ` HORIGUCHI NAOYA(堀口 直也)
2022-06-14  7:13     ` David Hildenbrand
2022-06-14  7:23       ` [External] " zhenwei pi
2022-06-14  8:19       ` Miaohe Lin
2022-06-14  4:38 ` [PATCH v4 2/2] mm/memory-failure: disable unpoison once hw error happens zhenwei pi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YqgYs75fD019NkUd@FVFYT0MHHV2J.usts.net \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=pizhenwei@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox