linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: 罗飞 <luofei@unicloud.com>
To: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Cc: "tony.luck@intel.com" <tony.luck@intel.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: 答复: [PATCH v2] hw/poison: Add in-use hugepage filter judgement and avoid filter page impact on mce handler
Date: Fri, 18 Feb 2022 06:30:50 +0000	[thread overview]
Message-ID: <1bdf929216be4816bad82c8902cd174c@unicloud.com> (raw)
In-Reply-To: <20220218030814.GA2955567@hori.linux.bs1.fc.nec.co.jp>

[-- Attachment #1: Type: text/plain, Size: 10024 bytes --]

>> After successfully obtaining the reference count of the huge
>> page, it is still necessary to call hwpoison_filter() to make a
>> filter judgement, otherwise the filter hugepage will be unmaped
>> and the related process may be killed.
>>
>> Also when the huge page meets the filter conditions, it should
>> not be regarded as successful memory_failure() processing for
>> mce handler, but should return a value to inform the caller,
>> otherwise the caller regards the error page has been identified
>> and isolated, which may lead to calling set_mce_nospec() to change
>> page attribute, etc.
>>
>> Signed-off-by: luofei <luofei@unicloud.com>
>
>This patch seems to do two separate things (introducing MF_MCE_HANDLE,
>and adding hwpoison_filter() in memory_failure_hugetlb()), so could you
>separate the patch into two?

Yes, these two things are not very related, I will submit two patches to

describe them separatley. :)

>> -
>> -     /*
>> -      * -EHWPOISON from memory_failure() means that it already sent SIGBUS
>> -      * to the current process with the proper error info, so no need to
>> -      * send SIGBUS here again.
>> -      */
>> -     if (ret == -EHWPOISON)
>> +     } else if (ret == -EHWPOISON || ret == 1)
>> +             /*
>> +              * -EHWPOISON from memory_failure() means that it already sent SIGBUS
>> +              * to the current process with the proper error info, so no need to
>> +              * send SIGBUS here again.
>> +              *
>> +              * 1 means it's a filter page, no need to deal with.
>> +              */
>
>The new return code 1 seems to be handled in the same manner as -EHWPOISON,
>so how about simply using -EHWPOISON as return code for the new case?
>Then, the meaning of -EHWPOISON at this context would change like below:
>
>        /*
>-        * -EHWPOISON from memory_failure() means that it already sent SIGBUS
>-        * to the current process with the proper error info, so no need to
>-        * send SIGBUS here again.
>+        * -EHWPOISON from memory_failure() means that memory_failure() did
>+        * not handle the error event for the following reason:
>+        *   - SIGBUS has already been sent to the current process with the
>+        *     proper error info, or
>+        *   - hwpoison_filter() filtered the event,
>+        * so no need to deal with it more.
>          */


Yes, here -EHWPOISON can represent the same case, I will just use -EHWPOISON

to simplify:)


Thanks

________________________________
发件人: HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com>
发送时间: 2022年2月18日 11:08:14
收件人: 罗飞
抄送: tony.luck@intel.com; bp@alien8.de; tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org; akpm@linux-foundation.org; hpa@zytor.com; linux-edac@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org
主题: Re: [PATCH v2] hw/poison: Add in-use hugepage filter judgement and avoid filter page impact on mce handler

On Wed, Feb 16, 2022 at 10:00:38PM -0500, luofei wrote:
> After successfully obtaining the reference count of the huge
> page, it is still necessary to call hwpoison_filter() to make a
> filter judgement, otherwise the filter hugepage will be unmaped
> and the related process may be killed.
>
> Also when the huge page meets the filter conditions, it should
> not be regarded as successful memory_failure() processing for
> mce handler, but should return a value to inform the caller,
> otherwise the caller regards the error page has been identified
> and isolated, which may lead to calling set_mce_nospec() to change
> page attribute, etc.
>
> Signed-off-by: luofei <luofei@unicloud.com>

This patch seems to do two separate things (introducing MF_MCE_HANDLE,
and adding hwpoison_filter() in memory_failure_hugetlb()), so could you
separate the patch into two?

> ---
>  arch/x86/kernel/cpu/mce/core.c | 22 +++++++++++-----------
>  include/linux/mm.h             |  1 +
>  mm/memory-failure.c            | 25 +++++++++++++++++++++++--
>  3 files changed, 35 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 5818b837fd4d..c2b99c60225f 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -612,7 +612,7 @@ static int uc_decode_notifier(struct notifier_block *nb, unsigned long val,
>                return NOTIFY_DONE;
>
>        pfn = mce->addr >> PAGE_SHIFT;
> -     if (!memory_failure(pfn, 0)) {
> +     if (!memory_failure(pfn, MF_MCE_HANDLE)) {
>                set_mce_nospec(pfn, whole_page(mce));
>                mce->kflags |= MCE_HANDLED_UC;
>        }
> @@ -1286,7 +1286,7 @@ static void kill_me_now(struct callback_head *ch)
>  static void kill_me_maybe(struct callback_head *cb)
>  {
>        struct task_struct *p = container_of(cb, struct task_struct, mce_kill_me);
> -     int flags = MF_ACTION_REQUIRED;
> +     int flags = MF_ACTION_REQUIRED | MF_MCE_HANDLE;
>        int ret;
>
>        p->mce_count = 0;
> @@ -1300,14 +1300,14 @@ static void kill_me_maybe(struct callback_head *cb)
>                set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
>                sync_core();
>                return;
> -     }
> -
> -     /*
> -      * -EHWPOISON from memory_failure() means that it already sent SIGBUS
> -      * to the current process with the proper error info, so no need to
> -      * send SIGBUS here again.
> -      */
> -     if (ret == -EHWPOISON)
> +     } else if (ret == -EHWPOISON || ret == 1)
> +             /*
> +              * -EHWPOISON from memory_failure() means that it already sent SIGBUS
> +              * to the current process with the proper error info, so no need to
> +              * send SIGBUS here again.
> +              *
> +              * 1 means it's a filter page, no need to deal with.
> +              */

The new return code 1 seems to be handled in the same manner as -EHWPOISON,
so how about simply using -EHWPOISON as return code for the new case?
Then, the meaning of -EHWPOISON at this context would change like below:

         /*
-        * -EHWPOISON from memory_failure() means that it already sent SIGBUS
-        * to the current process with the proper error info, so no need to
-        * send SIGBUS here again.
+        * -EHWPOISON from memory_failure() means that memory_failure() did
+        * not handle the error event for the following reason:
+        *   - SIGBUS has already been sent to the current process with the
+        *     proper error info, or
+        *   - hwpoison_filter() filtered the event,
+        * so no need to deal with it more.
          */


Thanks,
Naoya Horiguchi

>                return;
>
>        pr_err("Memory error not recovered");
> @@ -1320,7 +1320,7 @@ static void kill_me_never(struct callback_head *cb)
>
>        p->mce_count = 0;
>        pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr);
> -     if (!memory_failure(p->mce_addr >> PAGE_SHIFT, 0))
> +     if (!memory_failure(p->mce_addr >> PAGE_SHIFT, MF_MCE_HANDLE))
>                set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
>  }
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 213cc569b192..f4703f948e9a 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3188,6 +3188,7 @@ enum mf_flags {
>        MF_MUST_KILL = 1 << 2,
>        MF_SOFT_OFFLINE = 1 << 3,
>        MF_UNPOISON = 1 << 4,
> +     MF_MCE_HANDLE = 1 << 5,
>  };
>  extern int memory_failure(unsigned long pfn, int flags);
>  extern void memory_failure_queue(unsigned long pfn, int flags);
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 97a9ed8f87a9..1a0bd91a685b 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1526,7 +1526,10 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>                                if (TestClearPageHWPoison(head))
>                                        num_poisoned_pages_dec();
>                                unlock_page(head);
> -                             return 0;
> +                             if (flags & MF_MCE_HANDLE)
> +                                     return 1;
> +                             else
> +                                     return 0;
>                        }
>                        unlock_page(head);
>                        res = MF_FAILED;
> @@ -1545,6 +1548,17 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>        lock_page(head);
>        page_flags = head->flags;
>
> +     if (hwpoison_filter(p)) {
> +             if (TestClearPageHWPoison(head))
> +                     num_poisoned_pages_dec();
> +             put_page(p);
> +             if (flags & MF_MCE_HANDLE)
> +                     res = 1;
> +             else
> +                     res = 0;
> +             goto out;
> +     }
> +
>        /*
>         * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so
>         * simply disable it. In order to make it work properly, we need
> @@ -1613,7 +1627,10 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>                goto out;
>
>        if (hwpoison_filter(page)) {
> -             rc = 0;
> +             if (flags & MF_MCE_HANDLE)
> +                     rc = 1;
> +             else
> +                     rc = 0;
>                goto unlock;
>        }
>
> @@ -1837,6 +1854,10 @@ int memory_failure(unsigned long pfn, int flags)
>                        num_poisoned_pages_dec();
>                unlock_page(p);
>                put_page(p);
> +             if (flags & MF_MCE_HANDLE)
> +                     res = 1;
> +             else
> +                     res = 0;
>                goto unlock_mutex;
>        }
>
> --
> 2.27.0

[-- Attachment #2: Type: text/html, Size: 39009 bytes --]

      reply	other threads:[~2022-02-18  6:31 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17  3:00 luofei
2022-02-18  3:08 ` HORIGUCHI NAOYA(堀口 直也)
2022-02-18  6:30   ` 罗飞 [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1bdf929216be4816bad82c8902cd174c@unicloud.com \
    --to=luofei@unicloud.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox