From: John Hubbard <jhubbard@nvidia.com>
To: Axel Rasmussen <axelrasmussen@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andy Lutomirski <luto@kernel.org>,
"Aneesh Kumar K.V" <aneesh.kumar@kernel.org>,
Borislav Petkov <bp@alien8.de>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Hildenbrand <david@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>, Helge Deller <deller@gmx.de>,
Ingo Molnar <mingo@redhat.com>,
"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
Liu Shixin <liushixin2@huawei.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Muchun Song <muchun.song@linux.dev>,
"Naveen N. Rao" <naveen.n.rao@linux.ibm.com>,
Nicholas Piggin <npiggin@gmail.com>,
Oscar Salvador <osalvador@suse.de>, Peter Xu <peterx@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Suren Baghdasaryan <surenb@google.com>,
Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
x86@kernel.org
Subject: Re: [PATCH 1/1] arch/fault: don't print logs for simulated poison errors
Date: Thu, 9 May 2024 14:30:21 -0700 [thread overview]
Message-ID: <d04a838b-848d-405d-9317-40282cd58c36@nvidia.com> (raw)
In-Reply-To: <20240509203907.504891-2-axelrasmussen@google.com>
On 5/9/24 1:39 PM, Axel Rasmussen wrote:
> For real MCEs, various architectures print log messages when poisoned
> memory is accessed (which results in a SIGBUS). These messages can be
> important for users to understand the issue.
>
> On the other hand, we have the userfaultfd UFFDIO_POISON operation,
> which can "simulate" memory poisoning. That particular process will get
> SIGBUS on access to the memory, but this effect is tied to an MM, rather
> than being global like a real poison event. So, we don't want to log
> about this case to the global kernel log; instead, let the process
> itself log or whatever else it wants to do. This avoids spamming the
> kernel log, and avoids e.g. drowning out real events with simulated
> ones.
>
> To identify this situation, add a new VM_FAULT_HWPOISON_SIM flag. This
> is expected to be set *in addition to* one of the existing
> VM_FAULT_HWPOISON or VM_FAULT_HWPOISON_LARGE flags (which are mutually
> exclusive).
>
> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
> ---
> arch/parisc/mm/fault.c | 7 +++++--
> arch/powerpc/mm/fault.c | 6 ++++--
> arch/x86/mm/fault.c | 6 ++++--
> include/linux/mm_types.h | 5 +++++
> mm/hugetlb.c | 3 ++-
> mm/memory.c | 2 +-
> 6 files changed, 21 insertions(+), 8 deletions(-)
>
This completely fixes the uffd-unit-test behavior, I just did a quick
test run to be sure as well.
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
> diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
> index c39de84e98b0..e5370bcadf27 100644
> --- a/arch/parisc/mm/fault.c
> +++ b/arch/parisc/mm/fault.c
> @@ -400,9 +400,12 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
> #ifdef CONFIG_MEMORY_FAILURE
> if (fault & (VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) {
> unsigned int lsb = 0;
> - printk(KERN_ERR
> +
> + if (!(fault & VM_FAULT_HWPOISON_SIM)) {
> + pr_err(
> "MCE: Killing %s:%d due to hardware memory corruption fault at %08lx\n",
> - tsk->comm, tsk->pid, address);
> + tsk->comm, tsk->pid, address);
> + }
> /*
> * Either small page or large page may be poisoned.
> * In other words, VM_FAULT_HWPOISON_LARGE and
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 53335ae21a40..ac5e8a3c7fba 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -140,8 +140,10 @@ static int do_sigbus(struct pt_regs *regs, unsigned long address,
> if (fault & (VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) {
> unsigned int lsb = 0; /* shutup gcc */
>
> - pr_err("MCE: Killing %s:%d due to hardware memory corruption fault at %lx\n",
> - current->comm, current->pid, address);
> + if (!(fault & VM_FAULT_HWPOISON_SIM)) {
> + pr_err("MCE: Killing %s:%d due to hardware memory corruption fault at %lx\n",
> + current->comm, current->pid, address);
> + }
>
> if (fault & VM_FAULT_HWPOISON_LARGE)
> lsb = hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault));
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index e4f3c7721f45..16d077a3ad14 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -928,9 +928,11 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
> struct task_struct *tsk = current;
> unsigned lsb = 0;
>
> - pr_err_ratelimited(
> + if (!(fault & VM_FAULT_HWPOISON_SIM)) {
> + pr_err_ratelimited(
> "MCE: Killing %s:%d due to hardware memory corruption fault at %lx\n",
> - tsk->comm, tsk->pid, address);
> + tsk->comm, tsk->pid, address);
> + }
> if (fault & VM_FAULT_HWPOISON_LARGE)
> lsb = hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault));
> if (fault & VM_FAULT_HWPOISON)
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5240bd7bca33..7f8fc3efc5b2 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1226,6 +1226,9 @@ typedef __bitwise unsigned int vm_fault_t;
> * @VM_FAULT_HWPOISON_LARGE: Hit poisoned large page. Index encoded
> * in upper bits
> * @VM_FAULT_SIGSEGV: segmentation fault
> + * @VM_FAULT_HWPOISON_SIM Hit poisoned, PTE marker; this indicates a
> + * simulated poison (e.g. via usefaultfd's
> + * UFFDIO_POISON), not a "real" hwerror.
> * @VM_FAULT_NOPAGE: ->fault installed the pte, not return page
> * @VM_FAULT_LOCKED: ->fault locked the returned page
> * @VM_FAULT_RETRY: ->fault blocked, must retry
> @@ -1245,6 +1248,7 @@ enum vm_fault_reason {
> VM_FAULT_HWPOISON = (__force vm_fault_t)0x000010,
> VM_FAULT_HWPOISON_LARGE = (__force vm_fault_t)0x000020,
> VM_FAULT_SIGSEGV = (__force vm_fault_t)0x000040,
> + VM_FAULT_HWPOISON_SIM = (__force vm_fault_t)0x000080,
> VM_FAULT_NOPAGE = (__force vm_fault_t)0x000100,
> VM_FAULT_LOCKED = (__force vm_fault_t)0x000200,
> VM_FAULT_RETRY = (__force vm_fault_t)0x000400,
> @@ -1270,6 +1274,7 @@ enum vm_fault_reason {
> { VM_FAULT_HWPOISON, "HWPOISON" }, \
> { VM_FAULT_HWPOISON_LARGE, "HWPOISON_LARGE" }, \
> { VM_FAULT_SIGSEGV, "SIGSEGV" }, \
> + { VM_FAULT_HWPOISON_SIM, "HWPOISON_SIM" }, \
> { VM_FAULT_NOPAGE, "NOPAGE" }, \
> { VM_FAULT_LOCKED, "LOCKED" }, \
> { VM_FAULT_RETRY, "RETRY" }, \
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 65456230cc71..2b4e0173e806 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6485,7 +6485,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> pte_marker_get(pte_to_swp_entry(entry));
>
> if (marker & PTE_MARKER_POISONED) {
> - ret = VM_FAULT_HWPOISON_LARGE |
> + ret = VM_FAULT_HWPOISON_SIM |
> + VM_FAULT_HWPOISON_LARGE |
> VM_FAULT_SET_HINDEX(hstate_index(h));
> goto out_mutex;
> }
> diff --git a/mm/memory.c b/mm/memory.c
> index d2155ced45f8..29a833b996ae 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3910,7 +3910,7 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf)
>
> /* Higher priority than uffd-wp when data corrupted */
> if (marker & PTE_MARKER_POISONED)
> - return VM_FAULT_HWPOISON;
> + return VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_SIM;
>
> if (pte_marker_entry_uffd_wp(entry))
> return pte_marker_handle_uffd_wp(vmf);
next prev parent reply other threads:[~2024-05-09 21:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-09 20:39 [PATCH 0/1] " Axel Rasmussen
2024-05-09 20:39 ` [PATCH 1/1] " Axel Rasmussen
2024-05-09 21:05 ` Peter Xu
2024-05-09 23:02 ` Axel Rasmussen
2024-05-09 21:30 ` John Hubbard [this message]
2024-05-09 22:47 ` Axel Rasmussen
2024-05-09 21:08 ` [PATCH 0/1] " John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d04a838b-848d-405d-9317-40282cd58c36@nvidia.com \
--to=jhubbard@nvidia.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=axelrasmussen@google.com \
--cc=bp@alien8.de \
--cc=christophe.leroy@csgroup.eu \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=deller@gmx.de \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=liushixin2@huawei.com \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=muchun.song@linux.dev \
--cc=naveen.n.rao@linux.ibm.com \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.de \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox