* DAX error inject/page poison
@ 2017-09-19 23:15 Mike Kravetz
2017-09-19 23:45 ` Dan Williams
0 siblings, 1 reply; 2+ messages in thread
From: Mike Kravetz @ 2017-09-19 23:15 UTC (permalink / raw)
To: linux-nvdimm, linux-mm, linux-kernel
Cc: Dan Williams, Ross Zwisler, Vishal L Verma
We were trying to simulate pmem errors in an environment where a DAX
filesystem is used (ext4 although I suspect it does not matter). The
sequence attempted on a DAX filesystem is:
- Populate a file in the DAX filesystem
- mmap the file
- madvise(MADV_HWPOISON)
The madvise operation fails with EFAULT. This appears to come from
get_user_pages() as there are no struct pages for such mappings?
The idea is to make sure an application can recover from such errors
by hole punching and repopulating with another page.
A couple questions:
It seems like madvise(MADV_HWPOISON) is not going to work (ever?) in
such situations. If so, should we perhaps add a IS_DAX like check and
return something like EINVAL? Or, at least document expected behavior?
If madvise(MADV_HWPOISON) will not work, how can one inject errors to
test error handling code?
--
Mike Kravetz
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: DAX error inject/page poison
2017-09-19 23:15 DAX error inject/page poison Mike Kravetz
@ 2017-09-19 23:45 ` Dan Williams
0 siblings, 0 replies; 2+ messages in thread
From: Dan Williams @ 2017-09-19 23:45 UTC (permalink / raw)
To: Mike Kravetz
Cc: linux-nvdimm, linux-mm, linux-kernel, Ross Zwisler, Vishal L Verma
On Tue, Sep 19, 2017 at 4:15 PM, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> We were trying to simulate pmem errors in an environment where a DAX
> filesystem is used (ext4 although I suspect it does not matter). The
> sequence attempted on a DAX filesystem is:
> - Populate a file in the DAX filesystem
> - mmap the file
> - madvise(MADV_HWPOISON)
>
> The madvise operation fails with EFAULT. This appears to come from
> get_user_pages() as there are no struct pages for such mappings?
>
> The idea is to make sure an application can recover from such errors
> by hole punching and repopulating with another page.
>
> A couple questions:
> It seems like madvise(MADV_HWPOISON) is not going to work (ever?) in
> such situations. If so, should we perhaps add a IS_DAX like check and
> return something like EINVAL? Or, at least document expected behavior?
The MADV_HWPOISON machinery assumes normal memory pages, not DAX and
certainly not the special ZONE_DEVICE pages we allocate for the
purpose of DMA. Returning EINVAL seems like the right thing to do
since there is no facility in the kernel to soft offline a DAX page.
In other words MADV_HWPOISON is for emulating errors in volatile
memory that might be transient until the next reboot, DAX errors cause
permanent data loss in filesytem files, so the error injection and
handling models need to be different.
> If madvise(MADV_HWPOISON) will not work, how can one inject errors to
> test error handling code?
Similar to "hdparm --make-bad-sector" we need a platform specific
facility to inject a hard memory error at a given physical persistent
memory address. In the case of an ACPI 6.2 based platform that
mechanism is: "Section 9.20.7.9 Function Index 7 - ARS Error Inject".
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-09-19 23:45 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-19 23:15 DAX error inject/page poison Mike Kravetz
2017-09-19 23:45 ` Dan Williams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox