* [PATCH 4.9-stable -- 5.19-stable] mm/hugetlb: fix hugetlb not supporting softdirty tracking
[not found] <1661424546448@kroah.com>
@ 2022-08-25 14:32 ` David Hildenbrand
2022-08-29 8:13 ` Greg KH
0 siblings, 1 reply; 2+ messages in thread
From: David Hildenbrand @ 2022-08-25 14:32 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, gregkh, David Hildenbrand, Mike Kravetz, Peter Feiner,
Kirill A . Shutemov, Cyrill Gorcunov, Pavel Emelyanov, Jamie Liu,
Hugh Dickins, Naoya Horiguchi, Bjorn Helgaas, Muchun Song,
Peter Xu, stable
commit f96f7a40874d7c746680c0b9f57cef2262ae551f upstream.
Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2.
I observed that hugetlb does not support/expect write-faults in shared
mappings that would have to map the R/O-mapped page writable -- and I
found two case where we could currently get such faults and would
erroneously map an anon page into a shared mapping.
Reproducers part of the patches.
I propose to backport both fixes to stable trees. The first fix needs a
small adjustment.
This patch (of 2):
Staring at hugetlb_wp(), one might wonder where all the logic for shared
mappings is when stumbling over a write-protected page in a shared
mapping. In fact, there is none, and so far we thought we could get away
with that because e.g., mprotect() should always do the right thing and
map all pages directly writable.
Looks like we were wrong:
--------------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
#define HUGETLB_SIZE (2 * 1024 * 1024u)
static void clear_softdirty(void)
{
int fd = open("/proc/self/clear_refs", O_WRONLY);
const char *ctrl = "4";
int ret;
if (fd < 0) {
fprintf(stderr, "open(clear_refs) failed\n");
exit(1);
}
ret = write(fd, ctrl, strlen(ctrl));
if (ret != strlen(ctrl)) {
fprintf(stderr, "write(clear_refs) failed\n");
exit(1);
}
close(fd);
}
int main(int argc, char **argv)
{
char *map;
int fd;
fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT);
if (!fd) {
fprintf(stderr, "open() failed\n");
return -errno;
}
if (ftruncate(fd, HUGETLB_SIZE)) {
fprintf(stderr, "ftruncate() failed\n");
return -errno;
}
map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (map == MAP_FAILED) {
fprintf(stderr, "mmap() failed\n");
return -errno;
}
*map = 0;
if (mprotect(map, HUGETLB_SIZE, PROT_READ)) {
fprintf(stderr, "mmprotect() failed\n");
return -errno;
}
clear_softdirty();
if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) {
fprintf(stderr, "mmprotect() failed\n");
return -errno;
}
*map = 0;
return 0;
}
--------------------------------------------------------------------------
Above test fails with SIGBUS when there is only a single free hugetlb page.
# echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
# ./test
Bus error (core dumped)
And worse, with sufficient free hugetlb pages it will map an anonymous page
into a shared mapping, for example, messing up accounting during unmap
and breaking MAP_SHARED semantics:
# echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
# ./test
# cat /proc/meminfo | grep HugePages_
HugePages_Total: 2
HugePages_Free: 1
HugePages_Rsvd: 18446744073709551615
HugePages_Surp: 0
Reason in this particular case is that vma_wants_writenotify() will
return "true", removing VM_SHARED in vma_set_page_prot() to map pages
write-protected. Let's teach vma_wants_writenotify() that hugetlb does not
support softdirty tracking.
Link: https://lkml.kernel.org/r/20220811103435.188481-1-david@redhat.com
Link: https://lkml.kernel.org/r/20220811103435.188481-2-david@redhat.com
Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Jamie Liu <jamieliu@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org> [3.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/mmap.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 7c59ec73acc3..3b284b091bb7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1693,8 +1693,12 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
pgprot_val(vm_pgprot_modify(vm_page_prot, vm_flags)))
return 0;
- /* Do we need to track softdirty? */
- if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && !(vm_flags & VM_SOFTDIRTY))
+ /*
+ * Do we need to track softdirty? hugetlb does not support softdirty
+ * tracking yet.
+ */
+ if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && !(vm_flags & VM_SOFTDIRTY) &&
+ !is_vm_hugetlb_page(vma))
return 1;
/* Specialty mapping? */
--
2.37.1
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH 4.9-stable -- 5.19-stable] mm/hugetlb: fix hugetlb not supporting softdirty tracking
2022-08-25 14:32 ` [PATCH 4.9-stable -- 5.19-stable] mm/hugetlb: fix hugetlb not supporting softdirty tracking David Hildenbrand
@ 2022-08-29 8:13 ` Greg KH
0 siblings, 0 replies; 2+ messages in thread
From: Greg KH @ 2022-08-29 8:13 UTC (permalink / raw)
To: David Hildenbrand
Cc: linux-kernel, linux-mm, Mike Kravetz, Peter Feiner,
Kirill A . Shutemov, Cyrill Gorcunov, Pavel Emelyanov, Jamie Liu,
Hugh Dickins, Naoya Horiguchi, Bjorn Helgaas, Muchun Song,
Peter Xu, stable
On Thu, Aug 25, 2022 at 04:32:58PM +0200, David Hildenbrand wrote:
> commit f96f7a40874d7c746680c0b9f57cef2262ae551f upstream.
>
> Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2.
>
> I observed that hugetlb does not support/expect write-faults in shared
> mappings that would have to map the R/O-mapped page writable -- and I
> found two case where we could currently get such faults and would
> erroneously map an anon page into a shared mapping.
>
> Reproducers part of the patches.
>
> I propose to backport both fixes to stable trees. The first fix needs a
> small adjustment.
>
> This patch (of 2):
>
> Staring at hugetlb_wp(), one might wonder where all the logic for shared
> mappings is when stumbling over a write-protected page in a shared
> mapping. In fact, there is none, and so far we thought we could get away
> with that because e.g., mprotect() should always do the right thing and
> map all pages directly writable.
>
> Looks like we were wrong:
>
> --------------------------------------------------------------------------
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <errno.h>
> #include <sys/mman.h>
>
> #define HUGETLB_SIZE (2 * 1024 * 1024u)
>
> static void clear_softdirty(void)
> {
> int fd = open("/proc/self/clear_refs", O_WRONLY);
> const char *ctrl = "4";
> int ret;
>
> if (fd < 0) {
> fprintf(stderr, "open(clear_refs) failed\n");
> exit(1);
> }
> ret = write(fd, ctrl, strlen(ctrl));
> if (ret != strlen(ctrl)) {
> fprintf(stderr, "write(clear_refs) failed\n");
> exit(1);
> }
> close(fd);
> }
>
> int main(int argc, char **argv)
> {
> char *map;
> int fd;
>
> fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT);
> if (!fd) {
> fprintf(stderr, "open() failed\n");
> return -errno;
> }
> if (ftruncate(fd, HUGETLB_SIZE)) {
> fprintf(stderr, "ftruncate() failed\n");
> return -errno;
> }
>
> map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> if (map == MAP_FAILED) {
> fprintf(stderr, "mmap() failed\n");
> return -errno;
> }
>
> *map = 0;
>
> if (mprotect(map, HUGETLB_SIZE, PROT_READ)) {
> fprintf(stderr, "mmprotect() failed\n");
> return -errno;
> }
>
> clear_softdirty();
>
> if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) {
> fprintf(stderr, "mmprotect() failed\n");
> return -errno;
> }
>
> *map = 0;
>
> return 0;
> }
> --------------------------------------------------------------------------
>
> Above test fails with SIGBUS when there is only a single free hugetlb page.
> # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> # ./test
> Bus error (core dumped)
>
> And worse, with sufficient free hugetlb pages it will map an anonymous page
> into a shared mapping, for example, messing up accounting during unmap
> and breaking MAP_SHARED semantics:
> # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> # ./test
> # cat /proc/meminfo | grep HugePages_
> HugePages_Total: 2
> HugePages_Free: 1
> HugePages_Rsvd: 18446744073709551615
> HugePages_Surp: 0
>
> Reason in this particular case is that vma_wants_writenotify() will
> return "true", removing VM_SHARED in vma_set_page_prot() to map pages
> write-protected. Let's teach vma_wants_writenotify() that hugetlb does not
> support softdirty tracking.
>
> Link: https://lkml.kernel.org/r/20220811103435.188481-1-david@redhat.com
> Link: https://lkml.kernel.org/r/20220811103435.188481-2-david@redhat.com
> Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared")
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Peter Feiner <pfeiner@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Pavel Emelyanov <xemul@parallels.com>
> Cc: Jamie Liu <jamieliu@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Muchun Song <songmuchun@bytedance.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: <stable@vger.kernel.org> [3.18+]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> mm/mmap.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
Now queued up, thanks.
greg k-h
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-08-29 8:14 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1661424546448@kroah.com>
2022-08-25 14:32 ` [PATCH 4.9-stable -- 5.19-stable] mm/hugetlb: fix hugetlb not supporting softdirty tracking David Hildenbrand
2022-08-29 8:13 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox