From: Linus Torvalds <torvalds@linux-foundation.org>
To: David Hildenbrand <david@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
linux-s390 <linux-s390@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Buffered I/O broken on s390x with page faults disabled (gfs2)
Date: Tue, 8 Mar 2022 11:27:38 -0800 [thread overview]
Message-ID: <CAHk-=wjs2Jf3LzqCPmfkXd=ULPyCrrGEF7rR6TYzz1RPF+qh3Q@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wjHsQywXgNe9D+MQCiMhpyB2Gs5M78CGCpTr9BSeP71bw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1079 bytes --]
On Tue, Mar 8, 2022 at 9:40 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Hmm. The futex code actually uses "fixup_user_fault()" for this case.
> Maybe fault_in_safe_writeable() should do that?
.. paging all the bits back in, I'm reminded that one of the worries
was "what about large areas".
But I really think that the solution should be that we limit the size
of fault_in_safe_writeable() to just a couple of pages.
Even if you were to fault in gigabytes, page-out can undo it anyway,
so there is no semantic reason why that function should ever do more
than a few pages to make sure. There's already even a comment about
how there's no guarantee that the pages will stay.
Side note: the current GUP-based fault_in_safe_writeable() is buggy in
another way anyway: it doesn't work right for stack extending
accesses.
So I think the fix for this all might be something like the attached
(TOTALLY UNTESTED)!
Comments? Andreas, mind (carefully - maybe it is totally broken and
does unspeakable acts to your pets) testing this?
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 1972 bytes --]
mm/gup.c | 40 ++++++++++++----------------------------
1 file changed, 12 insertions(+), 28 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c
index a9d4d724aef7..9e085e7b9c28 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1745,44 +1745,28 @@ EXPORT_SYMBOL(fault_in_writeable);
size_t fault_in_safe_writeable(const char __user *uaddr, size_t size)
{
unsigned long start = (unsigned long)untagged_addr(uaddr);
- unsigned long end, nstart, nend;
+ unsigned long end, nstart;
struct mm_struct *mm = current->mm;
- struct vm_area_struct *vma = NULL;
- int locked = 0;
+ const unsigned int fault_flags = FAULT_FLAG_WRITE | FAULT_FLAG_KILLABLE;
+ const size_t max_size = 4 * PAGE_SIZE;
nstart = start & PAGE_MASK;
- end = PAGE_ALIGN(start + size);
+ end = PAGE_ALIGN(start + min(size, max_size));
if (end < nstart)
end = 0;
- for (; nstart != end; nstart = nend) {
- unsigned long nr_pages;
- long ret;
- if (!locked) {
- locked = 1;
- mmap_read_lock(mm);
- vma = find_vma(mm, nstart);
- } else if (nstart >= vma->vm_end)
- vma = vma->vm_next;
- if (!vma || vma->vm_start >= end)
- break;
- nend = end ? min(end, vma->vm_end) : vma->vm_end;
- if (vma->vm_flags & (VM_IO | VM_PFNMAP))
- continue;
- if (nstart < vma->vm_start)
- nstart = vma->vm_start;
- nr_pages = (nend - nstart) / PAGE_SIZE;
- ret = __get_user_pages_locked(mm, nstart, nr_pages,
- NULL, NULL, &locked,
- FOLL_TOUCH | FOLL_WRITE);
- if (ret <= 0)
+ mmap_read_lock(mm);
+ for (; nstart != end; nstart += PAGE_SIZE) {
+ if (fixup_user_fault(mm, nstart, fault_flags, NULL))
break;
- nend = nstart + ret * PAGE_SIZE;
}
- if (locked)
- mmap_read_unlock(mm);
+ mmap_read_unlock(mm);
+
+ /* If we got all of our (truncated) fault-in, we claim we got it all */
if (nstart == end)
return 0;
+
+ /* .. otherwise we'll use the original untruncated size */
return size - min_t(size_t, nstart - start, size);
}
EXPORT_SYMBOL(fault_in_safe_writeable);
next prev parent reply other threads:[~2022-03-08 19:28 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-07 22:52 Andreas Gruenbacher
2022-03-07 23:18 ` Linus Torvalds
2022-03-08 8:21 ` David Hildenbrand
2022-03-08 8:37 ` David Hildenbrand
2022-03-08 12:11 ` David Hildenbrand
2022-03-08 12:24 ` David Hildenbrand
2022-03-08 13:20 ` Gerald Schaefer
2022-03-08 13:32 ` David Hildenbrand
2022-03-08 14:14 ` Gerald Schaefer
2022-03-08 17:23 ` David Hildenbrand
2022-03-08 17:26 ` Linus Torvalds
2022-03-08 17:40 ` Linus Torvalds
2022-03-08 19:27 ` Linus Torvalds [this message]
2022-03-08 20:03 ` Linus Torvalds
2022-03-08 23:24 ` Andreas Gruenbacher
2022-03-09 0:22 ` Linus Torvalds
2022-03-09 18:42 ` Andreas Gruenbacher
2022-03-09 19:08 ` Linus Torvalds
2022-03-09 20:57 ` Andreas Gruenbacher
2022-03-09 21:08 ` Andreas Gruenbacher
2022-03-10 12:13 ` Filipe Manana
2022-03-09 19:21 ` Linus Torvalds
2022-03-09 19:35 ` Andreas Gruenbacher
2022-03-09 20:18 ` Linus Torvalds
2022-03-09 20:36 ` Andreas Gruenbacher
2022-03-09 20:48 ` Linus Torvalds
2022-03-09 20:54 ` Linus Torvalds
2022-03-10 17:13 ` David Hildenbrand
2022-03-10 18:00 ` Andreas Gruenbacher
2022-03-10 18:35 ` Linus Torvalds
2022-03-10 18:38 ` David Hildenbrand
2022-03-10 18:47 ` Andreas Gruenbacher
2022-03-10 19:22 ` Linus Torvalds
2022-03-10 19:56 ` Linus Torvalds
2022-03-10 20:23 ` Andreas Gruenbacher
2022-03-08 17:47 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wjs2Jf3LzqCPmfkXd=ULPyCrrGEF7rR6TYzz1RPF+qh3Q@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=agruenba@redhat.com \
--cc=david@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox