* [PATCH] mm: Make sendfile(2) killable
@ 2015-10-12 12:45 Jan Kara
2015-10-15 20:46 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2015-10-12 12:45 UTC (permalink / raw)
To: linux-mm; +Cc: Andrew Morton, linux-fsdevel, Al Viro, Dmitry Vyukov, Jan Kara
Currently a simple program below issues a sendfile(2) system call which
takes about 62 days to complete in my test KVM instance.
int fd;
off_t off = 0;
fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644);
ftruncate(fd, 2);
lseek(fd, 0, SEEK_END);
sendfile(fd, fd, &off, 0xfffffff);
Now you should not ask kernel to do a stupid stuff like copying 256MB in
2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin
should have a way to stop you.
We actually do have a check for fatal_signal_pending() in
generic_perform_write() which triggers in this path however because we
always succeed in writing something before the check is done, we return
value > 0 from generic_perform_write() and thus the information about
signal gets lost.
Fix the problem by doing the signal check before writing anything. That
way generic_perform_write() returns -EINTR, the error gets propagated up
and the sendfile loop terminates early.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jan Kara <jack@suse.com>
---
mm/filemap.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 1cc5467cf36c..327910c2400c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2488,6 +2488,11 @@ again:
break;
}
+ if (fatal_signal_pending(current)) {
+ status = -EINTR;
+ break;
+ }
+
status = a_ops->write_begin(file, mapping, pos, bytes, flags,
&page, &fsdata);
if (unlikely(status < 0))
@@ -2525,10 +2530,6 @@ again:
written += copied;
balance_dirty_pages_ratelimited(mapping);
- if (fatal_signal_pending(current)) {
- status = -EINTR;
- break;
- }
} while (iov_iter_count(i));
return written ? written : status;
--
1.7.12.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: Make sendfile(2) killable
2015-10-12 12:45 [PATCH] mm: Make sendfile(2) killable Jan Kara
@ 2015-10-15 20:46 ` Andrew Morton
2015-10-16 6:40 ` Jan Kara
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2015-10-15 20:46 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-mm, linux-fsdevel, Al Viro, Dmitry Vyukov
On Mon, 12 Oct 2015 14:45:23 +0200 Jan Kara <jack@suse.com> wrote:
> Currently a simple program below issues a sendfile(2) system call which
> takes about 62 days to complete in my test KVM instance.
Geeze some people are impatient.
> int fd;
> off_t off = 0;
>
> fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644);
> ftruncate(fd, 2);
> lseek(fd, 0, SEEK_END);
> sendfile(fd, fd, &off, 0xfffffff);
>
> Now you should not ask kernel to do a stupid stuff like copying 256MB in
> 2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin
> should have a way to stop you.
>
> We actually do have a check for fatal_signal_pending() in
> generic_perform_write() which triggers in this path however because we
> always succeed in writing something before the check is done, we return
> value > 0 from generic_perform_write() and thus the information about
> signal gets lost.
ah.
> Fix the problem by doing the signal check before writing anything. That
> way generic_perform_write() returns -EINTR, the error gets propagated up
> and the sendfile loop terminates early.
>
> ...
>
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2488,6 +2488,11 @@ again:
> break;
> }
>
> + if (fatal_signal_pending(current)) {
> + status = -EINTR;
> + break;
> + }
> +
> status = a_ops->write_begin(file, mapping, pos, bytes, flags,
> &page, &fsdata);
> if (unlikely(status < 0))
> @@ -2525,10 +2530,6 @@ again:
> written += copied;
>
> balance_dirty_pages_ratelimited(mapping);
> - if (fatal_signal_pending(current)) {
> - status = -EINTR;
> - break;
> - }
> } while (iov_iter_count(i));
>
> return written ? written : status;
This won't work, will it? If user hits ^C after we've written a few
pages, `written' is non-zero and the same thing happens?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: Make sendfile(2) killable
2015-10-15 20:46 ` Andrew Morton
@ 2015-10-16 6:40 ` Jan Kara
2015-10-16 21:05 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2015-10-16 6:40 UTC (permalink / raw)
To: Andrew Morton; +Cc: Jan Kara, linux-mm, linux-fsdevel, Al Viro, Dmitry Vyukov
On Thu 15-10-15 13:46:44, Andrew Morton wrote:
> On Mon, 12 Oct 2015 14:45:23 +0200 Jan Kara <jack@suse.com> wrote:
>
> > Currently a simple program below issues a sendfile(2) system call which
> > takes about 62 days to complete in my test KVM instance.
>
> Geeze some people are impatient.
>
> > int fd;
> > off_t off = 0;
> >
> > fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644);
> > ftruncate(fd, 2);
> > lseek(fd, 0, SEEK_END);
> > sendfile(fd, fd, &off, 0xfffffff);
> >
> > Now you should not ask kernel to do a stupid stuff like copying 256MB in
> > 2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin
> > should have a way to stop you.
> >
> > We actually do have a check for fatal_signal_pending() in
> > generic_perform_write() which triggers in this path however because we
> > always succeed in writing something before the check is done, we return
> > value > 0 from generic_perform_write() and thus the information about
> > signal gets lost.
>
> ah.
>
> > Fix the problem by doing the signal check before writing anything. That
> > way generic_perform_write() returns -EINTR, the error gets propagated up
> > and the sendfile loop terminates early.
> >
> > ...
> >
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -2488,6 +2488,11 @@ again:
> > break;
> > }
> >
> > + if (fatal_signal_pending(current)) {
> > + status = -EINTR;
> > + break;
> > + }
> > +
> > status = a_ops->write_begin(file, mapping, pos, bytes, flags,
> > &page, &fsdata);
> > if (unlikely(status < 0))
> > @@ -2525,10 +2530,6 @@ again:
> > written += copied;
> >
> > balance_dirty_pages_ratelimited(mapping);
> > - if (fatal_signal_pending(current)) {
> > - status = -EINTR;
> > - break;
> > - }
> > } while (iov_iter_count(i));
> >
> > return written ? written : status;
>
> This won't work, will it? If user hits ^C after we've written a few
> pages, `written' is non-zero and the same thing happens?
It does work - I've tested it :). Sure, the generic_perform_write() call
that is running when the signal is delivered will return with value > 0.
But the interesting thing is what happens after that: Either we return to
userspace (and then we are fine) or generic_perform_write() gets called
again because there's more to write and *that* call will return -EINTR
which ends up terminating the whole sendfile syscall.
Actually there is one general lesson to be learned here: When you check for
fatal signal and bail out, it's better to do it before doing any work. That
way things keep working even if the function is called in a loop.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: Make sendfile(2) killable
2015-10-16 6:40 ` Jan Kara
@ 2015-10-16 21:05 ` Andrew Morton
0 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2015-10-16 21:05 UTC (permalink / raw)
To: Jan Kara; +Cc: Jan Kara, linux-mm, linux-fsdevel, Al Viro, Dmitry Vyukov
On Fri, 16 Oct 2015 08:40:27 +0200 Jan Kara <jack@suse.cz> wrote:
> > > balance_dirty_pages_ratelimited(mapping);
> > > - if (fatal_signal_pending(current)) {
> > > - status = -EINTR;
> > > - break;
> > > - }
> > > } while (iov_iter_count(i));
> > >
> > > return written ? written : status;
> >
> > This won't work, will it? If user hits ^C after we've written a few
> > pages, `written' is non-zero and the same thing happens?
>
> It does work - I've tested it :). Sure, the generic_perform_write() call
> that is running when the signal is delivered will return with value > 0.
> But the interesting thing is what happens after that: Either we return to
> userspace (and then we are fine) or generic_perform_write() gets called
> again because there's more to write and *that* call will return -EINTR
> which ends up terminating the whole sendfile syscall.
OK. I guess that's better behaviour than overwriting a non-zero
`written' when signalled.
I'm going to tag this one for -stable. It's a bit of a DoS.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-10-16 21:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12 12:45 [PATCH] mm: Make sendfile(2) killable Jan Kara
2015-10-15 20:46 ` Andrew Morton
2015-10-16 6:40 ` Jan Kara
2015-10-16 21:05 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox