From: Linus Torvalds <torvalds@linux-foundation.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: a.p.zijlstra@chello.nl, salikhmetov@gmail.com,
linux-mm@kvack.org, jakob@unthought.net,
linux-kernel@vger.kernel.org, valdis.kletnieks@vt.edu,
riel@redhat.com, ksm@42.dk, staubach@redhat.com,
jesper.juhl@gmail.com, akpm@linux-foundation.org,
protasnb@gmail.com, r.e.wolff@bitwizard.nl,
hidave.darkstar@gmail.com, hch@infradead.org
Subject: Re: [PATCH -v8 3/4] Enable the MS_ASYNC functionality in sys_msync()
Date: Wed, 23 Jan 2008 13:00:57 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.1.00.0801231248060.2803@woody.linux-foundation.org> (raw)
In-Reply-To: <E1JHlh8-0003s8-Bb@pomaz-ex.szeredi.hu>
On Wed, 23 Jan 2008, Miklos Szeredi wrote:
>
> > [ Side note: it is quite possible that we should not do the
> > SYNC_FILE_RANGE_WAIT_BEFORE on MS_ASYNC, and just skip over pages that
> > are busily under writeback already.
>
> MS_ASYNC is not supposed to wait, so SYNC_FILE_RANGE_WAIT_BEFORE
> probably should not be used in that case.
Well, that would leave the page dirty (including in the page tables) if it
was under page writeback when the MS_ASYNC happened.
So I agree, we shouldn't necessarily wait, but if we want the page tables
to be cleaned, right now we need to.
> What would be perfect, is if we had a sync mode, that on encountering
> a page currently under writeback, would just do a page_mkclean() on
> it, so we still receive a page fault next time one of the mappings is
> dirtied, so the times can be updated.
>
> Would there be any difficulties with that?
It would require fairly invasive changes. Right now the actual page
writeback does effectively:
...
if (wbc->sync_mode != WB_SYNC_NONE)
wait_on_page_writeback(page);
if (PageWriteback(page) ||
!clear_page_dirty_for_io(page)) {
unlock_page(page);
continue;
}
ret = (*writepage)(page, wbc, data);
...
and that "clear_page_dirty_for_io()" really does clear *all* the dirty
bits, so we absolutely must start writepage() when we have done that. And
that, in turn, requires that we're not already under writeback.
Is it possible to fix? Sure. We'd have to split up
clear_page_dirty_for_io() to do it, and do the
if (mapping && mapping_cap_account_dirty(mapping))
..
part first (before the PageWriteback() tests), and then doing the
if (TestClearPageDirty(page))
...
parts later (after checking that that we're not under page-writeback).
So it's not horribly hard, but it's kind of a separate issue right now.
And while the *generic* page-writeback is easy enough to fix, I worry
about low-level filesystems that have their own "writepages()"
implementation. They could easily get that wrong.
So right now it seems that waiting for writeback to finish is the right
and safe thing to do (and even so, I'm not actually willing to commit my
suggested patch in 2.6.24, I think this needs more thinking about)
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-01-23 21:00 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-22 23:21 [PATCH -v8 0/4] Fixing the issue with memory-mapped file times Anton Salikhmetov
2008-01-22 23:21 ` [PATCH -v8 1/4] Massive code cleanup of sys_msync() Anton Salikhmetov
2008-01-22 23:21 ` [PATCH -v8 2/4] Update ctime and mtime for memory-mapped files Anton Salikhmetov
2008-01-23 18:03 ` Linus Torvalds
2008-01-23 23:14 ` Anton Salikhmetov
2008-01-22 23:21 ` [PATCH -v8 3/4] Enable the MS_ASYNC functionality in sys_msync() Anton Salikhmetov
2008-01-23 8:47 ` Peter Zijlstra
2008-01-23 8:51 ` Peter Zijlstra
2008-01-23 9:34 ` Miklos Szeredi
2008-01-23 9:51 ` Miklos Szeredi
2008-01-23 13:09 ` Anton Salikhmetov
2008-01-23 12:53 ` Anton Salikhmetov
2008-01-23 9:41 ` Miklos Szeredi
2008-01-23 17:05 ` Linus Torvalds
2008-01-23 17:26 ` Anton Salikhmetov
2008-01-23 17:41 ` Peter Zijlstra
2008-01-23 19:35 ` Linus Torvalds
2008-01-23 19:55 ` Miklos Szeredi
2008-01-23 21:00 ` Linus Torvalds [this message]
2008-01-23 21:16 ` Miklos Szeredi
2008-01-23 21:36 ` Linus Torvalds
2008-01-23 22:29 ` Hugh Dickins
2008-01-23 22:41 ` Linus Torvalds
2008-01-24 0:03 ` Hugh Dickins
2008-01-24 0:05 ` Miklos Szeredi
2008-01-24 0:11 ` Linus Torvalds
2008-01-24 1:36 ` Nick Piggin
2008-01-24 18:56 ` Matt Mackall
2008-01-22 23:21 ` [PATCH -v8 4/4] The design document for memory-mapped file times update Anton Salikhmetov
2008-01-23 9:26 ` Miklos Szeredi
2008-01-23 10:37 ` Anton Salikhmetov
2008-01-23 10:53 ` Miklos Szeredi
2008-01-23 11:16 ` Miklos Szeredi
2008-01-23 12:25 ` Anton Salikhmetov
2008-01-23 13:55 ` Miklos Szeredi
2008-01-25 16:27 ` Randy Dunlap
2008-01-25 16:40 ` Anton Salikhmetov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.1.00.0801231248060.2803@woody.linux-foundation.org \
--to=torvalds@linux-foundation.org \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=hidave.darkstar@gmail.com \
--cc=jakob@unthought.net \
--cc=jesper.juhl@gmail.com \
--cc=ksm@42.dk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miklos@szeredi.hu \
--cc=protasnb@gmail.com \
--cc=r.e.wolff@bitwizard.nl \
--cc=riel@redhat.com \
--cc=salikhmetov@gmail.com \
--cc=staubach@redhat.com \
--cc=valdis.kletnieks@vt.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox