From: Hugh Dickins <hugh@veritas.com>
To: Erez Zadok <ezk@cs.sunysb.edu>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>,
Ryan Finnie <ryan@finnie.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
cjwatson@ubuntu.com, linux-mm@kvack.org
Subject: Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland
Date: Thu, 25 Oct 2007 17:40:35 +0100 (BST) [thread overview]
Message-ID: <Pine.LNX.4.64.0710251649430.6433@blonde.wat.veritas.com> (raw)
In-Reply-To: <200710222104.l9ML4L1D002031@agora.fsl.cs.sunysb.edu>
On Mon, 22 Oct 2007, Erez Zadok wrote:
> In message <Pine.LNX.4.64.0710222101420.23513@blonde.wat.veritas.com>, Hugh Dickins writes:
> >
> > Only ramdisk and shmem have been returning AOP_WRITEPAGE_ACTIVATE.
> > Both of those set BDI_CAP_NO_WRITEBACK. ramdisk never returned it
> > if !wbc->for_reclaim. I contend that shmem shouldn't either: it's
> > a special code to get the LRU rotation right, not useful elsewhere.
> > Though Documentation/filesystems/vfs.txt does imply wider use.
>
> Yes, based on vfs.txt I figured unionfs should return
> AOP_WRITEPAGE_ACTIVATE.
unionfs_writepage returns it in two different cases: when it can't
find the underlying page; and when the underlying writepage returns
it. I'd say it's wrong to return it in both cases.
In the first case, you don't really want your page put back to the head
of the active list, you want to come back to try it again quite soon
(I think): so you should just redirty and unlock and pretend success.
ramdisk uses A_W_A because none of its pages will ever become freeable
(and comment points out it'd be better if they weren't even on the
LRUs - I think several people have recently been putting forward
patches to keep such timewasters off the LRUs).
shmem uses A_W_A when there's no swap (left), or when the underlying
shm is marked as locked in memory: in each case, best to move on to
look for other pages to swap out. (But I'm not quite convincing myself
that the temporarily out-of-swap case is different from yours above.)
It's about fixing some horrid busy loops where vmscan kept going
over the same hopeless pages repeatedly, instead of moving on to
better candidates. Oh, there's a third case, when move_to_swap_cache
fails: that's rare, and I think I was just too lazy to separate them.
In your second case, I fail to see why the unionfs level should
mimic the lower level: you've successfully copied data and marked
the lower level pages as dirty, vmscan will come back to those in
due course, but it's just a waste of time for it to come back to
the unionfs pages again - isn't it?
> But, now that unionfs has ->writepages which won't
> even call the lower writepage if BDI_CAP_NO_WRITEBACK is on, then perhaps I
> no longer need unionfs_writepage to bother checking for
> AOP_WRITEPAGE_ACTIVATE, or even return it up?
unionfs_writepages handles the sync/msync/fsync leaking of A_W_A to
userspace issue, as does Pekka & Andrew's patch to write_cache_pages,
as does my patch to shmem_writepage. And I'm contending that
unionfs_writepage should in no case return A_W_A up.
But so long as A_W_A is still defined, unionfs_writepage does
still need to check for it after calling the lower level ->writepage
(because it needs to do the missing unlock_page): unionfs_writepages
prevents unionfs_writepage being called on the normal writeout path,
but it's still getting called by vmscan under memory pressure.
(I'm in the habit of saying "vmscan" rather than naming the functions
in question, because every few months someone restructures that file
and changes their names. I exaggerate, but it's happened often enough.)
> But, a future file system _could_ return AOP_WRITEPAGE_ACTIVATE w/o setting
> BDI_CAP_NO_WRITEBACK, right? In that case, unionfs will still need to
> handle AOP_WRITEPAGE_ACTIVATE in ->writepage, right?
For so long as AOP_WRITEPAGE_ACTIVATE exists, unionfs_writepage needs to
check for it coming from the lower level ->writepage, as I said above.
But your/Pekka's unionfs_writepages doesn't need to worry about it
at all, because Andrew/Pekka's write_cache_pages fix prevents it
leaking up in the !reclaim case (as does my shmem_writepage fix):
please remove that AOP_WRITEPAGE_ACTIVATE comment from unionfs_writepages.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-10-25 16:40 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200710071920.l97JKJX5018871@agora.fsl.cs.sunysb.edu>
2007-10-11 21:47 ` Andrew Morton
2007-10-11 22:12 ` Ryan Finnie
2007-10-12 0:38 ` Hugh Dickins
2007-10-12 21:45 ` Pekka Enberg
2007-10-14 8:44 ` Hugh Dickins
2007-10-14 17:09 ` Pekka Enberg
2007-10-14 17:23 ` Erez Zadok
2007-10-14 17:50 ` Pekka J Enberg
2007-10-14 22:32 ` Erez Zadok
2007-10-15 11:47 ` Pekka Enberg
2007-10-16 18:02 ` Erez Zadok
2007-10-22 20:16 ` Hugh Dickins
2007-10-22 20:48 ` Pekka Enberg
2007-10-25 15:36 ` Hugh Dickins
2007-10-25 16:44 ` Erez Zadok
2007-10-25 18:23 ` Hugh Dickins
2007-10-26 2:00 ` Neil Brown
2007-10-26 8:09 ` Pekka Enberg
2007-10-26 11:26 ` Hugh Dickins
2007-10-26 8:05 ` Pekka Enberg
2007-10-22 21:04 ` Erez Zadok
2007-10-25 16:40 ` Hugh Dickins [this message]
2007-10-24 21:02 ` [PATCH] fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE Hugh Dickins
2007-10-24 21:08 ` Andrew Morton
2007-10-24 21:37 ` [PATCH+comment] " Hugh Dickins
2007-10-25 5:37 ` Pekka Enberg
2007-10-25 6:30 ` Hugh Dickins
2007-10-25 7:24 ` Pekka Enberg
2007-10-25 16:01 ` Erez Zadok
2007-10-25 20:51 ` H. Peter Anvin
2007-10-22 20:01 ` msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland Hugh Dickins
2007-10-22 20:40 ` Pekka Enberg
2007-10-22 19:42 ` Hugh Dickins
2007-10-22 21:38 ` Erez Zadok
2007-10-25 18:03 ` Hugh Dickins
2007-10-27 20:47 ` Erez Zadok
2007-10-28 20:23 ` Erez Zadok
2007-10-29 20:33 ` Hugh Dickins
2007-10-31 23:53 ` Erez Zadok
2007-11-05 15:40 ` Hugh Dickins
2007-11-05 16:38 ` Dave Hansen
2007-11-05 18:57 ` Hugh Dickins
2007-11-09 2:47 ` Erez Zadok
2007-11-09 6:05 ` Erez Zadok
2007-11-12 5:41 ` Hugh Dickins
2007-11-12 17:01 ` Hugh Dickins
2007-11-13 10:18 ` Erez Zadok
2007-11-17 21:24 ` Hugh Dickins
2007-11-20 1:30 ` Erez Zadok
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0710251649430.6433@blonde.wat.veritas.com \
--to=hugh@veritas.com \
--cc=akpm@linux-foundation.org \
--cc=cjwatson@ubuntu.com \
--cc=ezk@cs.sunysb.edu \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@cs.helsinki.fi \
--cc=ryan@finnie.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox