From: Andrew Morton <akpm@osdl.org>
To: Hugh Dickins <hugh@veritas.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Mike Waychison <mikew@google.com>,
linux-mm@kvack.org,
Linux Kernel list <linux-kernel@vger.kernel.org>,
Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFC] page fault retry with NOPAGE_RETRY
Date: Sat, 23 Sep 2006 12:46:18 -0700 [thread overview]
Message-ID: <20060923124618.e5ef3a51.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0609231421110.25804@blonde.wat.veritas.com>
On Sat, 23 Sep 2006 15:21:40 +0100 (BST)
Hugh Dickins <hugh@veritas.com> wrote:
> On Wed, 20 Sep 2006, Andrew Morton wrote:
> > On Wed, 20 Sep 2006 16:54:59 +1000
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > >
> > > That's what I don't understand... where is the actual race that can
> > > cause the livelock you are mentioning.
> >
> > Suppose a program (let's call it "DoS") is written which sits in a loop
> > doing fadvise(FADV_DONTNEED) against some parts of /lib/libc.so.
>
> I agree there's an issue here, but I believe you're attacking the wrong
> end, thereby complicating and uglifying the pagefault path (in every
> arch) with your proposed arg block and retry limitation.
"simplifying and cleaning up the pagefault path (in every arch) with my
proposed arg block and retry improvement".
We're presently passing from four to six arguments down many layers of
function call. Can be replaced with a single arg.
> (Maybe one day there will be need for such an arg block,
> but I don't see that yet.)
I agree it's marginal.
> Isn't the real problem that fadvise(FADV_DONTNEED) is much more
> powerful than it should be? Whereas madvise(MADV_DONTNEED) is simply
> releasing pages from my address space, fadvise(FADV_DONTNEED) is going
> so far as to remove them from pagecache (if nothing at that instant
> prevents): forcing others into I/O. Why should I be allowed to
> invalidate pagecache useful to others so quickly?
>
> Shouldn't it merely, say, move the pages in its range to the inactive
> list, giving other processes a chance to reassert an interest in them?
> May not turn out as easy as that, I admit.
Could be, although that would cause inodes to remain unreclaimable.
> I'm fine with your idea of dropping mmap_sem while nopage waits on I/O,
> I'm fine with your idea of an mm mmap transaction count, so nopage can
> just reget mmap_sem without backing out when nothing changed meanwhile.
>
> But I do think Ben should have the simple NOPAGE_RETRY he proposed,
> going right back out to userspace; and that should be enough for your
> case too (the mmap transaction count would make its use a rarity).
Perhaps we should concentrate on that for now. Did we have a patch to look
at?
> > So I think there's a nasty DoS here if we permit infinite retries. But
> > it's not just that - there might be other situations under really heavy
> > memory pressure where livelocks like this can occur.
>
> filemap_nopage would want to mark_page_accessed() before returning
> NOPAGE_RETRY, but if that's not good enough to hold the page in cache
> before the retried fault grabs it, your memory pressure is already
> into thrashing. I believe the livelock is peculiar to FADV_DONTNEED.
Maybe. Putting a potential infinite loop like this into the pagefault path
gives me the creeps.
Bear in mind that direct-io (both block and NFS) shoots down pagecache too.
It has to.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-09-23 19:46 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-14 22:55 Benjamin Herrenschmidt
2006-09-15 0:19 ` Linus Torvalds
2006-09-15 7:11 ` Andrew Morton
2006-09-15 7:35 ` Andrew Morton
2006-09-15 13:30 ` Hugh Dickins
2006-09-16 1:03 ` Benjamin Herrenschmidt
2006-09-19 23:35 ` Mike Waychison
2006-09-19 23:50 ` Benjamin Herrenschmidt
2006-09-19 23:59 ` Andrew Morton
2006-09-20 0:06 ` Benjamin Herrenschmidt
2006-09-20 0:05 ` Benjamin Herrenschmidt
2006-09-20 0:21 ` Andrew Morton
2006-09-20 1:57 ` Benjamin Herrenschmidt
2006-09-20 3:05 ` Andrew Morton
2006-09-20 5:04 ` Benjamin Herrenschmidt
2006-09-20 5:26 ` Andrew Morton
2006-09-20 6:54 ` Benjamin Herrenschmidt
2006-09-20 17:53 ` Andrew Morton
2006-09-21 22:05 ` Benjamin Herrenschmidt
2006-09-21 22:41 ` Andrew Morton
2006-09-21 23:09 ` Benjamin Herrenschmidt
2006-09-23 14:21 ` Hugh Dickins
2006-09-23 19:46 ` Andrew Morton [this message]
2006-09-23 22:35 ` Benjamin Herrenschmidt
2006-09-20 5:06 ` Benjamin Herrenschmidt
2006-09-20 1:14 ` Mike Waychison
2006-09-20 2:02 ` Benjamin Herrenschmidt
2006-09-15 21:35 ` Arnd Bergmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060923124618.e5ef3a51.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=benh@kernel.crashing.org \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mikew@google.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox