From: Nick Piggin <npiggin@suse.de>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: linux-mm <linux-mm@kvack.org>, Rik van Riel <riel@redhat.com>
Subject: Re: RFC: Noreclaim with "Keep Mlocked Pages off the LRU"
Date: Wed, 29 Aug 2007 06:38:03 +0200 [thread overview]
Message-ID: <20070829043803.GD25335@wotan.suse.de> (raw)
In-Reply-To: <1188312766.5079.77.camel@localhost>
On Tue, Aug 28, 2007 at 10:52:46AM -0400, Lee Schermerhorn wrote:
> On Tue, 2007-08-28 at 02:06 +0200, Nick Piggin wrote:
> >
> > I don't have a problem with having a more unified approach, although if
> > we did that, then I'd prefer just to do it more simply and don't special
> > case mlocked pages _at all_. Ie. just slowly try to reclaim them and
> > eventually when everybody unlocks them, you will notice sooner or later.
>
> I didn't think I was special casing mlocked pages. I wanted to treat
> all !page_reclaimable() pages the same--i.e., put them on the noreclaim
> list.
But you are keeping track of the mlock count? Why not simply call
try_to_unmap and see if they are still mlocked?
> > But once you do the code for mlock refcounting, that's most of the hard
> > part done so you may as well remove them completely from the LRU, no?
> > Then they become more or less transparent to the rest of the VM as well.
>
> Well, no. Depending on the reason for !reclaimable, the page would go
> on the noreclaim list or just be dropped--special handling. More
> importantly [for me], we still have to handle them specially in
> migration, dumping them back onto the LRU so that we can arbitrate
> access. If I'm ever successful in getting automatic/lazy page migration
> +replication accepted, I don't want that overhead in
> auto-migration/replication.
Oh OK. I don't know if there should be a whole lot of overhead involved
with that, though. I can't remember exactly what the problems were here
with my mlock patch, but I think it could have been made more optimal.
> > Could be possible. Tricky though. Probably take less code to use
> > ->lru ;)
>
> Oh, certainly less code to use any separate field. But the lru list
> field is the only link we have in the page struct, and a lot of VM
> depends on being able to pass around lists of pages. I'd hate to lose
> that for mlocked pages, or to have to dump the lock count and
> reestablish it in those cases, like migration, where we need to put the
> page on a list.
Hmm, yes. Migration could possibly use a single linked list.
But I'm only saying it _could_ be possible to do mlocked accounting
efficiently with one of the LRU pointers -- I would prefer the idea
of just using a single bit for example, if that is sufficient. It
should cut down on code.
> > I don't know. I'd have thought efficient mlock handling might be useful
> > for realtime systems, probably many of which would be 32-bit.
>
> I agree. I just wonder if those systems have a sufficient number of
> pages that they're suffering from the long lru lists with a large
> fraction of unreclaimable pages... If we do want to support keeping
> nonreclaimable pages off the [in]active lists for these systems, we'll
> need to find a place for the flag[s].
That's true, they will have a lot less pages (and probably won't
be using highmem).
> > Are you seeing mlock pinning heaps of memory in the field?
>
> It is a common usage to mlock() large shared memory areas, as well as
> entire tasks [MLOCK_CURRENT|MLOCK_FUTURE]. I think it would be even
> more frequent if one could inherit MLOCK_FUTURE across fork and exec.
> Then one could write/enhance a prefix command, like numactl and taskset,
> to enable locking of unmodified applications. I prototyped this once,
> but never updated it to do the mlock accounting [e.g., down in
> copy_page_range() during fork()] for your patch.
>
> What we see more of is folks just figuring that they've got sufficient
> memory [100s of GB] for their apps and shared memory areas, so they
> don't add enough swap to back all of the anon and shmem regions. Then,
> when they get under memory pressure--e.g., the old "backup ate my
> pagecache" scenario--the system more or less live-locks in vmscan
> shuffling non-reclaimable [unswappable] pages. A large number of
> mlocked pages on the LRU produces the same symptom; as do excessively
> long anon_vma lists and huge i_mmap trees--the latter seen with some
> large Oracle workloads.
OK, thanks for the background.
Thanks,
Nick
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-08-29 4:38 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-23 4:11 vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru Nick Piggin
2007-08-23 7:15 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru Andrew Morton
2007-08-23 9:07 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru Nick Piggin
2007-08-23 11:48 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-aroun d-the-lru Andrea Arcangeli
2007-08-24 20:43 ` RFC: Noreclaim with "Keep Mlocked Pages off the LRU" Lee Schermerhorn
2007-08-27 1:35 ` Nick Piggin
2007-08-27 14:34 ` Lee Schermerhorn
2007-08-27 15:44 ` Christoph Hellwig
2007-08-27 23:51 ` Nick Piggin
2007-08-28 12:29 ` Christoph Hellwig
2007-08-28 0:06 ` Nick Piggin
2007-08-28 14:52 ` Lee Schermerhorn
2007-08-28 21:54 ` Christoph Lameter
2007-08-29 14:40 ` Lee Schermerhorn
2007-08-29 17:39 ` Christoph Lameter
2007-08-30 0:09 ` Rik van Riel
2007-08-30 14:49 ` Lee Schermerhorn
2007-08-29 4:38 ` Nick Piggin [this message]
2007-08-30 16:34 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070829043803.GD25335@wotan.suse.de \
--to=npiggin@suse.de \
--cc=Lee.Schermerhorn@hp.com \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox