Re: [patch] not to disturb page LRU state when unmapping memory range

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@osdl.org>
Cc: Hugh Dickins <hugh@veritas.com>, Ken Chen <kenchen@google.com>,
	linux-mm@kvack.org
Subject: Re: [patch] not to disturb page LRU state when unmapping memory range
Date: Thu, 01 Feb 2007 00:52:14 +0100	[thread overview]
Message-ID: <1170287534.10924.103.camel@lappy> (raw)
In-Reply-To: <20070131144855.8fe255ff.akpm@osdl.org>

On Wed, 2007-01-31 at 14:48 -0800, Andrew Morton wrote:
> On Wed, 31 Jan 2007 23:25:00 +0100
> Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> 
> > On Wed, 2007-01-31 at 14:04 -0800, Andrew Morton wrote:
> > 
> > > > Andrew, any strong opinions?
> > > 
> > > Not really.  If we change something in there, some workloads will get
> > > better, some will get worse and most will be unaffected and any regressions
> > > we cause won't be known until six months later.  The usual deal.
> > > 
> > > Remember that all this info is supposed to be estimating what is likely to
> > > happen to this page in the future - we're not interested in what happened
> > > in the past, per-se.
> > > 
> > > I'd have thought that if multiple processes are touching the same
> > > page, this is a reason to think that the page will be required again in the
> > > immediate future.  But you seem to think otherwise?
> > 
> > Yes, why would unmapping a range make the pages more likely to be used
> > in the immediate future than otherwise indicated by their individual
> > young bits?
> > 
> > Even the opposite was suggested, that unmapping a range makes it less
> > likely to be used again.
> 
> Ah, yes, well, that's different.
> 
> Our handling of page referenced information is basically random: we had
> something in place in 2.4.midway, then use-once went in and churned things
> around, then we turned VM upside-down in 2.5 and I basically tried to keep
> what we then had in an unaltered state in the fond belief that someone
> would one day get down and actually apply some design and thought to what
> we're doing.  That has yet to happen.
> 
> Take a simple mmap+pagefault+munmap path.  The initial fault will leave the
> page pte-referenced+PageReferenced+!PageActive. 

Assuming major fault, a minor fault might well map an active page.

>  If the vm scanner sees the
> page it will become !pte-referenced+!PageReferenced+PageActive.  If it gets
> unmapped it becomes !PageReferenced+PageActive.

scanner does:

1) referenced,   inactive -> unreferenced, active
2) referenced,   active   -> unreferenced, active

3) unreferenced, active   -> unreferenced, inactive
4) unreferenced, inactive -> reclaimed

> These things at least seem to be somewhat consistent.  But I'm not sure
> there's any logic behind it.

Seems rather logical, 2 level state, each clock period you either
promote or demote depending on activity.

> Perhaps we're approaching this from the wrong direction.  Rather than
> looking at the code and saying "hey, we should change that", we should be
> looking at workloads and seeing how they can be improved.  Perhaps.

Any which way I'm turning it, it keeps being a blind shot. But I get the
idea.

> In the above (simple, common) scenario the proposed
> s/mark_page_accessed/SetPageReferenced/ change will cause the page to end
> up PageReferenced+!PageActive. 

How so, it will not demote the page to inactive. 

Now unmap could promote to active, with the change not so. Neither will
ever demote, only page reclaim will do that.

currently with mark_page_accessed:

 referenced := (pte young || PageReferenced) 

1 active pte

  referenced (pte, !PG_referenced), inactive -> referenced,   inactive
  referenced (pte ,PG_referenced),  inactive -> unreferenced, active
  *,                                active   -> referenced,   active

2 active ptes

  referenced (pte, !PG_referenced), inactive -> unreferenced, active
  referenced (pte, PG_referenced),  inactive -> referenced, active
  *,                                active   -> referenced, active

3+ active ptes

  *, * -> referenced, active

which I find quite horrid for unmap...

Or, with the proposed SetPageReferenced:

1+ active pte(s)
  referenced (pte,!PG_referenced), * -> referenced (PG_referenced), *
  referenced (pte, PG_referenced), * -> referenced (PG_referenced), *

Its actually an identity map, it just moves pte young bits into the
referenced bit, which is all the same to page_referenced().

>  ie: it ends up on the inactive list and not
> the active list.  <tests it, confirms>. 

it will stay on whatever list it was.

>  That's a substantial change in
> behaviour: inactive-list pages are considerably more reclaimable than
> active-list ones and we might well alter things for people my making this
> change.  Whether that alteration is net-good or net-bad is unknown ;)

Its quite a change indeed, but either I'm not quite parsing what you're
saying and we're in violent agreement, or I should go sleep ;-)

I hope this state machinery makes sense, I feel asleep already.

> We don't _have_ to use live applications.  Often they are hard to set up,
> and do complex things and are hard to understand.  

> A more controllable and
> ultimately more useful result could be achieved by defining *workloads*:
> particular scenarios for the VM.  

> Then write simple and easily observeable
> testcases for each scenario.  That's basically what people do, I think, but
> it's all a bit ad-hoc and uncoordinated.

I have started writing an application that can perform simple patterns,
perhaps we should discuss interesting patterns during the VM summit.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2007-01-31 23:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-31  4:41 Ken Chen
2007-01-31 12:26 ` Peter Zijlstra
2007-01-31 19:15   ` Balbir Singh
2007-01-31 19:30     ` Christoph Lameter
2007-01-31 18:02 ` Hugh Dickins
2007-01-31 21:43   ` Peter Zijlstra
2007-01-31 21:51     ` Ken Chen
2007-01-31 22:04     ` Andrew Morton
2007-01-31 22:25       ` Peter Zijlstra
2007-01-31 22:48         ` Andrew Morton
2007-01-31 23:52           ` Peter Zijlstra [this message]
2007-02-01  0:33             ` Andrew Morton
2007-02-01  3:21           ` Rik van Riel
2007-02-01  3:13         ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1170287534.10924.103.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@osdl.org \
    --cc=hugh@veritas.com \
    --cc=kenchen@google.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox