linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Nick Piggin <npiggin@suse.de>,
	Martin Bligh <mbligh@mbligh.org>
Subject: Re: make swappiness safer to use
Date: Wed, 1 Aug 2007 01:23:06 +0200	[thread overview]
Message-ID: <20070731232306.GY6910@v2.random> (raw)
In-Reply-To: <20070731160943.30e9c13a.akpm@linux-foundation.org>

On Tue, Jul 31, 2007 at 04:09:43PM -0700, Andrew Morton wrote:
> On Tue, 31 Jul 2007 23:52:28 +0200
> Andrea Arcangeli <andrea@suse.de> wrote:
> 
> > I think the prev_priority can also be nuked since it wastes 4 bytes
> > per zone (that would be an incremental patch but I wait the
> > nr_scan_[in]active to be nuked first for similar reasons). Clearly
> > somebody at some point noticed how broken that thing was and they had
> > to add min(priority, prev_priority) to give it some reliability, but
> > they didn't go the last mile to nuke prev_priority too. Calculating
> > distress only in function of not-racy priority is correct and sure
> > more than enough without having to add randomness into the equation.
> 
> I don't recall seeing any such patch and I suspect it'd cause problems
> anyway.
> 
> If we were to base swap_tendency purely on sc->priority then the VM would
> incorrectly fail to deactivate mapped pages until the scanning had reached
> a sufficiently high (ie: low) scanning priority.
> 
> The net effect would be that each time some process runs
> shrink_active_list(), some pages would be incorrectly retained on the
> active list and after a while, the code wold start moving mapped pages down
> to the inactive list.
> 
> In fact, I think that was (effectively) the behaviour which we had in
> there, and it caused problems with some worklaod which Martin was looking
> at and things got better when we fixed it.
> 
> 
> Anyway, we can say more if we see the patch (or, more accurately, the
> analysis which comes with that patch).

My reasoning for prev_priority not being such a great feature is that
between the two, sc->priority is critically more important because its
being set for the current run, prev_priority is set later (in origin
only prev_priority was used as failsafe for the swappiness logic,
these days sc->priority is being mixed too because clearly
prev_priority alone was not enough). But my whole dislike for those
prev_* thinks is that they're all smp racey. So your beloved
prev_priority will go back to 12 if a new try_to_free_pages runs with
a different gfpmask and/or different order of allocation, screwing the
other task in the other CPU that is having such an hard time to find
unmapped pages to free because it has a strictier gfpmask (perhaps not
allowed to eat into dcache/icache) or bigger order (perhaps even
looping nearly forever thanks to the order <= PAGE_ALLOC_COSTLY_ORDER
check). So I've an hard time to appreciate the prev_priority thing,
because like the nr_scan_[in]active it's imperfect.

Comments like those also shows the whole imperfection:

	 /* Now that we've scanned all the zones at this priority level, note
	  * that level within the zone so that the next thread

that's a lie, I mean there's no such thing as next thread, all threads
may be running in parallel in multiple cpus, or they may be context
switching. The comment would be remotely correct if there was a big
global semaphore around the vm, which would never happen.

It's really the same category of the nr_scan_[in]active, and my
dislike for those things is exactly the same and motivated by mostly
the same reasons.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-07-31 23:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-31 21:52 Andrea Arcangeli
2007-07-31 22:12 ` Andrew Morton
2007-07-31 22:40   ` Andrea Arcangeli
2007-07-31 22:51     ` Andrew Morton
2007-07-31 23:02       ` Andrea Arcangeli
     [not found]         ` <20070801011925.GB20109@mail.ustc.edu.cn>
2007-08-01  1:19           ` Fengguang Wu
     [not found]           ` <20070801012222.GA20565@mail.ustc.edu.cn>
2007-08-01  1:22             ` Fengguang Wu
     [not found]             ` <20070801013208.GA20085@mail.ustc.edu.cn>
2007-08-01  1:32               ` Fengguang Wu
2007-08-01  2:33               ` Andrea Arcangeli
2007-08-06 18:21                 ` Andrew Morton
     [not found]                   ` <20070807050032.GA16179@mail.ustc.edu.cn>
2007-08-07  5:00                     ` Fengguang Wu
2007-11-12  2:07                     ` YAMAMOTO Takashi
2007-08-01  2:30           ` Andrea Arcangeli
2007-07-31 23:09 ` Andrew Morton
2007-07-31 23:23   ` Andrea Arcangeli [this message]
2007-07-31 23:32   ` Martin Bligh
2007-07-31 23:49     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070731232306.GY6910@v2.random \
    --to=andrea@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@mbligh.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox