From: Andrew Morton <akpm@linux-foundation.org>
To: kosaki.motohiro@jp.fujitsu.com, nickpiggin@yahoo.com.au,
linux-mm@kvack.org, riel@redhat.com, lee.schermerhorn@hp.com
Subject: Re: vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch
Date: Fri, 10 Oct 2008 15:33:46 -0700 [thread overview]
Message-ID: <20081010153346.e25b90f7.akpm@linux-foundation.org> (raw)
In-Reply-To: <20081010152540.79ed64cb.akpm@linux-foundation.org>
On Fri, 10 Oct 2008 15:25:40 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 10 Oct 2008 15:17:01 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Wed, 8 Oct 2008 19:03:07 +0900 (JST)
> > KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> >
> > > Hi
> > >
> > > Nick, Andrew, very thanks for good advice.
> > > your helpful increase my investigate speed.
> > >
> > >
> > > > This patch, like I said when it was first merged, has the problem that
> > > > it can cause large stalls when reclaiming pages.
> > > >
> > > > I actually myself tried a similar thing a long time ago. The problem is
> > > > that after a long period of no reclaiming, your file pages can all end
> > > > up being active and referenced. When the first guy wants to reclaim a
> > > > page, it might have to scan through gigabytes of file pages before being
> > > > able to reclaim a single one.
> > >
> > > I perfectly agree this opinion.
> > > all pages stay on active list is awful.
> > >
> > > In addition, my mesurement tell me this patch cause latency degression on really heavy io workload.
> > >
> > > 2.6.27-rc8: Throughput 13.4231 MB/sec 4000 clients 4000 procs max_latency=1421988.159 ms
> > > + patch : Throughput 12.0953 MB/sec 4000 clients 4000 procs max_latency=1731244.847 ms
> > >
> > >
> > > > While it would be really nice to be able to just lazily set PageReferenced
> > > > and nothing else in mark_page_accessed, and then do file page aging based
> > > > on the referenced bit, the fact is that we virtually have O(1) reclaim
> > > > for file pages now, and this can make it much more like O(n) (in worst case,
> > > > especially).
> > > >
> > > > I don't think it is right to say "we broke aging and this patch fixes it".
> > > > It's all a big crazy heuristic. Who's to say that the previous behaviour
> > > > wasn't better and this patch breaks it? :)
> > > >
> > > > Anyway, I don't think it is exactly productive to keep patches like this in
> > > > the tree (that doesn't seem ever intended to be merged) while there are
> > > > other big changes to reclaim there.
> >
> > Well yes. I've been hanging onto these in the hope that someone would
> > work out whether they are changes which we should make.
> >
> >
> > > > Same for vm-dont-run-touch_buffer-during-buffercache-lookups.patch
> > >
> > > I mesured it too,
> > >
> > > 2.6.27-rc8: Throughput 13.4231 MB/sec 4000 clients 4000 procs max_latency=1421988.159 ms
> > > + patch : Throughput 11.8494 MB/sec 4000 clients 4000 procs max_latency=3463217.227 ms
> > >
> > > dbench latency increased about x2.5
> > >
> > > So, the patch desctiption already descibe this risk.
> > > metadata dropping can decrease performance largely.
> > > that just appeared, imho.
> >
> > Oh well, that'll suffice, thanks - I'll drop them.
>
> Which means that after vmscan-split-lru-lists-into-anon-file-sets.patch,
> shrink_active_list() simply does
>
> while (!list_empty(&l_hold)) {
> cond_resched();
> page = lru_to_page(&l_hold);
> list_add(&page->lru, &l_inactive);
> }
>
> yes?
>
> We might even be able to list_splice those pages..
OK, that wasn't a particularly good time to drop those patches.
Here's how shrink_active_list() ended up:
static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
struct scan_control *sc, int priority, int file)
{
unsigned long pgmoved;
int pgdeactivate = 0;
unsigned long pgscanned;
LIST_HEAD(l_hold); /* The pages which were snipped off */
LIST_HEAD(l_active);
LIST_HEAD(l_inactive);
struct page *page;
struct pagevec pvec;
enum lru_list lru;
lru_add_drain();
spin_lock_irq(&zone->lru_lock);
pgmoved = sc->isolate_pages(nr_pages, &l_hold, &pgscanned, sc->order,
ISOLATE_ACTIVE, zone,
sc->mem_cgroup, 1, file);
/*
* zone->pages_scanned is used for detect zone's oom
* mem_cgroup remembers nr_scan by itself.
*/
if (scan_global_lru(sc)) {
zone->pages_scanned += pgscanned;
zone->recent_scanned[!!file] += pgmoved;
}
if (file)
__mod_zone_page_state(zone, NR_ACTIVE_FILE, -pgmoved);
else
__mod_zone_page_state(zone, NR_ACTIVE_ANON, -pgmoved);
spin_unlock_irq(&zone->lru_lock);
pgmoved = 0;
while (!list_empty(&l_hold)) {
cond_resched();
page = lru_to_page(&l_hold);
list_del(&page->lru);
if (unlikely(!page_evictable(page, NULL))) {
putback_lru_page(page);
continue;
}
list_add(&page->lru, &l_inactive);
if (!page_mapping_inuse(page)) {
/*
* Bypass use-once, make the next access count. See
* mark_page_accessed and shrink_page_list.
*/
SetPageReferenced(page);
}
}
/*
* Count the referenced pages as rotated, even when they are moved
* to the inactive list. This helps balance scan pressure between
* file and anonymous pages in get_scan_ratio.
*/
zone->recent_rotated[!!file] += pgmoved;
/*
* Now put the pages back on the appropriate [file or anon] inactive
* and active lists.
*/
pagevec_init(&pvec, 1);
pgmoved = 0;
lru = LRU_BASE + file * LRU_FILE;
spin_lock_irq(&zone->lru_lock);
while (!list_empty(&l_inactive)) {
page = lru_to_page(&l_inactive);
prefetchw_prev_lru_page(page, &l_inactive, flags);
VM_BUG_ON(PageLRU(page));
SetPageLRU(page);
VM_BUG_ON(!PageActive(page));
ClearPageActive(page);
list_move(&page->lru, &zone->lru[lru].list);
mem_cgroup_move_lists(page, lru);
pgmoved++;
if (!pagevec_add(&pvec, page)) {
__mod_zone_page_state(zone, NR_LRU_BASE + lru, pgmoved);
spin_unlock_irq(&zone->lru_lock);
pgdeactivate += pgmoved;
pgmoved = 0;
if (buffer_heads_over_limit)
pagevec_strip(&pvec);
__pagevec_release(&pvec);
spin_lock_irq(&zone->lru_lock);
}
}
__mod_zone_page_state(zone, NR_LRU_BASE + lru, pgmoved);
pgdeactivate += pgmoved;
if (buffer_heads_over_limit) {
spin_unlock_irq(&zone->lru_lock);
pagevec_strip(&pvec);
spin_lock_irq(&zone->lru_lock);
}
pgmoved = 0;
lru = LRU_ACTIVE + file * LRU_FILE;
while (!list_empty(&l_active)) {
page = lru_to_page(&l_active);
prefetchw_prev_lru_page(page, &l_active, flags);
VM_BUG_ON(PageLRU(page));
SetPageLRU(page);
VM_BUG_ON(!PageActive(page));
list_move(&page->lru, &zone->lru[lru].list);
mem_cgroup_move_lists(page, lru);
pgmoved++;
if (!pagevec_add(&pvec, page)) {
__mod_zone_page_state(zone, NR_LRU_BASE + lru, pgmoved);
pgmoved = 0;
spin_unlock_irq(&zone->lru_lock);
if (vm_swap_full())
pagevec_swap_free(&pvec);
__pagevec_release(&pvec);
spin_lock_irq(&zone->lru_lock);
}
}
__mod_zone_page_state(zone, NR_LRU_BASE + lru, pgmoved);
__count_zone_vm_events(PGREFILL, zone, pgscanned);
__count_vm_events(PGDEACTIVATE, pgdeactivate);
spin_unlock_irq(&zone->lru_lock);
if (vm_swap_full())
pagevec_swap_free(&pvec);
pagevec_release(&pvec);
}
Note the first use of pgmoved there. It no longer does anything. erk.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-10-10 22:33 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-08 5:55 vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Nick Piggin
2008-10-08 10:03 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch KOSAKI Motohiro
2008-10-10 22:17 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Andrew Morton
2008-10-10 22:25 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Andrew Morton
2008-10-10 22:33 ` Andrew Morton [this message]
2008-10-10 23:59 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Rik van Riel
2008-10-11 1:42 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Andrew Morton
2008-10-11 1:53 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Rik van Riel
2008-10-11 2:21 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Andrew Morton
2008-10-11 20:46 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Rik van Riel
2008-10-12 13:31 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch KOSAKI Motohiro
2008-10-10 23:56 ` vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081010153346.e25b90f7.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox