From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Larry Woodman <lwoodman@redhat.com>
Cc: kosaki.motohiro@jp.fujitsu.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, akpm@linux-foundation.org,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [RFC] high system time & lock contention running large mixed workload
Date: Tue, 1 Dec 2009 21:23:23 +0900 (JST) [thread overview]
Message-ID: <20091201102645.5C0A.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <1259618429.2345.3.camel@dhcp-100-19-198.bos.redhat.com>
(cc to some related person)
> The cause was determined to be the unconditional call to
> page_referenced() for every mapped page encountered in
> shrink_active_list(). page_referenced() takes the anon_vma->lock and
> calls page_referenced_one() for each vma. page_referenced_one() then
> calls page_check_address() which takes the pte_lockptr spinlock. If
> several CPUs are doing this at the same time there is a lot of
> pte_lockptr spinlock contention with the anon_vma->lock held. This
> causes contention on the anon_vma->lock, stalling in the fo and very
> high system time.
>
> Before the splitLRU patch shrink_active_list() would only call
> page_referenced() when reclaim_mapped got set. reclaim_mapped only got
> set when the priority worked its way from 12 all the way to 7. This
> prevented page_referenced() from being called from shrink_active_list()
> until the system was really struggling to reclaim memory.
>
> On way to prevent this is to change page_check_address() to execute a
> spin_trylock(ptl) when it was called by shrink_active_list() and simply
> fail if it could not get the pte_lockptr spinlock. This will make
> shrink_active_list() consider the page not referenced and allow the
> anon_vma->lock to be dropped much quicker.
>
> The attached patch does just that, thoughts???
At first look,
- We have to fix this issue certenally.
- But your patch is a bit risky.
Your patch treat trylock(pte-lock) failure as no accessced. but
generally lock contention imply to have contention peer. iow, the page
have reference bit typically. then, next shrink_inactive_list() move it
active list again. that's suboptimal result.
However, we can't treat lock-contention as page-is-referenced simply. if it does,
the system easily go into OOM.
So,
if (priority < DEF_PRIORITY - 2)
page_referenced()
else
page_refenced_trylock()
is better?
On typical workload, almost vmscan only use DEF_PRIORITY. then,
if priority==DEF_PRIORITY situation don't cause heavy lock contention,
the system don't need to mind the contention. anyway we can't avoid
contention if the system have heavy memory pressure.
btw, current shrink_active_list() have unnecessary page_mapping_inuse() call.
it prevent to drop page reference bit from unmapped cache page. it mean
we protect unmapped cache page than mapped page. it is strange.
Unfortunately, I don't have enough development time today. I'll
working on tommorow.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-12-01 12:23 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-25 18:37 [PATCH] vmscan: do not evict inactive pages when skipping an active list scan Rik van Riel
2009-11-25 20:35 ` Johannes Weiner
2009-11-25 20:47 ` Rik van Riel
2009-11-26 2:50 ` KOSAKI Motohiro
2009-11-26 2:57 ` Rik van Riel
2009-11-30 22:00 ` [RFC] high system time & lock contention running large mixed workload Larry Woodman
2009-12-01 10:04 ` Andrea Arcangeli
2009-12-01 12:31 ` KOSAKI Motohiro
2009-12-01 12:46 ` Andrea Arcangeli
2009-12-02 2:02 ` KOSAKI Motohiro
2009-12-02 2:04 ` Rik van Riel
2009-12-02 2:00 ` Rik van Riel
2009-12-01 12:23 ` KOSAKI Motohiro [this message]
2009-12-01 16:41 ` Larry Woodman
2009-12-02 2:20 ` Rik van Riel
2009-12-02 2:41 ` KOSAKI Motohiro
2009-12-03 22:14 ` Larry Woodman
2009-12-04 0:29 ` Rik van Riel
2009-12-04 21:26 ` Larry Woodman
2009-12-06 21:04 ` Rik van Riel
2009-12-04 0:36 ` KOSAKI Motohiro
2009-12-04 19:31 ` Larry Woodman
2009-12-02 2:55 ` [PATCH] Clear reference bit although page isn't mapped KOSAKI Motohiro
2009-12-02 3:07 ` Rik van Riel
2009-12-02 3:28 ` [PATCH] Replace page_mapping_inuse() with page_mapped() KOSAKI Motohiro
2009-12-02 4:57 ` Rik van Riel
2009-12-02 11:07 ` Johannes Weiner
2009-12-02 1:55 ` [RFC] high system time & lock contention running large mixed workload Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091201102645.5C0A.A69D9226@jp.fujitsu.com \
--to=kosaki.motohiro@jp.fujitsu.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hugh.dickins@tiscali.co.uk \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lwoodman@redhat.com \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox