linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: kosaki.motohiro@jp.fujitsu.com, Rik van Riel <riel@redhat.com>,
	Jeff Dike <jdike@addtoit.com>, Avi Kivity <avi@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Yu, Wilfred" <wilfred.yu@intel.com>,
	"Kleen, Andi" <andi.kleen@intel.com>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	Mel Gorman <mel@csn.ul.ie>, LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC] respect the referenced bit of KVM guest pages?
Date: Wed, 19 Aug 2009 00:57:54 +0900 (JST)	[thread overview]
Message-ID: <20090818234310.A64B.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <20090816112910.GA3208@localhost>

> > Yes it does. I said 'mostly' because there is a small hole that an
> > unevictable page may be scanned but still not moved to unevictable
> > list: when a page is mapped in two places, the first pte has the
> > referenced bit set, the _second_ VMA has VM_LOCKED bit set, then
> > page_referenced() will return 1 and shrink_page_list() will move it
> > into active list instead of unevictable list. Shall we fix this rare
> > case?
> 
> How about this fix?

Good spotting.
Yes, this is rare case. but I also don't think your patch introduce
performance degression.

However, I think your patch have one bug.

> 
> ---
> mm: stop circulating of referenced mlocked pages
> 
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
> 
> --- linux.orig/mm/rmap.c	2009-08-16 19:11:13.000000000 +0800
> +++ linux/mm/rmap.c	2009-08-16 19:22:46.000000000 +0800
> @@ -358,6 +358,7 @@ static int page_referenced_one(struct pa
>  	 */
>  	if (vma->vm_flags & VM_LOCKED) {
>  		*mapcount = 1;	/* break early from loop */
> +		*vm_flags |= VM_LOCKED;
>  		goto out_unmap;
>  	}
>  
> @@ -482,6 +483,8 @@ static int page_referenced_file(struct p
>  	}
>  
>  	spin_unlock(&mapping->i_mmap_lock);
> +	if (*vm_flags & VM_LOCKED)
> +		referenced = 0;
>  	return referenced;
>  }
>  

page_referenced_file?
I think we should change page_referenced().


Instead, How about this?
==============================================

Subject: [PATCH] mm: stop circulating of referenced mlocked pages

Currently, mlock() systemcall doesn't gurantee to mark the page PG_Mlocked
because some race prevent page grabbing.
In that case, instead vmscan move the page to unevictable lru.

However, Recently Wu Fengguang pointed out current vmscan logic isn't so
efficient.
mlocked page can move circulatly active and inactive list because
vmscan check the page is referenced _before_ cull mlocked page.

Plus, vmscan should mark PG_Mlocked when cull mlocked page.
Otherwise vm stastics show strange number.

This patch does that.

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/internal.h |    5 +++--
 mm/rmap.c     |    8 +++++++-
 mm/vmscan.c   |    2 +-
 3 files changed, 11 insertions(+), 4 deletions(-)

Index: b/mm/internal.h
===================================================================
--- a/mm/internal.h	2009-06-26 21:06:43.000000000 +0900
+++ b/mm/internal.h	2009-08-18 23:31:11.000000000 +0900
@@ -91,7 +91,8 @@ static inline void unevictable_migrate_p
  * to determine if it's being mapped into a LOCKED vma.
  * If so, mark page as mlocked.
  */
-static inline int is_mlocked_vma(struct vm_area_struct *vma, struct page *page)
+static inline int try_set_page_mlocked(struct vm_area_struct *vma,
+				       struct page *page)
 {
 	VM_BUG_ON(PageLRU(page));
 
@@ -144,7 +145,7 @@ static inline void mlock_migrate_page(st
 }
 
 #else /* CONFIG_HAVE_MLOCKED_PAGE_BIT */
-static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p)
+static inline int try_set_page_mlocked(struct vm_area_struct *v, struct page *p)
 {
 	return 0;
 }
Index: b/mm/rmap.c
===================================================================
--- a/mm/rmap.c	2009-08-18 19:48:14.000000000 +0900
+++ b/mm/rmap.c	2009-08-18 23:47:34.000000000 +0900
@@ -362,7 +362,9 @@ static int page_referenced_one(struct pa
 	 * unevictable list.
 	 */
 	if (vma->vm_flags & VM_LOCKED) {
-		*mapcount = 1;	/* break early from loop */
+		*mapcount = 1;		/* break early from loop */
+		*vm_flags |= VM_LOCKED;	/* for prevent to move active list */
+		try_set_page_mlocked(vma, page);
 		goto out_unmap;
 	}
 
@@ -531,6 +533,9 @@ int page_referenced(struct page *page,
 	if (page_test_and_clear_young(page))
 		referenced++;
 
+	if (unlikely(*vm_flags & VM_LOCKED))
+		referenced = 0;
+
 	return referenced;
 }
 
@@ -784,6 +789,7 @@ static int try_to_unmap_one(struct page 
 	 */
 	if (!(flags & TTU_IGNORE_MLOCK)) {
 		if (vma->vm_flags & VM_LOCKED) {
+			try_set_page_mlocked(vma, page);
 			ret = SWAP_MLOCK;
 			goto out_unmap;
 		}
Index: b/mm/vmscan.c
===================================================================
--- a/mm/vmscan.c	2009-08-18 19:48:14.000000000 +0900
+++ b/mm/vmscan.c	2009-08-18 23:30:51.000000000 +0900
@@ -2666,7 +2666,7 @@ int page_evictable(struct page *page, st
 	if (mapping_unevictable(page_mapping(page)))
 		return 0;
 
-	if (PageMlocked(page) || (vma && is_mlocked_vma(vma, page)))
+	if (PageMlocked(page) || (vma && try_set_page_mlocked(vma, page)))
 		return 0;
 
 	return 1;








--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-08-18 15:58 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-05  2:40 Wu Fengguang
2009-08-05  4:15 ` KOSAKI Motohiro
2009-08-05  4:41   ` Wu Fengguang
2009-08-05  7:58 ` Avi Kivity
2009-08-05  8:17   ` Avi Kivity
2009-08-05 14:33     ` Rik van Riel
2009-08-05 15:37       ` Avi Kivity
2009-08-05 14:15   ` Rik van Riel
2009-08-05 15:12     ` Avi Kivity
2009-08-05 15:15       ` Rik van Riel
2009-08-05 15:25         ` Avi Kivity
2009-08-05 16:35           ` Andrea Arcangeli
2009-08-05 16:31     ` Andrea Arcangeli
2009-08-05 17:25       ` Rik van Riel
2009-08-05 15:45   ` Dike, Jeffrey G
2009-08-05 16:05   ` Andrea Arcangeli
2009-08-05 16:12     ` Dike, Jeffrey G
2009-08-05 16:19       ` Andrea Arcangeli
2009-08-05 15:58 ` Andrea Arcangeli
2009-08-05 17:20   ` Rik van Riel
2009-08-05 17:42   ` Rik van Riel
2009-08-06 10:15     ` Andrea Arcangeli
2009-08-06 10:08   ` Andrea Arcangeli
2009-08-06 10:18     ` Avi Kivity
2009-08-06 10:20       ` Andrea Arcangeli
2009-08-06 10:59         ` Wu Fengguang
2009-08-06 11:44           ` Avi Kivity
2009-08-06 13:06             ` Wu Fengguang
2009-08-06 13:16               ` Rik van Riel
2009-08-16  3:28                 ` Wu Fengguang
2009-08-16  3:56                   ` Rik van Riel
2009-08-16  4:43                     ` Balbir Singh
2009-08-16  4:55                     ` Wu Fengguang
2009-08-16  5:59                       ` Balbir Singh
2009-08-17 19:47                       ` Dike, Jeffrey G
2009-08-21 18:24                         ` Balbir Singh
2009-08-31 19:43                           ` Dike, Jeffrey G
2009-08-31 19:52                             ` Rik van Riel
2009-08-31 20:06                               ` Dike, Jeffrey G
2009-08-31 20:09                                 ` Rik van Riel
2009-08-31 20:11                                   ` Dike, Jeffrey G
2009-08-31 20:42                                     ` Balbir Singh
2009-08-06 13:46               ` Avi Kivity
2009-08-06 21:09               ` Jeff Dike
2009-08-16  3:18                 ` Wu Fengguang
2009-08-16  3:53                   ` Rik van Riel
2009-08-16  5:15                     ` Wu Fengguang
2009-08-16 11:29                       ` Wu Fengguang
2009-08-17 14:33                         ` Minchan Kim
2009-08-18  2:34                           ` Wu Fengguang
2009-08-18  4:17                             ` Minchan Kim
2009-08-18  9:31                               ` Wu Fengguang
2009-08-18  9:52                                 ` Minchan Kim
2009-08-18 10:00                                   ` Wu Fengguang
2009-08-18 11:00                                     ` Minchan Kim
2009-08-18 11:11                                       ` Wu Fengguang
2009-08-18 14:03                                         ` Minchan Kim
2009-08-18 16:27                                         ` KOSAKI Motohiro
2009-08-18 15:57                         ` KOSAKI Motohiro [this message]
2009-08-19 12:01                           ` Wu Fengguang
2009-08-19 12:05                             ` KOSAKI Motohiro
2009-08-19 12:10                               ` Wu Fengguang
2009-08-19 12:25                                 ` Minchan Kim
2009-08-19 13:19                                   ` KOSAKI Motohiro
2009-08-19 13:28                                     ` Minchan Kim
2009-08-21 11:17                                       ` KOSAKI Motohiro
2009-08-19 13:24                                   ` Wu Fengguang
2009-08-19 13:38                                     ` Minchan Kim
2009-08-19 14:00                                       ` Wu Fengguang
2009-08-06 13:13             ` Rik van Riel
2009-08-06 13:49               ` Avi Kivity
2009-08-07  3:11               ` KOSAKI Motohiro
2009-08-07  7:54                 ` Balbir Singh
2009-08-07  8:24                   ` KAMEZAWA Hiroyuki
2009-08-06 13:11           ` Rik van Riel
2009-08-06 13:08     ` Rik van Riel
2009-08-07  3:17       ` KOSAKI Motohiro
2009-08-12  7:48         ` Wu Fengguang
2009-08-12 14:31           ` Rik van Riel
2009-08-13  1:03             ` Wu Fengguang
2009-08-13 15:46               ` Rik van Riel
2009-08-13 16:12                 ` Avi Kivity
2009-08-13 16:26                   ` Rik van Riel
2009-08-13 19:12                     ` Avi Kivity
2009-08-13 21:16                       ` Johannes Weiner
2009-08-14  7:16                         ` Avi Kivity
2009-08-14  9:10                           ` Johannes Weiner
2009-08-14  9:51                             ` Wu Fengguang
2009-08-14 13:19                               ` Rik van Riel
2009-08-15  5:45                                 ` Wu Fengguang
2009-08-16  5:09                                   ` Balbir Singh
2009-08-16  5:41                                     ` Wu Fengguang
2009-08-16  5:50                                     ` Wu Fengguang
2009-08-18 15:57                                     ` KOSAKI Motohiro
2009-08-17 18:04                                   ` Dike, Jeffrey G
2009-08-18  2:26                                     ` Wu Fengguang
2009-09-02 19:30                                       ` Dike, Jeffrey G
2009-09-03  2:04                                         ` Wu Fengguang
2009-09-04 20:06                                           ` Dike, Jeffrey G
2009-09-04 20:57                                             ` Rik van Riel
2009-08-18 15:57                                   ` KOSAKI Motohiro
2009-08-19 12:08                                     ` Wu Fengguang
2009-08-19 13:40                                     ` [RFC] memcg: move definitions to .h and inline some functions Wu Fengguang
2009-08-19 14:18                                       ` KAMEZAWA Hiroyuki
2009-08-19 14:27                                         ` Balbir Singh
2009-08-20  1:34                                           ` Wu Fengguang
2009-08-14 21:42                               ` [RFC] respect the referenced bit of KVM guest pages? Dike, Jeffrey G
2009-08-14 22:37                                 ` Rik van Riel
2009-08-15  5:32                                   ` Wu Fengguang
2009-09-13 16:23                                 ` KOSAKI Motohiro
2009-08-05 17:53 ` Rik van Riel
2009-08-05 19:00   ` Dike, Jeffrey G
2009-08-05 19:07     ` Rik van Riel
2009-08-05 19:18       ` Dike, Jeffrey G
2009-08-06  9:22         ` Avi Kivity
2009-08-06  9:25           ` Wu Fengguang
2009-08-06  9:35             ` Avi Kivity
2009-08-06  9:35               ` Wu Fengguang
2009-08-06  9:59                 ` Avi Kivity
2009-08-06  9:59                   ` Wu Fengguang
2009-08-06 10:14                     ` Avi Kivity
2009-08-07  1:25                       ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090818234310.A64B.A69D9226@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=avi@redhat.com \
    --cc=cl@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=jdike@addtoit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    --cc=wilfred.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox