Re: [PATCH 3/4] memcg: avoid account not-on-LRU pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: nishimura@mxp.nes.nec.co.jp,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	"xemul@openvz.org" <xemul@openvz.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 3/4] memcg: avoid account not-on-LRU pages
Date: Mon, 29 Sep 2008 20:19:06 +0900	[thread overview]
Message-ID: <20080929201906.896b9f3d.nishimura@mxp.nes.nec.co.jp> (raw)
In-Reply-To: <20080929192339.327ca142.kamezawa.hiroyu@jp.fujitsu.com>

On Mon, 29 Sep 2008 19:23:39 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> There are not-on-LRU pages which can be mapped and they are not worth to
> be accounted. (becasue we can't shrink them and need dirty codes to handle
> specical case) We'd like to make use of usual objrmap/radix-tree's protcol
> and don't want to account out-of-vm's control pages.
> 
> When special_mapping_fault() is called, page->mapping is tend to be NULL 
> and it's charged as Anonymous page.
> insert_page() also handles some special pages from drivers.
> 
> This patch is for avoiding to account special pages.
> 
> Changlog: v5 -> v6
>   - modified Documentation.
>   - fixed to charge only when a page is newly allocated.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
>  Documentation/controllers/memory.txt |   24 ++++++++++++++++--------
>  mm/memory.c                          |   29 +++++++++++++----------------
>  mm/rmap.c                            |    4 ++--
>  3 files changed, 31 insertions(+), 26 deletions(-)
> 
> Index: mmotm-2.6.27-rc7+/mm/memory.c
> ===================================================================
> --- mmotm-2.6.27-rc7+.orig/mm/memory.c
> +++ mmotm-2.6.27-rc7+/mm/memory.c
> @@ -1323,18 +1323,14 @@ static int insert_page(struct vm_area_st
>  	pte_t *pte;
>  	spinlock_t *ptl;
>  
> -	retval = mem_cgroup_charge(page, mm, GFP_KERNEL);
> -	if (retval)
> -		goto out;
> -
>  	retval = -EINVAL;
>  	if (PageAnon(page))
> -		goto out_uncharge;
> +		goto out;
>  	retval = -ENOMEM;
>  	flush_dcache_page(page);
>  	pte = get_locked_pte(mm, addr, &ptl);
>  	if (!pte)
> -		goto out_uncharge;
> +		goto out;
>  	retval = -EBUSY;
>  	if (!pte_none(*pte))
>  		goto out_unlock;
> @@ -1350,8 +1346,6 @@ static int insert_page(struct vm_area_st
>  	return retval;
>  out_unlock:
>  	pte_unmap_unlock(pte, ptl);
> -out_uncharge:
> -	mem_cgroup_uncharge_page(page);
>  out:
>  	return retval;
>  }
> @@ -2463,6 +2457,7 @@ static int __do_fault(struct mm_struct *
>  	struct page *page;
>  	pte_t entry;
>  	int anon = 0;
> +	int charged = 0;
>  	struct page *dirty_page = NULL;
>  	struct vm_fault vmf;
>  	int ret;
> @@ -2503,6 +2498,12 @@ static int __do_fault(struct mm_struct *
>  				ret = VM_FAULT_OOM;
>  				goto out;
>  			}
> +			if (mem_cgroup_charge(page, mm, GFP_KERNEL)) {
> +				ret = VM_FAULT_OOM;
> +				page_cache_release(page);
> +				goto out;
> +			}
> +			charged = 1;
>  			/*
>  			 * Don't let another task, with possibly unlocked vma,
>  			 * keep the mlocked page.
> @@ -2543,11 +2544,6 @@ static int __do_fault(struct mm_struct *
>  
>  	}
>  
> -	if (mem_cgroup_charge(page, mm, GFP_KERNEL)) {
> -		ret = VM_FAULT_OOM;
> -		goto out;
> -	}
> -
>  	page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
>  
>  	/*
> @@ -2585,10 +2581,11 @@ static int __do_fault(struct mm_struct *
>  		/* no need to invalidate: a not-present page won't be cached */
>  		update_mmu_cache(vma, address, entry);
>  	} else {
> -		mem_cgroup_uncharge_page(page);
> -		if (anon)
> +		if (charged)
> +			mem_cgroup_uncharge_page(page);
> +		if (anon) {
>  			page_cache_release(page);
> -		else
> +		} else
>  			anon = 1; /* no anon but release faulted_page */
>  	}
>

checkpatch reports a warning here.

I think it should be like

@@ -2585,7 +2581,8 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		/* no need to invalidate: a not-present page won't be cached */
 		update_mmu_cache(vma, address, entry);
 	} else {
-		mem_cgroup_uncharge_page(page);
+		if (charged)
+			mem_cgroup_uncharge_page(page);
 		if (anon)
 			page_cache_release(page);
 		else


Thanks,
Daisuke Nishimura.

> Index: mmotm-2.6.27-rc7+/mm/rmap.c
> ===================================================================
> --- mmotm-2.6.27-rc7+.orig/mm/rmap.c
> +++ mmotm-2.6.27-rc7+/mm/rmap.c
> @@ -725,8 +725,8 @@ void page_remove_rmap(struct page *page,
>  			page_clear_dirty(page);
>  			set_page_dirty(page);
>  		}
> -
> -		mem_cgroup_uncharge_page(page);
> +		if (PageAnon(page))
> +			mem_cgroup_uncharge_page(page);
>  		__dec_zone_page_state(page,
>  			PageAnon(page) ? NR_ANON_PAGES : NR_FILE_MAPPED);
>  		/*
> Index: mmotm-2.6.27-rc7+/Documentation/controllers/memory.txt
> ===================================================================
> --- mmotm-2.6.27-rc7+.orig/Documentation/controllers/memory.txt
> +++ mmotm-2.6.27-rc7+/Documentation/controllers/memory.txt
> @@ -112,14 +112,22 @@ the per cgroup LRU.
>  
>  2.2.1 Accounting details
>  
> -All mapped pages (RSS) and unmapped user pages (Page Cache) are accounted.
> -RSS pages are accounted at the time of page_add_*_rmap() unless they've already
> -been accounted for earlier. A file page will be accounted for as Page Cache;
> -it's mapped into the page tables of a process, duplicate accounting is carefully
> -avoided. Page Cache pages are accounted at the time of add_to_page_cache().
> -The corresponding routines that remove a page from the page tables or removes
> -a page from Page Cache is used to decrement the accounting counters of the
> -cgroup.
> +All mapped anon pages (RSS) and cache pages (Page Cache) are accounted.
> +(some pages which never be reclaimable and will not be on global LRU
> + are not accounted. we just accounts pages under usual vm management.)
> +
> +RSS pages are accounted at page_fault unless they've already been accounted
> +for earlier. A file page will be accounted for as Page Cache when it's
> +inserted into inode (radix-tree). While it's mapped into the page tables of
> +processes, duplicate accounting is carefully avoided.
> +
> +A RSS page is unaccounted when it's fully unmapped. A PageCache page is
> +unaccounted when it's removed from radix-tree.
> +
> +At page migration, accounting information is kept.
> +
> +Note: we just account pages-on-lru because our purpose is to control amount
> +of used pages. not-on-lru pages are tend to be out-of-control from vm view.
>  
>  2.3 Shared Page Accounting
>  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-09-29 11:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-29 10:19 [PATCH 0/4] memcg: ready-to-go series (was memcg update v6) KAMEZAWA Hiroyuki
2008-09-29 10:21 ` [PATCH 1/4] memcg: account swap cache under lock KAMEZAWA Hiroyuki
2008-09-29 11:33   ` Daisuke Nishimura
2008-09-30  8:05   ` Balbir Singh
2008-09-29 10:22 ` [PATCH 2/4] memcg: set page->mapping NULL before uncharge KAMEZAWA Hiroyuki
2008-09-29 11:39   ` Daisuke Nishimura
2008-10-01  3:50   ` Balbir Singh
2008-09-29 10:23 ` [PATCH 3/4] memcg: avoid account not-on-LRU pages KAMEZAWA Hiroyuki
2008-09-29 11:19   ` Daisuke Nishimura [this message]
2008-09-29 11:59   ` kamezawa.hiroyu
2008-09-30  1:17   ` [PATCH/stylefix " KAMEZAWA Hiroyuki
2008-10-01  3:49     ` Balbir Singh
2008-10-01  4:50       ` KAMEZAWA Hiroyuki
2008-09-30  8:14   ` [PATCH " Balbir Singh
2008-09-29 10:24 ` [PATCH 4/4] memcg: optimze cpustat KAMEZAWA Hiroyuki
2008-09-29 11:44   ` Daisuke Nishimura
2008-10-06 17:15 ` [PATCH 0/4] memcg: ready-to-go series (was memcg update v6) Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080929201906.896b9f3d.nishimura@mxp.nes.nec.co.jp \
    --to=nishimura@mxp.nes.nec.co.jp \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox