From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail202.messagelabs.com (mail202.messagelabs.com [216.82.254.227]) by kanga.kvack.org (Postfix) with SMTP id 499C46B003D for ; Thu, 10 Dec 2009 19:55:01 -0500 (EST) Received: from m1.gw.fujitsu.co.jp ([10.0.50.71]) by fgwmail5.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id nBB0swOG018184 for (envelope-from kamezawa.hiroyu@jp.fujitsu.com); Fri, 11 Dec 2009 09:54:58 +0900 Received: from smail (m1 [127.0.0.1]) by outgoing.m1.gw.fujitsu.co.jp (Postfix) with ESMTP id EAFB045DE4F for ; Fri, 11 Dec 2009 09:54:57 +0900 (JST) Received: from s1.gw.fujitsu.co.jp (s1.gw.fujitsu.co.jp [10.0.50.91]) by m1.gw.fujitsu.co.jp (Postfix) with ESMTP id B3CDE45DE4E for ; Fri, 11 Dec 2009 09:54:57 +0900 (JST) Received: from s1.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s1.gw.fujitsu.co.jp (Postfix) with ESMTP id 9EF491DB8040 for ; Fri, 11 Dec 2009 09:54:57 +0900 (JST) Received: from m106.s.css.fujitsu.com (m106.s.css.fujitsu.com [10.249.87.106]) by s1.gw.fujitsu.co.jp (Postfix) with ESMTP id 536B11DB803A for ; Fri, 11 Dec 2009 09:54:57 +0900 (JST) Date: Fri, 11 Dec 2009 09:51:59 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [RFC mm][PATCH 2/5] percpu cached mm counter Message-Id: <20091211095159.6472a009.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <28c262360912101640y4b90db76w61a7a5dab5f8e796@mail.gmail.com> References: <20091210163115.463d96a3.kamezawa.hiroyu@jp.fujitsu.com> <20091210163448.338a0bd2.kamezawa.hiroyu@jp.fujitsu.com> <28c262360912101640y4b90db76w61a7a5dab5f8e796@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org To: Minchan Kim Cc: "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , cl@linux-foundation.org, "akpm@linux-foundation.org" , mingo@elte.hu List-ID: On Fri, 11 Dec 2009 09:40:07 +0900 Minchan Kim wrote: > >A static inline unsigned long get_mm_counter(struct mm_struct *mm, int member) > > A { > > - A A A return (unsigned long)atomic_long_read(&(mm)->counters[member]); > > + A A A long ret; > > + A A A /* > > + A A A A * Because this counter is loosely synchronized with percpu cached > > + A A A A * information, it's possible that value gets to be minus. For user's > > + A A A A * convenience/sanity, avoid returning minus. > > + A A A A */ > > + A A A ret = atomic_long_read(&(mm)->counters[member]); > > + A A A if (unlikely(ret < 0)) > > + A A A A A A A return 0; > > + A A A return (unsigned long)ret; > > A } > > Now, your sync point is only task switching time. > So we can't show exact number if many counting of mm happens > in short time.(ie, before context switching). > It isn't matter? > I think it's not a matter from 2 reasons. 1. Now, considering servers which requires continuous memory usage monitoring as ps/top, when there are 2000 processes, "ps -elf" takes 0.8sec. Because system admins know that gathering process information consumes some amount of cpu resource, they will not do that so frequently.(I hope) 2. When chains of page faults occur continously in a period, the monitor of memory usage just see a snapshot of current numbers and "snapshot of what moment" is at random, always. No one can get precise number in that kind of situation. > > > > A static inline void add_mm_counter(struct mm_struct *mm, int member, long value) > > Index: mmotm-2.6.32-Dec8/kernel/sched.c > > =================================================================== > > --- mmotm-2.6.32-Dec8.orig/kernel/sched.c > > +++ mmotm-2.6.32-Dec8/kernel/sched.c > > @@ -2858,6 +2858,7 @@ context_switch(struct rq *rq, struct tas > > A A A A trace_sched_switch(rq, prev, next); > > A A A A mm = next->mm; > > A A A A oldmm = prev->active_mm; > > + > > A A A A /* > > A A A A * For paravirt, this is coupled with an exit in switch_to to > > A A A A * combine the page table reload and the switch backend into > > @@ -5477,6 +5478,11 @@ need_resched_nonpreemptible: > > > > A A A A if (sched_feat(HRTICK)) > > A A A A A A A A hrtick_clear(rq); > > + A A A /* > > + A A A A * sync/invaldidate per-cpu cached mm related information > > + A A A A * before taling rq->lock. (see include/linux/mm.h) > > taling => taking > > > + A A A A */ > > + A A A sync_mm_counters_atomic(); > > It's my above concern. > before the process schedule out, we could get the wrong info. > It's not realistic problem? > I think not, now. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org