linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua>
To: William Lee Irwin III <wli@holomorphy.com>,
	Rik van Riel <riel@conectiva.com.br>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH] iowait statistics
Date: Wed, 15 May 2002 15:17:26 -0200	[thread overview]
Message-ID: <200205151214.g4FCEqY13273@Port.imtp.ilyichevsk.odessa.ua> (raw)
In-Reply-To: <20020514165414.GC27957@holomorphy.com>

On 14 May 2002 14:54, William Lee Irwin III wrote:
> On Tue, 14 May 2002, William Lee Irwin III wrote:
> >> This appears to be global across all cpu's. Maybe nr_iowait_tasks
> >> should be accounted on a per-cpu basis, where
>
> On Tue, May 14, 2002 at 01:36:00PM -0300, Rik van Riel wrote:
> > While your proposal should work, somehow I doubt it's worth
> > the complexity. It's just a statistic to help sysadmins ;)
>
> I reserved judgment on that in order to present a possible mechanism.
> I'm not sure it is either; we'll know it matters if sysadmins scream.

Hi Rik,

Since you are working on this piece of kernel,

I was investigating why sometimes in top I see idle % like
9384729374923.43%. It was caused by idle count in /proc/stat
going backward sometimes.

I found the race responsible for that and have a fix for it
(attached below). It checks for jiffies change and regenerate
stats if jiffies++ hit us.

Unfortunately it is for UP case only, in SMP race still exists,
even on SMP kernel on UP box.

Why: system/user/idle[/iowait] stats are collected at timer int
on UP but _on local APIC int_ on SMP.

It can be fixed for SMP:
* add spinlock
or
* add per_cpu_idle, account it too at timer/APIC int
  and get rid of idle % calculations for /proc/stat

As a user, I vote for glitchless statistics even if they
consume extra i++ cycle every timer int on every CPU.

Now you hear very first scream :-)
--
vda

--- fs/proc/proc_misc.c.orig	Wed Nov 21 03:29:09 2001
+++ fs/proc/proc_misc.c	Thu Apr 25 13:57:55 2002
@@ -239,38 +239,47 @@
 				 int count, int *eof, void *data)
 {
 	int i, len;
-	extern unsigned long total_forks;
-	unsigned long jif = jiffies;
-	unsigned int sum = 0, user = 0, nice = 0, system = 0;
+	extern unsigned long total_forks; /*FIXME: move into a .h */
+	unsigned long jif, sum, user, nice, system;
 	int major, disk;

-	for (i = 0 ; i < smp_num_cpus; i++) {
-		int cpu = cpu_logical_map(i), j;
-
-		user += kstat.per_cpu_user[cpu];
-		nice += kstat.per_cpu_nice[cpu];
-		system += kstat.per_cpu_system[cpu];
+	do {
+		jif=jiffies;
+		sum = user = nice = system = 0;
+		for (i = 0 ; i < smp_num_cpus; i++) {
+			int cpu = cpu_logical_map(i), j;
+			user += kstat.per_cpu_user[cpu];
+			nice += kstat.per_cpu_nice[cpu];
+			system += kstat.per_cpu_system[cpu];
 #if !defined(CONFIG_ARCH_S390)
-		for (j = 0 ; j < NR_IRQS ; j++)
-			sum += kstat.irqs[cpu][j];
+			for (j = 0 ; j < NR_IRQS ; j++)
+				sum += kstat.irqs[cpu][j];
 #endif
-	}
-
-	len = sprintf(page, "cpu  %u %u %u %lu\n", user, nice, system,
-		      jif * smp_num_cpus - (user + nice + system));
-	for (i = 0 ; i < smp_num_cpus; i++)
-		len += sprintf(page + len, "cpu%d %u %u %u %lu\n",
-			i,
-			kstat.per_cpu_user[cpu_logical_map(i)],
-			kstat.per_cpu_nice[cpu_logical_map(i)],
-			kstat.per_cpu_system[cpu_logical_map(i)],
-			jif - (  kstat.per_cpu_user[cpu_logical_map(i)] \
-				   + kstat.per_cpu_nice[cpu_logical_map(i)] \
-				   + kstat.per_cpu_system[cpu_logical_map(i)]));
+		}
+
+		len = sprintf(page, "cpu  %lu %lu %lu %lu\n",
+			    user, nice, system,
+			    jif*smp_num_cpus - (user+nice+system)
+			    );
+		for (i = 0 ; i < smp_num_cpus; i++) {
+			int cpu = cpu_logical_map(i);
+			len += sprintf(page + len, "cpu%d %lu %lu %lu %lu\n",
+				i,
+				(unsigned long)kstat.per_cpu_user[cpu],
+				(unsigned long)kstat.per_cpu_nice[cpu],
+				(unsigned long)kstat.per_cpu_system[cpu],
+				jif - ( kstat.per_cpu_user[cpu]
+					+ kstat.per_cpu_nice[cpu]
+					+ kstat.per_cpu_system[cpu]));
+		}
+	} while(jif!=jiffies); /* regenerate if there was a timer interrupt */
+				/* TODO: check SMP case: SMP uses local APIC ints
+				for kstat updates, not a timer int... */
+
 	len += sprintf(page + len,
 		"page %u %u\n"
 		"swap %u %u\n"
-		"intr %u",
+		"intr %lu",
 			kstat.pgpgin >> 1,
 			kstat.pgpgout >> 1,
 			kstat.pswpin,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

  reply	other threads:[~2002-05-15 17:17 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-14  1:19 Rik van Riel
2002-05-14  2:18 ` Andrew Morton
2002-05-14 12:30   ` Rik van Riel
2002-05-15 17:02   ` Denis Vlasenko
2002-05-16  7:41     ` Andrew Morton
2002-05-14 15:39 ` William Lee Irwin III
2002-05-14 16:36   ` Rik van Riel
2002-05-14 16:54     ` William Lee Irwin III
2002-05-15 17:17       ` Denis Vlasenko [this message]
2002-05-15 14:03         ` Rik van Riel
2002-05-15 20:17           ` Denis Vlasenko
2002-05-15 16:13             ` Rik van Riel
2002-05-15 16:21               ` William Lee Irwin III
2002-05-15 17:00               ` William Lee Irwin III
2002-05-15 18:16                 ` Bill Davidsen
2002-05-15 18:30                 ` William Lee Irwin III
2002-05-15 18:33                   ` Rik van Riel
2002-05-15 18:46                     ` William Lee Irwin III
2002-05-15 19:00                       ` Rik van Riel
2002-05-16 11:42                         ` Denis Vlasenko
2002-05-16  9:49               ` Leigh Brown
2002-05-16 14:51                 ` Rik van Riel
2002-05-16 16:44                   ` Leigh Brown
2002-05-17  8:02                     ` Jens Axboe
2002-05-16 11:14               ` Denis Vlasenko
2002-05-15 15:15         ` Bill Davidsen
2002-05-16 10:58           ` Denis Vlasenko
2002-05-14 18:19     ` Martin J. Bligh
2002-05-15  1:31 ` Bill Davidsen
2002-05-15  1:41   ` William Lee Irwin III
2002-05-15 14:39     ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200205151214.g4FCEqY13273@Port.imtp.ilyichevsk.odessa.ua \
    --to=vda@port.imtp.ilyichevsk.odessa.ua \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox