From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Marcelo Tosatti <marcelo@kvack.org>
Cc: kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
"Daniel Sp蚣g" <daniel.spang@gmail.com>,
"Rik van Riel" <riel@redhat.com>,
"Andrew Morton" <akpm@linux-foundation.org>
Subject: [RFC][patch 1/2] mem notifications v3 improvement for large system
Date: Tue, 25 Dec 2007 19:31:14 +0900 [thread overview]
Message-ID: <20071225182144.D26D.KOSAKI.MOTOHIRO@jp.fujitsu.com> (raw)
In-Reply-To: <20071225164832.D267.KOSAKI.MOTOHIRO@jp.fujitsu.com>
Hi
I tried resolve too few notification problem.
mem_notify_status global variable mean wakeup 1 process.
it is too few.
improvement step1:
- add read method and wake up all process.
1. run >10000 process test
console1# LANG=C; while [ 1 ] ;do sleep 1; date; vmstat 1 1 -S M -a; done
console2# sh m.sh 12500
result:
- wakeup all unoccur neither thundering herd nor soft lock-up.
- no swap out occured.
- but too much free ;-)
in my test-case, over 5GB freed.
Wed Dec 26 03:19:20 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
7 0 0 605 209 12778 0 0 143 11 1458 183 14 10 76 1 0
Wed Dec 26 03:19:21 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
6 0 0 2687 209 10769 0 0 142 11 1459 188 14 10 75 1 0
Wed Dec 26 03:19:22 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
2 0 0 4560 209 8968 0 0 142 11 1459 191 14 10 75 1 0
Wed Dec 26 03:19:23 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5857 209 7724 0 0 142 11 1457 192 14 10 75 1 0
Wed Dec 26 03:19:24 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5872 209 7724 0 0 141 11 1454 192 14 10 75 1 0
Wed Dec 26 03:19:25 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5884 209 7724 0 0 141 11 1451 192 14 10 75 1 0
Wed Dec 26 03:19:26 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5895 209 7724 0 0 140 11 1448 191 14 10 75 1 0
Wed Dec 26 03:19:27 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5904 209 7724 0 0 140 11 1445 191 14 10 75 1 0
Wed Dec 26 03:19:28 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5912 209 7724 0 0 140 11 1442 190 13 10 75 1 0
Wed Dec 26 03:19:29 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5920 209 7724 0 0 139 11 1439 190 13 10 75 1 0
Wed Dec 26 03:19:30 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 1 0 5929 209 7724 0 0 139 11 1436 189 13 10 75 1 0
Wed Dec 26 03:19:32 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5935 209 7724 0 0 139 11 1433 189 13 10 75 1 0
Wed Dec 26 03:19:33 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 5940 209 7724 0 0 138 11 1430 188 13 10 75 1 0
Wed Dec 26 03:19:34 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
2 1 0 5948 209 7725 0 0 138 11 1427 188 13 10 75 1 0
Wed Dec 26 03:19:35 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
0 0 0 5676 209 8005 0 0 138 11 1425 188 13 10 75 1 0
Wed Dec 26 03:19:36 JST 2007
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free inact active si so bi bo in cs us sy id wa st
0 1 0 5676 209 8006 0 0 137 11 1422 188 13 10 75 1 0
Index: linux-2.6.23-mem_notify_v3/mm/mem_notify.c
===================================================================
--- linux-2.6.23-mem_notify_v3.orig/mm/mem_notify.c
+++ linux-2.6.23-mem_notify_v3/mm/mem_notify.c
@@ -13,7 +13,11 @@
#include <linux/percpu.h>
#include <linux/timer.h>
-static unsigned long mem_notify_status = 0;
+struct mem_notify_file_info {
+ long last_event;
+};
+
+atomic_t mem_notify_event = ATOMIC_INIT(0);
static DECLARE_WAIT_QUEUE_HEAD(mem_wait);
static DEFINE_PER_CPU(unsigned long, last_mem_notify) = INITIAL_JIFFIES;
@@ -28,53 +32,81 @@ void mem_notify_userspace(void)
if (time_after(now, target)) {
__get_cpu_var(last_mem_notify) = now;
- mem_notify_status = 1;
+ atomic_inc(&mem_notify_event);
wake_up(&mem_wait);
}
}
static int mem_notify_open(struct inode *inode, struct file *file)
{
- return 0;
+ struct mem_notify_file_info *ptr;
+ int err = 0;
+
+ ptr = kmalloc(sizeof(*ptr), GFP_KERNEL);
+ if (!ptr) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ ptr->last_event = atomic_read(&mem_notify_event);
+ file->private_data = ptr;
+
+out:
+ return err;
}
static int mem_notify_release(struct inode *inode, struct file *file)
{
+ kfree(file->private_data);
+
return 0;
}
static unsigned int mem_notify_poll(struct file *file, poll_table *wait)
{
unsigned int val = 0;
+ struct zone *zone;
+ int pages_high, pages_free, pages_reserve;
+ struct mem_notify_file_info *file_info = file->private_data;
poll_wait(file, &mem_wait, wait);
- if (mem_notify_status) {
- struct zone *zone;
- int pages_high, pages_free, pages_reserve;
-
- mem_notify_status = 0;
-
- /* check if its not a spurious/stale notification */
- pages_high = pages_free = pages_reserve = 0;
- for_each_zone(zone) {
- if (!populated_zone(zone) || is_highmem(zone))
- continue;
- pages_high += zone->pages_high;
- pages_free += zone_page_state(zone, NR_FREE_PAGES);
- pages_reserve += zone->lowmem_reserve[MAX_NR_ZONES-1];
- }
+ if (file_info->last_event == atomic_read(&mem_notify_event))
+ goto out;
- if (pages_free < (pages_high+pages_reserve)*2)
- val = POLLIN;
+ /* check if its not a spurious/stale notification */
+ pages_high = pages_free = pages_reserve = 0;
+ for_each_zone(zone) {
+ if (!populated_zone(zone) || is_highmem(zone))
+ continue;
+ pages_high += zone->pages_high;
+ pages_free += zone_page_state(zone, NR_FREE_PAGES);
+ pages_reserve += zone->lowmem_reserve[MAX_NR_ZONES-1];
}
-
+
+ if (pages_free < (pages_high+pages_reserve)*2)
+ val = POLLIN;
+
+out:
return val;
}
+static ssize_t mem_notify_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct mem_notify_file_info *file_info = file->private_data;
+ if (!file_info)
+ return -EINVAL;
+
+ file_info->last_event = atomic_read(&mem_notify_event);
+
+ return 0;
+}
+
struct file_operations mem_notify_fops = {
.open = mem_notify_open,
.release = mem_notify_release,
.poll = mem_notify_poll,
+ .read = mem_notify_read,
};
EXPORT_SYMBOL(mem_notify_fops);
/kosaki
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-12-25 10:31 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-24 20:32 [PATCH] mem notifications v3 Marcelo Tosatti
2007-12-25 3:47 ` KOSAKI Motohiro
2007-12-25 4:56 ` [RFC] add poll_wait_exclusive() API KOSAKI Motohiro
2007-12-27 21:05 ` Marcelo Tosatti
2007-12-25 8:31 ` [PATCH] mem notifications v3 KOSAKI Motohiro
2007-12-25 10:31 ` KOSAKI Motohiro [this message]
2007-12-27 21:04 ` [RFC][patch 1/2] mem notifications v3 improvement for large system Marcelo Tosatti
2007-12-28 0:38 ` KOSAKI Motohiro
2007-12-25 10:31 ` [RFC][patch 2/2] " KOSAKI Motohiro
2007-12-25 10:41 ` KOSAKI Motohiro
2007-12-27 4:49 ` [RFC][patch] mem_notify more faster reduce load average KOSAKI Motohiro
2007-12-27 20:13 ` [PATCH] mem notifications v3 Marcelo Tosatti
2007-12-28 1:44 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071225182144.D26D.KOSAKI.MOTOHIRO@jp.fujitsu.com \
--to=kosaki.motohiro@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=daniel.spang@gmail.com \
--cc=linux-mm@kvack.org \
--cc=marcelo@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox