linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/8] mem_notify v5
@ 2008-01-24  4:18 KOSAKI Motohiro
  2008-01-24  4:19 ` [RFC][PATCH 1/8] mem_notify v5: introduce poll_wait_exclusive() new API KOSAKI Motohiro
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:18 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

Hi

The /dev/mem_notify is low memory notification device.
it can avoid swappness and oom by cooperationg with the user process.

You need not be annoyed by OOM any longer :)
please any comments!

patch list
	[1/8] introduce poll_wait_exclusive() new API
	[2/8] introduce wake_up_locked_nr() new API
	[3/8] introduce /dev/mem_notify new device (the core of this patch series)
	[4/8] memory_pressure_notify() caller
	[5/8] add new mem_notify field to /proc/zoneinfo
	[6/8] (optional) fixed incorrect shrink_zone
	[7/8] ignore very small zone for prevent incorrect low mem notify.
	[8/8] support fasync feature


related discussion:
--------------------------------------------------------------
  LKML OOM notifications requirement discussion
     http://www.gossamer-threads.com/lists/linux/kernel/832802?nohighlight=1#832802
  OOM notifications patch [Marcelo Tosatti]
     http://marc.info/?l=linux-kernel&m=119273914027743&w=2
  mem notifications v3 [Marcelo Tosatti]
     http://marc.info/?l=linux-mm&m=119852828327044&w=2
  Thrashing notification patch  [Daniel Spang]
     http://marc.info/?l=linux-mm&m=119427416315676&w=2
  mem notification v4 [kosaki]
     http://marc.info/?l=linux-mm&m=120035840523718&w=2


Changelog
-------------------------------------------------
  v4 -> v5 (by KOSAKI Motohiro)
    o rebase to 2.6.24-rc8-mm1
    o change display order of /proc/zoneinfo
    o ignore very small zone
    o support fcntl(F_SETFL, FASYNC)
    o fix some trivial bugs.

  v3 -> v4 (by KOSAKI Motohiro)
    o rebase to 2.6.24-rc6-mm1
    o avoid wake up all.
    o add judgement point to __free_one_page().
    o add zone awareness.

  v2 -> v3 (by Marcelo Tosatti)
    o changes the notification point to happen whenever
      the VM moves an anonymous page to the inactive list.
    o implement notification rate limit.

  v1(oom notify) -> v2 (by Marcelo Tosatti)
    o name change
    o notify timing change from just swap thrashing to
      just before thrashing.
    o also works with swapless device.





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 1/8] mem_notify v5: introduce poll_wait_exclusive() new API
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
@ 2008-01-24  4:19 ` KOSAKI Motohiro
  2008-01-24  4:20 ` [RFC][PATCH 2/8] mem_notify v5: introduce wake_up_locked_nr() " KOSAKI Motohiro
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:19 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

There are 2 way of adding item to wait_queue,
  1. add_wait_queue()
  2. add_wait_queue_exclusive()
and add_wait_queue_exclusive() is very useful API.

unforunately, poll_wait_exclusive() against poll_wait() doesn't exist. 
it means there is no way that wake up only 1 process where polled.
wake_up() is wake up all sleeping process by poll_wait(), not 1 process.

this patch introduce poll_wait_exclusive() new API for allow wake up only 1 process.

<example of usage>
unsigned int kosaki_poll(struct file *file,
		         struct poll_table_struct *wait)
{
	poll_wait_exclusive(file, &kosaki_wait_queue, wait);
	if (data_exist)
		return POLLIN | POLLRDNORM;
	return 0;
}


Signed-off-by: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 fs/eventpoll.c       |    7 +++++--
 fs/select.c          |    9 ++++++---
 include/linux/poll.h |   11 +++++++++--
 3 files changed, 20 insertions(+), 7 deletions(-)



Index: linux-2.6.24-rc6-mm1-memnotify/fs/eventpoll.c
===================================================================
--- linux-2.6.24-rc6-mm1-memnotify.orig/fs/eventpoll.c	2008-01-17 18:28:15.000000000 +0900
+++ linux-2.6.24-rc6-mm1-memnotify/fs/eventpoll.c	2008-01-17 18:55:47.000000000 +0900
@@ -675,7 +675,7 @@ out_unlock:
  * target file wakeup lists.
  */
 static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead,
-				 poll_table *pt)
+				 poll_table *pt, int exclusive)
 {
 	struct epitem *epi = ep_item_from_epqueue(pt);
 	struct eppoll_entry *pwq;
@@ -684,7 +684,10 @@ static void ep_ptable_queue_proc(struct 
 		init_waitqueue_func_entry(&pwq->wait, ep_poll_callback);
 		pwq->whead = whead;
 		pwq->base = epi;
-		add_wait_queue(whead, &pwq->wait);
+		if (exclusive)
+			add_wait_queue_exclusive(whead, &pwq->wait);
+		else
+			add_wait_queue(whead, &pwq->wait);
 		list_add_tail(&pwq->llink, &epi->pwqlist);
 		epi->nwait++;
 	} else {
Index: linux-2.6.24-rc6-mm1-memnotify/fs/select.c
===================================================================
--- linux-2.6.24-rc6-mm1-memnotify.orig/fs/select.c	2008-01-17 18:28:23.000000000 +0900
+++ linux-2.6.24-rc6-mm1-memnotify/fs/select.c	2008-01-17 18:55:47.000000000 +0900
@@ -48,7 +48,7 @@ struct poll_table_page {
  * poll table.
  */
 static void __pollwait(struct file *filp, wait_queue_head_t *wait_address,
-		       poll_table *p);
+		       poll_table *p, int exclusive);
 
 void poll_initwait(struct poll_wqueues *pwq)
 {
@@ -117,7 +117,7 @@ static struct poll_table_entry *poll_get
 
 /* Add a new entry */
 static void __pollwait(struct file *filp, wait_queue_head_t *wait_address,
-				poll_table *p)
+		       poll_table *p, int exclusive)
 {
 	struct poll_table_entry *entry = poll_get_entry(p);
 	if (!entry)
@@ -126,7 +126,10 @@ static void __pollwait(struct file *filp
 	entry->filp = filp;
 	entry->wait_address = wait_address;
 	init_waitqueue_entry(&entry->wait, current);
-	add_wait_queue(wait_address, &entry->wait);
+	if (exclusive)
+		add_wait_queue_exclusive(wait_address, &entry->wait);
+	else
+		add_wait_queue(wait_address, &entry->wait);
 }
 
 #define FDS_IN(fds, n)		(fds->in + n)
Index: linux-2.6.24-rc6-mm1-memnotify/include/linux/poll.h
===================================================================
--- linux-2.6.24-rc6-mm1-memnotify.orig/include/linux/poll.h	2008-01-17 18:28:32.000000000 +0900
+++ linux-2.6.24-rc6-mm1-memnotify/include/linux/poll.h	2008-01-17 18:55:47.000000000 +0900
@@ -28,7 +28,8 @@ struct poll_table_struct;
 /* 
  * structures and helpers for f_op->poll implementations
  */
-typedef void (*poll_queue_proc)(struct file *, wait_queue_head_t *, struct poll_table_struct *);
+typedef void (*poll_queue_proc)(struct file *, wait_queue_head_t *,
+				struct poll_table_struct *, int);
 
 typedef struct poll_table_struct {
 	poll_queue_proc qproc;
@@ -37,7 +38,13 @@ typedef struct poll_table_struct {
 static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p)
 {
 	if (p && wait_address)
-		p->qproc(filp, wait_address, p);
+		p->qproc(filp, wait_address, p, 0);
+}
+
+static inline void poll_wait_exclusive(struct file *filp, wait_queue_head_t *wait_address, poll_table *p)
+{
+	if (p && wait_address)
+		p->qproc(filp, wait_address, p, 1);
 }
 
 static inline void init_poll_funcptr(poll_table *pt, poll_queue_proc qproc)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 2/8] mem_notify v5: introduce wake_up_locked_nr() new API
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
  2008-01-24  4:19 ` [RFC][PATCH 1/8] mem_notify v5: introduce poll_wait_exclusive() new API KOSAKI Motohiro
@ 2008-01-24  4:20 ` KOSAKI Motohiro
  2008-01-24  4:21 ` [RFC][PATCH 3/8] mem_notify v5: introduce /dev/mem_notify new device (the core of this patch series) KOSAKI Motohiro
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:20 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

introduce new API wake_up_locked_nr() and wake_up_locked_all().
it it similar as wake_up_nr() and wake_up_all(), but it doesn't lock.

Signed-off-by: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 include/linux/wait.h |    7 +++++--
 kernel/sched.c       |    5 +++--
 2 files changed, 8 insertions(+), 4 deletions(-)

Index: linux-2.6.24-rc6-mm1-memnotify/include/linux/wait.h
===================================================================
--- linux-2.6.24-rc6-mm1-memnotify.orig/include/linux/wait.h	2008-01-17 18:28:33.000000000 +0900
+++ linux-2.6.24-rc6-mm1-memnotify/include/linux/wait.h	2008-01-17 18:56:16.000000000 +0900
@@ -142,7 +142,7 @@ static inline void __remove_wait_queue(w
 }
 
 void FASTCALL(__wake_up(wait_queue_head_t *q, unsigned int mode, int nr, void *key));
-extern void FASTCALL(__wake_up_locked(wait_queue_head_t *q, unsigned int mode));
+void FASTCALL(__wake_up_locked(wait_queue_head_t *q, unsigned int mode, int nr, void *key));
 extern void FASTCALL(__wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr));
 void FASTCALL(__wake_up_bit(wait_queue_head_t *, void *, int));
 int FASTCALL(__wait_on_bit(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned));
@@ -155,7 +155,10 @@ wait_queue_head_t *FASTCALL(bit_waitqueu
 #define wake_up(x)			__wake_up(x, TASK_NORMAL, 1, NULL)
 #define wake_up_nr(x, nr)		__wake_up(x, TASK_NORMAL, nr, NULL)
 #define wake_up_all(x)			__wake_up(x, TASK_NORMAL, 0, NULL)
-#define wake_up_locked(x)		__wake_up_locked((x), TASK_NORMAL)
+
+#define wake_up_locked(x)		__wake_up_locked((x), TASK_NORMAL, 1, NULL)
+#define wake_up_locked_nr(x, nr)	__wake_up_locked((x), TASK_NORMAL, nr, NULL)
+#define wake_up_locked_all(x)		__wake_up_locked((x), TASK_NORMAL, 0, NULL)
 
 #define wake_up_interruptible(x)	__wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
 #define wake_up_interruptible_nr(x, nr)	__wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)
Index: linux-2.6.24-rc6-mm1-memnotify/kernel/sched.c
===================================================================
--- linux-2.6.24-rc6-mm1-memnotify.orig/kernel/sched.c	2008-01-17 18:31:12.000000000 +0900
+++ linux-2.6.24-rc6-mm1-memnotify/kernel/sched.c	2008-01-17 18:56:16.000000000 +0900
@@ -3837,9 +3837,10 @@ EXPORT_SYMBOL(__wake_up);
 /*
  * Same as __wake_up but called with the spinlock in wait_queue_head_t held.
  */
-void __wake_up_locked(wait_queue_head_t *q, unsigned int mode)
+void __wake_up_locked(wait_queue_head_t *q, unsigned int mode,
+		      int nr_exclusive, void *key)
 {
-	__wake_up_common(q, mode, 1, 0, NULL);
+	__wake_up_common(q, mode, nr_exclusive, 0, key);
 }
 
 /**


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 3/8] mem_notify v5: introduce /dev/mem_notify new device (the core of this patch series)
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
  2008-01-24  4:19 ` [RFC][PATCH 1/8] mem_notify v5: introduce poll_wait_exclusive() new API KOSAKI Motohiro
  2008-01-24  4:20 ` [RFC][PATCH 2/8] mem_notify v5: introduce wake_up_locked_nr() " KOSAKI Motohiro
@ 2008-01-24  4:21 ` KOSAKI Motohiro
  2008-01-24 12:19   ` Daniel Spång
  2008-01-24  4:22 ` [RFC][PATCH 4/8] mem_notify v5: memory_pressure_notify() caller KOSAKI Motohiro
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:21 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

the core of this patch series.
add /dev/mem_notify device for notification low memory to user process.

<usage examle>

        fd = open("/dev/mem_notify", O_RDONLY);
        if (fd < 0) {
                exit(1);
        }
        pollfds.fd = fd;
        pollfds.events = POLLIN;
        pollfds.revents = 0;
	err = poll(&pollfds, 1, -1); // wake up at low memory

        ...
</usage example>


Signed-off-by: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 Documentation/devices.txt  |    1 
 drivers/char/mem.c         |    6 ++
 include/linux/mem_notify.h |   42 ++++++++++++++++
 include/linux/mmzone.h     |    1 
 mm/Makefile                |    2 
 mm/mem_notify.c            |  114 +++++++++++++++++++++++++++++++++++++++++++++
 mm/page_alloc.c            |    1 
 7 files changed, 166 insertions(+), 1 deletion(-)

Index: b/drivers/char/mem.c
===================================================================
--- a/drivers/char/mem.c	2008-01-23 19:21:34.000000000 +0900
+++ b/drivers/char/mem.c	2008-01-23 21:12:44.000000000 +0900
@@ -34,6 +34,8 @@
 # include <linux/efi.h>
 #endif
 
+extern struct file_operations mem_notify_fops;
+
 /*
  * Architectures vary in how they handle caching for addresses
  * outside of main memory.
@@ -869,6 +871,9 @@ static int memory_open(struct inode * in
 			filp->f_op = &oldmem_fops;
 			break;
 #endif
+		case 13:
+			filp->f_op = &mem_notify_fops;
+			break;
 		default:
 			return -ENXIO;
 	}
@@ -901,6 +906,7 @@ static const struct {
 #ifdef CONFIG_CRASH_DUMP
 	{12,"oldmem",    S_IRUSR | S_IWUSR | S_IRGRP, &oldmem_fops},
 #endif
+	{13,"mem_notify", S_IRUGO, &mem_notify_fops},
 };
 
 static struct class *mem_class;
Index: b/include/linux/mem_notify.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ b/include/linux/mem_notify.h	2008-01-23 23:09:32.000000000 +0900
@@ -0,0 +1,42 @@
+/*
+ * Notify applications of memory pressure via /dev/mem_notify
+ *
+ * Copyright (C) 2008 Marcelo Tosatti <marcelo@kvack.org>,
+ *                    KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
+ *
+ * Released under the GPL, see the file COPYING for details.
+ */
+
+#ifndef _LINUX_MEM_NOTIFY_H
+#define _LINUX_MEM_NOTIFY_H
+
+#define MEM_NOTIFY_FREQ (HZ/5)
+
+extern atomic_long_t last_mem_notify;
+
+extern void __memory_pressure_notify(struct zone *zone, int pressure);
+
+
+static inline void memory_pressure_notify(struct zone *zone, int pressure)
+{
+	unsigned long target;
+	unsigned long pages_high, pages_free, pages_reserve;
+
+	if (pressure) {
+		target = atomic_long_read(&last_mem_notify) + MEM_NOTIFY_FREQ;
+		if (likely(time_before(jiffies, target)))
+			return;
+
+		pages_high = zone->pages_high;
+		pages_free = zone_page_state(zone, NR_FREE_PAGES);
+		pages_reserve = zone->lowmem_reserve[MAX_NR_ZONES-1];
+		if (unlikely(pages_free > (pages_high+pages_reserve)*2))
+			return;
+
+	} else if (likely(!zone->mem_notify_status))
+		return;
+
+	__memory_pressure_notify(zone, pressure);
+}
+
+#endif /* _LINUX_MEM_NOTIFY_H */
Index: b/include/linux/mmzone.h
===================================================================
--- a/include/linux/mmzone.h	2008-01-23 19:22:56.000000000 +0900
+++ b/include/linux/mmzone.h	2008-01-23 21:12:44.000000000 +0900
@@ -283,6 +283,7 @@ struct zone {
 	 */
 	int prev_priority;
 
+	int mem_notify_status;
 
 	ZONE_PADDING(_pad2_)
 	/* Rarely used or read-mostly fields */
Index: b/mm/Makefile
===================================================================
--- a/mm/Makefile	2008-01-23 19:22:28.000000000 +0900
+++ b/mm/Makefile	2008-01-23 21:12:44.000000000 +0900
@@ -11,7 +11,7 @@ obj-y			:= bootmem.o filemap.o mempool.o
 			   page_alloc.o page-writeback.o pdflush.o \
 			   readahead.o swap.o truncate.o vmscan.o \
 			   prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \
-			   page_isolation.o $(mmu-y)
+			   page_isolation.o mem_notify.o $(mmu-y)
 
 obj-$(CONFIG_PROC_PAGE_MONITOR) += pagewalk.o
 obj-$(CONFIG_BOUNCE)	+= bounce.o
Index: b/mm/mem_notify.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ b/mm/mem_notify.c	2008-01-23 23:09:31.000000000 +0900
@@ -0,0 +1,114 @@
+/*
+ * Notify applications of memory pressure via /dev/mem_notify
+ *
+ * Copyright (C) 2008 Marcelo Tosatti <marcelo@kvack.org>,
+ *                    KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
+ *
+ * Released under the GPL, see the file COPYING for details.
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/wait.h>
+#include <linux/poll.h>
+#include <linux/timer.h>
+#include <linux/spinlock.h>
+#include <linux/mm.h>
+#include <linux/vmstat.h>
+#include <linux/percpu.h>
+#include <linux/timer.h>
+
+#include <asm/atomic.h>
+
+#define PROC_WAKEUP_GUARD  (10*HZ)
+
+struct mem_notify_file_info {
+	unsigned long last_proc_notify;
+};
+
+static DECLARE_WAIT_QUEUE_HEAD(mem_wait);
+static atomic_long_t nr_under_memory_pressure_zones = ATOMIC_LONG_INIT(0);
+static atomic_t nr_watcher_task = ATOMIC_INIT(0);
+
+atomic_long_t last_mem_notify = ATOMIC_LONG_INIT(INITIAL_JIFFIES);
+
+void __memory_pressure_notify(struct zone* zone, int pressure)
+{
+	int nr_wakeup;
+	int flags;
+
+	spin_lock_irqsave(&mem_wait.lock, flags);
+
+	if (pressure != zone->mem_notify_status) {
+		long val = pressure ? 1 : -1;
+		atomic_long_add(val, &nr_under_memory_pressure_zones);
+		zone->mem_notify_status = pressure;
+	}
+
+	if (pressure) {
+		int nr_watcher = atomic_read(&nr_watcher_task);
+
+		nr_wakeup = (nr_watcher >> 4) + 1;
+		if (unlikely(nr_wakeup > 100))
+			nr_wakeup = 100;
+
+		atomic_long_set(&last_mem_notify, jiffies);
+		wake_up_locked_nr(&mem_wait, nr_wakeup);
+	}
+
+	spin_unlock_irqrestore(&mem_wait.lock, flags);
+}
+
+static int mem_notify_open(struct inode *inode, struct file *file)
+{
+	struct mem_notify_file_info *info;
+	int    err = 0;
+
+	info = kmalloc(sizeof(*info), GFP_KERNEL);
+        if (!info) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	info->last_proc_notify = INITIAL_JIFFIES;
+	file->private_data = info;
+	atomic_inc(&nr_watcher_task);
+out:
+        return err;
+}
+
+static int mem_notify_release(struct inode *inode, struct file *file)
+{
+	kfree(file->private_data);
+	atomic_dec(&nr_watcher_task);
+	return 0;
+}
+
+static unsigned int mem_notify_poll(struct file *file, poll_table *wait)
+{
+	struct mem_notify_file_info *info = file->private_data;
+	unsigned long now = jiffies;
+	unsigned long timeout;
+	unsigned int retval = 0;
+
+	poll_wait_exclusive(file, &mem_wait, wait);
+
+	timeout = info->last_proc_notify + PROC_WAKEUP_GUARD;
+	if (time_before(now, timeout))
+		goto out;
+
+	if (atomic_long_read(&nr_under_memory_pressure_zones) != 0) {
+		info->last_proc_notify = now;
+		retval = POLLIN;
+	}
+
+out:
+	return retval;
+}
+
+struct file_operations mem_notify_fops = {
+	.open = mem_notify_open,
+	.release = mem_notify_release,
+	.poll = mem_notify_poll,
+};
+EXPORT_SYMBOL(mem_notify_fops);
Index: b/mm/page_alloc.c
===================================================================
--- a/mm/page_alloc.c	2008-01-23 19:22:28.000000000 +0900
+++ b/mm/page_alloc.c	2008-01-23 23:09:42.000000000 +0900
@@ -3458,6 +3458,7 @@ static void __meminit free_area_init_cor
 		zone->zone_pgdat = pgdat;
 
 		zone->prev_priority = DEF_PRIORITY;
+		zone->mem_notify_status = 0;
 
 		zone_pcp_init(zone);
 		INIT_LIST_HEAD(&zone->active_list);
Index: b/Documentation/devices.txt
===================================================================
--- a/Documentation/devices.txt	2008-01-23 19:22:33.000000000 +0900
+++ b/Documentation/devices.txt	2008-01-23 21:12:44.000000000 +0900
@@ -96,6 +96,7 @@ Your cooperation is appreciated.
 		 11 = /dev/kmsg		Writes to this come out as printk's
 		 12 = /dev/oldmem	Used by crashdump kernels to access
 					the memory of the kernel that crashed.
+		 13 = /dev/mem_notify   Low memory notification.
 
   1 block	RAM disk
 		  0 = /dev/ram0		First RAM disk


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 4/8] mem_notify v5: memory_pressure_notify() caller
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
                   ` (2 preceding siblings ...)
  2008-01-24  4:21 ` [RFC][PATCH 3/8] mem_notify v5: introduce /dev/mem_notify new device (the core of this patch series) KOSAKI Motohiro
@ 2008-01-24  4:22 ` KOSAKI Motohiro
  2008-01-24  4:22 ` [RFC][PATCH 5/8] mem_notify v5: add new mem_notify field to /proc/zoneinfo KOSAKI Motohiro
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:22 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

the notification point to happen whenever the VM moves an
anonymous page to the inactive list - this is a pretty good indication
that there are unused anonymous pages present which will be very likely
swapped out soon.

and, It is judged out of trouble at the fllowing situations. 
 o memory pressure decrease and stop moves an anonymous page to the inactive list.
 o free pages increase than (pages_high+lowmem_reserve)*2.


ChangeLog:
	v5: add out of trouble notify to exit of balance_pgdat().


Signed-off-by: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 mm/page_alloc.c |   12 ++++++++++++
 mm/vmscan.c     |   26 ++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

Index: b/mm/vmscan.c
===================================================================
--- a/mm/vmscan.c	2008-01-23 22:06:08.000000000 +0900
+++ b/mm/vmscan.c	2008-01-23 22:07:57.000000000 +0900
@@ -39,6 +39,7 @@
 #include <linux/kthread.h>
 #include <linux/freezer.h>
 #include <linux/memcontrol.h>
+#include <linux/mem_notify.h>
 
 #include <asm/tlbflush.h>
 #include <asm/div64.h>
@@ -1089,10 +1090,14 @@ static void shrink_active_list(unsigned 
 	struct page *page;
 	struct pagevec pvec;
 	int reclaim_mapped = 0;
+	bool inactivated_anon = 0;
 
 	if (sc->may_swap)
 		reclaim_mapped = calc_reclaim_mapped(sc, zone, priority);
 
+	if (!reclaim_mapped)
+		memory_pressure_notify(zone, 0);
+
 	lru_add_drain();
 	spin_lock_irq(&zone->lru_lock);
 	pgmoved = sc->isolate_pages(nr_pages, &l_hold, &pgscanned, sc->order,
@@ -1116,6 +1121,13 @@ static void shrink_active_list(unsigned 
 			if (!reclaim_mapped ||
 			    (total_swap_pages == 0 && PageAnon(page)) ||
 			    page_referenced(page, 0, sc->mem_cgroup)) {
+				/* deal with the case where there is no
+				 * swap but an anonymous page would be
+				 * moved to the inactive list.
+				 */
+				if (!total_swap_pages && reclaim_mapped &&
+				    PageAnon(page))
+					inactivated_anon = 1;
 				list_add(&page->lru, &l_active);
 				continue;
 			}
@@ -1123,8 +1135,12 @@ static void shrink_active_list(unsigned 
 			list_add(&page->lru, &l_active);
 			continue;
 		}
+		if (PageAnon(page))
+			inactivated_anon = 1;
 		list_add(&page->lru, &l_inactive);
 	}
+	if (inactivated_anon)
+		memory_pressure_notify(zone, 1);
 
 	pagevec_init(&pvec, 1);
 	pgmoved = 0;
@@ -1158,6 +1174,8 @@ static void shrink_active_list(unsigned 
 		pagevec_strip(&pvec);
 		spin_lock_irq(&zone->lru_lock);
 	}
+	if (!reclaim_mapped)
+		memory_pressure_notify(zone, 0);
 
 	pgmoved = 0;
 	while (!list_empty(&l_active)) {
@@ -1659,6 +1677,14 @@ out:
 		goto loop_again;
 	}
 
+	for (i = pgdat->nr_zones - 1; i >= 0; i--) {
+		struct zone *zone = pgdat->node_zones + i;
+
+		if (!populated_zone(zone))
+			continue;
+		memory_pressure_notify(zone, 0);
+	}
+
 	return nr_reclaimed;
 }
 
Index: b/mm/page_alloc.c
===================================================================
--- a/mm/page_alloc.c	2008-01-23 22:06:08.000000000 +0900
+++ b/mm/page_alloc.c	2008-01-23 23:09:32.000000000 +0900
@@ -44,6 +44,7 @@
 #include <linux/fault-inject.h>
 #include <linux/page-isolation.h>
 #include <linux/memcontrol.h>
+#include <linux/mem_notify.h>
 
 #include <asm/tlbflush.h>
 #include <asm/div64.h>
@@ -435,6 +436,8 @@ static inline void __free_one_page(struc
 	unsigned long page_idx;
 	int order_size = 1 << order;
 	int migratetype = get_pageblock_migratetype(page);
+	unsigned long prev_free;
+	unsigned long notify_threshold;
 
 	if (unlikely(PageCompound(page)))
 		destroy_compound_page(page, order);
@@ -444,6 +447,7 @@ static inline void __free_one_page(struc
 	VM_BUG_ON(page_idx & (order_size - 1));
 	VM_BUG_ON(bad_range(zone, page));
 
+	prev_free = zone_page_state(zone, NR_FREE_PAGES);
 	__mod_zone_page_state(zone, NR_FREE_PAGES, order_size);
 	while (order < MAX_ORDER-1) {
 		unsigned long combined_idx;
@@ -465,6 +469,14 @@ static inline void __free_one_page(struc
 	list_add(&page->lru,
 		&zone->free_area[order].free_list[migratetype]);
 	zone->free_area[order].nr_free++;
+
+	notify_threshold = (zone->pages_high +
+			    zone->lowmem_reserve[MAX_NR_ZONES-1]) * 2;
+
+	if (unlikely((zone->mem_notify_status == 1) &&
+		     (prev_free <= notify_threshold) &&
+		     (zone_page_state(zone, NR_FREE_PAGES) > notify_threshold)))
+		memory_pressure_notify(zone, 0);
 }
 
 static inline int free_pages_check(struct page *page)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 5/8] mem_notify v5: add new mem_notify field to /proc/zoneinfo
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
                   ` (3 preceding siblings ...)
  2008-01-24  4:22 ` [RFC][PATCH 4/8] mem_notify v5: memory_pressure_notify() caller KOSAKI Motohiro
@ 2008-01-24  4:22 ` KOSAKI Motohiro
  2008-01-24  4:23 ` [RFC][PATCH 6/8] mem_notify v5: (optional) fixed incorrect shrink_zone KOSAKI Motohiro
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:22 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

show new member of zone struct by /proc/zoneinfo.

ChangeLog:
	v5: change display order to at last.


Signed-off-by: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 mm/vmstat.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Index: b/mm/vmstat.c
===================================================================
--- a/mm/vmstat.c	2008-01-23 22:06:05.000000000 +0900
+++ b/mm/vmstat.c	2008-01-23 22:08:00.000000000 +0900
@@ -795,10 +795,12 @@ static void zoneinfo_show_print(struct s
 	seq_printf(m,
 		   "\n  all_unreclaimable: %u"
 		   "\n  prev_priority:     %i"
-		   "\n  start_pfn:         %lu",
-			   zone_is_all_unreclaimable(zone),
+		   "\n  start_pfn:         %lu"
+		   "\n  mem_notify_status: %i",
+		   zone_is_all_unreclaimable(zone),
 		   zone->prev_priority,
-		   zone->zone_start_pfn);
+		   zone->zone_start_pfn,
+		   zone->mem_notify_status);
 	seq_putc(m, '\n');
 }
 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 6/8] mem_notify v5: (optional) fixed incorrect shrink_zone
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
                   ` (4 preceding siblings ...)
  2008-01-24  4:22 ` [RFC][PATCH 5/8] mem_notify v5: add new mem_notify field to /proc/zoneinfo KOSAKI Motohiro
@ 2008-01-24  4:23 ` KOSAKI Motohiro
  2008-01-24  4:24 ` [RFC][PATCH 7/8] mem_notify v5: ignore very small zone for prevent incorrect low mem notify KOSAKI Motohiro
  2008-01-24  4:26 ` [RFC][PATCH 8/8] mem_notify v5: support fasync feature KOSAKI Motohiro
  7 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:23 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

on X86, ZONE_DMA is very very small.
It is often no used at all. 

Unfortunately, 
when NR_ACTIVE==0, NR_INACTIVE==0, shrink_zone() try to reclaim 1 page.
because

    zone->nr_scan_active +=
        (zone_page_state(zone, NR_ACTIVE) >> priority) + 1;
                                                        ^^^^^

it cause unnecessary low memory notify ;-)


ChangeLog
	v5: new

---
 mm/vmscan.c |   21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

Index: b/mm/vmscan.c
===================================================================
--- a/mm/vmscan.c	2008-01-18 14:18:27.000000000 +0900
+++ b/mm/vmscan.c	2008-01-18 14:49:06.000000000 +0900
@@ -948,7 +948,7 @@ static inline void note_zone_scanning_pr
 
 static inline int zone_is_near_oom(struct zone *zone)
 {
-	return zone->pages_scanned >= (zone_page_state(zone, NR_ACTIVE)
+	return zone->pages_scanned > (zone_page_state(zone, NR_ACTIVE)
 				+ zone_page_state(zone, NR_INACTIVE))*3;
 }
 
@@ -1214,18 +1214,29 @@ static unsigned long shrink_zone(int pri
 	unsigned long nr_inactive;
 	unsigned long nr_to_scan;
 	unsigned long nr_reclaimed = 0;
+	unsigned long tmp;
+	unsigned long zone_active;
+	unsigned long zone_inactive;
 
 	if (scan_global_lru(sc)) {
 		/*
 		 * Add one to nr_to_scan just to make sure that the kernel
 		 * will slowly sift through the active list.
 		 */
-		zone->nr_scan_active +=
-			(zone_page_state(zone, NR_ACTIVE) >> priority) + 1;
+		zone_active = zone_page_state(zone, NR_ACTIVE);
+		tmp = (zone_active >> priority) + 1;
+		if (unlikely(tmp > zone_active))
+			tmp = zone_active;
+		zone->nr_scan_active += tmp;
 		nr_active = zone->nr_scan_active;
-		zone->nr_scan_inactive +=
-			(zone_page_state(zone, NR_INACTIVE) >> priority) + 1;
+
+		zone_inactive = zone_page_state(zone, NR_INACTIVE);
+		tmp = (zone_inactive >> priority) + 1;
+		if (unlikely(tmp > zone_inactive))
+			tmp = zone_inactive;
+		zone->nr_scan_inactive += tmp;
 		nr_inactive = zone->nr_scan_inactive;
+
 		if (nr_inactive >= sc->swap_cluster_max)
 			zone->nr_scan_inactive = 0;
 		else


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 7/8] mem_notify v5: ignore very small zone for prevent incorrect low mem notify.
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
                   ` (5 preceding siblings ...)
  2008-01-24  4:23 ` [RFC][PATCH 6/8] mem_notify v5: (optional) fixed incorrect shrink_zone KOSAKI Motohiro
@ 2008-01-24  4:24 ` KOSAKI Motohiro
  2008-01-24  4:26 ` [RFC][PATCH 8/8] mem_notify v5: support fasync feature KOSAKI Motohiro
  7 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:24 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

on X86, ZONE_DMA is very very small.
it cause undesirable low mem notification.
It should ignored.

but on other some architecture, ZONE_DMA have 4GB.
4GB is large as it is not possible to ignored.

therefore, ignore or not is decided by zone size.

ChangeLog:
	v5: new


Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 include/linux/mem_notify.h |    3 +++
 mm/page_alloc.c            |    6 +++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

Index: b/include/linux/mem_notify.h
===================================================================
--- a/include/linux/mem_notify.h	2008-01-23 22:06:04.000000000 +0900
+++ b/include/linux/mem_notify.h	2008-01-23 22:08:02.000000000 +0900
@@ -22,6 +22,9 @@ static inline void memory_pressure_notif
 	unsigned long target;
 	unsigned long pages_high, pages_free, pages_reserve;
 
+	if (unlikely(zone->mem_notify_status == -1))
+		return;
+
 	if (pressure) {
 		target = atomic_long_read(&last_mem_notify) + MEM_NOTIFY_FREQ;
 		if (likely(time_before(jiffies, target)))
Index: b/mm/page_alloc.c
===================================================================
--- a/mm/page_alloc.c	2008-01-23 22:07:57.000000000 +0900
+++ b/mm/page_alloc.c	2008-01-23 22:08:02.000000000 +0900
@@ -3470,7 +3470,11 @@ static void __meminit free_area_init_cor
 		zone->zone_pgdat = pgdat;
 
 		zone->prev_priority = DEF_PRIORITY;
-		zone->mem_notify_status = 0;
+
+		if (zone->present_pages < (pgdat->node_present_pages / 10))
+			zone->mem_notify_status = -1;
+		else
+			zone->mem_notify_status = 0;
 
 		zone_pcp_init(zone);
 		INIT_LIST_HEAD(&zone->active_list);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 8/8] mem_notify v5: support fasync feature
  2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
                   ` (6 preceding siblings ...)
  2008-01-24  4:24 ` [RFC][PATCH 7/8] mem_notify v5: ignore very small zone for prevent incorrect low mem notify KOSAKI Motohiro
@ 2008-01-24  4:26 ` KOSAKI Motohiro
  7 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-24  4:26 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
	Andrew Morton, Alan Cox

implement FASYNC capability to /dev/mem_notify.

<usage example>
        fd = open("/dev/mem_notify", O_RDONLY);

	fcntl(fd, F_SETOWN, getpid());

	flags = fcntl(fd, F_GETFL);
	fcntl(fd, F_SETFL, flags|FASYNC);  /* when low memory, receive SIGIO */
</usage example>


ChangeLog
	v5: new



Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 mm/mem_notify.c |   95 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 90 insertions(+), 5 deletions(-)

Index: b/mm/mem_notify.c
===================================================================
--- a/mm/mem_notify.c	2008-01-23 23:09:08.000000000 +0900
+++ b/mm/mem_notify.c	2008-01-23 23:09:27.000000000 +0900
@@ -23,18 +23,58 @@
 #define PROC_WAKEUP_GUARD  (10*HZ)
 
 struct mem_notify_file_info {
-	unsigned long last_proc_notify;
+	unsigned long     last_proc_notify;
+	struct file      *file;
+
+	struct list_head  fa_list;
+	int	          fa_fd;
 };
 
 static DECLARE_WAIT_QUEUE_HEAD(mem_wait);
 static atomic_long_t nr_under_memory_pressure_zones = ATOMIC_LONG_INIT(0);
 static atomic_t nr_watcher_task = ATOMIC_INIT(0);
+static LIST_HEAD(mem_notify_fasync_list);
+static DEFINE_SPINLOCK(mem_notify_fasync_lock);
+static atomic_t nr_fasync_task = ATOMIC_INIT(0);
 
 atomic_long_t last_mem_notify = ATOMIC_LONG_INIT(INITIAL_JIFFIES);
 
+
+static void mem_notify_kill_fasync_nr(int sig, int band, int nr)
+{
+	struct mem_notify_file_info *iter, *saved_iter;
+	LIST_HEAD(l_fired);
+
+	if (!nr)
+		return;
+
+	spin_lock(&mem_notify_fasync_lock);
+
+	list_for_each_entry_safe_reverse(iter, saved_iter, &mem_notify_fasync_list, fa_list) {
+		struct fown_struct * fown;
+
+		fown = &iter->file->f_owner;
+		if (!(sig == SIGURG && fown->signum == 0))
+			send_sigio(fown, iter->fa_fd, band);
+
+		list_del(&iter->fa_list);
+		list_add(&iter->fa_list, &l_fired);
+		if(!--nr)
+			break;
+	}
+
+	/* rotate moving for FIFO wakeup */
+	list_splice(&l_fired, &mem_notify_fasync_list);
+
+	spin_unlock(&mem_notify_fasync_lock);
+}
+
+
 void __memory_pressure_notify(struct zone* zone, int pressure)
 {
 	int nr_wakeup;
+	int nr_poll_wakeup = 0;
+	int nr_fasync_wakeup = 0;
 	int flags;
 
 	spin_lock_irqsave(&mem_wait.lock, flags);
@@ -47,13 +87,18 @@ void __memory_pressure_notify(struct zon
 
 	if (pressure) {
 		int nr_watcher = atomic_read(&nr_watcher_task);
+		int nr_fasync = atomic_read(&nr_fasync_task);
 
 		nr_wakeup = (nr_watcher >> 4) + 1;
 		if (unlikely(nr_wakeup > 100))
 			nr_wakeup = 100;
 
+ 		nr_fasync_wakeup = nr_wakeup * nr_fasync/nr_watcher;
+ 		nr_poll_wakeup = nr_wakeup - nr_fasync_wakeup;
+
 		atomic_long_set(&last_mem_notify, jiffies);
-		wake_up_locked_nr(&mem_wait, nr_wakeup);
+		wake_up_locked_nr(&mem_wait, nr_poll_wakeup);
+ 		mem_notify_kill_fasync_nr(SIGIO, POLL_IN, nr_fasync_wakeup);
 	}
 
 	spin_unlock_irqrestore(&mem_wait.lock, flags);
@@ -71,6 +116,9 @@ static int mem_notify_open(struct inode 
 	}
 
 	info->last_proc_notify = INITIAL_JIFFIES;
+	INIT_LIST_HEAD(&info->fa_list);
+	info->file = file;
+	info->fa_fd = -1;
 	file->private_data = info;
 	atomic_inc(&nr_watcher_task);
 out:
@@ -79,7 +127,16 @@ out:
 
 static int mem_notify_release(struct inode *inode, struct file *file)
 {
-	kfree(file->private_data);
+	struct mem_notify_file_info *info = file->private_data;
+
+	spin_lock(&mem_notify_fasync_lock);
+	if (!list_empty(&info->fa_list)) {
+		list_del(&info->fa_list);
+		atomic_dec(&nr_fasync_task);
+	}
+	spin_unlock(&mem_notify_fasync_lock);
+
+	kfree(info);
 	atomic_dec(&nr_watcher_task);
 	return 0;
 }
@@ -106,9 +163,37 @@ out:
 	return retval;
 }
 
+static int mem_notify_fasync(int fd, struct file *filp, int on)
+{
+	struct mem_notify_file_info *info = filp->private_data;
+	int result = 0;
+
+	spin_lock(&mem_notify_fasync_lock);
+	if (on) {
+		if (list_empty(&info->fa_list)) {
+			info->fa_fd = fd;
+			list_add(&info->fa_list, &mem_notify_fasync_list);
+			result = 1;
+		} else {
+			info->fa_fd = fd;
+		}
+	} else {
+		if (!list_empty(&info->fa_list)) {
+			list_del_init(&info->fa_list);
+			info->fa_fd = -1;
+			result = -1;
+		}
+	}
+	if (result != 0)
+		atomic_add(result, &nr_fasync_task);
+	spin_unlock(&mem_notify_fasync_lock);
+	return abs(result);
+}
+
 struct file_operations mem_notify_fops = {
-	.open = mem_notify_open,
+	.open    = mem_notify_open,
 	.release = mem_notify_release,
-	.poll = mem_notify_poll,
+	.poll    = mem_notify_poll,
+	.fasync  = mem_notify_fasync,
 };
 EXPORT_SYMBOL(mem_notify_fops);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH 3/8] mem_notify v5: introduce /dev/mem_notify new device (the core of this patch series)
  2008-01-24  4:21 ` [RFC][PATCH 3/8] mem_notify v5: introduce /dev/mem_notify new device (the core of this patch series) KOSAKI Motohiro
@ 2008-01-24 12:19   ` Daniel Spång
  2008-01-25  3:33     ` KOSAKI Motohiro
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Spång @ 2008-01-24 12:19 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-mm, linux-kernel, Marcelo Tosatti, Rik van Riel,
	Andrew Morton, Alan Cox

Hi KOSAKI,

On 1/24/08, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> +#define PROC_WAKEUP_GUARD  (10*HZ)
[...]
> +       timeout = info->last_proc_notify + PROC_WAKEUP_GUARD;

If only one or a few processes are using the system I think 10 seconds
is a little long time to wait before they get the notification again.
Can we decrease this value? Or make it configurable under /proc? Or
make it lower with fewer users? Something like:

timeout = info->last_proc_notify + min(mem_notify_users, PROC_WAKEUP_GUARD);

Cheers,
Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH 3/8] mem_notify v5: introduce /dev/mem_notify new device (the core of this patch series)
  2008-01-24 12:19   ` Daniel Spång
@ 2008-01-25  3:33     ` KOSAKI Motohiro
  0 siblings, 0 replies; 11+ messages in thread
From: KOSAKI Motohiro @ 2008-01-25  3:33 UTC (permalink / raw)
  To: "Daniel Sp蚣g"
  Cc: kosaki.motohiro, linux-mm, linux-kernel, Marcelo Tosatti,
	Rik van Riel, Andrew Morton, Alan Cox

Hi Daniel

> > +#define PROC_WAKEUP_GUARD  (10*HZ)
> [...]
> > +       timeout = info->last_proc_notify + PROC_WAKEUP_GUARD;
> 
> If only one or a few processes are using the system I think 10 seconds
> is a little long time to wait before they get the notification again.
> Can we decrease this value? Or make it configurable under /proc? Or
> make it lower with fewer users? Something like:

Oh, that is very interesting issue.
tank you good point out.

after deep thinking, I understand my current implementation is fully stupid.
current, worst case is below.

  1. low end
     - many process of used only a bit memory(sh, cp etc..) exist.
     - 1 memory eater process exist(may be, it is fat browser)
       and it watching /dev/mem_notify.

  2. high end
     - many process of used only a bit memory(sh, cp etc..) exist.
     - 1 memory eater process exist(may be, it is DB)
       and it watching /dev/mem_notify.

the point is "only 1 process watch /dev/mem_notify", but not a few processor.
I fix it with pleasure. 


> timeout = info->last_proc_notify + min(mem_notify_users, PROC_WAKEUP_GUARD);

I like this formula.
the rest problem is decide to default value when only 1 process watch /dev/mem_notify.

What do you think it?
and if my low end worst case situation doesn't match yours, 
Could you please explain your usage more?


BTW: 
end up, We will add /proc configuration the future.
but I think it is too early.
sometimes configrable parameter prevent the discussion of nicer default value.
Instead, I hope the default value changed by adjust your usage.


- kosaki


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-01-25  3:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-24  4:18 [RFC][PATCH 0/8] mem_notify v5 KOSAKI Motohiro
2008-01-24  4:19 ` [RFC][PATCH 1/8] mem_notify v5: introduce poll_wait_exclusive() new API KOSAKI Motohiro
2008-01-24  4:20 ` [RFC][PATCH 2/8] mem_notify v5: introduce wake_up_locked_nr() " KOSAKI Motohiro
2008-01-24  4:21 ` [RFC][PATCH 3/8] mem_notify v5: introduce /dev/mem_notify new device (the core of this patch series) KOSAKI Motohiro
2008-01-24 12:19   ` Daniel Spång
2008-01-25  3:33     ` KOSAKI Motohiro
2008-01-24  4:22 ` [RFC][PATCH 4/8] mem_notify v5: memory_pressure_notify() caller KOSAKI Motohiro
2008-01-24  4:22 ` [RFC][PATCH 5/8] mem_notify v5: add new mem_notify field to /proc/zoneinfo KOSAKI Motohiro
2008-01-24  4:23 ` [RFC][PATCH 6/8] mem_notify v5: (optional) fixed incorrect shrink_zone KOSAKI Motohiro
2008-01-24  4:24 ` [RFC][PATCH 7/8] mem_notify v5: ignore very small zone for prevent incorrect low mem notify KOSAKI Motohiro
2008-01-24  4:26 ` [RFC][PATCH 8/8] mem_notify v5: support fasync feature KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox