* [patch 0/9] oom killer serialization
@ 2007-09-20 20:23 David Rientjes
2007-09-20 20:23 ` [patch 1/9] oom: move prototypes to appropriate header file David Rientjes
2007-09-21 9:12 ` [patch 0/9] oom killer serialization Andrew Morton
0 siblings, 2 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Third version of the OOM serialization patchset. Zone locking is now
done with a newly-introduced flag in struct zone and the per-cpuset
oom_kill_asking_task has been extracted out to become a full-fledged
sysctl.
Thanks to Christoph Lameter and Paul Jackson for their help and review of
this patchset.
Applied on 2.6.23-rc7.
---
Documentation/sysctl/vm.txt | 22 +++++++++
drivers/char/sysrq.c | 1 +
include/linux/cpuset.h | 12 ++---
include/linux/mmzone.h | 33 ++++++++++++--
include/linux/oom.h | 23 +++++++++-
include/linux/swap.h | 5 --
kernel/cpuset.c | 70 ++++-----------------------
kernel/sysctl.c | 9 ++++
mm/oom_kill.c | 107 +++++++++++++++++++++++++++++++------------
mm/page_alloc.c | 23 +++++++--
mm/vmscan.c | 25 +++++-----
mm/vmstat.c | 2 +-
12 files changed, 206 insertions(+), 126 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 1/9] oom: move prototypes to appropriate header file
2007-09-20 20:23 [patch 0/9] oom killer serialization David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 2/9] oom: move constraints to enum David Rientjes
2007-09-21 9:12 ` [patch 0/9] oom killer serialization Andrew Morton
1 sibling, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Move the OOM killer's extern function prototypes to include/linux/oom.h
and include it where necessary.
Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
drivers/char/sysrq.c | 1 +
include/linux/oom.h | 11 ++++++++++-
include/linux/swap.h | 5 -----
mm/page_alloc.c | 1 +
4 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/char/sysrq.c b/drivers/char/sysrq.c
--- a/drivers/char/sysrq.c
+++ b/drivers/char/sysrq.c
@@ -36,6 +36,7 @@
#include <linux/kexec.h>
#include <linux/irq.h>
#include <linux/hrtimer.h>
+#include <linux/oom.h>
#include <asm/ptrace.h>
#include <asm/irq_regs.h>
diff --git a/include/linux/oom.h b/include/linux/oom.h
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -1,10 +1,19 @@
#ifndef __INCLUDE_LINUX_OOM_H
#define __INCLUDE_LINUX_OOM_H
+#include <linux/sched.h>
+
/* /proc/<pid>/oom_adj set to -17 protects from the oom-killer */
#define OOM_DISABLE (-17)
/* inclusive */
#define OOM_ADJUST_MIN (-16)
#define OOM_ADJUST_MAX 15
-#endif
+#ifdef __KERNEL__
+
+extern void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order);
+extern int register_oom_notifier(struct notifier_block *nb);
+extern int unregister_oom_notifier(struct notifier_block *nb);
+
+#endif /* __KERNEL__*/
+#endif /* _INCLUDE_LINUX_OOM_H */
diff --git a/include/linux/swap.h b/include/linux/swap.h
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -158,11 +158,6 @@ struct swap_list_t {
/* Swap 50% full? Release swapcache more aggressively.. */
#define vm_swap_full() (nr_swap_pages*2 < total_swap_pages)
-/* linux/mm/oom_kill.c */
-extern void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order);
-extern int register_oom_notifier(struct notifier_block *nb);
-extern int unregister_oom_notifier(struct notifier_block *nb);
-
/* linux/mm/memory.c */
extern void swapin_readahead(swp_entry_t, unsigned long, struct vm_area_struct *);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -41,6 +41,7 @@
#include <linux/pfn.h>
#include <linux/backing-dev.h>
#include <linux/fault-inject.h>
+#include <linux/oom.h>
#include <asm/tlbflush.h>
#include <asm/div64.h>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 2/9] oom: move constraints to enum
2007-09-20 20:23 ` [patch 1/9] oom: move prototypes to appropriate header file David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 3/9] oom: change all_unreclaimable zone member to flags David Rientjes
0 siblings, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
The OOM killer's CONSTRAINT definitions are really more appropriate in an
enum, so define them in include/linux/oom.h.
Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/oom.h | 9 +++++++++
mm/oom_kill.c | 12 +++---------
2 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/include/linux/oom.h b/include/linux/oom.h
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -11,6 +11,15 @@
#ifdef __KERNEL__
+/*
+ * Types of limitations to the nodes from which allocations may occur
+ */
+enum oom_constraint {
+ CONSTRAINT_NONE,
+ CONSTRAINT_CPUSET,
+ CONSTRAINT_MEMORY_POLICY,
+};
+
extern void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order);
extern int register_oom_notifier(struct notifier_block *nb);
extern int unregister_oom_notifier(struct notifier_block *nb);
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -164,16 +164,10 @@ unsigned long badness(struct task_struct *p, unsigned long uptime)
}
/*
- * Types of limitations to the nodes from which allocations may occur
- */
-#define CONSTRAINT_NONE 1
-#define CONSTRAINT_MEMORY_POLICY 2
-#define CONSTRAINT_CPUSET 3
-
-/*
* Determine the type of allocation constraint.
*/
-static inline int constrained_alloc(struct zonelist *zonelist, gfp_t gfp_mask)
+static inline enum oom_constraint constrained_alloc(struct zonelist *zonelist,
+ gfp_t gfp_mask)
{
#ifdef CONFIG_NUMA
struct zone **z;
@@ -400,7 +394,7 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
struct task_struct *p;
unsigned long points = 0;
unsigned long freed = 0;
- int constraint;
+ enum oom_constraint constraint;
blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
if (freed > 0)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 3/9] oom: change all_unreclaimable zone member to flags
2007-09-20 20:23 ` [patch 2/9] oom: move constraints to enum David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 4/9] oom: add per-zone locking David Rientjes
` (2 more replies)
0 siblings, 3 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Convert the int all_unreclaimable member of struct zone to
unsigned long flags. This can now be used to specify several different
zone flags such as all_unreclaimable and reclaim_in_progress, which can
now be removed and converted to a per-zone flag.
Flags are set and cleared as follows:
zone_set_flag(struct zone *zone, zone_flags_t flag)
zone_clear_flag(struct zone *zone, zone_flags_t flag)
Defines the first zone flags, ZONE_ALL_UNRECLAIMABLE and
ZONE_RECLAIM_LOCKED, which have the same semantics as the old
zone->all_unreclaimable and zone->reclaim_in_progress, respectively. Also
converts all current users that set or clear either flag to use the new
interface.
Helper functions are defined to test the flags:
int zone_is_all_unreclaimable(const struct zone *zone)
int zone_is_reclaim_locked(const struct zone *zone)
All flag operators are of the atomic variety because there are currently
readers that are implemented that do not take zone->lock.
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/mmzone.h | 28 ++++++++++++++++++++++++----
mm/page_alloc.c | 8 ++++----
mm/vmscan.c | 25 +++++++++++++------------
mm/vmstat.c | 2 +-
4 files changed, 42 insertions(+), 21 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -232,10 +232,7 @@ struct zone {
unsigned long nr_scan_active;
unsigned long nr_scan_inactive;
unsigned long pages_scanned; /* since last reclaim */
- int all_unreclaimable; /* All pages pinned */
-
- /* A count of how many reclaimers are scanning this zone */
- atomic_t reclaim_in_progress;
+ unsigned long flags; /* zone flags, see below */
/* Zone statistics */
atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
@@ -313,6 +310,29 @@ struct zone {
const char *name;
} ____cacheline_internodealigned_in_smp;
+typedef enum {
+ ZONE_ALL_UNRECLAIMABLE, /* all pages pinned */
+ ZONE_RECLAIM_LOCKED, /* prevents concurrent reclaim */
+} zone_flags_t;
+
+static inline void zone_set_flag(struct zone *zone, zone_flags_t flag)
+{
+ set_bit(flag, &zone->flags);
+}
+static inline void zone_clear_flag(struct zone *zone, zone_flags_t flag)
+{
+ clear_bit(flag, &zone->flags);
+}
+
+static inline int zone_is_all_unreclaimable(const struct zone *zone)
+{
+ return test_bit(ZONE_ALL_UNRECLAIMABLE, &zone->flags);
+}
+static inline int zone_is_reclaim_locked(const struct zone *zone)
+{
+ return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags);
+}
+
/*
* The "priority" of VM scanning is how much of the queues we will scan in one
* go. A value of 12 for DEF_PRIORITY implies that we will scan 1/4096th of the
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -479,7 +479,7 @@ static void free_pages_bulk(struct zone *zone, int count,
struct list_head *list, int order)
{
spin_lock(&zone->lock);
- zone->all_unreclaimable = 0;
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
zone->pages_scanned = 0;
while (count--) {
struct page *page;
@@ -496,7 +496,7 @@ static void free_pages_bulk(struct zone *zone, int count,
static void free_one_page(struct zone *zone, struct page *page, int order)
{
spin_lock(&zone->lock);
- zone->all_unreclaimable = 0;
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
zone->pages_scanned = 0;
__free_one_page(page, zone, order);
spin_unlock(&zone->lock);
@@ -1617,7 +1617,7 @@ void show_free_areas(void)
K(zone_page_state(zone, NR_INACTIVE)),
K(zone->present_pages),
zone->pages_scanned,
- (zone->all_unreclaimable ? "yes" : "no")
+ (zone_is_all_unreclaimable(zone) ? "yes" : "no")
);
printk("lowmem_reserve[]:");
for (i = 0; i < MAX_NR_ZONES; i++)
@@ -2978,7 +2978,7 @@ static void __meminit free_area_init_core(struct pglist_data *pgdat,
zone->nr_scan_active = 0;
zone->nr_scan_inactive = 0;
zap_zone_vm_stats(zone);
- atomic_set(&zone->reclaim_in_progress, 0);
+ zone->flags = 0;
if (!size)
continue;
diff --git a/mm/vmscan.c b/mm/vmscan.c
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1067,7 +1067,7 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
unsigned long nr_to_scan;
unsigned long nr_reclaimed = 0;
- atomic_inc(&zone->reclaim_in_progress);
+ zone_set_flag(zone, ZONE_RECLAIM_LOCKED);
/*
* Add one to `nr_to_scan' just to make sure that the kernel will
@@ -1108,7 +1108,7 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
throttle_vm_writeout(sc->gfp_mask);
- atomic_dec(&zone->reclaim_in_progress);
+ zone_clear_flag(zone, ZONE_RECLAIM_LOCKED);
return nr_reclaimed;
}
@@ -1146,7 +1146,7 @@ static unsigned long shrink_zones(int priority, struct zone **zones,
note_zone_scanning_priority(zone, priority);
- if (zone->all_unreclaimable && priority != DEF_PRIORITY)
+ if (zone_is_all_unreclaimable(zone) && priority != DEF_PRIORITY)
continue; /* Let kswapd poll it */
sc->all_unreclaimable = 0;
@@ -1327,7 +1327,8 @@ loop_again:
if (!populated_zone(zone))
continue;
- if (zone->all_unreclaimable && priority != DEF_PRIORITY)
+ if (zone_is_all_unreclaimable(zone) &&
+ priority != DEF_PRIORITY)
continue;
if (!zone_watermark_ok(zone, order, zone->pages_high,
@@ -1362,7 +1363,8 @@ loop_again:
if (!populated_zone(zone))
continue;
- if (zone->all_unreclaimable && priority != DEF_PRIORITY)
+ if (zone_is_all_unreclaimable(zone) &&
+ priority != DEF_PRIORITY)
continue;
if (!zone_watermark_ok(zone, order, zone->pages_high,
@@ -1377,12 +1379,13 @@ loop_again:
lru_pages);
nr_reclaimed += reclaim_state->reclaimed_slab;
total_scanned += sc.nr_scanned;
- if (zone->all_unreclaimable)
+ if (zone_is_all_unreclaimable(zone))
continue;
if (nr_slab == 0 && zone->pages_scanned >=
(zone_page_state(zone, NR_ACTIVE)
+ zone_page_state(zone, NR_INACTIVE)) * 6)
- zone->all_unreclaimable = 1;
+ zone_set_flag(zone,
+ ZONE_ALL_UNRECLAIMABLE);
/*
* If we've done a decent amount of scanning and
* the reclaim ratio is low, start doing writepage
@@ -1548,7 +1551,7 @@ static unsigned long shrink_all_zones(unsigned long nr_pages, int prio,
if (!populated_zone(zone))
continue;
- if (zone->all_unreclaimable && prio != DEF_PRIORITY)
+ if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
continue;
/* For pass = 0 we don't shrink the active list */
@@ -1871,10 +1874,8 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
* not have reclaimable pages and if we should not delay the allocation
* then do not scan.
*/
- if (!(gfp_mask & __GFP_WAIT) ||
- zone->all_unreclaimable ||
- atomic_read(&zone->reclaim_in_progress) > 0 ||
- (current->flags & PF_MEMALLOC))
+ if (!(gfp_mask & __GFP_WAIT) || zone_is_all_unreclaimable(zone) ||
+ zone_is_reclaim_locked(zone) || (current->flags & PF_MEMALLOC))
return 0;
/*
diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -604,7 +604,7 @@ static int zoneinfo_show(struct seq_file *m, void *arg)
"\n all_unreclaimable: %u"
"\n prev_priority: %i"
"\n start_pfn: %lu",
- zone->all_unreclaimable,
+ zone_is_all_unreclaimable(zone),
zone->prev_priority,
zone->zone_start_pfn);
spin_unlock_irqrestore(&zone->lock, flags);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 4/9] oom: add per-zone locking
2007-09-20 20:23 ` [patch 3/9] oom: change all_unreclaimable zone member to flags David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 5/9] oom: serialize out of memory calls David Rientjes
2007-09-20 21:59 ` [patch 4/9] oom: add per-zone locking Christoph Lameter
2007-09-20 21:56 ` [patch 3/9] oom: change all_unreclaimable zone member to flags Christoph Lameter
2007-09-21 8:55 ` Andrew Morton
2 siblings, 2 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
OOM killer synchronization should be done with zone granularity so that
memory policy and cpuset allocations may have their corresponding zones
locked and allow parallel kills for other OOM conditions that may exist
elsewhere in the system. DMA allocations can be targeted at the zone
level, which would not be possible if locking was done in nodes or
globally.
Synchronization shall be done with a variation of "trylocks." The goal
is to put the current task to sleep and restart the failed allocation
attempt later if the trylock fails. Otherwise, the OOM killer is
invoked.
Each zone in the zonelist that __alloc_pages() was called with is checked
for the newly-introduced ZONE_OOM_LOCKED flag. If any zone has this flag
present, the "trylock" to serialize the OOM killer fails and returns
zero. Otherwise, all the zones have ZONE_OOM_LOCKED set and the
try_set_zone_oom() function returns non-zero.
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/mmzone.h | 5 ++++
include/linux/oom.h | 3 ++
mm/oom_kill.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 60 insertions(+), 0 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -313,6 +313,7 @@ struct zone {
typedef enum {
ZONE_ALL_UNRECLAIMABLE, /* all pages pinned */
ZONE_RECLAIM_LOCKED, /* prevents concurrent reclaim */
+ ZONE_OOM_LOCKED, /* zone is in OOM killer zonelist */
} zone_flags_t;
static inline void zone_set_flag(struct zone *zone, zone_flags_t flag)
@@ -332,6 +333,10 @@ static inline int zone_is_reclaim_locked(const struct zone *zone)
{
return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags);
}
+static inline int zone_is_oom_locked(const struct zone *zone)
+{
+ return test_bit(ZONE_OOM_LOCKED, &zone->flags);
+}
/*
* The "priority" of VM scanning is how much of the queues we will scan in one
diff --git a/include/linux/oom.h b/include/linux/oom.h
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -20,6 +20,9 @@ enum oom_constraint {
CONSTRAINT_MEMORY_POLICY,
};
+extern int try_set_zone_oom(struct zonelist *zonelist);
+extern void clear_zonelist_oom(struct zonelist *zonelist);
+
extern void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order);
extern int register_oom_notifier(struct notifier_block *nb);
extern int unregister_oom_notifier(struct notifier_block *nb);
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -27,6 +27,7 @@
#include <linux/notifier.h>
int sysctl_panic_on_oom;
+static DEFINE_MUTEX(zone_scan_mutex);
/* #define DEBUG */
/**
@@ -381,6 +382,57 @@ int unregister_oom_notifier(struct notifier_block *nb)
}
EXPORT_SYMBOL_GPL(unregister_oom_notifier);
+/*
+ * Try to acquire the OOM killer lock for the zones in zonelist. Returns zero
+ * if a parallel OOM killing is already taking place that includes a zone in
+ * the zonelist. Otherwise, locks all zones in the zonelist and returns 1.
+ */
+int try_set_zone_oom(struct zonelist *zonelist)
+{
+ struct zone **z;
+ int ret = 1;
+
+ z = zonelist->zones;
+
+ mutex_lock(&zone_scan_mutex);
+ do {
+ if (zone_is_oom_locked(*z)) {
+ ret = 0;
+ goto out;
+ }
+ } while (*(++z) != NULL);
+
+ /*
+ * Lock each zone in the zonelist under zone_scan_mutex so a parallel
+ * invocation of try_set_zone_oom() doesn't succeed when it shouldn't.
+ */
+ z = zonelist->zones;
+ do {
+ zone_set_flag(*z, ZONE_OOM_LOCKED);
+ } while (*(++z) != NULL);
+out:
+ mutex_unlock(&zone_scan_mutex);
+ return ret;
+}
+
+/*
+ * Clears the ZONE_OOM_LOCKED flag for all zones in the zonelist so that failed
+ * allocation attempts with zonelists containing them may now recall the OOM
+ * killer, if necessary.
+ */
+void clear_zonelist_oom(struct zonelist *zonelist)
+{
+ struct zone **z;
+
+ z = zonelist->zones;
+
+ mutex_lock(&zone_scan_mutex);
+ do {
+ zone_clear_flag(*z, ZONE_OOM_LOCKED);
+ } while (*(++z) != NULL);
+ mutex_unlock(&zone_scan_mutex);
+}
+
/**
* out_of_memory - kill the "best" process when we run out of memory
*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 5/9] oom: serialize out of memory calls
2007-09-20 20:23 ` [patch 4/9] oom: add per-zone locking David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 6/9] oom: add oom_kill_asking_task sysctl David Rientjes
` (2 more replies)
2007-09-20 21:59 ` [patch 4/9] oom: add per-zone locking Christoph Lameter
1 sibling, 3 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Before invoking the OOM killer, a final allocation attempt with a very
high watermark is attempted. Serialization needs to occur at this point
or it may be possible that the allocation could succeed after acquiring
the lock. If the lock is contended, the task is put to sleep and the
allocation attempt is retried when rescheduled.
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
mm/page_alloc.c | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1353,6 +1353,11 @@ nofail_alloc:
if (page)
goto got_pg;
} else if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
+ if (!try_set_zone_oom(zonelist)) {
+ schedule_timeout_uninterruptible(1);
+ goto restart;
+ }
+
/*
* Go through the zonelist yet one more time, keep
* very high watermark here, this is only to catch
@@ -1361,14 +1366,19 @@ nofail_alloc:
*/
page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
- if (page)
+ if (page) {
+ clear_zonelist_oom(zonelist);
goto got_pg;
+ }
/* The OOM killer will not help higher order allocs so fail */
- if (order > PAGE_ALLOC_COSTLY_ORDER)
+ if (order > PAGE_ALLOC_COSTLY_ORDER) {
+ clear_zonelist_oom(zonelist);
goto nopage;
+ }
out_of_memory(zonelist, gfp_mask, order);
+ clear_zonelist_oom(zonelist);
goto restart;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 6/9] oom: add oom_kill_asking_task sysctl
2007-09-20 20:23 ` [patch 5/9] oom: serialize out of memory calls David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 7/9] oom: suppress extraneous stack and memory dump David Rientjes
` (2 more replies)
2007-09-20 21:59 ` [patch 5/9] oom: serialize out of memory calls Christoph Lameter
2007-09-21 9:01 ` Andrew Morton
2 siblings, 3 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Adds a new sysctl, 'oom_kill_asking_task', which will automatically kill
the OOM-triggering task instead of scanning through the tasklist to find
a memory-hogging target. This is helpful for systems with an insanely
large number of tasks where scanning the tasklist significantly degrades
performance.
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
Documentation/sysctl/vm.txt | 22 ++++++++++++++++++++++
kernel/sysctl.c | 9 +++++++++
mm/oom_kill.c | 13 ++++++++-----
3 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -31,6 +31,7 @@ Currently, these files are in /proc/sys/vm:
- min_unmapped_ratio
- min_slab_ratio
- panic_on_oom
+- oom_kill_asking_task
- mmap_min_address
- numa_zonelist_order
@@ -220,6 +221,27 @@ The default value is 0.
1 and 2 are for failover of clustering. Please select either
according to your policy of failover.
+=============================================================
+
+oom_kill_asking_task
+
+This enables or disables killing the OOM-triggering task in
+out-of-memory situations.
+
+If this is set to zero, the OOM killer will scan through the entire
+tasklist and select a task based on heuristics to kill. This normally
+selects a rogue memory-hogging task that frees up a large amount of
+memory when killed.
+
+If this is set to non-zero, the OOM killer simply kills the task that
+triggered the out-of-memory condition. This avoids the expensive
+tasklist scan.
+
+If panic_on_oom is selected, it takes precedence over whatever value
+is used in oom_kill_asking_task.
+
+The default value is 0.
+
==============================================================
mmap_min_addr
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -63,6 +63,7 @@ extern int print_fatal_signals;
extern int sysctl_overcommit_memory;
extern int sysctl_overcommit_ratio;
extern int sysctl_panic_on_oom;
+extern int sysctl_oom_kill_asking_task;
extern int max_threads;
extern int core_uses_pid;
extern int suid_dumpable;
@@ -798,6 +799,14 @@ static ctl_table vm_table[] = {
.proc_handler = &proc_dointvec,
},
{
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "oom_kill_asking_task",
+ .data = &sysctl_oom_kill_asking_task,
+ .maxlen = sizeof(sysctl_oom_kill_asking_task),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec,
+ },
+ {
.ctl_name = VM_OVERCOMMIT_RATIO,
.procname = "overcommit_ratio",
.data = &sysctl_overcommit_ratio,
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -27,6 +27,7 @@
#include <linux/notifier.h>
int sysctl_panic_on_oom;
+int sysctl_oom_kill_asking_task;
static DEFINE_MUTEX(zone_scan_mutex);
/* #define DEBUG */
@@ -478,14 +479,16 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
"No available memory (MPOL_BIND)");
break;
- case CONSTRAINT_CPUSET:
- oom_kill_process(current, points,
- "No available memory in cpuset");
- break;
-
case CONSTRAINT_NONE:
if (sysctl_panic_on_oom)
panic("out of memory. panic_on_oom is selected\n");
+ /* Fall-through */
+ case CONSTRAINT_CPUSET:
+ if (sysctl_oom_kill_asking_task) {
+ oom_kill_process(current, points,
+ "Out of memory (oom_kill_asking_task)");
+ break;
+ }
retry:
/*
* Rambo mode: Shoot down a process and hope it solves whatever
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 7/9] oom: suppress extraneous stack and memory dump
2007-09-20 20:23 ` [patch 6/9] oom: add oom_kill_asking_task sysctl David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors David Rientjes
2007-09-20 22:00 ` [patch 7/9] oom: suppress extraneous stack and memory dump Christoph Lameter
2007-09-20 22:03 ` [patch 6/9] oom: add oom_kill_asking_task sysctl Christoph Lameter
2007-09-21 9:05 ` Andrew Morton
2 siblings, 2 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Suppresses the extraneous stack and memory dump when a parallel OOM
killing has been found. There's no need to fill the ring buffer with
this information if its already been printed and the condition that
triggered the previous OOM killer has not yet been alleviated.
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
mm/oom_kill.c | 27 ++++++++++++++-------------
1 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -340,12 +340,20 @@ static int oom_kill_task(struct task_struct *p)
return 0;
}
-static int oom_kill_process(struct task_struct *p, unsigned long points,
- const char *message)
+static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
+ unsigned long points, const char *message)
{
struct task_struct *c;
struct list_head *tsk;
+ if (printk_ratelimit()) {
+ printk(KERN_WARNING "%s invoked oom-killer: "
+ "gfp_mask=0x%x, order=%d, oomkilladj=%d\n",
+ current->comm, gfp_mask, order, current->oomkilladj);
+ dump_stack();
+ show_mem();
+ }
+
/*
* If the task is already exiting, don't alarm the sysadmin or kill
* its children or threads, just set TIF_MEMDIE so it can die quickly
@@ -454,14 +462,6 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
/* Got some memory back in the last second. */
return;
- if (printk_ratelimit()) {
- printk(KERN_WARNING "%s invoked oom-killer: "
- "gfp_mask=0x%x, order=%d, oomkilladj=%d\n",
- current->comm, gfp_mask, order, current->oomkilladj);
- dump_stack();
- show_mem();
- }
-
if (sysctl_panic_on_oom == 2)
panic("out of memory. Compulsory panic_on_oom is selected.\n");
@@ -475,7 +475,7 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
switch (constraint) {
case CONSTRAINT_MEMORY_POLICY:
- oom_kill_process(current, points,
+ oom_kill_process(current, gfp_mask, order, points,
"No available memory (MPOL_BIND)");
break;
@@ -485,7 +485,7 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
/* Fall-through */
case CONSTRAINT_CPUSET:
if (sysctl_oom_kill_asking_task) {
- oom_kill_process(current, points,
+ oom_kill_process(current, gfp_mask, order, points,
"Out of memory (oom_kill_asking_task)");
break;
}
@@ -506,7 +506,8 @@ retry:
panic("Out of memory and no killable processes...\n");
}
- if (oom_kill_process(p, points, "Out of memory"))
+ if (oom_kill_process(p, points, gfp_mask, order,
+ "Out of memory"))
goto retry;
break;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors
2007-09-20 20:23 ` [patch 7/9] oom: suppress extraneous stack and memory dump David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 20:23 ` [patch 9/9] oom: do not take callback_mutex David Rientjes
2007-09-20 22:01 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors Christoph Lameter
2007-09-20 22:00 ` [patch 7/9] oom: suppress extraneous stack and memory dump Christoph Lameter
1 sibling, 2 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Instead of testing for overlap in the memory nodes of the the nearest
exclusive ancestor of both current and the candidate task, it is better
to simply test for intersection between the task's mems_allowed in their
task descriptors. This does not require taking callback_mutex since it
is only used as a hint in the badness scoring.
Tasks that do not have an intersection in their mems_allowed with the
current task are not explicitly restricted from being OOM killed because
it is quite possible that the candidate task has allocated memory there
before and has since changed its mems_allowed.
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/cpuset.h | 6 ++++--
kernel/cpuset.c | 43 +++++++++++--------------------------------
mm/oom_kill.c | 2 +-
3 files changed, 16 insertions(+), 35 deletions(-)
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -45,7 +45,8 @@ static int inline cpuset_zone_allowed_hardwall(struct zone *z, gfp_t gfp_mask)
__cpuset_zone_allowed_hardwall(z, gfp_mask);
}
-extern int cpuset_excl_nodes_overlap(const struct task_struct *p);
+extern int cpuset_mems_allowed_intersects(const struct task_struct *tsk1,
+ const struct task_struct *tsk2);
#define cpuset_memory_pressure_bump() \
do { \
@@ -113,7 +114,8 @@ static inline int cpuset_zone_allowed_hardwall(struct zone *z, gfp_t gfp_mask)
return 1;
}
-static inline int cpuset_excl_nodes_overlap(const struct task_struct *p)
+static inline int cpuset_mems_allowed_intersects(const struct task_struct *tsk1,
+ const struct task_struct *tsk2)
{
return 1;
}
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2566,41 +2566,20 @@ int cpuset_mem_spread_node(void)
EXPORT_SYMBOL_GPL(cpuset_mem_spread_node);
/**
- * cpuset_excl_nodes_overlap - Do we overlap @p's mem_exclusive ancestors?
- * @p: pointer to task_struct of some other task.
- *
- * Description: Return true if the nearest mem_exclusive ancestor
- * cpusets of tasks @p and current overlap. Used by oom killer to
- * determine if task @p's memory usage might impact the memory
- * available to the current task.
- *
- * Call while holding callback_mutex.
+ * cpuset_mems_allowed_intersects - Does @tsk1's mems_allowed intersect @tsk2's?
+ * @tsk1: pointer to task_struct of some task.
+ * @tsk2: pointer to task_struct of some other task.
+ *
+ * Description: Return true if @tsk1's mems_allowed intersects the
+ * mems_allowed of @tsk2. Used by the OOM killer to determine if
+ * one of the task's memory usage might impact the memory available
+ * to the other.
**/
-int cpuset_excl_nodes_overlap(const struct task_struct *p)
+int cpuset_mems_allowed_intersects(const struct task_struct *tsk1,
+ const struct task_struct *tsk2)
{
- const struct cpuset *cs1, *cs2; /* my and p's cpuset ancestors */
- int overlap = 1; /* do cpusets overlap? */
-
- task_lock(current);
- if (current->flags & PF_EXITING) {
- task_unlock(current);
- goto done;
- }
- cs1 = nearest_exclusive_ancestor(current->cpuset);
- task_unlock(current);
-
- task_lock((struct task_struct *)p);
- if (p->flags & PF_EXITING) {
- task_unlock((struct task_struct *)p);
- goto done;
- }
- cs2 = nearest_exclusive_ancestor(p->cpuset);
- task_unlock((struct task_struct *)p);
-
- overlap = nodes_intersects(cs1->mems_allowed, cs2->mems_allowed);
-done:
- return overlap;
+ return nodes_intersects(tsk1->mems_allowed, tsk2->mems_allowed);
}
/*
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -143,7 +143,7 @@ unsigned long badness(struct task_struct *p, unsigned long uptime)
* because p may have allocated or otherwise mapped memory on
* this node before. However it will be less likely.
*/
- if (!cpuset_excl_nodes_overlap(p))
+ if (!cpuset_mems_allowed_intersects(current, p))
points /= 8;
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [patch 9/9] oom: do not take callback_mutex
2007-09-20 20:23 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors David Rientjes
@ 2007-09-20 20:23 ` David Rientjes
2007-09-20 22:04 ` Christoph Lameter
2007-09-20 22:01 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors Christoph Lameter
1 sibling, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-20 20:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
Since no task descriptor's 'cpuset' field is dereferenced in the
execution of the OOM killer anymore, it is no longer necessary to take
callback_mutex.
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/cpuset.h | 6 ------
kernel/cpuset.c | 27 ---------------------------
mm/oom_kill.c | 3 ---
3 files changed, 0 insertions(+), 36 deletions(-)
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -59,9 +59,6 @@ extern void __cpuset_memory_pressure_bump(void);
extern const struct file_operations proc_cpuset_operations;
extern char *cpuset_task_status_allowed(struct task_struct *task, char *buffer);
-extern void cpuset_lock(void);
-extern void cpuset_unlock(void);
-
extern int cpuset_mem_spread_node(void);
static inline int cpuset_do_page_mem_spread(void)
@@ -128,9 +125,6 @@ static inline char *cpuset_task_status_allowed(struct task_struct *task,
return buffer;
}
-static inline void cpuset_lock(void) {}
-static inline void cpuset_unlock(void) {}
-
static inline int cpuset_mem_spread_node(void)
{
return 0;
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -2501,33 +2501,6 @@ int __cpuset_zone_allowed_hardwall(struct zone *z, gfp_t gfp_mask)
}
/**
- * cpuset_lock - lock out any changes to cpuset structures
- *
- * The out of memory (oom) code needs to mutex_lock cpusets
- * from being changed while it scans the tasklist looking for a
- * task in an overlapping cpuset. Expose callback_mutex via this
- * cpuset_lock() routine, so the oom code can lock it, before
- * locking the task list. The tasklist_lock is a spinlock, so
- * must be taken inside callback_mutex.
- */
-
-void cpuset_lock(void)
-{
- mutex_lock(&callback_mutex);
-}
-
-/**
- * cpuset_unlock - release lock on cpuset changes
- *
- * Undo the lock taken in a previous cpuset_lock() call.
- */
-
-void cpuset_unlock(void)
-{
- mutex_unlock(&callback_mutex);
-}
-
-/**
* cpuset_mem_spread_node() - On which node to begin search for a page
*
* If a task is marked PF_SPREAD_PAGE or PF_SPREAD_SLAB (as for
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -470,7 +470,6 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
* NUMA) that may require different handling.
*/
constraint = constrained_alloc(zonelist, gfp_mask);
- cpuset_lock();
read_lock(&tasklist_lock);
switch (constraint) {
@@ -502,7 +501,6 @@ retry:
/* Found nothing?!?! Either we hang forever, or we panic. */
if (!p) {
read_unlock(&tasklist_lock);
- cpuset_unlock();
panic("Out of memory and no killable processes...\n");
}
@@ -515,7 +513,6 @@ retry:
out:
read_unlock(&tasklist_lock);
- cpuset_unlock();
/*
* Give "p" a good chance of killing itself before we
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 3/9] oom: change all_unreclaimable zone member to flags
2007-09-20 20:23 ` [patch 3/9] oom: change all_unreclaimable zone member to flags David Rientjes
2007-09-20 20:23 ` [patch 4/9] oom: add per-zone locking David Rientjes
@ 2007-09-20 21:56 ` Christoph Lameter
2007-09-20 21:58 ` David Rientjes
2007-09-21 8:55 ` Andrew Morton
2 siblings, 1 reply; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 21:56 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
> All flag operators are of the atomic variety because there are currently
> readers that are implemented that do not take zone->lock.
Acked-by: Christoph Lameter <clameter@sgi.com>
Additional work needed though: The setting of the reclaim flag can be
removed from outside of zone reclaim. A testset when zone reclaim starts
and a clear when it ends is enough.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 3/9] oom: change all_unreclaimable zone member to flags
2007-09-20 21:56 ` [patch 3/9] oom: change all_unreclaimable zone member to flags Christoph Lameter
@ 2007-09-20 21:58 ` David Rientjes
0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-20 21:58 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007, Christoph Lameter wrote:
> Additional work needed though: The setting of the reclaim flag can be
> removed from outside of zone reclaim. A testset when zone reclaim starts
> and a clear when it ends is enough.
>
Ok, I'll queue this for after we get this patchset merged into -mm.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 4/9] oom: add per-zone locking
2007-09-20 20:23 ` [patch 4/9] oom: add per-zone locking David Rientjes
2007-09-20 20:23 ` [patch 5/9] oom: serialize out of memory calls David Rientjes
@ 2007-09-20 21:59 ` Christoph Lameter
2007-09-20 22:03 ` David Rientjes
1 sibling, 1 reply; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 21:59 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007, David Rientjes wrote:
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -27,6 +27,7 @@
> #include <linux/notifier.h>
>
> int sysctl_panic_on_oom;
> +static DEFINE_MUTEX(zone_scan_mutex);
> /* #define DEBUG */
Use testset/testclear bitops instead of adding a lock?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 5/9] oom: serialize out of memory calls
2007-09-20 20:23 ` [patch 5/9] oom: serialize out of memory calls David Rientjes
2007-09-20 20:23 ` [patch 6/9] oom: add oom_kill_asking_task sysctl David Rientjes
@ 2007-09-20 21:59 ` Christoph Lameter
2007-09-21 9:01 ` Andrew Morton
2 siblings, 0 replies; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 21:59 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 7/9] oom: suppress extraneous stack and memory dump
2007-09-20 20:23 ` [patch 7/9] oom: suppress extraneous stack and memory dump David Rientjes
2007-09-20 20:23 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors David Rientjes
@ 2007-09-20 22:00 ` Christoph Lameter
1 sibling, 0 replies; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 22:00 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors
2007-09-20 20:23 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors David Rientjes
2007-09-20 20:23 ` [patch 9/9] oom: do not take callback_mutex David Rientjes
@ 2007-09-20 22:01 ` Christoph Lameter
1 sibling, 0 replies; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 22:01 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 4/9] oom: add per-zone locking
2007-09-20 21:59 ` [patch 4/9] oom: add per-zone locking Christoph Lameter
@ 2007-09-20 22:03 ` David Rientjes
2007-09-20 22:05 ` Christoph Lameter
0 siblings, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-20 22:03 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007, Christoph Lameter wrote:
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -27,6 +27,7 @@
> > #include <linux/notifier.h>
> >
> > int sysctl_panic_on_oom;
> > +static DEFINE_MUTEX(zone_scan_mutex);
> > /* #define DEBUG */
>
> Use testset/testclear bitops instead of adding a lock?
>
That doesn't work nicely, unfortunately, because then we need to unlock
all zones that we've locked so far in try_set_zone_oom() if we find one
that is alredy ZONE_OOM_LOCKED during the scan of the zonelist.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 6/9] oom: add oom_kill_asking_task sysctl
2007-09-20 20:23 ` [patch 6/9] oom: add oom_kill_asking_task sysctl David Rientjes
2007-09-20 20:23 ` [patch 7/9] oom: suppress extraneous stack and memory dump David Rientjes
@ 2007-09-20 22:03 ` Christoph Lameter
2007-09-20 22:07 ` David Rientjes
2007-09-21 9:05 ` Andrew Morton
2 siblings, 1 reply; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 22:03 UTC (permalink / raw)
To: David Rientjes
Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm, pj
Maybe we need this also for unconstrained allocations?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 9/9] oom: do not take callback_mutex
2007-09-20 20:23 ` [patch 9/9] oom: do not take callback_mutex David Rientjes
@ 2007-09-20 22:04 ` Christoph Lameter
0 siblings, 0 replies; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 22:04 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 4/9] oom: add per-zone locking
2007-09-20 22:03 ` David Rientjes
@ 2007-09-20 22:05 ` Christoph Lameter
2007-09-20 22:12 ` David Rientjes
0 siblings, 1 reply; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 22:05 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007, David Rientjes wrote:
> On Thu, 20 Sep 2007, Christoph Lameter wrote:
>
> > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -27,6 +27,7 @@
> > > #include <linux/notifier.h>
> > >
> > > int sysctl_panic_on_oom;
> > > +static DEFINE_MUTEX(zone_scan_mutex);
> > > /* #define DEBUG */
> >
> > Use testset/testclear bitops instead of adding a lock?
> >
>
> That doesn't work nicely, unfortunately, because then we need to unlock
> all zones that we've locked so far in try_set_zone_oom() if we find one
> that is alredy ZONE_OOM_LOCKED during the scan of the zonelist.
You need that lock release function anyways to when the oom killing is
done.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 6/9] oom: add oom_kill_asking_task sysctl
2007-09-20 22:03 ` [patch 6/9] oom: add oom_kill_asking_task sysctl Christoph Lameter
@ 2007-09-20 22:07 ` David Rientjes
2007-09-20 22:09 ` Christoph Lameter
0 siblings, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-20 22:07 UTC (permalink / raw)
To: Christoph Lameter
Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm, pj
On Thu, 20 Sep 2007, Christoph Lameter wrote:
> Acked-by: Christoph Lameter <clameter@sgi.com>
>
> Maybe we need this also for unconstrained allocations?
>
It already is, here's the relevant code (CONSTRAINT_NONE falls through to
check sysctl_oom_kill_asking_task. CONSTRAINT_MEMORY_POLICY will be
modified in a separate patchset since it doesn't have anything to do with
the serialization.
[ Ok, well modifying CONSTRAINT_CPUSET didn't really have anything to do
with serialization either, but it's included in this patchset so we can
eliminate the need to take callback_mutex. ]
@@ -478,14 +479,16 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order)
"No available memory (MPOL_BIND)");
break;
- case CONSTRAINT_CPUSET:
- oom_kill_process(current, points,
- "No available memory in cpuset");
- break;
-
case CONSTRAINT_NONE:
if (sysctl_panic_on_oom)
panic("out of memory. panic_on_oom is selected\n");
+ /* Fall-through */
+ case CONSTRAINT_CPUSET:
+ if (sysctl_oom_kill_asking_task) {
+ oom_kill_process(current, points,
+ "Out of memory (oom_kill_asking_task)");
+ break;
+ }
retry:
/*
* Rambo mode: Shoot down a process and hope it solves whatever
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 6/9] oom: add oom_kill_asking_task sysctl
2007-09-20 22:07 ` David Rientjes
@ 2007-09-20 22:09 ` Christoph Lameter
0 siblings, 0 replies; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 22:09 UTC (permalink / raw)
To: David Rientjes
Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm, pj
On Thu, 20 Sep 2007, David Rientjes wrote:
> It already is, here's the relevant code (CONSTRAINT_NONE falls through to
> check sysctl_oom_kill_asking_task. CONSTRAINT_MEMORY_POLICY will be
> modified in a separate patchset since it doesn't have anything to do with
> the serialization.
>
> [ Ok, well modifying CONSTRAINT_CPUSET didn't really have anything to do
> with serialization either, but it's included in this patchset so we can
> eliminate the need to take callback_mutex. ]
Good work.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 4/9] oom: add per-zone locking
2007-09-20 22:05 ` Christoph Lameter
@ 2007-09-20 22:12 ` David Rientjes
2007-09-20 22:26 ` Christoph Lameter
0 siblings, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-20 22:12 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007, Christoph Lameter wrote:
> > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > > > --- a/mm/oom_kill.c
> > > > +++ b/mm/oom_kill.c
> > > > @@ -27,6 +27,7 @@
> > > > #include <linux/notifier.h>
> > > >
> > > > int sysctl_panic_on_oom;
> > > > +static DEFINE_MUTEX(zone_scan_mutex);
> > > > /* #define DEBUG */
> > >
> > > Use testset/testclear bitops instead of adding a lock?
> > >
> >
> > That doesn't work nicely, unfortunately, because then we need to unlock
> > all zones that we've locked so far in try_set_zone_oom() if we find one
> > that is alredy ZONE_OOM_LOCKED during the scan of the zonelist.
>
> You need that lock release function anyways to when the oom killing is
> done.
>
It doesn't matter. You would then need the following in __alloc_pages():
if (!try_set_zone_oom(zonelist)) {
clear_zonelist_oom(zonelist);
schedule_timeout_uninterruptible(1);
goto restart;
}
or a call to clear_zonelist_oom() before returning 0 in
try_set_zone_oom().
But that races with another thread that is also trying an allocation
attempt and you end up clearing the ZONE_OOM_LOCKED bits that it has
already set in its call to try_set_zone_oom().
try_set_zone_oom() is a critical section because all ZONE_OOM_LOCKED bits
for each zone in the zonelist need to be set upon return, we can't allow
it to race with an exiting OOM killer calling clear_zonelist_oom().
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 4/9] oom: add per-zone locking
2007-09-20 22:12 ` David Rientjes
@ 2007-09-20 22:26 ` Christoph Lameter
2007-09-20 22:48 ` David Rientjes
0 siblings, 1 reply; 34+ messages in thread
From: Christoph Lameter @ 2007-09-20 22:26 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007, David Rientjes wrote:
> It doesn't matter. You would then need the following in __alloc_pages():
>
> if (!try_set_zone_oom(zonelist)) {
> clear_zonelist_oom(zonelist);
> schedule_timeout_uninterruptible(1);
> goto restart;
> }
>
> or a call to clear_zonelist_oom() before returning 0 in
> try_set_zone_oom().
Yup.
> But that races with another thread that is also trying an allocation
> attempt and you end up clearing the ZONE_OOM_LOCKED bits that it has
> already set in its call to try_set_zone_oom().
Well if you remember how far you got with locking and just undo those
then you are fine.
The global lock there just spooks me. If a large number of processors get
in there (say 1000 or so in the case of a global oom) then there is
already an issue of getting the lock from node 0. The bits in the zone
are distributed over all of the nodes in the system.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 4/9] oom: add per-zone locking
2007-09-20 22:26 ` Christoph Lameter
@ 2007-09-20 22:48 ` David Rientjes
2007-09-21 8:59 ` Andrew Morton
0 siblings, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-20 22:48 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andrew Morton, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007, Christoph Lameter wrote:
> > But that races with another thread that is also trying an allocation
> > attempt and you end up clearing the ZONE_OOM_LOCKED bits that it has
> > already set in its call to try_set_zone_oom().
>
> Well if you remember how far you got with locking and just undo those
> then you are fine.
>
No, you're not.
If you're locking your zones and find one that is already ZONE_OOM_LOCKED
and then try to unlock those you've already done, you can race and another
task in try_set_zone_oom() can fail because it found one of those zones
that you're about to unlock. Then both of these calls to
try_set_zone_oom() return 0, both tasks are put to sleep, and the OOM
killer is never called.
Granted, this will eventually work itself out but probably after putting
each task to sleep several times and wasting plenty of time when we're in
an OOM condition.
> The global lock there just spooks me. If a large number of processors get
> in there (say 1000 or so in the case of a global oom) then there is
> already an issue of getting the lock from node 0. The bits in the zone
> are distributed over all of the nodes in the system.
>
It's no more harder to acquire than callback_mutex was. It's far better
to include this global lock so the state of the zones are always correct
after releasing it than to have 1000 processors clearing and setting
ZONE_OOM_LOCKED bits for lengthy zonelists and all racing with each other
so no zonelist is ever fully locked.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 3/9] oom: change all_unreclaimable zone member to flags
2007-09-20 20:23 ` [patch 3/9] oom: change all_unreclaimable zone member to flags David Rientjes
2007-09-20 20:23 ` [patch 4/9] oom: add per-zone locking David Rientjes
2007-09-20 21:56 ` [patch 3/9] oom: change all_unreclaimable zone member to flags Christoph Lameter
@ 2007-09-21 8:55 ` Andrew Morton
2 siblings, 0 replies; 34+ messages in thread
From: Andrew Morton @ 2007-09-21 8:55 UTC (permalink / raw)
To: David Rientjes
Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
On Thu, 20 Sep 2007 13:23:17 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
> @@ -1871,10 +1874,8 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> * not have reclaimable pages and if we should not delay the allocation
> * then do not scan.
> */
> - if (!(gfp_mask & __GFP_WAIT) ||
> - zone->all_unreclaimable ||
> - atomic_read(&zone->reclaim_in_progress) > 0 ||
> - (current->flags & PF_MEMALLOC))
> + if (!(gfp_mask & __GFP_WAIT) || zone_is_all_unreclaimable(zone) ||
> + zone_is_reclaim_locked(zone) || (current->flags & PF_MEMALLOC))
> return 0;
It would be nice to convert this somewhat crappy code to use
test_and_set_bit(ZONE_RECLAIM_LOCKED) sometime.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 4/9] oom: add per-zone locking
2007-09-20 22:48 ` David Rientjes
@ 2007-09-21 8:59 ` Andrew Morton
0 siblings, 0 replies; 34+ messages in thread
From: Andrew Morton @ 2007-09-21 8:59 UTC (permalink / raw)
To: David Rientjes
Cc: Christoph Lameter, Andrea Arcangeli, Rik van Riel, linux-mm
On Thu, 20 Sep 2007 15:48:36 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
> > The global lock there just spooks me. If a large number of processors get
> > in there (say 1000 or so in the case of a global oom) then there is
> > already an issue of getting the lock from node 0. The bits in the zone
> > are distributed over all of the nodes in the system.
> >
>
> It's no more harder to acquire than callback_mutex was. It's far better
> to include this global lock so the state of the zones are always correct
> after releasing it than to have 1000 processors clearing and setting
> ZONE_OOM_LOCKED bits for lengthy zonelists and all racing with each other
> so no zonelist is ever fully locked.
It'd be better to use a spinlock than a sleeping lock: same speed in the
uncontended case, heaps faster in the contended case.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 5/9] oom: serialize out of memory calls
2007-09-20 20:23 ` [patch 5/9] oom: serialize out of memory calls David Rientjes
2007-09-20 20:23 ` [patch 6/9] oom: add oom_kill_asking_task sysctl David Rientjes
2007-09-20 21:59 ` [patch 5/9] oom: serialize out of memory calls Christoph Lameter
@ 2007-09-21 9:01 ` Andrew Morton
2007-09-21 20:04 ` David Rientjes
2 siblings, 1 reply; 34+ messages in thread
From: Andrew Morton @ 2007-09-21 9:01 UTC (permalink / raw)
To: David Rientjes
Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
On Thu, 20 Sep 2007 13:23:20 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
> Before invoking the OOM killer, a final allocation attempt with a very
> high watermark is attempted. Serialization needs to occur at this point
> or it may be possible that the allocation could succeed after acquiring
> the lock. If the lock is contended, the task is put to sleep and the
> allocation attempt is retried when rescheduled.
Am having trouble understanding this description. How can it ever be a
problem if an allocation succeeds??
Want to have another go, please?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 6/9] oom: add oom_kill_asking_task sysctl
2007-09-20 20:23 ` [patch 6/9] oom: add oom_kill_asking_task sysctl David Rientjes
2007-09-20 20:23 ` [patch 7/9] oom: suppress extraneous stack and memory dump David Rientjes
2007-09-20 22:03 ` [patch 6/9] oom: add oom_kill_asking_task sysctl Christoph Lameter
@ 2007-09-21 9:05 ` Andrew Morton
2 siblings, 0 replies; 34+ messages in thread
From: Andrew Morton @ 2007-09-21 9:05 UTC (permalink / raw)
To: David Rientjes
Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
On Thu, 20 Sep 2007 13:23:21 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
> Adds a new sysctl, 'oom_kill_asking_task', which will automatically kill
> the OOM-triggering task instead of scanning through the tasklist to find
> a memory-hogging target.
I find the name a bit cheesy. I renamed it to oom_kill_allocating_task,
but that's still not quite right. Really should be
oom_kill_allocation_attempting_task, but sheesh.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 0/9] oom killer serialization
2007-09-20 20:23 [patch 0/9] oom killer serialization David Rientjes
2007-09-20 20:23 ` [patch 1/9] oom: move prototypes to appropriate header file David Rientjes
@ 2007-09-21 9:12 ` Andrew Morton
2007-09-21 9:21 ` David Rientjes
2007-09-21 19:15 ` Christoph Lameter
1 sibling, 2 replies; 34+ messages in thread
From: Andrew Morton @ 2007-09-21 9:12 UTC (permalink / raw)
To: David Rientjes
Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
On Thu, 20 Sep 2007 13:23:13 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
> Third version of the OOM serialization patchset.
What's the relationship between this patch series and Andrea's monster
oomkiller patchset? Looks like teeny-subset-plus-other-stuff?
Are all attributions on all those patches appropriately set?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 0/9] oom killer serialization
2007-09-21 9:12 ` [patch 0/9] oom killer serialization Andrew Morton
@ 2007-09-21 9:21 ` David Rientjes
2007-09-21 19:13 ` David Rientjes
2007-09-21 19:15 ` Christoph Lameter
1 sibling, 1 reply; 34+ messages in thread
From: David Rientjes @ 2007-09-21 9:21 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
On Fri, 21 Sep 2007, Andrew Morton wrote:
> On Thu, 20 Sep 2007 13:23:13 -0700 (PDT) David Rientjes <rientjes@google.com> wrote:
>
> > Third version of the OOM serialization patchset.
>
> What's the relationship between this patch series and Andrea's monster
> oomkiller patchset? Looks like teeny-subset-plus-other-stuff?
>
This provides serialization for system-wide, mempolicy-constrained, and
cpuset-constrained OOM kills which was a small subset of Andrea's 24-patch
series posted August 22.
It replaces the following patches from Andrea:
[PATCH 04 of 24] serialize oom killer
[PATCH 12 of 24] show mem information only when a task is actually being killed
And the following patches from me:
[PATCH 21 of 24] select process to kill for cpusets
[PATCH 22 of 24] extract select helper function
[PATCH 23 of 24] serialize for cpusets
[PATCH 24 of 24] add oom_kill_asking_task flag
> Are all attributions on all those patches appropriately set?
>
Yes.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 0/9] oom killer serialization
2007-09-21 9:21 ` David Rientjes
@ 2007-09-21 19:13 ` David Rientjes
0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-21 19:13 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
On Fri, 21 Sep 2007, David Rientjes wrote:
> This provides serialization for system-wide, mempolicy-constrained, and
> cpuset-constrained OOM kills which was a small subset of Andrea's 24-patch
> series posted August 22.
>
> It replaces the following patches from Andrea:
> [PATCH 04 of 24] serialize oom killer
> [PATCH 12 of 24] show mem information only when a task is actually being killed
>
It also replaces
[PATCH 19 of 24] cacheline align VM_is_OOM to prevent false sharing
since locking isn't globally done with VM_is_OOM anymore.
Also, the patch
[PATCH 17 of 24] apply the anti deadlock features only to global oom
will no longer need to move the global locking mechanism since its now
non-existant, but the deadlock feature is still apporpriate in the
CONSTRAINT_NONE (i.e. global) case.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 0/9] oom killer serialization
2007-09-21 9:12 ` [patch 0/9] oom killer serialization Andrew Morton
2007-09-21 9:21 ` David Rientjes
@ 2007-09-21 19:15 ` Christoph Lameter
1 sibling, 0 replies; 34+ messages in thread
From: Christoph Lameter @ 2007-09-21 19:15 UTC (permalink / raw)
To: Andrew Morton; +Cc: David Rientjes, Andrea Arcangeli, Rik van Riel, linux-mm
On Fri, 21 Sep 2007, Andrew Morton wrote:
> What's the relationship between this patch series and Andrea's monster
> oomkiller patchset? Looks like teeny-subset-plus-other-stuff?
I think we need to know from Andrea if our work addresses all the issues
that he has seen.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [patch 5/9] oom: serialize out of memory calls
2007-09-21 9:01 ` Andrew Morton
@ 2007-09-21 20:04 ` David Rientjes
0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2007-09-21 20:04 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andrea Arcangeli, Christoph Lameter, Rik van Riel, linux-mm
On Fri, 21 Sep 2007, Andrew Morton wrote:
> > Before invoking the OOM killer, a final allocation attempt with a very
> > high watermark is attempted. Serialization needs to occur at this point
> > or it may be possible that the allocation could succeed after acquiring
> > the lock. If the lock is contended, the task is put to sleep and the
> > allocation attempt is retried when rescheduled.
>
> Am having trouble understanding this description. How can it ever be a
> problem if an allocation succeeds??
>
> Want to have another go, please?
>
Ok, please replace the description in
oom-serialize-out-of-memory-calls.patch with this:
A final allocation attempt with a very high watermark needs to be
attempted before invoking out_of_memory(). OOM killer serialization needs
to occur before this final attempt, otherwise tasks attempting to OOM-lock
all zones in its zonelist may spin and acquire the lock unnecessarily
after the OOM condition has already been alleviated.
If the final allocation does succeed, the zonelist is simply OOM-unlocked
and __alloc_pages() returns the page. Otherwise, the OOM killer is
invoked.
If the task cannot acquire OOM-locks on all zones in its zonelist, it is
put to sleep and the allocation is retried when it gets rescheduled. One
of its zones is already marked as being in the OOM killer so it'll
hopefully be getting some free memory soon, at least enough to satisfy a
high watermark allocation attempt. This prevents needlessly killing a
task when the OOM condition would have already been alleviated if it had
simply been given enough time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2007-09-21 20:04 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-20 20:23 [patch 0/9] oom killer serialization David Rientjes
2007-09-20 20:23 ` [patch 1/9] oom: move prototypes to appropriate header file David Rientjes
2007-09-20 20:23 ` [patch 2/9] oom: move constraints to enum David Rientjes
2007-09-20 20:23 ` [patch 3/9] oom: change all_unreclaimable zone member to flags David Rientjes
2007-09-20 20:23 ` [patch 4/9] oom: add per-zone locking David Rientjes
2007-09-20 20:23 ` [patch 5/9] oom: serialize out of memory calls David Rientjes
2007-09-20 20:23 ` [patch 6/9] oom: add oom_kill_asking_task sysctl David Rientjes
2007-09-20 20:23 ` [patch 7/9] oom: suppress extraneous stack and memory dump David Rientjes
2007-09-20 20:23 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors David Rientjes
2007-09-20 20:23 ` [patch 9/9] oom: do not take callback_mutex David Rientjes
2007-09-20 22:04 ` Christoph Lameter
2007-09-20 22:01 ` [patch 8/9] oom: compare cpuset mems_allowed instead of exclusive ancestors Christoph Lameter
2007-09-20 22:00 ` [patch 7/9] oom: suppress extraneous stack and memory dump Christoph Lameter
2007-09-20 22:03 ` [patch 6/9] oom: add oom_kill_asking_task sysctl Christoph Lameter
2007-09-20 22:07 ` David Rientjes
2007-09-20 22:09 ` Christoph Lameter
2007-09-21 9:05 ` Andrew Morton
2007-09-20 21:59 ` [patch 5/9] oom: serialize out of memory calls Christoph Lameter
2007-09-21 9:01 ` Andrew Morton
2007-09-21 20:04 ` David Rientjes
2007-09-20 21:59 ` [patch 4/9] oom: add per-zone locking Christoph Lameter
2007-09-20 22:03 ` David Rientjes
2007-09-20 22:05 ` Christoph Lameter
2007-09-20 22:12 ` David Rientjes
2007-09-20 22:26 ` Christoph Lameter
2007-09-20 22:48 ` David Rientjes
2007-09-21 8:59 ` Andrew Morton
2007-09-20 21:56 ` [patch 3/9] oom: change all_unreclaimable zone member to flags Christoph Lameter
2007-09-20 21:58 ` David Rientjes
2007-09-21 8:55 ` Andrew Morton
2007-09-21 9:12 ` [patch 0/9] oom killer serialization Andrew Morton
2007-09-21 9:21 ` David Rientjes
2007-09-21 19:13 ` David Rientjes
2007-09-21 19:15 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox