From: Andrea Arcangeli <andrea@suse.de>
To: linux-mm@kvack.org
Subject: [PATCH 01 of 16] remove nr_scan_inactive/active
Date: Fri, 08 Jun 2007 22:02:59 +0200 [thread overview]
Message-ID: <8e38f7656968417dfee0.1181332979@v2.random> (raw)
In-Reply-To: <patchbomb.1181332978@v2.random>
# HG changeset patch
# User Andrea Arcangeli <andrea@suse.de>
# Date 1181332959 -7200
# Node ID 8e38f7656968417dfee09fbb6450a8f1e70f8b21
# Parent 8b84ac74c8464bb6e4a2c08ff2a656d06c8667ca
remove nr_scan_inactive/active
The older atomic_add/atomic_set were pointless (atomic_set vs atomic_add would
race), but removing them didn't actually remove the race, the race is still
there, for the same reasons atomic_add/set couldn't prevent it. This is really
the kind of code that I dislike because it's sort of buggy, and it shouldn't be
making any measurable difference and when it does something for real it can
only hurt!
The real focus is on shrink_zone (ignore the other places where it's being used
that are even less interesting). Assume two tasks adds to nr_scan_*active at
the same time (first line of the old buggy code), they'll effectively double their
scan rate, for no good reason. What can happen is that instead of scanning
nr_entries each, they'll scan nr_entries*2 each. The more CPUs the bigger the
race and the higher the multiplication effect and the harder it will be to
detect oom. In the case that nr_*active < sc->swap_cluster_max, regardless of
whatever future invocation of alloc_pages, we'll be going down in the
priorities in the current alloc_pages invocation if the DEF_PRIORITY was too
high to make any work, so again accumulating the nr_scan_*active doesn't seem
interesting even when it's smaller than sc->swap_cluster_max. Each task should
work for itself without much care of what the others are doing.
Signed-off-by: Andrea Arcangeli <andrea@suse.de>
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -220,8 +220,6 @@ struct zone {
spinlock_t lru_lock;
struct list_head active_list;
struct list_head inactive_list;
- unsigned long nr_scan_active;
- unsigned long nr_scan_inactive;
unsigned long pages_scanned; /* since last reclaim */
int all_unreclaimable; /* All pages pinned */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2649,8 +2649,6 @@ static void __meminit free_area_init_cor
zone_pcp_init(zone);
INIT_LIST_HEAD(&zone->active_list);
INIT_LIST_HEAD(&zone->inactive_list);
- zone->nr_scan_active = 0;
- zone->nr_scan_inactive = 0;
zap_zone_vm_stats(zone);
atomic_set(&zone->reclaim_in_progress, 0);
if (!size)
diff --git a/mm/vmscan.c b/mm/vmscan.c
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -915,20 +915,11 @@ static unsigned long shrink_zone(int pri
* Add one to `nr_to_scan' just to make sure that the kernel will
* slowly sift through the active list.
*/
- zone->nr_scan_active +=
- (zone_page_state(zone, NR_ACTIVE) >> priority) + 1;
- nr_active = zone->nr_scan_active;
- if (nr_active >= sc->swap_cluster_max)
- zone->nr_scan_active = 0;
- else
+ nr_active = zone_page_state(zone, NR_ACTIVE) >> priority;
+ if (nr_active < sc->swap_cluster_max)
nr_active = 0;
-
- zone->nr_scan_inactive +=
- (zone_page_state(zone, NR_INACTIVE) >> priority) + 1;
- nr_inactive = zone->nr_scan_inactive;
- if (nr_inactive >= sc->swap_cluster_max)
- zone->nr_scan_inactive = 0;
- else
+ nr_inactive = zone_page_state(zone, NR_INACTIVE) >> priority;
+ if (nr_inactive < sc->swap_cluster_max)
nr_inactive = 0;
while (nr_active || nr_inactive) {
@@ -1392,22 +1383,14 @@ static unsigned long shrink_all_zones(un
/* For pass = 0 we don't shrink the active list */
if (pass > 0) {
- zone->nr_scan_active +=
- (zone_page_state(zone, NR_ACTIVE) >> prio) + 1;
- if (zone->nr_scan_active >= nr_pages || pass > 3) {
- zone->nr_scan_active = 0;
- nr_to_scan = min(nr_pages,
- zone_page_state(zone, NR_ACTIVE));
+ nr_to_scan = (zone_page_state(zone, NR_ACTIVE) >> prio) + 1;
+ if (nr_to_scan >= nr_pages || pass > 3) {
shrink_active_list(nr_to_scan, zone, sc, prio);
}
}
- zone->nr_scan_inactive +=
- (zone_page_state(zone, NR_INACTIVE) >> prio) + 1;
- if (zone->nr_scan_inactive >= nr_pages || pass > 3) {
- zone->nr_scan_inactive = 0;
- nr_to_scan = min(nr_pages,
- zone_page_state(zone, NR_INACTIVE));
+ nr_to_scan = (zone_page_state(zone, NR_INACTIVE) >> prio) + 1;
+ if (nr_to_scan >= nr_pages || pass > 3) {
ret += shrink_inactive_list(nr_to_scan, zone, sc);
if (ret >= nr_pages)
return ret;
diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -554,7 +554,7 @@ static int zoneinfo_show(struct seq_file
"\n min %lu"
"\n low %lu"
"\n high %lu"
- "\n scanned %lu (a: %lu i: %lu)"
+ "\n scanned %lu"
"\n spanned %lu"
"\n present %lu",
zone_page_state(zone, NR_FREE_PAGES),
@@ -562,7 +562,6 @@ static int zoneinfo_show(struct seq_file
zone->pages_low,
zone->pages_high,
zone->pages_scanned,
- zone->nr_scan_active, zone->nr_scan_inactive,
zone->spanned_pages,
zone->present_pages);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-06-08 20:06 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-08 20:02 [PATCH 00 of 16] OOM related fixes Andrea Arcangeli
2007-06-08 20:02 ` Andrea Arcangeli [this message]
2007-06-10 17:36 ` [PATCH 01 of 16] remove nr_scan_inactive/active Rik van Riel
2007-06-10 18:17 ` Andrea Arcangeli
2007-06-11 14:58 ` Rik van Riel
2007-06-26 17:08 ` Rik van Riel
2007-06-26 17:55 ` Andrew Morton
2007-06-26 19:02 ` Rik van Riel
2007-06-28 22:44 ` Rik van Riel
2007-06-28 22:57 ` Andrew Morton
2007-06-28 23:04 ` Rik van Riel
2007-06-28 23:13 ` Andrew Morton
2007-06-28 23:16 ` Rik van Riel
2007-06-28 23:29 ` Andrew Morton
2007-06-29 0:00 ` Rik van Riel
2007-06-29 0:19 ` Andrew Morton
2007-06-29 0:45 ` Rik van Riel
2007-06-29 1:12 ` Andrew Morton
2007-06-29 1:20 ` Rik van Riel
2007-06-29 1:29 ` Andrew Morton
2007-06-28 23:25 ` Andrea Arcangeli
2007-06-29 0:12 ` Andrew Morton
2007-06-29 13:38 ` Lee Schermerhorn
2007-06-29 14:12 ` Andrea Arcangeli
2007-06-29 14:59 ` Rik van Riel
2007-06-29 22:39 ` "Noreclaim Infrastructure" [was Re: [PATCH 01 of 16] remove nr_scan_inactive/active] Lee Schermerhorn
2007-06-29 22:42 ` RFC "Noreclaim Infrastructure - patch 1/3 basic infrastructure" Lee Schermerhorn
2007-06-29 22:44 ` RFC "Noreclaim Infrastructure patch 2/3 - noreclaim statistics..." Lee Schermerhorn
2007-06-29 22:49 ` "Noreclaim - client patch 3/3 - treat pages w/ excessively references anon_vma as nonreclaimable" Lee Schermerhorn
2007-06-26 20:37 ` [PATCH 01 of 16] remove nr_scan_inactive/active Andrea Arcangeli
2007-06-26 20:57 ` Rik van Riel
2007-06-26 22:21 ` Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 02 of 16] avoid oom deadlock in nfs_create_request Andrea Arcangeli
2007-06-10 17:38 ` Rik van Riel
2007-06-10 18:27 ` Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 03 of 16] prevent oom deadlocks during read/write operations Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 04 of 16] serialize oom killer Andrea Arcangeli
2007-06-09 6:43 ` Peter Zijlstra
2007-06-09 15:27 ` Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 05 of 16] avoid selecting already killed tasks Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 06 of 16] reduce the probability of an OOM livelock Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 07 of 16] balance_pgdat doesn't return the number of pages freed Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 08 of 16] don't depend on PF_EXITING tasks to go away Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 09 of 16] fallback killing more tasks if tif-memdie doesn't " Andrea Arcangeli
2007-06-08 21:57 ` Christoph Lameter
2007-06-08 20:03 ` [PATCH 10 of 16] stop useless vm trashing while we wait the TIF_MEMDIE task to exit Andrea Arcangeli
2007-06-08 21:48 ` Christoph Lameter
2007-06-09 1:59 ` Andrea Arcangeli
2007-06-09 3:01 ` Christoph Lameter
2007-06-09 14:05 ` Andrea Arcangeli
2007-06-09 14:38 ` Andrea Arcangeli
2007-06-11 16:07 ` Christoph Lameter
2007-06-11 16:50 ` Andrea Arcangeli
2007-06-11 16:57 ` Christoph Lameter
2007-06-11 17:51 ` Andrea Arcangeli
2007-06-11 17:56 ` Christoph Lameter
2007-06-11 18:22 ` Andrea Arcangeli
2007-06-11 18:39 ` Christoph Lameter
2007-06-11 18:58 ` Andrea Arcangeli
2007-06-11 19:25 ` Christoph Lameter
2007-06-11 16:04 ` Christoph Lameter
2007-06-08 20:03 ` [PATCH 11 of 16] the oom schedule timeout isn't needed with the VM_is_OOM logic Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 12 of 16] show mem information only when a task is actually being killed Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 13 of 16] simplify oom heuristics Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 14 of 16] oom select should only take rss into account Andrea Arcangeli
2007-06-10 17:17 ` Rik van Riel
2007-06-10 17:30 ` Andrea Arcangeli
2007-06-08 20:03 ` [PATCH 15 of 16] limit reclaim if enough pages have been freed Andrea Arcangeli
2007-06-10 17:20 ` Rik van Riel
2007-06-10 17:32 ` Andrea Arcangeli
2007-06-10 17:52 ` Rik van Riel
2007-06-11 16:23 ` Christoph Lameter
2007-06-11 16:57 ` Rik van Riel
2007-06-08 20:03 ` [PATCH 16 of 16] avoid some lock operation in vm fast path Andrea Arcangeli
2007-06-08 21:26 ` [PATCH 00 of 16] OOM related fixes William Lee Irwin III
2007-06-09 14:55 ` Andrea Arcangeli
2007-06-12 8:58 ` Petr Tesarik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e38f7656968417dfee0.1181332979@v2.random \
--to=andrea@suse.de \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox