Re: [PATCH 1/5] vmscan: remove all_unreclaimable check from direct reclaim path completely

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Minchan Kim <minchan.kim@gmail.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Oleg Nesterov <oleg@redhat.com>,
	linux-mm <linux-mm@kvack.org>, Andrey Vagin <avagin@openvz.org>,
	Hugh Dickins <hughd@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Nick Piggin <npiggin@kernel.dk>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH 1/5] vmscan: remove all_unreclaimable check from direct reclaim path completely
Date: Wed, 23 Mar 2011 15:59:04 +0900	[thread overview]
Message-ID: <AANLkTim1HcdkPcxnWrv+VbMUSh3kQBC=-myZ-j-a8Wiy@mail.gmail.com> (raw)
In-Reply-To: <20110323142133.1AC6.A69D9226@jp.fujitsu.com>

On Wed, Mar 23, 2011 at 2:21 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> Hi Minchan,
>
>> > zone->all_unreclaimable and zone->pages_scanned are neigher atomic
>> > variables nor protected by lock. Therefore a zone can become a state
>> > of zone->page_scanned=0 and zone->all_unreclaimable=1. In this case,
>>
>> Possible although it's very rare.
>
> Can you test by yourself andrey's case on x86 box? It seems
> reprodusable.
>
>> > current all_unreclaimable() return false even though
>> > zone->all_unreclaimabe=1.
>>
>> The case is very rare since we reset zone->all_unreclaimabe to zero
>> right before resetting zone->page_scanned to zero.
>> But I admit it's possible.
>
> Please apply this patch and run oom-killer. You may see following
> pages_scanned:0 and all_unreclaimable:yes combination. likes below.
> (but you may need >30min)
>
>        Node 0 DMA free:4024kB min:40kB low:48kB high:60kB active_anon:11804kB
>        inactive_anon:0kB active_file:0kB inactive_file:4kB unevictable:0kB
>        isolated(anon):0kB isolated(file):0kB present:15676kB mlocked:0kB
>        dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
>        slab_unreclaimable:0kB kernel_stack:0kB pagetables:68kB unstable:0kB
>        bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>
>
>>
>>         CPU 0                                           CPU 1
>> free_pcppages_bulk                              balance_pgdat
>>         zone->all_unreclaimabe = 0
>>                                                         zone->all_unreclaimabe = 1
>>         zone->pages_scanned = 0
>> >
>> > Is this ignorable minor issue? No. Unfortunatelly, x86 has very
>> > small dma zone and it become zone->all_unreclamble=1 easily. and
>> > if it becase all_unreclaimable, it never return all_unreclaimable=0
>>         ^^^^^ it's very important verb.    ^^^^^ return? reset?
>>
>>         I can't understand your point due to the typo. Please correct the typo.
>>
>> > beucase it typicall don't have reclaimable pages.
>>
>> If DMA zone have very small reclaimable pages or zero reclaimable pages,
>> zone_reclaimable() can return false easily so all_unreclaimable() could return
>> true. Eventually oom-killer might works.
>
> The point is, vmscan has following all_unreclaimable check in several place.
>
>                        if (zone->all_unreclaimable && priority != DEF_PRIORITY)
>                                continue;
>
> But, if the zone has only a few lru pages, get_scan_count(DEF_PRIORITY) return
> {0, 0, 0, 0} array. It mean zone will never scan lru pages anymore. therefore
> false negative smaller pages_scanned can't be corrected.
>
> Then, false negative all_unreclaimable() also can't be corrected.
>
>
> btw, Why get_scan_count() return 0 instead 1? Why don't we round up?
> Git log says it is intentionally.
>
>        commit e0f79b8f1f3394bb344b7b83d6f121ac2af327de
>        Author: Johannes Weiner <hannes@saeurebad.de>
>        Date:   Sat Oct 18 20:26:55 2008 -0700
>
>            vmscan: don't accumulate scan pressure on unrelated lists
>
>>
>> In my test, I saw the livelock, too so apparently we have a problem.
>> I couldn't dig in it recently by another urgent my work.
>> I think you know root cause but the description in this patch isn't enough
>> for me to be persuaded.
>>
>> Could you explain the root cause in detail?
>
> If you have an another fixing idea, please let me know. :)
>
>
>
>

Okay. I got it.

The problem is following as.
By the race the free_pcppages_bulk and balance_pgdat, it is possible
zone->all_unreclaimable = 1 and zone->pages_scanned = 0.
DMA zone have few LRU pages and in case of no-swap and big memory
pressure, there could be a just a page in inactive file list like your
example. (anon lru pages isn't important in case of non-swap system)
In such case, shrink_zones doesn't scan the page at all until priority
become 0 as get_scan_count does scan >>= priority(it's mostly zero).
And although priority become 0, nr_scan_try_batch returns zero until
saved pages become 32. So for scanning the page, at least, we need 32
times iteration of priority 12..0.  If system has fork-bomb, it is
almost livelock.

If is is right, how about this?

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 148c6e6..34983e1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1973,6 +1973,9 @@ static void shrink_zones(int priority, struct
zonelist *zonelist,

 static bool zone_reclaimable(struct zone *zone)
 {
+       if (zone->all_unreclaimable)
+               return false;
+
        return zone->pages_scanned < zone_reclaimable_pages(zone) * 6;
 }


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-03-23  6:59 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20110314232156.0c363813.akpm@linux-foundation.org>
     [not found] ` <20110315153801.3526.A69D9226@jp.fujitsu.com>
2011-03-22 11:04   ` [patch 0/5] oom: a few anti fork bomb patches KOSAKI Motohiro
2011-03-22 11:05     ` [PATCH 1/5] vmscan: remove all_unreclaimable check from direct reclaim path completely KOSAKI Motohiro
2011-03-22 14:49       ` Minchan Kim
2011-03-23  5:21         ` KOSAKI Motohiro
2011-03-23  6:59           ` Minchan Kim [this message]
2011-03-23  7:13             ` KOSAKI Motohiro
2011-03-23  8:24               ` Minchan Kim
2011-03-23  8:44                 ` KOSAKI Motohiro
2011-03-23  9:02                   ` Minchan Kim
2011-03-24  2:11                     ` KOSAKI Motohiro
2011-03-24  2:21                       ` Andrew Morton
2011-03-24  2:48                         ` KOSAKI Motohiro
2011-03-24  3:04                           ` Andrew Morton
2011-03-24  5:35                             ` KOSAKI Motohiro
2011-03-24  4:19                       ` Minchan Kim
2011-03-24  5:35                         ` KOSAKI Motohiro
2011-03-24  5:53                           ` Minchan Kim
2011-03-24  6:16                             ` KOSAKI Motohiro
2011-03-24  6:32                               ` Minchan Kim
2011-03-24  7:03                                 ` KOSAKI Motohiro
2011-03-24  7:25                                   ` Minchan Kim
2011-03-24  7:28                                     ` KOSAKI Motohiro
2011-03-24  7:34                                       ` Minchan Kim
2011-03-24  7:41                                         ` Minchan Kim
2011-03-24  7:43                                         ` KOSAKI Motohiro
2011-03-24  7:43                           ` Minchan Kim
2011-03-23  7:41       ` KAMEZAWA Hiroyuki
2011-03-23  7:55         ` KOSAKI Motohiro
2011-03-22 11:06     ` [PATCH 2/5] Revert "oom: give the dying task a higher priority" KOSAKI Motohiro
2011-03-23  7:42       ` KAMEZAWA Hiroyuki
2011-03-23 13:40         ` Luis Claudio R. Goncalves
2011-03-24  0:06           ` KOSAKI Motohiro
2011-03-24 15:27       ` Minchan Kim
2011-03-28  9:48         ` KOSAKI Motohiro
2011-03-28 12:28           ` Minchan Kim
2011-03-28  9:51         ` Peter Zijlstra
2011-03-28 12:21           ` Minchan Kim
2011-03-28 12:28             ` Peter Zijlstra
2011-03-28 12:40               ` Minchan Kim
2011-03-28 13:10                 ` Luis Claudio R. Goncalves
2011-03-28 13:18                   ` Peter Zijlstra
2011-03-28 13:56                     ` Luis Claudio R. Goncalves
2011-03-29  2:46                     ` KOSAKI Motohiro
2011-03-28 13:48                   ` Minchan Kim
2011-03-22 11:08     ` [PATCH 3/5] oom: create oom autogroup KOSAKI Motohiro
2011-03-22 23:21       ` Minchan Kim
2011-03-23  1:27         ` KOSAKI Motohiro
2011-03-23  2:41           ` Mike Galbraith
2011-03-22 11:08     ` [PATCH 4/5] mm: introduce wait_on_page_locked_killable KOSAKI Motohiro
2011-03-23  7:44       ` KAMEZAWA Hiroyuki
2011-03-24 15:04       ` Minchan Kim
2011-03-22 11:09     ` [PATCH 5/5] x86,mm: make pagefault killable KOSAKI Motohiro
2011-03-23  7:49       ` KAMEZAWA Hiroyuki
2011-03-23  8:09         ` KOSAKI Motohiro
2011-03-23 14:34           ` Linus Torvalds
2011-03-24 15:10       ` Minchan Kim
2011-03-24 17:13       ` Oleg Nesterov
2011-03-24 17:34         ` Linus Torvalds
2011-03-28  7:00           ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTim1HcdkPcxnWrv+VbMUSh3kQBC=-myZ-j-a8Wiy@mail.gmail.com' \
    --to=minchan.kim@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@openvz.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@kernel.dk \
    --cc=oleg@redhat.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox