linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ying Han <yinghan@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>, Mel Gorman <mel@csn.ul.ie>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Hillf Danton <dhillf@gmail.com>,
	Hugh Dickins <hughd@google.com>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH V5 5/5] mm: memcg discount pages under softlimit from per-zone reclaimable_pages
Date: Mon, 25 Jun 2012 14:00:52 -0700	[thread overview]
Message-ID: <CALWz4izhVnyCgnv4b+QVOONm_-1edDBd_ejWNmLp5tJZczKZwQ@mail.gmail.com> (raw)
In-Reply-To: <20120619120523.GD27816@cmpxchg.org>

On Tue, Jun 19, 2012 at 5:05 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Mon, Jun 18, 2012 at 09:47:31AM -0700, Ying Han wrote:
>> The function zone_reclaimable() marks zone->all_unreclaimable based on
>> per-zone pages_scanned and reclaimable_pages. If all_unreclaimable is true,
>> alloc_pages could go to OOM instead of getting stuck in page reclaim.
>
> There is no zone->all_unreclaimable at this point, you removed it in
> the previous patch.
>
>> In memcg kernel, cgroup under its softlimit is not targeted under global
>> reclaim. So we need to remove those pages from reclaimable_pages, otherwise
>> it will cause reclaim mechanism to get stuck trying to reclaim from
>> all_unreclaimable zone.
>
> Can't you check if zone->pages_scanned changed in between reclaim
> runs?
>
> Or sum up the scanned and reclaimable pages encountered while
> iterating the hierarchy during regular reclaim and then use those
> numbers in the equation instead of the per-zone counters?
>
> Walking the full global hierarchy in all the places where we check if
> a zone is reclaimable is a scalability nightmare.

One way to solve this is to record the per-zone reclaimable pages (
sum of reclaimable pages of memcg above softlimits ) after each
shrink_zone(). The later function does walk the memcg hierarchy and
also checks the softlimit, so we don't need to do it again. The new
value pages_reclaimed is recorded per-zone, and the caller side could
use that to compare w/ zone->pages_scanned.

While I run tests on the patch, it turns out that I can not reproduce
the problem ( machine hang while over-committing the softlimit) even
w/o the patch. Then I realize that the problem only exist in the
internal version we don't have the check "sc->priority < DEF_PRIORITY
- 2" to bypass softlimit check. The reason we did that part is to
guarantee no global pressure on high priority memcgs.  So In that
case, global reclaim can never steal any pages from any memgs and the
system can easily hang.

This is not the case in the version I am posting here. The patch
guarantees not looping in memcgs all under softlimit by :
1. detects whether no memcg above their softlimit, if so, skip
checking softlimit
2. only check softlimit memcg if priority is >= DEF_PRIORITY - 2

In summary, the problem described in this patch doesn't exist. So I am
thinking to drop this one on my next post. Please comment.

--Ying

>> @@ -100,18 +100,36 @@ static __always_inline enum lru_list page_lru(struct page *page)
>>       return lru;
>>  }
>>
>> +static inline unsigned long get_lru_size(struct lruvec *lruvec,
>> +                                      enum lru_list lru)
>> +{
>> +     if (!mem_cgroup_disabled())
>> +             return mem_cgroup_get_lru_size(lruvec, lru);
>> +
>> +     return zone_page_state(lruvec_zone(lruvec), NR_LRU_BASE + lru);
>> +}
>> +
>>  static inline unsigned long zone_reclaimable_pages(struct zone *zone)
>>  {
>> -     int nr;
>> +     int nr = 0;
>> +     struct mem_cgroup *memcg;
>> +
>> +     memcg = mem_cgroup_iter(NULL, NULL, NULL);
>> +     do {
>> +             struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, memcg);
>>
>> -     nr = zone_page_state(zone, NR_ACTIVE_FILE) +
>> -          zone_page_state(zone, NR_INACTIVE_FILE);
>> +             if (should_reclaim_mem_cgroup(memcg)) {
>> +                     nr += get_lru_size(lruvec, LRU_INACTIVE_FILE) +
>> +                           get_lru_size(lruvec, LRU_ACTIVE_FILE);
>
> Sometimes, the number of reclaimable pages DO include those of groups
> for which should_reclaim_mem_cgroup() is false: when the priority
> level is <= DEF_PRIORITY - 2, as you defined in 1/5!  This means that
> you consider pages you just scanned unreclaimable, which can result in
> the zone being unreclaimable after the DEF_PRIORITY - 2 cycle, no?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-06-25 21:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-18 16:47 [PATCH V5 1/5] mm: memcg softlimit reclaim rework Ying Han
2012-06-18 16:47 ` [PATCH V2 2/5] mm: memcg set soft_limit_in_bytes to 0 by default Ying Han
2012-06-18 16:47 ` [PATCH V5 3/5] mm: memcg detect no memcgs above softlimit under zone reclaim Ying Han
2012-06-18 16:47 ` [PATCH V5 4/5] mm, vmscan: fix do_try_to_free_pages() livelock Ying Han
2012-06-19 18:29   ` KOSAKI Motohiro
2012-06-20  3:29     ` Ying Han
2012-06-18 16:47 ` [PATCH V5 5/5] mm: memcg discount pages under softlimit from per-zone reclaimable_pages Ying Han
2012-06-19 12:05   ` Johannes Weiner
2012-06-20  3:51     ` Ying Han
2012-06-25 21:00     ` Ying Han [this message]
2012-06-19 11:29 ` [PATCH V5 1/5] mm: memcg softlimit reclaim rework Johannes Weiner
2012-06-20  3:45   ` Ying Han
2012-06-20  8:53     ` Johannes Weiner
2012-06-20 14:59       ` Ying Han

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALWz4izhVnyCgnv4b+QVOONm_-1edDBd_ejWNmLp5tJZczKZwQ@mail.gmail.com \
    --to=yinghan@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=dhillf@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox