linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Aaditya Kumar <aaditya.kumar.30@gmail.com>
To: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>,
	Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	tim.bird@am.sony.com, frank.rowand@am.sony.com,
	takuzo.ohara@ap.sony.com, kan.iibuchi@jp.sony.com,
	aaditya.kumar@ap.sony.com
Subject: Re: Accounting problem of MIGRATE_ISOLATED freed page
Date: Fri, 22 Jun 2012 13:26:54 +0530	[thread overview]
Message-ID: <CAEtiSasQco=GPCwENUY5ND7uvsMrH0a-uTZ6o9GNwmC5dgsPkA@mail.gmail.com> (raw)
In-Reply-To: <CAHGf_=oo5GrsbjTRPF2vC-g8R1XVOhjLAMQg6ik49-fr8D=Q+g@mail.gmail.com>

On Fri, Jun 22, 2012 at 12:52 PM, KOSAKI Motohiro
<kosaki.motohiro@gmail.com> wrote:
>> Let me summary again.
>>
>> The problem:
>>
>> when hotplug offlining happens on zone A, it starts to freed page as MIGRATE_ISOLATE type in buddy.
>> (MIGRATE_ISOLATE is very irony type because it's apparently on buddy but we can't allocate them)
>> When the memory shortage happens during hotplug offlining, current task starts to reclaim, then wake up kswapd.
>> Kswapd checks watermark, then go sleep BECAUSE current zone_watermark_ok_safe doesn't consider
>> MIGRATE_ISOLATE freed page count. Current task continue to reclaim in direct reclaim path without kswapd's help.
>> The problem is that zone->all_unreclaimable is set by only kswapd so that current task would be looping forever
>> like below.
>>
>> __alloc_pages_slowpath
>> restart:
>>        wake_all_kswapd
>> rebalance:
>>        __alloc_pages_direct_reclaim
>>                do_try_to_free_pages
>>                        if global_reclaim && !all_unreclaimable
>>                                return 1; /* It means we did did_some_progress */
>>        skip __alloc_pages_may_oom
>>        should_alloc_retry
>>                goto rebalance;
>>
>> If we apply KOSAKI's patch[1] which doesn't depends on kswapd about setting zone->all_unreclaimable,
>> we can solve this problem by killing some task. But it doesn't wake up kswapd, still.
>> It could be a problem still if other subsystem needs GFP_ATOMIC request.
>> So kswapd should consider MIGRATE_ISOLATE when it calculate free pages before going sleep.
>
> I agree. And I believe we should remove rebalance label and alloc
> retrying should always wake up kswapd.
> because wake_all_kswapd is unreliable, it have no guarantee to success
> to wake up kswapd. then this
> micro optimization is NOT optimization. Just trouble source. Our
> memory reclaim logic has a lot of race
> by design. then any reclaim code shouldn't believe some one else works fine.
>

I think this is a better approach, since MIGRATE_ISLOATE is really a
temporary phenomenon, it makes sense to just retry allocation.
One issue however, with this approach is that it does not exactly work
for PAGE_ALLOC_COSTLY_ORDER, But well, given the
frequency of such allocation, I think may be it is an acceptable
compromise to handle such request by OOM in case of many
MIGRATE_ISOLATE
pages present.

what do you think ?

>
>> Firstly I tried to solve this problem by this.
>> https://lkml.org/lkml/2012/6/20/30
>> The patch's goal was to NOT increase nr_free and NR_FREE_PAGES when we free page into MIGRATE_ISOLATED.
>> But it increases little overhead in higher order free page but I think it's not a big deal.
>> More problem is duplicated codes for handling only MIGRATE_ISOLATE freed page.
>>
>> Second approach which is suggested by KOSAKI is what you mentioned.
>> But the concern about second approach is how to make sure matched count increase/decrease of nr_isolated_areas.
>> I mean how to make sure nr_isolated_areas would be zero when isolation is done.
>> Of course, we can investigate all of current caller and make sure they don't make mistake
>> now. But it's very error-prone if we consider future's user.
>> So we might need test_set_pageblock_migratetype(page, MIGRATE_ISOLATE);
>>
>> IMHO, ideal solution is that we remove MIGRATE_ISOLATE type totally in buddy.
>> For it, there is no problem to isolate already freed page in buddy allocator but the concern is how to handle
>> freed page later by do_migrate_range in memory_hotplug.c.
>> We can create custom putback_lru_pages
>>
>> put_page_hotplug(page)
>> {
>>        int migratetype = get_pageblock_migratetype(page)
>>        VM_BUG_ON(migratetype != MIGRATE_ISOLATE);
>>        __page_cache_release(page);
>>        free_one_page(zone, page, 0, MIGRATE_ISOLATE);
>> }
>>
>> putback_lru_pages_hotplug(&source)
>> {
>>        foreach page from source
>>                put_page_hotplug(page)
>> }
>>
>> do_migrate_range()
>> {
>>        migrate_pages(&source);
>>        putback_lru_pages_hotplug(&source);
>> }
>>
>> I hope this summary can help you, Kame and If I miss something, please let me know it.
>
> I disagree this. Because of, memory hotplug intentionally don't use
> stopmachine. It is because
> we don't stop any system service when memory is being unpluged. That's
> said various subsystem
> try to allocate memory during page migration for memory unplug. IOW,
> we shouldn't do_migrate_page()
> is only one caller.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-06-22  7:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-20  6:12 Minchan Kim
2012-06-20  6:32 ` KOSAKI Motohiro
2012-06-20  7:53   ` Minchan Kim
2012-06-20 12:44     ` Hillf Danton
2012-06-20 23:58       ` Minchan Kim
2012-06-20 20:19     ` KOSAKI Motohiro
2012-06-21  0:01       ` Minchan Kim
2012-06-21  1:39         ` KOSAKI Motohiro
2012-06-21  1:55           ` Minchan Kim
2012-06-21  2:45             ` KOSAKI Motohiro
2012-06-21  4:55               ` Minchan Kim
2012-06-21 10:52                 ` Kamezawa Hiroyuki
2012-06-21 17:22                   ` KOSAKI Motohiro
2012-06-22  1:05                   ` Minchan Kim
2012-06-22  6:45                     ` Minchan Kim
2012-06-23  2:56                       ` KOSAKI Motohiro
2012-06-25  1:10                         ` Minchan Kim
2012-06-23  2:59                       ` KOSAKI Motohiro
2012-06-25  1:19                         ` Minchan Kim
2012-06-23  4:38                       ` Kamezawa Hiroyuki
2012-06-25  1:01                         ` Minchan Kim
2012-06-25  4:18                           ` Minchan Kim
2012-06-22  7:22                     ` KOSAKI Motohiro
2012-06-22  7:56                       ` Aaditya Kumar [this message]
2012-06-22  8:13                         ` KOSAKI Motohiro
2012-06-21 11:02                 ` Aaditya Kumar
2012-06-22  1:20                   ` Minchan Kim
2012-06-22  2:08                     ` Aaditya Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEtiSasQco=GPCwENUY5ND7uvsMrH0a-uTZ6o9GNwmC5dgsPkA@mail.gmail.com' \
    --to=aaditya.kumar.30@gmail.com \
    --cc=aaditya.kumar@ap.sony.com \
    --cc=frank.rowand@am.sony.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kan.iibuchi@jp.sony.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan@kernel.org \
    --cc=takuzo.ohara@ap.sony.com \
    --cc=tim.bird@am.sony.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox