Re: [RFC]pagealloc: compensate a task for direct page reclaim

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Minchan Kim <minchan.kim@gmail.com>
To: Shaohua Li <shaohua.li@intel.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mel@csn.ul.ie>
Subject: Re: [RFC]pagealloc: compensate a task for direct page reclaim
Date: Fri, 17 Sep 2010 13:47:56 +0900	[thread overview]
Message-ID: <AANLkTi=XHGxcxcz82ccrLSUrS9NcXb6qBh0TcnGkPzYB@mail.gmail.com> (raw)
In-Reply-To: <20100917023457.GA26307@sli10-conroe.sh.intel.com>

On Fri, Sep 17, 2010 at 11:34 AM, Shaohua Li <shaohua.li@intel.com> wrote:
> On Thu, Sep 16, 2010 at 11:00:10PM +0800, Minchan Kim wrote:
>> On Thu, Sep 16, 2010 at 07:26:36PM +0800, Shaohua Li wrote:
>> > A task enters into direct page reclaim, free some memory. But sometimes
>> > the task can't get a free page after direct page reclaim because
>> > other tasks take them (this is quite common in a multi-task workload
>> > in my test). This behavior will bring extra latency to the task and is
>> > unfair. Since the task already gets penalty, we'd better give it a compensation.
>> > If a task frees some pages from direct page reclaim, we cache one freed page,
>> > and the task will get it soon. We only consider order 0 allocation, because
>> > it's hard to cache order > 0 page.
>> >
>> > Below is a trace output when a task frees some pages in try_to_free_pages(), but
>> > get_page_from_freelist() can't get a page in direct page reclaim.
>> >
>> > <...>-809   [004]   730.218991: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-806   [001]   730.237969: __alloc_pages_nodemask: progress 147, order 0, pid 806, comm mmap_test
>> > <...>-810   [005]   730.237971: __alloc_pages_nodemask: progress 147, order 0, pid 810, comm mmap_test
>> > <...>-809   [004]   730.237972: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-811   [006]   730.241409: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
>> > <...>-809   [004]   730.241412: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-812   [007]   730.241435: __alloc_pages_nodemask: progress 147, order 0, pid 812, comm mmap_test
>> > <...>-809   [004]   730.245036: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-809   [004]   730.260360: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-805   [000]   730.260362: __alloc_pages_nodemask: progress 147, order 0, pid 805, comm mmap_test
>> > <...>-811   [006]   730.263877: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
>> >
>>
>> The idea is good.
>>
>> I think we need to reserve at least one page for direct reclaimer who make the effort so that
>> it can reduce latency of stalled process.
>>
>> But I don't like this implementation.
>>
>> 1. It selects random page of reclaimed pages as cached page.
>> This doesn't consider requestor's migratetype so that it causes fragment problem in future.
> maybe we can limit the migratetype to MIGRATE_MOVABLE, which is the most common case.
>
>> 2. It skips buddy allocator. It means we lost coalescence chance so that fragement problem
>> would be severe than old.
> we only cache order 0 allocation, which doesn't enter lumpy reclaim, so this sounds not
> an issue to me.

I mean following as.

Old behavior.

1) return 0-order page
2) Fortunately, It fills the hole for order-1, so the page would be
promoted order-1 page
3) Fortunately, It fills the hole for order-2, so the page would be
promoted order-2 page
4) repeatedly until some order.
5) Finally, alloc_page will allocate a order-o one page(ie not
coalesce) of all which reclaimed direct reclaimer from buddy.

But your patch lost the chance on cached page.

Of course, If any pages reclaimed isn't in order 0 list(ie, all page
should be coalesce), big page have to be break with order-0 page. But
it's unlikely.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-09-17  4:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-16 11:26 Shaohua Li
2010-09-16 15:00 ` Minchan Kim
2010-09-17  2:34   ` Shaohua Li
2010-09-17  4:47     ` Minchan Kim [this message]
2010-09-20  8:50   ` Mel Gorman
2010-09-17  5:52 ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTi=XHGxcxcz82ccrLSUrS9NcXb6qBh0TcnGkPzYB@mail.gmail.com' \
    --to=minchan.kim@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox