From: mel@skynet.ie (Mel Gorman)
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: nicolas.mailhot@laposte.net, clameter@sgi.com, apw@shadowen.org,
akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH 2/2] Only check absolute watermarks for ALLOC_HIGH and ALLOC_HARDER allocations
Date: Wed, 16 May 2007 15:00:39 +0100 [thread overview]
Message-ID: <20070516140038.GA10225@skynet.ie> (raw)
In-Reply-To: <464B089C.9070805@yahoo.com.au>
On (16/05/07 23:35), Nick Piggin didst pronounce:
> Mel Gorman wrote:
> >On (16/05/07 22:14), Nick Piggin didst pronounce:
> >
> >>Mel Gorman wrote:
> >>
> >>>zone_watermark_ok() checks if there are enough free pages including a
> >>>reserve.
> >>>High-order allocations additionally check if there are enough free
> >>>high-order
> >>>pages in relation to the watermark adjusted based on the requested size.
> >>>If
> >>>there are not enough free high-order pages available, 0 is returned so
> >>>that
> >>>the caller enters direct reclaim.
> >>>
> >>>ALLOC_HIGH and ALLOC_HARDER allocations are allowed to dip further into
> >>>the reserves but also take into account if the number of free high-order
> >>>pages meet the adjusted watermarks. As these allocations cannot sleep,
> >>
> >>Why can't ALLOC_HIGH or ALLOC_HARDER sleep? This patch seems wrong to
> >>me.
> >>
> >
> >
> >In page_alloc.c
> >
> > if ((unlikely(rt_task(p)) && !in_interrupt()) || !wait)
> > alloc_flags |= ALLOC_HARDER;
> >
> >See the !wait part.
>
> And the || part.
>
I doubt a rt_task is thrilled to be entering direct reclaim.
>
> >The ALLOC_HIGH applies to __GFP_HIGH allocations which are allowed to
> >dip into emergency pools and go below the reserve.
>
> And some of them can sleep too.
>
If you feel very strongly about it, I can back out the ALLOC_HIGH part for
__GFP_HIGH allocations but it looks like at a glance that users of __GFP_HIGH
are not too keen on sleeping;
drivers/block/rd.c;
Comment
Deep badness. rd_blkdev_pagecache_IO() needs to allocate
pagecache pages within a request_fn. We cannot recur back
into the filesytem which is mounted atop the ramdisk
fs/ext4/writeback.c;
Using __GFP_HIGH when allocating bios
kernel/power/swap.c;
Using __GFP_HIGH when allocating bios
The change is still obeying watermarks, just at order-0 instead of
strictly observing the higher orders.
>
> >>>they cannot enter direct reclaim so the allocation can fail even though
> >>>the pages are available and the number of free pages is well above the
> >>>watermark for order-0.
> >>>
> >>>This patch alters the behaviour of zone_watermark_ok() slightly.
> >>>Watermarks
> >>>are still obeyed but when an allocator is flagged ALLOC_HIGH or
> >>>ALLOC_HARDER,
> >>>we only check that there is sufficient memory over the reserve to satisfy
> >>>the allocation, allocation size is ignored. This patch also documents
> >>>better what zone_watermark_ok() is doing.
> >>
> >>This is wrong because now you lose the buffering of higher order pages
> >>for more urgent allocation classes against less urgent ones.
> >>
> >
> >
> >ALLOC_HARDER is an urgent allocation class.
>
> And HIGH is even more, and MEMALLOC even more again.
>
HIGH => ALLOC_HIGH => obey watermarks at order-0
Somewhat counter-intuitively, with the current code if the allocation is
a really high priority but can sleep, it can actually allocate without any
watermarks at all
>
> >>Think of how the order-0 allocation buffering works with the watermarks
> >>and consider that we're trying to do the same exact thing for higher order
> >>allocations here.
> >>
> >
> >
> >What actually happens is that high-order allocations fail even though
> >the watermarks are met because they cannot enter direct reclaim.
>
> Yeah, they fail leaving some spare for more urgent allocations. Like
> how the order-0 allocations work.
order-0 watermarks are still in place. After the patch, it is still not
possible for the allocations to break the watermarks there.
> They should also kick kswapd to start freeing pages _before_ they start
> failing too.
>
Should prehaps, but from what I read kswapd is only kicked into action
when the first allocation attempt has already failed.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-05-16 14:00 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-14 17:32 [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Mel Gorman
2007-05-14 17:32 ` [PATCH 1/2] Have kswapd keep a minimum order free other than order-0 Mel Gorman
2007-05-14 18:01 ` Christoph Lameter
2007-05-14 18:13 ` Christoph Lameter
2007-05-14 18:24 ` Mel Gorman
2007-05-14 18:52 ` Christoph Lameter
2007-05-15 8:42 ` Nicolas Mailhot
2007-05-15 9:16 ` Mel Gorman
2007-05-16 8:25 ` Nick Piggin
2007-05-16 9:03 ` Mel Gorman
2007-05-16 9:10 ` Nick Piggin
2007-05-16 9:45 ` Mel Gorman
2007-05-16 12:28 ` Nick Piggin
2007-05-16 13:50 ` Mel Gorman
2007-05-16 14:04 ` Nick Piggin
2007-05-16 15:32 ` Mel Gorman
2007-05-16 15:44 ` Nick Piggin
2007-05-16 16:46 ` Mel Gorman
2007-05-17 7:09 ` Nick Piggin
2007-05-17 12:22 ` Andy Whitcroft
2007-05-18 2:25 ` Nick Piggin
2007-05-16 15:46 ` Nick Piggin
2007-05-16 14:20 ` Nick Piggin
2007-05-16 15:06 ` Nicolas Mailhot
2007-05-16 15:33 ` Mel Gorman
2007-05-15 17:09 ` Christoph Lameter
2007-05-15 4:39 ` Christoph Lameter
2007-05-14 18:19 ` Mel Gorman
2007-05-14 17:32 ` [PATCH 2/2] Only check absolute watermarks for ALLOC_HIGH and ALLOC_HARDER allocations Mel Gorman
2007-05-16 12:14 ` Nick Piggin
2007-05-16 13:24 ` Mel Gorman
2007-05-16 13:35 ` Nick Piggin
2007-05-16 14:00 ` Mel Gorman [this message]
2007-05-16 14:11 ` Nick Piggin
2007-05-16 18:28 ` Andy Whitcroft
2007-05-16 18:48 ` Mel Gorman
2007-05-16 19:00 ` Christoph Lameter
2007-05-17 7:34 ` Nick Piggin
2007-05-14 18:13 ` [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Nicolas Mailhot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070516140038.GA10225@skynet.ie \
--to=mel@skynet.ie \
--cc=akpm@linux-foundation.org \
--cc=apw@shadowen.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=nicolas.mailhot@laposte.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox