From: Badari Pulavarty <pbadari@gmail.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Hugh Dickins <hugh@veritas.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>
Subject: Re: grow_dev_page's __GFP_MOVABLE
Date: Thu, 13 Mar 2008 07:05:58 -0800 [thread overview]
Message-ID: <1205420758.19403.6.camel@dyn9047017100.beaverton.ibm.com> (raw)
In-Reply-To: <20080313120755.GC12351@csn.ul.ie>
On Thu, 2008-03-13 at 12:07 +0000, Mel Gorman wrote:
> On (12/03/08 18:11), Hugh Dickins didst pronounce:
> > On Wed, 12 Mar 2008, Mel Gorman wrote:
> > > On (11/03/08 21:33), Hugh Dickins didst pronounce:
> > > >
> > > > I'm (slightly) worried by your __GFP_MOVABLE in grow_dev_page:
> > > > is it valid, given that we come here for filesystem metadata pages
> > > > - don't we?
> > >
> > > This is a tricky one and the second time it's come up. The pages allocated
> > > here end up on the page cache and had a similar life-cycle to other LRU-pages
> > > in the majority of cases when I checked at the time. The allocations are
> > > labeled MOVABLE, but in this case they can be cleaned and moved to disk
> > > rather than movable by page migration. Strictly, one would argue that
> > > they could be marked RECLAIMABLE but it increases the number of pageblocks
> > > used by RECLAIMABLE allocations quite considerably and they have a very
> > > different lifecycle which in itself is bad (spreads difficult to reclaim
> > > allocations wider than necessary).
> >
> > I was finding this ever so hard to understand, but now think I was blocked
> > by the misapprehension that a filesystem would hold all the metadata pages
> > associated with an inode in core while that file was open. That might be
> > true of some primitive filesystems, but could hardly be true of a grownup
> > filesystem.
>
> That is my current understanding.
>
> > Though even so, I'd expect different kinds of metadata pages
> > to have significantly different lifecycles, and quite dependent on the
> > filesystem in question e.g. superblock pages are held in core? inodes?
> >
>
> This is probably true. To be right, every caller that enters this path should
> be updated separetly.
>
> > But you found that the majority are not, their counts merely raised while
> > being accessed by the filesystem, so made the decision to treat them all
> > as MOVABLE because most can be reclaimed from the pagecache in the same
> > way as pagecache pages, which needs significantly less effort than
> > RECLAIMing from slab.
>
> Yes, this is correct. For some filesystems, the pages with buffers can
> also be migrated (ext2, ext3, ext4, gfs2, ntfs, ocfs2, xfs) but it's not
> universal.
>
> > Too bad about the obstacle to defragmentation
> > that the held ones would pose.
>
> Yeah and I suspect if this is going to hit as a bug report, it will be
> related to memory hot-remove. At some point, I'll may have to bite
> the bullet and set this place to GFP_NOFS, distinguish between the
> different types of caller.
>
> > Okay (if I'm getting it right): you
> > have to choose one way or the other, you've chosen this way, fine.
> > And my argument by analogy with __GFP_HIGHMEM was just bogus.
> >
> > (I guess this is why you added GFP_HIGHUSER_PAGECACHE etc., which to
> > my mind just obfuscate things further, and intend a patch to remove.)
> >
>
> Ironically, that was originally introduced to make things easier to
> read.
>
> > Though, what prevents them from being genuinely MOVABLE while they're
> > not transiently in use by the filesystem?
>
> Some of them are. The address_space will have a migratepage() helper if
> they are really movable.
>
> >And why does block_dev.c
> > set mapping gfp_mask to GFP_USER instead of (not yet defined)
> > GFP_USER_MOVABLE (ah, GFP_USER_PAGECACHE is defined, but unused)?
> >
>
> When I last checked, the blockdev address_space did not implement migratepage()
> so pages allocated on its behalf were not movable. I cannot recall if
> they ended up on the LRU where they could be reclaimed as normal or not.
>
> > > Similarly, leaving them GFP_NOFS would
> > > scatter allocations like page table pages wider than expected.
> >
> > Yes, I accept we should do better than GFP_NOFS there: but I'm
> > now not seeing why it isn't just &~__GFP_FS, with block_dev.c
> > supplying the MOVABLE.
> >
>
> I don't have a quick answer. I've added to the to-do list to revisit
> this and see can it be done better.
Mel,
All I can say is, marking grow_dev_page() __GFP_MOVABLE is causing
nothing but trouble in my hotplug memory remove testing :(
I constantly see that even though memblock is marked "removable", I
can't move the allocations. Most of the times these allocations came
from grow_dev_pages or its friends :(
Either these pages are not movable/migratable or code is not working
or filesystem/block device is holding them up :(
memory offlining 0x8000 to 0x9000 failed
page_owner shows:
Page allocated via order 0, mask 0x120050
PFN 30625 Block 7 type 2 Flags L
[0xc0000000000c511c] .alloc_pages_current+208
[0xc0000000001049d8] .__find_get_block_slow+88
[0xc0000000004f0bbc] .__wait_on_bit+232
[0xc0000000000994ec] .__page_cache_alloc+24
[0xc000000000104fd8] .__find_get_block+272
[0xc00000000009a124] .find_or_create_page+76
[0xc0000000001063fc] .unlock_buffer+48
[0xc000000000105280] .__getblk+312
Thanks,
Badari
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-03-13 15:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-11 21:33 Hugh Dickins
2008-03-12 14:08 ` Mel Gorman
2008-03-12 18:11 ` Hugh Dickins
2008-03-13 12:07 ` Mel Gorman
2008-03-13 15:05 ` Badari Pulavarty [this message]
2008-03-13 15:44 ` Mel Gorman
2008-03-14 0:50 ` Badari Pulavarty
2008-03-14 11:47 ` Mel Gorman
2008-03-14 16:05 ` Badari Pulavarty
2008-03-14 18:52 ` Badari Pulavarty
2008-03-17 10:54 ` Andy Whitcroft
2008-03-17 14:04 ` Badari Pulavarty
2008-03-18 9:34 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1205420758.19403.6.camel@dyn9047017100.beaverton.ibm.com \
--to=pbadari@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hugh@veritas.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox