linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hartleys@visionengravers.com,
	mcrapet@gmail.com, linux@arm.linux.org.uk,
	fred99@carolina.rr.com, linux-arm-kernel@lists.arm.linux.org.uk,
	Badari Pulavarty <pbadari@us.ibm.com>,
	Yasunori Goto <y-goto@jp.fujitsu.com>
Subject: Re: [PATCH] Double check memmap is actually valid with a memmap has unexpected holes
Date: Wed, 6 May 2009 17:50:43 +0200	[thread overview]
Message-ID: <20090506155043.GA3084@cmpxchg.org> (raw)
In-Reply-To: <20090506143059.GB20709@csn.ul.ie>

On Wed, May 06, 2009 at 03:31:00PM +0100, Mel Gorman wrote:
> On Tue, May 05, 2009 at 01:06:53PM +0200, Johannes Weiner wrote:
> > Hi Mel,
> > 
> > On Tue, May 05, 2009 at 09:29:44AM +0100, Mel Gorman wrote:
> > > pfn_valid() is meant to be able to tell if a given PFN has valid memmap
> > > associated with it or not. In FLATMEM, it is expected that holes always
> > > have valid memmap as long as there is valid PFNs either side of the hole.
> > > In SPARSEMEM, it is assumed that a valid section has a memmap for the
> > > entire section.
> > > 
> > > However, ARM and maybe other embedded architectures in the future free
> > > memmap backing holes to save memory on the assumption the memmap is never
> > > used. The page_zone() linkages are then broken even though pfn_valid()
> > > returns true. A walker of the full memmap in this case must do additional
> > > check to ensure the memmap they are looking at is sane by making sure the
> > > zone and PFN linkages are still valid. This is expensive, but walkers of
> > > the full memmap are extremely rare.
> > > 
> > > This was caught before for FLATMEM and hacked around but it hits again
> > > for SPARSEMEM because the page_zone() linkages can look ok where the PFN
> > > linkages are totally screwed. This looks like a hatchet job but the reality
> > > is that any clean solution would end up consuming all the memory saved
> > > by punching these unexpected holes in the memmap. For example, we tried
> > > marking the memmap within the section invalid but the section size exceeds
> > > the size of the hole in most cases so pfn_valid() starts returning false
> > > where valid memmap exists. Shrinking the size of the section would increase
> > > memory consumption offsetting the gains.
> > > 
> > > This patch identifies when an architecture is punching unexpected holes
> > > in the memmap that the memory model cannot automatically detect. When set,
> > > walkers of the full memmap must call memmap_valid_within() for each PFN and
> > > passing in what it expects the page and zone to be for that PFN. If it finds
> > > the linkages to be broken, it assumes the memmap is invalid for that PFN.
> > > 
> > > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > 
> > I think we also need to fix up show_mem(). 
> 
> As it turns out, ARM has its own show_mem(). I don't see how, but ARM
> must not be using lib/show_mem.c even though it compiles it.

It's some linker magic for lib/.  It compiles both but treats the
library version as weak symbol (or something).

But with the zone-walking show_mem, I think it should be able to use
the generic version.

> > Attached is a
> > compile-tested patch, please have a look.  I am not sure about memory
> > hotplug issues but on a quick glance the vmstat stuff seems to be
> > optimistic as well.
> > 
> > ---
> > From: Johannes Weiner <hannes@cmpxchg.org>
> > Subject: lib: adjust show_mem() to support memmap holes
> > 
> > Some architectures free the backing of mem_map holes.  pfn_valid() is
> > not able to report this properly, so a stronger check is needed if the
> > caller is about to use the page descriptor derived from a pfn.
> > 
> > Change the node walking to zone walking and use memmap_valid_within()
> > to check for holes.  This is reliable as it additionally checks for
> > page_zone() and page_to_pfn() coherency.
> > 
> > Not-yet-signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> >  lib/show_mem.c |   21 +++++++++------------
> >  1 files changed, 9 insertions(+), 12 deletions(-)
> > 
> > diff --git a/lib/show_mem.c b/lib/show_mem.c
> > index 238e72a..ed3c3ec 100644
> > --- a/lib/show_mem.c
> > +++ b/lib/show_mem.c
> > @@ -11,29 +11,27 @@
> >  
> >  void show_mem(void)
> >  {
> > -	pg_data_t *pgdat;
> >  	unsigned long total = 0, reserved = 0, shared = 0,
> >  		nonshared = 0, highmem = 0;
> > +	struct zone *zone;
> >  
> >  	printk(KERN_INFO "Mem-Info:\n");
> >  	show_free_areas();
> >  
> > -	for_each_online_pgdat(pgdat) {
> > -		unsigned long i, flags;
> > +	for_each_populated_zone(zone) {
> > +		unsigned long start = zone->zone_start_pfn;
> > +		unsigned long end = start + zone->spanned_pages;
> 
> The patch appears to be doing two things
> 
> o Scanning zones instead of pgdats
> o Adding the use of memmap_valid_within()
> 
> Scanning zones instead of pgdats seems like a good idea on its own and should
> be split out for separate consideration.

Good idea.  Will do.

> > +		unsigned long pfn;
> >  
> > -		pgdat_resize_lock(pgdat, &flags);
> 
> How sure are you about removing the acquisition of this lock?  If anything,
> it appears that pagetypeinfo_showblockcount_print() should be taking this lock.

I'm completely unsure about it.

<adds memory hotplug guys to CC>

> > -		for (i = 0; i < pgdat->node_spanned_pages; i++) {
> > -			struct page *page;
> > -			unsigned long pfn = pgdat->node_start_pfn + i;
> > +		for (pfn = start; pfn < end; pfn++) {
> > +			struct page *page = pfn_to_page(pfn);
> >  
> 
> You need to check pfn_valid() before using pfn_to_page().
> 
> > -			if (unlikely(!(i % MAX_ORDER_NR_PAGES)))
> > +			if (unlikely(!(pfn % MAX_ORDER_NR_PAGES)))
> >  				touch_nmi_watchdog();
> >  
> > -			if (!pfn_valid(pfn))
> > +			if (!memmap_valid_within(pfn, page, zone))
> >  				continue;
> >  
> 
> You need both the pfn_valid() check and the memmap_valid_within() as
> memmap_valid_within() unconditionally returns 1 for most architectures. If
> you applied this patch as-is, memory holes in a zone will cause big problems -
> random results at best and invalid memory references at worst.

Ah, zone holes.  More reading up needed.  Thanks, Mel.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-05-06 15:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-05  8:29 Mel Gorman
2009-05-05  8:36 ` Russell King - ARM Linux
2009-05-05  8:49   ` Mel Gorman
2009-05-13 16:34     ` [PATCH] Double check memmap is actually valid with a memmap has unexpected holes V2 Mel Gorman
2009-05-13 19:48       ` Andrew Morton
2009-05-14  8:39         ` Mel Gorman
2009-05-17 16:27           ` Russell King - ARM Linux
2009-05-05 11:06 ` [PATCH] Double check memmap is actually valid with a memmap has unexpected holes Johannes Weiner
2009-05-06 14:31   ` Mel Gorman
2009-05-06 15:50     ` Johannes Weiner [this message]
2009-05-07  5:29       ` Yasunori Goto
2009-05-14 17:02       ` Paul Mundt
2009-05-06 19:20     ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090506155043.GA3084@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=fred99@carolina.rr.com \
    --cc=hartleys@visionengravers.com \
    --cc=linux-arm-kernel@lists.arm.linux.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mcrapet@gmail.com \
    --cc=mel@csn.ul.ie \
    --cc=pbadari@us.ibm.com \
    --cc=y-goto@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox