linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@novell.com>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andi Kleen <ak@suse.de>, Andrew Morton <akpm@osdl.org>
Subject: Re: fix iounmap and a pageattr memleak (x86 and x86-64)
Date: Tue, 2 Nov 2004 23:07:20 +0100	[thread overview]
Message-ID: <20041102220720.GV3571@dualathlon.random> (raw)
In-Reply-To: <4187FA6D.3070604@us.ibm.com>

On Tue, Nov 02, 2004 at 01:21:49PM -0800, Dave Hansen wrote:
> This patch:
> 
> >From: Andrea Arcangeli <andrea@novell.com>
> >
> >- fix silent memleak in the pageattr code that I found while searching
> >  for the bug Andi fixed in the second patch below (basically reference
> >  counting in split page was done on the pmd instead of the pte).
> >
> >- Part of this patch is also needed to make the above work on x86 
> >(otherwise
> >  one of my new above BUGS() will trigger signalling the fact a bug was
> >  there).  The below patch creates a subtle dependency that (_PAGE_PCD << 
> >  24)
> >  must not be zero.  It's not the cleanest thing ever, but since it's an
> >  hardware bitflag I doubt it's going to break.
> >
> >Signed-off-by: Andi Kleen <ak@suse.de>
> >Signed-off-by: Andrea Arcangeli <andrea@novell.com>
> >Signed-off-by: Andrew Morton <akpm@osdl.org>
> >---
> >
> > 25-akpm/arch/i386/mm/ioremap.c    |    4 ++--
> > 25-akpm/arch/i386/mm/pageattr.c   |   13 +++++++------
> > 25-akpm/arch/x86_64/mm/ioremap.c  |   14 +++++++-------
> > 25-akpm/arch/x86_64/mm/pageattr.c |   23 ++++++++++++++---------
> > 4 files changed, 30 insertions(+), 24 deletions(-)
> 
> is hitting this BUG() during bootup:
> 
>        /* memleak and potential failed 2M page regeneration */
>        BUG_ON(!page_count(kpte_page));
> 
> in 2.6.10-rc1-mm2.
> 
> Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
> Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> Memory: 511144k/524288k available (1856k kernel code, 12608k reserved, 
> 1186k data, 164k init, 0k highmem)
> Checking if this processor honours the WP bit even in supervisor mode... Ok.
> Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
> ------------[ cut here ]------------
> kernel BUG at arch/i386/mm/pageattr.c:136!
> invalid operand: 0000 [#1]
> SMP DEBUG_PAGEALLOC
> Modules linked in:
> CPU:    0
> EIP:    0060:[<c0113f48>]    Not tainted VLI
> EFLAGS: 00010046   (2.6.10-rc1-mm2)
> EIP is at __change_page_attr+0x28c/0x358
> eax: ffffffff   ebx: 017ff163   ecx: 00000000   edx: c10001e0
> esi: 00000000   edi: c000fff8   ebp: c10001e0   esp: c03f9d98
> ds: 007b   es: 007b   ss: 0068
> Process swapper (pid: 0, threadinfo=c03f9000 task=c0345b40)
> Stack: c102ffe0 00000000 00000000 00000046 c0113ceb c17f9000 c102ff20
>        c102ff20 017ff163 00000000 017ff163 00000000 c0113ceb c17fe000
>        c102ffc0 c102ffc0 c1000000 c17ff000 c000fff8 017ff163 00000000
>        017ff163 00000000 00000000
> Call Trace:
>  [<c0113ceb>] __change_page_attr+0x2f/0x358
>  [<c0113ceb>] __change_page_attr+0x2f/0x358
>  [<c011404a>] change_page_attr+0x36/0x54
>  [<c0114148>] kernel_map_pages+0x30/0x5f
>  [<c0137d80>] __alloc_pages+0x340/0x350
>  [<c0137dad>] __get_free_pages+0x1d/0x30
>  [<c013adfa>] kmem_getpages+0x26/0xd4
>  [<c013c221>] cache_grow+0xb1/0x150
>  [<c013c84a>] cache_alloc_refill+0x232/0x280
>  [<c013ccbe>] kmem_cache_alloc+0x5a/0x78
>  [<c01d4970>] idr_pre_get+0x1c/0x44
>  [<c0181b60>] sysfs_fill_super+0x0/0xa4
>  [<c01572cc>] set_anon_super+0x10/0xb8
>  [<c0156cff>] sget+0xb3/0x148
>  [<c0181b60>] sysfs_fill_super+0x0/0xa4
>  [<c0157636>] get_sb_single+0x26/0x8c
>  [<c0157608>] compare_single+0x0/0x8
>  [<c01572bc>] set_anon_super+0x0/0xb8
>  [<c0181c1d>] sysfs_get_sb+0x19/0x1d
>  [<c0181b60>] sysfs_fill_super+0x0/0xa4
>  [<c01576ea>] do_kern_mount+0x4e/0xd0
>  [<c015777d>] kern_mount+0x11/0x15
>  [<c0409962>] sysfs_init+0x1e/0x50
>  [<c0409430>] mnt_init+0xb4/0xc0
>  [<c040917a>] vfs_caches_init+0x7e/0x94
>  [<c03fa831>] start_kernel+0x12d/0x150
> Code: c7 0f 75 f5 f0 ff 4d 04 eb 08 0f 0b 85 00 88 c1 2d c0 8b 45 00 89 
> ea f6 c4 80 74 07 8b 55 0c 8d 74 26 00 8b 42 04 83 f8 ff 75 08 <0f> 0b 
> 88 00 88 c1 2d c0 a1 ac 6c 34 c0 a8 08 0f 84 aa 00 00 00
> 
> I'm tracking down now to see exactly what's going on.  This just a 
> regular, plain 4-way x86 box with 4GB of RAM.  Removing that BUG_ON() 
> lets me boot just fine.

you've a debugging option enabled.

I'm afraid somebody wrote common code with the hope that
change_page_attr had a natural universal API.

change_page_attr API has alwasy required you a certain level of symmetry
to work. Now I made it completley symmetric just to make it simpler to
use and I added BUG where the previous code would screwup.

If you do something like this you'll trigger a BUG_ON too:

	change_page_attr(page, PAGE_KERNEL_NOCACHE)
	change_page_attr(page, PAGE_KERNEL)
	change_page_attr(page, PAGE_KERNEL)

the last one will BUG().

At least the new API with these changes will not leak memory anymore as
far as you retain simmetry (even if you run the change_page_attr on the
same page from different context).

I suspect that such debugging option has never worked right.

pageattr will need fixing so that to provide a natural universal API
that keeps track of each page. Previously this would have failed:

A       change_page_attr(page1, UNCACHED)    -> pte page_count = 2 
B       change_page_attr(page1, WHATEVER)    -> pte page_count still 2 
C       change_page_attr(page2, UNCACHED)    -> pte page count = 3 
A       change_page_attr(page1, PAGE_KERNEL) -> pte page count = 2 
B       change_page_attr(page1, PAGE_KERNEL) -> pte page count = 1 ->screwup

Now the above can work fine. But still examples like the above one will
trigger bugs since they're simply violating the current restricted
change_page_attr API.

I believe if you disable the DEBUG_PAGEALLOC it should work fine again.

Still I recommend investigating _why_ debug_pagealloc is violating the
API. It might not be necessary to wait for the pageattr universal
feature to make DEBUG_PAGEALLOC work safe.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2004-11-02 22:07 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-02 21:21 Dave Hansen
2004-11-02 22:07 ` Andrea Arcangeli [this message]
2004-11-02 22:21   ` Dave Hansen
2004-11-02 22:29     ` Andrew Morton
2004-11-02 22:34       ` Dave Hansen
2004-11-03  0:54     ` Andrea Arcangeli
2004-11-02 22:45   ` Dave Hansen
2004-11-02 23:00     ` Dave Hansen
2004-11-03  1:35       ` Andrea Arcangeli
2004-11-03  1:43         ` Dave Hansen
2004-11-03  2:26           ` Andrea Arcangeli
2004-11-03  2:48             ` Dave Hansen
2004-11-03  3:05               ` Andrea Arcangeli
2004-11-03 19:37                 ` Dave Hansen
2004-11-05  0:02                 ` Dave Hansen
2004-11-05  0:40                   ` Dave Hansen
2004-11-05  0:53                     ` Andrea Arcangeli
2004-11-05  1:55                       ` Dave Hansen
2004-11-05  2:08                         ` Andrea Arcangeli
2004-11-05  2:23                           ` Dave Hansen
2004-11-05  4:03                             ` Andrea Arcangeli
2004-11-05  4:20                               ` Andrea Arcangeli
2004-11-02 23:04     ` Andrew Morton
2004-11-03  1:40       ` Andrea Arcangeli
2004-11-02 22:34 ` Jason Baron
2004-11-02 23:12   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041102220720.GV3571@dualathlon.random \
    --to=andrea@novell.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=haveblue@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox