linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Badari Pulavarty <pbadari@us.ibm.com>
To: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Linux Kernel ML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, Yinghai Lu <yhlu.kernel@gmail.com>,
	Andrew Morton <akpm@osdl.org>
Subject: Re: [PATCH 0/3 (RFC)](memory hotplug) freeing pages allocated by bootmem for hotremove
Date: Fri, 21 Mar 2008 09:05:37 -0800	[thread overview]
Message-ID: <1206119137.8476.1.camel@dyn9047017100.beaverton.ibm.com> (raw)
In-Reply-To: <20080314231112.20D7.E1E9C6FF@jp.fujitsu.com>

On Fri, 2008-03-14 at 23:36 +0900, Yasunori Goto wrote:
> Hello.
> 
> I would like to post patch set to free pages which is allocated by bootmem
> for memory-hotremove.
> 
> Basic my idea is using remain members of struct page to remember
> information of users of bootmem (section number or node id).
> When the section is removing, kernel can confirm it.
> By this information, some issues can be solved.
> 
>   1) When the memmap of removing section is allocated on other
>      section by bootmem, it should/can be free. 
>   2) When the memmap of removing section is allocated on the
>      same section, it shouldn't be freed. Because the section has to be
>      offlined already and all pages must be isolated against
>      page allocater. Kernel keeps it as is.
>   3) When removing section has other section's memmap, 
>      kernel will be able to show easily which section should be removed
>      before it for user. (Not implemented yet)
>   4) When the above case 2), the page migrator will be able to check and skip
>      memmap againt page isolation when page offline.
>      Current page migration fails in this case because this page is 
>      just reserved page and it can't distinguish this pages can be
>      removed or not. But, it will be able to do by this patch.
>      (Not implemented yet.)
>   5) The node information like pgdat has similar issues. But, this
>      will be able to be solved too by this.
>      (Not implemented yet, but, remembering node id in the pages.)
> 
> Fortunately, current bootmem allocator just keeps PageReserved flags,
> and doesn't use any other members of page struct. The users of
> bootmem doesn't use them too.
> 
> This patch set needs Badari-san's generic __remove_pages() support patch.
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-03/msg02881.html
> 
> I think this patch set is not perfect. Because, some of section/node
> informations are smaller than one page, and bootmem allocator may
> mix other data. This patch is still trial.
> But I suppose this is good start for everyone to understand what is necessary.
> 
> Please comments.
> 
> Other Todo:
>   - for SPARSEMEM_VMEMMAP.
>     Freeing vmemmap's page is more diffcult than normal sparsemem.
>     Because not only memmap's page, but also the pages like page table must
>     be removed too. If removing section has pages for , then it must
>     be migrated too. Relocatable page table is necessary.
>     
>   - compile with other config.
>     This version is just for requesting comments.
>     If this way is accepted, I'll check it.
>   - Follow fix bootmem by Yinghai Lu-san.
>     (This patch set is for 2.6.25-rc3-mm1 with Badari-san's patch yet.)
> 
> Thanks.
> 

Do you have any updates to this. I am getting following boot panic while
testing this. Before I debug it, I want to make sure its not already 
fixed. Please let me know.

Thanks,
Badari

Linux version 2.6.25-rc5-mm1 (root@elm3b155) (gcc version 3.3.3 (SuSE Linux)) #2 SMP Fri Mar 21 07:48:29 PST 2008
[boot]0012 Setup Arch
NUMA associativity depth for CPU/Memory: 3
adding cpu 0 to node 0
node 0
NODE_DATA() = c000000071fea100
start_paddr = 0
end_paddr = 72000000
bootmap_paddr = 71fdb000
reserve_bootmem 0 7cc000
reserve_bootmem 23d0000 10000
reserve_bootmem 77b6000 84a000
reserve_bootmem 71fdb000 f000
reserve_bootmem 71fea100 1e00
reserve_bootmem 71febf68 14098
PCI host bridge /pci@800000020000002  ranges:
  IO 0x000003fe00200000..0x000003fe002fffff -> 0x0000000000000000
 MEM 0x0000040080000000..0x00000400bfffffff -> 0x00000000c0000000
PCI host bridge /pci@800000020000003  ranges:
  IO 0x000003fe00700000..0x000003fe007fffff -> 0x0000000000000000
 MEM 0x00000401c0000000..0x00000401ffffffff -> 0x00000000c0000000
EEH: PCI Enhanced I/O Error Handling Enabled
PPC64 nvram contains 7168 bytes
Zone PFN ranges:
  DMA             0 ->   466944
  Normal     466944 ->   466944
Movable zone start PFN for each node
  Node 0: 262144
early_node_map[1] active PFN ranges
    0:        0 ->   466944
[boot]0015 Setup Done
Built 1 zonelists in Node order, mobility grouping on.  Total pages: 451440
Policy zone: DMA
Kernel command line: root=/dev/sda3 selinux=0 elevator=cfq numa=debug kernelcore=1024M
[boot]0020 XICS Init
[boot]0021 XICS Done
PID hash table entries: 4096 (order: 12, 32768 bytes)
clocksource: timebase mult[1352e86] shift[22] registered
Console: colour dummy device 80x25
console handover: boot [udbg-1] -> real [hvc0]
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
freeing bootmem node 0
Unable to handle kernel paging request for data at address 0xcf7f80000000000c
Faulting instruction address: 0xc0000000000ce3e8
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32 NUMA pSeries
Modules linked in:
NIP: c0000000000ce3e8 LR: c0000000000cf714 CTR: 800000000013f270
REGS: c0000000007639f0 TRAP: 0300   Not tainted  (2.6.25-rc5-mm1)
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 44002022  XER: 00000008
DAR: cf7f80000000000c, DSISR: 0000000042010000
TASK = c000000000689910[0] 'swapper' THREAD: c000000000760000 CPU: 0
GPR00: fffffffffffffffd c000000000763c70 c000000000761be0 0000000000000000
GPR04: cf7f800000000000 0000000000000000 0000000000000000 0000000000000001
GPR08: 0000000000000000 fffffffffffffffe 0000000000000088 cf00000000000000
GPR12: 0000000000004000 c00000000068a380 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 4000000001c00000 0000000000000000 0000000002241ed8 0000000000000000
GPR24: 0000000002242148 0000000000000000 c000000071feb000 0000000000000000
GPR28: c000000071feb000 0000000000000001 c0000000006e2bd8 cf7f800000000000
NIP [c0000000000ce3e8] .set_page_bootmem_info+0x10/0x38
LR [c0000000000cf714] .register_page_bootmem_info_section+0xc4/0x17c
Call Trace:
[c000000000763c70] [000000000000001a] 0x1a (unreliable)
[c000000000763d10] [c0000000000cf8f0] .register_page_bootmem_info_node+0x124/0x158
[c000000000763dc0] [c0000000006290e4] .free_all_bootmem_node+0x1c/0x3c
[c000000000763e50] [c00000000061d618] .mem_init+0xbc/0x260
[c000000000763ee0] [c00000000060bbcc] .start_kernel+0x2f4/0x3f4
[c000000000763f90] [c000000000008594] .start_here_common+0x54/0xc0
Instruction dump:
eb61ffd8 eb81ffe0 eba1ffe8 7c0803a6 ebc1fff0 ebe1fff8 7d808120 4e800020
2fa50000 3920fffe 3800fffd 409e000c <9124000c> 48000008 9004000c 38000800
---[ end trace 31fd0ba7d8756001 ]---
Kernel panic - not syncing: Attempted to kill the idle task!


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-03-21 16:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-14 14:36 Yasunori Goto
2008-03-14 14:38 ` [PATCH 1/3 (RFC)](memory hotplug) remember section_nr and node id for removing Yasunori Goto
2008-03-14 14:41 ` [PATCH 2/3 (RFC)](memory hotplug) free pages allocated by bootmem for hotremove Yasunori Goto
2008-03-14 14:44 ` [PATCH 3/3 (RFC)](memory hotplug) align maps for easy removing Yasunori Goto
2008-03-14 16:26   ` Yinghai Lu
2008-03-15  4:12     ` Yasunori Goto
2008-03-21 17:05 ` Badari Pulavarty [this message]
2008-03-22  0:09   ` [PATCH 0/3 (RFC)](memory hotplug) freeing pages allocated by bootmem for hotremove Yasunori Goto
2008-03-26 22:08     ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1206119137.8476.1.camel@dyn9047017100.beaverton.ibm.com \
    --to=pbadari@us.ibm.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=y-goto@jp.fujitsu.com \
    --cc=yhlu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox