From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Nick Piggin <npiggin@suse.de>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kernel-testers@vger.kernel.org
Subject: Re: [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3
Date: Tue, 17 Jun 2008 13:46:38 -0400 [thread overview]
Message-ID: <1213724798.8707.41.camel@lts-notebook> (raw)
In-Reply-To: <20080617163501.7cf411ee.nishimura@mxp.nes.nec.co.jp>
On Tue, 2008-06-17 at 16:35 +0900, Daisuke Nishimura wrote:
> Hi.
>
> I got this bug while migrating pages only a few times
> via memory_migrate of cpuset.
Ah, I did test migration fairly heavily, but not by moving cpusets.
>
> Unfortunately, even if this patch is applied,
> I got bad_page problem after hundreds times of page migration
> (I'll report it in another mail).
> But I believe something like this patch is needed anyway.
Agreed. See comments below.
>
> ------------[ cut here ]------------
> kernel BUG at mm/migrate.c:719!
> invalid opcode: 0000 [1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu3/cache/index1/shared_cpu_map
> CPU 0
> Modules linked in: ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror dm_log dm_multipath dm_mod sbs sbshc button battery acpi_memhotplug ac parport_pc lp parport floppy serio_raw rtc_cmos rtc_core rtc_lib 8139too pcspkr 8139cp mii ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode]
> Pid: 3096, comm: switch.sh Not tainted 2.6.26-rc5-mm3 #1
> RIP: 0010:[<ffffffff8029bb85>] [<ffffffff8029bb85>] migrate_pages+0x33e/0x49f
> RSP: 0018:ffff81002f463bb8 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffffe20000c17500 RCX: 0000000000000034
> RDX: ffffe20000c17500 RSI: ffffe200010003c0 RDI: ffffe20000c17528
> RBP: ffffe200010003c0 R08: 8000000000000000 R09: 304605894800282f
> R10: 282f87058b480028 R11: 0028304005894800 R12: ffff81003f90a5d8
> R13: 0000000000000000 R14: ffffe20000bf4cc0 R15: ffff81002f463c88
> FS: 00007ff9386576f0(0000) GS:ffffffff8061d800(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007ff938669000 CR3: 000000002f458000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process switch.sh (pid: 3096, threadinfo ffff81002f462000, task ffff81003e99cf10)
> Stack: 0000000000000001 ffffffff80290777 0000000000000000 0000000000000000
> ffff81002f463c88 ffff81000000ea18 ffff81002f463c88 000000000000000c
> ffff81002f463ca8 00007ffffffff000 00007fff649f6000 0000000000000004
> Call Trace:
> [<ffffffff80290777>] ? new_node_page+0x0/0x2f
> [<ffffffff80291611>] ? do_migrate_pages+0x19b/0x1e7
> [<ffffffff802315c7>] ? set_cpus_allowed_ptr+0xe6/0xf3
> [<ffffffff8025c827>] ? cpuset_migrate_mm+0x58/0x8f
> [<ffffffff8025d0fd>] ? cpuset_attach+0x8b/0x9e
> [<ffffffff8025a3e1>] ? cgroup_attach_task+0x3a3/0x3f5
> [<ffffffff80276cb5>] ? __alloc_pages_internal+0xe2/0x3d1
> [<ffffffff8025af06>] ? cgroup_common_file_write+0x150/0x1dd
> [<ffffffff8025aaf4>] ? cgroup_file_write+0x54/0x150
> [<ffffffff8029f839>] ? vfs_write+0xad/0x136
> [<ffffffff8029fd76>] ? sys_write+0x45/0x6e
> [<ffffffff8020bef2>] ? tracesys+0xd5/0xda
>
>
> Code: 4c 48 8d 7b 28 e8 cc 87 09 00 48 83 7b 18 00 75 30 48 8b 03 48 89 da 25 00 40 00 00 48 85 c0 74 04 48 8b 53 10 83 7a 08 01 74 04 <0f> 0b eb fe 48 89 df e8 5e 50 fd ff 48 89 df e8 7d d6 fd ff eb
> RIP [<ffffffff8029bb85>] migrate_pages+0x33e/0x49f
> RSP <ffff81002f463bb8>
> Clocksource tsc unstable (delta = 438246251 ns)
> ---[ end trace ce4e6053f7b9bba1 ]---
>
>
> This bug is caused by VM_BUG_ON() in unmap_and_move().
>
> unmap_and_move()
> 710 if (rc != -EAGAIN) {
> 711 /*
> 712 * A page that has been migrated has all references
> 713 * removed and will be freed. A page that has not been
> 714 * migrated will have kepts its references and be
> 715 * restored.
> 716 */
> 717 list_del(&page->lru);
> 718 if (!page->mapping) {
> 719 VM_BUG_ON(page_count(page) != 1);
> 720 unlock_page(page);
> 721 put_page(page); /* just free the old page */
> 722 goto end_migration;
> 723 } else
> 724 unlock = putback_lru_page(page);
> 725 }
I think that at least part of your patch, below, should fix this
problem. See comments there.
Now I wonder if the assertion that newpage count == 1 could be violated?
I don't see how. We've just allocated and filled it and haven't
unlocked it yet, so we should hold the only reference. Do you agree?
>
> I think the page count is not necessarily 1 here, because
> migration_entry_wait increases page count and waits for the
> page to be unlocked.
> So, if the old page is accessed between migrate_page_move_mapping,
> which checks the page count, and remove_migration_ptes, page count
> would not be 1 here.
>
> Actually, just commenting out get/put_page from migration_entry_wait
> works well in my environment(succeeded in hundreds times of page migration),
> but modifying migration_entry_wait this way is not good, I think.
>
>
> This patch depends on Lee Schermerhorn's fix for double unlock_page.
>
> This patch also fixes a race between migrate_entry_wait and
> page_freeze_refs in migrate_page_move_mapping.
>
>
> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
>
> ---
> diff -uprN linux-2.6.26-rc5-mm3/mm/migrate.c linux-2.6.26-rc5-mm3-test/mm/migrate.c
> --- linux-2.6.26-rc5-mm3/mm/migrate.c 2008-06-17 15:31:23.000000000 +0900
> +++ linux-2.6.26-rc5-mm3-test/mm/migrate.c 2008-06-17 13:59:15.000000000 +0900
> @@ -232,6 +232,7 @@ void migration_entry_wait(struct mm_stru
> swp_entry_t entry;
> struct page *page;
>
> +retry:
> ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
> pte = *ptep;
> if (!is_swap_pte(pte))
> @@ -243,11 +244,20 @@ void migration_entry_wait(struct mm_stru
>
> page = migration_entry_to_page(entry);
>
> - get_page(page);
> - pte_unmap_unlock(ptep, ptl);
> - wait_on_page_locked(page);
> - put_page(page);
> - return;
> + /*
> + * page count might be set to zero by page_freeze_refs()
> + * in migrate_page_move_mapping().
> + */
> + if (get_page_unless_zero(page)) {
> + pte_unmap_unlock(ptep, ptl);
> + wait_on_page_locked(page);
> + put_page(page);
> + return;
> + } else {
> + pte_unmap_unlock(ptep, ptl);
> + goto retry;
> + }
> +
I'm not sure about this part. If it IS needed, I think it would be
needed independently of the unevictable/putback_lru_page() changes, as
this race must have already existed.
However, unmap_and_move() replaced the migration entries with bona fide
pte's referencing the new page before freeing the old page, so I think
we're OK without this change.
> out:
> pte_unmap_unlock(ptep, ptl);
> }
> @@ -715,13 +725,7 @@ unlock:
> * restored.
> */
> list_del(&page->lru);
> - if (!page->mapping) {
> - VM_BUG_ON(page_count(page) != 1);
> - unlock_page(page);
> - put_page(page); /* just free the old page */
> - goto end_migration;
> - } else
> - unlock = putback_lru_page(page);
> + unlock = putback_lru_page(page);
> }
>
> if (unlock)
i>>?
I agree with this part. I came to the same conclusion looking at the
code. If we just changed the if() and VM_BUG_ON() to:
if (!page->mapping && page_count(page) == 1) { ...
we'd be doing exactly what putback_lru_page() is doing. So, this code
as always unnecessary, duplicate code [that I was trying to avoid :(].
So, just let putback_lru_page() handle this condition and conditionally
unlock_page().
I'm testing with my stress load with the 2nd part of the patch above and
it's holding up OK. Of course, I didn't hit the problem before. I'll
try your duplicator script and see what happens.
Regards,
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-06-17 17:46 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-12 5:59 2.6.26-rc5-mm3 Andrew Morton
2008-06-12 7:58 ` 2.6.26-rc5-mm3: kernel BUG at mm/vmscan.c:510 Alexey Dobriyan
2008-06-12 8:22 ` Andrew Morton
2008-06-12 8:23 ` Alexey Dobriyan
2008-06-12 8:44 ` [BUG] 2.6.26-rc5-mm3 kernel BUG at mm/filemap.c:575! Kamalesh Babulal
2008-06-12 8:57 ` Andrew Morton
2008-06-12 11:20 ` KAMEZAWA Hiroyuki
2008-06-13 1:44 ` [PATCH] fix double unlock_page() in " KAMEZAWA Hiroyuki
2008-06-13 2:13 ` Andrew Morton
2008-06-13 15:30 ` Lee Schermerhorn
2008-06-15 3:59 ` Kamalesh Babulal
2008-06-16 14:49 ` Lee Schermerhorn
2008-06-17 2:32 ` KAMEZAWA Hiroyuki
2008-06-17 15:26 ` Lee Schermerhorn
2008-06-13 4:34 ` Valdis.Kletnieks
2008-06-14 13:32 ` Kamalesh Babulal
2008-06-12 11:38 ` [BUG] " Nick Piggin
2008-06-13 0:25 ` KAMEZAWA Hiroyuki
2008-06-13 4:18 ` Valdis.Kletnieks
2008-06-13 7:16 ` Andrew Morton
2008-06-12 23:32 ` 2.6.26-rc5-mm3 Byron Bradley
2008-06-12 23:55 ` 2.6.26-rc5-mm3 Daniel Walker
2008-06-13 0:04 ` 2.6.26-rc5-mm3 Byron Bradley
2008-06-18 17:55 ` 2.6.26-rc5-mm3 Daniel Walker
2008-06-19 9:13 ` 2.6.26-rc5-mm3 Ingo Molnar
2008-06-19 14:39 ` 2.6.26-rc5-mm3 Daniel Walker
2008-06-17 7:35 ` [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3 Daisuke Nishimura
2008-06-17 7:47 ` [Bad page] trying to free locked page? (Re: [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3) Daisuke Nishimura
2008-06-17 9:03 ` KAMEZAWA Hiroyuki
2008-06-17 9:14 ` KOSAKI Motohiro
2008-06-17 9:15 ` Daisuke Nishimura
2008-06-17 18:29 ` Lee Schermerhorn
2008-06-17 20:00 ` [PATCH] unevictable mlocked pages: initialize mm member of munlock mm_walk structure Lee Schermerhorn
2008-06-18 3:33 ` KOSAKI Motohiro
2008-06-18 2:40 ` [Bad page] trying to free locked page? (Re: [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3) Daisuke Nishimura
2008-06-17 15:34 ` KOSAKI Motohiro
2008-06-18 2:32 ` Daisuke Nishimura
2008-06-18 10:20 ` KOSAKI Motohiro
2008-06-18 9:40 ` [Experimental][PATCH] putback_lru_page rework KAMEZAWA Hiroyuki
2008-06-18 11:36 ` KOSAKI Motohiro
2008-06-18 11:55 ` KAMEZAWA Hiroyuki
2008-06-19 8:00 ` Daisuke Nishimura
2008-06-19 8:24 ` KAMEZAWA Hiroyuki
2008-06-18 14:50 ` Daisuke Nishimura
2008-06-18 18:21 ` Lee Schermerhorn
2008-06-19 0:22 ` KAMEZAWA Hiroyuki
2008-06-19 14:45 ` Lee Schermerhorn
2008-06-20 0:47 ` KAMEZAWA Hiroyuki
2008-06-20 1:13 ` KAMEZAWA Hiroyuki
2008-06-20 17:10 ` Lee Schermerhorn
2008-06-20 20:41 ` Lee Schermerhorn
2008-06-21 8:56 ` KOSAKI Motohiro
2008-06-23 0:30 ` KAMEZAWA Hiroyuki
2008-06-21 8:41 ` KOSAKI Motohiro
2008-06-21 8:39 ` KOSAKI Motohiro
2008-06-19 15:32 ` kamezawa.hiroyu
2008-06-20 16:24 ` Lee Schermerhorn
2008-06-17 15:33 ` [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3 KOSAKI Motohiro
2008-06-18 1:54 ` Daisuke Nishimura
2008-06-18 4:41 ` Daisuke Nishimura
2008-06-18 4:59 ` KAMEZAWA Hiroyuki
2008-06-18 7:54 ` [PATCH][-mm] remove redundant page->mapping check KOSAKI Motohiro
2008-06-17 17:46 ` Lee Schermerhorn [this message]
2008-06-17 18:33 ` [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3 Hugh Dickins
2008-06-17 19:28 ` Lee Schermerhorn
2008-06-18 5:19 ` Nick Piggin
2008-06-18 2:59 ` Daisuke Nishimura
2008-06-18 1:13 ` KAMEZAWA Hiroyuki
2008-06-18 1:26 ` Daisuke Nishimura
2008-06-18 1:54 ` [PATCH] migration_entry_wait fix KAMEZAWA Hiroyuki
2008-06-18 5:26 ` KOSAKI Motohiro
2008-06-18 5:35 ` Nick Piggin
2008-06-18 6:04 ` KAMEZAWA Hiroyuki
2008-06-18 6:42 ` Nick Piggin
2008-06-18 6:52 ` KAMEZAWA Hiroyuki
2008-06-18 7:29 ` [PATCH -mm][BUGFIX] migration_entry_wait fix. v2 KAMEZAWA Hiroyuki
2008-06-18 7:26 ` KOSAKI Motohiro
2008-06-18 7:40 ` Nick Piggin
2008-06-19 6:59 ` [BUG][PATCH -mm] avoid BUG() in __stop_machine_run() Hidehiro Kawai
2008-06-19 10:12 ` Rusty Russell
2008-06-19 15:51 ` Jeremy Fitzhardinge
2008-06-20 13:21 ` Ingo Molnar
2008-06-23 3:55 ` Rusty Russell
2008-06-23 21:01 ` Ingo Molnar
2008-06-19 16:27 ` 2.6.26-rc5-mm3: BUG large value for HugePages_Rsvd Jon Tollefson
2008-06-19 17:16 ` Andy Whitcroft
2008-06-20 3:18 ` Jon Tollefson
2008-06-20 19:17 ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas Andy Whitcroft
2008-06-20 19:17 ` [PATCH 1/2] hugetlb reservations: move region tracking earlier Andy Whitcroft
2008-06-20 19:17 ` [PATCH 2/2] hugetlb reservations: fix hugetlb MAP_PRIVATE reservations across vma splits Andy Whitcroft
2008-06-23 7:33 ` Mel Gorman
2008-06-23 8:00 ` Mel Gorman
2008-06-23 9:53 ` Andy Whitcroft
2008-06-23 16:04 ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas Jon Tollefson
2008-06-23 17:35 ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas V2 Andy Whitcroft
2008-06-23 17:35 ` [PATCH 1/2] hugetlb reservations: move region tracking earlier Andy Whitcroft
2008-06-23 23:05 ` Mel Gorman
2008-06-23 17:35 ` [PATCH 2/2] hugetlb reservations: fix hugetlb MAP_PRIVATE reservations across vma splits V2 Andy Whitcroft
2008-06-23 23:08 ` Mel Gorman
2008-06-25 21:22 ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas V2 Jon Tollefson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1213724798.8707.41.camel@lts-notebook \
--to=lee.schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=kernel-testers@vger.kernel.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox