From: Hillf Danton <dhillf@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: hugetlb oops on 3.1.0-rc8-devel
Date: Wed, 2 Nov 2011 20:06:46 +0800 [thread overview]
Message-ID: <CAJd=RBDdirdNiPMVcYLNFO5Ho+pRGCfh_RRA7_re+76Ds+H0pw@mail.gmail.com> (raw)
In-Reply-To: <CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com>
On Wed, Nov 2, 2011 at 6:20 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> (Disclaimer: I have my file_update_time patch running in this kernel.
> But I doubt that's the problem, since it has nothing to due with
> hugepages.)
>
> My system just oopsed, saying this:
>
> ... lots more than just this ...
> [334297.627797] PID 25627 killed due to inadequate hugepage pool
> [334298.130665] PID 25628 killed due to inadequate hugepage
> pool"Developers ("Developers (AMA)" <dev@amacapital.net>, AMA)"
> <dev@amacapital.net>,
> [334298.993760] PID 25630 killed due to inadequate hugepage pool
> [334299.373970] PID 25631 killed due to inadequate hugepage pool
> [334301.073572] PID 25633 killed due to inadequate hugepage pool
> [334301.460784] PID 25634 killed due to inadequate hugepage pool
> [334302.602704] PID 25636 killed due to inadequate hugepage pool
> [334302.911221] PID 25637 killed due to inadequate hugepage pool
> [334304.119890] PID 25639 killed due to inadequate hugepage pool
> [334304.636179] PID 25640 killed due to inadequate hugepage pool
> [334305.143010] PID 25641 killed due to inadequate hugepage pool
> [334305.934934] PID 25642 killed due to inadequate hugepage pool
> [334306.595565] PID 25644 killed due to inadequate hugepage pool
> [334308.043332] PID 25646 killed due to inadequate hugepage pool
> [334308.606766] PID 25647 killed due to inadequate hugepage pool
> [334309.628432] PID 25649 killed due to inadequate hugepage pool
> [334309.961794] PID 25650 killed due to inadequate hugepage pool
> [334311.170115] PID 25652 killed due to inadequate hugepage pool
> [334311.287254] ------------[ cut here ]------------
> [334311.287278] kernel BUG at mm/hugetlb.c:2407!
> [334311.287296] invalid opcode: 0000 [#1] SMP
> [334311.287315] CPU 6
> [334311.287326] Modules linked in: ppdev parport_pc lp parport
> des_generic ecb md4 nls_utf8 cifs fscache tcp_lp fuse acpi_cpufreq
> mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter
> ip6_tables w83627ehf hwmon_vid coretemp btrfs zlib_deflate
> lzo_compress libcrc32c snd_hda_codec_hdmi snd_hda_codec_realtek vfat
> fat joydev snd_hda_intel snd_hda_codec snd_hwdep snd_seq
> snd_seq_device snd_pcm snd_timer serio_raw snd i2c_i801 pcspkr
> microcode e1000e soundcore xhci_hcd mei(C) snd_page_alloc iTCO_wdt
> iTCO_vendor_support virtio_net virtio_ring virtio kvm_intel kvm ipv6
> xts gf128mul dm_crypt firewire_ohci firewire_core crc_itu_t i915
> drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded:
> scsi_wait_scan]
> [334311.287682]
> [334311.287693] Pid: 22419, comm: lk_index_and_co Tainted: G C
> 3.1.0-rc8-devel+ #416 /DQ67SW
> [334311.287743] RIP: 0010:[<ffffffff810e3137>] [<ffffffff810e3137>]
> hugetlb_cow+0x209/0x355
> [334311.287791] RSP: 0018:ffff880147cbdcb8 EFLAGS: 00010202
> [334311.287813] RAX: 0000000000000001 RBX: ffff88017e668498 RCX:
> 0000000000000038
> [334311.287840] RDX: 0000000000000000 RSI: ffff88017e6684e8 RDI:
> ffffea0005110000
> [334311.287868] RBP: ffff880147cbdd78 R08: 0000000000000000 R09:
> ffff88023e38c600
> [334311.287896] R10: 0000000000000001 R11: ffff88022cceb038 R12:
> ffffea0005110000
> [334311.287925] R13: 00007f1604600000 R14: ffffffff81bf8c20 R15:
> ffff88022cc1c280
> [334311.287954] FS: 00007f160b955740(0000) GS:ffff88023e380000(0000)
> knlGS:0000000000000000
> [334311.287983] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [334311.288007] CR2: 00007f9c18508000 CR3: 00000001549cd000 CR4:
> 00000000000406e0
> [334311.288035] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 00000"Developers (AMA)" <dev@amacapital.net>, 00000000000
> [334311.288063] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400"Developers (AMA)" <dev@amacapital.net>,
> [334311.288092] Process [random internal program] (pid: 22419,
> threadinfo ffff880147cbc000, task ffff88022f185a80)
> [334311.288126] Stack:
> [334311.288137] ffffffff81a33800 ffff88010c210808 ffff880213c88540
> ffffffff81bf8c20
> [334311.288172] 0000000000000000 ffff8801498a6118 80000001444000e5
> ffff88010c210848
> [334311.288206] 0000000000000001 00007f160461acd0 ffff880147cbdd18
> ffff88017e6684e8
> [334311.288241] Call Trace:
> [334311.288256] [<ffffffff810e36e6>] hugetlb_fault+0x3fd/0x49d
> [334311.288280] [<ffffffff810d17c2>] handle_mm_fault+0x81/0x1ae
> [334311.288305] [<ffffffff8101c9b5>] ? x2apic_send_IPI_mask+0x13/0x15
> [334311.288331] [<ffffffff81440f39>] do_page_fault+0x31f/0x366
> [334311.288357] [<ffffffff8103e9ab>] ? wake_up_new_task+0xb4/0xcd
> [334311.288382] [<ffffffff811c9820>] ? security_file_alloc+0x16/0x18
> [334311.288408] [<ffffffff810457f0>] ? do_fork+0x18c/0x225
> [334311.288432] [<ffffffff8143df2e>] ? _raw_spin_lock+0xe/0x10
> [334311.288456] [<ffffffff8143df2e>] ? _raw_spin_lock+0xe/0x10
> [334311.288480] [<ffffffff8143e4af>] page_fault+0x1f/0x30
> [334311.288500] Code: ff ff 48 8d 75 98 48 89 c7 e8 3b 54 fe ff 48 85
> c0 75 b6 48 8b bd 78 ff ff ff e8 2d 9f 35 00 4c 89 e7 e8 ed e5 ff ff
> ff c8 74 02 <0f> 0b 49 8d 7f 5c e8 de ad 35 00 e9 58 fe ff ff 48 8b bd
> 78 ff
> [334311.288652] RIP [<ffffffff810e3137>] hugetlb_cow+0x209/0x355
> [334311.288676] RSP <ffff880147cbdcb8>
> [334311.400050] ---[ end trace de2300cc5bd541be ]---
>
> After the oops, accesses to /proc/22418/cmdline (and other things in
> that directory) hang. 22418 is not the oopsing process or one of its
> children, so this is strange.
>
> The program that oopsed is written in python and does very little
> other than firing off subprocesses (in C) and having a replacement
> malloc LD_PRELOADed that is backed by MAP_PRIVATE | MAP_ANONYMOUS |
> MAP_FIXED | MAP_HUGETLB mappings. This is clearly a bad idea and I'll
> stop doing it; nonetheless, it shouldn't oops.
>
> The line that crashed is BUG_ON(page_count(old_page) != 1) in hugetlb_cow.
>
Hello Andy
Would you please try the following patch?
Thanks
Hillf
--- a/mm/hugetlb.c Sat Aug 13 11:45:14 2011
+++ b/mm/hugetlb.c Wed Nov 2 20:12:00 2011
@@ -2422,6 +2422,8 @@ retry_avoidcopy:
* anon_vma prepared.
*/
if (unlikely(anon_vma_prepare(vma))) {
+ page_cache_release(new_page);
+ page_cache_release(old_page);
/* Caller expects lock to be held */
spin_lock(&mm->page_table_lock);
return VM_FAULT_OOM;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-11-02 12:06 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-01 22:20 Andy Lutomirski
2011-11-02 12:06 ` Hillf Danton [this message]
2011-11-02 17:18 ` Andy Lutomirski
2011-12-06 20:16 ` David Rientjes
2011-12-06 20:43 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJd=RBDdirdNiPMVcYLNFO5Ho+pRGCfh_RRA7_re+76Ds+H0pw@mail.gmail.com' \
--to=dhillf@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@amacapital.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox