linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
To: Greg KH <greg@kroah.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	stable <stable@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	linux-mm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Rik van Riel <riel@redhat.com>
Subject: Re: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6
Date: Mon, 21 Mar 2016 11:52:23 +0100	[thread overview]
Message-ID: <56EFD267.9070609@profihost.ag> (raw)
In-Reply-To: <20160320214130.GB23920@kroah.com>


Am 20.03.2016 um 22:41 schrieb Greg KH:
> On Sun, Mar 20, 2016 at 10:27:23PM +0100, Stefan Priebe wrote:
>>
>> Am 19.03.2016 um 23:26 schrieb Vlastimil Babka:
>>> On 03/17/2016 07:45 PM, Greg KH wrote:
>>>> On Thu, Mar 17, 2016 at 07:38:03PM +0100, Stefan Priebe wrote:
>>>>> Hi,
>>>>>
>>>>> while running qemu 2.5 on a host running 4.4.6 the host system has
>>>>> crashed
>>>>> (load > 200) 3 times in the last 3 days.
>>>>>
>>>>> Always with this stack trace: (copy left here:
>>>>> http://pastebin.com/raw/bCWTLKyt)
>>>>>
>>>>> [69068.874268] divide error: 0000 [#1] SMP
>>>>> [69068.875242] Modules linked in: ebtable_filter ebtables ip6t_REJECT
>>>>> nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter
>>>>> ip6_tables
>>>>> ipt_REJECT nf_reject_ipv4 xt_physdev xt_comment nf_conntrack_ipv4
>>>>> nf_defrag_ipv4 xt_tcpudp xt_mark xt_set xt_addrtype xt_conntrack
>>>>> nf_conntrack ip_set_hash_net ip_set vhost_net tun vhost macvtap macvlan
>>>>> kvm_intel nfnetlink_log kvm nfnetlink irqbypass netconsole dlm
>>>>> xt_multiport
>>>>> iptable_filter ip_tables x_tables iscsi_tcp libiscsi_tcp libiscsi
>>>>> scsi_transport_iscsi nfsd auth_rpcgss oid_registry bonding coretemp
>>>>> 8021q
>>>>> garp fuse i2c_i801 i7core_edac edac_core i5500_temp button btrfs xor
>>>>> raid6_pq dm_mod raid1 md_mod usb_storage ohci_hcd bcache sg usbhid
>>>>> sd_mod
>>>>> ata_generic uhci_hcd ehci_pci ehci_hcd usbcore ata_piix usb_common igb
>>>>> i2c_algo_bit mpt3sas raid_class ixgbe scsi_transport_sas i2c_core
>>>>> mdio ptp
>>>>> pps_core
>>>>> [69068.895604] CPU: 14 PID: 6673 Comm: ceph-osd Not tainted
>>>>> 4.4.6+7-ph #1
>>>>> [69068.897052] Hardware name: Supermicro X8DT3/X8DT3, BIOS 2.1
>>>>> 03/17/2012
>>>>> [69068.898578] task: ffff880fc7f28000 ti: ffff880fda2c4000 task.ti:
>>>>> ffff880fda2c4000
>>>>> [69068.900377] RIP: 0010:[<ffffffff860b372c>]  [<ffffffff860b372c>]
>>>>> task_h_load+0xcc/0x100
>>>
>>> decodecode says:
>>>
>>>   27:   48 83 c1 01             add    $0x1,%rcx
>>>   2b:*  48 f7 f1                div    %rcx             <-- trapping
>>> instruction
>>>
>>> This suggests the CONFIG_FAIR_GROUP_SCHED version of task_h_load:
>>>
>>>         update_cfs_rq_h_load(cfs_rq);
>>>         return div64_ul(p->se.avg.load_avg * cfs_rq->h_load,
>>>                         cfs_rq_load_avg(cfs_rq) + 1);
>>>
>>> So the load avg is -1, thus after adding 1 we get division by 0, huh?
>>
>> Yes CONFIG_FAIR_GROUP_SCHED is set. I cherry picked now all those commits up
>> to 4.5 for fair.c:
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/kernel/sched/fair.c?h=v4.5
>>
>> It didn't happen again with v4.4.6 + 4.5 patches for fair.c
> 
> Ok, that's a lot of patches, how about figuring out which single patch,
> or shortest number of patches, makes things work again?

will do so but it seems most out of those 9 patches are based on each
other. So it wouldn't be easy.

Stefan

> 
> thanks,
> 
> greg k-h
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-03-21 10:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-17 18:38 Stefan Priebe
2016-03-17 18:45 ` Greg KH
2016-03-19 22:26   ` Vlastimil Babka
2016-03-20 21:27     ` Stefan Priebe
2016-03-20 21:41       ` Greg KH
2016-03-21 10:52         ` Stefan Priebe - Profihost AG [this message]
2016-03-21 13:38           ` Greg KH
2016-05-17  6:01             ` Stefan Priebe - Profihost AG
2016-05-17  9:21               ` Campbell Steven
2016-06-22  1:19                 ` Campbell Steven
2016-06-22  6:13                   ` Peter Zijlstra
2016-07-06 23:20                     ` Campbell Steven
2016-07-07  7:42                       ` Peter Zijlstra
2016-07-09  5:21                         ` Greg KH
2016-07-11 22:33                         ` Greg KH
2016-07-12 13:12                           ` Peter Zijlstra
2016-07-13  0:26                             ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56EFD267.9070609@profihost.ag \
    --to=s.priebe@profihost.ag \
    --cc=greg@kroah.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox