linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "zhangsha (A)" <zhangsha.zhang@huawei.com>
To: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>
Cc: "Wanghui (John)" <john.wanghui@huawei.com>,
	"Zhangyanfei (UVP)" <yanfei.zhang@huawei.com>,
	guijianfeng <guijianfeng@huawei.com>,
	"Wencongyang (UVP)" <wencongyang2@huawei.com>
Subject: [Problem] ndctl command hangs forever when reinitializing pmem device after vm destroyed
Date: Fri, 10 Aug 2018 03:49:23 +0000	[thread overview]
Message-ID: <FC1AAE34B870124C835BDA1138D00F5C7C8BAD87@dggema521-mbs.china.huawei.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 4803 bytes --]

Hi, all
I got a D status of the process ndctl command unfortunately,
when I try to reinitialize the dax device after vm destroyed.

The stack of the process ndctl command:
[<ffffffffa02c0029>] dax_pmem_percpu_kill+0x29/0x50 [dax_pmem]
[<ffffffff81454715>] devm_action_release+0x15/0x20
[<ffffffff814552cf>] release_nodes+0x1cf/0x220
[<ffffffff8145542c>] devres_release_all+0x3c/0x60
[<ffffffff81450bea>] __device_release_driver+0x8a/0xf0
[<ffffffff81450c73>] device_release_driver+0x23/0x30
[<ffffffff8144f647>] driver_unbind+0xf7/0x120
[<ffffffff8144ea87>] drv_attr_store+0x27/0x40
[<ffffffff81295ecb>] sysfs_write_file+0xcb/0x140
[<ffffffff812159e0>] vfs_write+0xc0/0x1f0
[<ffffffff8121650f>] SyS_write+0x7f/0xe0
[<ffffffff816c22ef>] system_call_fastpath+0x1c/0x21
[<ffffffffffffffff>] 0xffffffffffffffff

I can reproduce this problem reliably with the following steps:
1) initialize the device: "ndctl create-namespace --mode dax --map=mem -e namespace0.0 -f"
2) create the VM(command as follos), and wait the guestos starting up
   "/usr/bin/qemu-kvm -name guest=suse12sp2-wj,debug-threads=on -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off,nvdimm=on -cpu host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -m size=16777216k,slots=4,maxmem=75497472k -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0-3,mem=16384 -object memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/dev/dax0.0,share=yes,size=8587837440,align=2097152 -device nvdimm,node=0,label-size=131072,memdev=memnvdimm0,id=nvdimm0,slot=0 -uuid 39ce74f4-9cb6-49cf-8890-949864ee1a99 -no-user-config -nodefaults -rtc base=utc -no-hpet -no-shutdown -boot menu=on,strict=on -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x7 -device pci-bridge,chassis_nr=1,id=pci.2,bus=pci.0,addr=0x8 -device pci-bridge,chassis_nr=1,id=pci.3,bus=pci.0,addr=0x9 -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.3,addr=0x1 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x19 -drive file=/Images/zsha/images/EulerOS310.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x1,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2 -device ivshmem,id=ivshmem0,shm=i-00000006.kboxram,size=16m,role=master,bus=pci.0,addr=0x3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x1e -device pvpanic -msg timestamp=on -vnc :9"
3) destroy the VM: "kill -15 `pidof qemu-kvm`"
4) reinitialize the device, then the command hangs: "ndctl create-namespace --mode dax --map=mem -e namespace0.0 -f"

I've tested the problem with a CentOS 3.10.0-862 kernel, a Fedora 4.16.x kernel and a upstream 4.18.0-rc6; they all exhibit the same behavior.

By adding some logs, I find that the function gup_pte_range(get_page->get_zone_device_page)
increase the refcount of device dax0.0 to 161 when starting vm.
But function zap_pte_range() get a NULL page by vm_normal_page(),
so the OS can't decrease the refcount to zero when destroying vm.
And because of it, in function dax_pmem_percpu_kill(dax_pmem_percpu_exit),
the function percpu_ref_put() can't step in the brance releasing device,
the function wait_for_completion() will never be finished.

Stack of increasing the refcount of dax0.0:
[<ffffffff81072c90>] gup_pte_range+0x170/0x380
[<ffffffff8107312f>] gup_pud_range+0x12f/0x1e0
[<ffffffff8107339b>] __get_user_pages_fast+0xcb/0x140
[<ffffffffa057695b>] __gfn_to_pfn_memslot+0x46b/0x490 [kvm]
[<ffffffffa0593e2e>] try_async_pf+0x6e/0x2a0 [kvm]
[<ffffffffa0578dd8>] ? kvm_host_page_size+0x88/0x90 [kvm]
[<ffffffffa059b66a>] tdp_page_fault+0x13a/0x280 [kvm]
[<ffffffffa053c663>] ? vmx_vcpu_run+0x2f3/0xa40 [kvm_intel]
[<ffffffffa059570a>] kvm_mmu_page_fault+0x2a/0x140 [kvm]
[<ffffffffa0532346>] handle_ept_violation+0x96/0x170 [kvm_intel]
[<ffffffffa053ab7c>] vmx_handle_exit+0x2bc/0xc40 [kvm_intel]
[<ffffffffa053c66f>] ? vmx_vcpu_run+0x2ff/0xa40 [kvm_intel]
[<ffffffffa053c663>] ? vmx_vcpu_run+0x2f3/0xa40 [kvm_intel]
[<ffffffffa053c66f>] ? vmx_vcpu_run+0x2ff/0xa40 [kvm_intel]
[<ffffffffa053c663>] ? vmx_vcpu_run+0x2f3/0xa40 [kvm_intel]
[<ffffffffa0538ec8>] ? vmx_hwapic_irr_update+0xb8/0xc0 [kvm_intel]
[<ffffffffa0589b21>] vcpu_enter_guest+0x7d1/0x1300 [kvm]
[<ffffffffa05913b8>] kvm_arch_vcpu_ioctl_run+0x328/0x480 [kvm]
[<ffffffffa0577191>] kvm_vcpu_ioctl+0x2b1/0x660 [kvm]
[<ffffffff81229ec8>] do_vfs_ioctl+0x2e8/0x4d0
[<ffffffff8122a151>] SyS_ioctl+0xa1/0xc0
[<ffffffff816c22ef>] system_call_fastpath+0x1c/0x21

Any reply will be appreciated, and thanks for all your help.

B.R.
Sha Zhang

[-- Attachment #2: Type: text/html, Size: 9595 bytes --]

                 reply	other threads:[~2018-08-10  3:49 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FC1AAE34B870124C835BDA1138D00F5C7C8BAD87@dggema521-mbs.china.huawei.com \
    --to=zhangsha.zhang@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=guijianfeng@huawei.com \
    --cc=john.wanghui@huawei.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=wencongyang2@huawei.com \
    --cc=yanfei.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox