From: Friedrich Weber <f.weber@proxmox.com>
To: kvm@vger.kernel.org
Cc: Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: Temporary KVM guest hangs connected to KSM and NUMA balancer
Date: Thu, 11 Jan 2024 13:43:15 +0100 [thread overview]
Message-ID: <2c73db3f-2465-49e1-9d57-ef8d978849b6@proxmox.com> (raw)
In-Reply-To: <832697b9-3652-422d-a019-8c0574a188ac@proxmox.com>
Hi,
On 04/01/2024 14:42, Friedrich Weber wrote:
> f47e5bbb ("KVM: x86/mmu: Zap only TDP MMU leafs in zap range and
> mmu_notifier unmap")
As this commit mentions tdp_mmu_zap_leafs, I ran the reproducer again on
610a9b8f (6.7-rc8) while tracing >500ms calls to that function (printing
a stacktrace) and task_numa_work via a bpftrace script [1].
Again, there are several invocations of task_numa_work that take >500ms
(the VM appears to be frozen during that time). For instance, one
invocation that takes ~20 seconds:
[1704971602] task_numa_work (tid=52035) took 19995 ms
For this particular thread and in the 20 seconds before that, there were
8 invocations of tdp_mmu_zap_leafs with >500ms:
[1704971584] tdp_mmu_zap_leafs (tid=52035) took 2291 ms
[1704971586] tdp_mmu_zap_leafs (tid=52035) took 2343 ms
[1704971589] tdp_mmu_zap_leafs (tid=52035) took 2316 ms
[1704971590] tdp_mmu_zap_leafs (tid=52035) took 1663 ms
[1704971591] tdp_mmu_zap_leafs (tid=52035) took 682 ms
[1704971594] tdp_mmu_zap_leafs (tid=52035) took 2706 ms
[1704971597] tdp_mmu_zap_leafs (tid=52035) took 3132 ms
[1704971602] tdp_mmu_zap_leafs (tid=52035) took 4846 ms
They roughly sum up to 20s. The stacktrace is the same for all:
bpf_prog_5ca52691cb9e9fbd_tdp_mmu_zap_lea+345
bpf_prog_5ca52691cb9e9fbd_tdp_mmu_zap_lea+345
bpf_trampoline_380104735946+208
tdp_mmu_zap_leafs+5
kvm_unmap_gfn_range+347
kvm_mmu_notifier_invalidate_range_start+394
__mmu_notifier_invalidate_range_start+156
change_protection+3908
change_prot_numa+105
task_numa_work+1029
bpf_trampoline_6442457341+117
task_numa_work+9
xfer_to_guest_mode_handle_work+261
kvm_arch_vcpu_ioctl_run+1553
kvm_vcpu_ioctl+667
__x64_sys_ioctl+164
do_syscall_64+96
entry_SYSCALL_64_after_hwframe+110
AFAICT this pattern repeats several times. I uploaded the last 150kb of
the bpftrace output to [2]. I can provide the full output if needed.
To me, not knowing much about the KVM/KSM/NUMA balancer interplay, this
looks like task_numa_work triggers several invocations of
tdp_mmu_zap_leafs, each of which takes an unusually long time. If anyone
has a hunch why this might happen, or an idea where to look next, it
would be much appreciated.
Best,
Friedrich
[1]
kfunc:tdp_mmu_zap_leafs { @start_zap[tid] = nsecs; }
kretfunc:tdp_mmu_zap_leafs /@start_zap[tid]/ {
$diff = nsecs - @start_zap[tid];
if ($diff > 500000000) { // 500ms
time("[%s] ");
printf("tdp_mmu_zap_leafs (tid=%d) took %d ms\n", tid, $diff / 1000000);
print(kstack());
}
delete(@start_zap[tid]);
}
kfunc:task_numa_work { @start_numa[tid] = nsecs; }
kretfunc:task_numa_work /@start_numa[tid]/ {
$diff = nsecs - @start_numa[tid];
if ($diff > 500000000) { // 500ms
time("[%s] ");
printf("task_numa_work (tid=%d) took %d ms\n", tid, $diff / 1000000);
}
delete(@start_numa[tid]);
}
[2] https://paste.debian.net/1303767/
next prev parent reply other threads:[~2024-01-11 12:43 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-04 13:42 Friedrich Weber
2024-01-11 12:43 ` Friedrich Weber [this message]
2024-01-11 16:00 ` Sean Christopherson
2024-01-12 16:08 ` Friedrich Weber
2024-01-16 15:37 ` Friedrich Weber
2024-01-16 17:20 ` Sean Christopherson
2024-01-17 13:09 ` Friedrich Weber
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2c73db3f-2465-49e1-9d57-ef8d978849b6@proxmox.com \
--to=f.weber@proxmox.com \
--cc=akpm@linux-foundation.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox