From: zhenwei pi <pizhenwei@bytedance.com>
To: David Hildenbrand <david@redhat.com>,
Peter Xu <peterx@redhat.com>, Jue Wang <juew@google.com>,
Paolo Bonzini <pbonzini@redhat.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
jasowang@redhat.com, LKML <linux-kernel@vger.kernel.org>,
"Linux MM" <linux-mm@kvack.org>,
mst@redhat.com,
"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org
Subject: Re: Re: [PATCH 0/3] recover hardware corrupted page by virtio balloon
Date: Mon, 30 May 2022 19:33:35 +0800 [thread overview]
Message-ID: <4b0c3e37-b882-681a-36fc-16cee7e1fff0@bytedance.com> (raw)
In-Reply-To: <0d266c61-605d-ce0c-4274-b0c7e10f845a@redhat.com>
On 5/30/22 15:41, David Hildenbrand wrote:
> On 27.05.22 08:32, zhenwei pi wrote:
>> On 5/27/22 02:37, Peter Xu wrote:
>>> On Wed, May 25, 2022 at 01:16:34PM -0700, Jue Wang wrote:
>>>> The hypervisor _must_ emulate poisons identified in guest physical
>>>> address space (could be transported from the source VM), this is to
>>>> prevent silent data corruption in the guest. With a paravirtual
>>>> approach like this patch series, the hypervisor can clear some of the
>>>> poisoned HVAs knowing for certain that the guest OS has isolated the
>>>> poisoned page. I wonder how much value it provides to the guest if the
>>>> guest and workload are _not_ in a pressing need for the extra KB/MB
>>>> worth of memory.
>>>
>>> I'm curious the same on how unpoisoning could help here. The reasoning
>>> behind would be great material to be mentioned in the next cover letter.
>>>
>>> Shouldn't we consider migrating serious workloads off the host already
>>> where there's a sign of more severe hardware issues, instead?
>>>
>>> Thanks,
>>>
>>
>> I'm maintaining 1000,000+ virtual machines, from my experience:
>> UE is quite unusual and occurs randomly, and I did not hit UE storm case
>> in the past years. The memory also has no obvious performance drop after
>> hitting UE.
>>
>> I hit several CE storm case, the performance memory drops a lot. But I
>> can't find obvious relationship between UE and CE.
>>
>> So from the point of my view, to fix the corrupted page for VM seems
>> good enough. And yes, unpoisoning several pages does not help
>> significantly, but it is still a chance to make the virtualization better.
>>
>
> I'm curious why we should care about resurrecting a handful of poisoned
> pages in a VM. The cover letter doesn't touch on that.
>
> IOW, I'm missing the motivation why we should add additional
> code+complexity to unpoison pages at all.
>
> If we're talking about individual 4k pages, it's certainly sub-optimal,
> but does it matter in practice? I could understand if we're losing
> megabytes of memory. But then, I assume the workload might be seriously
> harmed either way already?
>
Yes, resurrecting a handful of poisoned pages does not help
significantly. And, in some ways, it seems nice to have. :D
A VM uses RAM of 2M huge page. Once a MCE(@HVAy in [HVAx,HVAz)) occurs,
the 2M([HVAx,HVAz)) of hypervisor becomes unaccessible, but the guest
poisons 4K (@GPAy in [GPAx, GPAz)) only, it may hit another 511 MCE
([GPAx, GPAz) except GPAy). This is the worse case, so I want to add
'__le32 corrupted_pages' in struct virtio_balloon_config, it is used
in the next step: reporting 512 * 4K 'corrupted_pages' to the guest, the
guest has a chance to isolate the other 511 pages ahead of time. And the
guest actually loses 2M, fixing 512*4K seems to help significantly.
>
> I assume when talking about "the performance memory drops a lot", you
> imply that this patch set can mitigate that performance drop?
>
> But why do you see a performance drop? Because we might lose some
> possible THP candidates (in the host or the guest) and you want to plug
> does holes? I assume you'll see a performance drop simply because
> poisoning memory is expensive, including migrating pages around on CE.
>
> If you have some numbers to share, especially before/after this change,
> that would be great.
>
The CE storm leads 2 problems I have even seen:
1, the memory bandwidth slows down to 10%~20%, and the cycles per
instruction of CPU increases a lot.
2, the THR (/proc/interrupts) interrupts frequently, the CPU has to use
a lot time to handle IRQ.
But no corrupted page occurs. Migrating VM to another healthy host seems
a good choice. This patch does not handle CE storm case.
--
zhenwei pi
next prev parent reply other threads:[~2022-05-30 11:37 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-25 20:16 Jue Wang
2022-05-26 18:37 ` Peter Xu
2022-05-27 6:32 ` zhenwei pi
2022-05-30 7:41 ` David Hildenbrand
2022-05-30 11:33 ` zhenwei pi [this message]
2022-05-30 15:49 ` Peter Xu
2022-05-31 4:08 ` Jue Wang
2022-06-01 2:17 ` zhenwei pi
2022-06-01 7:59 ` David Hildenbrand
2022-06-02 9:28 ` zhenwei pi
2022-06-02 9:40 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4b0c3e37-b882-681a-36fc-16cee7e1fff0@bytedance.com \
--to=pizhenwei@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jasowang@redhat.com \
--cc=juew@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mst@redhat.com \
--cc=naoya.horiguchi@nec.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox