From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 634ADC433F5 for ; Wed, 10 Nov 2021 13:12:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DD0E861168 for ; Wed, 10 Nov 2021 13:12:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DD0E861168 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 512866B0072; Wed, 10 Nov 2021 08:12:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 49A7F6B0073; Wed, 10 Nov 2021 08:12:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 314146B0074; Wed, 10 Nov 2021 08:12:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0147.hostedemail.com [216.40.44.147]) by kanga.kvack.org (Postfix) with ESMTP id 1BA836B0072 for ; Wed, 10 Nov 2021 08:12:07 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id C6D3182F76AC for ; Wed, 10 Nov 2021 13:12:06 +0000 (UTC) X-FDA: 78793058652.18.FCF4D07 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 5A93CB0000BC for ; Wed, 10 Nov 2021 13:11:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636549918; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gjoz7kCYCB4Z7jluCp8DD21sU5jB+TvJsaYgLpUuza4=; b=QReBDF+yQRT+SfwgQcX/Ca1aItBvviGA+j84K+HpfPTh3qk4b+Qcwsh4ITd3HMnUxpzgqN TJkomQSammGMYemH4HirSv/grGBQxi8hY/l0NTtjHmnHq5SeBkiFjpVMp4AFesI72jYHcI LPsAyAq4ntOjBl3GuCRSo9KIzWSugF4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636549924; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gjoz7kCYCB4Z7jluCp8DD21sU5jB+TvJsaYgLpUuza4=; b=JytSe41ZdY6a9+NdDi7baz4hO6l0fKE1RBF+D6zJ99/wHGume8JIhU9rqL0f6tLvP+Td1p wNt3V2J4x4whVpBFsECrflR/zK0pSDXWKYwe/ucsnIMJaH6F2zreykTcJInrarzSIs4tef ADR/cZUkwIfg4QbH3oFK0N4mgHMkotg= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-550-VVFyyUMANZ26eUYtvPoHSg-1; Wed, 10 Nov 2021 08:11:56 -0500 X-MC-Unique: VVFyyUMANZ26eUYtvPoHSg-1 Received: by mail-wr1-f70.google.com with SMTP id p3-20020a056000018300b00186b195d4ddso411040wrx.15 for ; Wed, 10 Nov 2021 05:11:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gjoz7kCYCB4Z7jluCp8DD21sU5jB+TvJsaYgLpUuza4=; b=0h1JMZHElBhyvTwMtNtYLFtfWSwt5P9dd4CUHlXv0oLYMl3RDU4eOqChQ6sxocRf/M wqDPtqq2OwO9tlhsceYaQQSvps9hLYRYwls4scI25AxHFgLBb447n5fMK0XlQ+XjBhvl cV1TYD85nQsqIpsE4fhj0I9KyOTW7nrYEfez1PhNsCQLuaQzB+qXbpSyuN0FxId142wG lVVJnQJUpxLO24/h44HNnQI1m8viKmCgr5OX0MMS8ipZewtOITKcyOt+8zJBnVYmqP4u qtnei6gB7zntHUELLnXGhR0F+79en3UP0tz93WcsMRrcN2z+f1JExZNtDQMRFDDmBjGF nPOQ== X-Gm-Message-State: AOAM531ohZ7W3lteqBGrkfQ2Jr3zG7eRqjDCyGYkd7hQ6028zeXtI7i3 hRFDoLmaqhdsw4P2wIHLlGgiZ0kHBLc9y94Ci7hTfDtPfCObaShsm+ykk+WKspw4vaHG8ruaI1l l5m0LkRNxyAUiCryw/aVvtu1AcQs= X-Received: by 2002:a05:600c:4153:: with SMTP id h19mr16380341wmm.142.1636549915061; Wed, 10 Nov 2021 05:11:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJx6S/ZblmZaOiAkLIwVwhmu4OKo46BjwHporr8/+iq9IXq77QlbHg6kfXQwLcg+P805qjolXj5zncwx+jAoF+Y= X-Received: by 2002:a05:600c:4153:: with SMTP id h19mr16380280wmm.142.1636549914762; Wed, 10 Nov 2021 05:11:54 -0800 (PST) MIME-Version: 1.0 References: <20211108183057.809e428e841088b657a975ec@linux-foundation.org> <20211109023148.b1OlyuiXG%akpm@linux-foundation.org> <20211110072225.GA18768@MiWiFi-R3L-srv> <0c68b366-38f4-94fd-da11-57e40a44cb48@redhat.com> <1cbc6332-8a45-3af1-c648-99437819bb5a@redhat.com> <0c83cb5b-20e0-31cb-b3bf-82d3ca30e08b@redhat.com> In-Reply-To: <0c83cb5b-20e0-31cb-b3bf-82d3ca30e08b@redhat.com> From: Dave Young Date: Wed, 10 Nov 2021 21:11:43 +0800 Message-ID: Subject: Re: [patch 08/87] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks To: David Hildenbrand Cc: Baoquan He , boris.ostrovsky@oracle.com, Borislav Petkov , Andrew Morton , "H. Peter Anvin" , Jason Wang , jgross@suse.com, linux-mm@kvack.org, mhocko@suse.com, Ingo Molnar , mm-commits@vger.kernel.org, MST , osalvador@suse.de, rafael.j.wysocki@intel.com, rppt@kernel.org, sstabellini@kernel.org, Thomas Gleixner , torvalds@linux-foundation.org, "Goyal, Vivek" X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/alternative; boundary="00000000000053b8dc05d06ef86b" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 5A93CB0000BC X-Stat-Signature: og78a4uj97y8iiccwucgwmd5t7a4sksd Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QReBDF+y; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JytSe41Z; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf19.hostedemail.com: domain of ruyang@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=ruyang@redhat.com X-HE-Tag: 1636549916-104705 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --00000000000053b8dc05d06ef86b Content-Type: text/plain; charset="UTF-8" On Wed, 10 Nov 2021 at 20:06, David Hildenbrand wrote: > >> "remaining vmcore is zeroed that it is bad and not acceptable for > kdump." > >> > >> Which scenario are you concerned about? User space plays stupid games > >> (unbining a driver from a virtio-mem device in a *kdump kernel* after > >> opening /proc/vmcore) and wins stupid prices (a warning and a vmcore > >> filled (partially) with zeroes). Why isn't a warning sufficient for > >> something like that? > > > > Hi David, > > > > Suppose we have the use case below: > > > > Hi Dave, > > thanks for elaborating, it helps a lot to understand your concerns. > > > A user plays with the game (Probably in hypervisor part, but the user is > > not aware that the guest panicked and in a kdump kernel), then we get a > > zeroed vmcore. But the panic can not be easily reproduced any more, > > then the warning is not useful. > > I can only speak about virtio-mem (well, that's the only current known > "dynamic vmcore_cb registration" user :) ). > > virtio-mem devices cannot get hotunplugged in the hypervisor (i.e., > QEMU)-- you can only hot(un)plug device memory, but not the device > itself, it will stick around. Hotunplugging the device is completely > blocked and not supported. > > The reason is simple: unplugging a virtio-mem device will also remove > the device memory. It's similar to other memory devices, such as DIMMs > -- I would not recommend forced, physical removal of a DIMM to anybody > -- not while the OS is running and not while kdump is saving > /proc/vmcore. Which is also the reason why hypervisors don't generally > support forced removal of such devices. :) > > So for the currently known vmcore_cb users, hypervisor action cannot > result in driver unbinding and consequently vmcore_cb changes. > > Note: virtio-mem-pci devices might eventually get hotplugged while kdump > is active. I assume we don't disable PCI hotplug in kdump kernels. While > this will trigger a warning ("Unexpected vmcore callback registration"), > the vmcore will not be affected and be complete. > Ok, thanks for the details, it sounds safe for the time being then. > > > > > But if you think user is playing the game in kdump kernel, eg. in guest > > os while kdump is saving vmcore then it is nearly not possible to happen > > I agree with you it is a very trival problem. > > Yes, that's the only thing I consider can happen. For example, doing a: > > # echo 1 > /sys/devices/pci0000\:00/0000\:00\:03.0/remove > > in a kdump kernel after opening the vmcore. > > > > > Probably we have some misunderstanding, but it would be good to make it > > clear :) > > Understanding your concern, it could be future proof (for future > vmcore_cb users?) to fail the ioctls instead of returning 0. But even > for new memory devices, unplug is usually something to be fenced off by > the hypervisor, just like not allowing forced DIMM removal. > Yes, there could be some future issues, not only for virt users, who knows... > The only think I could imagine is having e.g., virtio-balloon device > register a vmcore_cb dynamically and providing a new mechanism to query > if a page is backed by a real page in the hypervisor (similar to XENs > hypercall). Such a device could be unplugged without harm, as it doesn't > actually provide device memory. > > -- > Thanks, > > David / dhildenb > > --00000000000053b8dc05d06ef86b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Wed, 10 Nov 2021 at 20:06, David H= ildenbrand <david@= redhat.com> wrote:
>> "remaining vmcore is zeroed that it is bad and not= acceptable for kdump."
>>
>> Which scenario are you concerned about? User space plays stupid ga= mes
>> (unbining a driver from a virtio-mem device in a *kdump kernel* af= ter
>> opening /proc/vmcore) and wins stupid prices (a warning and a vmco= re
>> filled (partially) with zeroes). Why isn't a warning sufficien= t for
>> something like that?
>
> Hi David,
>
> Suppose we have the use case below:
>

Hi Dave,

thanks for elaborating, it helps a lot to understand your concerns.

> A user plays with the game (Probably in hypervisor part, but the user = is
> not aware that the guest panicked and in a kdump kernel), then we get = a
> zeroed vmcore.=C2=A0 =C2=A0But the panic can not be easily reproduced = any more,
> then the warning is not useful.

I can only speak about virtio-mem (well, that's the only current known<= br> "dynamic vmcore_cb registration" user :) ).

virtio-mem devices cannot get hotunplugged in the hypervisor (i.e.,
QEMU)-- you can only hot(un)plug device memory, but not the device
itself, it will stick around. Hotunplugging the device is completely
blocked and not supported.

The reason is simple: unplugging a virtio-mem device will also remove
the device memory. It's similar to other memory devices, such as DIMMs<= br> -- I would not recommend forced, physical removal of a DIMM to anybody
-- not while the OS is running and not while kdump is saving
/proc/vmcore. Which is also the reason why hypervisors don't generally<= br> support forced removal of such devices. :)

So for the currently known vmcore_cb users, hypervisor action cannot
result in driver unbinding and consequently vmcore_cb changes.

Note: virtio-mem-pci devices might eventually get hotplugged while kdump is active. I assume we don't disable PCI hotplug in kdump kernels. Whil= e
this will trigger a warning ("Unexpected vmcore callback registration&= quot;),
the vmcore will not be affected and be complete.

<= /div>
Ok, thanks for the details, it sounds safe for the time being the= n.
=C2=A0

>
> But if you think user is playing the game in kdump kernel, eg. in gues= t
> os while kdump is saving vmcore then it is nearly not possible to happ= en
> I agree with you it is a very trival problem.

Yes, that's the only thing I consider can happen. For example, doing a:=

# echo 1 > /sys/devices/pci0000\:00/0000\:00\:03.0/remove

in a kdump kernel after opening the vmcore.

>
> Probably we have some misunderstanding, but it would be good to make i= t
> clear :)

Understanding your concern, it could be future proof (for future
vmcore_cb users?) to fail the ioctls instead of returning 0. But even
for new memory devices, unplug is usually something to be fenced off by
the hypervisor, just like not allowing forced DIMM removal.

Yes,= =C2=A0 there could be some future issues, not only for virt users, who know= s...


The only think I could imagine is having e.g., virtio-balloon device
register a vmcore_cb dynamically and providing a new mechanism to query
if a page is backed by a real page in the hypervisor (similar to XENs
hypercall). Such a device could be unplugged without harm, as it doesn'= t
actually provide device memory.

--
Thanks,

David / dhildenb

--00000000000053b8dc05d06ef86b--