From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E3BCC433E5 for ; Thu, 23 Jul 2020 13:39:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EB001207BB for ; Thu, 23 Jul 2020 13:39:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=citrix.com header.i=@citrix.com header.b="FjbKqtKa" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EB001207BB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=citrix.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 647E96B0006; Thu, 23 Jul 2020 09:39:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F9BD8D0002; Thu, 23 Jul 2020 09:39:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E7C16B000D; Thu, 23 Jul 2020 09:39:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id 38C4F6B0006 for ; Thu, 23 Jul 2020 09:39:55 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CF368180AC462 for ; Thu, 23 Jul 2020 13:39:54 +0000 (UTC) X-FDA: 77069448708.03.farm05_171412d26f3f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 89A6D169E1 for ; Thu, 23 Jul 2020 13:39:54 +0000 (UTC) X-HE-Tag: farm05_171412d26f3f X-Filterd-Recvd-Size: 8120 Received: from esa3.hc3370-68.iphmx.com (esa3.hc3370-68.iphmx.com [216.71.145.155]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Thu, 23 Jul 2020 13:39:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=citrix.com; s=securemail; t=1595511594; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=SPySEjUBF6icBOLV/IomyE/JZa+kTbam/2Y6LYZ1gk4=; b=FjbKqtKa8KumYowPRZHgOA9v6vXjelNilaeIYt7sm+CBIvRy8gkf6Oi3 HYG9gdbuMZRyJEqPCu0VG23kLyQIONL+1/sBH5YtcYNQrT4g3UhszMCfD 0CQrEUw+D6EFDHgTEs5P4q6hm0hXzcPOJiCzcjJTEL1MZqUrTbosvSq+8 0=; Authentication-Results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none IronPort-SDR: gC5NREPQG9nd67j1J33rwfTdriPTr6ltn3xqL6NDbB05O6F5teQ1n0VfuWFZIocFW0HPSBnMhp Y3AkadLKuX5GHB6lXFzUjoY8mWfARLA3wo89xLy28bpq0ZpOjtTfM6Z31iif8gGHWxF9Plv8cy lLawOTf2kPVUTbahVLvJ2ev0fP6KPvwLLfWkXiupcpZq9kfiZy5s8GImGeE7nXiaGVzhL5KoTB k0WwwMKtT43wX/Mtj9uBSx87QwZ7lALeN1Svj0BnISrs76OosK8MsmSJrxp7cUYbfdlvngV48W R6g= X-SBRS: 2.7 X-MesageID: 23038656 X-Ironport-Server: esa3.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.75,386,1589256000"; d="scan'208";a="23038656" Date: Thu, 23 Jul 2020 15:39:45 +0200 From: Roger Pau =?utf-8?B?TW9ubsOp?= To: =?utf-8?B?SsO8cmdlbiBHcm/Dnw==?= CC: David Hildenbrand , , Boris Ostrovsky , Stefano Stabellini , Andrew Morton , , Subject: Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory Message-ID: <20200723133945.GG7191@Air-de-Roger> References: <20200723084523.42109-1-roger.pau@citrix.com> <20200723084523.42109-4-roger.pau@citrix.com> <21490d49-b2cf-a398-0609-8010bdb0b004@redhat.com> <20200723122300.GD7191@Air-de-Roger> <404ea76f-c3d8-dbc5-432d-08d84a17f2d7@suse.com> <20200723130831.GE7191@Air-de-Roger> <76640b3e-f46c-80d5-7714-aa3b731276ab@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline In-Reply-To: <76640b3e-f46c-80d5-7714-aa3b731276ab@suse.com> X-ClientProxiedBy: AMSPEX02CAS02.citrite.net (10.69.22.113) To AMSPEX02CL02.citrite.net (10.69.22.126) X-Rspamd-Queue-Id: 89A6D169E1 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 23, 2020 at 03:20:55PM +0200, J=C3=BCrgen Gro=C3=9F wrote: > On 23.07.20 15:08, Roger Pau Monn=C3=A9 wrote: > > On Thu, Jul 23, 2020 at 02:28:13PM +0200, J=C3=BCrgen Gro=C3=9F wrote= : > > > On 23.07.20 14:23, Roger Pau Monn=C3=A9 wrote: > > > > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote= : > > > > > On 23.07.20 10:45, Roger Pau Monne wrote: > > > > > > Add an extra option to add_memory_resource that overrides the= memory > > > > > > hotplug online behavior in order to force onlining of memory = from > > > > > > add_memory_resource unconditionally. > > > > > >=20 > > > > > > This is required for the Xen balloon driver, that must run th= e > > > > > > online page callback in order to correctly process the newly = added > > > > > > memory region, note this is an unpopulated region that is use= d by Linux > > > > > > to either hotplug RAM or to map foreign pages from other doma= ins, and > > > > > > hence memory hotplug when running on Xen can be used even wit= hout the > > > > > > user explicitly requesting it, as part of the normal operatio= ns of the > > > > > > OS when attempting to map memory from a different domain. > > > > > >=20 > > > > > > Setting a different default value of memhp_default_online_typ= e when > > > > > > attaching the balloon driver is not a robust solution, as the= user (or > > > > > > distro init scripts) could still change it and thus break the= Xen > > > > > > balloon driver. > > > > >=20 > > > > > I think we discussed this a couple of times before (even trigge= red by my > > > > > request), and this is responsibility of user space to configure= . Usually > > > > > distros have udev rules to online memory automatically. Especia= lly, user > > > > > space should eb able to configure *how* to online memory. > > > >=20 > > > > Note (as per the commit message) that in the specific case I'm > > > > referring to the memory hotplugged by the Xen balloon driver will= be > > > > an unpopulated range to be used internally by certain Xen subsyst= ems, > > > > like the xen-blkback or the privcmd drivers. The addition of such > > > > blocks of (unpopulated) memory can happen without the user explic= itly > > > > requesting it, and hence not even aware such hotplug process is t= aking > > > > place. To be clear: no actual RAM will be added to the system. > > > >=20 > > > > Failure to online such blocks using the Xen specific online handl= er > > > > (which does not handle back the memory to the allocator in any wa= y) > > > > will result in the system getting stuck and malfunctioning. > > > >=20 > > > > > It's the admin/distro responsibility to configure this properly= . In case > > > > > this doesn't happen (or as you say, users change it), bad luck. > > > > >=20 > > > > > E.g., virtio-mem takes care to not add more memory in case it i= s not > > > > > getting onlined. I remember hyper-v has similar code to at leas= t wait a > > > > > bit for memory to get onlined. > > > >=20 > > > > I don't think VirtIO or Hyper-V use the hotplug system in the sam= e way > > > > as Xen, as said this is done to add unpopulated memory regions th= at > > > > will be used to map foreign memory (from other domains) by Xen dr= ivers > > > > on the system. > > > >=20 > > > > Maybe this should somehow use a different mechanism to hotplug su= ch > > > > empty memory blocks? I don't mind doing this differently, but I w= ould > > > > need some pointers. Allowing user-space to change a (seemingly > > > > unrelated) parameter and as a result produce failures on Xen driv= ers > > > > is not an acceptable solution IMO. > > >=20 > > > Maybe we can use the same approach as Xen PV-domains: pre-allocate = a > > > region in the memory map to be used for mapping foreign pages. For = the > > > kernel it will look like pre-ballooned memory, so it will create st= ruct > > > page for the region (which is what we are after), but it won't give= the > > > memory to the allocator. > >=20 > > IMO using something similar to memory hotplug would give us more > > flexibility, and TBH the logic is already there in the balloon driver= . > > It seems quite wasteful to allocate such region(s) beforehand for all > > domains, even when most of them won't end up using foreign mappings a= t > > all. >=20 > We can do it for dom0 only per default, and add a boot parameter e.g. > for driver domains. >=20 > And the logic is already there (just pv-only right now). >=20 > >=20 > > Anyway, I'm going to take a look at how to do that, I guess it's goin= g > > to involve playing with the memory map and reserving some space. >=20 > Look at arch/x86/xen/setup.c (xen_add_extra_mem() and its usage). Yes, I've taken a look. It's my rough understanding that I would need to add a hook for HVM/PVH that modifies the memory map in order to add an extra region (or regions) that would be marked as reserved using memblock_reserve by xen_add_extra_mem. Adding such hook for PVH guests booted using the PVH entry point and fetching the memory map using the hypercall interface (mem_map_via_hcall) seems feasible, however I'm not sure dealing with other guests types is that easy. > >=20 > > I suggest we should remove the Xen balloon hotplug logic, as it's not > > working properly and we don't have a plan to fix it. >=20 > I have used memory hotplug successfully not very long ago. Right, but it requires a certain set of enabled options, which IMO is not obvious. For example enabling xen_hotplug_unpopulated without also setting the default memory hotplug policy to online the added blocks will result in processes getting stuck. This is IMO too fragile. Roger.