From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 790BBC32755 for ; Mon, 23 Sep 2019 11:16:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 469632089F for ; Mon, 23 Sep 2019 11:16:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 469632089F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CFADD6B000A; Mon, 23 Sep 2019 07:16:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CAB176B000C; Mon, 23 Sep 2019 07:16:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC2456B000D; Mon, 23 Sep 2019 07:16:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id 9A0FB6B000A for ; Mon, 23 Sep 2019 07:16:02 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 54AAB45C1 for ; Mon, 23 Sep 2019 11:16:02 +0000 (UTC) X-FDA: 75965930964.13.aunt44_34a8d05072b0a X-HE-Tag: aunt44_34a8d05072b0a X-Filterd-Recvd-Size: 5646 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf20.hostedemail.com (Postfix) with ESMTP for ; Mon, 23 Sep 2019 11:16:01 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 4084CAE6E; Mon, 23 Sep 2019 11:16:00 +0000 (UTC) Date: Mon, 23 Sep 2019 13:15:59 +0200 From: Michal Hocko To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Souptick Joarder , linux-hyperv@vger.kernel.org, Andrew Morton , Dan Williams , Haiyang Zhang , "K. Y. Srinivasan" , Oscar Salvador , Pavel Tatashin , Qian Cai , Sasha Levin , Stephen Hemminger , Wei Yang Subject: Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() Message-ID: <20190923111559.GK6016@dhcp22.suse.cz> References: <20190909114830.662-1-david@redhat.com> <20190923085807.GD6016@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 23-09-19 11:31:30, David Hildenbrand wrote: > On 23.09.19 10:58, Michal Hocko wrote: > > On Fri 20-09-19 10:17:54, David Hildenbrand wrote: > >> On 09.09.19 13:48, David Hildenbrand wrote: > >>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" > >>> > >>> Let's replace the __online_page...() functions by generic_online_page(). > >>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we > >>> can simpy re-use the generic function. > >>> > >>> Only compile-tested. > >>> > >>> Cc: Souptick Joarder > >>> > >>> David Hildenbrand (3): > >>> mm/memory_hotplug: Export generic_online_page() > >>> hv_balloon: Use generic_online_page() > >>> mm/memory_hotplug: Remove __online_page_free() and > >>> __online_page_increment_counters() > >>> > >>> drivers/hv/hv_balloon.c | 3 +-- > >>> include/linux/memory_hotplug.h | 4 +--- > >>> mm/memory_hotplug.c | 17 ++--------------- > >>> 3 files changed, 4 insertions(+), 20 deletions(-) > >>> > >> > >> Ping, any comments on this one? > > > > Unification makes a lot of sense to me. You can add > > Acked-by: Michal Hocko > > > > I will most likely won't surprise if I asked for more here though ;) > > I'm not surprised, but definitely not in a negative sense ;) I was > asking myself if we could somehow rework this, too. > > > I have to confess I really detest the whole concept of a hidden callback > > with a very weird API. Is this something we can do about? I do realize > > that adding a callback would require either cluttering the existing APIs > > but maybe we can come up with something more clever. Or maybe existing > > external users of online callback can do that as a separate step after > > the online is completed - or is this impossible due to locking > > guarantees? > > > > The use case of this (somewhat special) callback really is to avoid > selected (unbacked in the hypervisor) pages to get put to the buddy just > now, but instead to defer that (sometimes, defer till infinity ;) ). > Especially, to hinder these pages from getting touched at all. Pages > that won't be put to the buddy will usually get PG_offline set (e.g., > Hyper-V and XEN) - the only two users I am aware of. > > For Hyper-V (and also eventually virtio-mem), it is important to set > PG_offline before marking the section to be online (SECTION_IS_ONLINE). > Only this way, PG_offline is properly set on all pfn_to_online_page() > pages, meaning "don't touch this page" - e.g., used to skip over such > pages when suspending or by makedumpfile to skip over such offline pages > when creating a memory dump. Thanks for the clarification. I have never really studied what those callbacks are doing really. > So if we would e.g., try to piggy-back onto the memory_notify() > infrastructure, we could > 1. Online all pages to the buddy (dropping the callback) > 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg); > -> in the notifier, pull pages from the buddy, mark sections online > 3. Set all involved sections online (online_mem_sections()) This doesn't really sound any better. For one pages are immediately usable when they hit the buddy allocator so this is racy and thus not reliable. > However, I am not sure what actually happens after 1. - we are only > holding the device hotplug lock and the memory hotplug lock, so the > pages can just get allocated. Also, it sounds like more work and code > for the same end result (okay, if the rework is really necessary, though). > > So yeah, while the current callback might not be optimal, I don't see an > easy and clean way to rework this. With the change in this series we are > at least able to simply defer doing what would have been done without > the callback - not perfect but better. > > Do you have anything in mind that could work out and make this nicer? I am wondering why those pages get onlined when they are, in fact, supposed to be offline. -- Michal Hocko SUSE Labs