From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Souptick Joarder <jrdr.linux@gmail.com>,
linux-hyperv@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
Oscar Salvador <osalvador@suse.com>,
Pavel Tatashin <pasha.tatashin@soleen.com>, Qian Cai <cai@lca.pw>,
Sasha Levin <sashal@kernel.org>,
Stephen Hemminger <sthemmin@microsoft.com>,
Wei Yang <richard.weiyang@gmail.com>
Subject: Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
Date: Mon, 23 Sep 2019 13:43:43 +0200 [thread overview]
Message-ID: <388fdfa4-6cd9-bfcd-3f7b-546e54c11a97@redhat.com> (raw)
In-Reply-To: <88ac3511-4ad8-d5c8-8e6a-0cca0a0f0989@redhat.com>
On 23.09.19 13:34, David Hildenbrand wrote:
> On 23.09.19 13:15, Michal Hocko wrote:
>> On Mon 23-09-19 11:31:30, David Hildenbrand wrote:
>>> On 23.09.19 10:58, Michal Hocko wrote:
>>>> On Fri 20-09-19 10:17:54, David Hildenbrand wrote:
>>>>> On 09.09.19 13:48, David Hildenbrand wrote:
>>>>>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
>>>>>>
>>>>>> Let's replace the __online_page...() functions by generic_online_page().
>>>>>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we
>>>>>> can simpy re-use the generic function.
>>>>>>
>>>>>> Only compile-tested.
>>>>>>
>>>>>> Cc: Souptick Joarder <jrdr.linux@gmail.com>
>>>>>>
>>>>>> David Hildenbrand (3):
>>>>>> mm/memory_hotplug: Export generic_online_page()
>>>>>> hv_balloon: Use generic_online_page()
>>>>>> mm/memory_hotplug: Remove __online_page_free() and
>>>>>> __online_page_increment_counters()
>>>>>>
>>>>>> drivers/hv/hv_balloon.c | 3 +--
>>>>>> include/linux/memory_hotplug.h | 4 +---
>>>>>> mm/memory_hotplug.c | 17 ++---------------
>>>>>> 3 files changed, 4 insertions(+), 20 deletions(-)
>>>>>>
>>>>>
>>>>> Ping, any comments on this one?
>>>>
>>>> Unification makes a lot of sense to me. You can add
>>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>>>
>>>> I will most likely won't surprise if I asked for more here though ;)
>>>
>>> I'm not surprised, but definitely not in a negative sense ;) I was
>>> asking myself if we could somehow rework this, too.
>>>
>>>> I have to confess I really detest the whole concept of a hidden callback
>>>> with a very weird API. Is this something we can do about? I do realize
>>>> that adding a callback would require either cluttering the existing APIs
>>>> but maybe we can come up with something more clever. Or maybe existing
>>>> external users of online callback can do that as a separate step after
>>>> the online is completed - or is this impossible due to locking
>>>> guarantees?
>>>>
>>>
>>> The use case of this (somewhat special) callback really is to avoid
>>> selected (unbacked in the hypervisor) pages to get put to the buddy just
>>> now, but instead to defer that (sometimes, defer till infinity ;) ).
>>> Especially, to hinder these pages from getting touched at all. Pages
>>> that won't be put to the buddy will usually get PG_offline set (e.g.,
>>> Hyper-V and XEN) - the only two users I am aware of.
>>>
>>> For Hyper-V (and also eventually virtio-mem), it is important to set
>>> PG_offline before marking the section to be online (SECTION_IS_ONLINE).
>>> Only this way, PG_offline is properly set on all pfn_to_online_page()
>>> pages, meaning "don't touch this page" - e.g., used to skip over such
>>> pages when suspending or by makedumpfile to skip over such offline pages
>>> when creating a memory dump.
>>
>> Thanks for the clarification. I have never really studied what those
>> callbacks are doing really.
>>
>>> So if we would e.g., try to piggy-back onto the memory_notify()
>>> infrastructure, we could
>>> 1. Online all pages to the buddy (dropping the callback)
>>> 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg);
>>> -> in the notifier, pull pages from the buddy, mark sections online
>>> 3. Set all involved sections online (online_mem_sections())
>>
>> This doesn't really sound any better. For one pages are immediately
>> usable when they hit the buddy allocator so this is racy and thus not
>> reliable.
>>
>>> However, I am not sure what actually happens after 1. - we are only
>>> holding the device hotplug lock and the memory hotplug lock, so the
>>> pages can just get allocated. Also, it sounds like more work and code
>>> for the same end result (okay, if the rework is really necessary, though).
>>>
>>> So yeah, while the current callback might not be optimal, I don't see an
>>> easy and clean way to rework this. With the change in this series we are
>>> at least able to simply defer doing what would have been done without
>>> the callback - not perfect but better.
>>>
>>> Do you have anything in mind that could work out and make this nicer?
>>
>> I am wondering why those pages get onlined when they are, in fact,
>> supposed to be offline.
>>
>
> It's the current way of emulating sub-memory-block hotplug on top of the
> memory bock device API we have. Hyper-V and XEN have been using that for
> a long time.
>
So one idea would be to let clients set pages to PG_offline during
MEM_GOING_ONLINE. We could then skip any PG_offline pages when onlining
pages, not onlining them to the buddy.
But then, there still has to be a way to online pages when required -
e.g., generic_online_page(). At least the callback could go.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2019-09-23 11:43 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-09 11:48 David Hildenbrand
2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand
2019-09-09 11:48 ` [PATCH v1 2/3] hv_balloon: Use generic_online_page() David Hildenbrand
2019-09-09 11:48 ` [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() David Hildenbrand
2019-09-20 8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
2019-09-23 8:58 ` Michal Hocko
2019-09-23 9:31 ` David Hildenbrand
2019-09-23 11:15 ` Michal Hocko
2019-09-23 11:34 ` David Hildenbrand
2019-09-23 11:43 ` David Hildenbrand [this message]
2019-09-23 12:07 ` Michal Hocko
2019-09-23 12:20 ` David Hildenbrand
2019-09-23 12:30 ` Michal Hocko
2019-09-23 12:34 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=388fdfa4-6cd9-bfcd-3f7b-546e54c11a97@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cai@lca.pw \
--cc=dan.j.williams@intel.com \
--cc=haiyangz@microsoft.com \
--cc=jrdr.linux@gmail.com \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.com \
--cc=pasha.tatashin@soleen.com \
--cc=richard.weiyang@gmail.com \
--cc=sashal@kernel.org \
--cc=sthemmin@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox