From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>,
linux-mm@kvack.org, mpe@ellerman.id.au,
linuxppc-dev@lists.ozlabs.org, mdroth@linux.vnet.ibm.com,
kys@microsoft.com
Subject: Re: [RFC PATCH] memory-hotplug: Use dev_online for memhp_auto_offline
Date: Fri, 24 Feb 2017 15:10:29 +0100 [thread overview]
Message-ID: <87efyny90q.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <20170224133714.GH19161@dhcp22.suse.cz> (Michal Hocko's message of "Fri, 24 Feb 2017 14:37:14 +0100")
Michal Hocko <mhocko@kernel.org> writes:
> On Thu 23-02-17 19:14:27, Vitaly Kuznetsov wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>>
>> > On Thu 23-02-17 17:36:38, Vitaly Kuznetsov wrote:
>> >> Michal Hocko <mhocko@kernel.org> writes:
>> > [...]
>> >> > Is a grow from 256M -> 128GB really something that happens in real life?
>> >> > Don't get me wrong but to me this sounds quite exaggerated. Hotmem add
>> >> > which is an operation which has to allocate memory has to scale with the
>> >> > currently available memory IMHO.
>> >>
>> >> With virtual machines this is very real and not exaggerated at
>> >> all. E.g. Hyper-V host can be tuned to automatically add new memory when
>> >> guest is running out of it. Even 100 blocks can represent an issue.
>> >
>> > Do you have any reference to a bug report. I am really curious because
>> > something really smells wrong and it is not clear that the chosen
>> > solution is really the best one.
>>
>> Unfortunately I'm not aware of any publicly posted bug reports (CC:
>> K. Y. - he may have a reference) but I think I still remember everything
>> correctly. Not sure how deep you want me to go into details though...
>
> As much as possible to understand what was really going on...
>
>> Virtual guests under stress were getting into OOM easily and the OOM
>> killer was even killing the udev process trying to online the
>> memory.
>
> Do you happen to have any OOM report? I am really surprised that udev
> would be an oom victim because that process is really small. Who is
> consuming all the memory then?
It's been a while since I worked on this and unfortunatelly I don't have
a log. From what I remember, the kernel itself was consuming all memory
so *all* processes were victims.
>
> Have you measured how much memory do we need to allocate to add one
> memblock?
No, it's actually a good idea if we decide to do some sort of pre-allocation.
Just did a quick (and probably dirty) test, increasing guest memory from
4G to 8G (32 x 128mb blocks) require 68Mb of memory, so it's roughly 2Mb
per block. It's really easy to trigger OOM for small guests.
>
>> There was a workaround for the issue added to the hyper-v driver
>> doing memory add:
>>
>> hv_mem_hot_add(...) {
>> ...
>> add_memory(....);
>> wait_for_completion_timeout(..., 5*HZ);
>> ...
>> }
>
> I can still see
> /*
> * Wait for the memory block to be onlined when memory onlining
> * is done outside of kernel (memhp_auto_online). Since the hot
> * add has succeeded, it is ok to proceed even if the pages in
> * the hot added region have not been "onlined" within the
> * allowed time.
> */
> if (dm_device.ha_waiting)
> wait_for_completion_timeout(&dm_device.ol_waitevent,
> 5*HZ);
>
See
dm_device.ha_waiting = !memhp_auto_online;
30 lines above. The workaround is still there for udev case and it is
still equaly bad.
>> the completion was done by observing for the MEM_ONLINE event. This, of
>> course, was slowing things down significantly and waiting for a
>> userspace action in kernel is not a nice thing to have (not speaking
>> about all other memory adding methods which had the same issue). Just
>> removing this wait was leading us to the same OOM as the hypervisor was
>> adding more and more memory and eventually even add_memory() was
>> failing, udev and other processes were killed,...
>
> Yes, I agree that waiting on a user action from the kernel is very far
> from ideal.
>
>> With the feature in place we have new memory available right after we do
>> add_memory(), everything is serialized.
>
> What prevented you from onlining the memory explicitly from
> hv_mem_hot_add path? Why do you need a user visible policy for that at
> all? You could also add a parameter to add_memory that would do the same
> thing. Or am I missing something?
We have different mechanisms for adding memory, I'm aware of at least 3:
ACPI, Xen, Hyper-V. The issue I'm addressing is general enough, I'm
pretty sure I can reproduce the issue on Xen, for example - just boot a
small guest and try adding tons of memory. Why should we have different
defaults for different technologies?
And, BTW, the link to the previous discussion:
https://groups.google.com/forum/#!msg/linux.kernel/AxvyuQjr4GY/TLC-K0sL_NEJ
--
Vitaly
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-02-24 14:10 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-21 17:22 Nathan Fontenot
2017-02-22 9:32 ` Vitaly Kuznetsov
2017-02-23 12:56 ` Michal Hocko
2017-02-23 13:31 ` Vitaly Kuznetsov
2017-02-23 15:09 ` Michal Hocko
2017-02-23 15:49 ` Vitaly Kuznetsov
2017-02-23 16:12 ` Michal Hocko
2017-02-23 16:36 ` Vitaly Kuznetsov
2017-02-23 17:41 ` Michal Hocko
2017-02-23 18:14 ` Vitaly Kuznetsov
2017-02-24 13:37 ` Michal Hocko
2017-02-24 14:10 ` Vitaly Kuznetsov [this message]
2017-02-24 14:41 ` Michal Hocko
2017-02-24 15:05 ` Vitaly Kuznetsov
2017-02-24 15:32 ` Michal Hocko
2017-02-24 16:09 ` Vitaly Kuznetsov
2017-02-24 16:23 ` Michal Hocko
2017-02-24 16:40 ` Vitaly Kuznetsov
2017-02-24 16:52 ` Michal Hocko
2017-02-24 17:06 ` Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87efyny90q.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=kys@microsoft.com \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mdroth@linux.vnet.ibm.com \
--cc=mhocko@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=nfont@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox