From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
linux-acpi@vger.kernel.org, devel@linuxdriverproject.org,
xen-devel@lists.xenproject.org, x86@kernel.org,
"Andrew Banman" <andrew.banman@hpe.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Andy Lutomirski" <luto@kernel.org>,
"Arun KS" <arunks@codeaurora.org>,
"Balbir Singh" <bsingharora@gmail.com>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
"Borislav Petkov" <bp@alien8.de>,
"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
"Christophe Leroy" <christophe.leroy@c-s.fr>,
"Dan Williams" <dan.j.williams@intel.com>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Dave Jiang" <dave.jiang@intel.com>,
"Fenghua Yu" <fenghua.yu@intel.com>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Haiyang Zhang" <haiyangz@microsoft.com>,
"Heiko Carstens" <heiko.carstens@de.ibm.com>,
"H. Peter Anvin" <hpa@zytor.com>,
"Ingo Molnar" <mingo@kernel.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Jan H. Schönherr" <jschoenh@amazon.de>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Jonathan Neuschäfer" <j.neuschaefer@gmx.net>,
"Joonsoo Kim" <iamjoonsoo.kim@lge.com>,
"Juergen Gross" <jgross@suse.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
"Len Brown" <lenb@kernel.org>,
"Logan Gunthorpe" <logang@deltatee.com>,
"Martin Schwidefsky" <schwidefsky@de.ibm.com>,
"Mathieu Malaterre" <malat@debian.org>,
"Matthew Wilcox" <willy@infradead.org>,
"Mauricio Faria de Oliveira" <mauricfo@linux.vnet.ibm.com>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Michael Neuling" <mikey@neuling.org>,
"Michal Suchánek" <msuchanek@suse.de>,
"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
"mike.travis@hpe.com" <mike.travis@hpe.com>,
"Nathan Fontenot" <nfont@linux.vnet.ibm.com>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Oscar Salvador" <osalvador@suse.com>,
"Oscar Salvador" <osalvador@suse.de>,
"Paul Mackerras" <paulus@samba.org>,
"Pavel Tatashin" <pasha.tatashin@oracle.com>,
"Pavel Tatashin" <pasha.tatashin@soleen.com>,
"Pavel Tatashin" <pavel.tatashin@microsoft.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
"Rashmica Gupta" <rashmica.g@gmail.com>,
"Rich Felker" <dalias@libc.org>, "Rob Herring" <robh@kernel.org>,
"Stefano Stabellini" <sstabellini@kernel.org>,
"Stephen Hemminger" <sthemmin@microsoft.com>,
"Stephen Rothwell" <sfr@canb.auug.org.au>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Tony Luck" <tony.luck@intel.com>,
"Vasily Gorbik" <gor@linux.ibm.com>,
"Vitaly Kuznetsov" <vkuznets@redhat.com>,
"Wei Yang" <richard.weiyang@gmail.com>,
"Yoshinori Sato" <ysato@users.sourceforge.jp>,
YueHaibing <yuehaibing@huawei.com>
Subject: Re: [PATCH RFCv2 0/4] mm/memory_hotplug: Introduce memory block types
Date: Thu, 20 Dec 2018 14:16:58 +0100 [thread overview]
Message-ID: <872b5496-7227-9171-fb3c-ec03cf190302@redhat.com> (raw)
In-Reply-To: <20181220130832.GH9104@dhcp22.suse.cz>
On 20.12.18 14:08, Michal Hocko wrote:
> On Thu 20-12-18 13:58:16, David Hildenbrand wrote:
>> On 30.11.18 18:59, David Hildenbrand wrote:
>>> This is the second approach, introducing more meaningful memory block
>>> types and not changing online behavior in the kernel. It is based on
>>> latest linux-next.
>>>
>>> As we found out during dicussion, user space should always handle onlining
>>> of memory, in any case. However in order to make smart decisions in user
>>> space about if and how to online memory, we have to export more information
>>> about memory blocks. This way, we can formulate rules in user space.
>>>
>>> One such information is the type of memory block we are talking about.
>>> This helps to answer some questions like:
>>> - Does this memory block belong to a DIMM?
>>> - Can this DIMM theoretically ever be unplugged again?
>>> - Was this memory added by a balloon driver that will rely on balloon
>>> inflation to remove chunks of that memory again? Which zone is advised?
>>> - Is this special standby memory on s390x that is usually not automatically
>>> onlined?
>>>
>>> And in short it helps to answer to some extend (excluding zone imbalances)
>>> - Should I online this memory block?
>>> - To which zone should I online this memory block?
>>> ... of course special use cases will result in different anwers. But that's
>>> why user space has control of onlining memory.
>>>
>>> More details can be found in Patch 1 and Patch 3.
>>> Tested on x86 with hotplugged DIMMs. Cross-compiled for PPC and s390x.
>>>
>>>
>>> Example:
>>> $ udevadm info -q all -a /sys/devices/system/memory/memory0
>>> KERNEL=="memory0"
>>> SUBSYSTEM=="memory"
>>> DRIVER==""
>>> ATTR{online}=="1"
>>> ATTR{phys_device}=="0"
>>> ATTR{phys_index}=="00000000"
>>> ATTR{removable}=="0"
>>> ATTR{state}=="online"
>>> ATTR{type}=="boot"
>>> ATTR{valid_zones}=="none"
>>> $ udevadm info -q all -a /sys/devices/system/memory/memory90
>>> KERNEL=="memory90"
>>> SUBSYSTEM=="memory"
>>> DRIVER==""
>>> ATTR{online}=="1"
>>> ATTR{phys_device}=="0"
>>> ATTR{phys_index}=="0000005a"
>>> ATTR{removable}=="1"
>>> ATTR{state}=="online"
>>> ATTR{type}=="dimm"
>>> ATTR{valid_zones}=="Normal"
>>>
>>>
>>> RFC -> RFCv2:
>>> - Now also taking care of PPC (somehow missed it :/ )
>>> - Split the series up to some degree (some ideas on how to split up patch 3
>>> would be very welcome)
>>> - Introduce more memory block types. Turns out abstracting too much was
>>> rather confusing and not helpful. Properly document them.
>>>
>>> Notes:
>>> - I wanted to convert the enum of types into a named enum but this
>>> provoked all kinds of different errors. For now, I am doing it just like
>>> the other types (e.g. online_type) we are using in that context.
>>> - The "removable" property should never have been named like that. It
>>> should have been "offlinable". Can we still rename that? E.g. boot memory
>>> is sometimes marked as removable ...
>>>
>>
>>
>> Any feedback regarding the suggested block types would be very much
>> appreciated!
>
> I still do not like this much to be honest. I just didn't get to think
> through this properly. My fear is that this is conflating an actual API
> with the current implementation and as such will cause problems in
> future. But I haven't really looked into your patches closely so I might
> be wrong. Anyway I won't be able to look into it by the end of year.
>
I guess as long as we have memory block devices and we expect user space
to make a decision we will have this API and the involved problems.
I am open for alternatives, and as I said, any feedback on how to sort
this out will be highly appreciated.
I'll be on vacation for the next two weeks, so this can wait. Just
wanted to note that I am still interested in feedback :)
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2018-12-20 13:17 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-30 17:59 David Hildenbrand
2018-11-30 17:59 ` [PATCH RFCv2 1/4] " David Hildenbrand
2018-12-01 1:25 ` Wei Yang
2018-12-03 10:32 ` David Hildenbrand
2018-12-03 20:58 ` Wei Yang
2018-11-30 17:59 ` [PATCH RFCv2 2/4] mm/memory_hotplug: Replace "bool want_memblock" by "int type" David Hildenbrand
2018-12-01 1:50 ` Wei Yang
2018-12-03 10:33 ` David Hildenbrand
2018-11-30 17:59 ` [PATCH RFCv2 3/4] mm/memory_hotplug: Introduce and use more memory types David Hildenbrand
2018-12-04 9:44 ` Michal Suchánek
2018-12-04 9:47 ` David Hildenbrand
2018-11-30 17:59 ` [PATCH RFCv2 4/4] mm/memory_hotplug: Drop MEMORY_TYPE_UNSPECIFIED David Hildenbrand
2018-12-01 0:48 ` [PATCH RFCv2 0/4] mm/memory_hotplug: Introduce memory block types Wei Yang
2018-12-20 12:58 ` David Hildenbrand
2018-12-20 13:08 ` Michal Hocko
2018-12-20 13:16 ` David Hildenbrand [this message]
2019-03-27 16:03 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=872b5496-7227-9171-fb3c-ec03cf190302@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.banman@hpe.com \
--cc=arunks@codeaurora.org \
--cc=benh@kernel.crashing.org \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=bsingharora@gmail.com \
--cc=christophe.leroy@c-s.fr \
--cc=dalias@libc.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=devel@linuxdriverproject.org \
--cc=fenghua.yu@intel.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=haiyangz@microsoft.com \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=j.neuschaefer@gmx.net \
--cc=jglisse@redhat.com \
--cc=jgross@suse.com \
--cc=jschoenh@amazon.de \
--cc=kirill.shutemov@linux.intel.com \
--cc=kys@microsoft.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=logang@deltatee.com \
--cc=luto@kernel.org \
--cc=malat@debian.org \
--cc=mauricfo@linux.vnet.ibm.com \
--cc=mhocko@kernel.org \
--cc=mike.travis@hpe.com \
--cc=mikey@neuling.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=msuchanek@suse.de \
--cc=nfont@linux.vnet.ibm.com \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.com \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@oracle.com \
--cc=pasha.tatashin@soleen.com \
--cc=paulus@samba.org \
--cc=pavel.tatashin@microsoft.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rashmica.g@gmail.com \
--cc=richard.weiyang@gmail.com \
--cc=rjw@rjwysocki.net \
--cc=robh@kernel.org \
--cc=rppt@linux.vnet.ibm.com \
--cc=schwidefsky@de.ibm.com \
--cc=sfr@canb.auug.org.au \
--cc=sstabellini@kernel.org \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=vkuznets@redhat.com \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
--cc=ysato@users.sourceforge.jp \
--cc=yuehaibing@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox