From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Alexander Duyck <alexander.h.duyck@linux.intel.com>,
Alexander Potapenko <glider@google.com>,
Andrey Konovalov <andreyknvl@google.com>,
Andy Lutomirski <luto@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Arun KS <arunks@codeaurora.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Christophe Leroy <christophe.leroy@c-s.fr>,
Dave Airlie <airlied@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Fenghua Yu <fenghua.yu@intel.com>,
Gerald Schaefer <gerald.schaefer@de.ibm.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Halil Pasic <pasic@linux.ibm.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Ira Weiny <ira.weiny@intel.com>, Jason Gunthorpe <jgg@ziepe.ca>,
John Hubbard <jhubbard@nvidia.com>,
Jun Yao <yaojun8558363@gmail.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Logan Gunthorpe <logang@deltatee.com>,
Mark Rutland <mark.rutland@arm.com>,
Masahiro Yamada <yamada.masahiro@socionext.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Mel Gorman <mgorman@techsingularity.net>,
Michael Ellerman <mpe@ellerman.id.au>,
Mike Rapoport <rppt@linux.ibm.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Oscar Salvador <osalvador@suse.com>,
Oscar Salvador <osalvador@suse.de>,
Paul Mackerras <paulus@samba.org>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Pavel Tatashin <pavel.tatashin@microsoft.com>,
Peter Zijlstra <peterz@infradead.org>, Qian Cai <cai@lca.pw>,
Rich Felker <dalias@libc.org>,
Robin Murphy <robin.murphy@arm.com>,
Souptick Joarder <jrdr.linux@gmail.com>,
Stephen Rothwell <sfr@canb.auug.org.au>,
Steve Capper <steve.capper@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Tom Lendacky <thomas.lendacky@amd.com>,
Tony Luck <tony.luck@intel.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Vlastimil Babka <vbabka@suse.cz>,
Wei Yang <richard.weiyang@gmail.com>,
Wei Yang <richardw.yang@linux.intel.com>,
Will Deacon <will@kernel.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
Yoshinori Sato <ysato@users.sourceforge.jp>,
Yu Zhao <yuzhao@google.com>
Subject: Re: [PATCH v3 00/11] mm/memory_hotplug: Shrink zones before removing memory
Date: Thu, 29 Aug 2019 14:08:48 +0200 [thread overview]
Message-ID: <90313ec8-a13e-5353-cc25-1c8993d5269c@redhat.com> (raw)
In-Reply-To: <ef4a4973-3df9-4368-cf50-463e2970348f@redhat.com>
On 29.08.19 13:43, David Hildenbrand wrote:
> On 29.08.19 13:33, David Hildenbrand wrote:
>> On 29.08.19 10:23, Michal Hocko wrote:
>>> On Thu 29-08-19 09:00:08, David Hildenbrand wrote:
>>>> This is the successor of "[PATCH v2 0/6] mm/memory_hotplug: Consider all
>>>> zones when removing memory". I decided to go one step further and finally
>>>> factor out the shrinking of zones from memory removal code. Zones are now
>>>> fixed up when offlining memory/onlining of memory fails/before removing
>>>> ZONE_DEVICE memory.
>>>
>>> I was about to say Yay! but then reading...
>>
>> Almost ;)
>>
>>>
>>>> Example:
>>>>
>>>> :/# cat /proc/zoneinfo
>>>> Node 1, zone Movable
>>>> spanned 0
>>>> present 0
>>>> managed 0
>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory41/state
>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory43/state
>>>> :/# cat /proc/zoneinfo
>>>> Node 1, zone Movable
>>>> spanned 98304
>>>> present 65536
>>>> managed 65536
>>>> :/# echo 0 > /sys/devices/system/memory/memory43/online
>>>> :/# cat /proc/zoneinfo
>>>> Node 1, zone Movable
>>>> spanned 32768
>>>> present 32768
>>>> managed 32768
>>>> :/# echo 0 > /sys/devices/system/memory/memory41/online
>>>> :/# cat /proc/zoneinfo
>>>> Node 1, zone Movable
>>>> spanned 0
>>>> present 0
>>>> managed 0
>>>
>>> ... this made me realize that you are trying to fix it instead. Could
>>> you explain why do we want to do that? Why don't we simply remove all
>>> that crap? Why do we even care about zone boundaries when offlining or
>>> removing memory? Zone shrinking was mostly necessary with the previous
>>> onlining semantic when the zone type could be only changed on the
>>> boundary or unassociated memory. We can interleave memory zones now
>>> arbitrarily.
>>
>> Last time I asked whether we can just drop all that nasty
>> zone->contiguous handling I was being told that it does have a
>> significant performance impact and is here to stay. The boundaries are a
>> key component to detect whether a zone is contiguous.
>>
>> So yes, while we allow interleaved memory zones, having contiguous zones
>> is beneficial for performance. That's why also memory onlining code will
>> try to online memory as default to the zone that will keep/make zones
>> contiguous.
>>
>> Anyhow, I think with this series most of the zone shrinking code becomes
>> "digestible". Except minor issues with ZONE_DEVICE - which is acceptable.
>>
>
> Also, there are plenty of other users of
> node_spanned_pages/zone_spanned_pages etc.. I don't think this can go -
> not that easy :)
>
... re-reading, your suggestion is to drop the zone _shrinking_ code
only, sorry :) That makes more sense.
This would mean that once a zone was !contiguous, it will always remain
like that. Also, even empty zones after unplug would not result in
zone_empty() == true.
I can see that some users of *_spanned_pages make certain assumptions
based on the size (snapshot, oom killer, ...), but that would already be
wrong in case the zone is very sparse.
I'll prepare something, then we can discuss.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2019-08-29 12:09 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-29 7:00 David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 01/11] mm/memremap: Get rid of memmap_init_zone_device() David Hildenbrand
2019-08-29 16:39 ` Alexander Duyck
2019-08-29 16:55 ` David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 02/11] mm/memory_hotplug: Simplify shrink_pgdat_span() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 03/11] mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 04/11] mm/memory_hotplug: Drop local variables in shrink_zone_span() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 05/11] mm/memory_hotplug: Optimize zone shrinking code when checking for holes David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 06/11] mm/memory_hotplug: Fix crashes in shrink_zone_span() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 07/11] mm/memory_hotplug: Exit early in __remove_pages() on BUGs David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 08/11] mm: Exit early in set_zone_contiguous() if already contiguous David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 09/11] mm/memory_hotplug: Remove pages from a zone before removing memory David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 10/11] mm/memory_hotplug: Remove zone parameter from __remove_pages() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 11/11] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
2019-08-29 8:23 ` [PATCH v3 00/11] mm/memory_hotplug: Shrink zones before removing memory Michal Hocko
2019-08-29 11:33 ` David Hildenbrand
2019-08-29 11:43 ` David Hildenbrand
2019-08-29 12:08 ` David Hildenbrand [this message]
2019-08-29 12:15 ` Michal Hocko
2019-08-29 12:29 ` David Hildenbrand
2019-08-29 15:19 ` Michal Hocko
2019-08-29 15:28 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=90313ec8-a13e-5353-cc25-1c8993d5269c@redhat.com \
--to=david@redhat.com \
--cc=airlied@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.h.duyck@linux.intel.com \
--cc=andreyknvl@google.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=anshuman.khandual@arm.com \
--cc=arunks@codeaurora.org \
--cc=benh@kernel.crashing.org \
--cc=borntraeger@de.ibm.com \
--cc=bp@alien8.de \
--cc=cai@lca.pw \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@c-s.fr \
--cc=dalias@libc.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=fenghua.yu@intel.com \
--cc=gerald.schaefer@de.ibm.com \
--cc=glider@google.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=jrdr.linux@gmail.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=logang@deltatee.com \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=osalvador@suse.com \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=pasic@linux.ibm.com \
--cc=paulus@samba.org \
--cc=pavel.tatashin@microsoft.com \
--cc=peterz@infradead.org \
--cc=richard.weiyang@gmail.com \
--cc=richardw.yang@linux.intel.com \
--cc=robin.murphy@arm.com \
--cc=rppt@linux.ibm.com \
--cc=rppt@linux.vnet.ibm.com \
--cc=sfr@canb.auug.org.au \
--cc=steve.capper@arm.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yamada.masahiro@socionext.com \
--cc=yang.shi@linux.alibaba.com \
--cc=yaojun8558363@gmail.com \
--cc=ysato@users.sourceforge.jp \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox