From: Michal Hocko <mhocko@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Alexander Duyck <alexander.h.duyck@linux.intel.com>,
Alexander Potapenko <glider@google.com>,
Andrey Konovalov <andreyknvl@google.com>,
Andy Lutomirski <luto@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Arun KS <arunks@codeaurora.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Christophe Leroy <christophe.leroy@c-s.fr>,
Dave Airlie <airlied@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Fenghua Yu <fenghua.yu@intel.com>,
Gerald Schaefer <gerald.schaefer@de.ibm.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Halil Pasic <pasic@linux.ibm.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Ira Weiny <ira.weiny@intel.com>, Jason Gunthorpe <jgg@ziepe.ca>,
John Hubbard <jhubbard@nvidia.com>,
Jun Yao <yaojun8558363@gmail.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Logan Gunthorpe <logang@deltatee.com>,
Mark Rutland <mark.rutland@arm.com>,
Masahiro Yamada <yamada.masahiro@socionext.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Mel Gorman <mgorman@techsingularity.net>,
Michael Ellerman <mpe@ellerman.id.au>,
Mike Rapoport <rppt@linux.ibm.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Oscar Salvador <osalvador@suse.com>,
Oscar Salvador <osalvador@suse.de>,
Paul Mackerras <paulus@samba.org>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Pavel Tatashin <pavel.tatashin@microsoft.com>,
Peter Zijlstra <peterz@infradead.org>, Qian Cai <cai@lca.pw>,
Rich Felker <dalias@libc.org>,
Robin Murphy <robin.murphy@arm.com>,
Souptick Joarder <jrdr.linux@gmail.com>,
Stephen Rothwell <sfr@canb.auug.org.au>,
Steve Capper <steve.capper@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Tom Lendacky <thomas.lendacky@amd.com>,
Tony Luck <tony.luck@intel.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Vlastimil Babka <vbabka@suse.cz>,
Wei Yang <richard.weiyang@gmail.com>,
Wei Yang <richardw.yang@linux.intel.com>,
Will Deacon <will@kernel.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
Yoshinori Sato <ysato@users.sourceforge.jp>,
Yu Zhao <yuzhao@google.com>
Subject: Re: [PATCH v3 00/11] mm/memory_hotplug: Shrink zones before removing memory
Date: Thu, 29 Aug 2019 17:19:50 +0200 [thread overview]
Message-ID: <20190829151950.GI28313@dhcp22.suse.cz> (raw)
In-Reply-To: <ac7f1b53-f30d-35d0-375f-18fa6262b059@redhat.com>
On Thu 29-08-19 14:29:22, David Hildenbrand wrote:
> On 29.08.19 14:15, Michal Hocko wrote:
> > On Thu 29-08-19 14:08:48, David Hildenbrand wrote:
> >> On 29.08.19 13:43, David Hildenbrand wrote:
> >>> On 29.08.19 13:33, David Hildenbrand wrote:
> >>>> On 29.08.19 10:23, Michal Hocko wrote:
> >>>>> On Thu 29-08-19 09:00:08, David Hildenbrand wrote:
> >>>>>> This is the successor of "[PATCH v2 0/6] mm/memory_hotplug: Consider all
> >>>>>> zones when removing memory". I decided to go one step further and finally
> >>>>>> factor out the shrinking of zones from memory removal code. Zones are now
> >>>>>> fixed up when offlining memory/onlining of memory fails/before removing
> >>>>>> ZONE_DEVICE memory.
> >>>>>
> >>>>> I was about to say Yay! but then reading...
> >>>>
> >>>> Almost ;)
> >>>>
> >>>>>
> >>>>>> Example:
> >>>>>>
> >>>>>> :/# cat /proc/zoneinfo
> >>>>>> Node 1, zone Movable
> >>>>>> spanned 0
> >>>>>> present 0
> >>>>>> managed 0
> >>>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory41/state
> >>>>>> :/# echo "online_movable" > /sys/devices/system/memory/memory43/state
> >>>>>> :/# cat /proc/zoneinfo
> >>>>>> Node 1, zone Movable
> >>>>>> spanned 98304
> >>>>>> present 65536
> >>>>>> managed 65536
> >>>>>> :/# echo 0 > /sys/devices/system/memory/memory43/online
> >>>>>> :/# cat /proc/zoneinfo
> >>>>>> Node 1, zone Movable
> >>>>>> spanned 32768
> >>>>>> present 32768
> >>>>>> managed 32768
> >>>>>> :/# echo 0 > /sys/devices/system/memory/memory41/online
> >>>>>> :/# cat /proc/zoneinfo
> >>>>>> Node 1, zone Movable
> >>>>>> spanned 0
> >>>>>> present 0
> >>>>>> managed 0
> >>>>>
> >>>>> ... this made me realize that you are trying to fix it instead. Could
> >>>>> you explain why do we want to do that? Why don't we simply remove all
> >>>>> that crap? Why do we even care about zone boundaries when offlining or
> >>>>> removing memory? Zone shrinking was mostly necessary with the previous
> >>>>> onlining semantic when the zone type could be only changed on the
> >>>>> boundary or unassociated memory. We can interleave memory zones now
> >>>>> arbitrarily.
> >>>>
> >>>> Last time I asked whether we can just drop all that nasty
> >>>> zone->contiguous handling I was being told that it does have a
> >>>> significant performance impact and is here to stay. The boundaries are a
> >>>> key component to detect whether a zone is contiguous.
> >>>>
> >>>> So yes, while we allow interleaved memory zones, having contiguous zones
> >>>> is beneficial for performance. That's why also memory onlining code will
> >>>> try to online memory as default to the zone that will keep/make zones
> >>>> contiguous.
> >>>>
> >>>> Anyhow, I think with this series most of the zone shrinking code becomes
> >>>> "digestible". Except minor issues with ZONE_DEVICE - which is acceptable.
> >>>>
> >>>
> >>> Also, there are plenty of other users of
> >>> node_spanned_pages/zone_spanned_pages etc.. I don't think this can go -
> >>> not that easy :)
> >>>
> >>
> >> ... re-reading, your suggestion is to drop the zone _shrinking_ code
> >> only, sorry :) That makes more sense.
> >>
> >> This would mean that once a zone was !contiguous, it will always remain
> >> like that. Also, even empty zones after unplug would not result in
> >> zone_empty() == true.
> >
> > exactly. We only need to care about not declaring zone !contigious when
> > offlining from ends but that should be trivial.
>
> That won't help a lot (offlining a DIMM will offline first to last
> memory block, so unlikely we can keep the zone !contiguous). However, we
> could limit zone shrinking to offlining code only (easy) and not perform
> it at all for ZONE_DEVICE memory. That would simplify things *a lot*.
>
> What's your take? Remove it completely or do it only for !ZONE_DEVICE
> memory when offlining/onlining fails?
>
> I think I would prefer to try to shrink for !ZONE_DEVICE memory, then we
> can at least try to keep contiguous set and reset in case it's possible.
I would remove that code altogether if that is possible and doesn't
introduce any side effects I am not aware right now. All the existing
code has to deal with holes already so I do not see any reason why it
cannot do the same with holes at both ends.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2019-08-29 15:19 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-29 7:00 David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 01/11] mm/memremap: Get rid of memmap_init_zone_device() David Hildenbrand
2019-08-29 16:39 ` Alexander Duyck
2019-08-29 16:55 ` David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 02/11] mm/memory_hotplug: Simplify shrink_pgdat_span() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 03/11] mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 04/11] mm/memory_hotplug: Drop local variables in shrink_zone_span() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 05/11] mm/memory_hotplug: Optimize zone shrinking code when checking for holes David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 06/11] mm/memory_hotplug: Fix crashes in shrink_zone_span() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 07/11] mm/memory_hotplug: Exit early in __remove_pages() on BUGs David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 08/11] mm: Exit early in set_zone_contiguous() if already contiguous David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 09/11] mm/memory_hotplug: Remove pages from a zone before removing memory David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 10/11] mm/memory_hotplug: Remove zone parameter from __remove_pages() David Hildenbrand
2019-08-29 7:00 ` [PATCH v3 11/11] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
2019-08-29 8:23 ` [PATCH v3 00/11] mm/memory_hotplug: Shrink zones before removing memory Michal Hocko
2019-08-29 11:33 ` David Hildenbrand
2019-08-29 11:43 ` David Hildenbrand
2019-08-29 12:08 ` David Hildenbrand
2019-08-29 12:15 ` Michal Hocko
2019-08-29 12:29 ` David Hildenbrand
2019-08-29 15:19 ` Michal Hocko [this message]
2019-08-29 15:28 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190829151950.GI28313@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=airlied@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.h.duyck@linux.intel.com \
--cc=andreyknvl@google.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=anshuman.khandual@arm.com \
--cc=arunks@codeaurora.org \
--cc=benh@kernel.crashing.org \
--cc=borntraeger@de.ibm.com \
--cc=bp@alien8.de \
--cc=cai@lca.pw \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@c-s.fr \
--cc=dalias@libc.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=fenghua.yu@intel.com \
--cc=gerald.schaefer@de.ibm.com \
--cc=glider@google.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=jrdr.linux@gmail.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=logang@deltatee.com \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=osalvador@suse.com \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=pasic@linux.ibm.com \
--cc=paulus@samba.org \
--cc=pavel.tatashin@microsoft.com \
--cc=peterz@infradead.org \
--cc=richard.weiyang@gmail.com \
--cc=richardw.yang@linux.intel.com \
--cc=robin.murphy@arm.com \
--cc=rppt@linux.ibm.com \
--cc=rppt@linux.vnet.ibm.com \
--cc=sfr@canb.auug.org.au \
--cc=steve.capper@arm.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yamada.masahiro@socionext.com \
--cc=yang.shi@linux.alibaba.com \
--cc=yaojun8558363@gmail.com \
--cc=ysato@users.sourceforge.jp \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox