From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56167C3A59F for ; Thu, 29 Aug 2019 12:15:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1506E22CF5 for ; Thu, 29 Aug 2019 12:15:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1506E22CF5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 99FEF6B0006; Thu, 29 Aug 2019 08:15:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 929706B000C; Thu, 29 Aug 2019 08:15:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CAC26B000D; Thu, 29 Aug 2019 08:15:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id 580736B0006 for ; Thu, 29 Aug 2019 08:15:23 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id F136F824CA36 for ; Thu, 29 Aug 2019 12:15:22 +0000 (UTC) X-FDA: 75875360484.29.ship88_86394ef5d4d1a X-HE-Tag: ship88_86394ef5d4d1a X-Filterd-Recvd-Size: 7297 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Thu, 29 Aug 2019 12:15:22 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 63250AF68; Thu, 29 Aug 2019 12:15:20 +0000 (UTC) Date: Thu, 29 Aug 2019 14:15:15 +0200 From: Michal Hocko To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Aneesh Kumar K . V" , Andrew Morton , Dan Williams , Alexander Duyck , Alexander Potapenko , Andrey Konovalov , Andy Lutomirski , Anshuman Khandual , Arun KS , Benjamin Herrenschmidt , Borislav Petkov , Catalin Marinas , Christian Borntraeger , Christophe Leroy , Dave Airlie , Dave Hansen , Fenghua Yu , Gerald Schaefer , Greg Kroah-Hartman , Halil Pasic , Heiko Carstens , "H. Peter Anvin" , Ingo Molnar , Ira Weiny , Jason Gunthorpe , John Hubbard , Jun Yao , "Kirill A. Shutemov" , Logan Gunthorpe , Mark Rutland , Masahiro Yamada , "Matthew Wilcox (Oracle)" , Mel Gorman , Michael Ellerman , Mike Rapoport , Mike Rapoport , Oscar Salvador , Oscar Salvador , Paul Mackerras , Pavel Tatashin , Pavel Tatashin , Peter Zijlstra , Qian Cai , Rich Felker , Robin Murphy , Souptick Joarder , Stephen Rothwell , Steve Capper , Thomas Gleixner , Tom Lendacky , Tony Luck , Vasily Gorbik , Vlastimil Babka , Wei Yang , Wei Yang , Will Deacon , Yang Shi , Yoshinori Sato , Yu Zhao Subject: Re: [PATCH v3 00/11] mm/memory_hotplug: Shrink zones before removing memory Message-ID: <20190829121515.GE28313@dhcp22.suse.cz> References: <20190829070019.12714-1-david@redhat.com> <20190829082323.GT28313@dhcp22.suse.cz> <90313ec8-a13e-5353-cc25-1c8993d5269c@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <90313ec8-a13e-5353-cc25-1c8993d5269c@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 29-08-19 14:08:48, David Hildenbrand wrote: > On 29.08.19 13:43, David Hildenbrand wrote: > > On 29.08.19 13:33, David Hildenbrand wrote: > >> On 29.08.19 10:23, Michal Hocko wrote: > >>> On Thu 29-08-19 09:00:08, David Hildenbrand wrote: > >>>> This is the successor of "[PATCH v2 0/6] mm/memory_hotplug: Consider all > >>>> zones when removing memory". I decided to go one step further and finally > >>>> factor out the shrinking of zones from memory removal code. Zones are now > >>>> fixed up when offlining memory/onlining of memory fails/before removing > >>>> ZONE_DEVICE memory. > >>> > >>> I was about to say Yay! but then reading... > >> > >> Almost ;) > >> > >>> > >>>> Example: > >>>> > >>>> :/# cat /proc/zoneinfo > >>>> Node 1, zone Movable > >>>> spanned 0 > >>>> present 0 > >>>> managed 0 > >>>> :/# echo "online_movable" > /sys/devices/system/memory/memory41/state > >>>> :/# echo "online_movable" > /sys/devices/system/memory/memory43/state > >>>> :/# cat /proc/zoneinfo > >>>> Node 1, zone Movable > >>>> spanned 98304 > >>>> present 65536 > >>>> managed 65536 > >>>> :/# echo 0 > /sys/devices/system/memory/memory43/online > >>>> :/# cat /proc/zoneinfo > >>>> Node 1, zone Movable > >>>> spanned 32768 > >>>> present 32768 > >>>> managed 32768 > >>>> :/# echo 0 > /sys/devices/system/memory/memory41/online > >>>> :/# cat /proc/zoneinfo > >>>> Node 1, zone Movable > >>>> spanned 0 > >>>> present 0 > >>>> managed 0 > >>> > >>> ... this made me realize that you are trying to fix it instead. Could > >>> you explain why do we want to do that? Why don't we simply remove all > >>> that crap? Why do we even care about zone boundaries when offlining or > >>> removing memory? Zone shrinking was mostly necessary with the previous > >>> onlining semantic when the zone type could be only changed on the > >>> boundary or unassociated memory. We can interleave memory zones now > >>> arbitrarily. > >> > >> Last time I asked whether we can just drop all that nasty > >> zone->contiguous handling I was being told that it does have a > >> significant performance impact and is here to stay. The boundaries are a > >> key component to detect whether a zone is contiguous. > >> > >> So yes, while we allow interleaved memory zones, having contiguous zones > >> is beneficial for performance. That's why also memory onlining code will > >> try to online memory as default to the zone that will keep/make zones > >> contiguous. > >> > >> Anyhow, I think with this series most of the zone shrinking code becomes > >> "digestible". Except minor issues with ZONE_DEVICE - which is acceptable. > >> > > > > Also, there are plenty of other users of > > node_spanned_pages/zone_spanned_pages etc.. I don't think this can go - > > not that easy :) > > > > ... re-reading, your suggestion is to drop the zone _shrinking_ code > only, sorry :) That makes more sense. > > This would mean that once a zone was !contiguous, it will always remain > like that. Also, even empty zones after unplug would not result in > zone_empty() == true. exactly. We only need to care about not declaring zone !contigious when offlining from ends but that should be trivial. > I can see that some users of *_spanned_pages make certain assumptions > based on the size (snapshot, oom killer, ...), but that would already be > wrong in case the zone is very sparse. at least oom killer usage is certainly wrong. I will have a look. > I'll prepare something, then we can discuss. Thanks! -- Michal Hocko SUSE Labs