From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78442C3A59F for ; Thu, 29 Aug 2019 12:09:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2C7E12173E for ; Thu, 29 Aug 2019 12:09:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2C7E12173E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CB25B6B0006; Thu, 29 Aug 2019 08:09:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C62B66B000C; Thu, 29 Aug 2019 08:09:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B29466B000D; Thu, 29 Aug 2019 08:09:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0139.hostedemail.com [216.40.44.139]) by kanga.kvack.org (Postfix) with ESMTP id 8C7286B0006 for ; Thu, 29 Aug 2019 08:09:08 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 20A6A180AD7C1 for ; Thu, 29 Aug 2019 12:09:08 +0000 (UTC) X-FDA: 75875344776.10.key90_4fa2dbd0c942a X-HE-Tag: key90_4fa2dbd0c942a X-Filterd-Recvd-Size: 10585 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Thu, 29 Aug 2019 12:09:07 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 87E56106E290; Thu, 29 Aug 2019 12:09:05 +0000 (UTC) Received: from [10.36.117.243] (ovpn-117-243.ams2.redhat.com [10.36.117.243]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4F0135C221; Thu, 29 Aug 2019 12:08:49 +0000 (UTC) Subject: Re: [PATCH v3 00/11] mm/memory_hotplug: Shrink zones before removing memory From: David Hildenbrand To: Michal Hocko Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Aneesh Kumar K . V" , Andrew Morton , Dan Williams , Alexander Duyck , Alexander Potapenko , Andrey Konovalov , Andy Lutomirski , Anshuman Khandual , Arun KS , Benjamin Herrenschmidt , Borislav Petkov , Catalin Marinas , Christian Borntraeger , Christophe Leroy , Dave Airlie , Dave Hansen , Fenghua Yu , Gerald Schaefer , Greg Kroah-Hartman , Halil Pasic , Heiko Carstens , "H. Peter Anvin" , Ingo Molnar , Ira Weiny , Jason Gunthorpe , John Hubbard , Jun Yao , "Kirill A. Shutemov" , Logan Gunthorpe , Mark Rutland , Masahiro Yamada , "Matthew Wilcox (Oracle)" , Mel Gorman , Michael Ellerman , Mike Rapoport , Mike Rapoport , Oscar Salvador , Oscar Salvador , Paul Mackerras , Pavel Tatashin , Pavel Tatashin , Peter Zijlstra , Qian Cai , Rich Felker , Robin Murphy , Souptick Joarder , Stephen Rothwell , Steve Capper , Thomas Gleixner , Tom Lendacky , Tony Luck , Vasily Gorbik , Vlastimil Babka , Wei Yang , Wei Yang , Will Deacon , Yang Shi , Yoshinori Sato , Yu Zhao References: <20190829070019.12714-1-david@redhat.com> <20190829082323.GT28313@dhcp22.suse.cz> Openpgp: preference=signencrypt Autocrypt: addr=david@redhat.com; prefer-encrypt=mutual; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwX4EEwECACgFAljj9eoCGwMFCQlmAYAGCwkI BwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEE3eEPcA/4Na5IIP/3T/FIQMxIfNzZshIq687qgG 8UbspuE/YSUDdv7r5szYTK6KPTlqN8NAcSfheywbuYD9A4ZeSBWD3/NAVUdrCaRP2IvFyELj xoMvfJccbq45BxzgEspg/bVahNbyuBpLBVjVWwRtFCUEXkyazksSv8pdTMAs9IucChvFmmq3 jJ2vlaz9lYt/lxN246fIVceckPMiUveimngvXZw21VOAhfQ+/sofXF8JCFv2mFcBDoa7eYob s0FLpmqFaeNRHAlzMWgSsP80qx5nWWEvRLdKWi533N2vC/EyunN3HcBwVrXH4hxRBMco3jvM m8VKLKao9wKj82qSivUnkPIwsAGNPdFoPbgghCQiBjBe6A75Z2xHFrzo7t1jg7nQfIyNC7ez MZBJ59sqA9EDMEJPlLNIeJmqslXPjmMFnE7Mby/+335WJYDulsRybN+W5rLT5aMvhC6x6POK z55fMNKrMASCzBJum2Fwjf/VnuGRYkhKCqqZ8gJ3OvmR50tInDV2jZ1DQgc3i550T5JDpToh dPBxZocIhzg+MBSRDXcJmHOx/7nQm3iQ6iLuwmXsRC6f5FbFefk9EjuTKcLMvBsEx+2DEx0E UnmJ4hVg7u1PQ+2Oy+Lh/opK/BDiqlQ8Pz2jiXv5xkECvr/3Sv59hlOCZMOaiLTTjtOIU7Tq 7ut6OL64oAq+zsFNBFXLn5EBEADn1959INH2cwYJv0tsxf5MUCghCj/CA/lc/LMthqQ773ga uB9mN+F1rE9cyyXb6jyOGn+GUjMbnq1o121Vm0+neKHUCBtHyseBfDXHA6m4B3mUTWo13nid 0e4AM71r0DS8+KYh6zvweLX/LL5kQS9GQeT+QNroXcC1NzWbitts6TZ+IrPOwT1hfB4WNC+X 2n4AzDqp3+ILiVST2DT4VBc11Gz6jijpC/KI5Al8ZDhRwG47LUiuQmt3yqrmN63V9wzaPhC+ xbwIsNZlLUvuRnmBPkTJwwrFRZvwu5GPHNndBjVpAfaSTOfppyKBTccu2AXJXWAE1Xjh6GOC 8mlFjZwLxWFqdPHR1n2aPVgoiTLk34LR/bXO+e0GpzFXT7enwyvFFFyAS0Nk1q/7EChPcbRb hJqEBpRNZemxmg55zC3GLvgLKd5A09MOM2BrMea+l0FUR+PuTenh2YmnmLRTro6eZ/qYwWkC u8FFIw4pT0OUDMyLgi+GI1aMpVogTZJ70FgV0pUAlpmrzk/bLbRkF3TwgucpyPtcpmQtTkWS gDS50QG9DR/1As3LLLcNkwJBZzBG6PWbvcOyrwMQUF1nl4SSPV0LLH63+BrrHasfJzxKXzqg rW28CTAE2x8qi7e/6M/+XXhrsMYG+uaViM7n2je3qKe7ofum3s4vq7oFCPsOgwARAQABwsFl BBgBAgAPBQJVy5+RAhsMBQkJZgGAAAoJEE3eEPcA/4NagOsP/jPoIBb/iXVbM+fmSHOjEshl KMwEl/m5iLj3iHnHPVLBUWrXPdS7iQijJA/VLxjnFknhaS60hkUNWexDMxVVP/6lbOrs4bDZ NEWDMktAeqJaFtxackPszlcpRVkAs6Msn9tu8hlvB517pyUgvuD7ZS9gGOMmYwFQDyytpepo YApVV00P0u3AaE0Cj/o71STqGJKZxcVhPaZ+LR+UCBZOyKfEyq+ZN311VpOJZ1IvTExf+S/5 lqnciDtbO3I4Wq0ArLX1gs1q1XlXLaVaA3yVqeC8E7kOchDNinD3hJS4OX0e1gdsx/e6COvy qNg5aL5n0Kl4fcVqM0LdIhsubVs4eiNCa5XMSYpXmVi3HAuFyg9dN+x8thSwI836FoMASwOl C7tHsTjnSGufB+D7F7ZBT61BffNBBIm1KdMxcxqLUVXpBQHHlGkbwI+3Ye+nE6HmZH7IwLwV W+Ajl7oYF+jeKaH4DZFtgLYGLtZ1LDwKPjX7VAsa4Yx7S5+EBAaZGxK510MjIx6SGrZWBrrV TEvdV00F2MnQoeXKzD7O4WFbL55hhyGgfWTHwZ457iN9SgYi1JLPqWkZB0JRXIEtjd4JEQcx +8Umfre0Xt4713VxMygW0PnQt5aSQdMD58jHFxTk092mU+yIHj5LeYgvwSgZN4airXk5yRXl SE+xAvmumFBY Organization: Red Hat GmbH Message-ID: <90313ec8-a13e-5353-cc25-1c8993d5269c@redhat.com> Date: Thu, 29 Aug 2019 14:08:48 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.64]); Thu, 29 Aug 2019 12:09:06 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 29.08.19 13:43, David Hildenbrand wrote: > On 29.08.19 13:33, David Hildenbrand wrote: >> On 29.08.19 10:23, Michal Hocko wrote: >>> On Thu 29-08-19 09:00:08, David Hildenbrand wrote: >>>> This is the successor of "[PATCH v2 0/6] mm/memory_hotplug: Consider= all >>>> zones when removing memory". I decided to go one step further and fi= nally >>>> factor out the shrinking of zones from memory removal code. Zones ar= e now >>>> fixed up when offlining memory/onlining of memory fails/before remov= ing >>>> ZONE_DEVICE memory. >>> >>> I was about to say Yay! but then reading... >> >> Almost ;) >> >>> >>>> Example: >>>> >>>> :/# cat /proc/zoneinfo >>>> Node 1, zone Movable >>>> spanned 0 >>>> present 0 >>>> managed 0 >>>> :/# echo "online_movable" > /sys/devices/system/memory/memory41/stat= e=20 >>>> :/# echo "online_movable" > /sys/devices/system/memory/memory43/stat= e >>>> :/# cat /proc/zoneinfo >>>> Node 1, zone Movable >>>> spanned 98304 >>>> present 65536 >>>> managed 65536 >>>> :/# echo 0 > /sys/devices/system/memory/memory43/online >>>> :/# cat /proc/zoneinfo >>>> Node 1, zone Movable >>>> spanned 32768 >>>> present 32768 >>>> managed 32768 >>>> :/# echo 0 > /sys/devices/system/memory/memory41/online >>>> :/# cat /proc/zoneinfo >>>> Node 1, zone Movable >>>> spanned 0 >>>> present 0 >>>> managed 0 >>> >>> ... this made me realize that you are trying to fix it instead. Could >>> you explain why do we want to do that? Why don't we simply remove all >>> that crap? Why do we even care about zone boundaries when offlining o= r >>> removing memory? Zone shrinking was mostly necessary with the previou= s >>> onlining semantic when the zone type could be only changed on the >>> boundary or unassociated memory. We can interleave memory zones now >>> arbitrarily. >> >> Last time I asked whether we can just drop all that nasty >> zone->contiguous handling I was being told that it does have a >> significant performance impact and is here to stay. The boundaries are= a >> key component to detect whether a zone is contiguous. >> >> So yes, while we allow interleaved memory zones, having contiguous zon= es >> is beneficial for performance. That's why also memory onlining code wi= ll >> try to online memory as default to the zone that will keep/make zones >> contiguous. >> >> Anyhow, I think with this series most of the zone shrinking code becom= es >> "digestible". Except minor issues with ZONE_DEVICE - which is acceptab= le. >> >=20 > Also, there are plenty of other users of > node_spanned_pages/zone_spanned_pages etc.. I don't think this can go - > not that easy :) >=20 ... re-reading, your suggestion is to drop the zone _shrinking_ code only, sorry :) That makes more sense. This would mean that once a zone was !contiguous, it will always remain like that. Also, even empty zones after unplug would not result in zone_empty() =3D=3D true. I can see that some users of *_spanned_pages make certain assumptions based on the size (snapshot, oom killer, ...), but that would already be wrong in case the zone is very sparse. I'll prepare something, then we can discuss. --=20 Thanks, David / dhildenb