From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D5D6C2BB1D for ; Wed, 15 Apr 2020 02:35:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 373342076C for ; Wed, 15 Apr 2020 02:35:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GEOc5S8I" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 373342076C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CD76C8E0005; Tue, 14 Apr 2020 22:35:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C87EA8E0001; Tue, 14 Apr 2020 22:35:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9DAB8E0005; Tue, 14 Apr 2020 22:35:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id A334D8E0001 for ; Tue, 14 Apr 2020 22:35:38 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 6404C181AEF1D for ; Wed, 15 Apr 2020 02:35:38 +0000 (UTC) X-FDA: 76708523556.04.pets55_1416e9a7e1950 X-HE-Tag: pets55_1416e9a7e1950 X-Filterd-Recvd-Size: 7323 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Apr 2020 02:35:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1586918137; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3o4+0HgEBD0h2Go+GpB2trM6HPRj5m5yAn/gvrKMuxQ=; b=GEOc5S8Ib2mYegA3YQsNeCv0uhD4PFXunbfCTX/curKREx5Colh98+Rq2vsb3Fjxir1MGU 6VJdoIYAq6jO8rtg00KLqsWc7MgdUFvg/G7KyU7yiLW05p2354B7mzHzXpHynFiFuz1XR6 ghiALZblgG8o1YmVkjJ4EGv/skgcTz0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-321-WOza4ysLPQyg0KcN4VE8gg-1; Tue, 14 Apr 2020 22:35:33 -0400 X-MC-Unique: WOza4ysLPQyg0KcN4VE8gg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 890218017F3; Wed, 15 Apr 2020 02:35:30 +0000 (UTC) Received: from localhost (ovpn-12-27.pek2.redhat.com [10.72.12.27]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7B9A899DEE; Wed, 15 Apr 2020 02:35:26 +0000 (UTC) Date: Wed, 15 Apr 2020 10:35:24 +0800 From: Baoquan He To: David Hildenbrand Cc: "Eric W. Biederman" , Russell King - ARM Linux admin , Anshuman Khandual , Catalin Marinas , Bhupesh Sharma , kexec@lists.infradead.org, linux-mm@kvack.org, James Morse , Andrew Morton , Will Deacon , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, piliu@redhat.com Subject: Re: [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image Message-ID: <20200415023524.GG4247@MiWiFi-R3L-srv> References: <20200412080836.GM25745@shell.armlinux.org.uk> <87wo6klbw0.fsf@x220.int.ebiederm.org> <20200413023701.GA20265@MiWiFi-R3L-srv> <871rorjzmc.fsf@x220.int.ebiederm.org> <20200414064031.GB4247@MiWiFi-R3L-srv> <86e96214-7053-340b-5c1a-ff97fb94d8e0@redhat.com> <20200414092201.GD4247@MiWiFi-R3L-srv> <20200414143912.GE4247@MiWiFi-R3L-srv> <0085f460-b0c7-b25f-36a7-fa3bafaab6fe@redhat.com> MIME-Version: 1.0 In-Reply-To: <0085f460-b0c7-b25f-36a7-fa3bafaab6fe@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 04/14/20 at 04:49pm, David Hildenbrand wrote: > >>>>> The root cause is kexec-ed kernel is targeted at hotpluggable memor= y > >>>>> region. Just avoiding the movable area can fix it. In kexec_file_lo= ad(), > >>>>> just checking or picking those unmovable region to put kernel/initr= d in > >>>>> function locate_mem_hole_callback() can fix it. The page or pageblo= ck's > >>>>> zone is movable or not, it's easy to know. This fix doesn't need to > >>>>> bother other component. > >>>> > >>>> I don't fully agree. E.g., just because memory is onlined to ZONE_NO= RMAL > >>>> does not imply that it cannot get offlined and removed e.g., this is > >>>> heavily used on ppc64, with 16MB sections. > >>> > >>> Really? I just know there are two kinds of mem hoplug in ppc, but don= 't > >>> know the details. So in this case, is there any flag or a way to know > >>> those memory block are hotpluggable? I am curious how those kernel da= ta > >>> is avoided to be put in this area. Or ppc just freely uses it for ker= nel > >>> data or user space data, then try to migrate when hot remove? > >> > >> See > >> arch/powerpc/platforms/pseries/hotplug-memory.c:dlpar_memory_remove_by= _count() > >> > >> Under DLAPR, it can remove memory in LMB granularity, which is usually > >> 16MB (=3D=3D single section on ppc64). DLPAR will directly online all > >> hotplugged memory (LMBs) from the kernel using device_online(), which > >> will go to ZONE_NORMAL. > >> > >> When trying to remove memory, it simply scans for offlineable 16MB > >> memory blocks (=3D=3Dsection =3D=3D LMB), offlines and removes them. N= o need for > >> the movable zone and all the involved issues. > >=20 > > Yes, this is a different one, thanks for pointing it out. It sounds lik= e > > balloon driver in virt platform, doesn't it? >=20 > With DLPAR there is a hypervisor involved (which manages the actual HW > DIMMs), so yes. >=20 > >=20 > > Avoiding to put kexec kernel into movable zone can't solve this DLPAR > > case as you said. > >=20 > >> > >> Now, the interesting question is, can we have LMBs added during boot > >> (not via add_memory()), that will later be removed via remove_memory()= . > >> IIRC, we had BUGs related to that, so I think yes. If a section contai= ns > >> no unmovable allocations (after boot), it can get removed. > >=20 > > I do want to ask this question. If we can add LMB into system RAM, then > > reload kexec can solve it.=20 > >=20 > > Another better way is adding a common function to filter out the > > movable zone when search position for kexec kernel, use a arch specific > > funciton to filter out DLPAR memory blocks for ppc only. Over there, > > we can simply use for_each_drmem_lmb() to do that. >=20 > I was thinking about something similar. Maybe something like a notifier > that can be used to test if selected memory can be used for kexec Not sure if I get the notifier idea clearly. If you mean=20 1) Add a common function to pick memory in unmovable zone; 2) Let DLPAR, balloon register with notifier; 3) In the common function, ask notified part to check if the picked unmovable memory is available for locating kexec kernel; Sounds doable to me, and not complicated. > images. It would apply to >=20 > - arm64 and filter out all hotadded memory (IIRC, only boot memory can > be used). Do you mean hot added memory after boot can't be recognized and added into system RAM on arm64? > - powerpc to filter out all LMBs that can be removed (assuming not all > memory corresponds to LMBs that can be removed, otherwise we're in > trouble ... :) ) > - virtio-mem to filter out all memory it added. > - hyper-v to filter out partially backed memory blocks (esp. the last > memory block it added and only partially backed it by memory). >=20 > This would make it work for kexec_file_load(), however, I do wonder how > we would want to approach that from userspace kexec-tools when handling > it from kexec_load(). Let's make kexec_file_load work firstly. Since this work is only first step to make kexec-ed kernel not break memory hotplug. After kexec rebooting, the KASLR may locate kernel into hotpluggable area too.