From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69D4CEB64D9 for ; Mon, 19 Jun 2023 08:04:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC3E68D0002; Mon, 19 Jun 2023 04:04:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C74608D0001; Mon, 19 Jun 2023 04:04:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B635A8D0002; Mon, 19 Jun 2023 04:04:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A7B078D0001 for ; Mon, 19 Jun 2023 04:04:45 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 71A428052D for ; Mon, 19 Jun 2023 08:04:45 +0000 (UTC) X-FDA: 80918760930.27.365D3ED Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf19.hostedemail.com (Postfix) with ESMTP id 8133F1A0002 for ; Mon, 19 Jun 2023 08:04:43 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linuxfoundation.org header.s=korg header.b=oFbvYOPv; spf=pass (imf19.hostedemail.com: domain of gregkh@linuxfoundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=gregkh@linuxfoundation.org; dmarc=pass (policy=none) header.from=linuxfoundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687161883; a=rsa-sha256; cv=none; b=DdtkR9F2oO/HDwEF2PhrbwE3jTPKUyWCe/W4PU56pZReGBSDlxB/FOiCwXDjWSes4RKRse 711Ycggr/VdOiDVnvcDzclQbVxqFBKvW79LTzkLrjv14RuNjIsjgLOOdfzgDpPW218By89 2XszdUvCvpMyZ0xLSaLjyUIBcATF0zM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=linuxfoundation.org header.s=korg header.b=oFbvYOPv; spf=pass (imf19.hostedemail.com: domain of gregkh@linuxfoundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=gregkh@linuxfoundation.org; dmarc=pass (policy=none) header.from=linuxfoundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687161883; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cfZwVV+Iv/HrgI7WmX+DdXLIlAh48UMZX13drOHplkw=; b=o76PHfiYUzUscr//FN2k+c44D+jO6r9HlBLu1Qfqns980Adku/gcP0sogSLKQ9/Kg6wYHd CFUYGuN9iB5o+oGehazCcSTavAJ27m/DhBbqHxmT8Q2/t2vSNyv2o/SotO7DNkh0u+2iSQ S0JO1WsujYZY2U32BMH+t7dYrvdC19o= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9544E6131F; Mon, 19 Jun 2023 08:04:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 471BBC433C0; Mon, 19 Jun 2023 08:04:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1687161882; bh=Jkv3EGzDEoQxhxTQKy7Jh7KLg0vAnB0lxbXiokDwKsg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=oFbvYOPv+cOU5MZ/vephP2zLkQnuJcgb/785+006D4R7v9jvYTpkJMWKrt/Stfanz 74abWgx0R2eoRXXICOUzZpKQvxnwrCi01m2Kik8jxH4RH2q0QeiYKCP90rSpTl6nOr 7lk//ALTYrwr58n+B8p2mVAfInm83HL92Uz1pYyA= Date: Mon, 19 Jun 2023 10:04:38 +0200 From: Greg KH To: mawupeng Cc: david@redhat.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, richard.weiyang@linux.alibaba.com, mst@redhat.com, jasowang@redhat.com, pankaj.gupta.linux@gmail.com, mhocko@kernel.org, osalvador@suse.de Subject: Re: [PATCH stable 5.10] mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block Message-ID: <2023061936-mantra-pancreas-67d4@gregkh> References: <20230619065121.1720912-1-mawupeng1@huawei.com> <2023061926-monoxide-pastor-fa3b@gregkh> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 8133F1A0002 X-Stat-Signature: om5wp5ttt18fibgnye6tirfyqozom1w6 X-Rspam-User: X-HE-Tag: 1687161883-173121 X-HE-Meta: U2FsdGVkX19yLGbYOQnPfvof9tc9tMHkK5M7JlFDaHHPPB09tqKWtxEu8xe+pyMHOlNO395KgzxwctqzWKDmYET9144ktGVXDH2wFC+vspjP+6S0Jfqi6GRFKYYtnBg+yfICYjAJjXB+v/oFImb75QxAPwgDB1O4rhIRaPcOcXvakZFtDbEEQhSmpOnENQB4reQL5JYUhCQJIzCIbkTWAaIpBlC5CBXV4ICmn8jb5jCOHXADgrXk9/0n4JCwXeGOm9+jvjmjrOc8n/7O3faOqOZltLy8EEYMsDWrBWQP9XWcEyfzzMk5MW5Wb+zORkcOxz+decsCOiDkFSkHPhYdFeIYbvf+tqiXv3vR0ZlrhnBXVDRcctbAIkV3kOg9vumuqU4lL0by1HQqTnCJZWPwZbzXAJJYrA6WjPxZTV0QsZmDirSLYOIH5bq5voK16PipmD3IxgAsyYM4bkd860lznzKH27FDjnBukt6sg0MlzWf14YuUxkUD6DhhCnUvaD0VvYfc3uAaMTPQeKooHO7jCVvV854YXFJQnaqX3SCRe40/m2jqMMsgSY0SCxuN+s0BiYxYFh9Nid7lpnWd7iHtG/cyoVz3CfAwE6Qz0rnCzaZwMfXRcP0w/Y+Ah7jFpLH4FqDA/Syf2ZINrD+4oRNTeibdw6dULHtgzNoJp0RIFCq+wP5JQwKKDHN41u3vPf+Y+R162qGX8SJE0bEuMa2II3tENmqRc52urrL56++464Nztg/yZ9rDkuXa942eo3IXtR07kxoA62VPXqXrJ77v19rRnqGwb3k4rocnm8G0zOreDCHIhMoXlNzcgeRs7PAfKnnZ6WaNnuOIuVHtShbrTgXCWeSGyDM+3YOHPhKfv/4bSE/wZJYuvVlcTYXbsw+CcHck0L+eiZPpJHzR93+BbgM19o8Qx4QvF0UtybEJh5h0Qid+tMQGX0OBLKt4QN+vwTn9JKh+UCwXqa48tEJ zrrDILar P+2W3Kc/1/dwPY8NX6qxLWmbl01q0FB/FnnjA6gKQ6ZRP2VG/1RnMyduSt02I6HNgWvUteEuJkhLwcP9uGugb6IRSs2YKoCBtoRtY5hFVWi3YtcgfbJ2hPkXO3esRXK+ZXMZi9jYZeVB5P6FqIF17CPkHDdvRJQvehWckL/N9sotPRJ8XX9R9tf7QH9njEyZf23Hyf12dVYBrpNVSndQulljewjJzIBSWO+iT6DCvkFOgN2bzSW0HAvk24dA5vruvSzamgJus0FcVdvO0et4AkJbRTtT4rls9l7goFIVZX1vquocgPx+zllsZZ61eZE6F5rlWRNWKlP/bNvEe/rSzZe7rbKadNhCiBoInyNJAKnXFVq1y6QyU1A057eQLEx9eH/uX41lFJEfrXCx6osP2z79NeSSv5u3ScfscoVoCF8Cr0TG7sQW/RMXVXVX3XQt+X2udilfithvQPiE8d5yYk4DZ8zpUQJBMhwbbF0N+JWq3bOkPMvR9uc6Qgt0Dc1lC1+prvKqADcwZ5KYiToiC//10cCjQEdgMqVoVP4mHAGNKqXVrXKLLyCCYoWvyOBSikrm3sBE8rieUJ5IxtURpNeSopsjM+YoK4Ys/hhItnptrkPwmAANizEyahpJAjpAqdskJZ8x/bM+25Ck= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jun 19, 2023 at 03:53:40PM +0800, mawupeng wrote: > > > On 2023/6/19 15:41, David Hildenbrand wrote: > > On 19.06.23 09:22, mawupeng wrote: > >> > >> > >> On 2023/6/19 15:16, Greg KH wrote: > >>> On Mon, Jun 19, 2023 at 02:51:21PM +0800, Wupeng Ma wrote: > >>>> From: David Hildenbrand > >>>> > >>>> commit 8dc4bb58a146655eb057247d7c9d19e73928715b upstream. > >>>> > >>>> virtio-mem soon wants to use offline_and_remove_memory() memory that > >>>> exceeds a single Linux memory block (memory_block_size_bytes()). Let's > >>>> remove that restriction. > >>>> > >>>> Let's remember the old state and try to restore that if anything goes > >>>> wrong. While re-onlining can, in general, fail, it's highly unlikely to > >>>> happen (usually only when a notifier fails to allocate memory, and these > >>>> are rather rare). > >>>> > >>>> This will be used by virtio-mem to offline+remove memory ranges that are > >>>> bigger than a single memory block - for example, with a device block > >>>> size of 1 GiB (e.g., gigantic pages in the hypervisor) and a Linux memory > >>>> block size of 128MB. > >>>> > >>>> While we could compress the state into 2 bit, using 8 bit is much > >>>> easier. > >>>> > >>>> This handling is similar, but different to acpi_scan_try_to_offline(): > >>>> > >>>> a) We don't try to offline twice. I am not sure if this CONFIG_MEMCG > >>>> optimization is still relevant - it should only apply to ZONE_NORMAL > >>>> (where we have no guarantees). If relevant, we can always add it. > >>>> > >>>> b) acpi_scan_try_to_offline() simply onlines all memory in case > >>>> something goes wrong. It doesn't restore previous online type. Let's do > >>>> that, so we won't overwrite what e.g., user space configured. > >>>> > >>>> Reviewed-by: Wei Yang > >>>> Cc: "Michael S. Tsirkin" > >>>> Cc: Jason Wang > >>>> Cc: Pankaj Gupta > >>>> Cc: Michal Hocko > >>>> Cc: Oscar Salvador > >>>> Cc: Wei Yang > >>>> Cc: Andrew Morton > >>>> Signed-off-by: David Hildenbrand > >>>> Link: https://lore.kernel.org/r/20201112133815.13332-28-david@redhat.com > >>>> Signed-off-by: Michael S. Tsirkin > >>>> Acked-by: Andrew Morton > >>>> Signed-off-by: Ma Wupeng > >>>> --- > >>>>   mm/memory_hotplug.c | 105 +++++++++++++++++++++++++++++++++++++------- > >>>>   1 file changed, 89 insertions(+), 16 deletions(-) > >>>> > >>> > >>> Why is this needed in 5.10.y?  Looks like a new feature to me, what > >>> problem does it solve there? > >>> > >>> thanks, > >>> > >>> greg k-h > >> > >> It do introduce a new feature. But at the same time, it fix a memleak introduced > >> in Commit 08b3acd7a68f ("mm/memory_hotplug: Introduce offline_and_remove_memory()" > >> > >> Our test find a memleak in init_memory_block, it is clear that mem is never > >> been released due to wrong refcount. Commit 08b3acd7a68f ("mm/memory_hotplug: > >> Introduce offline_and_remove_memory()") failed to dec refcount after > >> find_memory_block which fail to dec refcount to zero in remove memory > >> causing the leak. > >> > >> Commit 8dc4bb58a146 ("mm/memory_hotplug: extend offline_and_remove_memory() > >> to handle more than one memory block") introduce walk_memory_blocks to > >> replace find_memory_block which dec refcount by calling put_device after > >> find_memory_block_by_id. In the way, the memleak is fixed. > >> > >> Here is the simplified calltrace: > >> > >>    kmem_cache_alloc_trace+0x664/0xed0 > >>    init_memory_block+0x8c/0x170 > >>    create_memory_block_devices+0xa4/0x150 > >>    add_memory_resource+0x188/0x530 > >>    __add_memory+0x78/0x104 > >>    add_memory+0x6c/0xb0 > >> > > > > Makes sense to me. Of course, we could think about a simplified stable fix that only drops the ref. > > Since the new patch does not introduce any kabi change, maybe we can merge this one? stable kernels never care about "kabi", that is a made up thing that some distros work to enforce only. It has nothing to do with the community. And I will always prefer to take the real commit that is in Linus's tree over any "custom" patch, as 90%+ of the time, custom changes are almost always wrong. thanks, greg k-h