From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C6F8C433EF for ; Tue, 2 Nov 2021 14:35:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0EED9610FC for ; Tue, 2 Nov 2021 14:35:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0EED9610FC Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 409F294000A; Tue, 2 Nov 2021 10:35:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B944940009; Tue, 2 Nov 2021 10:35:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2811994000A; Tue, 2 Nov 2021 10:35:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id 179F4940009 for ; Tue, 2 Nov 2021 10:35:35 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C3C5D1845828F for ; Tue, 2 Nov 2021 14:35:34 +0000 (UTC) X-FDA: 78764238588.14.F431BC3 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf26.hostedemail.com (Postfix) with ESMTP id 02DF020019DC for ; Tue, 2 Nov 2021 14:35:34 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 01AD41FD4E; Tue, 2 Nov 2021 14:35:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1635863733; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=PZOYwTcs/uRKNYMtgWTyKzvzPAmp2lHm06+rRoOs/v4=; b=vQq48Hi9tmkUX4nm9Wzk1Fq+ApGIlZIIZmmiD9SmLhgmExEVMaSiVApTakHz0HYqIqdy44 Ot3l2rxUuQ7ce7mmLPELGfrb1s4WWwDEC6Qn5wpi/v7GoGi94FHMUVj3KQTFdyyJYzllRl zjCzO1aYk947DkoYQKmzAyuTUh3hwbM= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B4221A3B87; Tue, 2 Nov 2021 14:35:32 +0000 (UTC) Date: Tue, 2 Nov 2021 15:35:32 +0100 From: Michal Hocko To: Oscar Salvador Cc: David Hildenbrand , Alexey Makhalov , "linux-mm@kvack.org" , Andrew Morton , "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" , Oscar Salvador Subject: Re: [PATCH] mm: fix panic in __alloc_pages Message-ID: References: <42abfba6-b27e-ca8b-8cdf-883a9398b506@redhat.com> <20211102135201.GA4348@linux> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211102135201.GA4348@linux> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 02DF020019DC Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=vQq48Hi9; spf=pass (imf26.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-Stat-Signature: jbz7sfh7xdcer6zks8e6t9qyprsrh9u3 X-HE-Tag: 1635863734-228912 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 02-11-21 14:52:01, Oscar Salvador wrote: > On Tue, Nov 02, 2021 at 02:25:03PM +0100, Michal Hocko wrote: > > I think we want to learn how exactly Alexey brought that cpu up. Because > > his initial thought on add_cpu resp cpu_up doesn't seem to be correct. > > Or I am just not following the code properly. Once we know all those > > details we can get in touch with cpu hotplug maintainers and see what > > can we do. > > I am not really familiar with CPU hot-onlining, but I have been taking a look. > As with memory, there are two different stages, hot-adding and onlining (and the > counterparts). > > Part of the hot-adding being: > > acpi_processor_get_info > acpi_processor_hotadd_init > arch_register_cpu > register_cpu > > One of the things that register_cpu() does is to set cpu->dev.bus pointing to > &cpu_subsys, which is: > > struct bus_type cpu_subsys = { > .name = "cpu", > .dev_name = "cpu", > .match = cpu_subsys_match, > #ifdef CONFIG_HOTPLUG_CPU > .online = cpu_subsys_online, > .offline = cpu_subsys_offline, > #endif > }; > > Then, the onlining part (in case of a udev rule or someone onlining the device) > would be: > > online_store > device_online > cpu_subsys_online > cpu_device_up > cpu_up > ... > online node > > Since Alexey disabled the udev rule and no one onlined the CPU, online_store()-> > device_online() wasn't really called. > > The following only applies to x86_64: > I think we got confused because cpu_device_up() is also called from add_cpu(), > but that is an exported function and x86 does not call add_cpu() unless for > debugging purposes (check kernel/torture.c and arch/x86/kernel/topology.c). > It does the onlining through online_store()... > So we can take add_cpu() off the equation here. Yes, so the real problem is (thanks for pointing me to the acpi code). The cpu->node association is done in acpi_map_cpu2node and I suspect this expects that the node is already present as it gets the information from SRAT/PXM tables which are parsed during boot. But I might be just confused or maybe just VMware inject new entries here somehow. Another interesting thing is that acpi_map_cpu2node skips over association if there is no node found in SRAT but that should only mean it would use the default initialization which should be hopefuly 0. Anyway, I have found in my notes https://www.spinics.net/lists/kernel/msg3010886.html which is a slightly different problem but it has some notes about how the initialization mess works (that one was boot time though and hotplug might be different actually). I have ran out of time for this today so hopefully somebody can re-learn that from there... -- Michal Hocko SUSE Labs